Research and Publications
Streaming Isolation Forest
Liu, J., Liu, F., Bifet, A., Pfahringer, B., & Cassales, G. (2025)
Introduction
Streaming Isolation Forest (SIF) is an online anomaly‑detection method designed for evolving data streams, where data arrives continuously and patterns change over time. It extends the classic Isolation Forest algorithm, originally built for static datasets, so it can operate efficiently, adapt to drift, and detect anomalies in real time without storing historical data.
Problems
The original algorithm requires full‑dataset access and cannot update incrementally
Real‑world streams change over time, so models must adapt continuously
Storing past data is impossible in high‑velocity streams
Anomaly detection must work without labeled feedback
Online updates must preserve the randomness and structure that make Isolation Forest effective
Method
The Streaming Isolation Forest method works by updating isolation trees incrementally as new data arrives, rather than rebuilding the entire forest from scratch. Each incoming instance is used to partially refresh or replace parts of the existing trees, allowing the model to adapt continuously to changes in the data stream. To handle concept drift, the algorithm incorporates mechanisms such as sliding windows or decay functions so that older, less relevant data gradually loses influence. Only small samples of data are used during updates, keeping computation fast and memory usage low. As the forest evolves, anomaly scores are recalculated in a way that reflects the current structure of the trees, ensuring that the system can detect unusual or rare patterns in real time without storing historical data.
Key outcomes
SIF detects anomalies quickly with low latency
The model adapts to changing data distributions
No need to store historical data
Performs competitively against other streaming anomaly‑detection methods
Efficient enough for real‑world, high‑throughput applications
Findings
The core finding is that Isolation Forest can be successfully adapted to streaming environments by using incremental tree updates and adaptive scoring. Streaming Isolation Forest maintains the strengths of the original algorithm while enabling fast, memory‑efficient, and drift‑aware anomaly detection in evolving data streams.
Journal Publications
RMIDDM: an unsupervised and interpretable concept drift detection method for data streams
Bayesian Stream Tuner: Dynamic Hyperparameter Optimization for Real-Time Data Streams
Conference Publications
How to cite this article
APA 7th: Liu, J. J., Cassales, G. W., Liu, F. T., Pfahringer, B., & Bifet, A. (2025). Streaming Isolation Forest. In X. Wu et al. (Eds.), Advances in Knowledge Discovery and Data Mining (PAKDD 2025, Lecture Notes in Computer Science, Vol. 15870). Springer. https://doi.org/10.1007/978-981-96-8170-9_8
MLA 9th: Liu, Justin Jia, et al. “Streaming Isolation Forest.” Advances in Knowledge Discovery and Data Mining, edited by Xindong Wu et al., vol. 15870, Lecture Notes in Computer Science, Springer, 2025. https://doi.org/10.1007/978-981-96-8170-9_8.
Chicago (Author-Date): Liu, Justin Jia, G. W. Cassales, F. T. Liu, Bernhard Pfahringer, and Albert Bifet. 2025. “Streaming Isolation Forest.” In Advances in Knowledge Discovery and Data Mining, edited by Xindong Wu et al., Lecture Notes in Computer Science, vol. 15870. Singapore: Springer. https://doi.org/10.1007/978-981-96-8170-9_8.
The University of Waikato
University of Canterbury
The University of Auckland
Victoria University of Wellington
MetService
Beca