Skip to main content

February e-newsletter update out now

Research and Publications

Streaming Isolation Forest

Liu, J., Liu, F., Bifet, A., Pfahringer, B., & Cassales, G. (2025)

Introduction

Streaming Isolation Forest (SIF) is an online anomaly‑detection method designed for evolving data streams, where data arrives continuously and patterns change over time. It extends the classic Isolation Forest algorithm, originally built for static datasets, so it can operate efficiently, adapt to drift, and detect anomalies in real time without storing historical data.

Problems

  • The original algorithm requires full‑dataset access and cannot update incrementally

  • Real‑world streams change over time, so models must adapt continuously

  • Storing past data is impossible in high‑velocity streams

  • Anomaly detection must work without labeled feedback

  • Online updates must preserve the randomness and structure that make Isolation Forest effective

Method

The Streaming Isolation Forest method works by updating isolation trees incrementally as new data arrives, rather than rebuilding the entire forest from scratch. Each incoming instance is used to partially refresh or replace parts of the existing trees, allowing the model to adapt continuously to changes in the data stream. To handle concept drift, the algorithm incorporates mechanisms such as sliding windows or decay functions so that older, less relevant data gradually loses influence. Only small samples of data are used during updates, keeping computation fast and memory usage low. As the forest evolves, anomaly scores are recalculated in a way that reflects the current structure of the trees, ensuring that the system can detect unusual or rare patterns in real time without storing historical data.

Key outcomes

  • SIF detects anomalies quickly with low latency

  • The model adapts to changing data distributions

  • No need to store historical data

  • Performs competitively against other streaming anomaly‑detection methods

  • Efficient enough for real‑world, high‑throughput applications

Findings

The core finding is that Isolation Forest can be successfully adapted to streaming environments by using incremental tree updates and adaptive scoring. Streaming Isolation Forest maintains the strengths of the original algorithm while enabling fast, memory‑efficient, and drift‑aware anomaly detection in evolving data streams.

How to cite this article

APA 7th: Liu, J. J., Cassales, G. W., Liu, F. T., Pfahringer, B., & Bifet, A. (2025). Streaming Isolation Forest. In X. Wu et al. (Eds.), Advances in Knowledge Discovery and Data Mining (PAKDD 2025, Lecture Notes in Computer Science, Vol. 15870). Springer. https://doi.org/10.1007/978-981-96-8170-9_8

MLA 9th: Liu, Justin Jia, et al. “Streaming Isolation Forest.” Advances in Knowledge Discovery and Data Mining, edited by Xindong Wu et al., vol. 15870, Lecture Notes in Computer Science, Springer, 2025. https://doi.org/10.1007/978-981-96-8170-9_8.

Chicago (Author-Date): Liu, Justin Jia, G. W. Cassales, F. T. Liu, Bernhard Pfahringer, and Albert Bifet. 2025. “Streaming Isolation Forest.” In Advances in Knowledge Discovery and Data Mining, edited by Xindong Wu et al., Lecture Notes in Computer Science, vol. 15870. Singapore: Springer. https://doi.org/10.1007/978-981-96-8170-9_8.