Adaptive random forests for evolving data stream classification
Introduction
This research addresses the growing need for machine‑learning methods capable of learning from continuous, real‑time data streams—an increasingly common requirement in areas such as recommendation systems, autonomous vehicles, financial forecasting, and environmental monitoring. In these environments, data arrives rapidly, cannot be stored indefinitely, and often changes over time, creating a challenging setting where models must adapt quickly and remain accurate despite shifting patterns.
Outcomes
The study introduces Adaptive Random Forests (ARF), a streaming classifier that extends the traditional Random Forest algorithm to operate effectively on evolving data streams. ARF incorporates a theoretically grounded online resampling method and adaptive mechanisms that detect warnings and concept drift at the level of individual trees. When drift is detected, outdated trees are replaced with newly trained ones, enabling the ensemble to evolve alongside the data. The method is designed to be flexible, scalable, and compatible with multiple drift‑detection techniques, and the authors demonstrate that a parallel implementation maintains accuracy without performance loss. Extensive evaluation shows that ARF achieves strong accuracy and resource efficiency compared with state‑of‑the‑art streaming algorithms.
This work features contributions from Professor Albert Bifet, Bernhard Pfharinger, Heitor M Gomes, and Geoff Holmes, whose expertise supports TAIAO’s mission to develop adaptive, real‑time AI methods for environmental and ecological data streams.
Gomes, Heitor Murilo, et al. “Adaptive Random Forests for Evolving Data Stream Classification.” Machine Learning, vol. 106, no. 9–10, 2017, pp. 1469–1495.
The University of Waikato
University of Canterbury
The University of Auckland
Victoria University of Wellington
MetService
Beca