Research and Publications
Auto-Reg: A Dynamic AutoML Framework for Streaming Regression
Verma, N., Bifet, A., Pfahringer, B., & Bahri, M. (2025)
Introduction
Auto‑Reg is a dynamic AutoML framework for streaming regression, designed for situations where data arrives continuously and changes over time. Instead of relying on static hyperparameter tuning or fixed models, Auto‑Reg continuously searches, updates, and improves entire machine‑learning pipelines as the data stream evolves. Its goal is to deliver strong predictive performance while staying efficient under strict time and memory constraints.
Problems
Traditional AutoML assumes full datasets and repeated retraining, which is impossible in real‑time streams
Most streaming AutoML work focuses on classification, leaving regression behind
Changing data patterns require models and hyperparameters to adapt continuously
Stream learners must update quickly without exceeding time or memory limits
Many AutoML search strategies converge too slowly for fast‑moving data
Existing methods often tune only the model, ignoring preprocessing and feature selection
Method
Auto‑Reg maintains a set of candidate pipelines (preprocessing + feature selection + model + hyperparameters) and updates them in exploration windows. At each window, it identifies the best pipeline so far and generates new candidates using Probability‑Weighted Hyperparameter Sampling (PWHS), which focuses the search around promising regions while still exploring new options. It also uses dynamic budget allocation, adjusting how much effort goes into exploration vs. exploitation based on recent performance. This creates a continuous, adaptive AutoML process suited for evolving data streams.
Key outcomes
Auto‑Reg outperforms major baselines across synthetic and real regression streams.
Dynamic parameters allow rapid response to drift and performance changes.
Competitive time and memory usage despite full‑pipeline optimization.
Using an ensemble of top pipelines improves robustness.
PWHS ensures the search can reach the optimal pipeline under stable conditions.
Findings
When dynamic AutoML is combined with probability‑weighted search and adaptive resource allocation, it can reliably outperform both traditional stream learners and existing AutoML systems in streaming regression tasks. Auto‑Reg shows that full‑pipeline optimization, not just hyperparameter tuning, is essential for handling evolving data streams, and that a carefully balanced exploration–exploitation strategy leads to consistently strong performance over time.
Journal Publications
RMIDDM: an unsupervised and interpretable concept drift detection method for data streams
Bayesian Stream Tuner: Dynamic Hyperparameter Optimization for Real-Time Data Streams
Conference Publications
How to cite this article
APA 7th: Gomes, H. M., Read, J., Bifet, A., Barddal, J. P., Enembreck, F., & Pfahringer, B. (2025). SLEADE: Disagreement‑based semi‑supervised learning for sparsely labeled evolving data streams. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2025.3647050
MLA 9th: Gomes, Heitor M., et al. “SLEADE: Disagreement‑Based Semi‑Supervised Learning for Sparsely Labeled Evolving Data Streams.” IEEE Transactions on Knowledge and Data Engineering, 2025, https://doi.org/10.1109/TKDE.2025.3647050,
Chicago (Author-Date): Gomes, Heitor M., Jesse Read, Maciej Grzenda, Bernhard Pfahringer, and Albert Bifet. 2025. “SLEADE: Disagreement‑Based Semi‑Supervised Learning for Sparsely Labeled Evolving Data Streams.” IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2025.3647050.
The University of Waikato
University of Canterbury
The University of Auckland
Victoria University of Wellington
MetService
Beca