Skip to main content

February e-newsletter update out now

Research and Publications

Auto-Reg: A Dynamic AutoML Framework for Streaming Regression

Verma, N., Bifet, A., Pfahringer, B., & Bahri, M. (2025)

Introduction

Auto‑Reg is a dynamic AutoML framework for streaming regression, designed for situations where data arrives continuously and changes over time. Instead of relying on static hyperparameter tuning or fixed models, Auto‑Reg continuously searches, updates, and improves entire machine‑learning pipelines as the data stream evolves. Its goal is to deliver strong predictive performance while staying efficient under strict time and memory constraints.

Problems

  • Traditional AutoML assumes full datasets and repeated retraining, which is impossible in real‑time streams

  • Most streaming AutoML work focuses on classification, leaving regression behind

  • Changing data patterns require models and hyperparameters to adapt continuously

  • Stream learners must update quickly without exceeding time or memory limits

  • Many AutoML search strategies converge too slowly for fast‑moving data

  • Existing methods often tune only the model, ignoring preprocessing and feature selection

Method

Auto‑Reg maintains a set of candidate pipelines (preprocessing + feature selection + model + hyperparameters) and updates them in exploration windows. At each window, it identifies the best pipeline so far and generates new candidates using Probability‑Weighted Hyperparameter Sampling (PWHS), which focuses the search around promising regions while still exploring new options. It also uses dynamic budget allocation, adjusting how much effort goes into exploration vs. exploitation based on recent performance. This creates a continuous, adaptive AutoML process suited for evolving data streams.

Key outcomes

  • Auto‑Reg outperforms major baselines across synthetic and real regression streams.

  • Dynamic parameters allow rapid response to drift and performance changes.

  • Competitive time and memory usage despite full‑pipeline optimization.

  • Using an ensemble of top pipelines improves robustness.

  • PWHS ensures the search can reach the optimal pipeline under stable conditions.

Findings

When dynamic AutoML is combined with probability‑weighted search and adaptive resource allocation, it can reliably outperform both traditional stream learners and existing AutoML systems in streaming regression tasks. Auto‑Reg shows that full‑pipeline optimization, not just hyperparameter tuning, is essential for handling evolving data streams, and that a carefully balanced exploration–exploitation strategy leads to consistently strong performance over time.

How to cite this article

APA 7th: Gomes, H. M., Read, J., Bifet, A., Barddal, J. P., Enembreck, F., & Pfahringer, B. (2025). SLEADE: Disagreement‑based semi‑supervised learning for sparsely labeled evolving data streams. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2025.3647050

MLA 9th: Gomes, Heitor M., et al. “SLEADE: Disagreement‑Based Semi‑Supervised Learning for Sparsely Labeled Evolving Data Streams.” IEEE Transactions on Knowledge and Data Engineering, 2025, https://doi.org/10.1109/TKDE.2025.3647050,

Chicago (Author-Date): Gomes, Heitor M., Jesse Read, Maciej Grzenda, Bernhard Pfahringer, and Albert Bifet. 2025. “SLEADE: Disagreement‑Based Semi‑Supervised Learning for Sparsely Labeled Evolving Data Streams.” IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2025.3647050.