Skip to main content

February e-newsletter update out now

Research and Publications

ASML-REG: Automated Machine Learning for Data Stream Regression

Verma, N., Bifet, A., Pfahringer, B., & Bahri, M. (2025)

Introduction

Auto‑Reg is a dynamic AutoML framework for streaming regression that automatically builds, tunes, and updates full machine‑learning pipelines as data arrives continuously. It is designed for real‑time environments where data evolves, labels may be delayed, and models must adapt quickly without manual intervention. The framework aims to deliver high predictive accuracy while respecting the strict time and memory constraints of data‑stream processing.

Problems

  • Traditional AutoML requires full data access and repeated retraining, which is impossible in streaming settings.

  • Most AutoML research focuses on classification, leaving regression tasks behind.

  • Changing data distributions require models and hyperparameters to adapt continuously.

  • Stream learners must update rapidly with minimal memory and processing time.

  • Many AutoML search methods converge too slowly for fast‑moving data streams.

  • Existing systems often tune only the model, ignoring preprocessing and feature selection.

Method

Auto‑Reg maintains a population of candidate pipelines, each containing preprocessing steps, feature selection, a regression model, and hyperparameters. It operates in exploration windows, where it evaluates performance and then generates new pipelines using:

  • Probability‑Weighted Hyperparameter Sampling (PWHS) to focus search around strong performers,

  • Random sampling to maintain exploration, and

  • Dynamic budget allocation to balance exploration vs. exploitation based on recent performance.

The system keeps only the top‑performing pipelines, forming a continuously adapting AutoML process that responds to drift and evolving data patterns.

Findings

The authors show that Auto‑Reg achieves state‑of‑the‑art performance across multiple real and synthetic regression streams. It adapts quickly to concept drift, maintains competitive time and memory usage, and consistently outperforms existing streaming regression and AutoML baselines. The key finding is that dynamic, full‑pipeline AutoML, guided by probability‑weighted search, can reliably track the best model configuration over time, making Auto‑Reg a powerful solution for real‑world streaming regression tasks.

How to cite this article

APA 7th: Verma, N., Bifet, A., Pfahringer, B., & Bahri, M. (2025). ASML-REG: Automated machine learning for data stream regression.

MLA 9th: Verma, Nilesh, et al. ASML-REG: Automated Machine Learning for Data Stream Regression. 2025.

Chicago (Author-Date): Verma, Nilesh, Albert Bifet, Bernhard Pfahringer, and Maroua Bahri. 2025. ASML-REG: Automated Machine Learning for Data Stream Regression.