Research and Publications

ASML-REG: Automated Machine Learning for Data Stream Regression

Verma, N., Bifet, A., Pfahringer, B., & Bahri, M. (2025)

Introduction

Auto‑Reg is a dynamic AutoML framework for streaming regression that automatically builds, tunes, and updates full machine‑learning pipelines as data arrives continuously. It is designed for real‑time environments where data evolves, labels may be delayed, and models must adapt quickly without manual intervention. The framework aims to deliver high predictive accuracy while respecting the strict time and memory constraints of data‑stream processing.

Problems

Traditional AutoML requires full data access and repeated retraining, which is impossible in streaming settings.
Most AutoML research focuses on classification, leaving regression tasks behind.
Changing data distributions require models and hyperparameters to adapt continuously.
Stream learners must update rapidly with minimal memory and processing time.
Many AutoML search methods converge too slowly for fast‑moving data streams.
Existing systems often tune only the model, ignoring preprocessing and feature selection.

Method

Auto‑Reg maintains a population of candidate pipelines, each containing preprocessing steps, feature selection, a regression model, and hyperparameters. It operates in exploration windows, where it evaluates performance and then generates new pipelines using:

Probability‑Weighted Hyperparameter Sampling (PWHS) to focus search around strong performers,
Random sampling to maintain exploration, and
Dynamic budget allocation to balance exploration vs. exploitation based on recent performance.

The system keeps only the top‑performing pipelines, forming a continuously adapting AutoML process that responds to drift and evolving data patterns.

Findings

The authors show that Auto‑Reg achieves state‑of‑the‑art performance across multiple real and synthetic regression streams. It adapts quickly to concept drift, maintains competitive time and memory usage, and consistently outperforms existing streaming regression and AutoML baselines. The key finding is that dynamic, full‑pipeline AutoML, guided by probability‑weighted search, can reliably track the best model configuration over time, making Auto‑Reg a powerful solution for real‑world streaming regression tasks.

Read here

Journal Publications

RMIDDM: an unsupervised and interpretable concept drift detection method for data streams
Linear adaptive filtering for regression in data streams
Accelerated Weka: GPU Machine Learning with Weka Workbench
Automatic species identification from images for Aotearoa
A comparative study of four deep learning algorithms for predicting tree stem radius measured by dendrometer: A case study
Bayesian Stream Tuner: Dynamic Hyperparameter Optimization for Real-Time Data Streams

Conference Publications

Featured Publications

How to cite this article

APA 7th: Verma, N., Bifet, A., Pfahringer, B., & Bahri, M. (2025). ASML-REG: Automated machine learning for data stream regression.

MLA 9th: Verma, Nilesh, et al. ASML-REG: Automated Machine Learning for Data Stream Regression. 2025.

Chicago (Author-Date): Verma, Nilesh, Albert Bifet, Bernhard Pfahringer, and Maroua Bahri. 2025. ASML-REG: Automated Machine Learning for Data Stream Regression.

The University of Waikato
University of Canterbury
The University of Auckland
Victoria University of Wellington
MetService
Beca