Research and Publications
SLEADE: Disagreement-Based Semi-Supervised Learning for Sparsely Labeled Evolving Data Streams
Gomes, H., Bifet, A., Pfahringer, B., Read, J & Grzenda, M. (2025)
Introduction
SLEADE is a semi‑supervised learning approach designed for evolving data streams with extremely sparse labels, where traditional supervised models struggle. It leverages ensemble disagreement and unsupervised drift detection to make effective use of unlabeled data and adapt to concept drift in real time.
Problems
Real‑world streams often provide very few labeled instances, making supervised learning unreliable
Data distributions change over time, requiring models to adapt continuously
Using pseudo‑labels can introduce noise; the challenge is to exploit them without degrading performance
Drift detection typically relies on labeled data, which is scarce in this setting
Disagreement must be meaningful to guide pseudo‑labeling and adaptation
Method
SLEADE uses an ensemble of classifiers that generate pseudo‑labels when they disagree, following a “majority trains the minority” strategy: if most models agree on a label, that pseudo‑label is used to train the minority models. A confidence‑based weighting function controls how strongly pseudo‑labeled instances influence learning. The system also incorporates unsupervised drift detection, enabling the ensemble to adapt to changes in the data distribution without requiring labeled feedback.
Key outcomes
Experiments on real and synthetic streams show SLEADE outperforming several existing semi‑supervised stream learners
Pseudo‑labeling with confidence weighting reduces noise and boosts accuracy
Unsupervised drift detection enables timely adaptation to evolving concepts
Demonstrated benefits across multiple domains and data types
Findings
The key finding is that the disagreement‑based semi‑supervised learning can successfully overcome the challenges of sparsely labeled evolving data streams. By combining ensemble disagreement, confidence‑weighted pseudo‑labeling, and unsupervised drift detection, SLEADE shows that unlabeled data (traditionally seen as a limitation) can be transformed into a powerful resource for maintaining accuracy and adaptability in dynamic environments.
Journal Publications
RMIDDM: an unsupervised and interpretable concept drift detection method for data streams
Bayesian Stream Tuner: Dynamic Hyperparameter Optimization for Real-Time Data Streams
Conference Publications
How to cite this article
APA 7th: Gomes, H. M., Read, J., Bifet, A., Barddal, J. P., Enembreck, F., & Pfahringer, B. (2025). SLEADE: Disagreement‑based semi‑supervised learning for sparsely labeled evolving data streams. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2025.3647050
MLA 9th: Gomes, Heitor M., et al. “SLEADE: Disagreement‑Based Semi‑Supervised Learning for Sparsely Labeled Evolving Data Streams.” IEEE Transactions on Knowledge and Data Engineering, 2025, https://doi.org/10.1109/TKDE.2025.3647050,
Chicago (Author-Date): Gomes, Heitor M., Jesse Read, Maciej Grzenda, Bernhard Pfahringer, and Albert Bifet. 2025. “SLEADE: Disagreement‑Based Semi‑Supervised Learning for Sparsely Labeled Evolving Data Streams.” IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2025.3647050.
The University of Waikato
University of Canterbury
The University of Auckland
Victoria University of Wellington
MetService
Beca