Description
Concept Drift Detection Methods for Stream Data.
Description
A system designed for detecting concept drift in streaming datasets. It offers a comprehensive suite of statistical methods to detect concept drift, including methods for monitoring changes in data distributions over time. The package supports several tests, such as Drift Detection Method (DDM), Early Drift Detection Method (EDDM), Hoeffding Drift Detection Methods (HDDM_A, HDDM_W), Kolmogorov-Smirnov test-based Windowing (KSWIN), Adaptive WINdowing (ADWIN) and Page Hinkley (PH) tests. The methods implemented in this package are based on established research and have been demonstrated to be effective in real-time data analysis. For more details on the methods, please check to the following sources. Kobylińska et al. (2023) <doi:10.48550/arXiv.2308.11446>, S. Kullback & R.A. Leibler (1951) <doi:10.1214/aoms/1177729694>, Gama et al. (2004) <doi:10.1007/978-3-540-28645-5_29>, Baena-Garcia et al. (2006) <https://www.researchgate.net/publication/245999704_Early_Drift_Detection_Method>, Frías-Blanco et al. (2014) <https://ieeexplore.ieee.org/document/6871418>, Bifet and Gavalda (2007) <doi:10.1137/1.9781611972771>, Raab et al. (2020) <doi:10.1016/j.neucom.2019.11.111>, Page (1954) <doi:10.1093/biomet/41.1-2.100>, Montiel et al. (2018) <https://jmlr.org/papers/volume19/18-251/18-251.pdf>.
README.md
datadriftR
datadriftR is an R package for detecting data drift in streaming data. It monitors when statistical properties of your data change over time, which is essential for maintaining machine learning model performance in production.
Available Methods
| Method | Description |
|---|---|
ddm | Drift Detection Method |
eddm | Early Drift Detection Method |
hddm_a | Hoeffding's bound with averaging |
hddm_w | Hoeffding's bound with weighting |
kswin | Kolmogorov-Smirnov Windowing |
adwin | ADaptive WINdowing |
page_hinkley | Page-Hinkley Test |
kl_divergence | KL Divergence |
profile_difference | Profile Difference |
Installation
# Install from CRAN
install.packages("datadriftR")
# Or install the development version from GitHub with pak
# install.packages("pak")
pak::pak("ugurdar/datadriftR")
# Or install the development version from GitHub with remotes
# install.packages("remotes")
# remotes::install_github("ugurdar/datadriftR")
Documentation: https://ugurdar.github.io/datadriftR
Simple DDM Example
library(datadriftR)
# Create a stream with drift at position 501
set.seed(123)
stream <- c(
sample(c(0, 1), 500, replace = TRUE, prob = c(0.7, 0.3)),
sample(c(0, 1), 500, replace = TRUE, prob = c(0.3, 0.7))
)
# Detect drift
results <- detect_drift(stream, method = "ddm")
print(results)
Documentation
Citation
@article{dar2025datadriftr,
title={datadriftR: Drift Detection Methods for Stream Data},
author={Dar, Ugur and Cavus, Mustafa},
journal={Journal of Open Source Software},
year={2025}
}
Authors
- Ugur Dar - Eskisehir Technical University
- Mustafa Cavus - Eskisehir Technical University.