Leveraging Experiment Lines to Data Analytics.
DAL Toolbox
As research experiments grow in scale and complexity, data analytics demands tools that go beyond isolated functions. DAL Toolbox is a framework designed to meet these modern challenges by organizing a comprehensive set of data analytics capabilities into an integrated workflow environment. Inspired by the Experiment Line model doi:10.1007/978-3-642-02279-1_20, it supports essential tasks such as data preprocessing, classification, regression, clustering, and time series prediction. With a unified data model, consistent method API, and support for hyperparameter tuning, DAL Toolbox enables the seamless construction and execution of end-to-end analytics pipelines. It also offers easy integration with existing libraries and languages, promoting usability, extensibility, and reproducibility in data science.
Documentation
The documentation was reorganized to support two complementary entry points:
- a guided tutorial track for readers who want to learn the workflow step by step
- thematic example collections for readers who want to inspect a specific family of methods
If you are new to daltoolbox, start with the tutorials. If you already know the package structure, the thematic collections remain available and were reorganized with more didactic descriptions.
Guided tutorial track
- Tutorials
- a 13-part learning sequence covering first experiment, sampling, data quality, preprocessing, baselines, metrics, model comparison, tuning, end-to-end pipelines, regression, clustering, visual analysis, and custom extensions.
Thematic example collections
- Transformations
- sampling, balancing, cleaning, scaling, encoding, smoothing, feature selection, dimensionality reduction, and curvature-based heuristics.
- Classification
- baseline models, core classifier families, and model selection.
- Regression
- interpretable models, nonlinear learners, and tuning for numeric prediction.
- Clustering
- partitional, medoid-based, density-based methods, and clustering model selection.
- Graphics
- comparison, distribution, relationship, time-oriented, and export-focused visualizations.
- Custom extensions
- examples showing how to integrate new transformations, classifiers, regressors, and clustering methods into the Experiment Line workflow.
Documentation design
The examples were revised to be more useful for learning:
- files inside each collection are now numbered in a suggested reading order
- category
READMEfiles group examples by subject rather than only by class name - many examples now include more explanation between code blocks, including interpretation hints and common mistakes
Installation
The latest version of DAL Toolbox at CRAN is available at: https://CRAN.R-project.org/package=daltoolbox
You can install the stable version of DAL Toolbox from CRAN with:
install.packages("daltoolbox")
You can install the development version of DAL Toolbox from GitHub https://github.com/cefet-rj-dal/daltoolbox with:
library(devtools)
devtools::install_github("cefet-rj-dal/daltoolbox", force=TRUE, dependencies=FALSE, upgrade="never")