Python-Based Extensions for Data Analytics Workflows.
Python-Based Extensions for Data Analytics Workflows
Python-Based Extensions for Data Analytics Workflows provides Python-based extensions to enhance data analytics workflows, particularly for tasks involving data preprocessing and predictive modeling. It includes tools for:
- Data sampling and transformation
- Feature selection
- Balancing strategies (e.g., SMOTE)
- Model construction and tuning
These capabilities leverage Python libraries via the reticulate interface, enabling seamless integration with the broader Python machine learning ecosystem. The package supports instance selection and hybrid workflows that combine R and Python functionalities for flexible and reproducible analytical pipelines.
The architecture is inspired by the Experiment Lines approach, which promotes modularity, extensibility, and interoperability across tools.
More information on Experiment Lines is available in Ogasawara et al. (2009).
Examples
Example scripts are available at:
Installation
You can install the latest stable version from CRAN:
install.packages("daltoolboxdp")
To install the development version from GitHub:
library(devtools)
devtools::install_github("cefet-rj-dal/daltoolboxdp", force = TRUE, dependencies = FALSE, upgrade = "never")
Bug reports and feature requests
Please report issues or suggest new features via: