Learning Equations for Automated Function Discovery.
leaf
leaf (Learning Equations for Automated Function discovery) is an R package for symbolic regression designed for ecological modelling. It discovers interpretable mathematical equations from data using a principled four-stage workflow — separating equation discovery from model evaluation — to ensure selected models are accurate, parsimonious, and generalizable.
leaf wraps powerful SR engines including PySR and RSRM, and is built around Multi-view Symbolic Regression (MvSR), which finds a single generalizable equation structure that can be fitted to multiple groups (e.g., species, sites, archipelagos) independently.
Installation
You can install leaf from CRAN with:
install.packages("leaf")
After installing, you need to install the Python backend once:
library(leaf)
leaf::install_leaf()
Example
library(leaf)
# Initialize a symbolic regressor
regressor <- SymbolicRegressor$new(
engine = "rsrm",
loss = "MSE"
)
# Search for candidate equations
train_data <- leaf_data("eg_train")
regressor$search_equations(
data = train_data,
formula = "y ~ f(x1, x2)"
)
# Fit parameters and evaluate
regressor$fit(data = train_data)
regressor$evaluate(metrics = c("RMSE", "R2"), data = leaf_data("eg_test"))
# Inspect results
print(regressor$get_pareto_front())
Core features
- Formula interface: R-style formula syntax supporting transformations and group-level parameters, e.g.
S ~ f(log(Area), Age | species) - Multiple SR engines: Wraps PySR, RSRM, and a baseline Exhaustive GLM Search
- Multi-view Symbolic Regression: Finds equations that generalise across groups
- Classification support: Includes sigmoid-based classification with cross-validated threshold selection
- Biological constraints: Enforce sign consistency of predictor effects across groups
- Principled model selection: Pareto front and Elbow Metric for complexity-performance trade-offs
- Equation inspection: Print final equations in full un-normalised algebraic form
- Model management: Save, load, and merge results across runs
Documentation
See the vignettes for detailed usage:
vignette("getting_started", package = "leaf")vignette("minimal_example", package = "leaf")vignette("cross_validation", package = "leaf")vignette("MvSR", package = "leaf")
Citation
A citation will be available once the accompanying paper is published.
Acknowledgements
This work was developed at INESC-ID and supported by national funds through FCT - Fundação para a Ciência e a Tecnologia, under project [REF].