MyNixOS website logo
Description

Learning Equations for Automated Function Discovery.

A unified framework for symbolic regression (SR) and multi-view symbolic regression (MvSR) designed for complex, nonlinear systems, with particular applicability to ecological datasets. The package implements a four-stage workflow: data subset generation, functional form discovery, numerical parameter optimization, and multi-objective evaluation. It provides a high-level formula-style interface that abstracts and extends multiple discovery engines: genetic programming (via PySR), Reinforcement Learning with Monte Carlo Tree Search (via RSRM), and exhaustive generalized linear model search. 'leaf' extends these methods by enabling multi-view discovery, where functional structures are shared across groups while parameters are fitted locally, and by supporting the enforcement of domain-specific constraints, such as sign consistency across groups. The framework automatically handles data normalization, link functions, and back-transformation, ensuring that discovered symbolic equations remain interpretable and valid on the original data scale. Implements methods following ongoing work by the authors (2026, in preparation).

leaf

leaf (Learning Equations for Automated Function discovery) is an R package for symbolic regression designed for ecological modelling. It discovers interpretable mathematical equations from data using a principled four-stage workflow — separating equation discovery from model evaluation — to ensure selected models are accurate, parsimonious, and generalizable.

leaf wraps powerful SR engines including PySR and RSRM, and is built around Multi-view Symbolic Regression (MvSR), which finds a single generalizable equation structure that can be fitted to multiple groups (e.g., species, sites, archipelagos) independently.

Installation

You can install leaf from CRAN with:

install.packages("leaf")

After installing, you need to install the Python backend once:

library(leaf)
leaf::install_leaf()

Example

library(leaf)

# Initialize a symbolic regressor
regressor <- SymbolicRegressor$new(
  engine = "rsrm",
  loss = "MSE"
)

# Search for candidate equations
train_data <- leaf_data("eg_train")
regressor$search_equations(
  data = train_data,
  formula = "y ~ f(x1, x2)"
)

# Fit parameters and evaluate
regressor$fit(data = train_data)
regressor$evaluate(metrics = c("RMSE", "R2"), data = leaf_data("eg_test"))

# Inspect results
print(regressor$get_pareto_front())

Core features

  • Formula interface: R-style formula syntax supporting transformations and group-level parameters, e.g. S ~ f(log(Area), Age | species)
  • Multiple SR engines: Wraps PySR, RSRM, and a baseline Exhaustive GLM Search
  • Multi-view Symbolic Regression: Finds equations that generalise across groups
  • Classification support: Includes sigmoid-based classification with cross-validated threshold selection
  • Biological constraints: Enforce sign consistency of predictor effects across groups
  • Principled model selection: Pareto front and Elbow Metric for complexity-performance trade-offs
  • Equation inspection: Print final equations in full un-normalised algebraic form
  • Model management: Save, load, and merge results across runs

Documentation

See the vignettes for detailed usage:

  • vignette("getting_started", package = "leaf")
  • vignette("minimal_example", package = "leaf")
  • vignette("cross_validation", package = "leaf")
  • vignette("MvSR", package = "leaf")

Citation

A citation will be available once the accompanying paper is published.

Acknowledgements

This work was developed at INESC-ID and supported by national funds through FCT - Fundação para a Ciência e a Tecnologia, under project [REF].

Metadata

Version

0.1.0

License

Unknown

Platforms (80)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arc-linux
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • sh4-linux
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows