MyNixOS website logo
Description

Fast Local Polynomial Regression and Kernel Density Estimation.

Non-Uniform Fast Fourier Transform ('NUFFT')-accelerated local polynomial regression and kernel density estimation for large, scattered, or complex-valued datasets. Provides automatic bandwidth selection via Generalized Cross-Validation (GCV) for regression and Likelihood Cross-Validation (LCV) for density estimation. This is the 'R' port of the 'fastLPR' 'MATLAB'/'Python' toolbox, achieving O(N + M log M) computational complexity through custom 'NUFFT' implementation with Gaussian gridding. Supports 1D/2D/3D data, complex-valued responses, heteroscedastic variance estimation, and confidence interval computation. Performance optimized with vectorized 'R' code and compiled helpers via 'Rcpp'/'RcppArmadillo'. Extends the 'FKreg' toolbox of Wang et al. (2022) <doi:10.48550/arXiv.2204.07716> with 'Python' and 'R' ports. Applied in Li et al. (2022) <doi:10.1016/j.neuroimage.2022.119190>. Uses 'NUFFT' methods based on Greengard and Lee (2004) <doi:10.1137/S003614450343200X>, binning-accelerated kernel estimation of Wand (1994) <doi:10.1080/10618600.1994.10474656>, and local polynomial regression framework of Fan and Gijbels (1996, ISBN:978-0412983214).

fastlpr: Fast Local Polynomial Regression for R

R package License: GPL-3

NUFFT-accelerated local polynomial regression and kernel density estimation for R. This is the R port of the fastLPR MATLAB/Python toolbox, achieving O(N + M log M) computational complexity through Non-Uniform Fast Fourier Transform (NUFFT) with Gaussian gridding.

Features

  • Fast: 20-60x faster than ks package for KDE, 100-1000x faster than naive implementations
  • Flexible: Supports 1D/2D/3D data, complex-valued responses, heteroscedastic variance estimation
  • Automatic: GCV/LCV-based bandwidth selection with 1-SE rule for robustness
  • Complete: Confidence intervals, DOF estimation, multiple polynomial orders (0/1/2)

Installation

From Source (Development)

# Clone repository and navigate to fastLPR_R
setwd("path/to/fastLPR_R")

# Option 1: Source setup script (for development)
source("setup.R")

# Option 2: Install as package (recommended)
# From command line:
# R CMD INSTALL .

Dependencies

Required:

  • R >= 4.0.0
  • Matrix (>= 1.0.0)
  • Rcpp (>= 1.0.0)
  • RcppArmadillo (>= 0.10.0)

Suggested:

  • testthat (>= 3.0.0) - for testing
  • R.matlab - for cross-validation against MATLAB references
  • akima, interp, fields - for additional interpolation methods

Quick Start

Kernel Density Estimation (KDE)

library(fastlpr)

# 1D KDE
set.seed(42)
x <- matrix(rnorm(500), ncol = 1)
hlist <- get_hlist(20, c(0.05, 0.5), "logspace")
opt <- list(order = 0)  # order=0 for KDE
kde <- cv_fastkde(x, hlist, opt)

# Selected bandwidth
print(kde$h)

# 2D KDE
x2d <- matrix(rnorm(2000), ncol = 2)
hlist2d <- get_hlist(c(10, 10), rbind(c(0.1, 1), c(0.1, 1)), "logspace")
kde2d <- cv_fastkde(x2d, hlist2d, list(order = 0))

Local Polynomial Regression

# 1D Local Linear Regression
x <- matrix(runif(500), ncol = 1)
y <- sin(2 * pi * x) + 0.1 * matrix(rnorm(500), ncol = 1)
hlist <- get_hlist(20, c(0.01, 0.5), "logspace")

opt <- list(
  order = 1,       # 0=Nadaraya-Watson, 1=local linear, 2=local quadratic
  calc_dof = TRUE  # Enable DOF estimation for GCV
)
regs <- cv_fastlpr(x, y, hlist, opt)

# Results
print(regs$gcv_yhat$h)  # Selected bandwidth
yhat <- regs$yhat       # Fitted values

Heteroscedastic Variance Estimation

# Mean estimation
opt_mean <- list(order = 1, calc_dof = TRUE)
res_mean <- cv_fastlpr(x, y, hlist, opt_mean)

# Variance estimation using squared residuals
residuals_sq <- (y - res_mean$yhat)^2
opt_var <- list(order = 1, calc_dof = TRUE, y_type_out = "variance")
res_var <- cv_fastlpr(x, residuals_sq, hlist, opt_var)

Package Structure

fastLPR_R/
├── R/                          # Source files (26 files)
│   ├── cv_fastlpr.R           # Main regression API
│   ├── cv_fastkde.R           # Main KDE API
│   ├── fastlpr_*.R            # Core algorithm components
│   ├── nufft.R                # NUFFT implementation
│   ├── lwp_estimator.R        # Local polynomial estimator
│   └── utils.R                # Utility functions
├── src/                        # C++ acceleration (Rcpp/RcppArmadillo)
├── inst/
│   ├── examples/              # JSS paper figure reproduction scripts
│   │   ├── example_fig2_fastkde.R
│   │   ├── example_fig3_boundary_comparison.R
│   │   ├── example_fig4_complex.R
│   │   ├── example_fig5_heteroscedasticity.R
│   │   ├── example_fig6_applications.R
│   │   └── reproduce_all_figures.R
│   └── benchmarks/            # Performance comparison scripts
│       ├── benchmark_tab4_kde_comparison.R
│       ├── benchmark_kde_large_scale.R
│       └── benchmark_lpr_npregfast.R
├── tests/
│   └── testthat/              # Test suite
│       ├── test-cross-validation.R  # 10-test MATLAB cross-validation
│       └── test-*.R                 # Component tests
├── man/                        # Documentation (roxygen2)
├── data/                       # Example datasets
├── dev/                        # Development files (excluded from build)
├── DESCRIPTION                 # Package metadata
├── NAMESPACE                   # Exported functions
└── LICENSE                     # GPL-3

Examples (JSS Paper Figures)

Reproduce all figures from the JSS paper:

# Run all figure scripts
source(system.file("examples", "reproduce_all_figures.R", package = "fastlpr"))

# Or run individual figures
source(system.file("examples", "example_fig2_fastkde.R", package = "fastlpr"))
FigureDescriptionScript
Fig 2KDE (1D/2D/3D)example_fig2_fastkde.R
Fig 3Boundary comparison (NW/LL/LQ)example_fig3_boundary_comparison.R
Fig 4Complex-valued regressionexample_fig4_complex.R
Fig 5Heteroscedastic varianceexample_fig5_heteroscedasticity.R
Fig 6Applications (qEEG/MRI)example_fig6_applications.R

Benchmarks

Performance comparison with existing R packages:

# Run benchmark scripts
source(system.file("benchmarks", "benchmark_tab4_kde_comparison.R", package = "fastlpr"))

KDE: fastlpr vs ks package (N=50,000)

  • fastlpr: 0.34s
  • ks: 19.98s
  • Speedup: 59x

LPR: fastlpr vs statsmodels (Python, N=5,000)

  • Naive O(N²): 31.7s
  • fastlpr O(N log N): 0.03s
  • Speedup: 1000x+

Testing

Run the test suite:

# Using testthat
setwd("fastLPR_R")
testthat::test_dir("tests/testthat")

# Or via devtools
devtools::test()

Cross-validation results (10 tests against MATLAB reference):

TestNdxorderTypeMATLABR TimeRatioGCV MaxErrBW MaxErrMean MaxErrVar MaxErrStatus
1D KDE140010mean0.341s0.081s0.24x1.94e+01*0.00e+00--PASS
2D KDE200020mean0.125s0.062s0.49x1.84e-01*0.00e+00--PASS
3D KDE200030mean0.145s0.869s6.00x3.32e-03*0.00e+00--PASS
1D Order 150011mean1.546s0.060s0.04x5.86e-040.00e+007.31e-03-PASS
2D Order 1120021mean0.406s0.873s2.15x9.64e-041.33e-021.90e-02-PASS
1D Order 250012mean0.185s0.116s0.63x1.46e-030.00e+007.77e-03-PASS
2D Order 2120022mean1.343s5.109s3.81x1.45e-030.00e+008.30e-03-PASS
Complex1000011mean0.334s7.444s22.29x6.89e-050.00e+003.45e-04-PASS
Hetero 1D1000011mean+var0.466s1.652s3.54x2.33e-050.00e+002.11e-031.64e-03PASS
Hetero 2D120021mean+var1.167s3.916s3.35x2.89e-020.00e+004.29e-023.02e-02PASS

* For KDE tests (order=0), LCV MaxErr shown in GCV column; All metrics must pass

API Reference

Main Functions

FunctionDescription
cv_fastlpr(x, y, hlist, opt)Cross-validated local polynomial regression
cv_fastkde(x, hlist, opt)Cross-validated kernel density estimation
get_hlist(n, range, type)Generate bandwidth grid
fastlpr_predict(regs, x_new)Predict at new points
fastlpr_interval(regs, alpha)Confidence intervals

Options

opt <- list(
  order = 1,                # Polynomial order: 0, 1, or 2
  kernel_type = "gaussian", # Kernel: "gaussian" or "epanechnikov"
  N = 100,                  # Grid size per dimension
  accuracy = 6,             # NUFFT accuracy (1-12)
  calc_dof = TRUE,          # Compute DOF for GCV
  dstd = 0                  # Std devs for 1-SE rule (0 = minimum GCV)
)

Citation

If you use fastlpr in your research, please cite:

@article{fastlpr2025,
  title = {fastLPR: Fast Local Polynomial Regression with NUFFT Acceleration},
  author = {Wang, Ying and Li, Min},
  journal = {Journal of Statistical Software},
  year = {2025},
  note = {R package version 1.0.0}
}

License

GPL-3

Related Packages

Metadata

Version

1.0.1

License

Unknown

Platforms (80)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arc-linux
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • sh4-linux
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows