MyNixOS website logo
Description

Conformal Prediction and Uncertainty Quantification.

Implements conformal prediction methods for constructing prediction intervals (regression) and prediction sets (classification) with finite-sample coverage guarantees. Methods include split conformal, 'CV+' and 'Jackknife+' (Barber et al. 2021) <doi:10.1214/20-AOS1965>, 'Conformalized Quantile Regression' (Romano et al. 2019) <doi:10.48550/arXiv.1905.03222>, 'Adaptive Prediction Sets' (Romano, Sesia, Candes 2020) <doi:10.48550/arXiv.2006.02544>, 'Regularized Adaptive Prediction Sets' (Angelopoulos et al. 2021) <doi:10.48550/arXiv.2009.14193>, Mondrian conformal prediction for group-conditional coverage (Vovk et al. 2005), weighted conformal prediction for covariate shift (Tibshirani et al. 2019), and adaptive conformal inference for sequential prediction (Gibbs and Candes 2021). All methods are distribution-free and provide calibrated uncertainty quantification without parametric assumptions. Works with any model that can produce predictions from new data, including 'lm', 'glm', 'ranger', 'xgboost', and custom user-defined models.

predictset

CRAN status Downloads Lifecycle: stable License: MIT

predictset is an R package for model-agnostic conformal prediction and distribution-free uncertainty quantification. It constructs prediction intervals (regression) and prediction sets (classification) with finite-sample coverage guarantees - no distributional assumptions required. Works with any model: lm, glm, ranger, xgboost, or custom user-defined models via make_model().

Installation

# Install from CRAN
install.packages("predictset")

# Or install the development version from GitHub
# install.packages("devtools")
devtools::install_github("charlescoverdale/predictset")
library(predictset)

# Get 90% prediction intervals around any model - 3 lines of code
result <- conformal_split(x, y, model = y ~ ., x_new = x_new, alpha = 0.10)
result$lower  # lower bounds
result$upper  # upper bounds

What is conformal prediction?

Standard machine learning models produce point predictions: a single number for regression, a single class for classification. But in practice, you almost always need to know how uncertain that prediction is. Conformal prediction is a framework for wrapping any model in a layer of calibrated uncertainty quantification. Given a target coverage level (say 90%), it produces prediction intervals or prediction sets that are guaranteed to contain the true value at least 90% of the time, regardless of the underlying model or data distribution.

The key property is that this guarantee holds in finite samples. It's not asymptotic, and it doesn't require distributional assumptions. The only requirement is that the calibration data and test data are exchangeable (roughly: drawn from the same distribution). This makes conformal prediction fundamentally different from parametric confidence intervals, bootstrap intervals, or Bayesian credible intervals, all of which depend on modelling assumptions that may not hold.

predictset implements the main conformal methods from the recent literature (split conformal, Jackknife+, CV+, conformalized quantile regression for regression, and APS, RAPS, and LAC for classification) in a lightweight package with only two dependencies (cli and stats).


How does predictset compare to other packages?

FeaturepredictsetprobablyconformalInferenceMAPIE
LanguageRRRPython
RegressionYesYesYesYes
ClassificationYesNoNoYes
Model-agnosticYestidymodels onlyYesscikit-learn only
On CRANPendingYesNo (GitHub only)N/A
Jackknife+ / CV+YesNoYesYes
CQRYesYesYesYes
APS / RAPSYesNoNoYes
Mondrian CPYesNoNoYes
Weighted CPYesNoNoYes
Adaptive CIYesNoNoNo
Conditional diagnosticsYesNoNoPartial
Dependencies214+5N/A
Last updated2026202420192024

predictset is designed to complement rather than compete with probably. If you're working in the tidymodels ecosystem and only need regression intervals, probably integrates neatly with your workflow. predictset fills the gaps: classification methods (APS, RAPS, LAC), Jackknife+/CV+ for regression, and a model-agnostic interface that works with any model, not just tidymodels workflows.

conformalInference by Ryan Tibshirani was foundational research code, but it hasn't been updated since 2019, isn't on CRAN, and doesn't cover classification.


Quick start

Regression: prediction intervals with coverage verification

library(predictset)

set.seed(42)
n <- 500
x <- matrix(rnorm(n * 5), ncol = 5)
y <- x[, 1] * 2 + x[, 2] + rnorm(n)
x_new <- matrix(rnorm(100 * 5), ncol = 5)
y_new <- x_new[, 1] * 2 + x_new[, 2] + rnorm(100)  # true values (for evaluation)

# Fit conformal intervals
result <- conformal_split(x, y, model = y ~ ., x_new = x_new, alpha = 0.10)
print(result)
#> ── Conformal Prediction Intervals (Split Conformal) ──
#> • Coverage target: "90%"
#> • Training: 250 | Calibration: 250 | Predictions: 100
#> • Conformal quantile: 1.876
#> • Median interval width: 3.7519

# Extract intervals as a data frame
head(data.frame(pred = result$pred, lower = result$lower, upper = result$upper))

# Verify coverage: should be ≥ 90%
coverage(result, y_new)

# Predict on new data later (fit once, predict many times)
future_data <- matrix(rnorm(50 * 5), ncol = 5)
predict(result, newdata = future_data)

Classification: prediction sets

set.seed(42)
n <- 400
x <- matrix(rnorm(n * 4), ncol = 4)
y <- factor(ifelse(x[, 1] + x[, 2] > 0, "A", "B"))
x_new <- matrix(rnorm(50 * 4), ncol = 4)

clf <- make_model(
  train_fun = function(x, y) glm(y ~ ., data = data.frame(y = y, x),
                                  family = "binomial"),
  predict_fun = function(object, x_new) {
    df <- as.data.frame(x_new)
    names(df) <- paste0("X", seq_len(ncol(x_new)))
    p <- predict(object, newdata = df, type = "response")
    cbind(A = 1 - p, B = p)
  },
  type = "classification"
)

result <- conformal_aps(x, y, model = clf, x_new = x_new, alpha = 0.10)
result$sets[[1]]   # prediction set for first observation: e.g. c("A")
result$sets[[10]]  # ambiguous case might include: c("A", "B")
table(set_size(result))  # distribution of set sizes

Choosing a method

ScenarioRecommended methodWhy
Default for regressionconformal_split()Fast, single model fit
Small dataset, need tight intervalsconformal_cv() or conformal_jackknife()*Uses all data for both training and calibration
Heteroscedastic dataconformal_split(..., score_type = "normalized") or conformal_cqr()Adaptive interval widths
Default for classificationconformal_aps()Adaptive set sizes, well-calibrated
Many classes, want small setsconformal_raps()Regularized APS, penalises large sets
Coverage must hold per subgroupconformal_mondrian() / conformal_mondrian_class()Group-conditional guarantees
Covariate shift between train/testconformal_weighted()Importance-weighted calibration
Sequential/online predictionconformal_aci()**Adapts to distribution drift over time

*Jackknife+ and CV+ have a theoretical coverage guarantee of 1-2α (Barber et al. 2021), weaker than split conformal's 1-α. In practice, coverage is typically near 1-α. **ACI provides asymptotic (not finite-sample) coverage guarantees.


Model interface

There are three ways to specify a model. This flexibility means predictset works with anything from a simple linear model to a custom deep learning wrapper.

1. Formula shorthand (fits lm internally):

result <- conformal_split(x, y, model = y ~ ., x_new = x_new)

This is the quickest way to get started. Pass a formula and predictset handles the fitting.

2. Fitted model (auto-detected for lm, glm, and ranger):

fit <- lm(y ~ ., data = data.frame(y = y, x))
result <- conformal_split(x, y, model = fit, x_new = x_new)

If you've already fitted a model and want conformal intervals around its predictions, pass it directly. predictset recognises standard R model objects and extracts the training and prediction functions automatically.

3. Custom model via make_model() (works with anything):

xgb_model <- make_model(
  train_fun = function(x, y) {
    dtrain <- xgboost::xgb.DMatrix(x, label = y)
    xgboost::xgb.train(list(objective = "reg:squarederror"), dtrain, nrounds = 100)
  },
  predict_fun = function(object, x_new) {
    predict(object, xgboost::xgb.DMatrix(x_new))
  },
  type = "regression"
)

result <- conformal_split(x, y, model = xgb_model, x_new = x_new)

make_model() takes a training function, a prediction function, and a type ("regression" or "classification"). This is how you use conformal prediction with xgboost, keras, lightgbm, or any other model.


Methods

FunctionTypeMethodReference
conformal_split()RegressionSplit conformalVovk et al. (2005)
conformal_cv()RegressionCV+Barber et al. (2021)
conformal_jackknife()RegressionJackknife+Barber et al. (2021)
conformal_cqr()RegressionConformalized Quantile RegressionRomano et al. (2019)
conformal_mondrian()RegressionMondrian (group-conditional)Vovk et al. (2005)
conformal_weighted()RegressionWeighted conformal (covariate shift)Tibshirani et al. (2019)
conformal_aps()ClassificationAdaptive Prediction SetsRomano, Sesia & Candes (2020)
conformal_raps()ClassificationRegularized APSAngelopoulos et al. (2021)
conformal_lac()ClassificationLeast Ambiguous ClassifierSadinle, Lei & Wasserman (2019)
conformal_mondrian_class()ClassificationMondrian (group-conditional)Vovk et al. (2005)
conformal_aci()SequentialAdaptive Conformal InferenceGibbs & Candes (2021)
coverage()DiagnosticEmpirical coverage rate
coverage_by_group()DiagnosticCoverage within subgroups
coverage_by_bin()DiagnosticCoverage by prediction quantile bin
interval_width()DiagnosticWidth of prediction intervals
set_size()DiagnosticSize of prediction sets
conformal_pvalue()DiagnosticConformal p-values
conformal_compare()DiagnosticCompare multiple methods
make_model()UtilityWrap custom train/predict functions

More examples

Jackknife+ with ranger

Jackknife+ uses leave-one-out refitting to produce prediction intervals without splitting the data. This gives tighter intervals than split conformal (because it uses all the data for both training and calibration) at the cost of refitting the model n times.

set.seed(42)
n <- 200
x <- matrix(rnorm(n * 3), ncol = 3)
y <- x[, 1]^2 + x[, 2] + rnorm(n, sd = 0.5)
x_new <- matrix(rnorm(50 * 3), ncol = 3)

rf <- make_model(
  train_fun = function(x, y) {
    ranger::ranger(y ~ ., data = data.frame(y = y, x))
  },
  predict_fun = function(object, x_new) {
    predict(object, data = as.data.frame(x_new))$predictions
  },
  type = "regression"
)

result <- conformal_jackknife(x, y, model = rf, x_new = x_new, alpha = 0.10)
print(result)
plot(result)

Normalised conformal for heteroscedastic data

When the noise varies across the input space (e.g. predictions are more uncertain at extreme values), standard conformal intervals are too wide in low-noise regions and too narrow in high-noise ones. Normalised conformal scoring fixes this by scaling residuals by a local estimate of variability.

set.seed(42)
n <- 500
x <- matrix(runif(n, 0, 10), ncol = 1)
y <- sin(x[, 1]) + rnorm(n, sd = 0.1 + 0.3 * x[, 1])  # noise grows with x
x_new <- matrix(seq(0, 10, length.out = 100), ncol = 1)

result <- conformal_split(
  x, y, model = y ~ ., x_new = x_new, alpha = 0.10,
  score_type = "normalized"
)

# Intervals are narrower near x = 0, wider near x = 10
plot(result)

Mondrian conformal: group-conditional coverage

Standard conformal prediction guarantees marginal coverage (across all test points), but coverage can vary wildly across subgroups. Mondrian conformal computes a separate quantile for each group, guaranteeing coverage within each subgroup. This is critical for fairness and regulatory compliance.

No other R package on CRAN implements Mondrian conformal prediction.

set.seed(42)
n <- 600
x <- matrix(rnorm(n * 3), ncol = 3)
groups <- factor(ifelse(x[, 1] > 0, "high", "low"))
y <- x[, 1] * 2 + ifelse(groups == "high", 3, 0.5) * rnorm(n)
x_new <- matrix(rnorm(200 * 3), ncol = 3)
groups_new <- factor(ifelse(x_new[, 1] > 0, "high", "low"))

result <- conformal_mondrian(x, y, model = y ~ ., x_new = x_new,
                              groups = groups, groups_new = groups_new)
print(result)

# Check per-group coverage
y_new <- x_new[, 1] * 2 + ifelse(groups_new == "high", 3, 0.5) * rnorm(200)
coverage_by_group(result, y_new, groups_new)

Weighted conformal: handling covariate shift

When the test distribution differs from the training distribution (covariate shift), standard conformal coverage guarantees break down. Weighted conformal prediction uses importance weights to correct for this shift.

set.seed(42)
n <- 500
x <- matrix(rnorm(n * 3), ncol = 3)
y <- x[, 1] * 2 + rnorm(n)
x_new <- matrix(rnorm(100 * 3, mean = 1), ncol = 3)  # shifted test data

# Importance weights (likelihood ratio of test vs training distributions)
weights <- dnorm(x[, 1], mean = 1) / dnorm(x[, 1], mean = 0)

result <- conformal_weighted(x, y, model = y ~ ., x_new = x_new,
                              weights = weights)
print(result)

Conformalized Quantile Regression

CQR combines conformal prediction with quantile regression to produce intervals that naturally adapt to heteroscedasticity. Instead of fitting a model for the mean and adding symmetric bands, CQR fits models for the lower and upper quantiles and then adjusts them to guarantee coverage.

set.seed(42)
n <- 500
x <- matrix(rnorm(n * 3), ncol = 3)
y <- x[, 1] + x[, 2]^2 + rnorm(n, sd = 0.5 + abs(x[, 1]))
x_new <- matrix(rnorm(100 * 3), ncol = 3)

# In practice, use quantile regression (e.g. quantreg::rq).
# Here we approximate with shifted linear models for illustration.
model_lo <- make_model(
  train_fun = function(x, y) lm(y ~ ., data = data.frame(y = y, x)),
  predict_fun = function(obj, x_new) predict(obj, newdata = as.data.frame(x_new)) - 1.5,
  type = "regression"
)
model_hi <- make_model(
  train_fun = function(x, y) lm(y ~ ., data = data.frame(y = y, x)),
  predict_fun = function(obj, x_new) predict(obj, newdata = as.data.frame(x_new)) + 1.5,
  type = "regression"
)

result <- conformal_cqr(x, y, model_lo, model_hi, x_new = x_new, alpha = 0.10)
print(result)
plot(result)

Adaptive Conformal Inference (sequential prediction)

ACI adapts the miscoverage level online based on observed coverage, maintaining long-run coverage even under distribution shift. No other R package implements ACI.

set.seed(42)
n <- 500
y_true <- cumsum(rnorm(n, sd = 0.1)) + rnorm(n)  # drifting process
y_pred <- c(0, y_true[-n])                         # naive lag-1 predictor

result <- conformal_aci(y_pred, y_true, alpha = 0.10, gamma = 0.01)
print(result)
plot(result)  # intervals + adaptive alpha trace

Comparing methods

Benchmark multiple conformal methods side-by-side:

set.seed(42)
n <- 500
x <- matrix(rnorm(n * 3), ncol = 3)
y <- x[, 1] * 2 + rnorm(n)
x_new <- matrix(rnorm(200 * 3), ncol = 3)
y_new <- x_new[, 1] * 2 + rnorm(200)

comp <- conformal_compare(x, y, model = y ~ ., x_new = x_new, y_new = y_new,
                           methods = c("split", "cv", "jackknife"))
print(comp)

Diagnostics

After producing predictions, use the diagnostic functions to evaluate calibration and efficiency.

# Suppose y_test contains the true values for x_new
coverage(result, y_test)
#> [1] 0.91

# Average interval width (regression)
mean(interval_width(result))
#> [1] 3.42

# Prediction set sizes (classification)
table(set_size(result))
#>  1  2  3
#> 74 21  5

# Coverage within subgroups
coverage_by_group(result, y_test, groups = groups_test)
#>   group coverage   n target
#> 1  high    0.920  98    0.9
#> 2   low    0.891 102    0.9

# Coverage by prediction quantile bin
coverage_by_bin(result, y_test, bins = 5)

# Conformal p-values for outlier detection
pvals <- conformal_pvalue(result$scores, new_scores)

coverage() should be close to 1 - alpha. If it's substantially lower, something has gone wrong (likely a violation of exchangeability). interval_width() and set_size() measure efficiency: narrower intervals and smaller sets are better, conditional on achieving the target coverage.

coverage_by_group() and coverage_by_bin() diagnose conditional coverage. Marginal coverage can mask severe under-coverage in subgroups (e.g. by demographic group or by prediction magnitude). These diagnostics help identify where intervals fail, and are essential for fairness evaluation.


Theory and references

Conformal prediction was introduced by Vovk, Gammerman, and Shafer in the early 2000s. The key insight is that if calibration and test data are exchangeable (i.e. their joint distribution is invariant to permutation), then the conformal p-value is uniformly distributed, which gives an exact finite-sample coverage guarantee. Unlike bootstrap or Bayesian intervals, this guarantee holds regardless of model misspecification.

The recent explosion of interest in conformal prediction has been driven by several methodological advances that make it practical for modern machine learning:

For an accessible introduction to the field, see Angelopoulos and Bates (2023), A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification.


Limitations

  • Split methods halve the training data. Split conformal, APS, RAPS, and LAC all divide the data into a training set and a calibration set. With small datasets, this can noticeably reduce model quality. Jackknife+ and CV+ avoid this at the cost of refitting the model multiple times.
  • Jackknife+ and CV+ are computationally expensive. Jackknife+ refits the model n times; CV+ refits it K times. For large datasets or expensive models, this may be impractical.
  • The coverage guarantee requires exchangeability. If the calibration data and test data come from different distributions (for example, if there is temporal drift) the coverage guarantee does not hold. This means conformal prediction is not directly applicable to time series forecasting without modification (e.g. conformal methods for time series exist but are not implemented here).
  • Classification methods depend on probability estimates. APS, RAPS, and LAC require the model to output well-calibrated class probabilities. If the probabilities are poorly calibrated, the prediction sets will still have valid coverage but may be unnecessarily large.

Related packages

PackageDescription
probablyConformal regression within the tidymodels ecosystem
conformalInferenceResearch code by Tibshirani et al. (2019, GitHub only)
onsUK Office for National Statistics data
boeBank of England data
fredFederal Reserve Economic Data (FRED)
readecbEuropean Central Bank data
readoecdOECD data
inflateRInflation adjustment

Issues

Found a bug or have a feature request? Please open an issue on GitHub.

Metadata

Version

0.3.0

License

Unknown

Platforms (80)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arc-linux
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • sh4-linux
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows