MyNixOS website logo
Description

Plausible Naive Bayes Classifier Using PDE.

A nonparametric, multicore-capable plausible naive Bayes classifier based on the Pareto density estimation (PDE) featuring a plausible approach to a pitfall in the Bayesian theorem covering low evidence cases. Stier, Q., Hoffmann, J., and Thrun, M.C.: "Classifying with the Fine Structure of Distributions: Leveraging Distributional Information for Robust and Plausible Naive Bayes" (2026), Machine Learning and Knowledge Extraction (MAKE), <DOI:10.3390/make8010013>.

CRAN_Status_Badge CRAN RStudio mirror downloads CRAN RStudio mirror downloads

PDEnaiveBayes

Table of Contents

1. Introduction
2. Installation
3. Additional Resources
4. References

1. Introduction

A nonparametric, multicore-capable plausible naive Bayes classifier based on the Pareto density estimation (PDE) featuring a plausible approach to a pitfall in the Bayesian theorem covering low evidence cases [Stier et al., 2026].

The PDEnaiveBayes package provides a nonparametric approach to Naive Bayes classification, relying on Pareto Density Estimation (PDE) for likelihood modeling. It offers functionalities for:

  • Training standard and multicore PDE-based Naive Bayes classifiers
  • Estimating likelihoods and priors nonparametrically
  • Applying Bayes’ theorem in a numerically stable manner
  • Plotting likelihoods, posteriors, and decision boundaries

Training and Prediction

Train_naiveBayes

Train a Pareto-density-based Naive Bayes classifier.

data(Hepta)
Data <- Hepta$Data
Cls  <- Hepta$Cls
model <- Train_naiveBayes(Data, Cls, Gaussian = FALSE)
table(Cls, model$ClsTrain)

Train_naiveBayes_multicore

Multicore version of the PDE-based Naive Bayes training.

if (requireNamespace("FCPS")) {
  data(Hepta)
  Data <- Hepta$Data
  Cls  <- Hepta$Cls
  model_mc <- Train_naiveBayes_multicore(
    cl       = NULL,
    Data     = Data,
    Cls      = Cls,
    Gaussian = FALSE,
    Predict  = TRUE
  )
  table(Cls, model_mc$ClsTrain)
}

Predict_naiveBayes

Predict class memberships for new data using a trained naive Bayes model.

if (requireNamespace("FCPS")) {
  V    <- FCPS::ClusterChallenge("Hepta", 1000)
  Data <- V$Hepta
  Cls  <- V$Cls

  ind       <- 1:length(Cls)
  ind_train <- sample(ind, 800)
  ind_test  <- setdiff(ind, ind_train)

  model <- Train_naiveBayes(Data[ind_train, ], Cls[ind_train], Gaussian = FALSE)
  res   <- Predict_naiveBayes(Data[ind_test, ], Model = model)

  table(Cls[ind_test], res$ClsTest)
}

predict.PDEbayes

S3 prediction method for PDE-based naive Bayes models.

if (requireNamespace("FCPS")) {
  V    <- FCPS::ClusterChallenge("Hepta", 1000)
  Data <- V$Hepta
  Cls  <- V$Cls

  ind       <- 1:length(Cls)
  ind_train <- sample(ind, 800)
  ind_test  <- setdiff(ind, ind_train)

  model   <- Train_naiveBayes(Data[ind_train, ], Cls[ind_train], Gaussian = FALSE)
  ClsTest <- predict.PDEbayes(object = model, newdata = Data[ind_test, ])
  table(Cls[ind_test], ClsTest)
}

Probability and Density Utilities

getPriors

Compute class priors from class proportions.

if (requireNamespace("FCPS")) {
  data(Hepta)
  Cls    <- Hepta$Cls
  Priors <- getPriors(Cls)
  print(Priors)
}

Visualization Tools

The following functions visualize either the learned likelihoods, the resulting posterior probabilities, or a 2D Bayesian decision surface. They are primarily useful for inspection and model interpretability.

PlotNaiveBayes

Visualize the PDE-based naive Bayes model for all or selected features.

Data <- as.matrix(iris[, 1:4])
Cls  <- as.numeric(iris[, 5])

model <- Train_naiveBayes(Data = Data, Cls = Cls, Plausible = FALSE)
FeatureNames <- colnames(Data)

PlotNaiveBayes(Model = model$Model, FeatureNames = FeatureNames)

PlotLikelihoods

Plot likelihoods for feature-class combinations.

Data <- as.matrix(iris[, 1:4])
Cls  <- as.numeric(iris[, 5])

model <- Train_naiveBayes(Data = Data, Cls = Cls, Plausible = FALSE)

PlotLikelihoods(
  Likelihoods = model$Model$ListOfLikelihoods,
  Data        = Data
)

PlotLikelihoodFuns

Plot likelihood functions as obtained from PDE or Gaussian estimation.

Data <- as.matrix(iris[, 1:4])
Cls  <- as.numeric(iris[, 5])

model <- Train_naiveBayes(Data = Data, Cls = Cls, Plausible = FALSE)

PlotLikelihoodFuns(
  LikelihoodFuns = model$Model$PDFs_funs,
  Data           = Data
)

PlotPosteriors

Plot posterior class probabilities for a selected class.

Data <- as.matrix(iris[, 1:4])
Cls  <- as.numeric(iris[, 5])

model <- Train_naiveBayes(Data = Data, Cls = Cls, Plausible = FALSE)

PlotPosteriors(
  Data       = Data,
  Posteriors = model$Posteriors,
  Class      = 1
)

PlotBayesianDecision2D

Display a 2D Bayesian decision surface.

Data <- as.matrix(iris[, 1:4])
Cls  <- as.numeric(iris[, 5])

model <- Train_naiveBayes(Data = Data, Cls = Cls, Plausible = FALSE)

PlotBayesianDecision2D(
  X          = Data[, 1],
  Y          = Data[, 2],
  Posteriors = model$Posteriors,
  Class      = 1
)

Tutorial Examples

The tutorial with several examples can be found on in the vignette on CRAN.

2. Installation

Installation using CRAN (once published)

install.packages("PDEnaiveBayes", dependencies = TRUE)

Installation using GitHub

Please note, that dependencies have to be installed manually.

remotes::install_github("Mthrun/PDEnaiveBayes")

Installation using R Studio

Please note, that dependencies have to be installed manually.

Tools -> Install Packages -> Repository (CRAN) -> PDEnaiveBayes

3. Additional Resources

  • View package on CRAN

Manual

The full manual for users or developers is available here: Package documentation

4. References

[Ultsch, 2005] Ultsch, A.: Pareto Density Estimation: A Density Estimation for Knowledge Discovery, in Baier, D.; Wernecke, K.-D. (Eds.), Innovations in Classification, Data Science, and Information Systems (Proc. 27th Annual GfKl Conference), Springer, Berlin, pp. 91–100, https://doi.org/10.1007/3-540-26981-9_12, 2005.

[John/Langley, 1995] John, G. H. / Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers, in Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI–95), Morgan Kaufmann, San Mateo, pp. 338–345, 1995.

[Stier et al., 2026] Stier, Q.,Hoffmann, J. & Thrun, M. C.: Classifying with the Fine Structure of Distributions: Leveraging Distributional Information for Robust and Plausible Naïve Bayes, Machine Learning and Knowledge Extraction (MAKE), Vol. 8(1), 13, doi 10.3390/make8010013, MDPI, 2026.

Metadata

Version

0.2.9

License

Unknown

Platforms (78)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows