MyNixOS website logo
Description

Relative Importance PCA Regression.

Performs Principal Components Analysis (also known as PCA) dimensionality reduction in the context of a linear regression. In most cases, PCA dimensionality reduction is performed independent of the response variable for a regression. This captures the majority of the variance of the model's predictors, but may not actually be the optimal dimensionality reduction solution for a regression against the response variable. An alternative method, optimized for a regression against the response variable, is to use both PCA and a relative importance measure. This package applies PCA to a given data frame of predictors, and then calculates the relative importance of each PCA factor against the response variable. It outputs ordered factors that are optimized for model fit. By performing dimensionality reduction with this method, an individual can achieve a the same r-squared value as performing just PCA, but with fewer PCA factors. References: Yuri Balasanov (2017) <https://ilykei.com>.

cran

RelimpPCR - Relative Importance PCA Regression

Acknowledgements

The concepts and ideas used to create this package/code were learned from Yuri Balasanov at the University of Chicago Master of Science in Analytics Program.

Recent Additions

  • Now able to handle matrices that, through PCA, produce very small eigenvalues, which would originally cause the calc.relimp() function to fail. RelimpPCR can now either a) iterative drop the last N PCA factors until a suitable matrix for calc.relimp() is produced, or b) drop the last M PCA factors, where M is a value set by the user.
  • Now able to perform train/test split on data.
  • Plots now show R2 progress for both train and test data sets.
  • Added a prediction function, RelimpPCR.predict().

Description

This package performs PCA dimensionality reduction in the context of a linear regression. In most cases, PCA dimensionality reduction is performed independent of the Y values for a regression. This captures the majority of the variance of the X values, but may not actually be the optimal dimensionality reduction solution for a regression against Y.

An alternative method, optimized for a regression against Y, is to use both PCA and a relative importance measure. This package applies PCA to a given data frame of X values, and then calculates the relative importance of each PCA factor against Y. It outputs ordered factors that are optimized for model fit. By performing dimensionality reduction with this method, an individual can achieve a the same r-squraed value as performing just PCA, but with fewer predictors.

Installation

In a shell / command prompt build the package with "R CMD build RelimpPCR" (from the directory above "RelimpPCR")
In a shell / command prompt install the package with "R CMD INSTALL RelimpPCR" (from the directory above "RelimpPCR")

Use

After importing the package, you can use the "RelimpPCR()" function to train a relative importance PCA regression.
After training a relative importnace PCA regression, you can use it for prediction with the "RelimpPCR.predict()" function.

Example

Note in the below example that RelimpPCR achieves higher r-squared values faster than any other method.

Example of RelimpPCR on mtcars data
image.

Metadata

Version

0.3.0

License

Unknown

Platforms (77)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows