Normalisation of Multiple Variables in Large-Scale Datasets.
The robustness of many of the statistical techniques, such as factor analysis, applied in the social sciences rests upon the assumption of item-level normality. However, when dealing with real data, these assumptions are often not met. The Box-Cox transformation (Box & Cox, 1964) <http://www.jstor.org/stable/2984418> provides an optimal transformation for non-normal variables. Yet, for large datasets of continuous variables, its application in current software programs is cumbersome with analysts having to take several steps to normalise each variable. We present an R package 'normalr' that enables researchers to make convenient optimal transformations of multiple variables in datasets. This R package enables users to quickly and accurately: (1) anchor all of their variables at 1.00, (2) select the desired precision with which the optimal lambda is estimated, (3) apply each unique exponent to its variable, (4) rescale resultant values to within their original X1 and X(n) ranges, and (5) provide original and transformed estimates of skewness, kurtosis, and other inferential assessments of normality.
The normalr allows you to perform normalisation on multiple variables in large-scale datasets.
normalr is available from CRAN. Install it with:
You can install normalr from github with:
# install.packages("devtools") devtools::install_github("kcha193/normalr")
The following example uses normalr to normalise 11 continous variables in mtcars dataset.
library(normalr) normaliseData(mtcars, getLambda(mtcars))
The following example run a shiny appliaction uses normalr to normalise any dataset that the user input.
The shiny app is also available from shinyapps.io.