MyNixOS website logo
Description
Refined Modified Stahel-Donoho (MSD) Estimators for Outlier Detection (Parallel Version)
A parallel function for multivariate outlier detection named modified Stahel-Donoho estimators is contained in this package. The function RMSDp() is for elliptically distributed datasets and recognizes outliers based on Mahalanobis distance. This function is for higher dimensional datasets that cannot be handled by a single core function RMSD() included in 'RMSD' package. See Wada and Tsubaki (2013) <doi:10.1109/CLOUDCOM-ASIA.2013.86> for the detail of the algorithm.

RMSDp

The package RMSDp contains a function RMSDp. It aims to provide a tool for multivariate outlier detection based on the Modified Stahel-Donoho estimators to cope with higher dimensional datasets than the function RMSD in the package RMSD. On the other hand, the function RMSD is faster than the function RMSDp for lower dimensional datasets.

The parameter 'cores' is for controlling number of cores used for the computation. If this parameter is omitted, all the cores are used.

Since the functions RMSDp uses random numbers, the results between runs could differ unless a random seed is provided using the parameter 'sd.'

The default threshold set to decide outlier is 99.9 percentile point of F-statistics. It can be changed using the parameter 'pt.'

This function is an improved version of 'msd.parallel' at https://github.com/kazwd2008/MSD.parallel by adding the last step to decide outliers by calculating the squared Mahalanobis distance of each observation and F-statistics to decide outliers.

Installation

from CRAN

The package RMSDP is on CRAN.

install.packages("RMSDps")
library(RMSDp)

from github

You can also install the package from GitHub with:

install.packages("devtools")
devtools::install_github("kazwd2008/RMSDp")

Example: wine data [13 variables]

This is an example with the wine data set from the UCI machine learning repository (https://archive.ics.uci.edu/ml/datasets/wine). Download 'wine.data' in your current directory. The dataset contains 1 categorical variable and 13 numerical variables.

RMSEp assumes an elliptical data distribution.

# number of cores of your PC
getOption("mc.cores", parallel::detectCores())

library(RMSDp)
wine <- read.csv("wine.data", header=F)
ot1 <- RMSDp(wine[,-1], cores=2)

# scatterplot with the outlier flag
plot(wine, col=ot1$ot, pch=19, main="Scatterplot (outliers are shown in red)")

# final weights
plot(ot1$wt, pch=19, col=ot1$ot)

# parallel coordinate plot
MASS::parcoord(wine[,-1], col=ot1$ot, lty=c(3,1)[ot1$ot], lwd=c(1,2)[ot1$ot])

Metadata

Version

0.1.1

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows