MyNixOS website logo
Description

Local Density-Based Outlier Detection.

Flexible procedures to compute local density-based outlier scores for ranking outliers. Both exact and approximate nearest neighbor search can be implemented, while also accommodating multiple neighborhood sizes and four different local density-based methods. It allows for referencing a random subsample of the input data or a user specified reference data set to compute outlier scores against, so both unsupervised and semi-supervised outlier detection can be implemented.

Overview

The ldbod package provides flexible functions for computing local density-based outlier scores. Both exact and approximate nearest neighbor search can be implemented based on an efficient k-d tree method, while also accomodating multiple neigbhorhood sizes and four different local density-based methods, LOF, LDF, RKOF, and LPDF. It allows for subsampling of input data or a user specified reference data set to compute outlier scores against, so both unsupervised and semi-supervised outlier detection can be done.

Two functions included are,ldbod and ldbod.ref. Function ldbod(X,k,...) computes outlier scores referencing a random subsample of the input data, X. Function ldbod.ref(X,Y,k,...) computes outlier scores for X based on a reference data set, Y. Y can be a set of "normal" data points for semi-supervised outlier detection. Note: Outlier score lpdr is only designed for unsupervised outlier detection and should not be used in the semi-supervised setting. Both functions can return nine outlier scores based on the methods LOF, LDF, RKOF, and LPDF. Each method returns both densities and relative densities.

All kNN computations are carried out using the nn2 function from the RANN package. For method LPDF, multivariate t densities are computed using the dmt function from the mnormt package. Refer to specific packages for more details. Note: all neighborhoods are strickly of size k; therefore, the algorithms for LOF, LDF, and RKOF are not exact implementations, but are similar for most situation and equivalent when distance to k-th nearest neighbor is unique. If there are many duplicate data points, then implementation of algorithms could lead to dramatically different results than those that allow neighborhood sizes larger than k, especially if k is relatively small. Removing duplicates is recommended before computing outlier scores unless there is good reason to keep them.

Motivation

The main motivation for this package is the need for more flexible implementations of local density-based outlier detection methods, that can be used to create ensemble outlier scores. The package is based on the PhD dissteration work by K. T. Williams (2016).

Installation

To install the most up to date version in R use the following commands:

install.packages("devtools")
devtools::install_github("kwilliams83/ldbod")

or using CRAN

install.packages("ldbod")
Metadata

Version

0.1.2

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows