MyNixOS website logo
Description

Data Normalization and Transformation.

Provides functions for data normalization and transformation in preprocessing stages. Implements scaling methods (min-max, Z-score, L2 normalization) and power transformations (Box-Cox, Yeo-Johnson). Box-Cox transformation is described in Box and Cox (1964) <doi:10.1111/j.2517-6161.1964.tb00553.x>, Yeo-Johnson transformation in Yeo and Johnson (2000) <doi:10.1093/biomet/87.4.954>.

prepkit: Robust preprocessing for Digital Health

R-CMD-check Codecov test coverage Documentation Lifecycle: stable License: MIT

Full Documentation & Tutorials:https://gonrui.github.io/prepkit/

"When Z-Score fails, use M-Score."

prepkit is a comprehensive R package designed for the prepkitocessing of longitudinal behavioral data, with a specific focus on gerontology, digital health, and sensor analytics.

Its flagship feature is the M-Score (Mode-Range Normalization), a novel algorithm designed to detect anomalies in data characterized by "habitual plateaus" (e.g., daily step counts, heart rate), where traditional methods like Z-Score or Min-Max scaling often fail due to skewed distributions and high-frequency routine noise.

📦 Installation

prepkit is rigorously tested on Linux, macOS, and Windows, with compatibility verified up to R 4.5 (development version).

You can install the stable version from GitHub:

# install.packages("devtools")
devtools::install_github("Gonrui/prepkit")

🚀 Key Algorithms

FunctionAlgorithm / DescriptionUse Case
norm_mode_rangeM-Score (New!)Detects frailty/falls in elderly behavioral data by suppressing routine noise.
trans_boxcoxRobust Box-CoxMLE-optimized power transform. Auto-handles non-positive values.
trans_yeojohnsonYeo-JohnsonPower transform natively supporting negative values.
norm_l2Spatial SignProjects data onto a unit hypersphere (L2 Norm). Ideal for high-dim clustering.
pp_plotDensity VisualizerInstant "Before vs. After" visualization for normality checks.

📊 Quick Start: The Power of M-Score

Real-world sensor data often contains routine plateaus (e.g., an older adult consistently walking 3000 steps) and sensor noise (floating-point jitter).

M-Score handles both elegantly:

library(prepkit)

# 1. Simulate Sensor Data
# - Routine: ~3000 steps (with sensor jitter like 3000.1, 2999.9)
# - Anomaly: 200 steps (Fall/Frailty)
# - Anomaly: 6000 steps (Hyperactivity)
steps <- c(3000.1, 3000.2, 2999.8, 200.0, 3000.0, 6000.0)

# 2. Apply M-Score (with precision protection)
# The 'digits' parameter rounds values internally to identify the semantic mode,
# while preserving the precision of the anomalies.
m_scores <- norm_mode_range(steps, digits = 0)

# 3. View Results
print(data.frame(Raw = steps, M_Score = m_scores))

📐 Mathematical Foundation

The M-Score transforms data based on its Mode Interval (the routine plateau). Unlike Z-Score which penalizes stability (low variance), M-Score treats the most frequent range as the "Safe Zone" (Score = 0).

The transformation function $M(x)$ is defined as:

$$ M(x) = \begin{cases} -\frac{k_L - x}{k_L - k_{min}} & \text{if } x < k_L \quad (\text{Left Tail / Frailty}) \ 0 & \text{if } k_L \le x \le k_R \quad (\text{Routine Plateau}) \ \frac{x - k_R}{k_{max} - k_R} & \text{if } x > k_R \quad (\text{Right Tail / Hyper}) \end{cases} $$

Validated via symbolic computation (Wolfram Mathematica) for strict monotonicity.

📚 Citation

If you use prepkit or the M-Score algorithm in your research, please cite:

Gong, R. (2026). M-Score: A Robust Normalization Method for Detecting Anomalies in Longitudinal Behavioral Data. arXiv preprint.

📄 License

MIT © Rui Gong (Tokyo Metropolitan Institute of Gerontology)

Metadata

Version

0.1.1

License

Unknown

Platforms (78)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows