Pre-Processing XY Data from Experimental Methods.
alkahest
Overview
alkahest is a lightweight, dependency-free toolbox for pre-processing XY data from experimental methods (i.e. any signal that can be measured along a continuous variable). It provides methods for baseline estimation and correction, smoothing, normalization, integration and peaks detection.
- Baseline estimation methods: Linear, Polynomial (Lieber and Mahadevan-Jansen 2003), Asymmetric Least Squares (Eilers and Boelens 2005), Rolling Ball (Kneen and Annegarn 1996), Rubberband, SNIP (Morháč et al. 1997; Morháč and Matoušek 2008; Ryan et al. 1988), 4S Peak Filling (Liland 2015).
- Smoothing methods: Rectangular, Triangular, Loess, Savitzky-Golay Filter (Gorry 1990; Savitzky and Golay 1964), Whittaker (Eilers 2003), Penalized Likelihood (De Rooi et al. 2014)
To cite alkahest in publications use:
Frerebeau N (2024). alkahest: Pre-Processing XY Data from Experimental Methods. Université Bordeaux Montaigne, Pessac, France. doi:10.5281/zenodo.7081524https://doi.org/10.5281/zenodo.7081524, R package version 1.2.0, https://packages.tesselle.org/alkahest/.
This package is a part of the tesselle project https://www.tesselle.org.
Installation
You can install the released version of alkahest from CRAN with:
install.packages("alkahest")
And the development version from GitHub with:
# install.packages("remotes")
remotes::install_github("tesselle/alkahest")
Usage
## Load the package
library(alkahest)
alkahest expects the input data to be in the simplest form (a two-column matrix or data frame, a two-element list or two numeric vectors).
## X-ray diffraction
data("XRD")
## 4S Peak Filling baseline
baseline <- baseline_peakfilling(XRD, n = 10, m = 5, by = 10, sparse = TRUE)
plot(XRD, type = "l", xlab = expression(2*theta), ylab = "Count")
lines(baseline, type = "l", col = "red")
## Correct baseline
XRD <- signal_drift(XRD, lag = baseline, subtract = TRUE)
## Find peaks
peaks <- peaks_find(XRD, SNR = 3, m = 11)
plot(XRD, type = "l", xlab = expression(2*theta), ylab = "Count")
lines(peaks, type = "p", pch = 16, col = "red")
## Simulate data
set.seed(12345)
x <- seq(-4, 4, length = 100)
y <- dnorm(x)
z <- y + rnorm(100, mean = 0, sd = 0.01) # Add some noise
## Plot raw data
plot(x, z, type = "l", xlab = "", ylab = "", main = "Raw data")
lines(x, y, type = "l", lty = 2, col = "red")
## Savitzky–Golay filter
smooth <- smooth_savitzky(x, z, m = 21, p = 2)
plot(smooth, type = "l", xlab = "", ylab = "", main = "Savitzky–Golay filter")
lines(x, y, type = "l", lty = 2, col = "red")
Contributing
Please note that the alkahest project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
References
De Rooi, Johan J., Niek M. Van Der Pers, Ruud W. A. Hendrikx, Rob Delhez, Amarante J. Böttger, and Paul H. C. Eilers. 2014. “Smoothing of X-ray Diffraction Data and K α 2 Elimination Using Penalized Likelihood and the Composite Link Model.” Journal of Applied Crystallography 47 (3): 852–60. https://doi.org/10.1107/S1600576714005809.
Eilers, Paul H. C. 2003. “A Perfect Smoother.” Analytical Chemistry 75 (14): 3631–36. https://doi.org/10.1021/ac034173t.
Eilers, Paul H. C., and Hans F. M. Boelens. 2005. “Baseline Correction with Asymmetric Least Squares Smoothing.” October 21, 2005.
Gorry, Peter A. 1990. “General Least-Squares Smoothing and Differentiation by the Convolution (Savitzky-Golay) Method.” Analytical Chemistry 62 (6): 570–73. https://doi.org/10.1021/ac00205a007.
Kneen, M. A., and H. J. Annegarn. 1996. “Algorithm for Fitting XRF, SEM and PIXE X-ray Spectra Backgrounds.” Nuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms 109–110 (April): 209–13. https://doi.org/10.1016/0168-583X(95)00908-6.
Lieber, Chad A., and Anita Mahadevan-Jansen. 2003. “Automated Method for Subtraction of Fluorescence from Biological Raman Spectra.” Applied Spectroscopy 57 (11): 1363–67. https://doi.org/10.1366/000370203322554518.
Liland, Kristian Hovde. 2015. “4S Peak Filling – Baseline Estimation by Iterative Mean Suppression.” MethodsX 2: 135–40. https://doi.org/10.1016/j.mex.2015.02.009.
Morháč, Miroslav, Ján Kliman, Vladislav Matoušek, Martin Veselský, and Ivan Turzo. 1997. “Background Elimination Methods for Multidimensional Coincidence γ-Ray Spectra.” Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 401 (1): 113–32. https://doi.org/10.1016/S0168-9002(97)01023-1.
Morháč, Miroslav, and Vladislav Matoušek. 2008. “Peak Clipping Algorithms for Background Estimation in Spectroscopic Data.” Applied Spectroscopy 62 (1): 91–106. https://doi.org/10.1366/000370208783412762.
Ryan, C. G., E. Clayton, W. L. Griffin, S. H. Sie, and D. R. Cousens. 1988. “SNIP, a Statistics-Sensitive Background Treatment for the Quantitative Analysis of PIXE Spectra in Geoscience Applications.” Nuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms 34 (3): 396–402. https://doi.org/10.1016/0168-583X(88)90063-8.
Savitzky, Abraham., and M. J. E. Golay. 1964. “Smoothing and Differentiation of Data by Simplified Least Squares Procedures.” Analytical Chemistry 36 (8): 1627–39. https://doi.org/10.1021/ac60214a047.