MyNixOS website logo
Description

Dereplicate and Cherry-Pick Mass Spectrometry Spectra.

Convenient wrapper functions for the analysis of matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) spectra data in order to select only representative spectra (also called cherry-pick). The package covers the preprocessing and dereplication steps (based on Strejcek, Smrhova, Junkova and Uhlik (2018) <doi:10.3389/fmicb.2018.01294>) needed to cluster MALDI-TOF spectra before the final cherry-picking step. It enables the easy exclusion of spectra and/or clusters to accommodate complex cherry-picking strategies. Alternatively, cherry-picking using taxonomic identification MALDI-TOF data is made easy with functions to import inconsistently formatted reports.

maldipickr

CRANstatus Project Status: Active – The project has reached a stable, usablestate and is being activelydeveloped. R-CMD-check codecov

  • Are you using the MALDI-TOF[^1] Biotyper to identify bacterial isolates? Yes
  • Do you want to select representative isolates for further experiments? Yes
  • Do you need fast and automated selection decisions that you can retrace? Yes

The {maldipickr} package is right for your needs! The documented and tested R functions will help you dereplicate MALDI-TOF data and cherry-pick representative spectra of microbial isolates.

Quickstart

How to cherry-pick bacterial isolates with MALDI Biotyper:

Using spectra data

library(maldipickr)
# Set up the directory location of your spectra data
spectra_dir <- system.file("toy-species-spectra", package = "maldipickr")

# Import and process the spectra
processed <- spectra_dir %>%
  import_biotyper_spectra() %>%
  process_spectra()

# Delineate spectra clusters using Cosine similarity
#  and cherry-pick one representative spectra.
#  The chosen ones are indicated by `to_pick` column
processed %>%
  list() %>%
  merge_processed_spectra() %>%
  coop::tcosine() %>%
  delineate_with_similarity(threshold = 0.92) %>%
  set_reference_spectra(processed$metadata) %>%
  pick_spectra() %>%
  dplyr::relocate(name, to_pick)
#> # A tibble: 6 × 7
#>   name         to_pick membership cluster_size   SNR peaks is_reference
#>   <chr>        <lgl>        <int>        <int> <dbl> <dbl> <lgl>       
#> 1 species1_G2  FALSE            1            4  5.09    21 FALSE       
#> 2 species2_E11 FALSE            2            2  5.54    22 FALSE       
#> 3 species2_E12 TRUE             2            2  5.63    23 TRUE        
#> 4 species3_F7  FALSE            1            4  4.89    26 FALSE       
#> 5 species3_F8  TRUE             1            4  5.56    25 TRUE        
#> 6 species3_F9  FALSE            1            4  5.40    25 FALSE

Using taxonomic identification report

library(maldipickr)
# Import Biotyper CSV report
#  and glimpse at the table
report_tbl <- read_biotyper_report(
  system.file("biotyper_unknown.csv", package = "maldipickr")
)
report_tbl %>%
  dplyr::select(name, bruker_species, bruker_log)
#> # A tibble: 4 × 3
#>   name              bruker_species               bruker_log
#>   <chr>             <chr>                             <dbl>
#> 1 unknown_isolate_1 not reliable identification        1.33
#> 2 unknown_isolate_2 not reliable identification        1.4 
#> 3 unknown_isolate_3 Faecalibacterium prausnitzii       1.96
#> 4 unknown_isolate_4 Faecalibacterium prausnitzii       2.07

# Delineate clusters from the identifications
#   and cherry-pick one representative spectra.
#   The chosen ones are indicated by `to_pick` column
report_tbl %>%
  delineate_with_identification() %>%
  pick_spectra(report_tbl, criteria_column = "bruker_log") %>%
  dplyr::relocate(name, to_pick, bruker_species)
#> Generating clusters from single report
#> # A tibble: 4 × 11
#>   name       to_pick bruker_species membership cluster_size sample_name hit_rank
#>   <chr>      <lgl>   <chr>               <int>        <int> <chr>          <int>
#> 1 unknown_i… TRUE    not reliable …          2            1 <NA>               1
#> 2 unknown_i… TRUE    not reliable …          3            1 <NA>               1
#> 3 unknown_i… FALSE   Faecalibacter…          1            2 <NA>               1
#> 4 unknown_i… TRUE    Faecalibacter…          1            2 <NA>               1
#> # ℹ 4 more variables: bruker_quality <chr>, bruker_taxid <dbl>,
#> #   bruker_hash <chr>, bruker_log <dbl>

Installation

{maldipickr} is available on the CRAN and on GitHub.

To install the latest CRAN release, use the following command in R:

install.packages("maldipickr")

To install the development version, use the following command in R:

remotes::install_github("ClavelLab/maldipickr", build_vignettes = TRUE)

Usage

The comprehensive vignettes will walk you through the package functions and showcase how to:

  1. Import spectra data and identification reports from Bruker MALDI Biotyper into R.
  2. Process, dereplicate and cherry-pick representative spectra, from simple to complex design.

Acknowledgments

This R package is developed for spectra data generated by the Bruker MALDI Biotyper device. The {maldipickr} package is built from a suite of Rmarkdown files using the {fusen} package by Rochette S (2023). It relies on:

  1. the {MALDIquant} package from Gibb & Strimmer (2012) for spectra functions
  2. the work of Strejcek et al. (2018) for the dereplication procedure.

Disclaimer

The developers of this package are part of the Clavel Lab and are not affiliated with the company Bruker, therefore this package is independent of the company and is distributed under the GPL-3.0 License.

The hexagonal logo was created by Charlie Pauvert and uses the Hack font and a color palette generated at https://coolors.co.

References

[^1]: Matrix-Assisted Laser Desorption/Ionization-Time-Of-Flight (MALDI-TOF)

Metadata

Version

1.3.0

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows