Chemical Metrics for Microbial Communities.
chem16S calculates chemical metrics for microbial communities by combining taxonomic abundances with genomic reference sequences for proteins. The chemical representation of communities has applications ranging from human microbiomes to Earth-life coevolution.
Read the paper in Bioinformatics: chem16S: community-level chemical metrics for exploring genomic adaptation to environments.
View the manual and vignettes: Chemical metrics of reference proteomes for taxa, Integration of chem16S with phyloseq, and Plotting two chemical metrics.
Supported input formats:
phyloseq-class
objects created using phyloseq- RDP Classifier
Supported reference databases:
- Genome Taxonomy Database (GTDB release 214)
- NCBI Reference Sequence Database (RefSeq release 206)
Description
The chem16S R package combines taxonomic classifications of high-throughput 16S rRNA gene sequences with precomputed amino acid compositions of reference proteomes for archaea and bacteria to obtain the amino acid compositions of community reference proteomes. Chemical metrics of community reference proteomes such as carbon oxidation state (Z/sub
) and stoichiometric hydration state (nH2O) reveal new types of adaptations of microbial genomes to environmental conditions. For instance, an association of lower nH2O with higher salinity in the Baltic Sea suggests a genomically encoded dehydration trend:
PSU stands for practical salinity units. The sequence data analyzed for this plot was taken from Herlemann et al. (2016) and the code to make this plot is available in the help page for chem16S::plot_metrics
.
Methods
Scripts in the GTDB_214 and RefSeq_206 directories were used to generate reference proteomes for genus- and higher-level archaeal and bacterial taxa (and viruses for RefSeq).
It is recommended to use 16S rRNA sequences from GTDB for taxonomic classification (files are available for DADA2 and the RDP Classifier) so that taxonomic assignments can be automatically matched to GTDB reference proteomes available in chem16S.
For taxonomic classifications made using the RDP training set (No. 18 07/2020, used in RDP Classifier version 2.13), chem16S includes manual mappings to the NCBI taxonomy described by Dick and Tan (2023).
Installation
First install phyloseq from Bioconductor:
if(!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install("phyloseq")
Then install the release version of chem16S from CRAN:
install.packages("chem16S")
Or use install_github
from remotes or devtools to install the development version of chem16S from GitHub:
if(!require("remotes", quietly = TRUE)) install.packages("remotes")
remotes::install_github("jedick/chem16S", build_vignettes = TRUE)