Description
MSigDB Gene Sets for Multiple Organisms in a Tidy Data Format.
Description
Provides the 'Molecular Signatures Database' (MSigDB) gene sets typically used with the 'Gene Set Enrichment Analysis' (GSEA) software (Subramanian et al. 2005 <doi:10.1073/pnas.0506580102>, Liberzon et al. 2015 <doi:10.1016/j.cels.2015.12.004>) in a standard R data frame with key-value pairs. The package includes the human genes as listed in MSigDB as well as the corresponding symbols and IDs for frequently studied model organisms such as mouse, rat, pig, fly, and yeast.
README.md
msigdbr: MSigDB Gene Sets for Multiple Organisms in a Tidy Data Format
Overview
The msigdbr
R package provides Molecular Signatures Database (MSigDB) gene sets typically used with the Gene Set Enrichment Analysis (GSEA) software:
- in an R-friendly tidy/long format with one gene per row
- for multiple frequently studied model organisms, such as mouse, rat, pig, zebrafish, fly, and yeast, in addition to the original human genes
- as gene symbols as well as NCBI Entrez and Ensembl IDs
- that can be installed and loaded as a package without requiring additional external files
Installation
The package can be installed from CRAN.
install.packages("msigdbr")
Usage
The package data can be accessed using the msigdbr()
function, which returns a data frame of gene sets and their member genes. For example, you can retrieve mouse genes from the C2 (curated) CGP (chemical and genetic perturbations) gene sets.
genesets = msigdbr(species = "Mus musculus", category = "C2", subcategory = "CGP")
Check the documentation website for more information.