MyNixOS website logo
Description

Access and Format Single-Cell RNA-Seq Datasets from Public Resources.

The goal of 'scfetch' is to access and format single-cell RNA-seq datasets. It can be used to download single-cell RNA-seq datasets from widely used public resources, including GEO <https://www.ncbi.nlm.nih.gov/geo/>, Zenodo <https://zenodo.org/>, CELLxGENE <https://cellxgene.cziscience.com/>, Human Cell Atlas <https://www.humancellatlas.org/>, PanglaoDB <https://panglaodb.se/index.html> and UCSC Cell Browser <https://cells.ucsc.edu/>. And, it can also be used to perform object conversion between SeuratObject <https://satijalab.org/seurat/>, loom <http://loompy.org/>, h5ad <https://scanpy.readthedocs.io/en/stable/>, SingleCellExperiment <https://bioconductor.org/packages/release/bioc/html/scran.html>, CellDataSet <http://cole-trapnell-lab.github.io/monocle-release/> and cell_data_set <https://cole-trapnell-lab.github.io/monocle3/>.

scfetch - Access and Format Single-cell RNA-seq Datasets from Public Resources

Introduction

scfetch is designed to accelerate users download and prepare single-cell datasets from public resources. It can be used to:

  • Download fastq files from GEO/SRA, foramt fastq files to standard style that can be identified by 10x softwares (e.g. CellRanger).
  • Download bam files from GEO/SRA, support downloading original 10x generated bam files (with custom tags) and normal bam files, and convert bam files to fastq files.
  • Download scRNA-seq matrix and annotation (e.g. cell type) information from GEO, PanglanDB and UCSC Cell Browser, load the downnloaded matrix to Seurat.
  • Download processed objects from Zeenodo, CELLxGENE and Human Cell Atlas.
  • Formats conversion between widely used single cell objects (SeuratObject, AnnData, SingleCellExperiment, CellDataSet/cell_data_set and loom).

Framework

scfetch_framework

Installation

You can install the development version of scfetch from GitHub with:

# install.packages("devtools")
devtools::install_github("showteeth/scfetch")

For issues about installation, please refer INSTALL.md.

For data structures conversion, scfetch requires several python pcakages, you can install with:

# install python packages
conda install -c bioconda loompy anndata
# or
pip install anndata loompy

Usage

Vignette

Detailed usage is available in website.


Function list

TypeFunctionUsage
Download and format fastqExtractRunExtract runs with GEO accession number or GSM number
DownloadSRADownload sra files
SplitSRASplit sra files to fastq files and format to 10x standard style
Download and convert bamDownloadBamDownload bam (support 10x original bam)
Bam2FastqConvert bam files to fastq files
Download matrix and load to Seurat ExtractGEOMetaExtract sample metadata from GEO
ParseGEODownload matrix from GEO and load to Seurat
ExtractPanglaoDBMetaExtract sample metadata from PandlaoDB
ExtractPanglaoDBCompositionExtract cell type composition of PanglaoDB datasets
ParsePanglaoDBDownload matrix from PandlaoDB and load to Seurat
ShowCBDatasetsShow all available datasets in UCSC Cell Browser
ExtractCBDatasetsExtract UCSC Cell Browser datasets with attributes
ExtractCBCompositionExtract cell type composition of UCSC Cell Browser datasets
ParseCBDatasetsDownload UCSC Cell Browser datasets and load to Seurat
Download objectsExtractZenodoMetaExtract sample metadata from Zenodo with DOIs
ParseZenodoDownload rds/rdata/h5ad/loom from Zenodo with DOIs
ShowCELLxGENEDatasetsShow all available datasets in CELLxGENE
ExtractCELLxGENEMetaExtract metadata of CELLxGENE datasets with attributes
ParseCELLxGENEDownload rds/h5ad from CELLxGENE
ShowHCAProjectsShow all available projects in Human Cell Atlas
ExtractHCAMetaExtract metadata of Human Cell Atlas projects with attributes
ParseHCADownload rds/rdata/h5/h5ad/loom from Human Cell Atlas
Convert between different single-cell objectsExportSeuratConvert SeuratObject to AnnData, SingleCellExperiment, CellDataSet/cell_data_set and loom
ImportSeuratConvert AnnData, SingleCellExperiment, CellDataSet/cell_data_set and loom to SeuratObject
SCEAnnDataConvert between SingleCellExperiment and AnnData
SCELoomConvert between SingleCellExperiment and loom
Summarize datasets based on attributesStatDBAttributeSummarize datasets in PandlaoDB, UCSC Cell Browser and CELLxGENE based on attributes

Contact

For any question, feature request or bug report please write an email to [email protected].


Code of Conduct

Please note that the scfetch project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.


Metadata

Version

0.5.0

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows