MyNixOS website logo
Description

A Comprehensive Interface for Accessing the Protein Data Bank.

Streamlines the interaction with the RCSB Protein Data Bank (PDB) <https://www.rcsb.org/>. This interface offers an intuitive and powerful tool for searching and retrieving a diverse range of data types from the PDB. It includes advanced functionalities like BLAST and sequence motif queries. Built upon the existing XML-based API of the PDB, it simplifies the creation of custom requests, thereby enhancing usability and flexibility for researchers.

rPDBapi: A Comprehensive R Package Interface for Accessing the Protein Data Bank

Introduction

rPDBapi is an R package designed to provide seamless access to the RCSB Protein Data Bank (PDB). It simplifies the retrieval and analysis of 3D structural data of large biological molecules, essential for bioinformatics and structural biology research. This package leverages the PDB's XML-based API to facilitate custom queries, data retrieval, and advanced search capabilities within the R programming environment.

Features

  • User-Friendly Interface: Simplifies access to PDB data for the R community.
  • Custom Queries: Streamlines the process of crafting custom queries for efficient data retrieval.
  • Advanced Search Capabilities: Includes specialized search functions for PubMed IDs, organisms, experimental methods, protein structure similarities, and more.
  • Data Retrieval: Facilitates downloading of PDB files in various formats and extraction of FASTA sequences.
  • Integration with R: Provides functions for data manipulation and analysis directly within R, enhancing research workflows.

Installation

You can install the stable version of rPDBapi from CRAN:

install.packages("rPDBapi", repos = "http://cran.us.r-project.org")

To install the development version from GitHub:

devtools::install_github("selcukorkmaz/rPDBapi")

Usage

Loading the Package

library(rPDBapi)

Retrieving PDB IDs Retrieve PDB IDs related to a specific term, such as "hemoglobin":

pdbs <- query_search(search_term = "hemoglobin")
head(pdbs)

Advanced Searches Search by PubMed ID:

pdbs <- query_search(search_term = 32453425, query_type = "PubmedIdQuery")
pdbs

Search by source organism:

pdbs <- query_search(search_term = '7227', query_type = 'TreeEntityQuery')
head(pdbs)

Search by experimental method:

pdbs <- query_search(search_term = 'SOLID-STATE NMR', query_type='ExpTypeQuery')
head(pdbs)

Data Retrieval Fetch data based on user-defined IDs and properties:

properties <- list(rcsb_entry_info = c("molecular_weight"), exptl = "method", rcsb_accession_info = "deposit_date")
ids <- query_search("CRISPR")
df <- data_fetcher(id = ids, data_type = "ENTRY", properties = properties, return_as_dataframe = TRUE)
df

Describing Chemical Compounds Retrieve comprehensive descriptions of chemical compounds:

chem_desc <- describe_chemical('ATP')
chem_desc$rcsb_chem_comp_descriptor$smiles

Retrieving PDB Files Download PDB files in various formats:

pdb_file <- get_pdb_file(pdb_id = "4HHB", filetype = "cif")
head(pdb_file$atom)

Additional Functionsget_info: Retrieve detailed information about a specific PDB entry. get_fasta_from_rcsb_entry: Fetch FASTA sequences for specified PDB entry IDs.

Documentation

For more detailed examples and usage, please refer to the package documentation.

Authors

  • Selcuk Korkmaz - Trakya University, Department of Biostatistics
  • Bilge Eren Yamasan - Trakya University, Department of Biophysics

License

This package is licensed under the MIT License.

Metadata

Version

1.3

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows