MyNixOS website logo
Description

Root Exudate Feature Toolkit.

Provides tools for molecule-oriented and reaction-centred analysis of root exudate datasets. It supports structural matching based on 'PubChem', calculation of molecular descriptors, and inference of candidate microbe-associated metabolic reactions using Kyoto Encyclopedia of Genes and Genomes ('KEGG') identifiers and Enzyme Commission ('EC') numbers. For background on these databases, see Kanehisa et al. (2023) <doi:10.1093/nar/gkac963> and Kim et al. (2023) <doi:10.1093/nar/gkac956>.

REFT

REFT (Root Exudate Feature Toolkit) is an R package for molecule-oriented analysis of root exudate and metabolomics annotation tables. It supports batch database searching, SMILES matching, and calculation of six molecular descriptors.

Main features

  • Read Excel annotation tables
  • Match PubChem records in the order Name -> Other_name(Kegg_name) -> Kegg_ID -> HMDB_ID
  • Return a matching log and unmatched records
  • Optionally export Excel output files when output_dir is explicitly supplied
  • Calculate six molecular descriptors using rcdk:
    • MW
    • nHBAcc
    • ALogP
    • FMF
    • HybRatio
    • nAcid
  • Query KEGG reactions linked to EC numbers for microbe-associated reaction annotation

Install dependencies

install.packages(c(
  "readxl", "dplyr", "purrr", "stringr", "tibble",
  "writexl", "webchem", "rcdk", "rcdklibs"
))

Install REFT

Option 1: Install from CRAN

install.packages("REFT")

Option 2: Install from a source tarball

install.packages("REFT_0.1.4.tar.gz", repos = NULL, type = "source")

Quick start

By default, reft_run() and reft_run_simple() return results in R and do not write files.

library(REFT)

res <- reft_run_simple(
  input_file = "example_root_exudate.xlsx"
)

head(res$descriptors)

To write output files, explicitly provide an output directory. In examples, use tempdir() or another user-chosen location.

res <- reft_run_simple(
  input_file = "example_root_exudate.xlsx",
  output_dir = tempdir()
)

Custom column names

library(REFT)

res <- reft_run(
  input_file = "your_data.xlsx",
  name_col = "Name",
  other_col = "Other_name(Kegg_name)",
  hmdb_col = "HMDB_ID",
  kegg_col = "Kegg_ID",
  output_dir = tempdir()
)

KEGG-microbe workflow

res <- reft_kegg_microbe_run(
  input_file = "microbe_ec.csv",
  output_dir = tempdir()
)

head(res$results)

Optional PubChem cache

If you want to cache PubChem results, explicitly choose a path. In examples or tests, use tempdir().

options(REFT.pubchem_cache_file = file.path(tempdir(), "REFT_pubchem_cache.rds"))

Output files

When output_dir is explicitly supplied, reft_run() writes:

  • metabolites_6_descriptors.xlsx
  • unmatched_smiles.xlsx
  • pubchem_match_log.xlsx

When output_dir is explicitly supplied, reft_kegg_microbe_run() writes:

  • microbe_ec_kegg_reactions.xlsx

Returned object

Both reft_run() and reft_run_simple() return a named list:

  • descriptors: final result table containing SMILES and six descriptors
  • unmatched: records that were not matched to SMILES
  • match_log: PubChem matching log

reft_kegg_microbe_run() returns a named list:

  • results: final microbe-EC-reaction table
  • ec_to_reaction: EC-to-reaction mapping table
  • reaction_details: reaction detail table
  • compound_table: compound formula table

Notes

  • KEGG_ID and HMDB_ID currently follow the original workflow and are queried through PubChem name-based searching, so the hit rate may vary among identifiers.
  • rcdk requires a working Java environment.
  • If PubChem or KEGG network access is unstable, some records may return NA.

Java / rcdk note

REFT can be installed without loading rcdk at package startup. Only the molecular descriptor calculation step requires rcdk/rJava.

If descriptor calculation fails on Windows, check that Java is installed and that R and Java use matching architectures.

Metadata

Version

0.1.4

License

Unknown

Platforms (80)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arc-linux
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • sh4-linux
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows