RbioRXN: an R package to manage biochemical reaction data from the Rhea and Metacyc databases

RbioRXN is an R package that facilitates retrieving and processing biochemical reaction data such as Rhea, MetaCyc,BioCyc and EcoCyc. The package provides various functions to download and parse data, instantiate generic reaction,interchange compound ID among well-known biochemical databases (e.g. KEGG, PubChem, and ChEBI) and check mass balance. The package aims to construct an integrated metabolic network and genome-scale metabolic model.

RbioRXN has been released (RbioRXN 1.0, April 11, 2013)

RbioRXN functions

Getting biochemical data and put into dataframe object

get.Rhea(), get.ChEBI(), parse.Rhea(), parse.ChEBI(), parse.MetaCyc.c(), parse.MetaCyc.r()

# install and load the package
> intall.packages(RbioRXN)
> library(RbioRXN)

# get and parse Rhea & ChEBI data
> rhea = get.Rhea()
> parsed_rhea = parse.Rhea(owl=rhea)
> head(parsed_rhea)

> chebi = get.ChEBI()
> parsed_chebi = parse.ChEBI(owl=chebi)
> head(parsed_chebi)

# parse MetaCyc data
> parsed_metacyc.c = parse.MetaCyc.c(file=/path/to/compounds.dat)
> parsed_metacyc.r = parse.MetaCyc.r(file=/path/to/reactions.dat)

Instantiate generic reaction

Rhea.instantiate(), BioCyc.instantiate()

# load example data

# Rhea.instantiate(parsed_Rhea,parsed_ChEBI,Rhea_ID,multicore=1)
pC = example$parsed_ChEBI # sample ChEBI
Rg = example$Rhea_generic # sample Rhea generic reaction
data(thermo) # this is for the package ¡¯CHNOSZ¡¯
instanceR = Rhea.instantiate(Rg, pC, Rg[1,¡¯ID¡¯], multicore=1)

# BioCyc.instantiate(parsed_MetaCyc.r,parsed_MetaCyc.c,BioCyc_ID,multicore=1)
pMc = example$parsed_MetaCyc.c # sample MetaCyc compound
Mg = example$MetaCyc_generic # sample generic reaction
instanceM = BioCyc.instantiate(Mg, pMc, Mg[1,¡¯ID¡¯], multicore=1)

Conversion compound ID into counterpart ID in other chemical database

Rhea2KEGG(), Rhea2cName(), BioCyc2KEGG(), BioCyc2PubChem(), BioCyc2ChEBI(), BioCyc2cName()

# Rhea conversion (Rhea2KEGG, Rhea2cName)
Rc = example$Rhea_conv # sample Rhea data

R2KEGG = Rhea2KEGG(pC, Rc) # ChEBI ID to KEGG ID

R2cName = Rhea2cName(pC, Rc) # ChEBI ID to compound name

# MetaCyc conversion (BioCyc2KEGG, BioCyc2PubChem, BioCyc2ChEBI, BioCyc2cName)
Mc = example$MetaCyc_conv # sample MetaCyc data print(Mc)

B2KEGG = BioCyc2KEGG(pMc, Mc) # BioCyc ID to KEGG ID

B2ChEBI = BioCyc2ChEBI(pMc, Mc) # BioCyc ID to ChEBI ID

B2PubChem = BioCyc2PubChem(pMc, Mc) # BioCyc ID to PubChem ID

B2cName = BioCyc2cName(pMc, Mc) # BioCyc ID to compound name

RbioRXN: future development

In order to discover and design novel biosynthetic pathways in systems and synthetic biology application, the later version of RbioRXN will include additional functions for genome-scale mod-el building, flux balance analysis, retrosynthetic pathway prediction, and sharing and interchange of biochemical reaction data with existing R packages (e.g. rBiopaxParser, rsbml or BiGGR).