RbioRXN: an R package to manage biochemical reaction data from the Rhea and Metacyc databases

RbioRXN is an R package that facilitates retrieving and processing biochemical reaction data such as Rhea, MetaCyc,BioCyc and EcoCyc. The package provides various functions to download and parse data, instantiate generic reaction,interchange compound ID among well-known biochemical databases (e.g. KEGG, PubChem, and ChEBI) and check mass balance. The package aims to construct an integrated metabolic network and genome-scale metabolic model.

RbioRXN has been released (RbioRXN 1.0, April 11, 2013)


RbioRXN functions

Getting biochemical data and put into dataframe object

get.Rhea(), get.ChEBI(), parse.Rhea(), parse.ChEBI(), parse.MetaCyc.c(), parse.MetaCyc.r()

# install and load the package
> intall.packages(RbioRXN)
> library(RbioRXN)

# get and parse Rhea & ChEBI data
> rhea = get.Rhea()
> parsed_rhea = parse.Rhea(owl=rhea)
> head(parsed_rhea)

> chebi = get.ChEBI()
> parsed_chebi = parse.ChEBI(owl=chebi)
> head(parsed_chebi)

# parse MetaCyc data
> parsed_metacyc.c = parse.MetaCyc.c(file=/path/to/compounds.dat)
> parsed_metacyc.r = parse.MetaCyc.r(file=/path/to/reactions.dat)


Instantiate generic reaction

Rhea.instantiate(), BioCyc.instantiate()

# load example data
data(example)

# Rhea.instantiate(parsed_Rhea,parsed_ChEBI,Rhea_ID,multicore=1)
pC = example$parsed_ChEBI # sample ChEBI
Rg = example$Rhea_generic # sample Rhea generic reaction
data(thermo) # this is for the package ¡¯CHNOSZ¡¯
instanceR = Rhea.instantiate(Rg, pC, Rg[1,¡¯ID¡¯], multicore=1)
print(instanceR)

# BioCyc.instantiate(parsed_MetaCyc.r,parsed_MetaCyc.c,BioCyc_ID,multicore=1)
pMc = example$parsed_MetaCyc.c # sample MetaCyc compound
Mg = example$MetaCyc_generic # sample generic reaction
instanceM = BioCyc.instantiate(Mg, pMc, Mg[1,¡¯ID¡¯], multicore=1)
print(instanceM)


Conversion compound ID into counterpart ID in other chemical database

Rhea2KEGG(), Rhea2cName(), BioCyc2KEGG(), BioCyc2PubChem(), BioCyc2ChEBI(), BioCyc2cName()

# Rhea conversion (Rhea2KEGG, Rhea2cName)
Rc = example$Rhea_conv # sample Rhea data
print(Rc)

R2KEGG = Rhea2KEGG(pC, Rc) # ChEBI ID to KEGG ID
print(R2KEGG)

R2cName = Rhea2cName(pC, Rc) # ChEBI ID to compound name
print(R2cName)

# MetaCyc conversion (BioCyc2KEGG, BioCyc2PubChem, BioCyc2ChEBI, BioCyc2cName)
Mc = example$MetaCyc_conv # sample MetaCyc data print(Mc)

B2KEGG = BioCyc2KEGG(pMc, Mc) # BioCyc ID to KEGG ID
print(B2KEGG)

B2ChEBI = BioCyc2ChEBI(pMc, Mc) # BioCyc ID to ChEBI ID
print(B2ChEBI)

B2PubChem = BioCyc2PubChem(pMc, Mc) # BioCyc ID to PubChem ID
print(B2PubChem)

B2cName = BioCyc2cName(pMc, Mc) # BioCyc ID to compound name
print(B2cName)


RbioRXN: future development

In order to discover and design novel biosynthetic pathways in systems and synthetic biology application, the later version of RbioRXN will include additional functions for genome-scale mod-el building, flux balance analysis, retrosynthetic pathway prediction, and sharing and interchange of biochemical reaction data with existing R packages (e.g. rBiopaxParser, rsbml or BiGGR).