For large scale genome based metabolic networks, there are mainly two types of
analysis methods: (1) stoichiometric matrix based methods such as flux balance analysis
(FBA); (2) graph theory based structural analysis methods.
Flux balance analysis is a popular tool for functional analysis of genome scale
metabolic networks. However for FBA one needs to add extra transport/exchange
reactions and choose which metabolites are external metabolites. These
processes are not straight forward from the KEGG based metabolic networks.
Therefore this web tool is focused on graph theory based structural analysis
Graph representation of metabolic network
Graph theory based methods are mainly used to examine the system level organization
of metabolic networks. A graph is a simplified representation of metabolic
networks. For example, reaction: A+B=C+D can be represented as a graph
including four metabolic links: A-C, A-D, B-C and B-D. Many reactions
include the so called currency metabolites such as H2O, CO2 and ATP. Links
through currency metabolites in a metabolic graph may lead to biological
meaningless pathways. For example, in the glycolysis pathway if ADP is included
in the graph we may get a two step path from glucose to pyruvate via ADP as
shown in the figure. To obtain a metabolic graph which captures the true
biological connectivity, the connections through currency metabolites should be
excluded. Two approaches are used in this web tool. One is based on the metabolic
connection database compiled by Ma and Zeng, Bioinformatics,
. In this database, the reactions were manually examined to determine
which metabolic connections should be included. Another is based on the KEGG Rpair
database. For each
reaction, only the "main" Rpairs (those appeared in the KEGG pathway maps) are
Network structure analysis
Many structure features of the reconstructed metabolic networks can be calculated
using the web tool. A brief description of the network structure properties can
be seen below and links for detail description in Wikipedia are provided.
: the number of links
connected with a node. In a directed network, there are in degree and out
degree considering the direction of the links. Nodes with high degree are often
important nodes in a network.
the distribution of node degrees in a network. Many complex networks including metabolic
networks are scale
which have power law degree distribution.
Average Path Length:
path length is defined as the
number of the steps in the shortest paths from one node
to another in a graph. The average path length is the average of the path
lengths for all connected pairs of nodes in a graph.
: Closeness Centrality: measure
how close is a node to other connected nodes. Betweenness centrality: the fraction of shortest paths
between pairs of nodes that passes through a given node or edge. Load
centrality: a varied form of betweeness centrality. For detail see Ulrik
Brandes: On Variants of Shortest-Path
Betweenness Centrality and their Generic Computation
Networks 30(2):136-145, 2008.
: a subgraph in which any two vertices are
connected to each other by paths, and which is connected to no additional
vertices. Such a subgraph is strongly connected if the link direction is
considered in a directed graph and is weakly connected if direction is ignored.
the output domain of a node is defined as the number of nodes which can be reached
by the node through paths. The input domain of a node is defined as the number of
nodes which can reach the node through paths.
A common global level organization structure found in many directed networks.
There are mainly four subsets in a bow-tie structure: giant strongly connected
component, the input, the output and the isolated subsets. For detail see Ma
and Zeng, Bioinformatics 19:1423