KEGG Modules

The KEGG MODULE database consists of KEGG modules identified by M numbers and KEGG reaction modules identified by RM numbers, which are manually defined functional units of gene sets and reaction sets, respectively. KEGG modules are further divided into pathway modules and signature modules as shown below.
  • pathway modules – functional units of gene sets in metabolic pathways, including molecular complexes
  • signature modules – functional units of gene sets that characterize phenotypic features
  • reaction modules – functional units of successive reaction steps in metabolic pathways
The entire lists of KEGG modules and KEGG reaction modules can be viewed from the BRITE hierarchy files: The actual content of each KEGG module can be viewed by the module diagram interface (see Module Diagram below), and the actual content of each KEGG reaction module can be viewed in an html file such as for RM001 (see more details in KEGG Reaction Module).

Logical Expression

The pathway module is defined by the logical expression of K numbers, and the signature module is defined by the logical expression of K numbers and M numbers, allowing automatic evaluation of whether the gene set is complete, i.e., the functional unit is present, in a given genome. A space or a plus sign, representing a connection in the pathway or the molecular complex, is treated as an AND operator and a comma, used for alternatives, is treated as an OR operator. A minus sign designates an optional item in the complex.

Each space-separated unit is called a block, and the distinction is made for:
  • complete modules
  • incomplete but almost complete modules with only 1 or 2 blocks missing
  • all modules that contain any matching K numbers
when evaluating the completeness check, such as in KEGG Mapper.

The reaction module is defined by the logical expression of RC numbers (Reaction Class identifiers), but this expression is not currently used for any evaluation purpose.

Module Diagram

KEGG modules are associated with dynamically generated graphical diagrams. For example, M00002 represents glycolysis core module involving three-carbon compounds and its organism specific module takes the form of hsa_M00002 and can be selected from the Organism menu at the top (for all modules) or from the pop-up menu (for complete modules only). While KEGG pathway maps are all manually drawn, KEGG module diagrams are computationally generated from the text definition of logical expression.

Ortholog Table

The ortholog table is a useful tool to check completeness and consistency of genome annotations. It shows currently annotated genes in individual genomes for a given set of K numbers, together with coloring of adjacent genes (operon-like structures) on the chromosome. Each KEGG module contains a link to the corresponding ortholog table, such as for M00165, together with options to select complete or other modules.

Taxonomy Mapping

Different types of taxonomy mapping are available for each module. First, the "Taxonomy" link shows which organisms have complete modules under the classification of KEGG organisms in the NCBI taxonomy. Second, the "Module table" link summarizes the abundance of complete modules at varying levels of organism groups under the NCBI taxonomy.

Third, the taxonomy link from the ortholog table (designated by T) allows mapping of both complete and incomplete modules against the taxonomic classification of KEGG organisms. The result is shown in the color coding shown below.
  complete
  incomplete, 1 block missing
  incomplete, 2 blocks missing

Pathway Modules and Reaction Modules

It is interesting to note the correspondence between pathway modules and reaction modules, which are extracted independently from either genomic properties or chemical properties, suggesting co-evolution of genomic and chemical networks.

A new category of KEGG metabolic pathway maps, called overview maps, shows this correspondence as well as an overall architecture of the metabolic network. The following is an example taken from the overview map for Degradation of aromatic compounds.

A single M number or a combination of M numbers can be used for characterizing phenotypic features encoded in the genome. For example, the BTX (benzene, toluene, and xylene) degradation capacity can be seen from the following diagram where M numbers are linked to the ortholog tables indicating which organisms have complete modules.
benzene M00548 catechol
toluene M00538 benzoate M00551 catechol M00569 meta-cleavage
M00568 ortho-cleavage
xylene M00537 methyl-
benzoate
M00551 methyl-
catechol
M00569 meta-cleavage
M00568 ortho-cleavage
This example can be rewritten in terms of the reaction modules.
benzene RM006 catechol
toluene RM003 benzoate RM005 catechol RM009 meta-cleavage
RM008 ortho-cleavage
xylene RM003 methyl-
benzoate
RM005 methyl-
catechol
RM009 meta-cleavage
RM008 ortho-cleavage
See more details in KEGG Reaction Module.

Last updated: May 1, 2021