KO (KEGG ORTHOLOGY) Database

Current statistics (2025/12/27)

	Protein	RNA
Number of KOs (K numbers)	27,505	523
KOs with references	25,898	522
KOs with sequences	24,198	502
KOs linked to pathway maps	14,384	151
KOs linked to brite hierarchies	22,834	523
KOs linked to brite tables	854
KOs linked to KEGG modules	3,086	1

KO Database of Molecular Functions

The KO (KEGG Orthology) database is a database of molecular functions represented in terms of functional orthologs. A functional ortholog is manually defined in the context of KEGG molecular networks, namely, KEGG pathway maps, BRITE hierarchies and KEGG modules, and is given a KO identifier called K number. Most KOs are defined from experimentally characterized genes and proteins in specific organisms, which are then generalized to other organisms based on sequence similarity. The granularity of "function" is context-dependent, and the resulting KO grouping may correspond to a group of highly similar sequences within a limited organism group or it may be a more divergent group.

The term KO system is used for a network-based classification of KOs shown below:

00001 KEGG Orthology (KO)

It consists of six top categories (09100 to 09160) for KEGG pathway maps and one top category (09180) for BRITE hierarchies, as well as one top category (09190) for those KOs that are not yet included in either of them. The category numbers for these top categories and the second-level categories under metabolism (09101 to 09112) are used to define color coding of functions (see KEGG Color Codes).

Efforts have been made to associate KO entries with pulication records reporting experimental evidence of functionally characterized sequence data as shown in the SEQUENCE field of the KO entry page. In many cases such data are not available for genes and proteins in the KEGG organisms of completely sequenced genomes. Thus, the addendum (ag) category was introduced in the GENES database enabling functionally characterized individual protein sequences to be included in KEGG. As a byproduct of these efforts, sequence data have also been associated with EC numbers in Enzyme Nomenclature.

Genome Annotation in KEGG

Genome annotation in KEGG contains two unique aspects, KO assignment and KEGG mapping, as summarized below.

KO assignment

Molecular functions are stored in the KO (KEGG Orthology) database containing orthologs of experimentally characterized genes/proteins.
Genome annotation in KEGG is to assign KO identifiers (or K numbers) to individual genes in the genome, rather than giving text description of functions.

KEGG mapping

Cellular and organism-level functions are stored in the PATHWAY, BRITE and MODULE databases in terms of the molecular networks, which are all created as networks of K number nodes.
The KO assignment procedure converts a gene set in the genome to a K number set and leads to automatic reconstruction of KEGG pathways and other networks by the process called KEGG mapping, enabling interpretation of high-level functions.

Ortholog table

The ortholog table (OT) existed from the beginning of the KEGG project. For a given set of K numbers it displays the current assignment of genes in KEGG organisms and viruses.

The ortholog table additionally displays positional correlations of genes in the chromosome, such as operon structures, by coloring. The same color means that the genes are adjacent.

Module table of taxonomic group

The module table presents a summary view of taxonomic groups for a given set of KOs and/or modules (see more details in Taxonomy mapping).

Last updated: December 3, 2025

Enter K numbers (Example) K00973 K01710 K01790 K00067 K23987

Enter K numbers (Example) K22014 K21512 K26964 K26963 K21511 K26962 K26961 K26965 K26966