Database & Software

GENA

GENA is a database of genes and their products and it is made to find their full names, symbols, and synonyms easily. The current scope is H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elegans, and S. cerevisiae. The data is automatically extracted following databases. HUGO, OMIM, Genatlas, Locuslink, GDB, MGI, FlyBase, WormBase, CYGD, SGD GENA is designed to support the automatic extraction of information of genes and their products from articles by natural language processing on a local UNIX machine. However, since the data has large capacity, we do not service a ftp-download at present.

PRIME

PRIME is a develped version of Kinase Pathway Database which is an integrated database concerning completed sequenced major eukaryotes, which contains the classification of protein kinases and their functional conservation and orthologous tables among species, protein-protein interaction data, domain information, structural information, and automatic pathway graph image interface. The protein-protein interactions are extracted by natural language processing (NLP) from abstracts using basic word pattern and protein name dictionary GENA: developed by our group. In this system, pathways are easily compared among species using protein interactions data more than 1,510,000 and orthologous tables. Further, using other organisms interaction data, interaction prediction is also possible. Protein-protein interactions are automatically extracted and they are not checked manually. If you proceed into next steps from these data, please check their references in advance.

KinasePathwayDatabase

KinasePathwayDatabase is an integrated database concerning completed sequenced major eukaryotes, which contains the classification of protein kinases and their functional conservation and orthologous tables among species, protein-protein interaction data, domain information, structural information, and automatic pathway graph image interface. The protein-protein interactions are extracted by natural language processing (NLP) from abstracts using basic word pattern and protein name dictionary GENA: developed by our group. In this system, pathways are easily compared among species using protein interactions data more than 47,000 and orthologous tables.

MetaGene

MetaGene is a prokaryotic gene finding program from environmental genome shotgun sequences or metagenomic sequences. The program predicts genes from anonymous, short fragment sequences in bacteria and archaea with no prior knowledge (codon usage, etc). It also predicts the domain to which the input sequence belongs. The software is freely available for academic use.