Introduction to our research activities
Reassembly of biological systems using knowledge processing and their analyses

- Information extraction from literature
- Predictions and analyses of interaction data
- Knowledge representation and knowledge discovery
- Information extraction from genomic sequences
- Research project of integrating human sciences
Substantial amount of biomedical knowledge is represented in natural languages as the forms of research papers and text books. We are studying to extract the knowledge from literature using Natural Language Processing and Information Retrieval technologies. More precisely, we have been developing a method of extracting biological and medical concepts such as protein, gene, and disease names as well as relationships among them. In addition, we are also proposing a system of supporting knowledge discovery by interpreting obtained experimental data based on the extracted knowledge. Our other interests include developing ways of overviewing knowledge stored in multiple papers and answering biological or medical questions given in a natural language.
Information on interactions between biomolecules provides us with many clues for systematically understanding living organisms. Computational approaches using interaction data facilitate researchers to gain initial insights into biological systems prior to conducting experiments. We are therefore studying on the methods for predicting interactions between them, for extracting functional modules from the interaction networks, and for predicting how these modules function together in a living organism.
Constructing biological systems on a computer without losing their intrinsic properties is an important step for us to systematically understand explicitly described knowledge, to discover latent knowledge, and to uncover the variety and constancy in biological systems. We study the methods for constructing biological systems, especially pathways and networks, on a computer and for discovering biological knowledge from them.
The genomic DNA molecules play a central role in the biological systems. Therefore, exhaustive identification of functional regions, such as genes and regulatory elements, in the genomic sequence is fundamental to understand the mechanisms of biological networks and processes. Hence, we are developing bioinformatics tools to extract biologically meaningful information from genomic sequences of various organisms. For instance, prediction tools of mammalian genes and promoters based on comparative genome analyses and a gene-finding tool from whole-genome shotgun reads of environmental genomes are developed in our laboratory.
In the field of biology, scientific data and information are expanding. For example, in the field of genomics and proteomics, individual scientists routinely contribute their sequence data to centralized databases at the time of publication. As a result, the databases and archives are growing daily. It has become difficult to grasp the big picture of biology, which is recognized as a global problem worldwide. In order to solve the problem, "Science Integration Program - Humans" has been developing a knowledge sharing system, an integrated neural simulator, and human cellular databases and browsers.
