Artificial intelligence approaches to knowledge discovery in functional genomics

Client ID: ARRS (BI-IT/05-08-011)
Project type: Bilateral Collaboration Project
Project duration: 2006 - 2009


Of all the questions posed, what makes us a living being is probably the one most often asked by any scientist. With recent advances in molecular biology and bioinformatics, we have just started to scratch the surface of uncovering the incredible complexity of living organisms. Within contemporary biology, gene function annotation and discovery of genetic networks have become the key challenges. Today’s functional genomics targets genome-wide discovery of gene functions, observing the organism, its elements and their interactions as a whole. The challenge for the biologists is to devise high-throughput experimental techniques to collect the data, passing it to fellow data analysts to develop and use dedicated techniques for data mining and knowledge discovery.

In the proposed we are collaborating with the group of Prof. Riccardo Bellazzi who leads Laboratory for Biomedical Informatics at University of Pavia, Italy. The scope of the collaboration is development of artificial intelligent techniques to support knowledge discovery in functional genomics. The collaboration is based on our past experience in temporal data analysis (Pavia) and research in construction of regulatory genetic networks and analysis of gene expression and sequences (Ljubljana). The particular methods we will rely on are various machine learning approaches, temporal abstraction, abductive and inductive inference, and inference of qualitative genetic networks.

Beside theoretical development of new methods, collaboration will also focus on practical results being delivered as software components within an open-source data mining framework called Orange and as applications in functional genomics, gene annotation and gene network discovery for budding yeast S. cerevisiae and amoeba D. dictyostelium. For the later, the project will use available data bases on genome sequence, gene expression and available functional annotation. The principal goal of the project is to employ the methods developed at each of the partners to introduce a new, combined approaches to enable genome-wide construction of gene regulatory networks from experimental data, one of the major challenges with today’s bioinformatics.