ECML/PKDD 2004, Pisa, Italy, September 20-24, 2004

webmaster contact at ecmlpkddweb@isti.cnr.it
photo: lenzo.it

Second European Workshop on Data Mining and Text Mining for Bioinformatics

24 September 2004, Pisa, Italy

in conjunction with ECML/PKDD 2004: The Fifteenth European Conference on Machine Learning (ECML) and The Eighth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), 20-24 September, 2004, Pisa, Italy.

Abstract

In the past years, research in molecular biology and molecular medicine has accumulated enormous amounts of data. This includes genomic sequences gathered by the Human Genome Project, gene expression data from microarray experiments, protein identification and quantification data from proteomics experiments, and SNP data from high-throughput SNP arrays. However, our understanding of the biological processes underlying these data lags far behind. There is a strong interest in employing methods of knowledge discovery and data mining to generate models of biological systems. Mining biological databases imposes challenges which knowledge discovery and data mining have to address and which will be in the focus of the workshop.

  • Important background knowledge in bioinformatics is often buried in textual documents, such as scientific publi-cations or database annotations. Text mining and information extraction approaches currently being studied range from term recognition to extraction of complex relationships of interaction between proteins.
  • Analysing data from biological databases often requires the consideration of data from multiple relations rather than from one single table. Recently, approaches (such as propositionalization algorithms) are being studied that utilize multi-relational data and yet meet the efficiency requirements of large-scale data mining problems.
  • It is difficult and requires profound understanding of both knowledge discovery and computational biology to identify problems and optimization criteria which, when maximized by knowledge discovery algorithms, actually contribute to a better understanding of biological systems. Identification of appropriate knowledge discovery problems and development of evaluation methods for knowledge discovery results are ongoing efforts.

Goals and Intended Audience

In order to build knowledge discovery systems that contribute to our understanding of biological systems, solutions to the above problems have to be assembled into efficient and scalable systems. The workshop aims at facilitating this process, and at enhancing the exchange of knowledge between computational biologists and knowledge discovery researchers. Accordingly, our intended audience are both computational biologists and KDD researchers.

Program

The program consists of invited talks by Yves Moreau and Alfonso Valencia and contributed presentations. Contributed paper will be presented by a 10 minutes oral spotlight presentation in the morning session and a poster in the afternoon sessions.

10:30-11:00 Invited Talk
Yves Moreau: Integrating Text and Microarray Data: Gene Expression and Comparative Genomic Hybridization.
11:10-12:50
Paper Spotlight Presentations
Hong Chai and Carlotta Domeniconi: An Evaluation of Gene Selection Methods for Multi-Class Microarray Data Classification.
Nikolai Daraselia, Sergei Egorov, Andrey Yazhuk, Svetlana Novichkova, Anton Yuryev, and Ilya Mazo: Extracting Protein Function Information from MEDLINE Using a Full-Sentence Parser.
Yoshikazu Kaneta, Md. Ahaduzzaman Munna, and Takenao Ohkawa: A Method of Extracting Sentences Related to Protein Interaction from Literature using a Structure Database.
Svetlana Kiritchenko, Stan Matwin, and A. Fazel Famili: Hierarchical Text Categorization as a Tool of Associating Genes with Gene Ontology Codes.
Judice L.Y.Koh, Mong Li Lee, Asif M. Khan, Paul T.J. Tan, and Vladimir Brusic: Duplicate Detection in Biological Data using Association Rule Mining.
Andrea Malossini, Enrico Blanzieri, Raymond T. Ng: Assessment of SVM Reliability for Microarrays Data Analysis.
Brigitte Mathiak and Silke Eckstein: Five Steps to Text Mining in Biomedical Literature.
Oleg Okun: Protein Fold Recognition with K-Local Hyperplane Distance Nearest Neighbor Algorithm.
F. Psomopoulos, S. Diplaris, and P. A. Mitkas: A Finite State Automata Based Technique for Protein Classification Rules Induction.
Fabio Rinaldi, Gerold Schneider, Kaarel Kaljurand, James Dowdall, Christos Andronis, Andreas Persidis, and Ourania Konstanti: Mining relations in the GENIA corpus.
13:00-13:30
Invited Talk
Alfonso Valencia: Information Extraction in Molecular Biology.
15:00-16:00
Poster Presentation
All technical contributions
16:00-16:30
Coffee Break
16:30-
Poster Presentation
All technical contribution

Proceedings

Get the workshop proceedings as one PDF file.

Workshop Chair

Tobias Scheffer, Humboldt-Universität zu Berlin.

Program Committee


Support

The workshop received support from the European Network of Excellence in Knowledge Discovery, KD-Net.

Submissions and Dates (passed)

The submission deadline has passed.
Please note that the submission deadline has been extended to June 21.
Submission deadline: June 21, 2004
Notification of acceptance: July 12, 2004
Camera-ready copies due: July 19, 2004
Workshop: September 24, 2004

Previous Event

First European Workshop on Data Mining and Text Mining for Bioinformatics held at ECML / PKDD 2003 in Cavtat, Croatia.