LLL'05 Challenge: Genic Interaction Extraction with Alignments and Finite State Automata
Supplementary information
Data sources
Learning Logic in Language 2005 - workshop
website
Challenge task
website
, with training and evaluation data
List of nouns indicating interactions (
INouns
)
List of verbs indicating interactions (
IVerbs
)
final dictionary with
modifications
Evaluation corpus from BioCreAtIvE, 1000 sentences (
available on request)
Mapping from corpus sentences to
PubMeds-IDs
Part-of-speech tags:
Penn Tree Bank tag set
,
for a list see e.g.
http://www.comp.leeds.ac.uk/amalgam/tagsets/upenn.html
Substitution matrix
for the PTB (full tag set, we used 19 tags in the challenge)
Software
POS-Tagging:
TnT
, Thorsten Brandts, see
http://www.coli.uni-sb.de/~thorsten/tnt/
Stemming:
Porter-Stemmer
, Martin F. Porter, see
http://www.tartarus.org/~martin/PorterStemmer/
Please send any questions and requests to
hakenberg(a)
informatik.hu-berlin.de
.