Collecting a Large Corpus from all of Medline

Supplementary information

This sample is a large set of sentences that contain protein-protein interactions. We collected this set by searching for protein pairs from IntAct (a protein-interaction database) in all of Medline. This led to typical (parts of) sentences often used to describe PPIs, like

  PTN blocks PTN
  PTN suppressed PTN
  PTN binds the PTN
  PTN that regulates PTN
(were PTN is a wildcard for a protein name). In addition, we found many examples that were more 'exotic', like
  PTN, but neither was influenced by PTN
  PTN requires the presence of comparable amounts of PTN or PTN

Data

Software


Please send any questions and requests to hakenberg(a)informatik.hu-berlin.de.
[Knowledge Management in Bioinformatics] - [Start page]