(2) Aktuell Seminare Reports Homepage Software
printer / text mode version
university-logo
draheim
@informatik.hu-berlin.de

Seminare
- MS/Proteomics
- MS/Mascot (1) (2)
- STX Transf.f.XML
- moldyn (mol.Bio.)
- location sensing
 
SS2005 (11)


sitemap


-guidod-pygtk
sitemap             *offsite link

2006-11-07
(C) Guido Draheim
guidod@gmx.de

 
generated by mksite.sh

literatur (MASCOT)

Let's try to recount the available literature references about the MASCOT algorithm. I do no try to list the descriptions about the underlying MOWSE scoring but only try to find information about the "probalistic" approach to the theme. Up to this attempt it looks to me like a "holistic" approach given the lack of public information.

as always, let's look at what citeseer can tell us about it:
http://citeseer.nj.nec.com/cs?q=mascot and note that your result might be different on a different day. Of course, the best matches come when looking for the inventors of the MOWSE/MASCOT series of algorithms: Perkins and Pappin. (note: citeseer search is often down, perhaps try with google as mascot/protein or protein/perkins )

One such reference entry says (not citeseer'ed itself):

Perkins, D., Pappin, D., Creasy, D., Cottrell, J.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20 (1997) 3551-3567

also linked up as a matrixscience literature reference . (going to the local HU library I found out that vol-20 is actually 1999, and access to the fulltext is electronic only). Still, this seems to be the main reference and it is good search for it on the web and electronic citation databases as all modern MS algorithms texts put that into their reference catalog, in the format presented above.

However, the paper does not reveal that much for a computer scientist, better than the original paper I recommend to have a look at the presentation made about that paper specifically - done lately in the nearby molgen facilities: www.molgen.mpg.de/~wolski/downloads/MascotT.pdf which has some nice pictures that are a good counterpart to the text of the Perkins/Pappin paper.

On another account, it seems that "Christopher Diehl" is another one good in the same field with interesting publications. It is also referenced by the same papers that reference to Perkins/Pappins, so you can find other papers with it. His topscore is on the following paper: I-J. Wang, C. P. Diehl, and F. J. Pineda "A Statistical Model of Proteolytic Digestion" JHU/APL technical report, 2003. and here is a link to his homepage with the online text as csb2003poster_final.pdf

Still there is not much to be found about how to implement the probabilities into a database for searching a match including a confidence score over the "false positive" problem. Some hints can be derived from the following report which is clearly much more oriented towards programmers and computer scientists handling the topic (instead of mathematical chemistry and lab usage): http://sib-dea.unil.ch/~lmuller/Stage.pdf
"Implementation of a MS/MS identification algorithm by spectral alignment and optimization of the scoring function by genetic programming" (2003-11-25, Lukas Mueller, Proteome Informatics Group).

... and last not least, I found a very good presentation which stands out for atleast one thing: it has a very very good literatur annex included in the presentation material - here are some excerpts with a link to the original ppt file first:

  • http://bio.informatics.indiana.edu/L529/papers/L529_proteomics.ppt
    Mass Spectrometry - showing it off on contemporary level
  • http://www.genetik.uni-bielefeld.de/.../stunde_ws0203_10.pdf
    Lecture "Genome Reseach" - Principles of proteomics, 2DGE&MS
  • http://sashimi.sf.net/extra/oral.pdf
    about my personal favourite topic: pipelining the data and comparing multiple MS results
  • http://www.proteomecenter.org/PDFs/Keller.AnalChem.02.pdf
    "How to estimate correctness of MS MS prediction"... That should give an impression on the theme since generally there is no comparison of results of different scoring methods, just some "expert validation" by humans.