Uninterpreted Spectra
Searching of MS/MS Data
--- Peptide
sequencing and protein identification using
ms/ms data can be done using a variety of methods. Perhaps
the most intuitive method is to generate amino acid sequence
for a given peptide and then submit the sequences of peptides
for a search against a specific database. This is the standard
BLAST search that most researchers are familiar with. The
advantage of this method is that it works well for similarity
searches because homologous residue substitutions can be considered.
The disadvantages of this method are that it is very time
consuming and therefore is not a good choice for most LC/MS/MS
experiments because of the large number of ms/ms spectra produced.
Another problem is that during manual sequencing (or computer
assisted-manual sequencing) of a spectra, it is possible for
mistakes to be made in comlicated spectra and an incorrect
amino acid to be used. The final disadvantage of this method
is when a sample contains more than one protein, there would
then be several peptide sequences belonging to different proteins
and when BLAST searched, these could affect the score results
of the search.
--- A different approach for
protein identification using ms/ms data is known as
uninterpreted spectra searching.
The basis of this method is to search a database using only
masses of the ions generated during an ms/ms experiment. This
method is much quicker than the manual method and works well
for samples containing more than one protein. It also avoids
any bias towards a specific residue that could mistakenly
occur during manual sequencing. It’s limitations are
that it is not the best method to search for similarities
because a single amino acid substitution that changes the
mass of the peptide can lead to that protein being missed
for identification.
--- When a peptide is ionized
and fragmented during an ms/ms experiment, there are two
pieces of information that are known for each peptide.
The first is the molecular weight of the peptide. This is
determined from the m/z of the ion and total charge on the
ion. The second piece of information is the fragment ion masses
that are created during an ms/ms scan. These two pieces of
information are then used to search a database in the following
sequence: All entries in the database are digested “in
silico” with the appropriate enzyme and the modification
masses are added to the mass of the peptides. Then, the peptides
are placed into bins based on their mass. The experimental
peptide is then compared to the bin containing peptides of
the same mass, within operator set tolerances, and all other
peptides, of other masses, are discarded. Then, the theoretical
fragment ion masses for all of the peptides in the selected
bin are compared to the experimental fragment ion masses for
the peptide, within operator set tolerances, and a weighted
ion score is assigned. This continues until all ms/ms spectra
have been analyzed. The software then combines peptides that
hit the same protein in to the results window.
--- The
score for an ms/ms match is a probability based score
that the observed match between the experimental data and
the database sequence is a random event. It is composed of
a peptide mass match and ion scores from ms/ms data. Therefore,
the more possible peptides there are in a database the lower
the score will be for the same ms/ms spectra.
In this method of database searching, the sequence of the
peptide is never used. When it is reported in the results
window, it has simply been translated from the appropriate
entry in the database.
For a more detailed explanation of MASCOT scoring and searching
please see our website at Mascot
at the UVic- Genome BC Proteomics Website.
|