Protein Analysis
What We Can Do
- Identify proteins from a gel band or simple solution
- Identify multiple proteins in complex mixtures
- Gel Band
- Coomassie Stain
- Silver Stain
- Sypro Stain
- Solution
How it is done?
The majority of protein sequence analysis today uses mass spectrometry. There are several steps in analyzing a protein.
- Digest the protein to peptides (in gel or solution). Mass spectrometry currently gets limited
sequence data from whole proteins, but can easily analyze peptides. - Trypsin is the first choice for digestion readily available, specifically, the majority of peptides are ideal size for analysis, peptides behave nicely in a mass spectrometer
- Separate peptides, usually on reverse phase column with acetonitrile gradient. We use columns 75 µm in diameter. We use formic acid in the solvents because the commonly used trifluoroacetic interferes with ionization.
- Place ionized peptides in vapor phase by passing the column eluate, containing peptides and solvent, through a fine tip to form tiny droplets. After evaporation of solvent, peptides are left in the vapor phase. Charged surfaces move the ionized peptide into the mass spectrometer. Using chromatography to introduce molecules into a mass spectrometer is LC-MS (liquid chromatography mass spectrometry or HPLC-MS.
- Measure mass of peptides.
- Fragment peptides. Collisions with gas molecules fragments peptides at peptide bonds. This is CID (collision induced dissociation) or HCD (higher energy collisionally activated dissociation).
- Measure mass of fragments from peptides. Because there are two steps of mass spectrometry (mass of peptide, mass of fragment of peptide), this is called MS/MS, or MSn because there can be 2 or more fragmentation steps. A two step process is also called tandem mass spectrometry.
- Use fragment mass data to determine the sequence of the peptide by seeing which combinations of amino acids gives the observed masses of peptide fragments by searching a database from Uniprot.
The two mass measurements in steps 5 and 7 requires a tandem mass spectrometer, or MS/MS. The two measurements can be performed in
- two different parts of the instrument- tandem in space
- one cell of the instrument that switches modes-tandem in time
Most data analysis is done by computer, by comparison with known sequences; SEQUEST is our standard program. For new sequences and confirmation of important sequences, data analysis is done by hand.
Why you do not get complete sequence data for every protein?
Seeing enough peptides to show 70% of the sequence of a protein (70% coverage) is a very successful protein analysis. In a project by the Cell Migration Consortium to analyze a number of protein involved in cell migration, 90% coverage of a protein is considered sufficient. There are several reasons why an analysis does not find all amino acids.
- protein does not digest well
- peptides too hydrophilic or small-they pass through the reverse phase column with salt and are not analyzed
- peptides too large/hydrophobic-they stick in gel, adsorb to tubes, do not elute from column, or are too large for the mass spectrometer to analyze because of poor fragmentation
- peptides fragment in ways which cannot be analyzed. Many spectra in an analysis cannot be interpreted. Some spectra only give limited data; proline, histidine, internal lysine and arginine are some reasons peptides do not give complete fragmentation data
If more data is needed, another protease is used for digestion.