Class for the enzymatic digestion of proteins. More...
#include <OpenMS/CHEMISTRY/EnzymaticDigestion.h>
| Classes | |
| struct | BindingSite | 
| struct | CleavageModel | 
| Public Types | |
| enum | Enzyme { ENZYME_TRYPSIN, SIZE_OF_ENZYMES } | 
| Possible enzymes for the digestion (adapt NamesOfEnzymes & nextCleavageSite_() if you add more enzymes here)  More... | |
| enum | Specificity { SPEC_FULL, SPEC_SEMI, SPEC_NONE, SIZE_OF_SPECIFICITY } | 
| when querying for valid digestion products, this determines if the specificity of the two peptide ends is considered important  More... | |
| Public Member Functions | |
| EnzymaticDigestion () | |
| Default constructor.  More... | |
| EnzymaticDigestion (const EnzymaticDigestion &rhs) | |
| Copy constructor.  More... | |
| EnzymaticDigestion & | operator= (const EnzymaticDigestion &rhs) | 
| Assignment operator.  More... | |
| SignedSize | getMissedCleavages () const | 
| Returns the number of missed cleavages for the digestion.  More... | |
| void | setMissedCleavages (SignedSize missed_cleavages) | 
| Sets the number of missed cleavages for the digestion (default is 0). This setting is ignored when log model is used.  More... | |
| Enzyme | getEnzyme () const | 
| Returns the enzyme for the digestion.  More... | |
| void | setEnzyme (Enzyme enzyme) | 
| Sets the enzyme for the digestion (default is ENZYME_TRYPSIN).  More... | |
| Specificity | getSpecificity () const | 
| Returns the specificity for the digestion.  More... | |
| void | setSpecificity (Specificity spec) | 
| Sets the specificity for the digestion (default is SPEC_FULL).  More... | |
| void | digest (const AASequence &protein, std::vector< AASequence > &output) const | 
| Performs the enzymatic digestion of a protein.  More... | |
| Size | peptideCount (const AASequence &protein) | 
| Returns the number of peptides a digestion of proteinwould yield under the current enzyme and missed cleavage settings.  More... | |
| bool | isLogModelEnabled () const | 
| use trained model when digesting?  More... | |
| void | setLogModelEnabled (bool enabled) | 
| enables/disabled the trained model  More... | |
| DoubleReal | getLogThreshold () const | 
| Returns the threshold which needs to be exceeded to call a cleavage (only for the trained cleavage model on real data)  More... | |
| void | setLogThreshold (DoubleReal threshold) | 
| bool | isValidProduct (const AASequence &protein, Size pep_pos, Size pep_length) | 
| Returns true if peptide at position pep_poswith lengthpep_lengthwithin proteinproteinwas generated by the current model.  More... | |
| Static Public Member Functions | |
| static Enzyme | getEnzymeByName (const String &name) | 
| static Specificity | getSpecificityByName (const String &name) | 
| Static Public Attributes | |
| static const std::string | NamesOfEnzymes [SIZE_OF_ENZYMES] | 
| Names of the Enzymes.  More... | |
| static const std::string | NamesOfSpecificity [SIZE_OF_SPECIFICITY] | 
| Names of the Specificity.  More... | |
| Protected Member Functions | |
| void | nextCleavageSite_ (const AASequence &sequence, AASequence::ConstIterator &p) const | 
| moves the iterator pbehind (i.e., C-term) the next cleavage site of thesequenceMore... | |
| bool | isCleavageSite_ (const AASequence &sequence, const AASequence::ConstIterator &p) const | 
| tests if position pointed to by p(N-term side) is a valid cleavage site  More... | |
| Protected Attributes | |
| SignedSize | missed_cleavages_ | 
| Number of missed cleavages.  More... | |
| Enzyme | enzyme_ | 
| Used enzyme.  More... | |
| Specificity | specificity_ | 
| specificity of enzyme  More... | |
| bool | use_log_model_ | 
| use the log model or naive digestion (with missed cleavages)  More... | |
| DoubleReal | log_model_threshold_ | 
| Threshold to decide if position is cleaved or missed (only for the model)  More... | |
| Map< BindingSite, CleavageModel > | model_data_ | 
| Holds the cleavage model.  More... | |
Class for the enzymatic digestion of proteins.
Digestion can be performed using simple regular expressions, e.g. [KR] | [^P] for trypsin. Also missed cleavages can be modelled, i.e. adjacent peptides are not cleaved due to enzyme malfunction/access restrictions. If n missed cleavages are given, all possible resulting peptides (cleaved and uncleaved) with up to n missed cleavages are returned. Thus no random selection of just n specific missed cleavage sites is performed.
An alternative model is also available, where the protein is cleaved only at positions where a cleavage model trained on real data, exceeds a certain threshold. The model is published in Siepen et al. (2007), "Prediction of missed cleavage sites in tryptic peptides aids protein identification in proteomics.", doi: 10.1021/pr060507u The model is only available for trypsin and ignores the missed cleavage setting. You should however use setLogThreshold() to adjust FP vs FN rates. A higher threshold increases the number of cleavages predicted.
| enum Enzyme | 
Possible enzymes for the digestion (adapt NamesOfEnzymes & nextCleavageSite_() if you add more enzymes here)
| Enumerator | |
|---|---|
| ENZYME_TRYPSIN | |
| SIZE_OF_ENZYMES | |
| enum Specificity | 
Default constructor.
| EnzymaticDigestion | ( | const EnzymaticDigestion & | rhs | ) | 
Copy constructor.
| void digest | ( | const AASequence & | protein, | 
| std::vector< AASequence > & | output | ||
| ) | const | 
Performs the enzymatic digestion of a protein.
| Enzyme getEnzyme | ( | ) | const | 
Returns the enzyme for the digestion.
convert enzyme string name to enum returns SIZE_OF_ENZYMES if name is not valid 
| DoubleReal getLogThreshold | ( | ) | const | 
Returns the threshold which needs to be exceeded to call a cleavage (only for the trained cleavage model on real data)
| SignedSize getMissedCleavages | ( | ) | const | 
Returns the number of missed cleavages for the digestion.
| Specificity getSpecificity | ( | ) | const | 
Returns the specificity for the digestion.
| 
 | static | 
convert spec string name to enum returns SIZE_OF_SPECIFICITY if name is not valid 
| 
 | protected | 
tests if position pointed to by p (N-term side) is a valid cleavage site 
| bool isLogModelEnabled | ( | ) | const | 
use trained model when digesting?
| bool isValidProduct | ( | const AASequence & | protein, | 
| Size | pep_pos, | ||
| Size | pep_length | ||
| ) | 
Returns true if peptide at position pep_pos with length pep_length within protein protein was generated by the current model. 
| 
 | protected | 
moves the iterator p behind (i.e., C-term) the next cleavage site of the sequence 
| EnzymaticDigestion& operator= | ( | const EnzymaticDigestion & | rhs | ) | 
Assignment operator.
| Size peptideCount | ( | const AASequence & | protein | ) | 
Returns the number of peptides a digestion of protein would yield under the current enzyme and missed cleavage settings. 
| void setEnzyme | ( | Enzyme | enzyme | ) | 
Sets the enzyme for the digestion (default is ENZYME_TRYPSIN).
| void setLogModelEnabled | ( | bool | enabled | ) | 
enables/disabled the trained model
| void setLogThreshold | ( | DoubleReal | threshold | ) | 
Sets the threshold which needs to be exceeded to call a cleavage (only for the trained cleavage model on real data) Default is 0.25
| void setMissedCleavages | ( | SignedSize | missed_cleavages | ) | 
Sets the number of missed cleavages for the digestion (default is 0). This setting is ignored when log model is used.
| void setSpecificity | ( | Specificity | spec | ) | 
Sets the specificity for the digestion (default is SPEC_FULL).
| 
 | protected | 
Used enzyme.
| 
 | protected | 
Threshold to decide if position is cleaved or missed (only for the model)
| 
 | protected | 
Number of missed cleavages.
| 
 | protected | 
Holds the cleavage model.
| 
 | static | 
Names of the Enzymes.
| 
 | static | 
Names of the Specificity.
| 
 | protected | 
specificity of enzyme
| 
 | protected | 
use the log model or naive digestion (with missed cleavages)
| OpenMS / TOPP release 1.11.1 | Documentation generated on Thu Nov 14 2013 11:19:27 using doxygen 1.8.5 |