PROSCAN is a tool to search patterns within a given protein sequence, and perform composition analysis.
See below for a description of patterns to be searched.
Pattern to be searched | Identified sequences | |||
---|---|---|---|---|
First position | Second position | Third position | Fourth position | |
G[IL]KP | G | I or L | K | P |
G.KP | G | any amino acid | K | P |
G[^L]KP | G | any amino acid except L | K | P |
Advanced features for pattern definition:
The star symbol "*" means a repetition (zero or more times) of the previous
symbol.
The plus symbol "+" means a repetition (one or more times) of the previous symbol.
Example: the pattern GK.*HA identifies segments consisting of GK and
HA spaced by an undefined number of amino acids.
Parentheses ( ) can be used to group a sequence and apply a rule to the group.
You can specify how many repetition you are searching for by using a pair of
curly braces { } with one or two numbers inside.
Example: the pattern (PQV){3,5} identifies the following sequences:
PQVPQVPQV - that is, the pattern PQV repeated 3 times;
PQVPQVPQVPQV - that is, the pattern PQV repeated 4 times;
PQVPQVPQVPQVPQV - that is, the pattern PQV repeated 5 times;
You can leave not specified the lower or the upper number or repetition.
Example: (PQV){2,} identifies any repetition of PQV (two or more times).