PROSCAN


PROSCAN is a tool to search patterns within a given protein sequence, and perform composition analysis.

See below for a description of patterns to be searched.



Type the sequence to analyze:

Description:
Sequence:

If you want to highlight a pattern, type it here:

N.B.: you can define patterns similarly to PERL regular expressions. For example, the pattern G[GR]K finds both the sequences GGK and GRK; the pattern G.K identifies any segment with G and K spaced by any amino acid (just one); the pattern G[^G]K identifies the sequence G, any amino acid except G, K; the pattern GK.*GK identifies segments consisting of two GK spaced by an undefined number of amino acids. See below for more instructions.


Pattern definition

You can define a pattern according to PERL regular expressions. These examples refer to a simple four-residue pattern:

Pattern to be searched Identified sequences
First positionSecond positionThird positionFourth position
G[IL]KP G I or LKP
G.KP Gany amino acidKP
G[^L]KP Gany amino acid except LKP

Advanced features for pattern definition:
The star symbol "*" means a repetition (zero or more times) of the previous symbol.
The plus symbol "+" means a repetition (one or more times) of the previous symbol.
Example: the pattern GK.*HA identifies segments consisting of GK and HA spaced by an undefined number of amino acids.

Parentheses ( ) can be used to group a sequence and apply a rule to the group.
You can specify how many repetition you are searching for by using a pair of curly braces { } with one or two numbers inside.
Example: the pattern (PQV){3,5} identifies the following sequences:
PQVPQVPQV - that is, the pattern PQV repeated 3 times;
PQVPQVPQVPQV - that is, the pattern PQV repeated 4 times;
PQVPQVPQVPQVPQV - that is, the pattern PQV repeated 5 times;
You can leave not specified the lower or the upper number or repetition.
Example: (PQV){2,} identifies any repetition of PQV (two or more times).