localalign
Return local optimal and suboptimal alignments between two sequences
Description
returns information about the first optimal (highest scoring) local alignment between two
sequences in a MATLAB® structure.AlignStruct = localalign(Seq1,Seq2)
specifies options using one or more name-value arguments in addition to the arguments in
previous syntaxes.AlignStruct = localalign(Seq1,Seq2,Name=Value)
Examples
Limit the number of alignments to return between two sequences by specifying the number of alignments.
Create variables containing two amino acid sequences.
Seq1 = "VSPAGMASGYDPGKA"; Seq2 = "IPGKATREYDVSPAG";
Use the NumAln argument to return information about the top three local alignments.
struct1 = localalign(Seq1,Seq2,NumAln=3)
struct1 = struct with fields:
Score: [3×1 double]
Start: [3×2 double]
Stop: [3×2 double]
Alignment: {3×1 cell}
View the scores of the first and second alignments.
struct1.Score(1:2)
ans = 2×1
11.0000
9.6667
View the first alignment.
struct1.Alignment{1}ans = 3×5 char array
'VSPAG'
'|||||'
'VSPAG'
Limit the number of alignments to return between two sequences by specifying a minimum score.
Create variables containing two amino acid sequences.
Seq1 = "VSPAGMASGYDPGKA"; Seq2 = "IPGKATREYDVSPAG";
Use MinScore to return information about only local alignments with a score greater than 8. Use DoAlignment to exclude the actual alignments.
struct2 = localalign(Seq1,Seq2,MinScore=8,DoAlignment=false)
struct2 = struct with fields:
Score: [2×1 double]
Start: [2×2 double]
Stop: [2×2 double]
Limit the number of alignments to return between two sequences by specifying a percentage of the highest score.
Create variables containing two amino acid sequences.
Seq1 = "VSPAGMASGYDPGKA"; Seq2 = "IPGKATREYDVSPAG";
Use Percent to return information about only local alignments with a score within 15% of the maximum score.
struct3 = localalign(Seq1,Seq2,Percent=15)
struct3 = struct with fields:
Score: [2×1 double]
Start: [2×2 double]
Stop: [2×2 double]
Alignment: {2×1 cell}
Specify a scoring matrix and gap opening penalty when aligning two sequences.
Create variables containing two nucleotide sequences.
Seq1 = "CCAATCTACTACTGCTTGCAGTAC"; Seq2 = "AGTCCGAGGGCTACTCTACTGAAC";
Create a scoring matrix with a match score of 10 and a mismatch score of -9.
sm = [10 -9 -9 -9;
-9 10 -9 -9;
-9 -9 10 -9;
-9 -9 -9 10];Use ScoringMatrix and GapOpen when returning information about the top three local alignments.
struct4 = localalign(Seq1,Seq2, ... Alphabet="nt", ... ScoringMatrix=sm, ... GapOpen=20, ... NumAln=3)
struct4 = struct with fields:
Score: [3×1 double]
Start: [3×2 double]
Stop: [3×2 double]
Alignment: {3×1 cell}
Input Arguments
First amino acid or nucleotide sequence, specified as one of these values:
Character vector or string of letters representing amino acids or nucleotides, such as returned by
int2aaorint2ntVector of integers representing amino acids or nucleotides, such as returned by
aa2intornt2intMATLAB structure array containing a
Sequencefield, such as returned byfastaread,fastqread,emblread,getembl,genbankread,getgenbank,getgenpept,genpeptread,getpdb,pdbread, orsffread
For help with letter and integer representations of amino acids and nucleotides, see Amino Acid Lookup or Nucleotide Lookup.
Data Types: double | char | string | struct
Second amino acid or nucleotide sequence, specified as one of these values:
Character vector or string of letters representing amino acids or nucleotides, such as returned by
int2aaorint2ntVector of integers representing amino acids or nucleotides, such as returned by
aa2intornt2intMATLAB structure array containing a
Sequencefield, such as returned byfastaread,fastqread,emblread,getembl,genbankread,getgenbank,getgenpept,genpeptread,getpdb,pdbread, orsffread
For help with letter and integer representations of amino acids and nucleotides, see Amino Acid Lookup or Nucleotide Lookup.
Data Types: double | char | string | struct
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN, where Name is
the argument name and Value is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: NumAln=3
Number of alignments to return, specified as a positive number not exceeding
212. The localalign functions returns
the top NumAln local, nonintersecting alignments (optimal and
suboptimal). If the number of optimal alignments is greater than the specified number
of alignments, then localalign returns the first
NumAln alignments based on their order in the trace back
matrix.
Use NumAln to return multiple alignments when you are aligning
low complexity sequences and must consider several local alignments.
Data Types: double
Minimum score of local, nonintersecting alignments (optimal and suboptimal) to
return, specified as a positive number. Use MinScore to return
suboptimal alignments, for example when you are interested in accounting for
sequencing errors or imperfect scoring matrices.
Data Types: double
Percent of the highest score, specified as a positive number between 0 and 100.
This value limits the return of local, nonintersecting alignments (optimal and
suboptimal) to those alignments with a score within the specified percentage of the
highest score. For example, if the highest score is 10.5 and you
specify 5, then localalign determines a
minimum score of 10.5 – (10.5 * 0.05) = 9.975. It returns all
alignments with a score of 9.975 or higher.
Use Percent to return optimal and suboptimal alignments when
you do not know how similar the two sequences are or how well they score against a
given scoring matrix.
Data Types: double
Indicator to include the pairwise alignments in the Alignment
field of the output structure array, specified as true or
false.
Data Types: logical
Type of sequences, specified as "AA" or
"NT".
Data Types: char | string
Scoring matrix to use for local alignment, specified as one of these values for amino acid sequences:
A string or a corresponding character vector:
"BLOSUM62""BLOSUM30"increasing by5up to"BLOSUM90""BLOSUM100""PAM10"increasing by10up to"PAM500""DAYHOFF""GONNET"
The default values are:
"BLOSUM50"— IfAlphabetis set to"AA"."NUC44"— IfAlphabetis set to"NT".
The previous scoring matrices, provided with the software, also include a structure containing a scale factor that converts the units of the output score to bits. You can also use the
Scaleargument to specify an additional scale factor to convert the output score from bits to another unit.A matrix representing the scoring matrix to use for the local alignment, such as returned by the
blosum,pam,dayhoff,gonnet, ornuc44function.If you use a scoring matrix that you created or was created by one of the previous functions, the matrix does not include a scale factor. The output score is returned in the same units as the scoring matrix. You can use the
Scaleargument to specify a scale factor to convert the output score to another unit.
If you need to compile localalign into a stand-alone
application or software component using MATLAB
Compiler™, use a matrix instead of a character vector or string for
ScoringMatrix.
Data Types: char | string
Scale factor applied to the output scores, thereby controlling the units of the
output scores, specified as a positive number. For example, if the output score is
initially determined in bits, and you specify the scale factor
log(2), then localalign returns
Score in nats. The default value 1 does not
change the units of the output score.
If the ScoringMatrix argument also specifies a scale factor,
then localalign uses it first to scale the output score. It then
applies the scale factor specified by Scale to rescale the output
score.
Before comparing alignment scores from multiple alignments, ensure that the scores
are in the same units. Use Scale to control the units of the output
scores.
Data Types: double
Penalty for opening a gap in the alignment, specified as a positive number.
Data Types: double
Output Arguments
Information about the local optimal and suboptimal alignments between two sequences, returned as a MATLAB structure array or array of structure arrays. Each structure array represents an optimal or suboptimal alignment and contains these fields.
| Field | Description |
|---|---|
Score | Score for the local optimal or suboptimal alignment. |
Start | 1-by-2 vector of indices indicating the starting point in each sequence for the alignment. |
Stop | 1-by-2 vector of indices indicating the stopping point in each sequence for the alignment. |
Alignment | 3-by-N character array showing the two sequences,
|
More About
Alignments having no matches or mismatches in common.
An alignment with the highest score.
An alignment with a score less than the highest score.
References
[1] Barton, G. (1993). An efficient algorithm to locate all locally optimal alignments between two sequences allowing for gaps. CABIOS 9, 729–734.
Version History
Introduced in R2009b
See Also
multialignread | fastaread | gethmmalignment | seqalignviewer | multialign | nwalign | swalign | blosum | pam | dayhoff | gonnet | nuc44
Topics
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Web サイトの選択
Web サイトを選択すると、翻訳されたコンテンツにアクセスし、地域のイベントやサービスを確認できます。現在の位置情報に基づき、次のサイトの選択を推奨します:
また、以下のリストから Web サイトを選択することもできます。
最適なサイトパフォーマンスの取得方法
中国のサイト (中国語または英語) を選択することで、最適なサイトパフォーマンスが得られます。その他の国の MathWorks のサイトは、お客様の地域からのアクセスが最適化されていません。
南北アメリカ
- América Latina (Español)
- Canada (English)
- United States (English)
ヨーロッパ
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)