Main Content

oligoprop

Calculate sequence properties of DNA oligonucleotide

Description

SeqProperties = oligoprop(SeqNT) returns the sequence properties for a DNA oligonucleotide as a structure.

example

SeqProperties = oligoprop(SeqNT,Name,Value) uses additional options specified by one or more Name,Value pair arguments.

Examples

collapse all

  1. Create a random sequence.

    seq = randseq(25)
    
    seq =
    
    TAGCTTCATCGTTGACTTCTACTAA
  2. Calculate sequence properties of the sequence.

    S1 = oligoprop(seq)
    
    S1 = 
    
                    GC: 36
               GCdelta: 0
              Hairpins: [0x25 char]
                Dimers: 'tAGCTtcatcgttgacttctactaa'
             MolWeight: 7.5820e+003
        MolWeightdelta: 0
                    Tm: [52.7640 60.8629 62.2493 55.2870 54.0293 61.0614]
               Tmdelta: [0 0 0 0 0 0]
                Thermo: [4x3 double]
           Thermodelta: [4x3 double]
  3. List the thermodynamic calculations for the sequence.

    S1.Thermo
    
    ans =
    
     -178.5000 -477.5700  -36.1125
     -182.1000 -497.8000  -33.6809
     -190.2000 -522.9000  -34.2974
     -191.9000 -516.9000  -37.7863
  1. Calculate sequence properties of the sequence ACGTAGAGGACGTN.

    S2 = oligoprop('ACGTAGAGGACGTN')
    
    S2 = 
    
                    GC: 53.5714
               GCdelta: 3.5714
              Hairpins: 'ACGTagaggACGTn'
                Dimers: [3x14 char]
             MolWeight: 4.3329e+003
        MolWeightdelta: 20.0150
                    Tm: [38.8357 42.2958 57.7880 52.4180 49.9633 55.1330]
               Tmdelta: [1.4643 1.4643 10.3885 3.4633 0.2829 3.8074]
                Thermo: [4x3 double]
           Thermodelta: [4x3 double]
    
  2. List the potential dimers for the sequence.

    S2.Dimers
    
    ans =
    
    ACGTagaggacgtn
    ACGTagaggACGTn
    acgtagagGACGTN

Input Arguments

collapse all

DNA oligonucleotide sequence represented by any of the following:

  • Character vector or string containing the letters A, C, G, T, or N

  • Vector of integers containing the integers 1, 2, 3, 4, or 15

  • Structure containing a Sequence field that contains a nucleotide sequence

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: 'Replicates',5 specifies to repeat the algorithm five times.

Specify a salt concentration in moles/liter for melting temperature calculations

Example: 'Salt',0.02

Specify the temperature in degrees Celsius for nearest-neighbor calculations of free energy.

Example: 'Temp',20

Specify the concentration in moles/liter for melting temperatures.

Example: 'Primerconc',40e-6

Specify the minimum number of paired bases that form the neck of the hairpin.

Example: 'HPBase',6

Specify the minimum number of bases that form the loop of a hairpin.

Example: 'HPLoop',2

Specify the minimum number of aligned bases between the sequence and its reverse

Example: 'Dimmerlength',6

Output Arguments

collapse all

Sequence properties for a DNA oligonucleotide as a structure with the following fields:

FieldDescription
GCPercent GC content for the DNA oligonucleotide. Ambiguous N characters in SeqNT are considered to potentially be any nucleotide. If SeqNT contains ambiguous N characters, GC is the midpoint value, and its uncertainty is expressed by GCdelta.
GCdeltaThe difference between GC (midpoint value) and either the maximum or minimum value GC could assume. The maximum and minimum values are calculated by assuming all N characters are G/C or not G/C, respectively. Therefore, GCdelta defines the possible range of GC content.
HairpinsH-by-length(SeqNT) matrix of characters displaying all potential hairpin structures for the sequence SeqNT. Each row is a potential hairpin structure of the sequence, with the hairpin forming nucleotides designated by capital letters. H is the number of potential hairpin structures for the sequence. Ambiguous N characters in SeqNT are considered to potentially complement any nucleotide.
Dimers D-by-length(SeqNT) matrix of characters displaying all potential dimers for the sequence SeqNT. Each row is a potential dimer of the sequence, with the self-dimerizing nucleotides designated by capital letters. D is the number of potential dimers for the sequence. Ambiguous N characters in SeqNT are considered to potentially complement any nucleotide.
MolWeightMolecular weight of the DNA oligonucleotide. Ambiguous N characters in SeqNT are considered to potentially be any nucleotide. If SeqNT contains ambiguous N characters, MolWeight is the midpoint value, and its uncertainty is expressed by MolWeightdelta.
MolWeightdeltaThe difference between MolWeight (midpoint value) and either the maximum or minimum value MolWeight could assume. The maximum and minimum values are calculated by assuming all N characters are G or C, respectively. Therefore, MolWeightdelta defines the possible range of molecular weight for SeqNT.
Tm

A vector with melting temperature values, in degrees Celsius, calculated by six different methods, listed in the following order:

Ambiguous N characters in SeqNT are considered to potentially be any nucleotide. If SeqNT contains ambiguous N characters, Tm is the midpoint value, and its uncertainty is expressed by Tmdelta.

TmdeltaA vector containing the differences between Tm (midpoint value) and either the maximum or minimum value Tm could assume for each of the six methods. Therefore, Tmdelta defines the possible range of melting temperatures for SeqNT.
Thermo

4-by-3 matrix of thermodynamic calculations.

The rows correspond to nearest-neighbor parameters from:

The columns correspond to:

  • delta H — Enthalpy in kilocalories per mole, kcal/mol

  • delta S — Entropy in calories per mole-degrees Kelvin, cal/(K)(mol)

  • delta G — Free energy in kilocalories per mole, kcal/mol

Ambiguous N characters in SeqNT are considered to potentially be any nucleotide. If SeqNT contains ambiguous N characters, Thermo is the midpoint value, and its uncertainty is expressed by Thermodelta.

Thermodelta4-by-3 matrix containing the differences between Thermo (midpoint value) and either the maximum or minimum value Thermo could assume for each calculation and method. Therefore, Thermodelta defines the possible range of thermodynamic values for SeqNT.

References

[1] Breslauer, K.J., Frank, R., Blöcker, H., and Marky, L.A. (1986). Predicting DNA duplex stability from the base sequence. Proceedings of the National Academy of Science USA 83, 3746–3750.

[2] Chen, S.H., Lin, C.Y., Cho, C.S., Lo, C.Z., and Hsiung, C.A. (2003). Primer Design Assistant (PDA): A web-based primer design tool. Nucleic Acids Research 31(13), 3751–3754.

[3] Howley, P.M., Israel, M.A., Law, M., and Martin, M.A. (1979). A rapid method for detecting and mapping homology between heterologous DNAs. Evaluation of polyomavirus genomes. The Journal of Biological Chemistry 254(11), 4876–4883.

[4] Marmur, J., and Doty, P. (1962). Determination of the base composition of deoxyribonucleic acid from its thermal denaturation temperature. Journal Molecular Biology 5, 109–118.

[5] Panjkovich, A., and Melo, F. (2005). Comparison of different melting temperature calculation methods for short DNA sequences. Bioinformatics 21(6), 711–722.

[6] SantaLucia Jr., J., Allawi, H.T., and Seneviratne, P.A. (1996). Improved Nearest-Neighbor Parameters for Predicting DNA Duplex Stability. Biochemistry 35, 3555–3562.

[7] SantaLucia Jr., J. (1998). A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. Proceedings of the National Academy of Science USA 95, 1460–1465.

[8] Sugimoto, N., Nakano, S., Yoneyama, M., and Honda, K. (1996). Improved thermodynamic parameters and helix initiation factor to predict stability of DNA duplexes. Nucleic Acids Research 24(22), 4501–4505.

Version History

Introduced before R2006a