フィルターのクリア

If I have a primer with redundant bases, how do I generate all associated primer combinations

4 ビュー (過去 30 日間)
As an example, if I have the following primer:
primer = 'AGCTYRSWKMACGT';
And these are the options for each redundant base:
options = {'C', 'T'; ... % Y
'A', 'G'; ... % R
'C', 'G'; ... % S
'A', 'T'; ... % W
'G', 'T'; ... % K
'A', 'C';}; % M
How do I generate all primer combinations, i.e. 1) AGCTCACAGAACGT, 2) AGCTTACAGAACGT, 3) AGCTCGCAGAACGT, etc? There should be 64 primer combinations for the above example. Thanks!

採用された回答

Tim DeFreitas
Tim DeFreitas 2023 年 5 月 2 日
Per your last comment, here's a longer but more robust approach that works regardless of where the ambiguous bases are in the primer sequence:
primer = 'AWCTARCTAMGT';
allPrimers = char.empty(1,0);
for b = 1:numel(primer)
base = primer(b);
switch base
case 'Y'
nextBases = 'CT';
case 'R'
nextBases = 'AG';
case 'S'
nextBases = 'CG';
case 'W'
nextBases = 'AT';
case 'K'
nextBases = 'GT';
case 'M'
nextBases = 'AC';
otherwise
nextBases = base; % Unambiguous base
end
% Extend allPrimers by the first (and possibly only) candidate base
allPrimers(:, end+1) = nextBases(1);
if numel(nextBases) > 1
% Make a copy of all current primers and change the trailing base to the
% other candidate for the ambiguous base
alternatePrimers = allPrimers;
alternatePrimers(:, end) = nextBases(2);
allPrimers = [allPrimers; alternatePrimers];
end
end
allPrimers
allPrimers = 8×12 char array
'AACTAACTAAGT' 'ATCTAACTAAGT' 'AACTAGCTAAGT' 'ATCTAGCTAAGT' 'AACTAACTACGT' 'ATCTAACTACGT' 'AACTAGCTACGT' 'ATCTAGCTACGT'
If you want to automate against a bunch of primers, I'd suggest turning the above script into a function with the primer sequence as the input.
Hope this helps,
-Tim

その他の回答 (1 件)

Tim DeFreitas
Tim DeFreitas 2023 年 5 月 1 日
Here's one way to do it:
options = ['CT' 'AG' 'CG' 'AT' 'GT' 'AC'];
% Enumerate indices into options producing valid primers
base = 1:2:11;
offsets = dec2bin(0:63) == '1';
allPrimers = cell(1,64);
for p = 1:64
allPrimers{p} = ['AGCT', options(base + offsets(p, :)), 'ACGT'];
end
This works by arranging our options string such that indexing into it with an odd number selects the base from the first set of options, and indexing with an even number selects the base from the other set of options. For instance
options([1, 3, 5, 7, 9, 11])
ans = 'CACAGA'
selects entirely from your first column, and
options([2, 4, 6, 8, 10, 12])
ans = 'TGGTTC'
selects entirely from your second column. If we then enumerate all possible ways to index into this choosing only one element from each pair, we will produce every possible primer. Because there are 2 choices, and 6 candidate bases, we can produce these offsets using dec2bin from 0 to 2^6-1.
  1 件のコメント
Vishwaratn Asthana
Vishwaratn Asthana 2023 年 5 月 2 日
This is an interesting approach @Tim DeFreitas! One concern I have is how would I got about automating the above code? Specifically, it appears I need to manually set the non-redundant bases in the following portion of the code:
allPrimers{p} = ['AGCT', options(base + offsets(p, :)), 'ACGT'];
This works for 'AGCTYRSWKMACGT' but what if the primer sequence was 'AWCTARCTAMGT?
Again, your help is greatly appreciated!

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeGenomics and Next Generation Sequencing についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by