how to replace characters into digits

Question

0 投票

i have to replace the each characters using the following digits s=ACGT

i have to replace as

        'A' then 11
        'C' then 00
        'G' then 01
        'T' then 10

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Jan 2013 年 2 月 27 日

You write 11 without surrounding quotes. This isn't an accident, correctly? What do you want as output? How long is the input?

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Azzi Abdelmalek 2013 年 2 月 27 日

編集済み: Azzi Abdelmalek 2013 年 2 月 27 日

MATLAB Online で開く

2 投票

clear 
s='ACGT'
e=['11';'00';'01';'10']
in='AGCTAG'      % Your initial data
out=in
for k=1:numel(s)
 out=regexprep(out,s(k),e(k,:))
end

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

Jan 2013 年 2 月 28 日

STRREP is less powerful, but much faster than REGEXPREP.

Cedric 2013 年 2 月 28 日

STRREP is probably the best solution. It is a little slower than my solution, but more memory friendly.

サインインしてコメントする。

Answer 2

Jan 2013 年 2 月 28 日

MATLAB Online で開く

1 投票

And a lookup table:

seqIn = 'ACGTTGCA'
table = repmat('0', 2, 255);
table(1, 'AT') = '1';
table(2, 'AG') = '1';
result = reshape(table(:, seqIn), 1, []);

Does this work? I do not have access to Matlab currently.

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

Cedric 2013 年 2 月 28 日

編集済み: Cedric 2013 年 2 月 28 日

Actually STRREP still wins up to me. The solution based on ISMEMBER is profiled between REGEXP and the two solutions based on array indexing. But the latter require much more memory during the indexing operation than STRREP. You start seeing that with ~1e8 chars. On my laptop with 8GB RAM for example, only STRREP can treat more than 2e8 chars without swapping (or being killed by Process Lasso).

Jan 2013 年 2 月 28 日

@Azzi: Is this a typo?! Your function needs 14 secs with REGEXPREP and 0.007 secs with STRREP? Then my minor suggestion caused a speedup of a factor 1900? Wow, this would be the most efficient suggestion I ever gave. And it would be a strong hint to warn for the low efficiency of regexprep in this forum.

サインインしてコメントする。

Answer 3

Cedric 2013 年 2 月 27 日

編集済み: Cedric 2013 年 2 月 27 日

MATLAB Online で開く

0 投票

If you need to process long sequences, you might want to optimize a little the efficiency.. a MEX-based solution would be most efficient I guess, but here is one way you could go using basic MATLAB only..

If you want to replace character 'A' with a numeric array (1,1) and so on, you can do:

 aa  = 'ACGT' ;
 seq = 'AAGCTCAGGTTCA' ;
 rep = zeros(2, max(aa), 'uint8') ;
 rep(:,aa) = [1 0 0 1; 1 0 1 0] ;
 result = reshape(rep(:,seq), 1, []) ;

This outputs the numeric array:

 result =
   1  1  1  1  0  1  0  0  1  0  0  0  1  1  0  1  0  1  1  0  1  0  0  0  1  1

If you want to replace character 'A' with characters '11' and so on, you can do:

 aa  = 'ACGT' ;
 seq = 'AAGCTCAGGTTCA' ;
 rep = zeros(2, max(aa), 'uint8') ;
 rep(:,aa) = ['11'; '00'; '01'; '10'].' ;
 result = char(reshape(rep(:,seq), 1, [])) ;

This outputs the string '11110100100011010110100011'.

EDIT: note that there are slightly different ways of doing this a little slower but with a more memory-friendly approach.

Cheers,

Cedric

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

Answer 4

Jos (10584) 2013 年 2 月 28 日

MATLAB Online で開く

0 投票

Here is a simpler approach than looping over REGEXPREP or STRREP:

seqIn = 'ACGTTGCA' % input sequence
letters = 'ACGT' ;
symb = {'11','00','10','01'} ; % stored as a cell array of strings!
[tf,idx] = ismember(seqIn,letters) ;
seqOut = [symb{idx}]

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

how to replace characters into digits

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

採用された回答

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

その他の回答 (3 件)

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

カテゴリ

タグ

Community Treasure Hunt

how to replace characters into digits

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

採用された回答

3 件のコメント 1 件の古いコメントを表示 1 件の古いコメントを非表示

その他の回答 (3 件)

3 件のコメント 1 件の古いコメントを表示 1 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

カテゴリ

タグ

参考

Community Treasure Hunt

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示 1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示