フィルターのクリア

How do I make a cell with the following contents?

1 回表示 (過去 30 日間)
Mohannad Abboushi
Mohannad Abboushi 2017 年 1 月 15 日
コメント済み: Guillaume 2017 年 1 月 16 日
I am making a program that basically takes a string s as a single strand of DNA and returns the amino acid sequence of the longest gene it finds. Whereby, a gene is defined as a reading frame that: starts with AUG codon, ends with one of UAA,UAG, or UGA codon.
I tried making a cell of different "frames" but since they are not the same length I can't put them into an array. How do i work around this? Here's my code:
function [ptn]=Seq_transcribe2(x)
y=seq_transcribe1(x);
frames={};
frames={x(1:end) x(2:end) x(3:end) y(1:end) y(2:end)
y(3:end)};
starts=[];
stops=[];
allorfs={};
for i=1:3:numel(frames)-2
codon= frames([i i+1 i+2])
if codon=='AUG'
starts(end+1)=codon;
if strcmp(codon,'UAA') || strcmp(codon,'UAG') || strcmp(codon,'UGA')
stops(end+1)=codon;
end
stops= find(stops>starts,1)
lengthofthisstart=stops-starts
allorfs{end+1}=frame(starts:stops-1)
  2 件のコメント
Niels
Niels 2017 年 1 月 15 日
would be helpfull if you add the error message
Mohannad Abboushi
Mohannad Abboushi 2017 年 1 月 15 日
Error using vertcat Dimensions of matrices being concatenated are not consistent.

サインインしてコメントする。

採用された回答

Guillaume
Guillaume 2017 年 1 月 15 日
If I understood correctly, a simple way to find all genes would be:
[genesequences, starts, stops] = regexp(x, 'AUG.*?(UAA|UAG|UGA)', 'match', 'start', 'end');
And the longest sequence is of course:
[~, longestidx] = max(stops - starts);
longestsequence = genesequences{longestidx}
  2 件のコメント
Arthur Goldsipe
Arthur Goldsipe 2017 年 1 月 16 日
I think you need a slight change to account for the fact that all codons are 3 characters long:
[genesequences, starts, stops] = regexp(x, 'AUG(...)*?(UAA|UAG|UGA)', 'match', 'start', 'end');
Guillaume
Guillaume 2017 年 1 月 16 日
Oh yes, as I know nothing about genes and codons, I didn't know that the number of characters between the start and end codon must be a multiple of three, but I should have inferred that from the original code.
Thanks.

サインインしてコメントする。

その他の回答 (1 件)

Niels
Niels 2017 年 1 月 15 日
if i understood you right your problem is in one of the following lines:
frames={x(1:end) x(2:end) x(3:end) y(1:end) y(2:end)
allorfs{end+1}=frame(starts:stops-1)
if so, i cant replicate your problem, in cell arrays the length of the elements is irrelevant
>> a=1:3;
>> b=1:4;
>> c=1:5;
>> cell={a b c}
cell =
1×3 cell array
[1×3 double] [1×4 double] [1×5 double]
%=================================
>> a={}
a =
0×0 empty cell array
>> a{end+1}=1
a =
cell
[1]
>> a{end+1}=2
a =
1×2 cell array
[1] [2]
>> a{end+1}=[2 1]
a =
1×3 cell array
[1] [2] [1×2 double]
  2 件のコメント
Mohannad Abboushi
Mohannad Abboushi 2017 年 1 月 15 日
I feel like there's got to be an easier way to do this
Guillaume
Guillaume 2017 年 1 月 15 日
If the following line
frames={x(1:end) x(2:end) x(3:end) y(1:end) y(2:end)
y(3:end)};
is indeed written on two lines, then yes matlab is going to issue a concatenation error since the line return is interpreted as a vertical concatenation.
frames={x(1:end) x(2:end) x(3:end) y(1:end) y(2:end) y(3:end)};
or
frames={x(1:end) x(2:end) x(3:end) y(1:end) y(2:end) ...
y(3:end)};
would fix the error

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeBioinformatics Toolbox についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by