how to split dna sequence into three letters each

11 ビュー (過去 30 日間)
Aarsha mv
Aarsha mv 2017 年 5 月 12 日
回答済み: Luuk van Oosten 2017 年 5 月 12 日
how to split a dna sequence into three letters each
  3 件のコメント
Stephen23
Stephen23 2017 年 5 月 12 日
編集済み: Stephen23 2017 年 5 月 12 日
Pick from:
reshape
reshape and num2cell
mat2cell
regexp
...
John D'Errico
John D'Errico 2017 年 5 月 12 日
Get used to working in MATLAB where you think about variables in terms of their shape, their size, as complete arrays. Think of things in terms of how you can transform those arrays into what will be useful to you. Then do some quick reading in the help of the tools in matfun.
help elmat
There are many useful tools in elmat. Some of them, like reshape, might be useful to you.

サインインしてコメントする。

回答 (1 件)

Luuk van Oosten
Luuk van Oosten 2017 年 5 月 12 日
Dear Aarsha mv,
I assume you want to get the codons from your piece of DNA, but correct me if I am wrong.
Let us take the example of a piece of DNA coding for the protein insulin (I took a part of this piece of DNA).
Define a string of nucleotides as being your DNA sequence:
your_DNA = 'CTCGAGGGGCCTAGACATTGCCCTCCAGAGAGAGCACCCAACACCCTCCAGGCTTGACCGGCCAGGGTG';
To get the codons you can use 'reshape', as was suggested by Stephen Cobeldick and John D'Errico.
Use reshape as follows:
codons = reshape(your_DNA(:),3,length(your_DNA)/3)'
In a 'real life' scenario you would probably need a workaround if length(your_DNA) is not a number that can be divided by 3.
Besides having a look at the tips that Cobeldick and d'Errico already gave you, I suggest you have a look at the Nucleotide Sequence Analysis overview page. There might be some functions that will aid you in your DNA analysis.

カテゴリ

Help Center および File ExchangeGenomics and Next Generation Sequencing についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by