Exploring Genome-wide Differences in DNA Methylation Profiles documentation

1 回表示 (過去 30 日間)
Theo
Theo 2015 年 3 月 17 日
回答済み: elcebir 2016 年 2 月 1 日
Could someone please explain to me how am I supposed to do the following steps?
" *(1) downloaded the files SRR030222.sra, SRR030223.sra, SRR030224.sra and SRR030225.sra containing the unmapped short reads for two replicates of from the DICERex5 sample and two replicates from the HCT116 sample respectively. Converted them to FASTQ-formatted files using the NCBI SRA Toolkit.
(2) produced SAM-formatted files by mapping the short reads to the reference human genome (NCBI Build 37.5) using the Bowtie [2] algorithm. Only uniquely mapped reads are reported.
(3) compressed the SAM formatted files to BAM and ordered them by reference name first, then by genomic position by using SAMtools [3].* "
For part 1 I downloaded to files using the FTP site and I used to file fastq-dump.exe to find the fastq format. After that I have no idea what to do. Help is appreciated. Wish Matlab documentation was more descriptive though.

採用された回答

Paola Favaretto
Paola Favaretto 2015 年 3 月 20 日
Hello,
Here you can find the pre-processing steps for starting the analysis outlined in the demo. Please note you must have bowtie installed and samtools available to you.
1. In the directories where you have saved your .sra files, do the following for each file:
fastq_dump SRRxxxx.sra
2. Download and uncompress the relevant bowtie indexes (pre-built indexes are available in the Bowtie website, just check the right-hand panel).
3. In each directory where you have your SRR* files, run bowtie against the reference genome. The options below specify you want only the best match (-m 1 --best), the result in SAM format (--sam), no more than 2 mismatches (-v), multithreading option (-p 8) and the time for each step (-t):
bowtie -t -q --sam -m 1 --best -k 1 -v 2 -p 8 /location_of_reference_index/hg18 SRRxxxx.fastq SRRxxxx.sam
4. Recover the reference as a fasta file (it's needed by the samtools' ordering)
bowtie-inspect hg18 > hg18.fasta
5. Convert to BAM and order
samtools view -bt hg18.fasta SRRxxxx.sam > SRRxxxxunordered.bam
samtools sort SRRxxxxunordered.bam SRRxxxx
Please let me know if you have further questions.
  1 件のコメント
Theo
Theo 2015 年 4 月 3 日
Thanks Paola for the detailed answer. Your steps are clear I think they will work. I'll come back to you if I meanwhile had a problem.

サインインしてコメントする。

その他の回答 (1 件)

elcebir
elcebir 2016 年 2 月 1 日
For bowtie 2 the given answer need to be corrected for the flags and usage. in the 3 step the usage and flags should be in the given way 3-$BT2_HOME/bowtie2 -p 16 --local -M 3 -x hsGRCh38 -U $BT2_HOME/example/reads/SRR030224.fastq -S SRR030224.sam --best and -v options are not valid in bowtie2 In the last step the usage can be in the given way 5-samtools sort -l 9 -n -T abc SRR030224unordered.bam -o SRR030224.bam

カテゴリ

Help Center および File ExchangeGenomics and Next Generation Sequencing についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by