bioinfo.pipeline.block.GenomicsViewer

Open Genomics Viewer

Since R2024a

Description

An GenomicsViewer block enables you to open the Genomics Viewer app from a bioinformatics pipeline and visualize NGS data. The Genomics Viewer app allows you to view and explore such data with an embedded version of the Integrative Genomics Viewer (IGV) [1][2].

Creation

Syntax

b = bioinfo.pipeline.block.GenomicsViewer

Description

b = bioinfo.pipeline.block.GenomicsViewer creates a GenomicsViewer block.

example

Properties

expand all

`ErrorHandler` — Function to handle errors from `run` method
`[]` (default) | function handle

Function to handle errors from the run method of the block, specified as a function handle. The handle specifies the function to call if the run method encounters an error within a pipeline. For the pipeline to continue after a block fails, ErrorHandler must return a structure that is compatible with the output ports of the block. The error handling function is called with the following two inputs:

Structure with these fields:

Field	Description
identifier	Identifier of the error that occurred
message	Text of the error message
index	Linear index indicating which block process failed in the parallel run. By default, the index is 1 because there is only one run per block. For details on how block inputs can be split across different dimensions for multiple run calls, see Bioinformatics Pipeline SplitDimension.

Input structure passed to the run method when it fails

Data Types: function_handle

`Inputs` — Input ports
structure

This property is read-only.

Input ports of the block, specified as a structure. The field names of the structure are the names of the block input ports, and the field values are bioinfo.pipeline.Input objects. These objects describe the input port behaviors. The input port names are the expected field names of the input structure that you pass to the block run method.

The GenomicsViewer block Inputs structure has the following fields:

Reference — Reference genome file name. This input is a required input that must be satisfied.
Cytoband — Cytoband ideogram file name. This input is an optional input.
Tracks — Alignment or genomics data file names. This input is an optional input.

The default value for each field is a bioinfo.pipeline.datatypes.Unset object, which means that the value property of the input is not set yet.

Data Types: struct

`Outputs` — Output ports
structure

This property is read-only.

Output ports of the block, specified as a structure. The field names of the structure are the names of the block output ports, and the field values are bioinfo.pipeline.Output objects. These objects describe the output port behaviors. The field names of the output structure returned by the block run method are the same as the output port names.

The GenomicsViewer block Outputs structure has the field named GenomicsViewer, which is a genomicsViewer object.

Data Types: struct

Object Functions

`compile`	Perform block-specific additional checks and validations
`copy`	Copy array of handle objects
`emptyInputs`	Create input structure for use with `run` method
`eval`	Evaluate block object
`run`	Run block object

Examples

collapse all

Download NGS Data from SRA Using Bioinformatics Pipeline

This example uses:

Open Live Script

Import the pipeline and block objects needed for the example so that you can create these objects without specifying the entire namespace.

import bioinfo.pipeline.Pipeline
import bioinfo.pipeline.block.*

Create a pipeline.

P = Pipeline;

Create an SRAFasterqDump block and specify the accession number SRR11846824 as the block input. SRR11846824 has two reads per spot and no unaligned reads.

SRAFQDump = SRAFasterqDump;
SRAFQDump.Inputs.SRRID.Value = "SRR11846824";
addBlock(P,SRAFQDump);

Run the pipeline to download the corresponding FASTQ files from SRA for the specified accession number.

run(P);

Get the results of the SRAFQDump block.

R = results(P,SRAFQDump)

R = struct with fields:
      Reads: [1×1 bioinfo.pipeline.datatype.Incomplete]
    Reads_1: [1×1 bioinfo.pipeline.datatype.File]
    Reads_2: [1×1 bioinfo.pipeline.datatype.File]
    Reads_3: [1×1 bioinfo.pipeline.datatype.Incomplete]
    Reads_4: [1×1 bioinfo.pipeline.datatype.Incomplete]
    Reads_5: [1×1 bioinfo.pipeline.datatype.Incomplete]

View the names of the downloaded files by using the unwrap function.

unwrap(R.Reads_1)
unwrap(R.Reads_2)

By default, the block uses the SplitType="SplitThree" option and downloads only biological reads. Specifically, the block splits spots into reads. For spots with two reads, the block produces *_1.fastq and *_2.fastq and displays them in the Reads_1 and Reads_2 fields, respectively. The block saves any unaligned reads in a *.fastq file and displays it in the Reads field. Because this accession has no unaligned reads, the block did not produce a *.fastq file, and the Reads field is returned as Incomplete. Reads_3, Reads_4, and Reads_5 are also Incomplete because of the usage of SplitType="SplitThree". For more details on the block output behavior, see Outputs.

You can specify other download options using the SRAFasterqDumpOptions. For instance, to download the FASTA-formatted file, specify FastaOutput=true and rerun the block.

opt = SRAFasterqDumpOptions;
opt.FastaOutput = true;
SRAFQDump.Options = opt;

You can also download SAM files from SRA using the SRASAMDump block.

SRASDump = SRASAMDump;

Specify the accession number to download.

SRASDump.Inputs.SRRID.Value = "SRR11846824";

Specify the options using an SRASAMDumpOptions object. For instance, set the output filename and compress the output file using bzip2.

samdumpopt = SRASAMDumpOptions;
samdumpopt.BZip2 = 1;
samdumpopt.OutputFileName = "SRR11846824.sam.bz2"

samdumpopt = 
  SRASAMDumpOptions with properties:

   Default properties:
       ExtraCommand: ""
        FastaOutput: 0
        FastqOutput: 0
               GZip: 0
      HideIdentical: 0
         IncludeAll: 0
      MinMapQuality: 0
      OutputPrimary: 0
    OutputUnaligned: 0
            Version: "3.0.6"

   Modified properties:
              BZip2: 1
     OutputFileName: "SRR11846824.sam.bz2"

SRASDump.Options = samdumpopt;

Add the block to the pipeline and run the pipeline.

addBlock(P,SRASDump);
run(P);

Get the block results.

R2 = results(P,SRASDump);

View the names of the output files by using the unwrap function.

unwrap(R2.OutputFiles)

After downloading the files, you can use them for downstream analyses. For instance, you can run bowtie2 to map the reads to the reference sequence, and then visualize the mapped reads in the Genomics Viewer app.

First, download the C. elegans reference sequence.

celegans_refseq = fastaread("https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/ce11/ce11.fa");

Save the Chromosome 3 reference data in a FASTA file.

celegans_chr3 = celegans_refseq(3).Sequence;
fastawrite("celegans_chr3.fa",celegans_chr3);

Create a FileChooser block to select the Chromosome 3 reference file.

fcRef = FileChooser;
fcRef.Files = fullfile(pwd,"celegans_chr3.fa");
addBlock(P,fcRef);

Build a set of index files using the Bowtie2Build block. Set the base name of the index files and the name of the reference FASTA file.

buildIndex = Bowtie2Build;
buildIndex.Inputs.IndexBaseName.Value = "celegans_chr3_index";
addBlock(P,buildIndex);
connect(P,fcRef,buildIndex,["Files","ReferenceFASTAFiles"]);
run(P);

Align reads to the reference using the Bowtie2 block. Create the block and then connect it to buildIndex and SRAFQDump blocks.

alignReads = Bowtie2;
alignReads.OutFilename = "SRR11846824_mapped.sam";
addBlock(P,alignReads);
connect(P,buildIndex,alignReads,["IndexBaseName","IndexBaseName"]);
connect(P,SRAFQDump,alignReads,["Reads_1","Reads1Files";"Reads_2","Reads2Files"]);
run(P);

Bowtie2 produces a SAM file. To visualize the mapped reads in the Genomics Viewer app, convert the SAM file to a BAM file.

First, make a UserFunction block to create a BioMap object from the SAM file.

biomapObj = UserFunction;
biomapObj.Function = "BioMap";
biomapObj.RequiredArguments = "inputSAM";
biomapObj.OutputArguments = "biomapObject";
addBlock(P,biomapObj);

Next, connect the biomapObj block to the alignReads block, which provides the SAM file needed. Suppress two informational warnings issued during the creation of a BioMap object.

connect(P,alignReads,biomapObj,["SAMFile","inputSAM"]);
w = warning;
warning("off","bioinfo:BioMap:BioMap:UnsortedReadsInSAMFile");
warning("off","bioinfo:saminfo:InvalidTagField");
run(P);
warning(w); % Restore warnings

Use the write method of the BioMap object to convert the SAM file to a BAM file.

sam2bam = UserFunction;
sam2bam.Function = "write";
sam2bam.RequiredArguments = ["biomapObj","BAMFileName"];
sam2bam.NameValueArguments = "Format";
sam2bam.Inputs.BAMFileName.Value = "../../../SRR11846824_mapped.bam";
sam2bam.Inputs.Format.Value = "BAM";
addBlock(P,sam2bam);
connect(P,biomapObj,sam2bam,["biomapObject","biomapObj"]);
run(P);

Create a FileChooser block to select the generated BAM file.

fcBAM = FileChooser;
fcBAM.Files = fullfile(pwd,"SRR11846824_mapped.bam");
addBlock(P,fcBAM);

Create a FileChooser block to select the C. elegans cytoband file, which is provided with the toolbox.

fcCyto = FileChooser;
fcCyto.Files = fullfile(pwd,"celegans_cytoBandIdeo.txt.gz");
addBlock(P,fcCyto);

View the alignment data using the Genomics Viewer app.

gv = GenomicsViewer;
addBlock(P,gv);
connect(P,fcRef,gv,["Files","Reference"]);
connect(P,fcCyto,gv,["Files","Cytoband"]);
connect(P,fcBAM,gv,["Files","Tracks"]);
run(P);

Use the zoom slider to zoom in and see the features. Or you can enter the following in the search text box: Generated:3,711,861-3,711,940.

Delete the pipeline results and downloaded files.

deleteResults(P,IncludeFiles=true);

References

[1] Robinson, J., H. Thorvaldsdóttir, W. Winckler, M. Guttman, E. Lander, G. Getz, J. Mesirov. 2011. Integrative Genomics Viewer. Nature Biotechnology. 29:24–26.

[2] Thorvaldsdóttir, H., J. Robinson, J. Mesirov. 2013. Integrative Genomics Viewer (IGV): High-performance genomics data visualization and exploration. Briefings in Bioinformatics. 14:178–192.

Version History

Introduced in R2024a

bioinfo.pipeline.block.GenomicsViewer

Description

Creation

Syntax

Description

Properties

`ErrorHandler` — Function to handle errors from `run` method
`[]` (default) | function handle

`Inputs` — Input ports
structure

`Outputs` — Output ports
structure

Object Functions

Examples

Download NGS Data from SRA Using Bioinformatics Pipeline

References

Version History

See Also

Topics

bioinfo.pipeline.block.GenomicsViewer

Description

Creation

Syntax

Description

Properties

ErrorHandler — Function to handle errors from run method [] (default) | function handle

Inputs — Input ports structure

Outputs — Output ports structure

Object Functions

Examples

Download NGS Data from SRA Using Bioinformatics Pipeline

References

Version History

See Also

Topics

`ErrorHandler` — Function to handle errors from `run` method
`[]` (default) | function handle

`Inputs` — Input ports
structure

`Outputs` — Output ports
structure