Main Content

results

Get bioinformatics pipeline results

Since R2023a

Description

example

blockResults = results(pipeline,block) returns the results of block in the pipeline.

Tip

Use fetchResults instead of results if you are running in parallel. fetchResults waits for the block to complete before returning its results.

Examples

collapse all

Import the pipeline and block objects needed for the example.

import bioinfo.pipeline.Pipeline
import bioinfo.pipeline.block.*

Create a pipeline.

P = Pipeline;

Create the FileChooser and SamSort blocks.

FCB = FileChooser(which("ex1.sam"));
SSB = SamSort;

Add blocks to the pipeline and connect them.

addBlock(P,[FCB,SSB]);
connect(P,FCB,SSB,["Files","SAMFile"]);

Run the pipeline.

run(P);

The outputs of the FileChooser and SamSort blocks are files that are saved to your file system.

fcbResults = results(P,FCB)
fcbResults = struct with fields:
    Files: [1×1 bioinfo.pipeline.datatype.File]

ssbResults = results(P,SSB)
ssbResults = struct with fields:
    SortedSAMFile: [1×1 bioinfo.pipeline.datatype.File]

Tip: Use the unwrap method to see the location of the output file. For example, unwrap(ssbResults.SortedSAMFiles) shows the location of the sorted SAM file.

Import the pipeline and block objects needed for the example.

import bioinfo.pipeline.Pipeline
import bioinfo.pipeline.blocks.*

Create a pipeline.

P = Pipeline;

A FileChooser block can take in a URL of a remote file as an input and download the file to make it available for the downstream blocks. Download the file Homo_sapiens.GRCh38.dna.chromosome.19.fa.gz that contains the human reference genome chromosome 19 in the FASTA format.

chr19url = "http://ftp.ensembl.org/pub/release-104/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.chromosome.19.fa.gz";
fileChooserBlock1 = FileChooser(chr19url);

A UserFunction block to unzip the downloaded reference genome file using the gunzip function. When you create the block, you can specify the function to call and set the input and output port names that map to the input and output arguments of the corresponding function, respectively. In this example, name the input port as "ZippedFilenames" and the output port as "UnzippedFilenames"

gunzipUserFunctionBlock = UserFunction(@gunzip,RequiredArguments="ZippedFilenames",OutputArguments="UnzippedFilenames");

The reference genome file needs to be indexed in before reads can be aligned to it. To generate the indices, create a Bowtie2Build block.

bowtie2BuildBlock = Bowtie2Build;

Add the blocks.

addBlock(P,[fileChooserBlock1,gunzipUserFunctionBlock,bowtie2BuildBlock]);

Connect the output port named "Files" of fileChooserBlock1 to the input port named "ZippedFileNames" of gunzipUserFunctionBlock. Also connect the output "UnzippedFilenames" of gunzipUserFunctionBlock to the input "ReferenceFASTAFiles" of bowtie2BuildBlock.

connect(P,fileChooserBlock1,gunzipUserFunctionBlock,["Files","ZippedFilenames"]);
connect(P,gunzipUserFunctionBlock,bowtie2BuildBlock,["UnzippedFilenames","ReferenceFASTAFiles"]);

Create blocks for downloading RNA-seq data.

adrenal_1_url = "https://usegalaxy.org/dataset/display?dataset_id=d44d2a324474d1aa&to_ext=fq";
adrenal_2_url = "https://usegalaxy.org/dataset/display?dataset_id=d08360a1c0ffdc62&to_ext=fq";
brain_1_url =   "https://usegalaxy.org/dataset/display?dataset_id=f187acb8015d6c7f&to_ext=fq";
brain_2_url =   "https://usegalaxy.org/dataset/display?dataset_id=08c45996966d7ded&to_ext=fq";
fileChooserBlock2 = FileChooser([brain_1_url;adrenal_1_url]);
fileChooserBlock3 = FileChooser([brain_2_url;adrenal_2_url]);

Create a Bowtie2 block for mapping reads.

bowtie2Block = Bowtie2;

Add blocks to the pipeline.

addBlock(P,[fileChooserBlock2,fileChooserBlock3,bowtie2Block]);

Connect the blocks.

connect(P,bowtie2BuildBlock,bowtie2Block,["IndexBaseName","IndexBaseName"]);
connect(P,fileChooserBlock2,bowtie2Block,["Files","Reads1Files"]);
connect(P,fileChooserBlock3,bowtie2Block,["Files","Reads2Files"]);

Run the pipeline in parallel.

run(P,UseParallel=true);
Starting parallel pool (parpool) using the 'Processes' profile ...
Connected to parallel pool with 4 workers.

If you try to get the block results while the pipeline is still running, you get an incomplete result.

bt2Results = results(P,bowtie2Block)
bt2Results = 
  Incomplete pipeline result.

Use fetchResults to wait for the blocks that are running in parallel to complete and get the results.

bt2Results = fetchResults(P,bowtie2Block)
bt2Results = struct with fields:
    SAMFile: [1×1 bioinfo.pipeline.datatype.File]

Tip: Use the unwrap method to see the location of the output file. For example, unwrap(bt2Results.SAMFile) shows the location of the sorted SAM file.

Alternatively, you can use the following two commands instead of fetchResults.

wait(P,bowtie2Block);
bt2Results = results(P,bowtie2Block);

Input Arguments

collapse all

Bioinformatics pipeline, specified as a bioinfo.pipeline.Pipeline object.

Block in the pipeline, specified as a scalar bioinfo.pipeline.Block object, character vector, or string scalar as the block name. To get the list of block names, enter pipeline.BlockNames at the command line.

Output Arguments

collapse all

Block results, returned as a structure or bioinfo.pipeline.datatypes.Incomplete object. The Incomplete object is returned if the block results are not yet computed or available.

If it is a structure, the field names are the output port names of the block, and the field values are the output values. If you have not run the pipeline or the results are not available yet, each output value has the default value of Incomplete, which is a bioinfo.pipeline.Incomplete object.

Data Types: struct

Version History

Introduced in R2023a