bioinfo.pipeline.Input
Description
Each input port of a bioinfo.pipeline.Block
object is a bioinfo.pipeline.Input
object.
Creation
Create the object using bioinfo.pipeline.Input
.
Properties
SplitDimension
— Instruction on how to split block inputs
[]
(default) | vector of positive integers | "all"
Instruction on how to split the block inputs across multiple runs of block in a
pipeline, specified as a vector of positive integers or "all"
.
Some of the blocks in a bioinformatics pipeline operate on their input data arrays
as one single input while other blocks can operate on individual elements or slices of
the input data array independently. The SplitDimension
property of
a block input controls how to split the block input data (or input array) across
multiple runs of the same block in a pipeline. By default, the block input data are
passed unchanged (that is, there is no dimensional splitting of the input data) to the
run
method of the block, which means that the block runs once for
all of the input data array.
Specify a vector of integers to indicate which dimensions of the input array to
split and pass to the block run
method. By splitting the input
array, you are specifying how many times you want to run the same block with different
inputs. Use "all"
to pass all elements of the input value to the
run
method of the block independently. If there are
n elements, the block runs n times
independently. For example, you can use a Bowtie2
block to align three
input files to a single SAM file, or use "all"
to let
Bowtie2
run three times, aligning each input file to a distinct SAM
file.
When a block has a single input with split dimensions, the input value is split in
the corresponding dimensions (such as row-dimension or column-dimension) before being
passed to the run
method of the block. The total number of times
the block runs within a pipeline is the product of the sizes of the input value in the
split dimensions.
For details, see Bioinformatics Pipeline SplitDimension.
Data Types: double
| char
| string
Required
— Flag to indicate input port is required
true
or 1 | false
or 0
This property is read-only.
Flag to indicate if the input port is required for the block to run, specified as a
numeric or logical 1 (true
) or 0 (false
).
A required input port (Required=true
) must be satisfied.
Otherwise, the pipeline fails to compile and does not run.
You can set the value as true or false when you define a block subclass. For details, see Subclass Pipeline Block.
Data Types: double
| logical
Value
— Input port value
bioinfo.pipeline.datatype.Unset
(default)
Input port value. By default, the value is set as a bioinfo.pipeline.datatype.Unset
object which means that no value is
provided, and the input value comes from a connected upstream block or input structure
passed to the run
call.
If an input port with a set value is also connected to an output port of another block, the value coming from the connected block is used instead of the set value.
Examples
Split Input SAM Files and Assemble Transcriptomes Using Bioinformatics Pipeline
Import the pipeline and block objects needed for the example.
import bioinfo.pipeline.Pipeline import bioinfo.pipeline.block.*
Create a pipeline.
P = Pipeline
P = Pipeline with properties: Blocks: [0×1 bioinfo.pipeline.Block] BlockNames: [0×1 string]
Use a FileChooser
block to select the provided SAM files. The files contain aligned reads for Mycoplasma pneumoniae from two samples.
fileChooserBlock = FileChooser([which("Myco_1_1.sam"); which("Myco_1_2.sam")]);
Create a Cufflinks
block.
cufflinksBlock = Cufflinks;
Add the blocks to the pipeline.
addBlock(P,[fileChooserBlock,cufflinksBlock]);
Connect the blocks.
connect(P,fileChooserBlock,cufflinksBlock,["Files","GenomicAlignmentFiles"]);
Set SplitDimension
to 1
for the GenomicAlignmentFiles
input port. The value of 1 corresponds to the row dimension of the input, which means that the Cufflinks
block will run on each individual SAM files (Myco_1_1.sam
and Myco_1_1.sam
).
cufflinksBlock.Inputs.GenomicAlignmentFiles.SplitDimension = 1;
Run the pipeline. The pipeline runs Cufflinks
block two times independently and generates a set of four files for each SAM file.
run(P);
Get the block results.
cufflinksResults = results(P,cufflinksBlock)
cufflinksResults = struct with fields:
TranscriptsGTFFile: [2×1 bioinfo.pipeline.datatype.File]
IsoformsFPKMFile: [2×1 bioinfo.pipeline.datatype.File]
GenesFPKMFile: [2×1 bioinfo.pipeline.datatype.File]
SkippedTranscriptsGTFFile: [2×1 bioinfo.pipeline.datatype.File]
Use the process table to check the total number of runs for each block. Cufflinks
ran two times independently.
t = processTable(P,Expanded=true);
Set SplitDimension
to empty []
(which is the default). In this case, the pipeline does split the input files and runs Cufflinks
just once for both SAM files, processing each SAM file one after another.
cufflinksBlock.Inputs.GenomicAlignmentFiles.SplitDimension = []; deleteResults(P,IncludeFiles=true); run(P); cufflinksResults = results(P,cufflinksBlock)
cufflinksResults = struct with fields:
TranscriptsGTFFile: [2×1 bioinfo.pipeline.datatype.File]
IsoformsFPKMFile: [2×1 bioinfo.pipeline.datatype.File]
GenesFPKMFile: [2×1 bioinfo.pipeline.datatype.File]
SkippedTranscriptsGTFFile: [2×1 bioinfo.pipeline.datatype.File]
Check the process table, which confirms that Cufflinks
ran just once.
t2 = processTable(P,Expanded=true);
Tip: you can speed up the pipeline run by setting UseParallel=true
if you have Parallel Computing Toolbox™. The pipeline can schedule independent executions of blocks on parallel pool workers.
run(P,UseParallel=true)
Version History
Introduced in R2023a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)