Main Content

bioinfo.pipeline.block.FeatureCount

Bioinformatics pipeline block to count reads mapped to genomic features

Since R2023a

  • featurecount block icon

Description

A FeatureCount block enables you to compute the number of reads mapped to genomic features.

Creation

Description

b = bioinfo.pipeline.block.FeatureCount creates a FeatureCount block.

example

b = bioinfo.pipeline.block.FeatureCount(options) also specifies additional options.

b = bioinfo.pipeline.block.FeatureCount(Name=Value) specifies additional options as the property names and values of a FeatureCountOptions object. This object is set as the value of the Options property of the block.

Input Arguments

expand all

FeatureCount options, specified as a FeatureCountOptions object.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Note

The following list of arguments is a partial list. For the complete list, refer to the properties of FeatureCountOptions object.

Feature type, specified as a character vector or string. This is used to decide what feature to consider from the GTF file. Default is 'exon'.

Attribute type, specified as a character vector or string. This is used to decide what attribute to consider from the GTF file for grouping features into metafeatures and summarizing the read count.

Properties

expand all

Function to handle errors from the run method of the block, specified as a function handle. The handle specifies the function to call if the run method encounters an error within a pipeline. For the pipeline to continue after a block fails, ErrorHandler must return a structure that is compatible with the output ports of the block. The error handling function is called with the following two inputs:

  • Structure with these fields:

    FieldDescription
    identifierIdentifier of the error that occurred
    messageText of the error message
    indexLinear index indicating which block process failed in the parallel run. By default, the index is 1 because there is only one run per block. For details on how block inputs can be split across different dimensions for multiple run calls, see Bioinformatics Pipeline SplitDimension.

  • Input structure passed to the run method when it fails

Data Types: function_handle

This property is read-only.

Input ports of the block, specified as a structure. The field names of the structure are the names of the block input ports, and the field values are bioinfo.pipeline.Input objects. These objects describe the input port behaviors. The input port names are the expected field names of the input structure that you pass to the block run method.

The FeatureCount block Inputs structure has the following fields:

  • GTFFile — GTF-formatted file name. This input is a required input that must be satisfied.

  • GenomicAlignmentFiles — Names of BAM- or SAM-formatted files. This input is a required input that must be satisfied.

The default value for each input field is a bioinfo.pipeline.datatypes.Unset object, which means that the input value is not set yet.

Data Types: struct

This property is read-only.

Output ports of the block, specified as a structure. The field names of the structure are the names of the block output ports, and the field values are bioinfo.pipeline.Output objects. These objects describe the output port behaviors. The field names of the output structure returned by the block run method are the same as the output port names.

The FeatureCount block Outputs structure has the following fields:

  • CountsTable — Results containing sequence reads mapped to genomic features, returned as a table.

  • SummaryTable — Summary of assigned and unassigned alignment entries, returned as a table.

Data Types: struct

FeatureCount options, specified as a FeatureCountOptions object. The default is a FeatureCountOptions object with default property values.

Object Functions

compilePerform block-specific additional checks and validations
copyCopy array of handle objects
emptyInputsCreate input structure for use with run method
evalEvaluate block object
runRun block object

Examples

collapse all

Use a FeatureCount block to count reads that are mapped to exons and summarize the total number of reads at the gene level.

import bioinfo.pipeline.block.*
import bioinfo.pipeline.Pipeline

FC1 = FileChooser(which("Dmel_BDGP5_nohc.gtf"));
FC2 = FileChooser(which("rnaseq_sample1.sam"));
F = FeatureCount;

P = Pipeline;
addBlock(P,[FC1,FC2,F]);
connect(P,FC1,F,["Files","GTFFile"]);
connect(P,FC2,F,["Files","GenomicAlignmentFiles"]);

run(P);
Processing GTF file C:\Program Files\MATLAB\R2023a\toolbox\bioinfo\bioinfodata\Dmel_BDGP5_nohc.gtf ...
Processing SAM file C:\Program Files\MATLAB\R2023a\toolbox\bioinfo\bioinfodata\rnaseq_sample1.sam ...
Processing reference chr2L ...
Processing reference chr2R ...
Processing reference chr3L ...
Processing reference chr3R ...
Processing reference chr4 ...
Processing reference chrX ...
Done.

Get the block results.

R = results(P);
head(R.CountsTable)
          ID           Reference    rnaseq_sample1
    _______________    _________    ______________

    {'FBgn0002121'}    {'chr2L'}           9      
    {'FBgn0067779'}    {'chr2L'}           2      
    {'FBgn0005278'}    {'chr2L'}           4      
    {'FBgn0031220'}    {'chr2L'}           4      
    {'FBgn0025683'}    {'chr2L'}          13      
    {'FBgn0053635'}    {'chr2L'}           2      
    {'FBgn0016977'}    {'chr2L'}          22      
    {'FBgn0086902'}    {'chr2L'}          27   
R.SummaryTable
ans =

  9×1 table

                                    rnaseq_sample1
                                    ______________

    TotalEntries                        33354     
    Assigned                            16399     
    Unassigned_ambiguous                  167     
    Unassigned_filtered                     0     
    Unassigned_lowMappingQuality            0     
    Unassigned_multiMapped                  0     
    Unassigned_noFeature                16788     
    Unassigned_supplementary                0     
    Unassigned_unmapped                     0 

Version History

Introduced in R2023a