Main Content

bagOfFeaturesDBoW

Bag of visual words using DBoW2 library

Since R2024b

    Description

    Use the bagOfFeaturesDBoW object to create a bag of words (BoW) vocabulary to use for loop closure detection with vSLAM algorithms. This object uses the distributed bag of words DBoW2 library. This object supports only oriented FAST and rotated BRIEF (ORB) features.

    Creation

    Description

    bag = bagOfFeaturesDBoW(imds) creates a bag of features from images specified by the datastore imds. The algorithm uses oriented fast and rotated brief (ORB) features to create the visual vocabulary.

    example

    bag = bagOfFeaturesDBoW(features) creates a bag of features from ORB feature descriptors specified by features.

    bag = bagOfFeaturesDBoW(___,Name=Value) sets properties by using one or more name-value arguments in addition to the previous syntax. For example, Normalization="L2" sets the normalization to L2.

    bag = bagOfFeaturesDBoW(vocabularyFileName) loads an existing DBoW vocabulary file.

    Input Arguments

    expand all

    Images, specified as an ImageDatastore object.

    ORB feature descriptors, specified as a cell array of binaryFeatures objects.

    DBoW vocabulary file, specified as a character string.

    Name-Value Arguments

    expand all

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Vocabulary tree properties, specified as a 2-element vector of the form [depthLevel,branchingFactor]. depthLevel is the number of levels in the vocabulary tree, specified as an integer. branchingFactor is the factor to control the amount that the vocabulary can grow at successive levels in the tree, specified as an integer.

    The capacity of a vocabulary tree to represent visual words is determined by the formula: branchingFactor x depthLevel. Commonly, the depthLevel ranges from 1 to 6, while the branchingFactor varies between 10 and 500. To identify the optimal values for these parameters, conducting empirical analysis is recommended.

    Increasing the branching factor enlarges the vocabulary, potentially improving the accuracy of image similarity assessments. However, this adjustment also leads to an increase in the time required to encode images. Implementing a vocabulary tree with multiple levels can facilitate the creation of vocabularies comprising over 10,000 visual words. Although this multi-level approach streamlines the encoding process for images associated with large vocabularies, it necessitates a more extended setup phase.

    Alternatively, for vocabularies containing only 100 to 1,000 visual words, employing a tree with a single level is advisable. This configuration simplifies the structure and accelerates the creation process, albeit for smaller vocabularies.

    Type of normalization applied to the features, specified as L1 or L2. Normalization is the method by which to evaluate the similarity or dissimilarity between the feature descriptors for the bag of features.

    • L1 — Calculates the sum of the absolute values of the vector elements.

    • L2 — Calculates the square root of the sum of the squared values of the vector elements.

    Properties

    expand all

    This property is read-only.

    Number of levels in the vocabulary tree, returned as an integer. The Normalization property sets this value.

    The TreeProperties name-value argument sets this property.

    This property is read-only.

    Number of branches of every node in the vocabulary tree, returned as an integer. The Normalization property sets this value.

    The TreeProperties name-value argument sets this property.

    Type of normalization applied to the features, returned as an L1 or L2.

    The Normalization name-value argument sets this property.

    Examples

    collapse all

    Create a bag of features object using an existing distributed bag of words (DBoW) vocabulary file.

    bag = bagOfFeaturesDBoW("bagOfFeatures.bin.gz")
    bag = 
      bagOfFeaturesDBoW with properties:
    
             DepthLevel: 5
        BranchingFactor: 10
          Normalization: 'L1'
    
    

    Create an image datastore for a set of images with stop signs.

    folder=fullfile(toolboxdir("vision"),"visiondata","stopSignImages");
    imds=imageDatastore(folder);

    Visualize an image from the datastore.

    imshow(preview(imds))

    Figure contains an axes object. The hidden axes object contains an object of type image.

    Create a bag of features vocabulary representation from the images in the datastore.

    bag = bagOfFeaturesDBoW(imds)
    bag = 
      bagOfFeaturesDBoW with properties:
    
             DepthLevel: 5
        BranchingFactor: 10
          Normalization: 'L1'
    
    

    Read an image.

    I = imread("cameraman.tif");
    imshow(I)

    Figure contains an axes object. The hidden axes object contains an object of type image.

    Detect ORB features in the image.

    points = detectORBFeatures(I);
    imshow(I)
    hold on
    plot(points,ShowScale=false)

    Figure contains an axes object. The hidden axes object contains 2 objects of type image, line. One or more of the lines displays its values using only markers

    Extract ORB features from the detected points in the image. The extractFeatures function returns features and their corresponding locations. This code focuses on only the features for loop closure detection.

    features = extractFeatures(I,points);

    Create a bag of features using the extracted ORB features. Specify L2 normalization method to normalize image encodings in the bag.

    bag = bagOfFeaturesDBoW({features},Normalization="L2")
    bag = 
      bagOfFeaturesDBoW with properties:
    
             DepthLevel: 5
        BranchingFactor: 10
          Normalization: 'L2'
    
    

    You can now apply this bag to tasks such as loop closure detection in visual SLAM workflows using the dbowLoopDetector object.

    References

    [1] Galvez-López, D., and J. D. Tardos. “Bags of Binary Words for Fast Place Recognition in Image Sequences.” IEEE Transactions on Robotics, vol. 28, no. 5, Oct. 2012, pp. 1188–97. DOI.org (Crossref), https://doi.org/10.1109/TRO.2012.2197158.

    Version History

    Introduced in R2024b