Acquire Image and Body Data Using Kinect V2

In Detect the Kinect V2 Devices, you see that the two sensors on the Kinect® for Windows® device are represented by two device IDs, one for the color sensor and one of the depth sensor. In that example, Device 1 is the color sensor and Device 2 is the depth sensor. This example shows how to create a videoinput object for the color sensor to acquire RGB images and then for the depth sensor to acquire body data.

  1. Create the videoinput object for the color sensor. DeviceID 1 is used for the color sensor.

    vid = videoinput('kinect',1);

    Note that you do not need to provide the video format as you do for a Kinect V1 device, since only one format is used in Kinect V2 devices (RGB_1920x1080).

  2. Look at the device-specific properties on the source device, which is the color sensor on the Kinect V2 camera.

    src = getselectedsource(vid);
    
    src
    
    Display Summary for Video Source Object:
     
          General Settings:
            Parent = [1x1 videoinput]
            Selected = on
            SourceName = Kinect V2 Color Source
            Tag = 
            Type = videosource
     
          Device Specific Properties:
            ExposureTime = 4000
            FrameInterval = 333333
            Gain = 1
            Gamma = 2.2
    

    The output shows that the color sensor has a set of device-specific properties. These properties are read-only for Kinect V2. You can set properties on Kinect V1 devices, but not on Kinect V2 devices. The Kinect V2 device can change the properties, depending on conditions.

    Device-Specific Property – Color SensorDescription
    ExposureTimeIndicates the exposure time in increments of 1/10,000 of a second.
    FrameIntervalIndicates the frame interval in units of 1/1,000,000 of a second.
    GainIndicates a multiplier for the RGB color values.
    GammaIndicates gamma measurement.
  3. Preview the color stream by calling preview on the color sensor object you created.

    preview(vid);

    When you are done previewing, close the preview window.

    closepreview(vid);
  4. Create the videoinput object for the depth sensor. Note that a second object is created (vid2), and DeviceID 2 is used for the depth sensor.

    vid2 = videoinput('kinect', 2);
  5. Look at the device-specific properties on the source device, which is the depth sensor on the Kinect V2 camera.

    src = getselectedsource(vid2);
    
    src
    
    Display Summary for Video Source Object:
     
          General Settings:
            Parent = [1x1 videoinput]
            Selected = on
            SourceName = Kinect V2 Depth Source
            Tag = 
            Type = videosource
     
          Device Specific Properties:
            EnableBodyTracking = off

    The output shows that the depth sensor has one device-specific property associated with body tracking. This property is specific to the depth sensor.

    Device-Specific Property – Depth SensorDescription
    EnableBodyTrackingIndicates tracking state. When set to on, it returns body metadata. The default is off.
  6. Collect body metadata by turning on body tracking, which is off by default.

    src.EnableBodyTracking = 'on';
  7. Start the second videoinput object (the depth stream).

    start(vid2);
  8. Access body tracking data as metadata on the depth stream using getdata. The function returns:

    • Frames of size 512x424 in mono13 format and uint16 data type

    • Time stamps

    • Metadata

    % Get the data on the object.
    [frame, ts, metaData] = getdata(vid2);
    
    % Look at the metadata to see the parameters in the body data.
    metaData
    
    metaData = 
     
    11x1 struct array with fields:
        IsBodyTracked: [1x6 logical]
        BodyTrackingID: [1x6 double]
        BodyIndexFrame: [424x512 double]
        ColorJointIndices: [25x2x6 double]
        DepthJointIndices: [25x2x6 double]
        HandLeftState: [1x6 double] 
        HandRightState: [1x6 double]
        HandLeftConfidence: [1x6 double]
        HandRightConfidence: [1x6 double]
        JointTrackingStates: [25x6 double]
        JointPositions: [25x3x6 double]

    These metadata fields are related to tracking the bodies.

    MetaData Description
    IsBodyTrackedA 1 x 6 Boolean matrix of true/false values for the tracking of the position of each of the six bodies. A 1 indicates the body is tracked, and a 0 indicates it is not. See step 9 below for an example.
    BodyTrackingIDA 1 x 6 double that represents the tracking IDs for the bodies.
    ColorJointIndicesA 25 x 2 x 6 double matrix of x- and y-coordinates for 25 joints in pixels relative to the color image, for the six possible bodies.
    DepthJointIndicesA 25 x 2 x 6 double matrix of x- and y-coordinates for 25 joints in pixels relative to the depth image, for the six possible bodies.
    BodyIndexFrameA 424 x 512 double that indicates which pixels belong to tracked bodies and which do not. Use tis metadata to acquire segmentation data.
    HandLeftStateA 1 x 6 double that identifies possible hand states for the left hands of the bodies. Values include:

    0 unknown

    1 not tracked

    2 open

    3 closed

    4 lasso

    HandRightStateA 1 x 6 double that identifies possible hand states for the right hands of the bodies. Values include:

    0 unknown

    1 not tracked

    2 open

    3 closed

    4 lasso

    HandLeftConfidenceThis is a 1 x 6 double that identifies the tracking confidence for the left hands of the bodies. Values include:

    0 low

    1 high

    HandRightConfidenceA 1 x 6 double that identifies the tracking confidence for the right hands of the bodies. Values include:

    0 low

    1 high

    JointTrackingStatesA 25 x 6 double matrix that identifies the tracking states for joints. Values include:

    0 not tracked

    1 inferred

    2 tracked

    JointPositionsThis is a 25 x 3 x 6 double matrix indicating the location of each joint in 3-D space. See the Joint Positions section for a list of the 25 joint positions.
  9. Look at any individual property by drilling into the metadata. For example, look at the IsBodyTracked property.

    metaData.IsBodyTracked
    
    ans = 
     
         1     0     0     0     0     0

    In this case the data shows that of the six possible bodies, there is one body being tracked and it is in the first position. If you have multiple bodies, this property is useful to confirm which ones are being tracked.

  10. Get the joint locations for the first body using the JointPositions property. Since this is the body in position 1, the index uses 1.

    metaData.JointPositions(:,:,1)
    
    ans =
     
       -0.1408   -0.3257    2.1674
       -0.1408   -0.2257    2.1674
       -0.1368   -0.0098    2.2594
       -0.1324    0.1963    2.3447
       -0.3024   -0.0058    2.2574
       -0.3622   -0.3361    2.1641
       -0.3843   -0.6279    1.9877
       -0.4043   -0.6779    1.9877
        0.0301   -0.0125    2.2603
        0.2364    0.2775    2.2117
        0.3775    0.5872    2.2022
        0.4075    0.6372    2.2022
       -0.2532   -0.4392    2.0742
       -0.1869   -0.8425    1.8432
       -0.1869   -1.2941    1.8432
       -0.1969   -1.3541    1.8432
       -0.0360   -0.4436    2.0771
        0.0382   -0.8350    1.8286
        0.1096   -1.2114    1.5896
        0.1196   -1.2514    1.5896
        0.2969    1.2541    1.2432
        0.1360    0.5436    1.1771
        0.1382    0.7350    1.5286
        0.2096    1.2114    1.3896
        0.0196    1.1514    1.4896

    The columns represent the X, Y, and Z coordinates in meters of the 25 points on body 1.

  11. Optionally view the segmentation data as an image using the BodyIndexFrame property.

    % View the segmentation data as an image.
    imagesc(metaDataDepth.BodyIndexFrame);
    % Set the color map to jet to color code the people detected.
    colormap(jet);
    

Joint Positions

The EnableBodyTracking property indicates whether body metadata is collected. When set to on, this list displays the order of the joints returned by the Kinect V2 adaptor in the JointPositions property.

   SpineBase = 1;
   SpineMid = 2;
   Neck = 3;
   Head = 4;
   ShoulderLeft = 5;
   ElbowLeft = 6;
   WristLeft = 7;
   HandLeft = 8;
   ShoulderRight = 9;
   ElbowRight = 10;
   WristRight = 11;
   HandRight = 12;
   HipLeft = 13;
   KneeLeft = 14;
   AnkleLeft = 15;
   FootLeft = 16; 
   HipRight = 17;
   KneeRight = 18;
   AnkleRight = 19;
   FootRight = 20;
   SpineShoulder = 21;
   HandTipLeft = 22;
   ThumbLeft = 23;
   HandTipRight = 24;
   ThumbRight = 25;