Acquire Image and Skeletal Data Using Kinect V1
In Detect the Kinect V1 Devices, you see that the two sensors on
the Kinect® for Windows® device are represented by two device
IDs, one for the color sensor and one of the depth sensor. In that
example, Device 1 is the color sensor and Device 2 is the depth sensor.
This example shows how to create a videoinput
object
for the color sensor to acquire RGB images and then for the depth
sensor to acquire skeletal data.
Create the
videoinput
object for the color sensor.DeviceID
1 is used for the color sensor.vid = videoinput('kinect',1,'RGB_640x480');
Look at the device-specific properties on the source device, which is the color sensor on the Kinect camera.
src = getselectedsource(vid); src Display Summary for Video Source Object: General Settings: Parent = [1x1 videoinput] Selected = on SourceName = ColorSource Tag = Type = videosource Device Specific Properties: Accelerometer = [0.0 -1.0 0.0] AutoExposure = on AutoWhiteBalance = on BacklightCompensation = AverageBrightness Brightness = 0.2156 CameraElevationAngle = 3 Contrast = 1 ExposureTime = 1.0 FrameInterval = 0 FrameRate = 30 Gain = 0 Gamma = 2.2 Hue = 0 PowerLineFrequency = Disabled Saturation = 1 Sharpness = 0.5 WhiteBalance = 2700
As you can see in the output, the color sensor has a set of device-specific properties.
Device-Specific Property – Color Sensor Description Accelerometer
Returns 3-D vector of acceleration data for both the color and depth sensors. The data is updated while the device is running or previewing. This 1 x 3 double represents the
x
,y
, andz
values of acceleration in gravity unitsg
(9.81m/s^2
). For example,[0.06 -1.00 -0.09]
represents values of
x
as0.06
g,y
as-1.00
g, andz
as-0.09
g.AutoExposure
Use to set the exposure automatically. This control whether other related properties are activated. Values are on
(default) andoff
.on
means that exposure is set automatically, and these properties are not able to be set and will throw a warning:FrameInterval
,ExposureTime
, andGain
.off
means that these properties are not able to be set and will throw a warning:PowerLineFrequency
,BacklightCompensation
, andBrightness
.AutoWhiteBalance
Use to enable or disable automatic white balance setting. on
(default) means that it will automatically configure white balance and theWhiteBalance
property cannot be set.off
means that theWhiteBalance
property is settable.BacklightCompensation
Configures backlight compensation modes to adjust the camera to capture images dependent on environmental conditions. Note that this property is only valid if
AutoExposure
is set toEnabled
. The default isAverageBrightness
.Values are:
AverageBrightness
favors an average brightness levelCenterPriority
favors the center of the sceneLowLightsPriority
favors a low light levelCenterOnly
favors the center onlyBrightness
Indicates the brightness level. The value range is 0.0
to1.0
, and the default value is0.2156
.Note that this property is only valid if
AutoExposure
is set toEnabled
.CameraElevationAngle
Controls the angle of the sensor lens. This is the camera angle relative to the ground. The value must be an integer property with range of -27 to 27 degrees. The default value is the last set value, since this is a sticky setting. Only set it if you want to change the angle of the camera. This property is shared with the depth sensor also. Contrast
Indicates contrast level. Values must be in the range 0.5
to2
, with a default value of1
.ExposureTime
Indicates the exposure time in increments of 1/10,000 of a second. The value range is 0
to4000
, and the default is0
.Note that this property is only valid if
AutoExposure
is set toDisabled
.FrameInterval
Indicates the frame interval in units of 1/10,000 of a second. The value range is 0
to4000
, and the default is0
.Note that this property is only valid if
AutoExposure
is set toDisabled
.FrameRate
Frames per second for the acquisition. This property is read only and the possible values for the color sensor are 12
,15
, and30
(default). It reflects the actual frame rate when running.Gain
Indicates a multiplier for the RGB color values. The value range is 1.0
to16.0
, and the default is1.0
.Note that this property is only valid if
AutoExposure
is set toDisabled
.Gamma
Indicates gamma measurement. Values must be in the range 1
to2.8
, with a default value of2.2
.Hue
Indicates hue setting. Values must be in the range -22
to22
, with a default value of0
.PowerLineFrequency
Option for reducing flicker caused by the frequency of a power line. Values are Disabled
,FiftyHertz
, andSixtyHertz
. The default isDisabled
.Note that this property is only valid if
AutoExposure
is set toEnabled
.Saturation
Indicates saturation level. Values must be in the range 0
to2
, with a default value of1
.Sharpness
Indicates sharpness level. Values must be in the range 0
to1
, with a default value of0.5
.WhiteBalance
Indicates color temperature in degrees Kelvin. The value range is 2700
to6500
and the default is2700
.Note that this property is only valid if
AutoWhiteBalance
is set toDisabled
.You can optionally set some of these properties shown in the previous step. For example, you might be acquiring images in a low light situation. You could adjust the acquisition for this by setting the
BacklightCompensation
property toLowLightsPriority
, which favors a low light level.src.BacklightCompensation = 'LowLightsPriority';
Preview the color stream by calling
preview
on the color sensor object you created.preview(vid);
When you are done previewing, close the preview window.
closepreview(vid);
Create the
videoinput
object for the depth sensor. Note that a second object is created (vid2
), andDeviceID
2 is used for the depth sensor.vid2 = videoinput('kinect',2,'Depth_640x480');
Look at the device-specific properties on the source device, which is the depth sensor on the Kinect.
src = getselectedsource(vid2); src Display Summary for Video Source Object: General Settings: Parent = [1x1 videoinput] Selected = on SourceName = DepthSource Tag = Type = videosource Device Specific Properties: Accelerometer = [0.0 -1.0 0.0] BodyPosture = Standing CameraElevationAngle = 4 DepthMode = Default FrameRate = 30 IREmitter = on SkeletonsToTrack = [1x0 double] TrackingMode = off
As you can see in the output, the depth sensor has a set of device-specific properties associated with skeletal tracking. These properties are specific to the depth sensor.
Device-Specific Property – Depth Sensor Description Accelerometer
Returns 3-D vector of acceleration data for both the color and depth sensors. The data is updated while the device is running or previewing. This 1 x 3 double represents the
x
,y
, andz
values of acceleration in gravity unitsg
(9.81m/s^2
). For example,[0.06 -1.00 -0.09]
represents values of
x
as0.06
g,y
as-1.00
g, andz
as-0.09
g.BodyPosture
Indicates whether the tracked skeletons are standing or sitting. Values are Standing
(gives 20 point skeleton data) andSeated
(gives 10 point skeleton data, using joint indices 2 - 11).Standing
is the default.Note that if
BodyPosture
is set toSeated
mode, andTrackingMode
is set toPosition
, no position is returned, sincePosition
is the location of the hip joint and the hip joint is not tracked inSeated
mode.See the subsection “BodyPosture Joint Indices” at the end of this example for the list of indices of the 20 skeletal joints.
CameraElevationAngle
Controls the angle of the sensor lens. This is the camera angle relative to the ground. The value must be an integer property with range of -27 to 27 degrees. The default value is the last set value, since this is a sticky setting. Only set it if you want to change the angle of the camera. This property is shared with the color sensor also. DepthMode
Indicates the range of depth in the depth map. Values are Default
(range of 50 to 400 cm) andNear
(range of 40 to 300 cm).FrameRate
Frames per second for the acquisition. This property is read only and is fixed at 30
for the depth sensor for all formats. It reflects the actual frame rate when running.IREmitter
Controls whether the IR emitter is on or off. Values are on
andoff
. Initially, the default value ison
. However, this is a sticky property, so the default value is the last set value. If you set it tooff
, it will remain off in future uses until you change the setting.An advantage of this property is that it is useful when using multiple Kinect devices to avoid interference.
SkeletonsToTrack
Indicates the Skeleton Tracking ID returned as part of the metadata. Values are: []
Default tracking[TrackingID1]
Track 1 skeleton with Tracking ID = TrackingID1[TrackingID1 TrackingID2]
Track 2 skeletons with Tracking IDs = TrackingID1 and TrackingID2TrackingMode
Indicates tracking state. Values are: Skeleton
tracks full skeleton with jointsPosition
tracks hip joint position onlyOff
disables skeleton position tracking (default)Note that if
BodyPosture
is set toSeated
mode, andTrackingMode
is set toPosition
, no position is returned, sincePosition
is the location of the hip joint and the hip joint is not tracked inSeated
mode.Start the second
videoinput
object (the depth stream).start(vid2);
Skeletal data is accessed as metadata on the depth stream using
getdata
.% Get the data on the object. [frame, ts, metaData] = getdata(vid2); % Look at the metadata to see the parameters in the skeletal data. metaData metaData = 10x1 struct array with fields: AbsTime: [1x1 double] FrameNumber: [1x1 double] IsPositionTracked: [1x6 logical] IsSkeletonTracked: [1x6 logical] JointDepthIndices: [20x2x6 double] JointImageIndices: [20x2x6 double] JointTrackingState: [20x6 double] JointWorldCoordinates: [20x3x6 double] PositionDepthIndices: [2x6 double] PositionImageIndices: [2x6 double] PositionWorldCoordinates: [3x6 double] RelativeFrame: [1x1 double] SegmentationData: [640x480 double] SkeletonTrackingID: [1x6 double] TriggerIndex: [1x1 double]
These metadata fields are related to tracking the skeletons.
MetaData Description AbsTime
A 1 x 1 double that represents the full timestamp, including date and time, in MATLAB® clock format. FrameNumber
A 1 x 1 double that represents the frame number. IsPositionTracked
A 1 x 6 Boolean matrix of true/false values for the tracking of the position of each of the six skeletons. A 1
indicates the position is tracked and a0
indicates it is not.IsSkeletonTracked
A 1 x 6 Boolean matrix of true/false values for the tracked state of each of the six skeletons. A 1
indicates it is tracked and a0
indicates it is not.JointDepthIndices
If the BodyPosture
property is set toStanding
, this is a 20 x 2 x 6 double matrix of x-and y-coordinates for 20 joints in pixels relative to the depth image, for the six possible skeletons. IfBodyPosture
is set toSeated
, this would be a 10 x 2 x 6 double for 10 joints.JointImageIndices
If the BodyPosture
property is set toStanding
, this is a 20 x 2 x 6 double matrix of x-and y-coordinates for 20 joints in pixels relative to the color image, for the six possible skeletons. IfBodyPosture
is set toSeated
, this would be a 10 x 2 x 6 double for 10 joints.JointTrackingState
This 20 x 6 integer matrix contains enumerated values for the tracking accuracy of each joint for all six skeletons. Values include: 0
not tracked1
position inferred2
position trackedJointWorldCoordinates
A 20 x 3 x 6 double matrix of x-, y- and z-coordinates for 20 joints, in meters from the sensor, for the six possible skeletons, if the BodyPosture
is set toStanding
. If it is set toSeated
, this would be a 10 x 3 x 6 double for 10 joints.See step 9 for the syntax on how to see this data.
PositionDepthIndices
A 2 x 6 double matrix of X and Y coordinates of each skeleton in pixels relative to the depth image. PositionImageIndices
A 2 x 6 double matrix of X and Y coordinates of each skeleton in pixels relative to the color image. PositionWorldCoordinates
A 3 x 6 double matrix of the X, Y and Z coordinates of each skeleton in meters relative to the sensor. RelativeFrame
This 1 x 1 double represents the frame number relative to the execution of a trigger if triggering is used. SegmentationData
Image size double array with each pixel mapped to a tracked/detected skeleton, represented by numbers 1 to 6. This segmentation map is a bitmap with pixel values corresponding to the index of the person in the field-of-view who is closest to the camera at that pixel position. A value of 0 means there is no tracked skeleton. SkeletonTrackingID
This 1 x 6 integer matrix contains the tracking IDs of all six skeletons. These IDs track specific skeletons using the SkeletonsToTrack
property in step 5.Tracking IDs are generated by the Kinect and change from acquisition to acquisition.
TriggerIndex
A 1 x 1 double that represents the trigger the event is associated with if triggering is used. Look at any individual property by drilling into the metadata. For example, look at the
IsSkeletonTracked
property.metaData.IsSkeletonTracked ans = 1 0 0 0 0 0
In this case the data shows that of the six possible skeletons, there is one skeleton being tracked and it is in the first position. If you have multiple skeletons, this property is useful to confirm which ones are being tracked.
Get the joint locations for the first person in world coordinates using the
JointWorldCoordinates
property. Since this is the person in position 1, the index uses1
.metaData.JointWorldCoordinates(:,:,1) ans = -0.1408 -0.3257 2.1674 -0.1408 -0.2257 2.1674 -0.1368 -0.0098 2.2594 -0.1324 0.1963 2.3447 -0.3024 -0.0058 2.2574 -0.3622 -0.3361 2.1641 -0.3843 -0.6279 1.9877 -0.4043 -0.6779 1.9877 0.0301 -0.0125 2.2603 0.2364 0.2775 2.2117 0.3775 0.5872 2.2022 0.4075 0.6372 2.2022 -0.2532 -0.4392 2.0742 -0.1869 -0.8425 1.8432 -0.1869 -1.2941 1.8432 -0.1969 -1.3541 1.8432 -0.0360 -0.4436 2.0771 0.0382 -0.8350 1.8286 0.1096 -1.2114 1.5896 0.1196 -1.2514 1.5896
The columns represent the X, Y, and Z coordinates in meters of the 20 points on skeleton 1.
Optionally view the segmentation data as an image.
% View the segmentation data as an image. imagesc(metaDataDepth.SegmentationData); % Set the color map to jet to color code the people detected. colormap(jet);
BodyPosture Joint Indices
The BodyPosture
property, in step 5, indicates
whether the tracked skeletons are standing or sitting. Values are Standing
(gives
20 point skeleton data) and Seated
(gives 10 point
skeleton data, using joint indices 2 - 11).
This is the order of the joints returned by the Kinect adaptor:
Hip_Center = 1; Spine = 2; Shoulder_Center = 3; Head = 4; Shoulder_Left = 5; Elbow_Left = 6; Wrist_Left = 7; Hand_Left = 8; Shoulder_Right = 9; Elbow_Right = 10; Wrist_Right = 11; Hand_Right = 12; Hip_Left = 13; Knee_Left = 14; Ankle_Left = 15; Foot_Left = 16; Hip_Right = 17; Knee_Right = 18; Ankle_Right = 19; Foot_Right = 20;
When BodyPosture
is set to Standing
,
all 20 indices are returned, as shown above. When BodyPosture
is
set to Seated
, numbers 2 through 11 are returned,
since this represents the upper body of the skeleton.