Acquire Image and Body Data Using Kinect V2
In Detect the Kinect V2 Devices, you see that the two sensors on
the Kinect® for Windows® device are represented by two device
IDs, one for the color sensor and one of the depth sensor. In that
example, Device 1 is the color sensor and Device 2 is the depth sensor.
This example shows how to create a
for the color sensor to acquire RGB images and then for the depth
sensor to acquire body data.
videoinputobject for the color sensor.
DeviceID1 is used for the color sensor.
vid = videoinput('kinect',1);
Note that you do not need to provide the video format as you do for a Kinect V1 device, since only one format is used in Kinect V2 devices (
Look at the device-specific properties on the source device, which is the color sensor on the Kinect V2 camera.
src = getselectedsource(vid); src Display Summary for Video Source Object: General Settings: Parent = [1x1 videoinput] Selected = on SourceName = Kinect V2 Color Source Tag = Type = videosource Device Specific Properties: ExposureTime = 4000 FrameInterval = 333333 Gain = 1 Gamma = 2.2
The output shows that the color sensor has a set of device-specific properties. These properties are read-only for Kinect V2. You can set properties on Kinect V1 devices, but not on Kinect V2 devices. The Kinect V2 device can change the properties, depending on conditions.
Device-Specific Property – Color Sensor Description
Indicates the exposure time in increments of 1/10,000 of a second.
Indicates the frame interval in units of 1/1,000,000 of a second.
Indicates a multiplier for the RGB color values.
Indicates gamma measurement.
Preview the color stream by calling
previewon the color sensor object you created.
When you are done previewing, close the preview window.
videoinputobject for the depth sensor. Note that a second object is created (
DeviceID2 is used for the depth sensor.
vid2 = videoinput('kinect', 2);
Look at the device-specific properties on the source device, which is the depth sensor on the Kinect V2 camera.
src = getselectedsource(vid2); src Display Summary for Video Source Object: General Settings: Parent = [1x1 videoinput] Selected = on SourceName = Kinect V2 Depth Source Tag = Type = videosource Device Specific Properties: EnableBodyTracking = off
The output shows that the depth sensor has one device-specific property associated with body tracking. This property is specific to the depth sensor.
Device-Specific Property – Depth Sensor Description
Indicates tracking state. When set to
on, it returns body metadata. The default is
Collect body metadata by turning on body tracking, which is off by default.
src.EnableBodyTracking = 'on';
Start the second
videoinputobject (the depth stream).
Access body tracking data as metadata on the depth stream using
getdata. The function returns:
Frames of size
% Get the data on the object. [frame, ts, metaData] = getdata(vid2); % Look at the metadata to see the parameters in the body data. metaData metaData = 11x1 struct array with fields: IsBodyTracked: [1x6 logical] BodyTrackingID: [1x6 double] BodyIndexFrame: [424x512 double] ColorJointIndices: [25x2x6 double] DepthJointIndices: [25x2x6 double] HandLeftState: [1x6 double] HandRightState: [1x6 double] HandLeftConfidence: [1x6 double] HandRightConfidence: [1x6 double] JointTrackingStates: [25x6 double] JointPositions: [25x3x6 double]
These metadata fields are related to tracking the bodies.
A 1 x 6 Boolean matrix of true/false values for the tracking of the position of each of the six bodies. A
1indicates the body is tracked, and a
0indicates it is not. See step 9 below for an example.
A 1 x 6 double that represents the tracking IDs for the bodies.
A 25 x 2 x 6 double matrix of x- and y-coordinates for 25 joints in pixels relative to the color image, for the six possible bodies.
A 25 x 2 x 6 double matrix of x- and y-coordinates for 25 joints in pixels relative to the depth image, for the six possible bodies.
A 424 x 512 double that indicates which pixels belong to tracked bodies and which do not. Use tis metadata to acquire segmentation data.
A 1 x 6 double that identifies possible hand states for the left hands of the bodies. Values include:
A 1 x 6 double that identifies possible hand states for the right hands of the bodies. Values include:
This is a 1 x 6 double that identifies the tracking confidence for the left hands of the bodies. Values include:
A 1 x 6 double that identifies the tracking confidence for the right hands of the bodies. Values include:
A 25 x 6 double matrix that identifies the tracking states for joints. Values include:
This is a 25 x 3 x 6 double matrix indicating the location of each joint in 3-D space. See the Joint Positions section for a list of the 25 joint positions.
Look at any individual property by drilling into the metadata. For example, look at the
metaData.IsBodyTracked ans = 1 0 0 0 0 0
In this case the data shows that of the six possible bodies, there is one body being tracked and it is in the first position. If you have multiple bodies, this property is useful to confirm which ones are being tracked.
Get the joint locations for the first body using the
JointPositionsproperty. Since this is the body in position 1, the index uses
metaData.JointPositions(:,:,1) ans = -0.1408 -0.3257 2.1674 -0.1408 -0.2257 2.1674 -0.1368 -0.0098 2.2594 -0.1324 0.1963 2.3447 -0.3024 -0.0058 2.2574 -0.3622 -0.3361 2.1641 -0.3843 -0.6279 1.9877 -0.4043 -0.6779 1.9877 0.0301 -0.0125 2.2603 0.2364 0.2775 2.2117 0.3775 0.5872 2.2022 0.4075 0.6372 2.2022 -0.2532 -0.4392 2.0742 -0.1869 -0.8425 1.8432 -0.1869 -1.2941 1.8432 -0.1969 -1.3541 1.8432 -0.0360 -0.4436 2.0771 0.0382 -0.8350 1.8286 0.1096 -1.2114 1.5896 0.1196 -1.2514 1.5896 0.2969 1.2541 1.2432 0.1360 0.5436 1.1771 0.1382 0.7350 1.5286 0.2096 1.2114 1.3896 0.0196 1.1514 1.4896
The columns represent the X, Y, and Z coordinates in meters of the 25 points on body 1.
Optionally view the segmentation data as an image using the
% View the segmentation data as an image. imagesc(metaDataDepth.BodyIndexFrame); % Set the color map to jet to color code the people detected. colormap(jet);
EnableBodyTracking property indicates whether body metadata is
collected. When set to
on, this list displays the order of the joints
returned by the Kinect V2 adaptor in the
SpineBase = 1; SpineMid = 2; Neck = 3; Head = 4; ShoulderLeft = 5; ElbowLeft = 6; WristLeft = 7; HandLeft = 8; ShoulderRight = 9; ElbowRight = 10; WristRight = 11; HandRight = 12; HipLeft = 13; KneeLeft = 14; AnkleLeft = 15; FootLeft = 16; HipRight = 17; KneeRight = 18; AnkleRight = 19; FootRight = 20; SpineShoulder = 21; HandTipLeft = 22; ThumbLeft = 23; HandTipRight = 24; ThumbRight = 25;