How to start developing with Kinect
If you haven’t already done so, you will need to setup Kinect on your computer. Follow the instructions in the link.
The Kinect sensor contains several components that allow it to capture and stream different types of data:
- Image & Video – An RGB camera enables the Kinect sensor to capture images and videos.
- Infrared – Like the Image & Video data, only with infrared light.
- Depth – An IR Depth sensor enables to sense the distance of objects in front of it and create a 3D mapping (point cloud). How it works
- Audio (also the direction of the audio) – An array of 4 microphones enables the Kinect sensor to capture audio, and also determine the source location of the audio and its direction.
- Orientation of Kinect Sensor – A 3-axis accelerometer lets us know the orientation (angle) that the Kinect is currently in.
- Skeleton Tracking – Tracks 20 different points of human skeleton (up to 2 humans). This advanced tracking feature is integrated inside the Kinect sensor and relies on the depth sensor.
- The Kinect sensor streams 4 types of data: audio, color, depth data and skeleton data.
The Kinect SDK (Software Development Kit) offers an API that allows a user to get real-time data from a stream. What this basically means is that you can easily access data from the Kinect sensor in your code (C++, C#…) by using libraries there were installed when you installed the Kinect SDK. You will create classes supplied by these libraries that will enable you to operate the Kinect sensor and get access to its data streams.
Basic Concepts of Capturing Data From Sensor
You may think of a data stream as a continuous flow of data, that keeps adding a new pieces of data to the stream at every time interval. A movie, for example, that was shot at 24 frames per second, when played on your TV, can be thought of as a data stream. The stream is the movie transmitted on the cable, and the data are the picture frames. There are 24 frames in a second, so new data (a new frame) is added to the stream every 1/24 of a second (41.6 milliseconds).
The Kinect sensor also streams data, at about 30 frames per second. The 4 types of data are audio, color, depth data and skeleton data. This means that a new frame of data is received every 1/30 of a second. The user may access the streamed data by accessing the frames as they are streamed.
In order to process every frame in real-time, your program will need to process each frame faster than they are being streamed. To avoid dropping frames, ensure that your application processes and releases each frame in a timely fashion. Recent frames can be stored in a buffer, so that in case of temporary load and slowdown on your program which causes it to drop frames, it can still access the last few frames it missed.
There are two ways you can access frames, either by polling for a frame or by subscribing to the event of a new frame.
Polling – The application code (your program) requests a frame from a stream, and states how long it is willing to wait until a new frame is received. You can specify to wait for an infinite time, in which case if a frame is never sent, your application will get stuck forever waiting for a new frame. Your program will not continue execution until it receives a new frame.
Event – Your application defines a method to be invoked once a new frame is received. In this case, your application does not wait for a new frame, it continues execution, and when a new frame is received the method it supplied handles it.
As a developer, you would want to look at samples provided by Kinect in order to start coding. To view the samples provided:
- Click Start > type in “SDK Browser” and open the app.
- In the top menu bar, select the samples for your choice of language (C# and C++)
- Press Install on the sample you want to download the source code of.
- Open the source code with an IDE of your choice (Visual Studio, Eclipse, etc…)
The official documentation is an important resource for the developer.