In our previous post, we talked about our efforts to use the Microsoft Kinect to control a large-scale table-top interface running Grasshopper. In short: we used the Kinect to sense touch and gestures…which we then used to control the canvas via keyboard and mouse events. We’ve had a lot of fun building Kinect Multitouch Interactions but – being an architecture firm – we can only spend so much time developing the code. We think we’ve created a solid foundation and would like to share with the broader community to use, modify, and extend. Obviously, Grasshopper is only one possible application and we’d love to see how others will use this interaction. So, in the spirit of openness, we’re providing the complete source – as well as Visual Studio solution files – downloadable below. It’s our hope that those of you reading this will adapt and improve on what we’ve started.
BE WARNED: Before you jump in and try this at home, we encourage you to read this post (and the next one) to get a sense of how we went about this. Rest assured, downloading and compiling our sample code is not a terribly complex endeavor, provided you have some experience with linking and building applications. You’ll need to have some experience with C/C++ programming…so please read the detailed technical descriptions included in the SDK (and in these posts).
To get started, you’ll need the following…
1. Microsoft Kinect for XBox with external power source and USB cables.
2. Microsoft Windows 7 or later.
3. Kinect for Windows PC SDK with drivers. We used the first public beta.
4. Microsoft .NET Framework 4.0.
5. VisualStudio 2010 IDE or similar. Express editions should work fine.
6. Our source code, which includes the Visual Studio Solution files…scroll down.
7. Projector (or similar) display. Our setup, while not expensive, is a bit exotic.
8. Kinect stabilizing mount (for walls, tables and ceilings).
Obligatory Disclaimer/Terms and Conditions of Use: By downloading, you agree to use this software at your own risk. Under no circumstances shall LMN or LMNts be liable for direct, indirect, special, incidental, or consequential damages resulting from the use, misuse, or inability to use this software. The software is provided “as is,” whereas we cannot provide a guarantee of support. LMNts does not guarantee that this software is bug-free or that it will solve all your problems…
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
Ok, let’s get into the fun details of how this works…
There are a great number of possible configurations and methods of display, from as simple as tracking touch on a standard LCD monitor to registering touch on the surface of a wall or a chair. In our example, we used a large ground glass table and a giant mirror to project the interactive workspace and UI from beneath the table. Our goal was to create a poor man’s Microsoft Surface. As it’s unlikely that you will have a similar setup, let’s assume you want to project onto a standard table from above. The table top should typically be capable of allowing light from the projector to be dispersed on the surface to facilitate viewing but not exhibit excessive specular behavior. That’s a fancy way of saying: you will need a table with a dull surface finish and position the projector at an angle (adjusting the image keystone) so that it doesn’t bounce light directly back into the depth sensor. This is important since the Kinect sensor operates by projecting infra-red light through the field of view; light bouncing off any reflective surface would be reproduced as noise or invalid depth data. You don’t want that.
If you set this up on a table, the RGBD sensor (ie: Kinect) should be installed overhead with a field of view covering the table top. While the minimum distance of the sensor (distance to table) is restricted by the Kinect’s capabilities (typically around 0.75m), the maximum distance to table determines the accuracy of touch calculations. A typical depth value (and recommended) is 1.00 m for the specific table dimensions. You will want to mount the Kinect in such a way as it cannot be moved or disturbed…even small vibrations cause noise in the depth data and could throw off the interaction.
If you set this up on a wall (or similar surface), the above rules still apply, but you will want to be careful to position the Kinect where its view of the surface won’t be occluded by users standing directly in the way. Our touch implementation is based on line-of-sight…overhead mounting works best for wall interaction as well.
A complete walkthrough of the Microsoft Kinect PC SDK is well beyond the scope of this post. Microsoft has some great documentation and support for getting up and running with the SDK. For clarity, we’ve built our the demo interaction around Microsoft’s SkeletalViewer sample code (a popular place to start for good reason). We have NOT included Microsoft’s source, so it will be necessary to link and recompile with the actual Microsoft SDK if you want to make major changes. The VisualStudio Solution file is setup to link to the necessary dependencies, but you will have to copy the relevant sources from the SDK into the project folder of your choosing.
The source code is written in C++ and targeting the .NET framework 4.0. We have included the three executable builds:
Each of these builds is a modified version of the main TouchGestures sample code. The sample included in our download builds the MissionMode executable by default. (Should you like to rebuild SurfaceCalibration.exe or ExtentsCalibration.exe, the TouchGestures.h contains a globally defined switch that will trigger #ifndef code in the implementation…after building, simply rename the execs).
The executables should be run in the following order (one at a time):
1. SurfaceCalibration.exe – necessary for depth estimation across the interaction surface. This utility calibrates the Kinect relative to a surface in its field of view. Once the executable completes, a dump file called
DepthAverage.dump is created in the Release folder containing the depth information in millimeters for each of the 640×480 pixels.
2. ExtentsCalibration.exe – is required to map the 2D table top space to the projected interaction space (in our case, the Grasshopper canvas). Build and run this executable. If you are using a projector, move the window directly over the area you wish to track. This executable displays a square pixelated area on the display which acts as the target point to be mapped onto the 2D depth space (which is normally restricted to a subset of the Kinect’s field-of-view). Once the square is displayed (starting bottom-right and moving clockwise for each corner, ending in the center), you are expected to touch and hold the square for a few seconds until it moves to the next calibration location. Watch carefully: the better the tracking, the quicker it moves. This continues until all the 5 points are displayed and logged by the calibration utility. At the end of the process, a data file called
Calibration.dat is created in the Release directory containing the co-ordinates logged during the extents calibration process.
The controls in the UI are required to adjust the touch parameters and thresholds during image processing: this section displays the frame rate, provides control sliders to modify the
DSurface, Finger and Mean filter threshold value. The filter kernel size can also be modified using the text control box (we’ll explain these terms in the next post). The default values in the UI are usually sufficient for a typical use case but, depending on where you position the Kinect relative to the tracked area, you may need to adjust the parameters to remove some of the noise.
3. MissionMode.exe – this is the sample touch and image processing executable that processes touch inputs, identifies and translates the touch inputs as gestures and mouse events on the Grasshopper canvas. This executable should always be open and running to enable touch gestures tracking, but it does not need to be coincident with the tracked area. The window displays the touch points at ~30 frames per second. This exe should only be run once the calibration procedure is complete. In our example, the Grasshopper canvas can be pushed to the display space (in our case: the table) and the fun begins.
So that’s the overview. If you are testing this out with Grasshopper (which we recommend), the above gestures are implemented and should work fine without any changes to the code. In our next post, we’ll unpack and walk through each of the code functions in greater detail…