Posts Tagged ‘gesture’

Development: complete!

I’ve been quiet for the past few weeks because I’ve been frantically fising those last-minute bugs that one invariably finds in a development project like this. But it was all worth it because I spent three days this week using my software to teach 9 and 10 year olds about astronomy.

Shirley Community Nursery & Primary School in Cambridge were kind enough to let me teach a few lessons. I had an excellent time at Shirley. The students and staff were all very helpful, friendly and enthusiastic. This made my job both pleasant and easier!

The sessions were well received by the children. It got a few ‘oohs’, ‘aahs’ and ‘awesomes’, so the biggest challenge – which was to engage the children in the activity – was met. I’m studying a computing degree, so my main focus is to assess the technical implementation of my work. This naturally means ‘did it work?’, but there is no point in producing something that is technically proficient unless it achieves its purpose well. In this case the purpose was to impart learning, and feedback indicates that this aim was met well too.

I asked the children to complete a Likert-scale survey after the intervention. The ‘enjoyment factor’ shown in the results is very positive. Similarly the children say that they learned about astronomy because of the lesson. Good news! Some of the discursive feedback was really helpful and very insightful too. Commments such as “I think the kinect should be able to track which person is controlling it so it doesn’t get confused’ and ‘the information text should also be read out by the computer so that blind people can hear it’ show a depth of thought that I perhaps foolishly wasn’t expecting.

Technically, the software worked well. I included three different control types:┬ádeictic gestures (pointing and hovering to control a cursor), symbolic gestures (rotate, zoom, pan) and voice commands. Probably the most highly developed and accurate method (from a technical perspective) is my deictic control method. This is highly tuned, highly developed and ‘just works’. Interestingly, the children pretty much ignored this control method. For them it was just an expected behaviour, requiring very little to understand and use proficiently.

My symbolic control method was arguably the least successful technically. The success rate of recognising gestures was around 60% (as compared with around 95% for the deictic method). The children commented on this, were occasionally frustrated and had a long list of ideas to improve it. Great!

And finally the voice control method was very popular. Of the three, it was voice controls that prompted the most ‘wow’s, and the children really enjoyed shouting at the computer to make it work. It technically performed well. I told Kinect to listen only to audio coming from an angle of 0 radians and also to suppress background noise and echoes. It did a sterling job of listening only to the audio we wanted it to, ignoring the noise. Microsoft’s Speech Recognition Engine could do with being a little faster, but overall this part of my project was remarkably stable.

I hope to run these sessions in another school in January, so it will be interesting to see if I get similar results. In the mean time I have enough information to begin writing up my findings. Here’s hoping I can articulate what I’ve done in such a way as to yield marks. I fear a discrepancy between the amount of effort I’ve put into this and the nature of the marking scheme which will define my overall grade.

This evening I finally published the gesture recording and recognition project I’ve been working on. With the help of the Kinect community, especially a member who goes by the name of Rhemyst, we have produced a library which introduces developers to vector-based gesture recognition.

May of the approaches I’ve seen elsewhere use specific positional tracking to recognising gestures – i.e. tracking a hand and matching its movement profile against a series of coordinates or something. This is great, of course, and can actually offer very good recognition. But the Dynamic Time Warping approach is more flexible in that it can be very easily programmed by a novice. It’s great for rapid prototying and, with the help of the community, I hope this can grow into a production-capable recognition engine. It’s not quite there yet, though…

So what are you waiting for? Grab a copy of the first release of KinectDTW from Codeplex now!

Please share your recorded gestures and recognition parameters with the community so that we can all learn and benefit from your experience!


Another little piece of the jigsaw: controlling a WPF ScrollViewer (with added animated easing wizardry) using swipe gestures. Freakin’ yah!

Yeah, one of those.

The trouble with the standard WPF ScrollViewer is that it doesn’t scroll very nicely. Firstly it snaps to the extremities of its components (i.e. to the edges of images) making smooth scrolling impossible. Secondly it doesn’t support .NET’s reasonably powerful animation effects. I sorted this by making my own animation mediator for a ScrollViewer, enabling all that stuff I just mentioned. Here’s a demo:

The intention is to hook this up to the Kinect gesture recogniser so that natural interactions (i.e. swipe gestures) can have a ‘natural’ effect on a menu system. Because, really, if you perform a swipe action on an object you don’t expect it to move uniformly and snap to unnatural positions; instead you would expect it to have inertia, to decelerate, and to rest in a natural position (i.e. a function of how hard you ‘pushed’ it). This WPF extension achieves that.

Today I made a touch and hold style menu system for my Celestia project. It works by tracking a player’s hands, comparing where and for how long they are ‘hovered’, and if a certain time target is reached an event is fired.

The tricky part is to make this modular so that many different menus can be created without reinventing stuff. Currently this only works for rectangular areas but I will enhance this to cater for circles and irregular shapes soon.

An interesting thing I had to consider here is that the active area around a player must be scaled in such a way that all menu items can be reached without stretching, but is small enough that items aren’t triggered by accident. I think this will mainly be trial and error, and I may not have time to change the scaling based on how far away from the camera the player stands.

So now I have voice controls and touch menus from Microsoft’s SDK and full-body gesture controls from NITE. I have to decide whether to re-create the NITE stuff in the Kinect SDK, wait for someoe else to make a library of gesture controls in the SDK or to use an SDK->NITE binding (which doesn’t even exist yet). For now, though, I think I’ll go to bed.