Why Gesture?

As I mentioned in another post here, one of the primary driving forces behind my thesis project was that I wanted to explore the use of stereoscopic 3D in interactive media.  To do this, I looked at some of the tools that virtual reality researchers have been using for years and apply those to a home setting.  With VR, the goal is often to replicate the “real” world in a digital realm to the point where the audience/player reacts in the virtual world as they would in the real world.  Natural gestures go hand-in-hand with this instinct and, for me, the most natural input mechanism is gestural control.

Consider for a moment a presentation in a large 3D theater like the ones in theme parks, where there is an extreme use of stereoscopic 3D to send objects into the faces of audience members.  Simply observe any audience, especially a younger audience, as that object comes off the screen plane and into the theater space.  The majority of audience members make an instinctual motion to reach out into space and touch that object.  After all, to the observer, it is floating there right in front of their faces.  We want to be able to touch the things that appear in our personal space, even if that virtual object cannot give us the same haptic touch-and-feel feedback of a physical object.  What that simple motion in the theater tells us is that stereoscopic 3D and gestural input are married in a fundamentally human way.  With gestural controls, we can capture that motion in a home setting.  This is just one of the reasons for the decision to use the Microsoft Kinect for my thesis project.

Gesture input has been common in the home video gaming market for more than five years, since the introduction of Nintendo’s Wii.  The incredible success of the Wii brought about the Microsoft Kinect and the Sony Move controllers.  Among these, though, only the Kinect is truly “hands-free”.  This allows for the natural gestures I mention above. In addition, only the Kinect replicates a motion capture volume where the player’s skeleton is trackable.  This is key when combined with stereoscopic 3D.  By being able to track where the user is in space, the stereo3D depth can be adjusted moment-by-moment, allowing for a 3D that more closely replicates the way we see the real world.  Because of this tracking, the 3D calls less attention to itself (as it does in everyday life) allowing the player to move and play much better in a simulated virtual space.

-Michael Annetta
April 16, 2012