Development of a vision-based gesture human-computer interface

A human computer interface (HCI) involves information passing between the human and the computer. Information is imparted from the computer to the human through visual and audio means. However, a physical interface is required for intention to be conveyed to the computer in a traditional HC...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Chun Ping.
Other Authors: Seet Gim Lee, Gerald
Format: Final Year Project
Language:English
Published: 2010
Subjects:
Online Access:http://hdl.handle.net/10356/40327
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:A human computer interface (HCI) involves information passing between the human and the computer. Information is imparted from the computer to the human through visual and audio means. However, a physical interface is required for intention to be conveyed to the computer in a traditional HCI scheme. The keyboard and mouse, in particular, are pervasive in the world today. For a more intuitive HCI, the information passing from the human to the computer ought to be in the form of visual or audio. Hence, VG HCI emerged to fill this discontinuity. Hand gesture combined with vision, being the most fundamental and universal form of communicative means, is ideal for being developed as an intuitive HCI. However, detection of the hand from the image seen through the lenses of a video camera is not as intuitive to the computer as it is to people. As images are represented as structures of numbers, extracting accurate information about the hand from the data captured is a daunting task. To make matters worse, attaching meaning to the hand sign and hand motion detected involves high level intelligence which is still elusive to computers. Nonetheless, with reasonable assumptions and restrictions, algorithms capable of detecting the hand and understanding gestures do exist. Building on the foundation of such algorithms, a system of VG HCI was constructed. The VG HCI developed consists of two stages, namely, hand detection and gesture recognition. Hand detection involves localization of the hand and extraction of its geometrical information. Gesture recognition involves attaching meaning to the hand sign and hand motion detected. The first stage of the VG HCI was built based on 2-D appearance-based approach with an explicit shape model of the hand. With a few reasonable assumptions, this stage was achieved with 3 image processing techniques, namely background subtraction by codebook method, skin segmentation and contour convexity finding. To enhance the robustness of the system, a variant codebook method had been created to be used in skin segmentation. The gesture recognition stage comprises of static communicative gesture recognition and dynamic manipulative gesture extraction. A hand shape model, characterizing the hand sign, was defined as a means to identify static communicative gesture. Dynamic manipulative gesture was interpreted as the hand’s motion. The hand was tracked using a Kalman tracker to endow the system with noise rejection capability. The VG HCI system was subjected to a series of performance evaluation experiments and the system’s parameters were fine tuned with insights gained from the experiments. From the evaluations, it was concluded that the two key modules, which are the skin segmentation by codebook method and the Kalman tracker, are robust. This in turn translates into the robustness of the VG HCI.