ADAPTATION OF MOBILENETV2-TSM FOR GESTURE BASED NATURAL USER INTERFACE
Hand gestures are an alternative to complement the current shortcomings of natural user interfaces. Research on hand gesture recognition for interfaces has been conducted by Kopuklu, et al., (2019). The study used 3D-CNN to recognize hand gestures. This approach can provide good accuracy but havi...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/50589 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Hand gestures are an alternative to complement the current shortcomings of
natural user interfaces. Research on hand gesture recognition for interfaces has
been conducted by Kopuklu, et al., (2019). The study used 3D-CNN to recognize
hand gestures. This approach can provide good accuracy but having high
computation time. In another study, Lin, et al., (2019) conducted a study on 2DCNN-based learning of spatiotemporal features called the Temporal Shift Module
(TSM). TSM provides lower computatino time than 3D-CNN. However, this
approach cannot directly replace the 3D-CNN model in the hand gesture interface.
This is due to differences in characteristics between the two approaches.
In this final project, an adaptation was made so that TSM can be applied to the
hand gesture interface. As the TSM backbone, MobileNetV2 is used because it
has low computation time. The adaptations made in the form of adjustments to the
detection mechanism and activation of hand gestures to the input and output
characteristics of the TSM. In the detection mechanism, the detector uses a motion
detection algorithm so that it can process hand gesture input per frame. Whereas
in the activation mechanism, in addition to using the weighted accuracy method
from the Kopuklu approach, et al., (2019), a frame abandonment mechanism is
also used after activation.
With the adaptations made, the resulting hand gesture interface has a good
performance in terms of accuracy and computation time. The adaptation interface
can provide 96.54% accuracy with a computation speed of 67 fps. This result is
also better than the 3D-CNN-based approach.
|
---|