PVT3: a pruned video-vision transformer for tactile texture classification

With the newly involved technologies in tactile sensory, variants tactile sensors have been deployed on robots which provides them touching ability to perceive complex environments. One typical example of robot touching task is to recognize different materials based on the tactile data generated fro...

Full description

Saved in:
Bibliographic Details
Main Author: Ouyang, Yanjia
Other Authors: Lin Zhiping
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/158296
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:With the newly involved technologies in tactile sensory, variants tactile sensors have been deployed on robots which provides them touching ability to perceive complex environments. One typical example of robot touching task is to recognize different materials based on the tactile data generated from different textures. In this report, we propose PVT 3 , a light-weight Transformer-based architecture with pruning layers to model the texture representation. By using a Video-Vision Transformer backbone, the spatial and temporal features will be well preserved and utilized. The multi-dimensional pruning layers will reduce model complexity and size without sacrificing the performance. Three tactile datasets are used for 3 testing the PVT model. Overall, our proposed model achieves higher accuracy on material classification results with a smaller model size compared to the state-of-the-art tactile texture models. This work was written as a paper and submitted to the International Conference on Intelligent Robots and Systems (IROS) 2022.