PANOPTIC SEGMENTATION PERCEPTION SYSTEM FOR AUTONOMOUS ELECTRIC TRAM IN THE MIXED TRAFFIC ENVIRONMENT

Semantic segmentation's objective is to recognize all objects from an image without concerning the number of objects from the same class. On the other hand, instance segmentation emphasizes the number of foreground entities. Panoptic segmentation combines two contrasting paradigms of semanti...

Full description

Saved in:
Bibliographic Details
Main Author: Bintang Kusumawardhana, Dhimas
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/70809
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Semantic segmentation's objective is to recognize all objects from an image without concerning the number of objects from the same class. On the other hand, instance segmentation emphasizes the number of foreground entities. Panoptic segmentation combines two contrasting paradigms of semantic segmentation and instance segmentation into a single architecture to provide a whole unifying segmentation. The research on panoptic segmentation for autonomous trams in a mixed traffic environment requires a dataset that represents the real-world scenario. Unfortunately, this type of dataset in the common public dataset is quite scarce. This research centered on developing panoptic segmentation for autonomous trams, with a novelty in dataset creation for the mixed-traffic environment. This thesis consists of four consecutive developments: evaluation of the public datasets, data recording, data annotation, and model training. The goal of evaluating Cityscapes, COCO, and RailSem19 datasets is to find compatibility for the local context. Inference of some local data using a model trained on those datasets shows some failure during recognizing some unique objects. The specification for the local panoptic dataset is then derived from what was gotten during the evaluation. The local datasets consist of 22 object categories, in which 17 categories resemble instance objects, and 5 categories resemble semantic objects. Sensor recording of mixed traffic environment is taken on Jl. Slamet Riyadi, Kota Solo. The process utilizes 4 Sekonix cameras connected to the Nvidia DRIVE AGX Pegasus development kit. This activity resulting some data sequences 120 minutes in length for each camera. Data annotations are proceeded by giving polygon labels to every object in the recorded data. The annotation team produces 1000 data frames that have passed quality control. The result of post-processing on the annotated images is datasets with a COCO-like format and in a nuScenes-like scenario. Training the panoptic segmentation model is conducted to verify the produced datasets as a source of knowledge. Training is executed on Panoptic-FPN architecture using the MMDetection framework. The qualitative evaluation shows a fair result, indicated by the decent segmentation of some unique objects. The quantitative analysis shows 69% Recognition Quality (RQ), 140.6% Segmentation Quality (SQ), and 110.6% Panoptic Quality (PQ). The presence of double annotations causes an incorrect calculation of intersection-over-union (IoU), which has consequences on the spike of SQ and PQ calculation.