INDONESIAN STREET FOOD CALORIE ESTIMATION USING MASK R-CNN DAN MULTIPLE LINEAR REGRESSION

iii ABSTRACT INDONESIAN STREET FOOD CALORIE ESTIMATION USING MASK R-CNN DAN MULTIPLE LINEAR REGRESSION By Nadya Aditama NIM: 23520039 (Master’s Program in Informatics) There are two problems in building image-based calorie estimation model using Mask R-CNN to get the food shape and food wei...

Full description

Saved in:
Bibliographic Details
Main Author: Aditama, Nadya
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/63085
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:iii ABSTRACT INDONESIAN STREET FOOD CALORIE ESTIMATION USING MASK R-CNN DAN MULTIPLE LINEAR REGRESSION By Nadya Aditama NIM: 23520039 (Master’s Program in Informatics) There are two problems in building image-based calorie estimation model using Mask R-CNN to get the food shape and food weight prediction using linear regression model. First, simple linear regression model has lower R Squared score than multiple linear regression according to Abdelhady et al. (2019) so it is necessary to add more feature in measurement. Second, there are some occluded food objects so the system cannot get the real food shape. Therefore, in this research, Mask R-CNN model will be trained with the amodally annotated object dataset, so the model is expected to form the segmentation result that segment the occluded part of the object as evaluated in Qi et al. (2019) research in amodal instance segmentation task on KINS Dataset. In this research, an image dataset of Indonesian street food has been created. The food that are used in this dataset are tahu, tempe, bakwan, cireng, bolu, and serabi. The dataset was taken manually with the various amount of food in plates and different positions, both occluded and non-occluded positions. The calorie estimation model is divided into three models, the detection model, the estimation model, and the combined model between detection and estimation. In the combined model, the image will be segmented first by the Mask R-CNN model. From the segmentation results, information on the area, perimeter, length, and width of the object will be taken to predict the weight of the food using multiple linear regression models. Weight information will be converted to kilocalorie units. In the development of detection model, the results showed that the ResNeXt-101-FPN model had a validation mAP that was not much different from the ResNet-101-FPN model in segmenting amodal annotated objects. The mAP is 91.74% for ResNet-101-FPN and 91.47%. for ResNeXt-101-FPN. In the estimation model, a multiple linear regression model with four proposed features has an R Squared score of 0.804, and the average MAE score for all classes in the prediction of the estimation model on test data is 5.254. In the combined model, the best Mask R-CNN model is the model with the ResNeXt-101-FPN backbone. This model succeeded in detecting and segmenting food with an average F1 Score of 0.821 in the IoU threshold above 0.85 in the scenario of images containing occluded objects and 0.994 in the IoU threshold above 0.9 in the iv scenario of images containing non-occluded objects. The proposed multiple linear regression model gets an average MAE value for all classes 8.354 in the occluded object scenario and an average MAE value for all classes 11.256 in the non-occluded object scenario. Even so, this model has drawbacks. In the detection model, there are still false positives between the occluded objects, and segmentation results are not very similar to the ground truth object. When predicting object that is not on training data, there still false positive detection between the occluded objects. Overall, the amodal instance segmentation task in predicting occluded food calories can help in getting features for calorie estimation with an average MAE value that is not too large. In addition, in the scenario of the occluded object, the multiple linear regression with the proposed features is not as good as the other models in some food class. This is due to imperfect segmentation results on occluded objects. A multiple linear regression model is more appropriate to use in measuring calories on non-overlapping objects. Keywords: Mask R-CNN, Multiple Linear Regression, Calorie Estimation, Amodal Instance Segmentation, Indonesian Street Food.