DEVELOPMENT OF A BIRD SONG IDENTIFICATION MODEL USING A CONVOLUTIONAL NEURAL NETWORK (CNN)

Birds are a group of animals that have a very large diversity of species. However, this diversity is under threat of extinction due to human activities and natural phenomena. Identifying bird species that still exist in an ecosystem is an initial action that can be taken to protect this bird dive...

Full description

Saved in:
Bibliographic Details
Main Author: Hannania, Nabila
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/74796
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Birds are a group of animals that have a very large diversity of species. However, this diversity is under threat of extinction due to human activities and natural phenomena. Identifying bird species that still exist in an ecosystem is an initial action that can be taken to protect this bird diversity. In this final project's research, a Convolutional Neural Network (CNN) model was built to automate the process of identifying bird species in a bird song sound record. In the experiments carried out, 24 CNN models were built, which had differences in architecture, feature extraction techniques, and the type of data used. There are three architectures used: AlexNet, DenseNet, and VGG. There are four feature extraction techniques used: mel-spectrogram, harmonic component-based mel- spectrogram, percussive component-based mel-spectrogram, and MFCC. Meanwhile, there are two types of data used: clean data and raw data. Of these 24 models, three were then selected for hyperparameter tuning. Based on the results of the experimental analysis, the model with the best configuration was obtained with the DenseNet architecture, feature extraction with percussive component-based mel-spectrogram, and clean data as model input. This evaluation process is carried out using two types of testing methods: testing with chunk data and testing with complete data. The accuracy of the model with the best configuration reaches 85.67% for method 1 and 90.42% for method 2.