Query-by-singing based music retrieval

Imagine a situation where you have heard a new song over the radio or in a music shop which you like it a lot and recorded it down with your mobile phone. And another situation where you remember a part of a song but forgotten the title. In both situations, neither the song title nor artist...

Full description

Saved in:
Bibliographic Details
Main Author: Goh, Li-Xian.
Other Authors: School of Computer Engineering
Format: Final Year Project
Language:English
Published: 2010
Subjects:
Online Access:http://hdl.handle.net/10356/38814
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Imagine a situation where you have heard a new song over the radio or in a music shop which you like it a lot and recorded it down with your mobile phone. And another situation where you remember a part of a song but forgotten the title. In both situations, neither the song title nor artiste name is known, but you would like to retrieve the song. However, the most common method for retrieving a song currently requires the song title or artiste name as the input information. Hence, a query-by-singing based music retrieval system is proposed as a solution for the scenarios above. The system is implemented using MATLAB and is built on a music database of 500 songs. Experiments are conducted to test the accuracy and efficiency of the completed system. The process for implementing the music retrieval system involves segmenting the songs into sentences based on time-coded lyrics. Each sentence is further divided into fixed overlapping frames to perform feature extraction. The Mel-Frequency Cepstral Coefficients (MFCCs) are extracted as the feature from the music database and clustered using K-means algorithm. The resulting data from the clustering will then be processed with Bag-of-Words model to form a histogram for each sentence of a song. The Random Projection Tree (RP-Tree) is used for indexing the histograms for efficient retrieval. Two types of queries are collected for experiments and comprise of recorded playback and a user singing a sentence of a song. The process of audio segmentation, feature extraction and data training is also applied to each query. The system compares a query with the music database to retrieve results that are closest to the query. The results are returned based on the comparison of Chisquare distance between the query input and database. The analysis of the results show that the system level of accuracy for the recorded type of query is reasonably high but it did not perform as well for the singing type of query. The efficiency of the system was greatly improved with the implementation of RP-Tree.