Flickr tag segmentation and classification
With the increasing usage and popularity of the social media on the Internet such as Flickr and Facebook, there is also an increase in the amount of information within the internet. An example is that some of the information is stored in tags. Tags are keywords or terms assigned to a piece of inform...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2014
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/58956 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | With the increasing usage and popularity of the social media on the Internet such as Flickr and Facebook, there is also an increase in the amount of information within the internet. An example is that some of the information is stored in tags. Tags are keywords or terms assigned to a piece of information. With the increasing amount of data, there is a need for the users to search and obtain the required information efficiently and hence, tags are important as they are widely used to filter the unwanted information and reduce the search space.
The words in tags are joined together without spaces such as “thisphoto”. Hence the aim of the project is to implement a program to carry out tag segmentation and classification. During segmentation, the program will split the words (For example, “thisphoto” to “this photo”). After segmentation, the program will carry out classification which will classify the tags from the segmentation (For example, “New York” is classified as a “LOCATION”). The dataset of tags used in this project were obtained from Flickr, a photo sharing website. Hence, the tags used in this project are for photos and images only.
The completed project was tested against a total of 1000 tags which were randomly extracted from the Flickr dataset. Based on the segmentation results, it was observed that the accuracy rate of the segmentation was at 96.5%. As for the classification results, it was observed that the accuracy rate was at 86.21%. Hence based on the results, the program is able to segment and classify the tags at an acceptable accuracy. |
---|