Predicting domain adaptivity: Redo or recycle?

Over the years, the academic researchers have contributed various visual concept classifiers. Nevertheless, given a new dataset, most researchers still prefer to develop large number of classifiers from scratch despite expensive labeling efforts and limited computing resources. A valid question is w...

Full description

Saved in:
Bibliographic Details
Main Authors: YAO, Ting, NGO, Chong-wah, ZHU, Shiai
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2012
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6472
https://ink.library.smu.edu.sg/context/sis_research/article/7475/viewcontent/2393347.2396321.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7475
record_format dspace
spelling sg-smu-ink.sis_research-74752022-01-10T05:58:52Z Predicting domain adaptivity: Redo or recycle? YAO, Ting NGO, Chong-wah ZHU, Shiai Over the years, the academic researchers have contributed various visual concept classifiers. Nevertheless, given a new dataset, most researchers still prefer to develop large number of classifiers from scratch despite expensive labeling efforts and limited computing resources. A valid question is why not multimedia community “embrace the green” and recycle off-the-shelf classifiers for new dataset. The difficulty originates from the domain gap that there are many different factors that govern the development of a classifier and eventually drive its performance to emphasize certain aspects of dataset. Reapplying a classifier to an unseen dataset may end up GIGO (garbage in, garbage out) and the performance could be much worse than re-developing a new classifier with very few training examples. In this paper, we explore different parameters, including shift of data distribution, visual and context diversities, that may hinder or otherwise encourage the recycling of old classifiers for new dataset. Particularly, we give empirical insights of when to recycle available resources, and when to redo from scratch by completely forgetting the past and train a brand new classifier. Based on these analysis, we further propose an approach for predicting the negative transfer of a concept classifier to a different domain given the observed parameters. Experimental results show that the prediction accuracy of over 75% can be achieved when transferring concept classifiers learnt from LSCOM (news video domain), ImageNet (Web image domain) and Flickr-SF (weakly tagged Web image domain), respectively, to TRECVID 2011 dataset (Web video domain). 2012-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6472 info:doi/10.1145/2393347.2396321 https://ink.library.smu.edu.sg/context/sis_research/article/7475/viewcontent/2393347.2396321.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University cross-domain concept learning domain adaptation Data Storage Systems Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic cross-domain concept learning
domain adaptation
Data Storage Systems
Graphics and Human Computer Interfaces
spellingShingle cross-domain concept learning
domain adaptation
Data Storage Systems
Graphics and Human Computer Interfaces
YAO, Ting
NGO, Chong-wah
ZHU, Shiai
Predicting domain adaptivity: Redo or recycle?
description Over the years, the academic researchers have contributed various visual concept classifiers. Nevertheless, given a new dataset, most researchers still prefer to develop large number of classifiers from scratch despite expensive labeling efforts and limited computing resources. A valid question is why not multimedia community “embrace the green” and recycle off-the-shelf classifiers for new dataset. The difficulty originates from the domain gap that there are many different factors that govern the development of a classifier and eventually drive its performance to emphasize certain aspects of dataset. Reapplying a classifier to an unseen dataset may end up GIGO (garbage in, garbage out) and the performance could be much worse than re-developing a new classifier with very few training examples. In this paper, we explore different parameters, including shift of data distribution, visual and context diversities, that may hinder or otherwise encourage the recycling of old classifiers for new dataset. Particularly, we give empirical insights of when to recycle available resources, and when to redo from scratch by completely forgetting the past and train a brand new classifier. Based on these analysis, we further propose an approach for predicting the negative transfer of a concept classifier to a different domain given the observed parameters. Experimental results show that the prediction accuracy of over 75% can be achieved when transferring concept classifiers learnt from LSCOM (news video domain), ImageNet (Web image domain) and Flickr-SF (weakly tagged Web image domain), respectively, to TRECVID 2011 dataset (Web video domain).
format text
author YAO, Ting
NGO, Chong-wah
ZHU, Shiai
author_facet YAO, Ting
NGO, Chong-wah
ZHU, Shiai
author_sort YAO, Ting
title Predicting domain adaptivity: Redo or recycle?
title_short Predicting domain adaptivity: Redo or recycle?
title_full Predicting domain adaptivity: Redo or recycle?
title_fullStr Predicting domain adaptivity: Redo or recycle?
title_full_unstemmed Predicting domain adaptivity: Redo or recycle?
title_sort predicting domain adaptivity: redo or recycle?
publisher Institutional Knowledge at Singapore Management University
publishDate 2012
url https://ink.library.smu.edu.sg/sis_research/6472
https://ink.library.smu.edu.sg/context/sis_research/article/7475/viewcontent/2393347.2396321.pdf
_version_ 1770575969455702016