Two-view Transductive Support Vector Machines

Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam detection, which changes at a very brisk pace. For some problems, there may exist multiple perspectives, so called views...

Full description

Saved in:
Bibliographic Details
Main Authors: LI, Guangxia, HOI, Steven C. H., CHANG, Kuiyu
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2010
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/2360
https://ink.library.smu.edu.sg/context/sis_research/article/3360/viewcontent/Two_view_Transductive_2010.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-3360
record_format dspace
spelling sg-smu-ink.sis_research-33602018-12-04T07:03:24Z Two-view Transductive Support Vector Machines LI, Guangxia HOI, Steven C. H. CHANG, Kuiyu Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam detection, which changes at a very brisk pace. For some problems, there may exist multiple perspectives, so called views, of each data sample. For example, in text classification, the typical view contains a large number of raw content features such as term frequency, while a second view may contain a small but highly-informative number of domain specific features. We thus propose a novel two-view transductive SVM that takes advantage of both the abundant amount of unlabeled data and their multiple representations to improve the performance of classifiers. The idea is fairly simple: train a classifier on each of the two views of both labeled and unlabeled data, and impose a global constraint that each classifier assigns the same class label to each labeled and unlabeled data. We applied our two-view transductive SVM to the WebKB course dataset, and a real-life review spam classification dataset. Experimental results show that our proposed approach performs up to 5% better than a single view learning algorithm, especially when the amount of labeled data is small. The other advantage of our two-view approach is its significantly improved stability, which is especially useful for noisy real world data. 2010-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/2360 info:doi/10.1137/1.9781611972801.21 https://ink.library.smu.edu.sg/context/sis_research/article/3360/viewcontent/Two_view_Transductive_2010.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Computer Sciences Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Computer Sciences
Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Computer Sciences
Databases and Information Systems
Numerical Analysis and Scientific Computing
LI, Guangxia
HOI, Steven C. H.
CHANG, Kuiyu
Two-view Transductive Support Vector Machines
description Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam detection, which changes at a very brisk pace. For some problems, there may exist multiple perspectives, so called views, of each data sample. For example, in text classification, the typical view contains a large number of raw content features such as term frequency, while a second view may contain a small but highly-informative number of domain specific features. We thus propose a novel two-view transductive SVM that takes advantage of both the abundant amount of unlabeled data and their multiple representations to improve the performance of classifiers. The idea is fairly simple: train a classifier on each of the two views of both labeled and unlabeled data, and impose a global constraint that each classifier assigns the same class label to each labeled and unlabeled data. We applied our two-view transductive SVM to the WebKB course dataset, and a real-life review spam classification dataset. Experimental results show that our proposed approach performs up to 5% better than a single view learning algorithm, especially when the amount of labeled data is small. The other advantage of our two-view approach is its significantly improved stability, which is especially useful for noisy real world data.
format text
author LI, Guangxia
HOI, Steven C. H.
CHANG, Kuiyu
author_facet LI, Guangxia
HOI, Steven C. H.
CHANG, Kuiyu
author_sort LI, Guangxia
title Two-view Transductive Support Vector Machines
title_short Two-view Transductive Support Vector Machines
title_full Two-view Transductive Support Vector Machines
title_fullStr Two-view Transductive Support Vector Machines
title_full_unstemmed Two-view Transductive Support Vector Machines
title_sort two-view transductive support vector machines
publisher Institutional Knowledge at Singapore Management University
publishDate 2010
url https://ink.library.smu.edu.sg/sis_research/2360
https://ink.library.smu.edu.sg/context/sis_research/article/3360/viewcontent/Two_view_Transductive_2010.pdf
_version_ 1770572110654078976