Two-view Transductive Support Vector Machines
Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam detection, which changes at a very brisk pace. For some problems, there may exist multiple perspectives, so called views...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2010
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/2360 https://ink.library.smu.edu.sg/context/sis_research/article/3360/viewcontent/Two_view_Transductive_2010.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-3360 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-33602018-12-04T07:03:24Z Two-view Transductive Support Vector Machines LI, Guangxia HOI, Steven C. H. CHANG, Kuiyu Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam detection, which changes at a very brisk pace. For some problems, there may exist multiple perspectives, so called views, of each data sample. For example, in text classification, the typical view contains a large number of raw content features such as term frequency, while a second view may contain a small but highly-informative number of domain specific features. We thus propose a novel two-view transductive SVM that takes advantage of both the abundant amount of unlabeled data and their multiple representations to improve the performance of classifiers. The idea is fairly simple: train a classifier on each of the two views of both labeled and unlabeled data, and impose a global constraint that each classifier assigns the same class label to each labeled and unlabeled data. We applied our two-view transductive SVM to the WebKB course dataset, and a real-life review spam classification dataset. Experimental results show that our proposed approach performs up to 5% better than a single view learning algorithm, especially when the amount of labeled data is small. The other advantage of our two-view approach is its significantly improved stability, which is especially useful for noisy real world data. 2010-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/2360 info:doi/10.1137/1.9781611972801.21 https://ink.library.smu.edu.sg/context/sis_research/article/3360/viewcontent/Two_view_Transductive_2010.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Computer Sciences Databases and Information Systems Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Computer Sciences Databases and Information Systems Numerical Analysis and Scientific Computing |
spellingShingle |
Computer Sciences Databases and Information Systems Numerical Analysis and Scientific Computing LI, Guangxia HOI, Steven C. H. CHANG, Kuiyu Two-view Transductive Support Vector Machines |
description |
Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam detection, which changes at a very brisk pace. For some problems, there may exist multiple perspectives, so called views, of each data sample. For example, in text classification, the typical view contains a large number of raw content features such as term frequency, while a second view may contain a small but highly-informative number of domain specific features. We thus propose a novel two-view transductive SVM that takes advantage of both the abundant amount of unlabeled data and their multiple representations to improve the performance of classifiers. The idea is fairly simple: train a classifier on each of the two views of both labeled and unlabeled data, and impose a global constraint that each classifier assigns the same class label to each labeled and unlabeled data. We applied our two-view transductive SVM to the WebKB course dataset, and a real-life review spam classification dataset. Experimental results show that our proposed approach performs up to 5% better than a single view learning algorithm, especially when the amount of labeled data is small. The other advantage of our two-view approach is its significantly improved stability, which is especially useful for noisy real world data. |
format |
text |
author |
LI, Guangxia HOI, Steven C. H. CHANG, Kuiyu |
author_facet |
LI, Guangxia HOI, Steven C. H. CHANG, Kuiyu |
author_sort |
LI, Guangxia |
title |
Two-view Transductive Support Vector Machines |
title_short |
Two-view Transductive Support Vector Machines |
title_full |
Two-view Transductive Support Vector Machines |
title_fullStr |
Two-view Transductive Support Vector Machines |
title_full_unstemmed |
Two-view Transductive Support Vector Machines |
title_sort |
two-view transductive support vector machines |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2010 |
url |
https://ink.library.smu.edu.sg/sis_research/2360 https://ink.library.smu.edu.sg/context/sis_research/article/3360/viewcontent/Two_view_Transductive_2010.pdf |
_version_ |
1770572110654078976 |