Online feature selection for mining big data

Most studies of online learning require accessing all the attributes/ features of training instances. Such a classical setting is not always appropriate for real-world applications when data instances are of high dimensionality or the access to it is expensive to acquire the full set of attributes/f...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hoi, Steven C. H., Wang, Jialei., Zhao, Peilin., Jin, Rong.
Other Authors:	School of Computer Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/98983 http://hdl.handle.net/10220/12629
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-98983
record_format	dspace
spelling	sg-ntu-dr.10356-989832020-05-28T07:18:24Z Online feature selection for mining big data Hoi, Steven C. H. Wang, Jialei. Zhao, Peilin. Jin, Rong. School of Computer Engineering International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications (1st : 2012 : Beijing, China) DRNTU::Engineering::Computer science and engineering Most studies of online learning require accessing all the attributes/ features of training instances. Such a classical setting is not always appropriate for real-world applications when data instances are of high dimensionality or the access to it is expensive to acquire the full set of attributes/features. To address this limitation, we investigate the problem of Online Feature Selection (OFS) in which the online learner is only allowed to maintain a classifier involved a small and fixed number of features. The key challenge of Online Feature Selection is how to make accurate prediction using a small and fixed number of active features. This is in contrast to the classical setup of online learning where all the features are active and can be used for prediction. We address this challenge by studying sparsity regularization and truncation techniques. Specifically, we present an effective algorithm to solve the problem, give the theoretical analysis, and evaluate the empirical performance of the proposed algorithms for online feature selection on several public datasets. We also demonstrate the application of our online feature selection technique to tackle real-world problems of big data mining, which is significantly more scalable than some well-known batch feature selection algorithms. The encouraging results of our experiments validate the efficacy and efficiency of the proposed techniques for large-scale applications. 2013-07-31T06:43:57Z 2019-12-06T20:02:01Z 2013-07-31T06:43:57Z 2019-12-06T20:02:01Z 2012 2012 Conference Paper Hoi, S. C. H., Wang, J., Zhao, P., & Jin, R. (2012). Online feature selection for mining big data. Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining Algorithms, Systems, Programming Models and Applications - BigMine '12, 93-100. https://hdl.handle.net/10356/98983 http://hdl.handle.net/10220/12629 10.1145/2351316.2351329 en
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Hoi, Steven C. H. Wang, Jialei. Zhao, Peilin. Jin, Rong. Online feature selection for mining big data
description	Most studies of online learning require accessing all the attributes/ features of training instances. Such a classical setting is not always appropriate for real-world applications when data instances are of high dimensionality or the access to it is expensive to acquire the full set of attributes/features. To address this limitation, we investigate the problem of Online Feature Selection (OFS) in which the online learner is only allowed to maintain a classifier involved a small and fixed number of features. The key challenge of Online Feature Selection is how to make accurate prediction using a small and fixed number of active features. This is in contrast to the classical setup of online learning where all the features are active and can be used for prediction. We address this challenge by studying sparsity regularization and truncation techniques. Specifically, we present an effective algorithm to solve the problem, give the theoretical analysis, and evaluate the empirical performance of the proposed algorithms for online feature selection on several public datasets. We also demonstrate the application of our online feature selection technique to tackle real-world problems of big data mining, which is significantly more scalable than some well-known batch feature selection algorithms. The encouraging results of our experiments validate the efficacy and efficiency of the proposed techniques for large-scale applications.
author2	School of Computer Engineering
author_facet	School of Computer Engineering Hoi, Steven C. H. Wang, Jialei. Zhao, Peilin. Jin, Rong.
format	Conference or Workshop Item
author	Hoi, Steven C. H. Wang, Jialei. Zhao, Peilin. Jin, Rong.
author_sort	Hoi, Steven C. H.
title	Online feature selection for mining big data
title_short	Online feature selection for mining big data
title_full	Online feature selection for mining big data
title_fullStr	Online feature selection for mining big data
title_full_unstemmed	Online feature selection for mining big data
title_sort	online feature selection for mining big data
publishDate	2013
url	https://hdl.handle.net/10356/98983 http://hdl.handle.net/10220/12629
_version_	1681056323572072448

Online feature selection for mining big data

Similar Items