Embedding-based representation of categorical data by hierarchical value coupling learning

Learning the representation of categorical data with hierarchical value coupling relationships is very challenging but critical for the effective analysis and learning of such data. This paper proposes a novel coupled unsupervised categorical data representation (CURE) framework and its instantiatio...

Full description

Saved in:
Bibliographic Details
Main Authors: JIAN, Songlei, CAO, Longbing, PANG, Guansong, LU, Kai, GAO, Hang
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2017
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7143
https://ink.library.smu.edu.sg/context/sis_research/article/8146/viewcontent/0269.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8146
record_format dspace
spelling sg-smu-ink.sis_research-81462022-04-22T04:20:38Z Embedding-based representation of categorical data by hierarchical value coupling learning JIAN, Songlei CAO, Longbing PANG, Guansong LU, Kai GAO, Hang Learning the representation of categorical data with hierarchical value coupling relationships is very challenging but critical for the effective analysis and learning of such data. This paper proposes a novel coupled unsupervised categorical data representation (CURE) framework and its instantiation, i.e., a coupled data embedding (CDE) method, for representing categorical data by hierarchical value-to-value cluster coupling learning. Unlike existing embedding- and similarity-based representation methods which can capture only a part or none of these complex couplings, CDE explicitly incorporates the hierarchical couplings into its embedding representation. CDE first learns two complementary feature value couplings which are then used to cluster values with different granularities. It further models the couplings in value clusters within the same granularity and with different granularities to embed feature values into a new numerical space with independent dimensions. Substantial experiments show that CDE significantly outperforms three popular unsupervised embedding methods and three state-of-the-art similarity-based representation methods. 2017-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7143 info:doi/10.24963/ijcai.2017/269 https://ink.library.smu.edu.sg/context/sis_research/article/8146/viewcontent/0269.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Machine Learning: Data Mining Machine Learning: Unsupervised Learning Databases and Information Systems Data Storage Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Machine Learning: Data Mining
Machine Learning: Unsupervised Learning
Databases and Information Systems
Data Storage Systems
spellingShingle Machine Learning: Data Mining
Machine Learning: Unsupervised Learning
Databases and Information Systems
Data Storage Systems
JIAN, Songlei
CAO, Longbing
PANG, Guansong
LU, Kai
GAO, Hang
Embedding-based representation of categorical data by hierarchical value coupling learning
description Learning the representation of categorical data with hierarchical value coupling relationships is very challenging but critical for the effective analysis and learning of such data. This paper proposes a novel coupled unsupervised categorical data representation (CURE) framework and its instantiation, i.e., a coupled data embedding (CDE) method, for representing categorical data by hierarchical value-to-value cluster coupling learning. Unlike existing embedding- and similarity-based representation methods which can capture only a part or none of these complex couplings, CDE explicitly incorporates the hierarchical couplings into its embedding representation. CDE first learns two complementary feature value couplings which are then used to cluster values with different granularities. It further models the couplings in value clusters within the same granularity and with different granularities to embed feature values into a new numerical space with independent dimensions. Substantial experiments show that CDE significantly outperforms three popular unsupervised embedding methods and three state-of-the-art similarity-based representation methods.
format text
author JIAN, Songlei
CAO, Longbing
PANG, Guansong
LU, Kai
GAO, Hang
author_facet JIAN, Songlei
CAO, Longbing
PANG, Guansong
LU, Kai
GAO, Hang
author_sort JIAN, Songlei
title Embedding-based representation of categorical data by hierarchical value coupling learning
title_short Embedding-based representation of categorical data by hierarchical value coupling learning
title_full Embedding-based representation of categorical data by hierarchical value coupling learning
title_fullStr Embedding-based representation of categorical data by hierarchical value coupling learning
title_full_unstemmed Embedding-based representation of categorical data by hierarchical value coupling learning
title_sort embedding-based representation of categorical data by hierarchical value coupling learning
publisher Institutional Knowledge at Singapore Management University
publishDate 2017
url https://ink.library.smu.edu.sg/sis_research/7143
https://ink.library.smu.edu.sg/context/sis_research/article/8146/viewcontent/0269.pdf
_version_ 1770576231214874624