Robust bipoly-matching for multi-granular entities

Entity matching across two data sources is a prevalent need in many domains, including e-commerce. Of interest is the scenario where entities have varying granularity, e.g., a coarse product category may match multiple finer categories. Previous work in one-to-many matching generally presumes the `o...

Full description

Saved in:
Bibliographic Details
Main Authors: LEE, Ween Jiann, TKACHENKO, Maksim, LAUW, Hady W.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6434
https://ink.library.smu.edu.sg/context/sis_research/article/7437/viewcontent/icdm21.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7437
record_format dspace
spelling sg-smu-ink.sis_research-74372022-05-10T05:28:32Z Robust bipoly-matching for multi-granular entities LEE, Ween Jiann TKACHENKO, Maksim LAUW, Hady W. Entity matching across two data sources is a prevalent need in many domains, including e-commerce. Of interest is the scenario where entities have varying granularity, e.g., a coarse product category may match multiple finer categories. Previous work in one-to-many matching generally presumes the `one' necessarily comes from a designated source and the `many' from the other source. In contrast, we propose a novel formulation that allows concurrent one-to-many bidirectional matching in any direction. Beyond flexibility, we also seek matching that is more robust to noisy similarity values arising from diverse entity descriptions, by introducing receptivity and reclusivity notions. In addition to an optimal formulation, we also propose an efficient and performant heuristic. Experiments on multiple real-life datasets from e-commerce sources showcase the effectiveness and outperformance of our proposed algorithms over baselines. 2021-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6434 info:doi/10.1109/ICDM51629.2021.00143 https://ink.library.smu.edu.sg/context/sis_research/article/7437/viewcontent/icdm21.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University entity resolution matching one-to-many poly bipoly Databases and Information Systems Data Science
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic entity resolution
matching
one-to-many
poly
bipoly
Databases and Information Systems
Data Science
spellingShingle entity resolution
matching
one-to-many
poly
bipoly
Databases and Information Systems
Data Science
LEE, Ween Jiann
TKACHENKO, Maksim
LAUW, Hady W.
Robust bipoly-matching for multi-granular entities
description Entity matching across two data sources is a prevalent need in many domains, including e-commerce. Of interest is the scenario where entities have varying granularity, e.g., a coarse product category may match multiple finer categories. Previous work in one-to-many matching generally presumes the `one' necessarily comes from a designated source and the `many' from the other source. In contrast, we propose a novel formulation that allows concurrent one-to-many bidirectional matching in any direction. Beyond flexibility, we also seek matching that is more robust to noisy similarity values arising from diverse entity descriptions, by introducing receptivity and reclusivity notions. In addition to an optimal formulation, we also propose an efficient and performant heuristic. Experiments on multiple real-life datasets from e-commerce sources showcase the effectiveness and outperformance of our proposed algorithms over baselines.
format text
author LEE, Ween Jiann
TKACHENKO, Maksim
LAUW, Hady W.
author_facet LEE, Ween Jiann
TKACHENKO, Maksim
LAUW, Hady W.
author_sort LEE, Ween Jiann
title Robust bipoly-matching for multi-granular entities
title_short Robust bipoly-matching for multi-granular entities
title_full Robust bipoly-matching for multi-granular entities
title_fullStr Robust bipoly-matching for multi-granular entities
title_full_unstemmed Robust bipoly-matching for multi-granular entities
title_sort robust bipoly-matching for multi-granular entities
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sis_research/6434
https://ink.library.smu.edu.sg/context/sis_research/article/7437/viewcontent/icdm21.pdf
_version_ 1770575959800414208