Entity Synonyms for Structured Web Search

Nowadays, there are many queries issued to search engines targeting at finding values from structured data (e.g., movie showtime of a specific location). In such scenarios, there is often a mismatch between the values of structured data (how content creators describe entities) and the web queries (h...

Full description

Saved in:
Bibliographic Details
Main Authors: CHENG, Tao, LAUW, Hady W., PAPARIZOS, Stelios
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2012
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/1549
https://ink.library.smu.edu.sg/context/sis_research/article/2548/viewcontent/tkde12b.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-2548
record_format dspace
spelling sg-smu-ink.sis_research-25482017-12-26T08:26:29Z Entity Synonyms for Structured Web Search CHENG, Tao LAUW, Hady W. PAPARIZOS, Stelios Nowadays, there are many queries issued to search engines targeting at finding values from structured data (e.g., movie showtime of a specific location). In such scenarios, there is often a mismatch between the values of structured data (how content creators describe entities) and the web queries (how different users try to retrieve them). Therefore, recognizing the alternative ways people use to reference an entity, is crucial for structured web search. In this paper, we study the problem of automatic generation of entity synonyms over structured data toward closing the gap between users and structured data. We propose an offline, data-driven approach that mines query logs for instances where content creators and web users apply a variety of strings to refer to the same webpages. This way, given a set of strings that reference entities, we generate an expanded set of equivalent strings (entity synonyms) for each entity. Our framework consists of three modules: candidate generation, candidate selection, and noise cleaning. We further study the cause of the problem through the identification of different entity synonym classes. The proposed method is verified with experiments on real-life data sets showing that we can significantly increase the coverage of structured web queries with good precision. 2012-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/1549 info:doi/10.1109/TKDE.2011.168 https://ink.library.smu.edu.sg/context/sis_research/article/2548/viewcontent/tkde12b.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Entity synonym fuzzy matching structured data web query query log Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Entity synonym
fuzzy matching
structured data
web query
query log
Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Entity synonym
fuzzy matching
structured data
web query
query log
Databases and Information Systems
Numerical Analysis and Scientific Computing
CHENG, Tao
LAUW, Hady W.
PAPARIZOS, Stelios
Entity Synonyms for Structured Web Search
description Nowadays, there are many queries issued to search engines targeting at finding values from structured data (e.g., movie showtime of a specific location). In such scenarios, there is often a mismatch between the values of structured data (how content creators describe entities) and the web queries (how different users try to retrieve them). Therefore, recognizing the alternative ways people use to reference an entity, is crucial for structured web search. In this paper, we study the problem of automatic generation of entity synonyms over structured data toward closing the gap between users and structured data. We propose an offline, data-driven approach that mines query logs for instances where content creators and web users apply a variety of strings to refer to the same webpages. This way, given a set of strings that reference entities, we generate an expanded set of equivalent strings (entity synonyms) for each entity. Our framework consists of three modules: candidate generation, candidate selection, and noise cleaning. We further study the cause of the problem through the identification of different entity synonym classes. The proposed method is verified with experiments on real-life data sets showing that we can significantly increase the coverage of structured web queries with good precision.
format text
author CHENG, Tao
LAUW, Hady W.
PAPARIZOS, Stelios
author_facet CHENG, Tao
LAUW, Hady W.
PAPARIZOS, Stelios
author_sort CHENG, Tao
title Entity Synonyms for Structured Web Search
title_short Entity Synonyms for Structured Web Search
title_full Entity Synonyms for Structured Web Search
title_fullStr Entity Synonyms for Structured Web Search
title_full_unstemmed Entity Synonyms for Structured Web Search
title_sort entity synonyms for structured web search
publisher Institutional Knowledge at Singapore Management University
publishDate 2012
url https://ink.library.smu.edu.sg/sis_research/1549
https://ink.library.smu.edu.sg/context/sis_research/article/2548/viewcontent/tkde12b.pdf
_version_ 1770571296711639040