Selecting the right search term in query-based systems for deduplication

Essentially three approaches could be identiﬁed when choosing a proper search term to detect bibliographic duplicates. Stop words are excluded in all of them, then (1) just the ﬁrst term of an entry will be selected or (2) that term is selected, which produces the smallest number of hits or ﬁnally (...

全面介紹

Saved in:

書目詳細資料
主要作者:	Jele, Harald
格式:	Article
語言:	English
出版:	2021
主題:	Library and information science
在線閱讀:	https://hdl.handle.net/10356/154222
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!

實物特徵
總結:	Essentially three approaches could be identiﬁed when choosing a proper search term to detect bibliographic duplicates. Stop words are excluded in all of them, then (1) just the ﬁrst term of an entry will be selected or (2) that term is selected, which produces the smallest number of hits or ﬁnally (3) that term will be used, which has a certain number of hits below a deﬁned threshold. These three procedures are compared with each other here. The results derive from series of measurements done with bibliographic data from the Austrian Central Catalog.

Selecting the right search term in query-based systems for deduplication

相似書籍