Knowledge graph embedding with deep learning

Knowledge graphs (KGs) are widely used to represent structured knowledge, such as entities and their relationships, in applications like natural language processing, information retrieval, and recommendation systems. However, real-world domains are complex, leading to incomplete and error-prone KGs....

Full description

Saved in:

Bibliographic Details
Main Author:	Chen, Chen
Other Authors:	Lam Kwok Yan
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Engineering
Online Access:	https://hdl.handle.net/10356/173397
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-173397
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering
spellingShingle	Engineering Chen, Chen Knowledge graph embedding with deep learning
description	Knowledge graphs (KGs) are widely used to represent structured knowledge, such as entities and their relationships, in applications like natural language processing, information retrieval, and recommendation systems. However, real-world domains are complex, leading to incomplete and error-prone KGs. Knowledge graph completion (KGC) addresses this by predicting missing links and improving KG quality. Knowledge graph embedding (KGE) is a promising approach for KGC, converting KG data into low-dimensional vector representations using deep learning and other techniques. This thesis focuses on deep learning methods for knowledge graph embedding. In the first place, we place our emphasis on the graph-based KGC methods. Existing graph-based methods for KGC generally learn continuous embeddings for entities and relations with shallow linear transformations or deep convolutional modules. These methods suffer from poor expressiveness issues or impose unnecessary image-specific inductive bias to the KGC embedding models, which potentially degrade the model performance. To avoid these issues, we propose a Transformer-based Patch Refinement Model (PatReFormer) under a “Separate-and-Aggregate” framework which segments the input entity and relation embeddings into patches, and utilizes a cross-attentive Transformer architecture for aggregation. Secondly, we start to consider incorporating textual information such as entity / relation description for KGC, and propose a PLM-based method by using an encoder-only structure. The recently-proposed fine-tuned PLMs often overwhelmingly focus on the textual information and overlook structural knowledge. To address this issue, we propose CSProm-KG (Conditional Soft Prompts for KGC) which maintains a balance between structural information and textual knowledge. CSProm-KG only tunes the parameters of Conditional Soft Prompts that are generated by the entities and relations representations and freeze the parameters in PLM. In this way, our proposed approach would be able to consider both information equally and effectively during the KGC process. Thirdly, rather than relying on an encoder-only system to utilize and learn KG textual information, we propose a novel approach based on the sequence-to-sequence paradigm for directly predicting the target entity text. Existing solutions for KGC often cater to specific graph structures, resulting in incompatible methods for different KGC tasks. Such methodological discrepancies not only incur significant maintenance costs but also hinder adaptability to evolving knowledge queries, ingestion processes, and presentation requirements. To address these challenges, we leverage the exceptional performance and technical homogeneity demonstrated by Seq2Seq Pre-trained Language Models (PLMs) across various NLP tasks. We introduce a straightforward yet highly effective Seq2Seq PLM framework, called KG-S2S, that exhibits adaptability to diverse knowledge graph structures. Lastly, we extend the application of KGC techniques to address the challenges in the context of Internet of Things (IoT) services. IoT profiling has recently gained attention as a promising method for validating the normal behavior of connected devices in these services. However, a significant challenge is how to effectively process the vast amounts of IoT profiles to identify suspicious devices which require closer monitoring. To tackle this challenge, we propose a holistic and novel framework HABIT, which regards the behaviors of connected devices as a KG, and detect the “false” knowledge using KGC techniques. By introducing the power of cutting-edge KGC techniques, HABIT offers a comprehensive profiling approach for accurately identifying anomalous behaviors in IoT services.
author2	Lam Kwok Yan
author_facet	Lam Kwok Yan Chen, Chen
format	Thesis-Doctor of Philosophy
author	Chen, Chen
author_sort	Chen, Chen
title	Knowledge graph embedding with deep learning
title_short	Knowledge graph embedding with deep learning
title_full	Knowledge graph embedding with deep learning
title_fullStr	Knowledge graph embedding with deep learning
title_full_unstemmed	Knowledge graph embedding with deep learning
title_sort	knowledge graph embedding with deep learning
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/173397
_version_	1794549304362270720
spelling	sg-ntu-dr.10356-1733972024-03-07T08:52:05Z Knowledge graph embedding with deep learning Chen, Chen Lam Kwok Yan School of Computer Science and Engineering kwokyan.lam@ntu.edu.sg Engineering Knowledge graphs (KGs) are widely used to represent structured knowledge, such as entities and their relationships, in applications like natural language processing, information retrieval, and recommendation systems. However, real-world domains are complex, leading to incomplete and error-prone KGs. Knowledge graph completion (KGC) addresses this by predicting missing links and improving KG quality. Knowledge graph embedding (KGE) is a promising approach for KGC, converting KG data into low-dimensional vector representations using deep learning and other techniques. This thesis focuses on deep learning methods for knowledge graph embedding. In the first place, we place our emphasis on the graph-based KGC methods. Existing graph-based methods for KGC generally learn continuous embeddings for entities and relations with shallow linear transformations or deep convolutional modules. These methods suffer from poor expressiveness issues or impose unnecessary image-specific inductive bias to the KGC embedding models, which potentially degrade the model performance. To avoid these issues, we propose a Transformer-based Patch Refinement Model (PatReFormer) under a “Separate-and-Aggregate” framework which segments the input entity and relation embeddings into patches, and utilizes a cross-attentive Transformer architecture for aggregation. Secondly, we start to consider incorporating textual information such as entity / relation description for KGC, and propose a PLM-based method by using an encoder-only structure. The recently-proposed fine-tuned PLMs often overwhelmingly focus on the textual information and overlook structural knowledge. To address this issue, we propose CSProm-KG (Conditional Soft Prompts for KGC) which maintains a balance between structural information and textual knowledge. CSProm-KG only tunes the parameters of Conditional Soft Prompts that are generated by the entities and relations representations and freeze the parameters in PLM. In this way, our proposed approach would be able to consider both information equally and effectively during the KGC process. Thirdly, rather than relying on an encoder-only system to utilize and learn KG textual information, we propose a novel approach based on the sequence-to-sequence paradigm for directly predicting the target entity text. Existing solutions for KGC often cater to specific graph structures, resulting in incompatible methods for different KGC tasks. Such methodological discrepancies not only incur significant maintenance costs but also hinder adaptability to evolving knowledge queries, ingestion processes, and presentation requirements. To address these challenges, we leverage the exceptional performance and technical homogeneity demonstrated by Seq2Seq Pre-trained Language Models (PLMs) across various NLP tasks. We introduce a straightforward yet highly effective Seq2Seq PLM framework, called KG-S2S, that exhibits adaptability to diverse knowledge graph structures. Lastly, we extend the application of KGC techniques to address the challenges in the context of Internet of Things (IoT) services. IoT profiling has recently gained attention as a promising method for validating the normal behavior of connected devices in these services. However, a significant challenge is how to effectively process the vast amounts of IoT profiles to identify suspicious devices which require closer monitoring. To tackle this challenge, we propose a holistic and novel framework HABIT, which regards the behaviors of connected devices as a KG, and detect the “false” knowledge using KGC techniques. By introducing the power of cutting-edge KGC techniques, HABIT offers a comprehensive profiling approach for accurately identifying anomalous behaviors in IoT services. Doctor of Philosophy 2024-02-02T00:17:00Z 2024-02-02T00:17:00Z 2024 Thesis-Doctor of Philosophy Chen, C. (2024). Knowledge graph embedding with deep learning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/173397 https://hdl.handle.net/10356/173397 10.32657/10356/173397 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Knowledge graph embedding with deep learning

Similar Items