NATURAL LANGUAGE PROCESSING-BASED INFORMATION EXTRACTION FOR MEDICAL KNOWLEDGE GRAPH CONSTRUCTION

Every day, hospitals generate new data related to patients. One of the challenges is that patient information is highly diverse and massive, making certain tasks performed by healthcare professionals, such as retrieving patient medical records, repetitive and time-consuming. Harnoune et al. (2021) p...

Full description

Saved in:
Bibliographic Details
Main Author: Reinhart, Richard
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/87686
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:87686
spelling id-itb.:876862025-02-01T16:15:13ZNATURAL LANGUAGE PROCESSING-BASED INFORMATION EXTRACTION FOR MEDICAL KNOWLEDGE GRAPH CONSTRUCTION Reinhart, Richard Indonesia Theses Medical Knowledge Graph, Named Entity Recognition, Part-of-speech tagging, Dependency Parsing INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/87686 Every day, hospitals generate new data related to patients. One of the challenges is that patient information is highly diverse and massive, making certain tasks performed by healthcare professionals, such as retrieving patient medical records, repetitive and time-consuming. Harnoune et al. (2021) proposed a solution in the form of a knowledge graph to address this issue. However, the resulting knowledge graph used the same edge labels for every edge connecting two same type entities, thus failing to provide information such as positive or negative relationships. This study develops the construction of a medical knowledge graph by utilizing Named Entity Recognition (NER) to identify entities such as disease names, medications, or medical procedures. Part-of-Speech (POS) Tagging and Dependency Parsing are used to determine the words functioning as verbs and roots. These words are then used as the relationships between entities in the knowledge graph. This approach aims to generate a graph structure with relationships that align with the contextual connections between entities in the medical domain. The resulting knowledge graph is evaluated using both quantitative and qualitative methods. Quantitative evaluation involves measuring metrics such as precision, recall, and F1-score, which achieved results of 0.89, 0.93, and 0.91, respectively. iv Meanwhile, qualitative evaluation is conducted by involving experts in the medical and informatics domains to assess the correctness and informativeness of the constructed knowledge graph, with scores of 4.2 out of 5 and 3.9 out of 5, respectively. The results demonstrate that the NER, POS Tagging, and Dependency Parsing-based approach is capable of constructing an informative and valid medical knowledge graph, yielding favorable evaluation results both quantitatively and qualitatively. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Every day, hospitals generate new data related to patients. One of the challenges is that patient information is highly diverse and massive, making certain tasks performed by healthcare professionals, such as retrieving patient medical records, repetitive and time-consuming. Harnoune et al. (2021) proposed a solution in the form of a knowledge graph to address this issue. However, the resulting knowledge graph used the same edge labels for every edge connecting two same type entities, thus failing to provide information such as positive or negative relationships. This study develops the construction of a medical knowledge graph by utilizing Named Entity Recognition (NER) to identify entities such as disease names, medications, or medical procedures. Part-of-Speech (POS) Tagging and Dependency Parsing are used to determine the words functioning as verbs and roots. These words are then used as the relationships between entities in the knowledge graph. This approach aims to generate a graph structure with relationships that align with the contextual connections between entities in the medical domain. The resulting knowledge graph is evaluated using both quantitative and qualitative methods. Quantitative evaluation involves measuring metrics such as precision, recall, and F1-score, which achieved results of 0.89, 0.93, and 0.91, respectively. iv Meanwhile, qualitative evaluation is conducted by involving experts in the medical and informatics domains to assess the correctness and informativeness of the constructed knowledge graph, with scores of 4.2 out of 5 and 3.9 out of 5, respectively. The results demonstrate that the NER, POS Tagging, and Dependency Parsing-based approach is capable of constructing an informative and valid medical knowledge graph, yielding favorable evaluation results both quantitatively and qualitatively.
format Theses
author Reinhart, Richard
spellingShingle Reinhart, Richard
NATURAL LANGUAGE PROCESSING-BASED INFORMATION EXTRACTION FOR MEDICAL KNOWLEDGE GRAPH CONSTRUCTION
author_facet Reinhart, Richard
author_sort Reinhart, Richard
title NATURAL LANGUAGE PROCESSING-BASED INFORMATION EXTRACTION FOR MEDICAL KNOWLEDGE GRAPH CONSTRUCTION
title_short NATURAL LANGUAGE PROCESSING-BASED INFORMATION EXTRACTION FOR MEDICAL KNOWLEDGE GRAPH CONSTRUCTION
title_full NATURAL LANGUAGE PROCESSING-BASED INFORMATION EXTRACTION FOR MEDICAL KNOWLEDGE GRAPH CONSTRUCTION
title_fullStr NATURAL LANGUAGE PROCESSING-BASED INFORMATION EXTRACTION FOR MEDICAL KNOWLEDGE GRAPH CONSTRUCTION
title_full_unstemmed NATURAL LANGUAGE PROCESSING-BASED INFORMATION EXTRACTION FOR MEDICAL KNOWLEDGE GRAPH CONSTRUCTION
title_sort natural language processing-based information extraction for medical knowledge graph construction
url https://digilib.itb.ac.id/gdl/view/87686
_version_ 1823000146427772928