Removing sensitive part of a text

With the onset of an era of digitalisation, data across many industries are now becoming digitalised. It is no surprise that the healthcare industry has moved from paper records to maintaining health records on an online portal or a system. With the vast amount of medical information in the health r...

全面介紹

Saved in:

書目詳細資料
主要作者:	Architha, Gopinath
其他作者:	Tay Wee Peng
格式:	Final Year Project
語言:	English
出版:	2019
主題:	DRNTU::Engineering::Electrical and electronic engineering
在線閱讀:	http://hdl.handle.net/10356/77990
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

id	sg-ntu-dr.10356-77990
record_format	dspace
spelling	sg-ntu-dr.10356-779902023-07-07T16:46:27Z Removing sensitive part of a text Architha, Gopinath Tay Wee Peng School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering With the onset of an era of digitalisation, data across many industries are now becoming digitalised. It is no surprise that the healthcare industry has moved from paper records to maintaining health records on an online portal or a system. With the vast amount of medical information in the health records, medical researchers can synthesize and find new medicine for existing diseases. They can also try to gain a more significant understanding of the underlying causes of new diseases by comparing the information across relevant medical records. With the benefits of such data sharing, it is inarguable that the same data can inevitably lead to privacy loss. Medical records contain a lot of sensitive identifiers that can easily identify the patient. From this, we can see that whenever medical records are shared for research purposes, they need to be anonymized and removed of any personal information. A combination of NLTK as well as spaCy models can be used to address this issue. With these methods, each word in the document will be allocated a meaning by the machine. Any patient identifier found, will be removed and replaced as the general PI (Patient Identifier) it refers to. This project uses Python 3.5 (64bit), NLTK 3.3.0 and spaCy. Information on the research carried out, project implementation and the results of the project are included in this report. Bachelor of Engineering (Electrical and Electronic Engineering) 2019-06-11T01:23:03Z 2019-06-11T01:23:03Z 2019 Final Year Project (FYP) http://hdl.handle.net/10356/77990 en Nanyang Technological University 109 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Electrical and electronic engineering
spellingShingle	DRNTU::Engineering::Electrical and electronic engineering Architha, Gopinath Removing sensitive part of a text
description	With the onset of an era of digitalisation, data across many industries are now becoming digitalised. It is no surprise that the healthcare industry has moved from paper records to maintaining health records on an online portal or a system. With the vast amount of medical information in the health records, medical researchers can synthesize and find new medicine for existing diseases. They can also try to gain a more significant understanding of the underlying causes of new diseases by comparing the information across relevant medical records. With the benefits of such data sharing, it is inarguable that the same data can inevitably lead to privacy loss. Medical records contain a lot of sensitive identifiers that can easily identify the patient. From this, we can see that whenever medical records are shared for research purposes, they need to be anonymized and removed of any personal information. A combination of NLTK as well as spaCy models can be used to address this issue. With these methods, each word in the document will be allocated a meaning by the machine. Any patient identifier found, will be removed and replaced as the general PI (Patient Identifier) it refers to. This project uses Python 3.5 (64bit), NLTK 3.3.0 and spaCy. Information on the research carried out, project implementation and the results of the project are included in this report.
author2	Tay Wee Peng
author_facet	Tay Wee Peng Architha, Gopinath
format	Final Year Project
author	Architha, Gopinath
author_sort	Architha, Gopinath
title	Removing sensitive part of a text
title_short	Removing sensitive part of a text
title_full	Removing sensitive part of a text
title_fullStr	Removing sensitive part of a text
title_full_unstemmed	Removing sensitive part of a text
title_sort	removing sensitive part of a text
publishDate	2019
url	http://hdl.handle.net/10356/77990
_version_	1772829116718907392

Removing sensitive part of a text

相似書籍