Evaluating the use of different machine learning techniques to predict traditional Chinese medicine diagnosis

Traditional Chinese Medicine (TCM) diagnostic features uses patient perceptions which can aid in the development of robust diagnostic tools. We explored the information value of these features by modelling the TCM diagnostic process with machine learning, namely decision tree, random forest, and mul...

Full description

Saved in:
Bibliographic Details
Main Author: Kon, Wen Xuan
Other Authors: Goh Wen Bin Wilson
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/152316
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Traditional Chinese Medicine (TCM) diagnostic features uses patient perceptions which can aid in the development of robust diagnostic tools. We explored the information value of these features by modelling the TCM diagnostic process with machine learning, namely decision tree, random forest, and multi-layered perceptron (MLP). Before evaluating the performance, different metrics are tested using dummy models. Accuracy and balanced accuracy are deemed unsuitable due to the large true negatives (TN) that inflates the metrics. Precision and recall are also not suitable to be used alone to determine the overall performance. The F1 and threat scores are suitable metrics that ignores the large TN. The Matthew’s corelation coefficient (MCC) is the best metric as it computes TN as part of the performance while being inert to the class imbalance. The percentage of correct predicted diagnosis may only be useful in determining whether the models are functioning. Although the 3 models did not perform well, MLP had the best performance. Some ways to improve the performance of the models include better record keeping of the TCM data, synthetic minority over sampling (SMOTE), top performing feature selection, using more complex models, and optimisation.