Machine learning for chemical components testing

Terahertz time domain spectroscopy (THz-TDS) involves the use of THz radiation to identify chemicals via their absorption spectra characteristics. Building on the preceding Final Year Project which proved the feasibility of incorporating machine learning with THz-TDS to identify pure chemicals, this...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Ashley Zhao Kiat
Other Authors: Cai Yiyu
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/150216
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-150216
record_format dspace
spelling sg-ntu-dr.10356-1502162021-05-25T06:22:46Z Machine learning for chemical components testing Tan, Ashley Zhao Kiat Cai Yiyu School of Mechanical and Aerospace Engineering Anor Technologies Pte Ltd MYYCai@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Mechanical engineering::Robots Terahertz time domain spectroscopy (THz-TDS) involves the use of THz radiation to identify chemicals via their absorption spectra characteristics. Building on the preceding Final Year Project which proved the feasibility of incorporating machine learning with THz-TDS to identify pure chemicals, this report explores the improvement on chemical mixture identification. From data collected in the lab from the industrial partner, Anor Technologies, various new pre-processing approaches are applied. These include the use of Mixture Synthesis to bolster the mixture dataset, as well as a Stacked Area approach to average out the inconsistencies between individual datapoints obtained from the THz-TDS machine. Following this, two new machine learning approaches are taken to evaluate the effectiveness on chemical mixture identification. Multi-label problem transformation techniques and algorithm adaptations such as Binary Relevance, Classifier Chain, Label Powerset and MLkNN are taken to tackle the mixture identification problem, along with the application of a 1D CNN as a new machine learning approach. Results from the training and testing show that while the Stacked Area approach can greatly increase the training and validation recall and precision scores up to 0.99, the drawback is a five times reduction in dataset size, which can affect model generalization performance. Further testing results show that the 1D CNN model has a very good generalization performance on completely unseen data, achieving a recall and precision score of around 0.98. The two novel approaches are shown to be very effective in this field of chemical detection using THz-TDS, with the Mixture Synthesis method effectively able to double the size of the existing datasets, and the Stacked Area leading to trained models with consistently better recall and precision scores compared to the original data. Future considerations to build on this work could involve the incorporation of data augmentation methods by randomising the offset, slope, and multiplication of the original absorption spectra to produce even more datapoints. Further development can be made on the model, switching to a regression model that can quantitatively detect the composition of chemicals in a mixture. Bachelor of Engineering (Mechanical Engineering) 2021-05-25T06:22:46Z 2021-05-25T06:22:46Z 2021 Final Year Project (FYP) Tan, A. Z. K. (2021). Machine learning for chemical components testing. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/150216 https://hdl.handle.net/10356/150216 en C050 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Mechanical engineering::Robots
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Mechanical engineering::Robots
Tan, Ashley Zhao Kiat
Machine learning for chemical components testing
description Terahertz time domain spectroscopy (THz-TDS) involves the use of THz radiation to identify chemicals via their absorption spectra characteristics. Building on the preceding Final Year Project which proved the feasibility of incorporating machine learning with THz-TDS to identify pure chemicals, this report explores the improvement on chemical mixture identification. From data collected in the lab from the industrial partner, Anor Technologies, various new pre-processing approaches are applied. These include the use of Mixture Synthesis to bolster the mixture dataset, as well as a Stacked Area approach to average out the inconsistencies between individual datapoints obtained from the THz-TDS machine. Following this, two new machine learning approaches are taken to evaluate the effectiveness on chemical mixture identification. Multi-label problem transformation techniques and algorithm adaptations such as Binary Relevance, Classifier Chain, Label Powerset and MLkNN are taken to tackle the mixture identification problem, along with the application of a 1D CNN as a new machine learning approach. Results from the training and testing show that while the Stacked Area approach can greatly increase the training and validation recall and precision scores up to 0.99, the drawback is a five times reduction in dataset size, which can affect model generalization performance. Further testing results show that the 1D CNN model has a very good generalization performance on completely unseen data, achieving a recall and precision score of around 0.98. The two novel approaches are shown to be very effective in this field of chemical detection using THz-TDS, with the Mixture Synthesis method effectively able to double the size of the existing datasets, and the Stacked Area leading to trained models with consistently better recall and precision scores compared to the original data. Future considerations to build on this work could involve the incorporation of data augmentation methods by randomising the offset, slope, and multiplication of the original absorption spectra to produce even more datapoints. Further development can be made on the model, switching to a regression model that can quantitatively detect the composition of chemicals in a mixture.
author2 Cai Yiyu
author_facet Cai Yiyu
Tan, Ashley Zhao Kiat
format Final Year Project
author Tan, Ashley Zhao Kiat
author_sort Tan, Ashley Zhao Kiat
title Machine learning for chemical components testing
title_short Machine learning for chemical components testing
title_full Machine learning for chemical components testing
title_fullStr Machine learning for chemical components testing
title_full_unstemmed Machine learning for chemical components testing
title_sort machine learning for chemical components testing
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/150216
_version_ 1701270628591468544