Machine learning for chemical components testing
Terahertz time domain spectroscopy (THz-TDS) involves the use of THz radiation to identify chemicals via their absorption spectra characteristics. Building on the preceding Final Year Project which proved the feasibility of incorporating machine learning with THz-TDS to identify pure chemicals, this...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/150216 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-150216 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1502162021-05-25T06:22:46Z Machine learning for chemical components testing Tan, Ashley Zhao Kiat Cai Yiyu School of Mechanical and Aerospace Engineering Anor Technologies Pte Ltd MYYCai@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Mechanical engineering::Robots Terahertz time domain spectroscopy (THz-TDS) involves the use of THz radiation to identify chemicals via their absorption spectra characteristics. Building on the preceding Final Year Project which proved the feasibility of incorporating machine learning with THz-TDS to identify pure chemicals, this report explores the improvement on chemical mixture identification. From data collected in the lab from the industrial partner, Anor Technologies, various new pre-processing approaches are applied. These include the use of Mixture Synthesis to bolster the mixture dataset, as well as a Stacked Area approach to average out the inconsistencies between individual datapoints obtained from the THz-TDS machine. Following this, two new machine learning approaches are taken to evaluate the effectiveness on chemical mixture identification. Multi-label problem transformation techniques and algorithm adaptations such as Binary Relevance, Classifier Chain, Label Powerset and MLkNN are taken to tackle the mixture identification problem, along with the application of a 1D CNN as a new machine learning approach. Results from the training and testing show that while the Stacked Area approach can greatly increase the training and validation recall and precision scores up to 0.99, the drawback is a five times reduction in dataset size, which can affect model generalization performance. Further testing results show that the 1D CNN model has a very good generalization performance on completely unseen data, achieving a recall and precision score of around 0.98. The two novel approaches are shown to be very effective in this field of chemical detection using THz-TDS, with the Mixture Synthesis method effectively able to double the size of the existing datasets, and the Stacked Area leading to trained models with consistently better recall and precision scores compared to the original data. Future considerations to build on this work could involve the incorporation of data augmentation methods by randomising the offset, slope, and multiplication of the original absorption spectra to produce even more datapoints. Further development can be made on the model, switching to a regression model that can quantitatively detect the composition of chemicals in a mixture. Bachelor of Engineering (Mechanical Engineering) 2021-05-25T06:22:46Z 2021-05-25T06:22:46Z 2021 Final Year Project (FYP) Tan, A. Z. K. (2021). Machine learning for chemical components testing. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/150216 https://hdl.handle.net/10356/150216 en C050 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Mechanical engineering::Robots |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Mechanical engineering::Robots Tan, Ashley Zhao Kiat Machine learning for chemical components testing |
description |
Terahertz time domain spectroscopy (THz-TDS) involves the use of THz radiation to identify chemicals via their absorption spectra characteristics. Building on the preceding Final Year Project which proved the feasibility of incorporating machine learning with THz-TDS to identify pure chemicals, this report explores the improvement on chemical mixture identification. From data collected in the lab from the industrial partner, Anor Technologies, various new pre-processing approaches are applied. These include the use of Mixture Synthesis to bolster the mixture dataset, as well as a Stacked Area approach to average out the inconsistencies between individual datapoints obtained from the THz-TDS machine. Following this, two new machine learning approaches are taken to evaluate the effectiveness on chemical mixture identification. Multi-label problem transformation techniques and algorithm adaptations such as Binary Relevance, Classifier Chain, Label Powerset and MLkNN are taken to tackle the mixture identification problem, along with the application of a 1D CNN as a new machine learning approach. Results from the training and testing show that while the Stacked Area approach can greatly increase the training and validation recall and precision scores up to 0.99, the drawback is a five times reduction in dataset size, which can affect model generalization performance. Further testing results show that the 1D CNN model has a very good generalization performance on completely unseen data, achieving a recall and precision score of around 0.98. The two novel approaches are shown to be very effective in this field of chemical detection using THz-TDS, with the Mixture Synthesis method effectively able to double the size of the existing datasets, and the Stacked Area leading to trained models with consistently better recall and precision scores compared to the original data. Future considerations to build on this work could involve the incorporation of data augmentation methods by randomising the offset, slope, and multiplication of the original absorption spectra to produce even more datapoints. Further development can be made on the model, switching to a regression model that can quantitatively detect the composition of chemicals in a mixture. |
author2 |
Cai Yiyu |
author_facet |
Cai Yiyu Tan, Ashley Zhao Kiat |
format |
Final Year Project |
author |
Tan, Ashley Zhao Kiat |
author_sort |
Tan, Ashley Zhao Kiat |
title |
Machine learning for chemical components testing |
title_short |
Machine learning for chemical components testing |
title_full |
Machine learning for chemical components testing |
title_fullStr |
Machine learning for chemical components testing |
title_full_unstemmed |
Machine learning for chemical components testing |
title_sort |
machine learning for chemical components testing |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/150216 |
_version_ |
1701270628591468544 |