Data augmentation strategies for machine learning in polymer composites

Machine learning has become ubiquitous in recent years across various sectors. This report presents prevalent machine learning techniques and their potential applications in the field of polymer composites. However, a sufficiently large dataset is requisite for learning algorithms to make accurate p...

Full description

Saved in:
Bibliographic Details
Main Author: Ho, Agnes Lin Xuan
Other Authors: Sunil Chandrakant Joshi
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/159161
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Machine learning has become ubiquitous in recent years across various sectors. This report presents prevalent machine learning techniques and their potential applications in the field of polymer composites. However, a sufficiently large dataset is requisite for learning algorithms to make accurate predictions. Existing data augmentation strategies meant to boost the amount of data available so that they can meet the requirements for machine learning were then discussed, which led to implementation of the Knowledge-Based Data Boosting technique for this research project. Software development of the technique was conducted accordingly and applied to case studies associated with polymer composites. After four iterations of the augmentation process, the first case study achieved a boosting factor of approximately 10:1 for two datasets. The second attained a boosting factor of nearly 9:1, with further potential to multiply the number of data points for all datasets if supplementary rounds of augmentations are performed. Data collection of experimental results can be a time-consuming and costly endeavour. Through its generation of an additional set of realistic data points, application of the KBDB data augmentation strategy will consequently enable researchers to utilise machine learning algorithms with existing datasets even when they do not have ample data.