Sensor-based activity recognition via learning from distributions

Wearable-sensor-based activity recognition aims to predict users' activities from multi-dimensional streams of various sensor readings received from ubiquitous sensors. To utilize machine learning techniques for sensor-based activity recognition, previous approaches focused on composing a featu...

Full description

Saved in:
Bibliographic Details
Main Author: Qian, Hangwei
Other Authors: Pan Jialin, Sinno
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/137691
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Wearable-sensor-based activity recognition aims to predict users' activities from multi-dimensional streams of various sensor readings received from ubiquitous sensors. To utilize machine learning techniques for sensor-based activity recognition, previous approaches focused on composing a feature vector to represent sensor-reading streams received within a period of various lengths. With the constructed feature vectors, e.g., using predefined orders of moments in statistics, and their corresponding labels of activities, standard classification algorithms can be applied to train a predictive model, which will be used to make predictions. However, we argue that the prevalent success of existing methods has two crucial prerequisites: proper feature extraction and sufficient labeled training data. The former is important to differentiate activities, while the latter is crucial to build a precise learning model. These two prerequisites have become bottlenecks to make existing methods more practical. Most existing feature extraction methods are highly dependent on domain knowledge, while labeled data requires intensive human annotation effort. In this thesis, we propose novel methods to tackle the above problems. The first crucial research issue is how to extract proper features from the partitioned segments of multivariate sensor readings. Both feature-engineering-based machine learning models and deep learning models have been explored for wearable-sensor-based human activity recognition. Existing methods have different drawbacks: 1) feature-engineering-based methods are able to extract meaningful features, such as statistical or structural information underlying the segments, but usually require manual designs of features for different applications, which is time consuming, and 2) deep learning models are able to learn temporal and/or spatial features from the sensor data automatically, but fail to capture statistical information. To solve the problems, we firstly aim to extract statistical information captured by higher-order moments when constructing features. We propose a new method, denoted by SMM AR-, based on learning from distributions for sensor-based activity recognition. Specifically, we consider sensor readings received within a period as a sample, which can be represented by a feature vector of infinite dimensions in a Reproducing Kernel Hilbert Space (RKHS) using kernel embedding techniques. We then train a classifier in the RKHS. To scale-up the proposed method, we further offer an accelerated version R-SMM AR- by utilizing an explicit feature map instead of using a kernel function. Besides, we propose a novel deep learning model to automatically learn meaningful features including statistical features, temporal features and spatial correlation features for activity recognition in a unified framework. The second research issue is how to alleviate the demand of sufficient training data problem. We propose a novel method, named Distribution-based Semi-Supervised Learning (DSSL for short), to tackle the aforementioned limitations. The proposed method is capable of automatically extracting powerful features with no domain knowledge required, meanwhile, alleviating the heavy annotation effort through semi-supervised learning. Specifically, we treat data stream of sensor readings received in a period as a distribution, and map all training distributions, including labeled and unlabeled, into a RKHS using the kernel mean embedding technique. The RKHS is further altered by exploiting the underlying geometry structure of the unlabeled distributions. Finally, in the altered RKHS, a classifier is trained with the labeled distributions. We also investigate the situation where only the coarse sequence of activity labels are known, while the starting and ending points of activities are unknown. We propose a unified weakly-supervised framework to jointly segment sensor streams and extract statistical features of sensory readings of each segment. We named our proposed algorithm S-SMM AR- Extensive evaluations are conducted on various large-scale datasets to demonstrate the effectiveness of our proposed methods compared with state-of-the-art baselines.