Structured prediction for feature selection and performance evaluation
Machine learning methods can be employed to discover the relationship between inputs and their desired outputs from a large collection of data points. The outputs of many real-world problems are naturally formed as structured objects in which elements are interdependent in terms of the given structu...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2014
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/55288 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Machine learning methods can be employed to discover the relationship between inputs and their desired outputs from a large collection of data points. The outputs of many real-world problems are naturally formed as structured objects in which elements are interdependent in terms of the given structure. Although the conventional methods may be directly applied to solve these problems by treating the elements in each output object independently, they tend to yield suboptimal performance because of the ignorance of interdependency information in the structured outputs. During the last few years, structured prediction provides a natural way to directly model the relationship between inputs and structured outputs. Structure prediction model consists of three components: feature engineering, learning the optimal hypothesis from data by minimizing specific loss, and prediction. In the thesis, the main focus is on the first two components which are interweaved in the following three pieces of my works, where the generalized linear model is used in the prediction.The first work of this thesis deals with the feature engineering problem which is critical in the structured prediction modeling. The success of structure prediction models is attributed to the fact that their discriminative models are able to account for overlapping features on the whole input observations. These features are usually generated from data by applying a given set of templates on labeled data, but improper templates may lead to degraded performance. To alleviate the difficulty of template selection in feature engineering phrase, a novel multiple template learning paradigm has been proposed to learn a structured prediction model and the importance of each template simultaneously, so that hundreds of arbitrary templates could be added into the learning model without caution of degraded performance. This paradigm has been further extended for structured prediction using generalized p-block norm regularization. The second work of this thesis focus on the application based loss functions in the special structured prediction problem called automatic image annotation which is one of major tools to enhance the semantic understanding of web images. However, the insufficient performance of image annotation methods prevents these applications from being practical. Although many image annotation methods have been proposed, most of them are inevitably trapped into suboptimal performance because the optimized measure is not the measure for the performance evaluation. To address this issue, a variety of objective-guided performance measures is first summarized under a unified representation. And then, a unified multi-label learning framework has been proposed by directly optimizing a variety of performance measures of multi-label learning tasks. Instead of template selection for structured prediction, the third work of this thesis studies how to select features for binary classification by optimizing specific multivariate performance measures based on structured prediction model. A generalized sparse regularizer has been proposed. Based on the proposed regularizer, a unified feature selection framework has also been presented for general loss functions. In particular, I have studied the novel feature selection paradigm by optimizing multivariate performance measures based on Structural SVM. To solve the challenging problem of the resultant formulation for high-dimensional data, a two-layer cutting plane algorithm has been proposed, and the convergence has been proved. In addition, the proposed method has been adapted to optimize multivariate measures for multiple instance learning problems. |
---|