Learning semantically rich network-based multi-modal mobile user interface embeddings

Semantically rich information from multiple modalities - text, code, images, categorical and numerical data - co-exist in the user interface (UI) design of mobile applications. Moreover, each UI design is composed of inter-linked UI entities which support different functions of an application, e.g.,...

Full description

Saved in:

Bibliographic Details
Main Authors:	ANG, Meng Kiat Gary, LIM, Ee-peng
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2022
Subjects:	network embedding mobile application user interface unsupervised retrieval selfsupervised learning multi-modal user interface design Databases and Information Systems OS and Networks
Online Access:	https://ink.library.smu.edu.sg/sis_research/7269 https://ink.library.smu.edu.sg/context/sis_research/article/8272/viewcontent/3533856.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-8272
record_format	dspace
spelling	sg-smu-ink.sis_research-82722022-09-15T07:33:33Z Learning semantically rich network-based multi-modal mobile user interface embeddings ANG, Meng Kiat Gary LIM, Ee-peng Semantically rich information from multiple modalities - text, code, images, categorical and numerical data - co-exist in the user interface (UI) design of mobile applications. Moreover, each UI design is composed of inter-linked UI entities which support different functions of an application, e.g., a UI screen comprising a UI taskbar, a menu and multiple button elements. Existing UI representation learning methods unfortunately are not designed to capture multi-modal and linkage structure between UI entities. To support effective search and recommendation applications over mobile UIs, we need UI representations that integrate latent semantics present in both multi-modal information and linkages between UI entities. In this article, we present a novel self-supervised model - Multi-modal Attention-based Attributed Network Embedding (MAAN) model. MAAN is designed to capture structural network information present within the linkages between UI entities, as well as multi-modal attributes of the UI entity nodes. Based on the variational autoencoder framework, MAAN learns semantically rich UI embeddings in a self-supervised manner by reconstructing the attributes of UI entities and the linkages between them. The generated embeddings can be applied to a variety of downstream tasks: predicting UI elements associated with UI screens, inferring missing UI screen and element attributes, predicting UI user ratings, and retrieving UIs. Extensive experiments, including user evaluations, conducted on datasets from RICO, a rich real-world mobile UI repository, demonstrate that MAAN out-performs other state-of-the-art models. The number of linkages between UI entities can provide further information on the role of different UI entities in UI designs. However, MAAN does not capture edge attributes. To extend and generalize MAAN to learn even richer UI embeddings, we further propose EMAAN to capture edge attributes. We conduct additional extensive experiments on EMAAN, which show that it improves the performance of MAAN and similarly out-performs state-of-the-art models. 2022-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7269 info:doi/10.1145/3533856 https://ink.library.smu.edu.sg/context/sis_research/article/8272/viewcontent/3533856.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University network embedding mobile application user interface unsupervised retrieval selfsupervised learning multi-modal user interface design Databases and Information Systems OS and Networks
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	network embedding mobile application user interface unsupervised retrieval selfsupervised learning multi-modal user interface design Databases and Information Systems OS and Networks
spellingShingle	network embedding mobile application user interface unsupervised retrieval selfsupervised learning multi-modal user interface design Databases and Information Systems OS and Networks ANG, Meng Kiat Gary LIM, Ee-peng Learning semantically rich network-based multi-modal mobile user interface embeddings
description	Semantically rich information from multiple modalities - text, code, images, categorical and numerical data - co-exist in the user interface (UI) design of mobile applications. Moreover, each UI design is composed of inter-linked UI entities which support different functions of an application, e.g., a UI screen comprising a UI taskbar, a menu and multiple button elements. Existing UI representation learning methods unfortunately are not designed to capture multi-modal and linkage structure between UI entities. To support effective search and recommendation applications over mobile UIs, we need UI representations that integrate latent semantics present in both multi-modal information and linkages between UI entities. In this article, we present a novel self-supervised model - Multi-modal Attention-based Attributed Network Embedding (MAAN) model. MAAN is designed to capture structural network information present within the linkages between UI entities, as well as multi-modal attributes of the UI entity nodes. Based on the variational autoencoder framework, MAAN learns semantically rich UI embeddings in a self-supervised manner by reconstructing the attributes of UI entities and the linkages between them. The generated embeddings can be applied to a variety of downstream tasks: predicting UI elements associated with UI screens, inferring missing UI screen and element attributes, predicting UI user ratings, and retrieving UIs. Extensive experiments, including user evaluations, conducted on datasets from RICO, a rich real-world mobile UI repository, demonstrate that MAAN out-performs other state-of-the-art models. The number of linkages between UI entities can provide further information on the role of different UI entities in UI designs. However, MAAN does not capture edge attributes. To extend and generalize MAAN to learn even richer UI embeddings, we further propose EMAAN to capture edge attributes. We conduct additional extensive experiments on EMAAN, which show that it improves the performance of MAAN and similarly out-performs state-of-the-art models.
format	text
author	ANG, Meng Kiat Gary LIM, Ee-peng
author_facet	ANG, Meng Kiat Gary LIM, Ee-peng
author_sort	ANG, Meng Kiat Gary
title	Learning semantically rich network-based multi-modal mobile user interface embeddings
title_short	Learning semantically rich network-based multi-modal mobile user interface embeddings
title_full	Learning semantically rich network-based multi-modal mobile user interface embeddings
title_fullStr	Learning semantically rich network-based multi-modal mobile user interface embeddings
title_full_unstemmed	Learning semantically rich network-based multi-modal mobile user interface embeddings
title_sort	learning semantically rich network-based multi-modal mobile user interface embeddings
publisher	Institutional Knowledge at Singapore Management University
publishDate	2022
url	https://ink.library.smu.edu.sg/sis_research/7269 https://ink.library.smu.edu.sg/context/sis_research/article/8272/viewcontent/3533856.pdf
_version_	1770576295366754304

Learning semantically rich network-based multi-modal mobile user interface embeddings

Similar Items