Learning network-based multi-modal mobile user interface embeddings

Rich multi-modal information - text, code, images, categorical and numerical data - co-exist in the user interface (UI) design of mobile applications. UI designs are composed of UI entities supporting different functions which together enable the application. To support effective search and recommen...

Full description

Saved in:
Bibliographic Details
Main Authors: ANG, Gary, LIM, Ee-Peng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7049
https://ink.library.smu.edu.sg/context/sis_research/article/8052/viewcontent/3397481.3450693.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Rich multi-modal information - text, code, images, categorical and numerical data - co-exist in the user interface (UI) design of mobile applications. UI designs are composed of UI entities supporting different functions which together enable the application. To support effective search and recommendation applications over mobile UIs, we need to be able to learn UI representations that integrate latent semantics. In this paper, we propose a novel unsupervised model - Multi-modal Attention-based Attributed Network Embedding (MAAN) model. MAAN is designed to capture both multi-modal and structural network information. Based on the encoder-decoder framework, MAAN aims to learn UI representations that allow UI design reconstruction. The generated embedding can be applied to a variety of tasks: predicting UI elements associated with UI screens, inferring missing UI screen and element attributes, predicting UI user ratings, and retrieving UIs. Extensive experiments, including user evaluations, conducted on two datasets from RICO, a rich real-world mobile UI repository, demonstrates that MAAN out-performs other state-of-the-art models.