SEMANTIC COMPOSITIONAL NETWORK WITH TOP-DOWN ATTENTION FOR PRODUCT TITLE GENERATION FROM IMAGE

E-commerce is currently one of the most widely used forms of transactions. There are many ecommerce applications nowadays. There are several types of e-commerce and one of them is customer-to-customer (C2C). C2C e-commerce is one type of e-commerce where the seller can be anyone, thus the product...

全面介紹

Saved in:
書目詳細資料
主要作者: Wijaya, Nicholas
格式: Final Project
語言:Indonesia
在線閱讀:https://digilib.itb.ac.id/gdl/view/50032
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:E-commerce is currently one of the most widely used forms of transactions. There are many ecommerce applications nowadays. There are several types of e-commerce and one of them is customer-to-customer (C2C). C2C e-commerce is one type of e-commerce where the seller can be anyone, thus the product data input is done manually. Those manual processes could lead to problems, for example, inconsistency and mistype at the product title. Therefore, product title generation from image system will be very useful for sellers in C2C e-commerce. In this work, the usage of image captioning method for generating product title from the image is proposed by combining two recent works: Semantic Compositional Networks by Gan et al. (2017) and Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering by Anderson et al. (2018). Practically, this work uses Semantic Compositional Networks combined with top-down attention from Anderson et al.’s work. Applying this approach, the system of product title generation from its image can yield a pretty good product title. The combined architecture achieves ROUGE-L, BLEU-1, BLEU-2, BLEU-3, and BLEU-4 scores of 0.8313, 0.7911, 0.6784, 0.5240, and 0.4179 respectively. It outperforms two reference works to the same dataset with scores of 0.8183, 0.7463, 0.6445, 0.4859, and 0.3812 for Gan et al.’s and 0.7922, 0.7867, 0.6816, 0.5159, and 0.3989 for Anderson et al.’s without the bottom-up attention.