Portrait matting using an attention-based memory network

Matting is the process of generating prediction alpha and foreground with rich details from the input images. There are three major challenges for traditional matting algorithms. Firstly, most of them focus on auxiliary-based matting, while putting those algorithms into daily use is impractical sinc...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Song, Shufeng
مؤلفون آخرون:	Lin Zhiping
التنسيق:	Thesis-Master by Research
اللغة:	English
منشور في:	Nanyang Technological University 2023
الموضوعات:	Engineering::Electrical and electronic engineering
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/166590
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

id	sg-ntu-dr.10356-166590
record_format	dspace
spelling	sg-ntu-dr.10356-1665902023-07-04T17:03:52Z Portrait matting using an attention-based memory network Song, Shufeng Lin Zhiping School of Electrical and Electronic Engineering EZPLin@ntu.edu.sg Engineering::Electrical and electronic engineering Matting is the process of generating prediction alpha and foreground with rich details from the input images. There are three major challenges for traditional matting algorithms. Firstly, most of them focus on auxiliary-based matting, while putting those algorithms into daily use is impractical since additional input is not applicable in most scenarios. The second thing is the construction of temporal-guided modules to exploit temporal coherence for video matting tasks. Last but not least is the availability of matting datasets. This thesis addresses the above challenges and proposes a novel auxiliary-free video matting network. To eliminate the reliance on additional inputs, we perform a interleaved training strategy, in which we use binary masks from segmentation outputs to help our model to locate the portrait position and separate its boundary from the background. Then, we design a temporal-guided memory module based on the attention mechanism to compute and store the temporal coherence among video frames. Moreover, we also provide direct supervision for the the attention-based memory block to further boost the network’s robustness. Finally, we collect multiple matting datasets to generate synthesized video clips for training and testing. The validation results show that our method outperforms several state-of-the-art methods in terms of the alpha and foreground prediction quality and temporal consistency. Master of Engineering 2023-05-04T03:41:07Z 2023-05-04T03:41:07Z 2023 Thesis-Master by Research Song, S. (2023). Portrait matting using an attention-based memory network. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166590 https://hdl.handle.net/10356/166590 10.32657/10356/166590 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering
spellingShingle	Engineering::Electrical and electronic engineering Song, Shufeng Portrait matting using an attention-based memory network
description	Matting is the process of generating prediction alpha and foreground with rich details from the input images. There are three major challenges for traditional matting algorithms. Firstly, most of them focus on auxiliary-based matting, while putting those algorithms into daily use is impractical since additional input is not applicable in most scenarios. The second thing is the construction of temporal-guided modules to exploit temporal coherence for video matting tasks. Last but not least is the availability of matting datasets. This thesis addresses the above challenges and proposes a novel auxiliary-free video matting network. To eliminate the reliance on additional inputs, we perform a interleaved training strategy, in which we use binary masks from segmentation outputs to help our model to locate the portrait position and separate its boundary from the background. Then, we design a temporal-guided memory module based on the attention mechanism to compute and store the temporal coherence among video frames. Moreover, we also provide direct supervision for the the attention-based memory block to further boost the network’s robustness. Finally, we collect multiple matting datasets to generate synthesized video clips for training and testing. The validation results show that our method outperforms several state-of-the-art methods in terms of the alpha and foreground prediction quality and temporal consistency.
author2	Lin Zhiping
author_facet	Lin Zhiping Song, Shufeng
format	Thesis-Master by Research
author	Song, Shufeng
author_sort	Song, Shufeng
title	Portrait matting using an attention-based memory network
title_short	Portrait matting using an attention-based memory network
title_full	Portrait matting using an attention-based memory network
title_fullStr	Portrait matting using an attention-based memory network
title_full_unstemmed	Portrait matting using an attention-based memory network
title_sort	portrait matting using an attention-based memory network
publisher	Nanyang Technological University
publishDate	2023
url	https://hdl.handle.net/10356/166590
_version_	1772825206577954816

Portrait matting using an attention-based memory network

مواد مشابهة