Human-guided cross-domain synthesis: generating virtual robotic arm imagery and videos
Currently, a multitude of interaction methods between humans and robotic arms have emerged, among which one effective strategy is to enable robotic arms to imitate human arm movements, thereby achieving intuitive operation. With technological advancements, robotic arms are now capable of learning...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/173713 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-173713 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1737132024-03-01T15:44:36Z Human-guided cross-domain synthesis: generating virtual robotic arm imagery and videos Wang, Ruofeng Wen Bihan School of Electrical and Electronic Engineering bihan.wen@ntu.edu.sg Computer and Information Science Cross-domain conversion Image Generation Robotic arm Adversarial Generative Networks Contrastive Learning Currently, a multitude of interaction methods between humans and robotic arms have emerged, among which one effective strategy is to enable robotic arms to imitate human arm movements, thereby achieving intuitive operation. With technological advancements, robotic arms are now capable of learning and imitating actions by watching their videos or images. This dissertation proposes a method using cross-domain conversion and image generation technology to transform videos of human arm movements into robotic arm action videos. This method provides real robotic arms with opportunities to learn and imitate, further enabling direct interaction by mimicking human arm movements. By processing videos into frames and utilizing adversarial generative networks and contrastive learning frameworks, the mutual information between input and output domain image patches is maximized, effectively achieving cross-domain conversion. Moreover, to enhance the model’s generalization capabilities, techniques such as image masking and human skeleton keypoints detection have been introduced. This not only broadens the scope of the model’s application but also provides insights for tasks involving cross-domain conversion and opens up additional possibilities for the learning of robotic arms. Master's degree 2024-02-26T03:01:01Z 2024-02-26T03:01:01Z 2024 Thesis-Master by Coursework Wang, R. (2024). Human-guided cross-domain synthesis: generating virtual robotic arm imagery and videos. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/173713 https://hdl.handle.net/10356/173713 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Cross-domain conversion Image Generation Robotic arm Adversarial Generative Networks Contrastive Learning |
spellingShingle |
Computer and Information Science Cross-domain conversion Image Generation Robotic arm Adversarial Generative Networks Contrastive Learning Wang, Ruofeng Human-guided cross-domain synthesis: generating virtual robotic arm imagery and videos |
description |
Currently, a multitude of interaction methods between humans and robotic arms
have emerged, among which one effective strategy is to enable robotic arms
to imitate human arm movements, thereby achieving intuitive operation. With
technological advancements, robotic arms are now capable of learning and imitating
actions by watching their videos or images. This dissertation proposes
a method using cross-domain conversion and image generation technology to
transform videos of human arm movements into robotic arm action videos. This
method provides real robotic arms with opportunities to learn and imitate, further
enabling direct interaction by mimicking human arm movements. By processing
videos into frames and utilizing adversarial generative networks and contrastive
learning frameworks, the mutual information between input and output
domain image patches is maximized, effectively achieving cross-domain conversion.
Moreover, to enhance the model’s generalization capabilities, techniques
such as image masking and human skeleton keypoints detection have been introduced.
This not only broadens the scope of the model’s application but also
provides insights for tasks involving cross-domain conversion and opens up additional
possibilities for the learning of robotic arms. |
author2 |
Wen Bihan |
author_facet |
Wen Bihan Wang, Ruofeng |
format |
Thesis-Master by Coursework |
author |
Wang, Ruofeng |
author_sort |
Wang, Ruofeng |
title |
Human-guided cross-domain synthesis: generating virtual robotic arm imagery and videos |
title_short |
Human-guided cross-domain synthesis: generating virtual robotic arm imagery and videos |
title_full |
Human-guided cross-domain synthesis: generating virtual robotic arm imagery and videos |
title_fullStr |
Human-guided cross-domain synthesis: generating virtual robotic arm imagery and videos |
title_full_unstemmed |
Human-guided cross-domain synthesis: generating virtual robotic arm imagery and videos |
title_sort |
human-guided cross-domain synthesis: generating virtual robotic arm imagery and videos |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/173713 |
_version_ |
1794549399642177536 |