Embodied object hunt
This study investigates the use of multimodal encoders in the Embodied Object Hunt task. The motivation behind this approach is recent developments in joint multimodal encoders such as CLIP that are able to extract common features between images and text. This ability is ideal for tasks combining...
Saved in:
主要作者: | |
---|---|
其他作者: | |
格式: | Final Year Project |
語言: | English |
出版: |
Nanyang Technological University
2024
|
主題: | |
在線閱讀: | https://hdl.handle.net/10356/175084 |
標簽: |
添加標簽
沒有標簽, 成為第一個標記此記錄!
|