Are vision language models multimodal learners?
Since the release of accessible vision language models (VLMs) such as GPT-4V and Gemini Pro in 2023, scholars have envisaged utilizing these artificial intelligence (AI) models to widely support instructors and learners. Particularly, their capability to simultaneously process visual and textual dat...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181109 https://www.ntu.edu.sg/mae/ai-education-singapore-2024/activities/keynote-invited-talk#Content_C021_Col00 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |