Q-align: teaching LMMs for visual scoring via discrete text-defined levels
The explosion of visual content available online underscores the requirement for an accurate machine assessor to robustly evaluate scores across diverse types of visual contents. While recent studies have demonstrated the exceptional potentials of large multi-modality models (LMMs) on a wide rang...
Saved in:
Main Authors: | , , , , , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/178466 http://arxiv.org/abs/2312.17090v1 https://openreview.net/forum?id=PHjkVjR78A https://icml.cc/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Be the first to leave a comment!