Q-instruct: improving low-level visual abilities for multi-modality foundation models
Multi-modality foundation models, as represented by GPT-4V, have brought a new paradigm for low-level visual perception and understanding tasks, that can respond to a broad range of natural human instructions in a model. While existing foundation models have shown exciting potentials on low-level...
Saved in:
Main Authors: | , , , , , , , , , , , , , |
---|---|
其他作者: | |
格式: | Conference or Workshop Item |
語言: | English |
出版: |
2024
|
主題: | |
在線閱讀: | https://hdl.handle.net/10356/178464 http://arxiv.org/abs/2311.06783v1 https://openaccess.thecvf.com/content/CVPR2024/papers/Wu_Q-Instruct_Improving_Low-level_Visual_Abilities_for_Multi-modality_Foundation_Models_CVPR_2024_paper.pdf |
標簽: |
添加標簽
沒有標簽, 成為第一個標記此記錄!
|