Automating dataset updates towards reliable and timely evaluation of Large Language Models

Large language models (LLMs) have achieved impressive performance across various natural language benchmarks, prompting a continual need to curate more difficult datasets for larger LLMs, which is costly and time-consuming. In this paper, we propose to automate dataset updating and provide systemati...

Full description

Saved in:
Bibliographic Details
Main Authors: YING, Jiahao, CAO, Yixin, BAI, Yushi, SUN, Qianru, WANG, Bo, TANG, Wei, DING, Zhaojun, YANG, Yizhe, HUANG, Xuanjing, YAN, Shuicheng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
LLM
Online Access:https://ink.library.smu.edu.sg/sis_research/9439
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Be the first to leave a comment!
You must be logged in first