Automating dataset updates towards reliable and timely evaluation of Large Language Models

Large language models (LLMs) have achieved impressive performance across various natural language benchmarks, prompting a continual need to curate more difficult datasets for larger LLMs, which is costly and time-consuming. In this paper, we propose to automate dataset updating and provide systemati...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	YING, Jiahao, CAO, Yixin, BAI, Yushi, SUN, Qianru, WANG, Bo, TANG, Wei, DING, Zhaojun, YANG, Yizhe, HUANG, Xuanjing, YAN, Shuicheng
التنسيق:	text
اللغة:	English
منشور في:	Institutional Knowledge at Singapore Management University 2024
الموضوعات:	Large language models LLM Dataset update Benchmark update Automation Artificial Intelligence and Robotics
الوصول للمادة أونلاين:	https://ink.library.smu.edu.sg/sis_research/9439
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Singapore Management University
اللغة:	English

الانترنت

https://ink.library.smu.edu.sg/sis_research/9439

Automating dataset updates towards reliable and timely evaluation of Large Language Models

الانترنت

مواد مشابهة