ALI-agent: Assessing LLMS’ alignment with human values via agent-based evaluation

Large Language Models (LLMs) can elicit unintended and even harmful content when misaligned with human values, posing severe risks to users and society. To mitigate these risks, current evaluation benchmarks predominantly employ expertdesigned contextual scenarios to assess how well LLMs align with...

Full description

Saved in:

Bibliographic Details
Main Authors:	ZHENG, Jingnan, WANG, Han, NGUYEN, Tai D., ZHANG, An, SUN, Jun, CHUA, Tat-Seng
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/9834 https://ink.library.smu.edu.sg/context/sis_research/article/10834/viewcontent/8621_ALI_Agent_Assessing_LLMs_.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Be the first to leave a comment!

ALI-agent: Assessing LLMS’ alignment with human values via agent-based evaluation

Similar Items