Chain of preference optimization: Improving chain-of-thought reasoning in LLMs

The recent development of chain-of-thought (CoT) decoding has enabled large language models (LLMs) to generate explicit logical reasoning paths for complex problem-solving. However, research indicates that these paths are not always deliberate and optimal. The tree-of-thought (ToT) method employs tr...

Full description

Saved in:

Bibliographic Details
Main Authors:	ZHANG, Xuan, DU, Chao, PANG, Tianyu, LIU, Qian, GAO, Wei, LIN, Min
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Databases and Information Systems
Online Access:	https://ink.library.smu.edu.sg/sis_research/9881 https://ink.library.smu.edu.sg/context/sis_research/article/10881/viewcontent/2406.09136v2.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Internet

https://ink.library.smu.edu.sg/sis_research/9881
https://ink.library.smu.edu.sg/context/sis_research/article/10881/viewcontent/2406.09136v2.pdf

Chain of preference optimization: Improving chain-of-thought reasoning in LLMs

Internet

Similar Items