Evaluation of Orca 2 against other LLMs for Retrieval Augmented Generation

This study presents a comprehensive evaluation of Microsoft Research’s Orca 2, a small yet potent language model, in the context of Retrieval Augmented Generation (RAG). The research involved comparing Orca 2 with other significant models such as Llama-2, GPT-3.5-Turbo, and GPT-4, particularly focus...

Full description

Saved in:

Bibliographic Details
Main Authors:	HUANG, Donghao, WANG, Zhaoxia
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Large Language Model (LLM) Generated Pre-trained Transformer (GPT) Retrieval Augmented Generation (RAG) Question Answering Model Comparison Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing
Online Access:	https://ink.library.smu.edu.sg/sis_research/9052 https://ink.library.smu.edu.sg/context/sis_research/article/10055/viewcontent/RAFDA_2024_Empirical_Evaluation_of_Orca_2_Models_for_Retrieval_Augmented_Generation.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10055
record_format	dspace
spelling	sg-smu-ink.sis_research-100552024-10-17T06:44:08Z Evaluation of Orca 2 against other LLMs for Retrieval Augmented Generation HUANG, Donghao WANG, Zhaoxia This study presents a comprehensive evaluation of Microsoft Research’s Orca 2, a small yet potent language model, in the context of Retrieval Augmented Generation (RAG). The research involved comparing Orca 2 with other significant models such as Llama-2, GPT-3.5-Turbo, and GPT-4, particularly focusing on its application in RAG. Key metrics, included faithfulness, answer relevance, overall score, and inference speed, were assessed. Experiments conducted on high-specification PCs revealed Orca 2’s exceptional performance in generating high quality responses and its efficiency on consumer-grade GPUs, underscoring its potential for scalable RAG applications. This study highlights the pivotal role of smaller, efficient models like Orca 2 in the advancement of conversational AI and their implications for various IT infrastructures. The source codes and datasets of this paper are accessible here (https://github.com/inflaton/Evaluation-of-Orca-2-for-RAG.). 2024-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9052 info:doi/10.1007/978-981-97-2650-9_1 https://ink.library.smu.edu.sg/context/sis_research/article/10055/viewcontent/RAFDA_2024_Empirical_Evaluation_of_Orca_2_Models_for_Retrieval_Augmented_Generation.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Large Language Model (LLM) Generated Pre-trained Transformer (GPT) Retrieval Augmented Generation (RAG) Question Answering Model Comparison Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Large Language Model (LLM) Generated Pre-trained Transformer (GPT) Retrieval Augmented Generation (RAG) Question Answering Model Comparison Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing
spellingShingle	Large Language Model (LLM) Generated Pre-trained Transformer (GPT) Retrieval Augmented Generation (RAG) Question Answering Model Comparison Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing HUANG, Donghao WANG, Zhaoxia Evaluation of Orca 2 against other LLMs for Retrieval Augmented Generation
description	This study presents a comprehensive evaluation of Microsoft Research’s Orca 2, a small yet potent language model, in the context of Retrieval Augmented Generation (RAG). The research involved comparing Orca 2 with other significant models such as Llama-2, GPT-3.5-Turbo, and GPT-4, particularly focusing on its application in RAG. Key metrics, included faithfulness, answer relevance, overall score, and inference speed, were assessed. Experiments conducted on high-specification PCs revealed Orca 2’s exceptional performance in generating high quality responses and its efficiency on consumer-grade GPUs, underscoring its potential for scalable RAG applications. This study highlights the pivotal role of smaller, efficient models like Orca 2 in the advancement of conversational AI and their implications for various IT infrastructures. The source codes and datasets of this paper are accessible here (https://github.com/inflaton/Evaluation-of-Orca-2-for-RAG.).
format	text
author	HUANG, Donghao WANG, Zhaoxia
author_facet	HUANG, Donghao WANG, Zhaoxia
author_sort	HUANG, Donghao
title	Evaluation of Orca 2 against other LLMs for Retrieval Augmented Generation
title_short	Evaluation of Orca 2 against other LLMs for Retrieval Augmented Generation
title_full	Evaluation of Orca 2 against other LLMs for Retrieval Augmented Generation
title_fullStr	Evaluation of Orca 2 against other LLMs for Retrieval Augmented Generation
title_full_unstemmed	Evaluation of Orca 2 against other LLMs for Retrieval Augmented Generation
title_sort	evaluation of orca 2 against other llms for retrieval augmented generation
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/9052 https://ink.library.smu.edu.sg/context/sis_research/article/10055/viewcontent/RAFDA_2024_Empirical_Evaluation_of_Orca_2_Models_for_Retrieval_Augmented_Generation.pdf
_version_	1814047925545730048

Evaluation of Orca 2 against other LLMs for Retrieval Augmented Generation

Similar Items