The devil is in the tails: How long-tailed code distributions impact large language models

Learning-based techniques, especially advanced Large Language Models (LLMs) for code, have gained considerable popularity in various software engineering (SE) tasks. However, most existing works focus on designing better learning-based models and pay less attention to the properties of datasets. Lea...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHOU, Xin, KIM, Kisub, XU, Bowen, LIU, Jiakun, HAN, DongGyun, LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2023
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8568
https://ink.library.smu.edu.sg/context/sis_research/article/9571/viewcontent/The_devil_is_in_the_tails.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English