Can identifier splitting improve open-vocabulary language model of code?

Can identifier splitting improve open-vocabulary language model of code?

Statistical language models on source code have successfully assisted software engineering tasks. However, developers can create or pick arbitrary identifiers when writing source code. Freely chosen identifiers lead to the notorious out-of-vocabulary (OOV) problem that negatively affects model perfo...

Full description

Saved in:

Bibliographic Details
Main Authors:	SHI, Jieke, YANG, Zhou, HE, Junda, XU, Bowen, LO, David
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2022
Subjects:	Open vocabulary Identifier splitting Language model of code Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/7698 https://ink.library.smu.edu.sg/context/sis_research/article/8701/viewcontent/can_identifier.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Similar Items

Split group codes
by: Ding, C., et al.
Published: (2013)

Open-vocabulary object detection via debiased curriculum self-training
by: Zhang, Hanlue, et al.
Published: (2024)

Repeated Exposure to Vocabulary Websites and Academic Vocabulary Retention by University Students in Japan
by: Mirador, Josephine, et al.
Published: (2022)

Greening large language models of code
by: SHI, Jieke, et al.
Published: (2024)

On the feasibility of detecting cross-platform code clones via identifier similarity
by: CHENG, Xiao, et al.
Published: (2016)

Relationships between vocabulary learning strategies and vocabulary knowledge and reading comprehension
by: Sukanlaya Thavornpon
Published: (2013)

Ecological Approach to Vocabulary Density Estimation
by: Juan Carlos Olmos Alcoy
Published: (2022)

The validity of and the test anxiety produced by rational cloze test as a test of vocabulary
by: Nurhuda Benjama
Published: (2017)

Open-vocabulary video anomaly detection
by: WU, Peng, et al.
Published: (2024)

Simple image-level classification improves open-vocabulary object detection
by: FANG, Ruohuan, et al.
Published: (2024)

Monitoring and identifying the sources of Giardia and Crytosporidium contamination in the Bangkok Chao Phraya River.
by: Yaowalark Sukthana
Published: (2015)

The Effect of image creation interventions on the vocabulary achievement of elementary public school students
by: MARICAR, BLASQUILLO
Published: (2015)

Korean Vocabulary Learning Strategies of University Students in Thailand
by: Yurim Lee, et al.
Published: (2020)

Duadic codes over Z2k
by: Ling, S., et al.
Published: (2014)

Duadic codes over F2 + uF2
by: Ling, S., et al.
Published: (2014)

A Vocabulary of Philippine Food and Well-being
by: Sta. Maria, Felice Prudente
Published: (2024)

INFLUENCE OF MATERNAL BEHAVIOURS DURING JOINT ATTENTION AT 6 MONTHS ON VOCABULARY AT 18 MONTHS
by: FU HUIYUN, ERIN
Published: (2013)

Learning transferable negative prompts for out-of-distribution detection
by: LI, Tianqi, et al.
Published: (2024)

The Relationship between Vocabulary Learning Strategies and Vocabulary Knowledge of Thai Undergraduate Students
by: Tinutda Komol
Published: (2016)

Stealthy backdoor attack for code models
by: YANG, Zhou, et al.
Published: (2024)

Alternate form reliability of the PPVT-III in 100 ESL university students
by: Hilton, Laurence M.
Published: (2014)

The Effects of Using WordSift in English Vocabulary Teaching on Student Vocabulary Retention and Depth at a Thai Secondary School
by: Supatida Dumchoo
Published: (2019)

Application of etymology-visualization techniques to teaching financial English vocabulary: A Korean experience
by: Ri, Chung-Sim, et al.
Published: (2019)

FRESHMEN’S LEARNER AUTONOMY IN VOCABULARY LEARNING AT FELTE – ULIS - VNU
by: Đỗ, Thị Thanh Dung
Published: (2021)

Unveiling memorization in code models
by: YANG, Zhou, et al.
Published: (2024)

AN INVESTIGATION INTO YOUNG LEARNERS’ PREFERENCES FOR COMMON VOCABULARY ACTIVITIES AT BRITISH COUNCIL
by: Dư, Thị Huyền Trang
Published: (2021)

Vocabulary size and vocabulary learning strategies of Thai university students
by: Supika Nirattisai
Published: (2015)

ตำรา ศัพท์สูติ-นรีเวช (Vocabulary of Obstetrics & Gynaecology)
by: สุนีย์ สุนทรมีเสถียร
Published: (2020)

A comparative study on long-term vocabulary recall: world lists and a vocabulary trainer.
by: Krey, Thomas
Published: (2015)

LEARNER AUTONOMY IN VOCABULARY LEARNING - A STUDY ON STUDENTS OF THE INTERNATIONAL STANDARD PROGRAM, VNU
by: Trần, Thị Ngân
Published: (2021)

System for learning vocabulary and spelling through humor
by: Aban, Vic Roland L., et al.
Published: (2010)

Emergent semantic segmentation: training-free dense-label-free extraction from vision-language models
by: Luo, Jiayun
Published: (2024)

Polyadic codes revisited
by: Ling, S., et al.
Published: (2014)

Speech Perception, Metalinguistic Awareness, Reading, and Vocabulary in Chinese-English Bilingual Children
by: Cheung, H., et al.
Published: (2014)

An efficient algorithm to test the observability of rational nonlinear systems with unmeasured inputs
by: Shi, Xiaodong, et al.
Published: (2022)

A Comparative study of passive and acitive vocabulary knowledge of Prince of Songkla University and South China Agricultural University EFL learners
by: Zhong, Zhiying
Published: (2005)

Augmented reality as multimedia: the case for situated vocabulary learning
by: Santos, Marc Ericson C, et al.
Published: (2016)

A note on asymptotic splitting and its applications
by: Sheng, Q., et al.
Published: (2014)

ANALYSE ACOUSTIQUE DE SONS BIEN IDENTIFIÉS PAR UN SYSTEME DE RECONNAISSANCE AUTOMATIQUE DE LA PAROLE
by: BONNEAU, Anne, et al.
Published: (2015)

Identifying Code Reading Strategies in Debugging using STA with a Tolerance Algorithm
by: Tablatin, Christine Lourrine S, et al.
Published: (2022)