Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning

Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning

We present a study on reinforcement learning (RL) from human bandit feedback for sequence-to-sequence learning, exemplified by the task of bandit neural machine translation (NMT). We investigate the reliability of human bandit feedback, and analyze the influence of reliability on the learnability of...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kreutzer, Julia, Uyheng, Joshua, Riezler, Stefan
Format:	text
Published:	Archīum Ateneo 2018
Subjects:	Cognitive Psychology Psychology
Online Access:	https://archium.ateneo.edu/psychology-faculty-pubs/357 https://aclanthology.org/P18-1165/
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Ateneo De Manila University

Similar Items

Is there evidence for cross-domain congruency sequence effect? A replication of Kan et al. (2013)
by: ACZEL, Balazs, et al.
Published: (2021)

Leadership and Human Values
by: DEKLE, Dawn Jeanine
Published: (2003)

Beyond the Special Case: Applying Neural Theories of Consciousness to Non-Human Animals
by: FARBER, Ilya
Published: (2007)

Variational learning from implicit bandit feedback
by: TRUONG, Quoc Tuan, et al.
Published: (2021)

Cognitive Polyphasia in a Global South Populist Democracy: Mapping Social Representations of Duterte's Regime in the Philippines
by: Montiel, Cristina Jayme, et al.
Published: (2020)

What is an academic emotion? Insights from Filipino bilingual students’ emotion words associated with learning
by: Bernardo, Allan Benedict I., et al.
Published: (2009)

The good, the bad and the ugly: Internet use, outcomes and the role of regulation in the Philippines
by: Hechanova, Ma. Regina, et al.
Published: (2017)

Exploring connections between teacher interpersonal behaviour, student motivation and competency level in competence-based learning environments
by: Misbah, Zainun, et al.
Published: (2021)

Human Mental Model of Humanoid Robot
by: LAU, Ivy Yee-Man
Published: (2005)

Keberhasailan Belajar Mahasiswa Ditinjau dari Style of Learning
by: Rustam, Amrizal, et al.
Published: (1992)

Are Happy People More Prone to Have Memory Errors?: Effects of Positive Affect on False Memory
by: YANG, Hwajin, et al.
Published: (2008)

The Happier, the More Controlled: Effects of Positive Mood on Working Memory
by: YANG, Hwajin, et al.
Published: (2008)

Easy to Remember and Hard to Forget: Effects of Emotional Prosody on Suppressed Verbal Memory
by: YANG, Hwajin, et al.
Published: (2009)

Differential Effects of Mood on Risky Choice Versus Advice
by: YANG, Hwajin, et al.
Published: (2009)

An Inferential Model of Choice Blindness
by: FARBER, Ilya
Published: (2008)

Socio-cognitive Correlates of Biculturalism
by: CHENG, Chi-Ying, et al.
Published: (2012)

Using Comparative Neuroanatomy to Assess the Capacity of Nonhuman Animals for Morally Relevant Forms of Suffering
by: FARBER, Ilya
Published: (2006)

An Alternative Framework for Understanding change Blindness
by: FARBER, Ilya
Published: (2006)

Viewing Composite Sketches: Showups and Lineups Compared
by: DEKLE, Dawn Jeanine
Published: (2005)

Effects of Emotional Prosody on Verbal Memory
by: YANG, Hwajin, et al.
Published: (2009)

When Persuading Others Becomes Persuading Oneself
by: LAU, Ivy Yee-Man
Published: (2007)

Predicting on-Line and Memory-Based Judgments: The Role of Mental Models
by: TONG, Yuk-Yue
Published: (2007)

Interference between verbal concept formation and spatial mental rotation in female subjects
by: MAKANY, Tamas, et al.
Published: (2002)

Partner's understanding of affective-cognitive meta-bases predicts relationship quality
by: TAN, Kenneth, et al.
Published: (2015)

Cognitive control adjustments are dependent on the level of conflict: A replication of Zhang et al. (2021).
by: BOGNAR, M., et al.
Published: (2024)

Reward sensitivity, impulse control, and social cognition as mediators of the link between childhood family adversity and externalizing behavior in eight countries
by: Lansford, Jennifer E, et al.
Published: (2017)

Methodological Issues in Research on Cognition and Culture
by: CHAN, David
Published: (2006)

Pengaruh pelatihan imajeri dan penalaran terhadap kreativitas menurut perspektif perbedaan individu
by: SUHARNAN, SUHARNAN
Published: (1998)

PENGARUH ERGONOMI PARTISIPATORI DALAM MEREDUKSI TINGKAT AGRESIVITAS DI TEMPAT KERJA
by: Harsani, Intaglia
Published: (2013)

Raliabilitas tes wartegg
by: Suhapti, Retno
Published: (1989)

FUNGSI MIXED METHODS DALAM PENELITIAN PSIKOLOGI KELUARGA
by: Afiatin, Tina
Published: (2013)

FAKTOR-FAKTOR YANG MEMENGARUHI KEPERCAYAAN EPISTEMOLOGIS MAHASISWA
by: Ghufron, M. Nur
Published: (2012)

Laporan Penelitian Penyusunan Skala Kreativitas
by: Sukarti, Sukarti
Published: (1989)

Laporan Penelitian Hubungan antara'self-esteem' dengan 'stress' dalam Kerja pada Karyawan PEMDA Tingkat II Kabupaten Sleman Yogyakarta
by: Sanmustari, Rasimin Budjang, et al.
Published: (1992)

Melenyapkan Kecemasaan
by: Basri, Hasan
Published: (1976)

Laporan Penelitian Validitas dan Reliabilitas Tes Dasar Pengertian Mekanik
by: Purnamaningsih, Esti Hayu
Published: (1991)

Laporan Penelitian Pembakuan Norma Tes Kraepelin untuk Siswa-siswa Lulusan SMA di Yogyakarta
by: Atamimi, Nuryati
Published: (1983)

Tinjauan Stastistik pada Nilai-Nilai Ipsative
by: Wirawan, Yapsir G.
Published: (1979)

Kemandulan di Indonesia Variasi dari Daerah ke Daerah
by: Hull, Terry, et al.
Published: (1978)

Tanggapan Remaja Mengenai Diri dan Kehidupannya
by: Meichati, Siti
Published: (1979)