Reducing estimation bias via triplet-average deep deterministic policy gradient

Reducing estimation bias via triplet-average deep deterministic policy gradient

The overestimation caused by function approximation is a well-known property in Q-learning algorithms, especially in single-critic models, which leads to poor performance in practical tasks. However, the opposite property, underestimation, which often occurs in Q-learning methods with double critics...

Full description

Saved in:

Bibliographic Details
Main Authors:	WU, Dongming, DONG, Xingping, SHEN, Jianbing, HOI, Steven C. H.
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2020
Subjects:	Averaging technology deep reinforcement learning (DRL) estimation bias triplet networks Numerical Analysis and Scientific Computing Software Engineering Theory and Algorithms
Online Access:	https://ink.library.smu.edu.sg/sis_research/5920 https://ink.library.smu.edu.sg/context/sis_research/article/6923/viewcontent/tnnls19ReducingBias_av.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Similar Items

Learning multiple maps from conditional ordinal triplets
by: LE, Duy Dung, et al.
Published: (2019)

Context-Aware REpresentation: Jointly learning item features and selection from triplets
by: ALVES, Rodrigo, et al.
Published: (2024)

Resonantly Pumped Bright-Triplet Exciton Lasing in Cesium Lead Bromide Perovskites
by: Ying, Guanhua, et al.
Published: (2022)

Learning to solve 3-D bin packing problem via deep reinforcement learning and constraint programming
by: JIANG, Yuan, et al.
Published: (2023)

Inferring phylogenetic relationships avoiding forbidding rooted triplets
by: He, Y.-J., et al.
Published: (2013)

Inferring a level-1 phylogenetic network from a dense set of rooted triplets
by: Jansson, J., et al.
Published: (2013)

Robust estimating functions and bias correction for longitudinal data analysis
by: Wang, Y.-G., et al.
Published: (2014)

Moving average reversion strategy for on-line portfolio selection
by: LI, Bin, et al.
Published: (2015)

Algorithms for combining rooted triplets into a galled phylogenetic network
by: Jansson, J., et al.
Published: (2013)

Multi-zone thermal processing in semiconductor manufacturing: Bias estimation
by: Yan, H., et al.
Published: (2014)

Deep reinforcement learning for solving vehicle routing problems with backhauls
by: WANG, Conghui, et al.
Published: (2024)

Computing a smallest multilabeled phylogenetic tree from rooted triplets
by: Guillemot, S., et al.
Published: (2013)

Surgical activity triplet recognition via triplet disentanglement
by: CHEN, Yiliang, et al.
Published: (2023)

Computing the Rooted Triplet Distance Between Phylogenetic Networks
by: Jansson, Jesper, et al.
Published: (2022)

Triplet spike time dependent plasticity in a floating-gate synapse
by: Gopalakrishnan, Roshan, et al.
Published: (2016)

Deep reinforcement learning for solving the heterogeneous capacitated vehicle routing problem
by: LI, Jingwen, et al.
Published: (2021)

Estimation of population size from biased samples using non-parametric binary regression
by: Chen, S.X., et al.
Published: (2014)

Test-enhanced learning for pairs and triplets: When and why does transfer occur?
by: Rickard, Timothy C, et al.
Published: (2022)

On bias in the estimation of structural break points
by: JIANG, Liang, et al.
Published: (2014)

Biased Domination Games
by: Tharit Sereekiatdilok
Published: (2023)

Empirical evaluation of three common assumptions in building political media bias datasets
by: GANGULY, Soumen, et al.
Published: (2020)

Decision-aided carrier phase estimation with selective averaging for low-cost optical coherent communication
by: Huang, D., et al.
Published: (2014)

Equivariance and invariance inductive bias for learning from insufficient data
by: WANG, Tan, et al.
Published: (2022)

GENERALIZATION TECHNIQUES IN DEEP REINFORCEMENT LEARNING
by: MUHAMMAD RIZKI AULIA RAHMAN MAULANA
Published: (2023)

Higher Order Bias Correcting Moment Equation for M-Estimation and Its Higher Order Efficiency
by: KIM, Kyoo-il
Published: (2006)

Recent advances in attention bias modification for substance addictions
by: Zhang, Melvyn Weibin, et al.
Published: (2018)

Deep learning for person re-identification: A survey and outlook
by: YE, Mang, et al.
Published: (2022)

Aspect sentiment triplet extraction incorporating syntactic constituency parsing tree and commonsense knowledge graph
by: HU, Zhenda, et al.
Published: (2023)

A systematic review of attention biases in opioid, cannabis, stimulant use disorders
by: Fung, Daniel S. S., et al.
Published: (2018)

Parameter estimation and bias correction for diffusion processes
by: Tang, C.Y., et al.
Published: (2014)

HEURISTICS AND BIASES TO BEHAVIOURAL ECONOMICS: A SOCIOLOGY OF A PSYCHOLOGY OF ERROR
by: ZARA THOKOZANI KAMWENDO
Published: (2018)

Reinforcement learning based online request scheduling framework for workload-adaptive edge deep learning inference
by: TAN, Xinrui, et al.
Published: (2024)

Distributed Average Consensus based on Structural Weight-Balanceability
by: Haghighi, Reza, et al.
Published: (2016)

A moving average Cholesky factor model in covariance modelling for longitudinal data
by: Zhang, W., et al.
Published: (2014)

Beyond triplet loss: Person re-identification with fine-grained difference-aware pairwise loss
by: YAN, Cheng, et al.
Published: (2022)

Systematic error modeling and bias estimation
by: Zhang, F, et al.
Published: (2020)

MODEL AVERAGING FOR LONGITUDINAL COVARIANCE ESTIMATION AND BAYESIAN NONPARAMETRIC RGRESSION
by: WANG JINGLI
Published: (2019)

LEARNING SCENE HIERARCHY: FROM CATEGORY-LEVEL SIMILARITY TO ATTRIBUTE-LEVEL SIMILARITY
by: LENG YUSONG
Published: (2019)

Gamified cognitive bias modification interventions for psychiatric disorders : review
by: Zhang, Melvyn, et al.
Published: (2019)

Hedonic and non-hedonic bias toward the future
by: Greene, Preston, et al.
Published: (2021)