Takip et
Wenjia Meng
Wenjia Meng
sdu.edu.cn üzerinde doğrulanmış e-posta adresine sahip
Başlık
Alıntı yapanlar
Alıntı yapanlar
Yıl
Two-bit networks for deep learning on resource-constrained embedded devices
W Meng, Z Gu, M Zhang, Z Wu
arXiv preprint arXiv:1701.00485, 2017
422017
An off-policy trust region policy optimization method with monotonic improvement guarantee for deep reinforcement learning
W Meng, Q Zheng, Y Shi, G Pan
IEEE Transactions on Neural Networks and Learning Systems 33 (5), 2223-2235, 2021
372021
Qualitative measurements of policy discrepancy for return-based deep q-network
W Meng, Q Zheng, L Yang, P Li, G Pan
IEEE transactions on neural networks and learning systems 31 (10), 4374-4380, 2019
292019
A unified approach for multi-step temporal-difference learning with eligibility traces in reinforcement learning
L Yang, M Shi, Q Zheng, W Meng, G Pan
arXiv preprint arXiv:1802.03171, 2018
222018
Off-policy proximal policy optimization
W Meng, Q Zheng, G Pan, Y Yin
Proceedings of the AAAI Conference on Artificial Intelligence 37 (8), 9162-9170, 2023
12023
Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline
W Meng, Q Zheng, L Yang, Y Yin, G Pan
arXiv preprint arXiv:2405.02572, 2024
2024
Sistem, işlemi şu anda gerçekleştiremiyor. Daha sonra yeniden deneyin.
Makaleler 1–6