Tengyang Xie
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
T Xie, Y Ma, YX Wang
Advances in Neural Information Processing Systems, 9665-9675, 2019
Provably efficient q-learning with low switching cost
Y Bai, T Xie, N Jiang, YX Wang
arXiv preprint arXiv:1905.12849, 2019
Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison
T Xie, N Jiang
Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence …, 2020
Batch value-function approximation with only realizability
T Xie, N Jiang
International Conference on Machine Learning, 11404-11413, 2021
A block coordinate ascent algorithm for mean-variance optimization
T Xie, B Liu, Y Xu, M Ghavamzadeh, Y Chow, D Lyu, D Yoon
Advances in Neural Information Processing Systems 31, 1065-1075, 2018
A variant of the wang-foster-kakade lower bound for the discounted setting
P Amortila, N Jiang, T Xie
arXiv preprint arXiv:2011.01075, 2020
Finite sample analysis of minimax offline reinforcement learning: Completeness, fast rates and first-order efficiency
M Uehara, M Imaizumi, N Jiang, N Kallus, W Sun, T Xie
arXiv preprint arXiv:2102.02981, 2021
Marginalized Off-Policy Evaluation for Reinforcement Learning
T Xie, YX Wang, Y Ma
NeurIPS 2018 Workshop on Causal Learning, 2018
Bellman-consistent Pessimism for Offline Reinforcement Learning
T Xie, CA Cheng, N Jiang, P Mineiro, A Agarwal
arXiv preprint arXiv:2106.06926, 2021
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
T Xie, N Jiang, H Wang, C Xiong, Y Bai
arXiv preprint arXiv:2106.04895, 2021
Interaction-Grounded Learning
T Xie, J Langford, P Mineiro, I Momennejad
arXiv preprint arXiv:2106.04887, 2021
Privacy Preserving Off-Policy Evaluation
T Xie, PS Thomas, G Miklau
arXiv preprint arXiv:1902.00174, 2019
