Colloquia / Seminars / Industry Talks - Improving Sample Efficiency of Online Temporal Difference Learning

HONG KONG BAPTIST UNIVERSITY
FACULTY OF SCIENCE

Department of Computer Science Seminar
2020 Series

Improving Sample Efficiency of Online Temporal Difference Learning

Mr. Yangchen Pan
PhD Candidate
Department of Computer Science
University of Alberta (Canada)

Date: October 15, 2020 (Thursday)
Time: 11:15 am - 12:15 pm
Venue: Zoom ID: 957 9692 7070
(The password and direct link will only be provided to registrants)

Registration: https://bit.ly/sem-zm
(Deadline: 2:00pm, 14 October 2020)

Abstract

Reinforcement Learning (RL) achieved several remarkable successes in recent years, such as playing Atari games at the human-level, power station control, and finance portfolio management, etc. However, it is still far away from developing RL’s full potential. One of the most important scientific hurdles is that RL algorithms suffer from low sample efficiency. That is, an RL agent typically needs to have many physical interactions with the real-world to achieve a reasonably good policy. Such interactions are typically quite expensive. I will introduce my efforts in improving the sample efficiency of online RL algorithms for both policy evaluation and control problems. I have been making efforts in the following directions: 1) bringing in preconditioning acceleration techniques for policy evaluation algorithms in a linear function approximation setting; 2) investigating efficient sampling distribution for model-based control problems; 3) designing special regularization method to leverage the intrinsic structure of a PDE control problem; 4) designing an efficient, scalable sparse representation learning activation function for a broad class of deep RL algorithms. All of my developed methods are supported by strong empirical evidence.

Biography

Yangchen Pan is currently a PhD candidate at the University of Alberta. He is co-supervised by Dr. Martha White from the University of Alberta and Dr. Amir-massoud from the University of Toronto. His primary research interest is reinforcement learning. His long-term goal is to develop RL agents that interactively learn from data to solve complex real-world tasks. During his Ph.D. program, he has been working on fundamental research in reinforcement learning, including policy evaluation problems, model-based reinforcement learning control problems, sparse representation learning methods, extremely high dimensional continuous control problems, etc. He has published refereed papers at several well-known conferences such as ICML, ICLR, NeurIPS, IJCAI, AAAI, UAI. He also serves as a committee member/reviewer for those conferences and for the Journal of Machine Learning Research (JMLR).

********* ALL INTERESTED ARE WELCOME ***********
(For enquiry, please contact Computer Science Department at 3411 2385)

http://www.comp.hkbu.edu.hk/v1/?page=seminars&id=564