Reinforcement Learning Conference (RLC) 2024, August 9
Amherst, MA, United States
Quick links:
Twitter | OpenReview Portal
Program:
Speakers | Panelists |
Schedule | Organization
Contact us: deployable.rl@gmail.com
Real-world applications pose distinct challenges for decision-making algorithms, especially during the deployment phase. These include high-dimensional observation and action spaces, tasks that may be partially observable or non-stationary, and feedback that is often unspecified, delayed, or corrupted. Furthermore, feedback rarely comes in the form of a scalar reward function, exploration is often prohibitively costly, and safety considerations are a prerequisite for trained models to be deployed.
Reinforcement learning (RL) and contextual bandit (CB) algorithms have been studied in many settings, including healthcare, recommender and advertising systems, resource allocation and operations management, hardware and engineering design, and more recently, large foundation models. Despite many studies focusing on applying RL to such domains, the majority of solutions are not eventually deployed. An important question for the RL community to ask is:
What pieces of the puzzle are we missing and what new methods are needed to push these ideas forward so that they become truly deployable?
This workshop aims to place a spotlight on this topic, with the goal of advancing RL and CB algorithms towards becoming a widespread industry standard. To this end, we invite contributions on theory and practice of RL aimed at facilitating deployment to real-world problems, examples of which include but are not limited to:
The above are only a handful of suitable topics. We welcome submissions on any topic that focuses on the RL deployment process. The submissions can be 4-8 pages in length and be of either a theoretical/methodological or empirical nature.
Hamsa Bastani
Wharton, University of Pennsylvania
Daniel Russo
Columbia Business School
Aviral Kumar
Google DeepMind
CMU
John Langford
Microsoft Research NYC
Hongseok Namkoong
Columbia Business School
Omer Gottsman
Amazon
Daniel Russo
Columbia Business School
Kaushik Subramanian
Sony AI
Cathy Wu
MIT
Zheqing (Bill) Zhu
Meta
Panel moderator
The schedule below is tentative and subject to change.
AMÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â | Â |
08:55 - 09:00 | Opening remarks |
09:00 - 09:40 | Keynote: Daniel Russo |
 |     Optimizing Audio Recommendations for the Long-Term: A Reinforcement Learning Perspective |
09:40 - 10:20 | Keynote: Hongseok Namkoong |
 |     Adaptive Experimentation at Scale |
10:20 - 11:20 | Morning poster session + coffee break |
11:20 - 12:00 | Keynote: John Langford |
 |     Contextual Bandit Systems |
PMÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â | Â |
12:00 - 01:00 | Lunch break (on your own) |
01:00 - 02:00 | Panel: Omer Gottsman, Roberta Raileanu, Daniel Russo, Kaushik Subramanian, Cathy Wu, Zheqing (Bill) Zhu |
 |     Challenges Surrounding RL Deployment |
02:00 - 02:40 | Afternoon poster session + coffee break |
02:40 - 03:20 | Keynote: Hamsa Bastani |
 |     Lessons from Deploying Algorithms for Public Health |
03:20 - 04:00 | Keynote: Aviral Kumar |
 |     Converting Foundation Models to Deployable Agents via Offline and Online RL |
04:00 - 04:05 | Concluding remarks |
Meta
Yonathan Efroni
Meta
Mohammad Ghavamzadeh
Amazon
Daniel R. Jiang
Meta
Aldo Pacchiano
Boston University
Yi Wan
Meta
Kelly W. Zhang
Columbia Business School
Angela Zhou
Marshall School of Business, USC