Deployable RL Workshop @ RLC 2024

Announcements

We've posted the call for papers! The paper submission deadline is May 17, 2024 (extended from May 8 for RLC notifications).
We've posted a tentative schedule for the full-day event on August 9, 2024 in Amherst, MA.
We've posted the list of accepted papers!

Real-world applications pose distinct challenges for decision-making algorithms, especially during the deployment phase. These include high-dimensional observation and action spaces, tasks that may be partially observable or non-stationary, and feedback that is often unspecified, delayed, or corrupted. Furthermore, feedback rarely comes in the form of a scalar reward function, exploration is often prohibitively costly, and safety considerations are a prerequisite for trained models to be deployed.

Reinforcement learning (RL) and contextual bandit (CB) algorithms have been studied in many settings, including healthcare, recommender and advertising systems, resource allocation and operations management, hardware and engineering design, and more recently, large foundation models. Despite many studies focusing on applying RL to such domains, the majority of solutions are not eventually deployed. An important question for the RL community to ask is:

What pieces of the puzzle are we missing and what new methods are needed to push these ideas forward so that they become truly deployable?

This workshop aims to place a spotlight on this topic, with the goal of advancing RL and CB algorithms towards becoming a widespread industry standard. To this end, we invite contributions on theory and practice of RL aimed at facilitating deployment to real-world problems, examples of which include but are not limited to:

Methods to enable deployability in RL: offline RL, offline policy evaluation (OPE), offline to online RL, safe learning and exploration, preference learning, learning in partially observable settings, interpretable policies in high-stakes settings, and methods for improving deployment efficiency.
Applications in personalization and recommendation systems: applications of RL and CB methods in technology platforms that interact with humans (e.g., audio and video recommendations, online advertising, LLMs).
Applications in industrial automation: applications of RL and CB methods in the control of physical systems or allocation of resources (e.g., inventory management, ride-sharing, commercial cooling systems, data center congestion control).
Applications in decision systems for healthcare: applications of RL and CB methods in healthcare (e.g., managing chronic conditions, digital treatment recommendations, human-in-the-loop RL).

The above are only a handful of suitable topics. We welcome submissions on any topic that focuses on the RL deployment process. The submissions can be 4-8 pages in length and be of either a theoretical/methodological or empirical nature.

Keynote Speakers

Hamsa Bastani
Wharton, University of Pennsylvania

Daniel Russo
Columbia Business School

Aviral Kumar
Google DeepMind
CMU

John Langford
Microsoft Research NYC

Hongseok Namkoong
Columbia Business School

Panel

Omer Gottsman
Amazon

Daniel Russo
Columbia Business School

Kaushik Subramanian
Sony AI

Cathy Wu
MIT

Zheqing (Bill) Zhu
Meta
Panel moderator

Schedule

The schedule below is tentative and subject to change.

AM
08:55 - 09:00	Opening remarks
09:00 - 09:40	Keynote: Daniel Russo
	Optimizing Audio Recommendations for the Long-Term: A Reinforcement Learning Perspective
09:40 - 10:20	Keynote: Hongseok Namkoong
	Adaptive Experimentation at Scale
10:20 - 11:20	Morning poster session + coffee break
11:20 - 12:00	Keynote: John Langford
	Contextual Bandit Systems

PM
12:00 - 01:00	Lunch break (on your own)
01:00 - 02:00	Panel: Omer Gottsman, Roberta Raileanu, Daniel Russo, Kaushik Subramanian, Cathy Wu, Zheqing (Bill) Zhu
	Challenges Surrounding RL Deployment
02:00 - 02:40	Afternoon poster session + coffee break
02:40 - 03:20	Keynote: Hamsa Bastani
	Lessons from Deploying Algorithms for Public Health
03:20 - 04:00	Keynote: Aviral Kumar
	Converting Foundation Models to Deployable Agents via Offline and Online RL
04:00 - 04:05	Concluding remarks

Organizers

Jalaj Bhandari

Deployable RL: From Research to Practice

Announcements

Keynote Speakers

Panel

Schedule

Organizers