Fair Average MDP

TLDR: We propose a new notion of fairness for resource allocation settings modeled as average Markov decision problems (AMDPs), and give an algorithm that provably converges to the optimal policy under the fairness constraint.

Abstract: Fairness has emerged as an important concern in automated decision-making in recent years, especially when these decisions affect human welfare. In this work, we study fairness in temporally extended decision-making settings, specifically those formulated as Markov Decision Processes (MDPs). Our proposed notion of fairness ensures that each state's long-term visitation frequency is at least a specified fraction. This quota-based notion of fairness is natural in many resource-allocation settings where the dynamics of a single resource being allocated is governed by an MDP and the distribution of the shared resource is captured by its state-visitation frequency. In an average-reward MDP (AMDP) setting, we formulate the problem as a bilinear saddle point program and, for a generative model, solve it using a Stochastic Mirror Descent (SMD) based algorithm. The proposed solution guarantees a simultaneous approximation on the expected average-reward and fairness requirement. We give sample complexity bounds for the proposed algorithm and validate our theoretical results with experiments on simulated data.

            @inproceedings{ghalme2022long,

                  title = {Long-Term Resource Allocation Fairness in Average Markov Decision Process (AMDP) Environment},

                  author = {Ghalme, Ganesh, and Nair, Vineet and Patil, Vishakha and Zhou, Yilun},

                  booktitle = {Proceedings of the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS)},

                  year = {2022},

                  month = {May}

            }

Long-Term Resource Allocation Fairness in Average Markov Decision Process (AMDP) Environment

Ganesh Ghalme, Technion Israel Institute of Technology
Vineet Nair, Google Research, India
Vishakha Patil, Indian Institute of Science, Bangalore
Yilun Zhou, Massachusetts Institute of Technology

Alphabetal author ordering

[Full Paper] [GitHub Repo]

AAMAS 2022

Long-Term Resource Allocation Fairness in Average Markov Decision Process (AMDP) Environment

Ganesh Ghalme, Technion Israel Institute of Technology Vineet Nair, Google Research, India Vishakha Patil, Indian Institute of Science, Bangalore Yilun Zhou, Massachusetts Institute of Technology

Alphabetal author ordering

[Full Paper] [GitHub Repo]

AAMAS 2022

Ganesh Ghalme, Technion Israel Institute of Technology
Vineet Nair, Google Research, India
Vishakha Patil, Indian Institute of Science, Bangalore
Yilun Zhou, Massachusetts Institute of Technology