Balancing Welfare and Stability in Decision-Making Systems

The research area is witnessing a shift towards more nuanced and adaptable approaches in decision-making systems, particularly in contexts like matching markets and facility location problems. There is a growing emphasis on balancing utilitarian and Rawlsian welfare metrics in market stability, as well as optimizing social welfare in resource-scarce scenarios. Innovations in contextual bandits are addressing the challenges of non-uniform exploration and supervised learning, especially in dynamic environments with delayed feedback. Additionally, the field is advancing in combinatorial online learning, with a focus on rising rewards and the development of efficient algorithms that minimize policy regret. The exploration of submodular maximization under constant recourse constraints is also revealing new insights into the trade-offs between consistency and approximation ratios.

Noteworthy papers include one that introduces a welfarist approach to matching markets, balancing utilitarian and Rawlsian welfare while maintaining market stability, and another that connects Optimal Transport theory to facility location problems, identifying optimal mechanisms for maximizing social welfare. A third paper stands out for its analysis of regression oracles in contextual bandits, highlighting the need for adaptable algorithms to avoid instability in policy performance over time.

Sources

Bandit Learning in Matching Markets: Utilitarian and Rawlsian Perspectives

Designing Optimal Mechanisms to Locate Facilities with Insufficient Capacity for Bayesian Agents

Contextual Bandits in Payment Processing: Non-uniform Exploration and Supervised Learning at Adyen

Combinatorial Rising Bandit

The Cost of Consistency: Submodular Maximization with Constant Recourse

Built with on top of