Advances in Algorithmic Decision Making

The field of algorithmic decision making is moving towards more adaptive and responsive strategies, with a focus on integrating multiple sources of feedback and learning from dynamic environments. Recent research has explored the use of generative models, reinforcement learning, and bandit algorithms to optimize decision making in complex systems. Notably, innovative approaches to addressing non-stationarity and uncertainty in these systems have been proposed, including the use of natural policy gradients and value-guided explorations. These advances have the potential to improve the efficiency and effectiveness of decision making in a wide range of applications, from advertising and pricing to supply chain management.

Some particularly noteworthy papers in this area include: The paper on Generative Auto-Bidding with Value-Guided Explorations, which introduces a novel framework for offline auto-bidding that accommodates various advertising objectives and integrates an action exploration mechanism with an RTG-based evaluation method. The paper on Fusing Reward and Dueling Feedback in Stochastic Bandits, which proposes two fusion approaches for combining absolute and relative feedback in stochastic bandits and achieves regret matching the lower bound up to a constant under a common assumption. The paper on Natural Policy Gradient for Average Reward Non-Stationary RL, which proposes and analyzes the first model-free policy-based algorithm for non-stationary reinforcement learning in the infinite-horizon average-reward setting.

Advances in Algorithmic Decision Making

Sources