The field of reinforcement learning is shifting towards more efficient and scalable methods, with a focus on bilevel optimization and novel exploration strategies. Researchers are making progress in developing theoretical foundations, such as sample complexity bounds, to bridge the gap between theory and practice. New algorithms and techniques are being proposed to improve exploration efficiency, including the use of generative models and hierarchical representations. These advances have the potential to improve the performance of reinforcement learning systems in complex environments and sparse reward settings. Noteworthy papers include:
- On The Sample Complexity Bounds In Bilevel Reinforcement Learning, which presents the first sample complexity result for bilevel reinforcement learning.
- KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies, which introduces a novel approach to balancing exploration strategies.
- Adventurer: Exploration with BiGAN for Deep Reinforcement Learning, which proposes a novelty-driven exploration algorithm based on Bidirectional Generative Adversarial Networks.
- Synthesizing world models for bilevel planning, which introduces TheoryCoder, an instantiation of theory-based reinforcement learning that exploits hierarchical representations of theories.