The field of explainable AI is moving towards developing more transparent and trustworthy models. Recent research has focused on creating models that can provide interpretable and uncertainty-aware reasoning, allowing for more reliable and effective collaboration between humans and AI systems. Innovations in this area include the development of compositional and probabilistic reasoning systems, as well as methods for generating natural language explanations of agent behavior. Additionally, there is a growing emphasis on the importance of human-AI interaction and the need for explainability to be a bidirectional process. Notable papers in this area include:
- Bonsai, which introduces a tunable reasoning system that generates adaptable inference trees and demonstrates reliable handling of varied domains.
- Model-Agnostic Policy Explanations with Large Language Models, which proposes a method for generating natural language explanations of agent behavior without access to the agent's underlying model.
- Interactive Explanations for Reinforcement-Learning Agents, which presents an interactive explanation system that allows users to query the agent's behavior and identify faulty actions.