The field of conversational AI is moving towards developing more robust and adaptable agents that can effectively interact with users and perform complex tasks. Recent research has focused on improving the evaluation methods for large language model-based agents, with a emphasis on developing taxonomy systems that capture the key components of conversational agents and their evaluation dimensions. Additionally, there is a growing interest in decoupling in-context learning and memorization to overcome the limitations of traditional single-agent systems. Another area of research is on developing agents that can manage ambiguous GUI navigation tasks with follow-up questions, and compositional frameworks that delegate cognitive responsibilities across various generalist and specialist models. Noteworthy papers include:
- Factored Agents: Decoupling In-Context Learning and Memorization for Robust Tool Use, which proposes a novel factored agent architecture that improves planning accuracy and error resilience.
- Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents, which introduces a novel compositional framework that establishes new state-of-the-art performance on three prominent computer use benchmarks.
- Navi-plus: Managing Ambiguous GUI Navigation Tasks with Follow-up, which develops a new dataset and evaluation method to benchmark GUI agents with follow-up question capabilities.