Advancements in Autonomous Driving: Integrating VLMs and LLMs with RL

The field of autonomous driving is increasingly leveraging the integration of vision-language models (VLMs) and large language models (LLMs) with reinforcement learning (RL) to enhance decision-making processes and safety. This trend addresses the limitations of traditional RL approaches, which often rely on manually engineered rewards and lack generalizability. By incorporating VLMs and LLMs, researchers are able to generate more nuanced and semantically rich reward signals, align autonomous vehicle (AV) decisions with human-like preferences, and incorporate human input into the driving process. This integration not only improves the adaptability and efficiency of AVs in complex driving scenarios but also enhances the interpretability and safety of autonomous driving systems. Furthermore, the development of behavior-based neural networks and the emphasis on user-friendly environment descriptions in RL are contributing to more robust and accessible autonomous driving technologies. These advancements signify a shift towards more intelligent, adaptable, and human-aligned autonomous driving systems.

Noteworthy Papers

VLM-RL: Introduces a framework combining VLMs with RL for semantic reward generation, significantly improving collision rates and route completion.
CLIP-RLDrive: Utilizes CLIP-based reward shaping to align AV decisions with human preferences, enhancing decision-making in complex scenarios.
Autoware.Flex: Proposes a system that integrates human instructions into ADS, improving decision appropriateness and safety.
Large Language Model guided Deep Reinforcement Learning: Presents a framework where LLMs guide DRL, enhancing learning efficiency and decision-making performance.

Advancements in Autonomous Driving: Integrating VLMs and LLMs with RL

Noteworthy Papers

Sources