Integrating Language and World Knowledge in Autonomous Driving

The recent advancements in autonomous driving research are marked by a significant shift towards integrating natural language instructions and world knowledge into driving systems. This trend is driven by the need for more context-aware and adaptive planning, especially in scenarios with limited perception. Innovations such as the development of large multimodal models (LMMs) and generative pre-training frameworks are enabling more nuanced and flexible responses in real-world driving conditions. These models are designed to process diverse data inputs and perform a broad spectrum of tasks, from perception and prediction to planning, showcasing their ability to generalize across different datasets and tasks. Additionally, the introduction of closed-loop frameworks and multi-actor dynamics generation systems is advancing the field towards more realistic and interactive traffic simulations. These developments collectively aim to enhance the safety and effectiveness of human-vehicle collaboration, paving the way for end-to-end autonomous driving applications in the real world.

Noteworthy papers include one that introduces a novel dataset for human-vehicle instruction interactions, emphasizing actionable directives tied to scene objects, and another that proposes a framework enhancing driving performance under perception-limited conditions by integrating perception capabilities and world knowledge.

Sources

doScenes: An Autonomous Driving Dataset with Natural Language Instruction for Human Interaction and Vision-Language Navigation

World knowledge-enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving

Driving with InternVL: Oustanding Champion in the Track on Driving with Language of the Autonomous Grand Challenge at CVPR 2024

DriveMM: All-in-One Large Multimodal Model for Autonomous Driving

GPD-1: Generative Pre-training for Driving

ChatDyn: Language-Driven Multi-Actor Dynamics Generation in Street Scenes

Doe-1: Closed-Loop Autonomous Driving with Large World Model

Built with on top of