Autonomous Driving Research

Report on Current Developments in Autonomous Driving Research

General Direction of the Field

The field of autonomous driving is currently witnessing a significant shift towards leveraging advanced generative AI models to enhance the diversity, realism, and regulatory compliance of training datasets. This trend is driven by the need for more sophisticated and comprehensive simulations that can accurately represent the complexities and rare occurrences of real-world driving scenarios. The integration of vision language models (VLMs) with video generation frameworks is emerging as a key innovation, enabling not only the creation of realistic driving videos but also the generation of corresponding narrations that can aid in traffic scene understanding and navigation.

One of the primary advancements is the use of latent diffusion models, such as Stable Diffusion XL (SDXL), combined with advanced computer vision techniques like ControlNet and Hotshot-XL, to generate high-quality, diverse driving scenarios. These models are being trained on large-scale, real-world datasets such as KITTI and Waymo, which allows for the creation of videos that closely mimic the variability and unpredictability of actual driving conditions. This approach not only enhances the training data for autonomous driving systems but also opens new possibilities for virtual environments used in simulation and validation.

Another notable development is the exploration of how generative AI can support compliance with regulatory frameworks, particularly the European Union's Artificial Intelligence Act (EU AI Act). This legislation sets stringent norms for high-risk AI systems, including those used in autonomous driving. Generative AI models are being examined for their potential to address requirements related to transparency, robustness, and safety, thereby helping developers meet regulatory standards while improving the overall reliability of autonomous driving systems.

Noteworthy Innovations

  • GenDDS: Introduces a novel approach for generating diverse driving scenarios using Stable Diffusion XL and advanced computer vision techniques, significantly enhancing the realism and diversity of training data.
  • DriveGenVLM: Proposes a framework that integrates video generation with vision language models, enabling the creation of realistic driving videos with corresponding narrations, which can enhance traffic scene understanding and navigation.
  • Generative AI and EU AI Act Compliance: Explores the potential of generative AI models to address regulatory requirements in autonomous driving, particularly focusing on transparency and robustness, and highlights areas for further research.

Sources

GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model

DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving

The Artificial Intelligence Act: critical overview

How Could Generative AI Support Compliance with the EU AI Act? A Review for Safe Automated Driving Perception