Advances in Multimodal Analysis and Visualization

The field of data visualization is rapidly evolving, with a growing focus on multimodal analysis and the development of more sophisticated models for understanding and interpreting visual representations of data. Recent research has explored the capabilities and limitations of large language models and vision language models in tasks such as chart question answering and graphical perception. These models have shown promising results, but also face challenges in handling complex and dynamic data. To address these challenges, new benchmarks and evaluation methods are being developed to test the performance of these models in real-world scenarios. Noteworthy papers include:

  • Evaluating Graphical Perception with Multimodal LLMs, which investigates the performance of multimodal large language models in graphical perception tasks.
  • Probing the Visualization Literacy of Vision Language Models, which introduces a new approach for visualizing the internal reasoning of vision language models.
  • ChartQAPro, which presents a new benchmark for chart question answering that includes a diverse set of charts and questions.
  • VADIS, which proposes a visual analytics pipeline for dynamic document representation and information-seeking in the biomedical domain.

Sources

Evaluating Graphical Perception with Multimodal LLMs

Probing the Visualization Literacy of Vision Language Models: the Good, the Bad, and the Ugly

ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering

VADIS: A Visual Analytics Pipeline for Dynamic Document Representation and Information-Seeking

Built with on top of