The field of artificial intelligence is moving towards developing more trustworthy and transparent multimodal models, with a focus on fairness, ethics, and explainability. Recent research has highlighted the importance of integrating these considerations into the development of vision-language models, large language models, and other AI systems.
A key direction in this area is the development of methods for explaining and interpreting the decisions made by AI models, such as attention maps, gradient-based methods, and counterfactual analysis. These techniques aim to provide insights into the reasoning processes behind AI decisions, making them more transparent and accountable.
Another important area of research is the development of frameworks and datasets for evaluating the trustworthiness and fairness of AI models, including metrics for measuring bias, fairness, and explainability. These efforts aim to ensure that AI systems are fair, transparent, and unbiased, and that they can be trusted to make decisions that are in the best interests of users.
Noteworthy papers in this area include: Building Trustworthy Multimodal AI, which provides a comprehensive review of fairness, transparency, and ethics in vision-language tasks. Walk the Talk, which introduces a novel approach for measuring the faithfulness of large language model explanations. BMRL, which proposes a bi-modal guided multi-perspective representation learning framework for zero-shot deepfake attribution. CHAINSFORMER, which presents a novel chain-based framework for numerical reasoning on knowledge graphs. CRAVE, which proposes a conflicting reasoning approach for explainable claim verification using large language models.