Multilingual and Multimodal LLM Advancements

The recent advancements in the field of multilingual and multimodal Large Language Models (LLMs) have shown significant strides in enhancing the capabilities of these models across diverse languages and cultural contexts. A notable trend is the development of models that can transfer datasets and annotations across languages, which is crucial for boosting underresourced languages and accelerating research in these areas. This has been complemented by efforts to improve the robustness and consistency of LLMs in reasoning tasks, particularly in handling complex relationships like equivalence and inheritance across multiple languages. Additionally, there is a growing focus on evaluating and enhancing the factual recall mechanisms of LLMs in multilingual settings, which is essential for maintaining the accuracy and reliability of these models in non-English contexts.

Another emerging area is the evaluation of reward models, which play a critical role in aligning LLMs with human preferences. Recent benchmarks have been introduced to assess these models' sensitivity to subtle content differences and style variations, which are crucial for effective model alignment. Furthermore, there is increasing interest in cross-lingual transfer of reward models, which aims to extend the benefits of Reinforcement Learning from Human Feedback (RLHF) to multilingual settings.

Noteworthy papers include 'Be My Donor. Transfer the NLP Datasets Between the Languages Using LLM,' which presents a novel pipeline for dataset and annotation transfer using LLMs, and 'Towards Robust Knowledge Representations in Multilingual LLMs for Equivalence and Inheritance based Consistent Reasoning,' which introduces a new approach to enhance consistency in LLM reasoning across languages.

Sources

Be My Donor. Transfer the NLP Datasets Between the Languages Using LLM

Towards Robust Knowledge Representations in Multilingual LLMs for Equivalence and Inheritance based Consistent Reasoning

How Do Multilingual Models Remember? Investigating Multilingual Factual Recall Mechanisms

How to Evaluate Reward Models for RLHF

M-RewardBench: Evaluating Reward Models in Multilingual Settings

Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions Following

Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model

Findings of the Third Shared Task on Multilingual Coreference Resolution

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style

Exploring Pretraining via Active Forgetting for Improving Cross Lingual Transfer for Decoder Language Models

Cross-lingual Transfer of Reward Models in Multilingual Alignment

SPEED++: A Multilingual Event Extraction Framework for Epidemic Prediction and Preparedness

Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching

Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs

A Systematic Survey on Instructional Text: From Representation and Downstream NLP Tasks

Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation

Built with on top of