Large Language Models: Mitigating Bias, Domain Specialization, and Multilingual Capabilities

Report on Current Developments in the Research Area

General Direction of the Field

The recent advancements in the research area are marked by a significant focus on enhancing the robustness, specialization, and multilingual capabilities of large language models (LLMs) and their applications. The field is moving towards more domain-specific and context-aware models, which are crucial for improving performance in specialized tasks and reducing biases in existing systems.

  1. Mitigating Bias in Neural Network-based Systems: There is a growing emphasis on developing techniques to mitigate biases in neural network-based systems, particularly in applications like Automatic Essay Scoring (AES). The introduction of adversarial training methods at the phrase level is a notable innovation, aiming to improve the robustness of AES models by generating adversarial examples that challenge the model's predictions. This approach not only enhances the model's performance under adversarial conditions but also contributes to a more equitable evaluation process.

  2. Domain-Specific Large Language Models (LLMs): The trend towards creating specialized LLMs for specific domains, such as telecommunications, is gaining momentum. These models are designed to handle domain-specific terminology and mathematical representations more effectively than general-purpose LLMs. The development of domain-specific datasets and evaluation metrics is a critical step in this direction, enabling the creation of models that outperform their general-purpose counterparts while retaining their broader capabilities.

  3. Multilingual Capabilities in LLM-Based Systems: There is an increasing recognition of the need for LLM-based systems to support a diverse range of languages, particularly in applications like recommender systems. Recent studies highlight the performance disparities when using non-English prompts and suggest the importance of retraining models with multilingual prompts to achieve more balanced performance across languages. This work underscores the necessity for future research to focus on creating evaluation datasets and models that support a wider array of languages.

  4. Prompting Strategies for LLMs: The role of prompting strategies in eliciting knowledge from LLMs is being thoroughly explored, particularly in the context of low-resourced languages. Studies are investigating the effectiveness of different prompting strategies, such as native vs. non-native language prompts, to understand how they impact the performance of LLMs across various NLP tasks. This research is crucial for optimizing the interaction between users and LLMs, especially in multilingual settings.

Noteworthy Papers

  • Phrase-Level Adversarial Training for Mitigating Bias in Neural Network-based Automatic Essay Scoring: Introduces a novel model-agnostic method to generate adversarial essay sets, significantly improving AES model robustness.

  • Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications: Develops the first series of LLMs tailored for telecommunications, outperforming general-purpose models on domain-specific tasks.

  • Multilingual Prompts in LLM-Based Recommenders: Performance Across Languages: Highlights the performance gap in non-English prompts and suggests retraining with multilingual prompts to achieve more balanced performance.

  • Native vs Non-Native Language Prompting: A Comparative Analysis: Conducts extensive experiments on prompting strategies, finding that non-native prompts generally perform better across various NLP tasks.

Sources

Phrase-Level Adversarial Training for Mitigating Bias in Neural Network-based Automatic Essay Scoring

Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications

Multilingual Prompts in LLM-Based Recommenders: Performance Across Languages

Native vs Non-Native Language Prompting: A Comparative Analysis