Bias Mitigation and Linguistic Diversity in LLMs

The recent research in the field of Large Language Models (LLMs) has predominantly focused on addressing and mitigating biases that arise from both pretraining data and model outputs. A significant trend is the exploration of how biases in training data are amplified in model outputs, emphasizing the need for early intervention in the pretraining stage. Studies are also delving into the effects of different tuning methods and hyperparameters on bias expression, with some finding that instruction-tuning can partially alleviate representational biases. Additionally, there is a growing interest in developing resource-efficient and interpretable methods for bias mitigation, which aim to reduce biases without compromising model performance. Furthermore, the field is witnessing advancements in enhancing linguistic diversity and reducing demographic biases through innovative fine-tuning techniques. Notably, there is a shift towards understanding and evaluating the transfer of biases between pre-trained models and their prompt-adapted versions, highlighting the importance of fairness in pre-trained models for downstream tasks. Overall, the research is moving towards more nuanced and comprehensive approaches to bias detection, mitigation, and linguistic diversity enhancement in LLMs.

Sources

How far can bias go? -- Tracing bias from pretraining data to alignment

The Femininomenon of Inequality: A Data-Driven Analysis and Cluster Profiling in Indonesia

Cognitive Biases in Large Language Models: A Survey and Mitigation Experiments

The "LLM World of Words" English free association norms generated by large language models

Towards Resource Efficient and Interpretable Bias Mitigation in Large Language Models

Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning

Evaluating Gender Bias Transfer between Pre-trained and Prompt-Adapted Language Models

CBEval: A framework for evaluating and interpreting cognitive biases in LLMs

Exploring the Influence of Label Aggregation on Minority Voices: Implications for Dataset Bias and Model Training

Built with on top of