AI and Language

Report on Current Developments in the AI and Language Research Area

General Direction of the Field

The recent developments in the AI and language research area are marked by a significant shift towards addressing linguistic diversity, fairness, and inclusivity in AI technologies. Researchers are increasingly recognizing the importance of moving beyond the English-centric paradigm that has dominated the field, advocating for more inclusive practices that accommodate a wider range of languages and cultures. This shift is driven by the realization that AI technologies, particularly large language models (LLMs), can perpetuate biases and stereotypes if not carefully managed, thereby influencing public perception and decision-making on a global scale.

One of the key areas of focus is the examination of biases embedded in AI systems, particularly those that are language-specific. Recent studies have highlighted the prevalence of negative sentiments and stereotypes in non-Western AI technologies, emphasizing the need for a more global perspective in AI research. This includes not only the development of AI systems that are fair and inclusive but also the promotion of research and publication practices that are linguistically diverse.

Another important trend is the exploration of International Auxiliary Languages (IALs) and their potential role in fostering cross-cultural communication and understanding. Researchers are analyzing the quality and usage of IALs on platforms like Wikipedia, aiming to improve their content and visibility. This work not only contributes to the growth of IAL communities but also provides a new methodology for categorizing and understanding the impact of these languages in digital spaces.

Sentiment analysis using LLMs has also gained traction, particularly in the context of global crises like the COVID-19 pandemic. Researchers are leveraging these models to analyze social media data and understand the evolution of public sentiment, particularly in relation to xenophobia and discrimination. This work underscores the importance of transparent communication in mitigating negative sentiments during times of crisis.

Finally, there is a growing interest in the diachronic analysis of language, exploring how words and concepts evolve over time in response to societal changes. This research highlights the connections between language and societal shifts, emphasizing the ethical considerations involved in interpreting such results.

Noteworthy Papers

  • Language-Diverse Publishing in AI: This paper provocatively argues for a shift away from English-centric publishing in AI, proposing practical steps to promote linguistic diversity and inclusivity in the field.

  • Bias in Chinese-Language AI Technologies: A comprehensive study on the biases embedded in Chinese AI tools, highlighting the importance of promoting fairness and inclusivity in AI technologies with a global perspective.

  • Impact of ChatGPT on Writing Style: Demonstrates the significant impact of ChatGPT on the writing style of condensed matter physicists, particularly among non-native English speakers, indicating widespread adoption of the tool.

Sources

A global AI community requires language-diverse publishing

Comparing diversity, negativity, and stereotypes in Chinese-language AI technologies: a case study on Baidu, Ernie and Qwen

Constructing a Common Ground: Analyzing the quality and usage of International Auxiliary Languages in Wikipedia

A longitudinal sentiment analysis of Sinophobia during COVID-19 using large language models

From cart to truck: meaning shift through words in English in the last two centuries

Impact of ChatGPT on the writing style of condensed matter physicists