Advances in Large Language Models and Information Retrieval

The field of natural language processing is witnessing significant advancements in large language models (LLMs) and information retrieval techniques. Recent developments focus on improving the effectiveness and efficiency of these models, particularly in tasks such as passage re-ranking, cross-encoder fine-tuning, and in-context learning. Noteworthy papers in this area include those that propose innovative methods for generating synthetic oracle datasets to analyze noise impact, improving context copying in linear recurrence models with retrieval, and enhancing listwise ranking performance with collaborative ranking frameworks. These papers demonstrate substantial performance gains over conventional approaches, highlighting the importance of addressing feature noise, improving attention mechanisms, and developing more efficient ranking algorithms. Overall, the field is moving towards more sophisticated and nuanced models that can better capture subtle nuances in language and improve overall performance in various NLP tasks.

Sources

Exploring the Effectiveness of Multi-stage Fine-tuning for Cross-encoder Re-rankers

Generating Synthetic Oracle Datasets to Analyze Noise Impact: A Study on Building Function Classification Using Tweets

Resona: Improving Context Copying in Linear Recurrence Models with Retrieval

Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance

TRA: Better Length Generalisation with Threshold Relative Attention

Focus Directions Make Your Language Models Pay More Attention to Relevant Contexts

CoRanking: Collaborative Ranking with Small and Large Ranking Agents

On the Reproducibility of Learned Sparse Retrieval Adaptations for Long Documents

An extension of linear self-attention for in-context learning

Boundless Byte Pair Encoding: Breaking the Pre-tokenization Barrier

Multi-Token Attention

Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval

LLM-VPRF: Large Language Model Based Vector Pseudo Relevance Feedback

CASCADE Your Datasets for Cross-Mode Knowledge Retrieval of Language Models

From Sm{\o}r-re-br{\o}d to Subwords: Training LLMs on Danish, One Morpheme at a Time

Chain of Correction for Full-text Speech Recognition with Large Language Models

Why do LLMs attend to the first token?

On Vanishing Variance in Transformer Length Generalization