Efficient and Scalable Large Language Models for Diverse Applications

The recent advancements in the field of Large Language Models (LLMs) and their applications across various domains, particularly in software engineering and speech recognition, have shown significant promise. The integration of LLMs into software development tools has demonstrated potential for enhancing performance in text and code-related tasks, with innovative approaches like speculative decoding and dynamic token tree structures leading to faster inference times. In the realm of speech recognition, the focus has shifted towards optimizing multilingual models to handle diverse speech variabilities, with notable improvements in word error rates and character error rates. Additionally, the exploration of scaling laws for predicting downstream performance in LLMs has provided a more efficient metric for performance estimation, reducing the need for extensive computational resources. The convergence of LLM architectures and the development of adaptive data optimization techniques have further streamlined the training process, making it more scalable and efficient. Notably, the use of encoder-only Transformers in time series foundation models has shown superior scalability compared to decoder-only architectures, offering practical guidelines for future model scaling. Overall, the field is moving towards more efficient, scalable, and versatile models that can handle a wide range of tasks with minimal computational overhead.

Sources

Studying and Benchmarking Large Language Models For Log Level Suggestion

Scaling Laws for Predicting Downstream Performance in LLMs

From N-grams to Pre-trained Multilingual Models For Language Identification

Enhancing Indonesian Automatic Speech Recognition: Evaluating Multilingual Models with Diverse Speech Variabilities

Survey and Evaluation of Converging Architecture in LLMs based on Footsteps of Operations

Tending Towards Stability: Convergence Challenges in Small Language Models

DySpec: Faster Speculative Decoding with Dynamic Token Tree Structure

Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws

A Hitchhiker's Guide to Scaling Law Estimation

Optimizing Low-Resource Language Model Training: Comprehensive Analysis of Multi-Epoch, Multi-Lingual, and Two-Stage Approaches

Towards Neural Scaling Laws for Time Series Foundation Models

Scaling Laws for Multilingual Language Models

Cerberus: Efficient Inference with Adaptive Parallel Decoding and Sequential Knowledge Enhancement

Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR

Transformer-Based Approaches for Sensor-Based Human Activity Recognition: Opportunities and Challenges

Scaling Wearable Foundation Models

Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding