Optimizing Training Efficiency and Scaling Laws in Large Models

The recent developments in the field of large language models (LLMs) and recommendation systems have shown a significant shift towards optimizing training efficiency and scaling laws. Researchers are increasingly focusing on innovative strategies to manage the computational demands of training LLMs, with a particular emphasis on incremental training methods and adaptive parallelism. These approaches aim to balance the trade-offs between computational efficiency and model performance, often revealing new scaling laws that guide the optimization of LLM training processes. Additionally, the integration of large embedding tables in recommendation models has led to advancements in scaling network parameters, mirroring the successes seen in LLMs. Notable among these is the development of generative recommendation models that leverage scaling laws to achieve substantial performance gains. The field is also witnessing a move towards data-centric approaches, where the quality of data is being assessed more rigorously to enhance model performance, rather than solely focusing on data quantity. This trend underscores the importance of data quality metrics like Approximate Entropy in sequential recommendation models. Overall, the current research landscape is characterized by a concerted effort to refine and optimize the training and deployment of large models, with a keen eye on computational efficiency and data quality.

Sources

On the Effectiveness of Incremental Training of Large Language Models

A Simple and Provable Scaling Law for the Test-Time Compute of Large Language Models

Predictive Models in Sequential Recommendations: Bridging Performance Laws with Data Quality Insights

Scaling New Frontiers: Insights into Large Recommendation Models

Data-Centric and Heterogeneity-Adaptive Sequence Parallelism for Efficient LLM Training

Scaling Law for Language Models Training Considering Batch Size

Densing Law of LLMs

Establishing Task Scaling Laws via Compute-Efficient Model Ladders

Built with on top of