Advancements in Large Language Models

The field of large language models (LLMs) is rapidly evolving, with current developments focused on improving their performance, efficiency, and robustness. Researchers are exploring novel fine-tuning paradigms, such as Mask Fine-Tuning, which have shown to improve model performance across various domains. Another area of focus is the mitigation of massive activations in LLMs, with studies proposing hybrid strategies that balance the mitigation of these activations with preserved downstream model performance. Furthermore, there is a growing interest in leveraging LLMs for multimodal tasks, such as polymer property prediction, where the integration of text embeddings and molecular structure embeddings has demonstrated promising results. Noteworthy papers in this area include A Refined Analysis of Massive Activations in LLMs, which challenges prior assumptions on the detrimental effects of massive activations, and Multimodal machine learning with large language embedding model for polymer property prediction, which showcases the potential of LLMs in materials science applications.

Sources

A Refined Analysis of Massive Activations in LLMs

STADE: Standard Deviation as a Pruning Metric

Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets

Boosting Large Language Models with Mask Fine-Tuning

Multimodal machine learning with large language embedding model for polymer property prediction

Learning Towards Emergence: Paving the Way to Induce Emergence by Inhibiting Monosemantic Neurons on Pre-trained Models

Not All LoRA Parameters Are Essential: Insights on Inference Necessity

Model Hemorrhage and the Robustness Limits of Large Language Models

Leaking LoRa: An Evaluation of Password Leaks and Knowledge Storage in Large Language Models

A machine learning platform for development of low flammability polymers

NCAP: Scene Text Image Super-Resolution with Non-CAtegorical Prior

$\mu$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

Built with on top of