Advances in Optimization and Model Merging for Machine Learning

The recent developments in the research area of machine learning optimization and model merging have shown significant advancements in addressing complex and non-convex optimization landscapes. Researchers are increasingly focusing on developing sharper risk bounds for minimax problems, which are crucial for applications like adversarial training and robust optimization. The field is also witnessing innovations in Riemannian gradient descent methods for solving joint blind super-resolution and demixing problems in integrated sensing and communication (ISAC). Bayesian optimization is being enhanced with novel acquisition functions that better manage the explore-exploit trade-off in batched settings. Additionally, there is a growing interest in differentiable multilevel optimization, with new gradient-based approaches that simplify the handling of nested structures, improving computational efficiency and solution accuracy. Black-box optimization is being advanced with sharpness-aware minimization strategies to enhance model generalization. Furthermore, there is a notable shift towards understanding and characterizing neural network loss landscapes without over-parametrization, which is crucial for ensuring convergence of optimization methods. The challenge of non-local model merging is being addressed through techniques that correct for variance collapse, improving the performance of multi-task models. Lastly, discrete optimization is being improved with decoupled temperature settings in the Straight-Through Gumbel-Softmax estimator, enhancing gradient fidelity and performance across various tasks.

Noteworthy papers include one that introduces a novel gradient-based approach for multilevel optimization, significantly reducing computational complexity and improving solution accuracy. Another notable contribution is the proposal of a sharpness-aware black-box optimization algorithm that improves model generalization performance through a reparameterization strategy.

Sources

Towards Sharper Risk Bounds for Minimax Problems

Riemannian Gradient Descent Method to Joint Blind Super-Resolution and Demixing in ISAC

Batched Energy-Entropy acquisition for Bayesian Optimization

Towards Differentiable Multilevel Optimization: A Gradient-Based Approach

Sharpness-Aware Black-Box Optimization

Loss Landscape Characterization of Neural Networks without Over-Parametrziation

The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse

SoK: On Finding Common Ground in Loss Landscapes Using Deep Model Merging Techniques

Improving Discrete Optimisation Via Decoupled Straight-Through Gumbel-Softmax

Built with on top of