Advances in Vision Transformers and Out-of-Distribution Generalization

The field of computer vision is rapidly advancing with a focus on improving the performance and efficiency of vision transformers. Recent developments have shown that these models can be made more efficient and effective for edge devices by deriving task-specific models from pre-trained vision transformers. Additionally, there is a growing interest in out-of-distribution (OOD) generalization, with researchers exploring new methods for detecting OOD samples and improving the robustness of vision transformers to distribution shifts. Entropy-based methods have been proposed for OOD detection, and novel frameworks have been introduced for generating OOD test cases. Furthermore, research has demonstrated the effectiveness of structured pruning methods for improving the efficiency of vision transformers while maintaining their performance on domain generalization tasks. Noteworthy papers include: NuWa, which derives lightweight task-specific vision transformers for edge devices, improving model accuracy and accelerating model inference. EOOD, which proposes an entropy-based OOD detection framework that outperforms state-of-the-art methods. Resilience of Vision Transformers, which evaluates the robustness of vision transformers to OOD noisy images and demonstrates their effectiveness in domain generalization. The Effects of Grouped Structural Global Pruning, which introduces a novel pruning method for pre-trained vision transformers and achieves significant improvements in inference speed and fine-tuning time. Prior2Former, which proposes an evidential modeling approach for open-world panoptic segmentation and achieves state-of-the-art performance on anomaly instance segmentation and open-world panoptic segmentation. Intermediate Layer Classifiers, which explores the utility of intermediate layers for OOD generalization and discovers that they frequently offer better generalization than last-layer representations.

Sources

NuWa: Deriving Lightweight Task-Specific Vision Transformers for Edge Devices

EOOD: Entropy-based Out-of-distribution Detection

Resilience of Vision Transformers for Domain Generalisation in the Presence of Out-of-Distribution Noisy Images

The Effects of Grouped Structural Global Pruning of Vision Transformers on Domain Generalisation

Prior2Former -- Evidential Modeling of Mask Transformers for Assumption-Free Open-World Panoptic Segmentation

Intermediate Layer Classifiers for OOD generalization

Built with on top of