The field of image processing and vision transformers is rapidly evolving, with a focus on improving the accuracy and efficiency of image restoration, retargeting, and feature extraction. Recent developments have centered on innovative architectures and techniques, such as deformable sliding window transformers, query-aware selective attention, and encoder-decoder frameworks. These advancements have led to significant improvements in state-of-the-art performance, with applications in image super-resolution, deraining, and rectification. Noteworthy papers include HALO, which introduces a human-aligned end-to-end image retargeting method, and EDIT, which proposes an encoder-decoder architecture to mitigate attention sink in vision transformers. Additionally, papers such as Content-Aware Transformer and Rethinking LayerNorm have made significant contributions to the field, demonstrating the potential of tailored normalization strategies and attention mechanisms for image restoration tasks.