Foundation Models and Self-Supervised Learning in Remote Sensing

The recent developments in the field of remote sensing and image segmentation have shown a significant shift towards leveraging foundation models and innovative techniques to address long-standing challenges. A notable trend is the adaptation and integration of models like the Segment Anything Model (SAM) into various tasks, such as historical map segmentation and bone segmentation in CT scans, demonstrating their versatility and robustness. Additionally, there is a growing emphasis on self-supervised and semi-supervised learning methods, which are proving effective in scenarios with limited labeled data, as seen in the advancements of frameworks like PIEViT and AACL. These approaches not only enhance the representation learning from unlabeled data but also improve the generalization and transferability of models across different tasks and datasets.

Another emerging direction is the use of cross-modal fusion techniques, such as CoMiX, which aim to improve the semantic segmentation of hyperspectral images by integrating complementary information from different data types. This approach highlights the importance of capturing dynamic interactions between modalities to enhance feature extraction and fusion.

Noteworthy papers include 'MLPMatch: Multi-Level-Perturbation Match for Semi-Supervised Semantic Segmentation,' which introduces a novel framework that achieves state-of-the-art performance by integrating network perturbations with weak-to-strong consistency regularization. Another standout is 'MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps,' which presents a parameter-efficient fine-tuning strategy that adapts SAM for prompt-free historical map segmentation tasks.

Sources

Revisiting Network Perturbation for Semi-Supervised Semantic Segmentation

A Nerf-Based Color Consistency Method for Remote Sensing Images

STARS: Sensor-agnostic Transformer Architecture for Remote Sensing

Curriculum Learning for Few-Shot Domain Adaptation in CT-based Airway Tree Segmentation

Joint-Optimized Unsupervised Adversarial Domain Adaptation in Remote Sensing Segmentation with Prompted Foundation Model

Pattern Integration and Enhancement Vision Transformer for Self-Supervised Learning in Remote Sensing

Superpixel Segmentation: A Long-Lasting Ill-Posed Problem

Track Any Peppers: Weakly Supervised Sweet Pepper Tracking Using VLMs

United Domain Cognition Network for Salient Object Detection in Optical Remote Sensing Images

MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps

XPoint: A Self-Supervised Visual-State-Space based Architecture for Multispectral Image Registration

Large-scale Remote Sensing Image Target Recognition and Automatic Annotation

CDXFormer: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory

Slender Object Scene Segmentation in Remote Sensing Image Based on Learnable Morphological Skeleton with Segment Anything Model

Zero-shot capability of SAM-family models for bone segmentation in CT scans

Masked Image Modeling Boosting Semi-Supervised Semantic Segmentation

CoMiX: Cross-Modal Fusion with Deformable Convolutions for HSI-X Semantic Segmentation

Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation

Adaptively Augmented Consistency Learning: A Semi-supervised Segmentation Framework for Remote Sensing

Instruction-Driven Fusion of Infrared-Visible Images: Tailoring for Diverse Downstream Tasks