Advancements in Image Segmentation: Efficiency and Accuracy through Foundational Models

The recent developments in the field of image segmentation, particularly in medical and biological domains, showcase a significant shift towards leveraging foundational models like the Segment Anything Model (SAM) and its variants for more efficient, accurate, and less labor-intensive segmentation tasks. Innovations are primarily focused on reducing the dependency on extensive labeled datasets and manual annotations by introducing novel frameworks that enhance the capabilities of existing models through few-shot learning, domain adaptation, and the integration of advanced techniques such as Layer-wise Relevance Propagation (LRP), Compact Convolutional Transformers (CCT), and prototype-guided prompt learning. These advancements not only improve segmentation accuracy and efficiency but also open new avenues for applications in medical diagnosis, biological research, and beyond.

Noteworthy papers include:

  • A novel framework that increases the spatial resolution of attention-based Multiple Instance Learning (MIL) using LRP to prompt SAM2, significantly improving segmentation accuracy for small structures like hyper-reflective foci in OCT images.
  • PGP-SAM, which introduces a prototype-based few-shot tuning approach for medical image segmentation, achieving superior performance with minimal manual intervention.
  • SST, a label-efficient method for fine-grained segmentation in biological specimen images, demonstrating high-quality segmentation with just one labeled image per species.
  • SAM-DA, a decoder adapter for efficient medical domain adaptation, showcasing improved segmentation across tasks with minimal trainable parameters.
  • Guided SAM, which enhances part segmentation efficiency by learning positional prompts from coarse patch annotations, significantly reducing manual effort.
  • FATE-SAM, a method for few-shot adaptation of SAM2 for 3D medical image segmentation, eliminating the need for large annotated datasets and expert intervention.
  • PartCATSeg, a framework for open-vocabulary part segmentation that addresses challenges in part-level image-text correspondence and structural understanding, setting a new baseline for generalization to unseen part categories.

Sources

Weakly Supervised Segmentation of Hyper-Reflective Foci with Compact Convolutional Transformers and SAM2

PGP-SAM: Prototype-Guided Prompt Learning for Efficient Few-Shot Medical Image Segmentation

Static Segmentation by Tracking: A Frustratingly Label-Efficient Approach to Fine-Grained Segmentation

SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation

Crowdsourced human-based computational approach for tagging peripheral blood smear sample images from Sickle Cell Disease patients using non-expert users

TimberVision: A Multi-Task Dataset and Framework for Log-Component Segmentation and Tracking in Autonomous Forestry Operations

Guided SAM: Label-Efficient Part Segmentation

Boosting Sclera Segmentation through Semi-supervised Learning with Fewer Labels

SkipClick: Combining Quick Responses and Low-Level Features for Interactive Segmentation in Winter Sports Contexts

Few-Shot Adaptation of Training-Free Foundation Model for 3D Medical Image Segmentation

Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation

Built with on top of