Advancements in Surgical Automation and Scene Understanding

The field of surgical research is witnessing significant advancements in automation and scene understanding, driven by innovations in artificial intelligence, computer vision, and machine learning. Recent developments focus on improving the accuracy and efficiency of surgical procedures, enhancing patient outcomes, and streamlining the learning process for surgeons. Notably, there is a growing emphasis on automated assessment and feedback systems, as well as the development of large datasets and foundation models to support perception and decision-making in surgical settings. Some noteworthy papers in this area include: The paper on LLM-SAP, which introduces a Large Language Models-based Surgical Action Planning framework that predicts future actions and generates text responses by interpreting natural language prompts of surgical goals. The paper on fine-CLIP, which proposes a vision-language model that learns object-centric features and leverages the hierarchy in triplet formulation to enhance zero-shot recognition of novel surgical triplets. The paper on Surg-3M, which presents a comprehensive dataset and foundation model for perception in surgical settings, achieving impressive results in downstream tasks such as surgical phase recognition, action recognition, and tool presence detection.

Sources

AI-driven Automation of End-to-end Assessment of Suturing Expertise

Surgical Action Planning with Large Language Models

Innovative Automated Stretch Elastic Waistband Sewing Machine for Garment Manufacturing

EgoSurgery-HTS: A Dataset for Egocentric Hand-Tool Segmentation in Open Surgery Videos

fine-CLIP: Enhancing Zero-Shot Fine-Grained Surgical Action Recognition with Vision-Language Models

Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings

Dataset and Analysis of Long-Term Skill Acquisition in Robot-Assisted Minimally Invasive Surgery

Synergistic Bleeding Region and Point Detection in Surgical Videos

Endo-TTAP: Robust Endoscopic Tissue Tracking via Multi-Facet Guided Attention and Hybrid Flow-point Supervision

EndoLRMGS: Complete Endoscopic Scene Reconstruction combining Large Reconstruction Modelling and Gaussian Splatting

Built with on top of