Out-of-Distribution Detection and Anomaly Detection

Current Developments in Out-of-Distribution Detection and Anomaly Detection

The field of out-of-distribution (OOD) detection and anomaly detection has seen significant advancements over the past week, driven by innovative methodologies and the introduction of new datasets. These developments are crucial for enhancing the robustness and reliability of machine learning models, particularly in real-world applications where data distribution shifts are common.

General Direction of the Field

  1. Post-Hoc and Lightweight Methods: There is a growing trend towards post-hoc methods that do not require retraining or access to the original training data. These methods are designed to be lightweight and adaptable across different architectures, addressing the limitations of current OOD detection techniques that often struggle with model transferability.

  2. Focus on Semantic Shifts and Covariate Shifts: Researchers are increasingly recognizing the importance of semantic shifts and covariate shifts in OOD detection. New datasets and methodologies are being developed to specifically address these challenges, ensuring that models can generalize better to unseen data with different semantic meanings or covariate distributions.

  3. Integration of Large Language Models (LLMs): The integration of LLMs into anomaly and OOD detection is gaining traction. LLMs offer advanced comprehension and generative capabilities that can be leveraged for detecting anomalies in text and other sequential data, marking a significant shift from traditional paradigms.

  4. Compression and Efficiency for Embedded Systems: There is a strong emphasis on developing OOD detectors that can operate efficiently on embedded systems with memory and power constraints. Techniques such as quantization, pruning, and knowledge distillation are being explored to compress models without significantly compromising detection performance.

  5. Unsupervised and Semi-Supervised Approaches: Unsupervised and semi-supervised methods are being favored for their ability to detect OOD samples without labeled data. These approaches are particularly useful in scenarios where labeled data is scarce or unavailable.

  6. Benchmarking and Standardization: The introduction of new benchmark datasets is helping to standardize evaluations and comparisons of OOD detection methods. These datasets are designed to reflect real-world challenges and are being used to validate the effectiveness of new techniques.

Noteworthy Papers

  1. Logit Scaling for Out-of-Distribution Detection: Introduces a simple, post-hoc method that effectively distinguishes between in-distribution and OOD samples across various architectures, demonstrating state-of-the-art performance.

  2. SOOD-ImageNet: A large-scale dataset designed to address semantic shift and covariate shift in OOD detection, showcasing its potential to significantly advance OOD research in computer vision.

  3. VQ-Flow: A novel flow-based method for multi-class anomaly detection, leveraging hierarchical vector quantization to achieve state-of-the-art performance in a unified training scheme.

  4. Compressing VAE-Based Out-of-Distribution Detectors for Embedded Deployment: Proposes a design methodology that combines quantization, pruning, and knowledge distillation to develop lean OOD detectors capable of real-time inference on embedded systems.

  5. Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey: Provides a comprehensive survey of the integration of LLMs into anomaly and OOD detection, proposing a new taxonomy and discussing potential future directions.

These developments highlight the ongoing efforts to create more robust, efficient, and adaptable OOD detection methods, paving the way for safer and more reliable machine learning applications in diverse real-world settings.

Sources

Logit Scaling for Out-of-Distribution Detection

DNN-GDITD: Out-of-distribution detection via Deep Neural Network based Gaussian Descriptor for Imbalanced Tabular Data

SOOD-ImageNet: a Large-Scale Dataset for Semantic Out-Of-Distribution Image Classification and Semantic Segmentation

LoGex: Improved tail detection of extremely rare histopathology classes via guided diffusion

VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization

A Review of Image Retrieval Techniques: Data Augmentation and Adversarial Learning Approaches

Compressing VAE-Based Out-of-Distribution Detectors for Embedded Deployment

Supervised Pattern Recognition Involving Skewed Feature Densities

Latent Distillation for Continual Object Detection at the Edge

PMLBmini: A Tabular Classification Benchmark Suite for Data-Scarce Applications

Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey

Can Your Generative Model Detect Out-of-Distribution Covariate Shift?

Oddballness: universal anomaly detection with language models

Image Recognition for Garbage Classification Based on Pixel Distribution Learning

Resultant: Incremental Effectiveness on Likelihood for Unsupervised Out-of-Distribution Detection

Ultra-imbalanced classification guided by statistical information