Advances in Audio and Acoustic Signal Processing

Advances in Audio and Acoustic Signal Processing

Recent developments in the field of audio and acoustic signal processing have seen significant advancements, particularly in the areas of noise adaptation, real-time anomaly detection, and multimodal data synthesis. Innovations in noise adaptation networks have demonstrated enhanced robustness and accuracy in classifying Morse code images under various noise conditions, addressing a critical gap in handling diverse noise types. Real-time acoustic anomaly detection has also made strides, with hybrid models combining temporal convolutions and representation learning showing superior performance in industrial machinery monitoring.

Multimodal data synthesis has been advanced through hierarchical mixture models, enabling the generation of high-resolution images from incomplete data across different modalities. This approach not only tackles the challenge of missing information but also leverages dataset-level insights for more effective synthesis.

In the realm of music source separation, ensemble methods have shown promise by combining multiple models to achieve better separation performance, particularly in hierarchical stem separation. This approach highlights the potential for further research in expanding model capabilities beyond traditional stems.

Noteworthy papers include:

  • Noise Adaptation Network for Morse Code Image Classification: Introduces a novel two-stage approach that significantly enhances accuracy and robustness in noisy environments.
  • Temporal Convolution-based Hybrid Model Approach with Representation Learning for Real-Time Acoustic Anomaly Detection: Combines semi-supervised learning with representation learning to effectively handle intricate anomaly patterns in acoustic data.
  • Unified Cross-Modal Image Synthesis with Hierarchical Mixture of Product-of-Experts: Proposes a deep mixture model that synthesizes missing images from observed images in different modalities, addressing key challenges in multimodal data synthesis.

These advancements collectively push the boundaries of what is possible in audio and acoustic signal processing, offering new tools and methodologies for researchers and practitioners alike.

Sources

Noise Adaption Network for Morse Code Image Classification

Unified Cross-Modal Image Synthesis with Hierarchical Mixture of Product-of-Experts

Temporal Convolution-based Hybrid Model Approach with Representation Learning for Real-Time Acoustic Anomaly Detection

GreenEye: Development of Real-Time Traffic Signal Recognition System for Visual Impairments

Data-Efficient Low-Complexity Acoustic Scene Classification via Distilling and Progressive Pruning

An Ensemble Approach to Music Source Separation: A Comparative Analysis of Conventional and Hierarchical Stem Separation

ByteNet: Rethinking Multimedia File Fragment Classification through Visual Perspectives

OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup

Producer vs. Rapper: Who Dominates the Hip Hop Sound? A Case Study

Knowledge Distillation for Real-Time Classification of Early Media in Voice Communications

A Novel Score-CAM based Denoiser for Spectrographic Signature Extraction without Ground Truth

Audio Classification of Low Feature Spectrograms Utilizing Convolutional Neural Networks

Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization

USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis

Gaussian Derivative Change-point Detection for Early Warnings of Industrial System Failures

SoundCollage: Automated Discovery of New Classes in Audio Datasets

Neurobench: DCASE 2020 Acoustic Scene Classification benchmark on XyloAudio 2

Improving snore detection under limited dataset through harmonic/percussive source separation and convolutional neural networks

Built with on top of