The field of music and dance generation is rapidly evolving, with a focus on developing innovative methods to generate high-quality, synchronized audio and dance movements. Recent developments have seen the proposal of novel frameworks, such as those utilizing gating mechanisms, compression-based techniques, and chain-of-thought prompting, to improve the alignment and creativity of generated music and dance.
Noteworthy papers in this area include LZMidi, which introduces a lightweight symbolic music generation framework based on a Lempel-Ziv-induced sequential probability assignment, achieving competitive results with state-of-the-art diffusion models while significantly reducing computational overhead. Another notable paper is MusiCoT, which proposes a novel chain-of-thought prompting technique tailored for music generation, empowering autoregressive models to outline an overall music structure before generating audio tokens, thereby enhancing the coherence and creativity of the resulting compositions.