Skip to content

The Ultimate 2026 Guide to AI Vocal Separation Models: Deep Dive into SDR, Fullness, and Bleedless Metrics

At the intersection of audio engineering and machine learning, AI audio source separation has evolved far beyond the question of whether separation is possible. Today, the goal is near lossless, mastering-grade separation.

As the MVSEP leaderboard continues to evolve—from early Hybrid Demucs models to the current dominance of BS-Roformer variants—audio producers are faced with increasingly complex model choices.

This guide is based on the latest MVSEP Multisong benchmark dataset, providing a deep analysis of current state-of-the-art (SOTA) separation models along with professional selection strategies for different musical scenarios.


Key Metrics Explained: SDR, Fullness, and Bleedless

When evaluating AI vocal separation quality, three core dimensions are widely recognized in the industry:

  1. SDR (Signal-to-Distortion Ratio)
    Measures signal distortion and overall separation accuracy.

  2. Fullness
    Indicates how well a model preserves instrumental details, dynamic range, and low-frequency texture.

  3. Bleedless
    Measures how effectively the model removes vocal remnants and artifacts from the instrumental track.

Note: Fullness and Bleedless often involve a trade-off. Pursuing extreme cleanliness may sacrifice instrumental richness. Therefore, choosing the right model for the music genre is critical—for example, studio masters and live recordings often require different priorities.


Key Factors When Choosing a Model

1. Song Type and Genre

Each song differs in instrumentation, mixing style, and effect processing. A model that performs well on one track may perform differently on another.

2. Fullness vs Bleedless Metrics

  • Fullness represents how well the accompaniment details are preserved.
  • Bleedless indicates how effectively residual vocals are removed.

MVSEP provides multi-track benchmark data, allowing users to sort and compare models by these metrics.

3. Phase Fix Technology

If you encounter vocal remnants or low-frequency humming, tools such as Phase Fixer or Phase Swapper in UVR > Tools can help correct phase-related artifacts.


Major AI Vocal / Instrumental Separation Models in 2026

The following data is based on the MVSEP Multisong Dataset, showing the performance of individual models:

Model NameArchitectureInst. FullnessInst. BleedlessSDR (dB)Core Use Case
Becruily Mel-Roformer "Deux"Mel-Roformer34.2541.3617.55All-round champion: balanced, high SDR, no phase correction needed
Unwa HyperAce v2BS-Roformer38.0337.8717.40Extreme detail: wide soundstage, ideal for complex vocal arrangements
BS-Roformer ResurrectionBS-Roformer34.9340.1417.25Piano & electric guitar: smooth mid-low frequencies, ultra-low noise floor
Unwa Mel-Roformer V1e+Mel-Roformer37.8936.5316.65Modern mixes: great for electronic, trap, and high-energy backgrounds

Expert Model Analysis

1. Becruily Dual Mel-Roformer "Deux"

A leading SOTA model that automatically performs internal phase inversion correction.

Technical Highlights

  • Excellent for commercial mixes
  • Outstanding preservation of instruments like piano
  • Minimizes common artifacts such as watery or phasey sounds

Advanced Tuning

Recommended accompaniment parameter:

chunk_size ≈ 705,600

Larger chunk sizes may increase fullness, but exceeding 882,000 may reduce SDR.


2. Unwa HyperAce v2 (BS-Roformer)

The preferred model for achieving top aura_mrstft scores.

Sound Characteristics

  • Highly transparent acoustic instrument reproduction
  • Fuller sound compared to V1e+

Limitations

  • Less effective for vocoder-style audio
  • Slower inference compared to Resurrection

3. BS-Roformer Resurrection

Designed specifically to reduce phase distortion artifacts.

Recommended Usage

For minimalist piano pieces or tracks with quiet sections, Resurrection significantly reduces background hiss and subtle noise artifacts.


Practical Optimization Tips

1. Audio Segmentation & Chunk Size

Recommended settings:

  • Becruily Deux: 661,500 – 749,700 (higher may reduce SDR)
  • V1e+: ~570K default works well

2. Phase Fix / Phase Swapper

In UVR > Tools:

  • Phase Fix can remove low-frequency humming
  • Also helps reduce minor vocal remnants

Using a bleedless-oriented model as reference can further improve results.


3. Model Comparison & Hybrid Workflow

Combining models often yields the best results:

  • Piano solos: use Resurrection
  • Dense vocal arrangements: use HyperAce v2

Segmented processing or multi-model comparisons can significantly improve separation quality.


4. Reference MVSEP Benchmark Data

MVSEP provides quantitative metrics including Fullness, Bleedless, and SDR, which are essential when selecting models.

MVSEP model test results:
https://mvsep.com/quality_checker/entry/9475


Offline Processing Workflow Recommendations

1. Privacy & Lossless Output

Using LyRuno allows completely offline vocal separation, meaning files are never uploaded—ensuring full privacy.

https://lyruno.com/


2. Batch Processing

Import multiple tracks at once to improve workflow efficiency.


3. Overlap & Chunking Parameters

Setting an Overlap value (e.g., 8) can help eliminate boundary artifacts during chunk-based processing.


4. Handling Large Audio Files

For extremely large or long audio files, segmented separation is recommended.

Tools like LyRuno handle very large file sizes and long durations effectively.


Frequently Asked Questions (FAQ)

Q1: Why does the separated instrumental sometimes sound "synthetic"?

Usually this occurs when the model over-suppresses the fundamental frequencies.

Try:

  • Increasing chunk_size
  • Using Becruily Deux to improve phase consistency.

Q2: Should I use a 2-stem or 4-stem model?

If your goal is clean vocal extraction, 2-stem models generally achieve higher SDR.

4-stem models allow separation of drums and bass but often introduce more frequency leakage at the boundaries.


Q3: How can I quickly remove slight vocal remnants?

Use a denoise/bleedless model first, then apply Phase Fix for additional cleanup.


Q4: How should MVSEP benchmark data be interpreted?

MVSEP provides metrics like Fullness, Bleedless, and SDR that allow users to rank and compare models objectively. These metrics are extremely helpful for model selection.


References