Back
The DecoderVideo AIThe Decoder2026-04-04

Netflix Open-Sources VOID: An AI That Erases Objects From Video and Rewrites the Physics They Left Behind

Netflix and INSAIT Sofia University released VOID (Video Object and Interaction Deletion) on April 3, 2026 — an Apache 2.0 framework that removes objects from video and automatically regenerates the physical interactions those objects caused, including shadows, reflections, and collision effects.

Original source

Netflix has open-sourced VOID, a video AI framework that solves a problem no previous tool has cleanly addressed: removing objects from video in a way that also removes every physical consequence the object had on its surroundings.

Standard video inpainting tools fill the gap left by a removed object with plausible background content. VOID goes further — it recalculates how the remaining scene elements would behave if the removed object had never been there. A falling glass removed from footage means the liquid it would have spilled also disappears. A person removed from a room means the shadow they cast on a wall regenerates as empty floor.

Technically, VOID is a fine-tuned version of CogVideoX-Fun-V1.5-5b-InP, an Alibaba PAI video diffusion model. The pipeline works in stages: a language description of the target object is fed to the system; Google's Gemini 3 Pro analyzes the scene and identifies physically affected areas; Meta's SAM2 segments the target object precisely; the diffusion model then regenerates the cleaned scene. An optional optical flow correction pass handles shape distortions that appear in complex camera movements.

The system was developed in collaboration with INSAIT (Institute for Computer Science, Artificial Intelligence and Technology) at Sofia University. Training data came from Google's Kubric synthetic dataset and Adobe's HUMOTO for interaction detection scenarios.

In a 25-participant user study, VOID was preferred 64.8% of the time. Runway came second at 18.4%. The paper (arXiv:2604.02296) documents additional comparisons against ProPainter, DiffuEraser, MiniMax-Remover, ROSE, and Gen-Omnimatte.

Available on GitHub (Netflix/void-model) and Hugging Face under Apache 2.0. The practical applications — removing boom mics, unwanted signage, brand logos, background people from sensitive footage, and reshooting-free scene corrections — are immediately obvious for any content production team.

Panel Takes