Realtime Video AI with Diffusion Models – SVTA University

Conference Proceedings

Home
Realtime Video AI with Diffusion Models

Realtime Video AI with Diffusion Models

Description

Generative AI is opening new possibilities for creating and transforming video in real time. In this talk, I’ll explore how recent models such as StreamDiffusion and LongLive push diffusion techniques into practical use for low-latency video generation and transformation. I’ll give a deep technical walkthrough of how these systems can be adapted for streaming use cases, unpacking the full pipeline – from decoding, through the diffusion process, to encoding – and highlighting optimisation strategies, such as KV caching, that make interactive generation possible. I’ll also discuss the tradeoffs between ultra-low latency video transformation and generating longer, more coherent streams. To make it concrete, I’ll present demos of StreamDiffusion (served with the open-source cloud service Daydream) and LongLive (explored with the open-source research tool Scope), showcasing practical examples of both video-to-video transformation and streaming text-to-video generation. This talk was presented at Demuxed 2025 in London, a conference by and for engineers working in video. Every year we host a conference with lots of great new talks like this – learn more at https://demuxed.com

Conference

Speakers

Rafal Leszko

Blockchain Staff Engineer

Learning Categories

Other Proceedings

Here are some other proceedings that you might find interesting.

What Codec Should I Use?

Alan Resnick

Doing Server-Side Ad Insertion on Live Sports for 25.3M Concurrent Users

Ashutosh Agrawal

Is now the time to solve the deepfake threat?

Roderick Hodgson

Super Resolution: The scaler of tomorrow, here today!

Nick Chadwick

The do's and don'ts about Streaming security

Javier Brines Garcia

Modeling the conceptual structure of FFmpeg in JavaScript

Ryan Harvey

Objectionable Uses of Objective Quality Metrics

Richard Fliam

RTMP: web video innovation or Web 1.0 hack… how did we get to now?

Sarah Allen

Large-Scale Media Archive Migration to the Cloud

Konstantin Wilms

HEVC Upload Experiments

Chris Ellsworth

Related Courses

Below are some courses that might interest you based on the learning categories and topic tags of this conference proceeding.

What Codec Should I Use?

Alan Resnick

Doing Server-Side Ad Insertion on Live Sports for 25.3M Concurrent Users

Ashutosh Agrawal

Is now the time to solve the deepfake threat?

Roderick Hodgson

Super Resolution: The scaler of tomorrow, here today!

Nick Chadwick

The do's and don'ts about Streaming security

Javier Brines Garcia

Modeling the conceptual structure of FFmpeg in JavaScript

Ryan Harvey

Objectionable Uses of Objective Quality Metrics

Richard Fliam

RTMP: web video innovation or Web 1.0 hack… how did we get to now?

Sarah Allen

Large-Scale Media Archive Migration to the Cloud

Konstantin Wilms

HEVC Upload Experiments

Chris Ellsworth