Conference Proceedings
- Home
- The media processing pipelines behind AI
The media processing pipelines behind AI
Description
“In this talk we will be talking about the complex media processing pipelines behind media AI models (translations, lipsyncing, text to video, etc).
AI Models are very picky on media specs (resolutions, frame rates, sample rates, colorspace, etc), also most of the times the AI tools that you see that, for instance, generates a fantastic and sharp video from a text prompt are really based on several AI models working in conjunction, each of them with their own constraints.
Our team is responsible for ingesting 1 BILLION media assets DAYLY, and delivering 1TRILLION views every 24h, for that we use (highly optimized) typical media processing pipelines. In this talk we will explain how we leveraged all of that experience, and building blocks, and we added media AI inference as another offering of those pipelines, now you can upload an asset and deliver it with ABR encodings + CDN, and ALSO alter the content of that via AI (ex: add a hat to all the dogs in the scene). And all of that trying to NOT break the bank (GPU time is really expensive)
We think this talk could be useful to reveal the hidden complexities of delivering AI, specially at scale
This talk was presented at Demuxed 2025 in London, a conference by and for engineers working in video. Every year we host a conference with lots of great new talks like this – learn more at https://demuxed.com”
Conference
Speakers
Other Proceedings
Here are some other proceedings that you might find interesting.
What Codec Should I Use?
Alan Resnick
Doing Server-Side Ad Insertion on Live Sports for 25.3M Concurrent Users
Ashutosh Agrawal
Is now the time to solve the deepfake threat?
Roderick Hodgson
Super Resolution: The scaler of tomorrow, here today!
Nick Chadwick
The do's and don'ts about Streaming security
Javier Brines Garcia
Modeling the conceptual structure of FFmpeg in JavaScript
Ryan Harvey
Objectionable Uses of Objective Quality Metrics
Richard Fliam
RTMP: web video innovation or Web 1.0 hack… how did we get to now?
Sarah Allen
Large-Scale Media Archive Migration to the Cloud
Konstantin Wilms
HEVC Upload Experiments
Chris Ellsworth
Related Courses
Below are some courses that might interest you based on the learning categories and topic tags of this conference proceeding.
What Codec Should I Use?
Alan Resnick
Doing Server-Side Ad Insertion on Live Sports for 25.3M Concurrent Users
Ashutosh Agrawal
Is now the time to solve the deepfake threat?
Roderick Hodgson
Super Resolution: The scaler of tomorrow, here today!
Nick Chadwick
The do's and don'ts about Streaming security
Javier Brines Garcia
Modeling the conceptual structure of FFmpeg in JavaScript
Ryan Harvey
Objectionable Uses of Objective Quality Metrics
Richard Fliam
RTMP: web video innovation or Web 1.0 hack… how did we get to now?