Conference Proceedings
- Home
- Using AI to get relevant KPIs for Audio Quality
Using AI to get relevant KPIs for Audio Quality
Description
Good metrics for audio quality are rare to find, and some of them doesnt give too much information about what’s the problem, or wether the quality is really one way or another. Some examples are POLQA or ViSQOL, but those two are algorithms requires the reference, and the degraded audio samples to be complete (full reference).
We are proposing two new algorithms based on Deep Learning, and specifically on transformers. The first one is purely focused on speech transcription (SpeechQ) and the second one is about audio classification using audio spectrogram transformers (ASQ-ViT), but using a convolutional neural network as a backbone for feature extraction. The results from this training has lead to a very high correlation with subjective scoring, which proves the effectiveness and simplicity of this algorithm to extract an objective quality measurement from audio outputs.
This talk was presented at Demuxed ’23, a conference for video nerds in San Francisco featuring amazing talks like this one.
Conference
Speakers
Other Proceedings
Here are some other proceedings that you might find interesting.
What Codec Should I Use?
Alan Resnick
Doing Server-Side Ad Insertion on Live Sports for 25.3M Concurrent Users
Ashutosh Agrawal
Is now the time to solve the deepfake threat?
Roderick Hodgson
Super Resolution: The scaler of tomorrow, here today!
Nick Chadwick
The do's and don'ts about Streaming security
Javier Brines Garcia
Modeling the conceptual structure of FFmpeg in JavaScript
Ryan Harvey
Objectionable Uses of Objective Quality Metrics
Richard Fliam
RTMP: web video innovation or Web 1.0 hack… how did we get to now?
Sarah Allen
Large-Scale Media Archive Migration to the Cloud
Konstantin Wilms
HEVC Upload Experiments
Chris Ellsworth
Related Courses
Below are some courses that might interest you based on the learning categories and topic tags of this conference proceeding.
What Codec Should I Use?
Alan Resnick
Doing Server-Side Ad Insertion on Live Sports for 25.3M Concurrent Users
Ashutosh Agrawal
Is now the time to solve the deepfake threat?
Roderick Hodgson
Super Resolution: The scaler of tomorrow, here today!
Nick Chadwick
The do's and don'ts about Streaming security
Javier Brines Garcia
Modeling the conceptual structure of FFmpeg in JavaScript
Ryan Harvey
Objectionable Uses of Objective Quality Metrics
Richard Fliam
RTMP: web video innovation or Web 1.0 hack… how did we get to now?