AI Enhanced GPU Video Coding: Achieving Joint High Compression Efficiency and Throughput

Conference Proceedings

Home
AI Enhanced GPU Video Coding: Achieving Joint High Compression Efficiency and Throughput

Description

We are here to present a novel approach to significantly boost video compression efficiency on Nvidia NVENC hardware encoders, by leveraging AI-driven pre-analysis and pre-processing algorithms. We refer to this method as AI Enhanced GPU Video Coding, which combines Nvidia NVENC’s high density, low latency, and high throughput with ML-based techniques to enhance video compression efficiency and boost visual quality, while maintaining high throughput.

NVENC, as a leading hardware-based encoder, excels in providing high throughput and low latency but generally offers lower compression efficiency compared to CPU-based software encoders. Our AI-driven GPU video compression approach aims to leverage the advantages of both NVENC and AI algorithms to achieve high compression efficiency and throughput performance. Our optimization algorithms mainly include: 1. ML-based Scene & Region Classification: Identifying effective coding tools based on scene and region classification. 2. Regions of Interest (ROI) Identification: Focusing on perceptually significant regions, such as faces and jersey numbers in typical sports videos. 3. Pre-processing Techniques: Applying deblurring, denoising, sharpening, contrast adjustment, etc. to boost up visual quality. 4. Hierarchical pre-analysis and pre-classification: Setting fine granular QPs, including block-based QPs, and enabling quick quality monitoring. These techniques combined improve video compression efficiency, boosting both objective and subjective quality while achieving significant bitrate savings. We have applied these methods to large UGC content platforms. Our results demonstrate promising improvements in compression efficiency for both VOD and live use cases. Using the NVIDIA T4 Tensor Core, we maintained the same high throughput for multiple parallel encoding threads and achieved a 15-20% bitrate saving and a 1-2 VMAF score improvement at the same time, on typical UGC & PUGC content compared to the out-of-the-box NVENC approach. Further enhancements, such as re-encoding, are currently being developed and further compression gains are expected. This talk was presented at Demuxed 2024, a conference by and for engineers working in video. Every year we host a conference with lots of great new talks like this in San Francisco. Learn more at https://demuxed.com