MMAction2
An open source toolkit for video understanding by OpenMMLab. It supports state-of-the-art models for action recognition, temporal action detection, and spatio-temporal detection in videos.
An open source toolkit for video understanding by OpenMMLab. It supports state-of-the-art models for action recognition, temporal action detection, and spatio-temporal detection in videos.
A tool to automatically synchronize subtitles with video by analyzing audio tracks. It uses speech detection to align subtitle timing via FFmpeg and machine learning.
An open source deepfakes tool that allows the swapping of faces in videos using machine learning. It provides implementations of several face-swapping algorithms and a GUI.
An AI-based method (Real-Time Intermediate Flow Estimation) for frame interpolation to create slow-motion videos by generating intermediate frames between existing ones.
RIFE (AI Frame Interpolation) Read More »
An AI model that achieves accurate lip-syncing in videos. Given an input video of a person and a target speech audio, Wav2Lip generates a video where the person’s lip movements
An open source deepfake application that provides tools to extract, train, and swap faces in videos. FaceSwap has an active community and supports plugins for different neural network models.
The leading software for creating deepfakes. It’s an open source toolkit that allows users to swap faces in videos using machine learning, with support for multiple models and GPU acceleration.
A real-time multi-person keypoint detection library (for body, face, and hands) by CMU. Often used on video to extract pose information frame-by-frame for animation or analysis.
An AI model (ICCV 2023) for zero-shot text-to-video generation using image diffusion models. It allows generating short video clips from text prompts without training on video data.
Netflix Tech Blog on a framework for training ML models for video understanding. Describes an internal system to create video classifiers using combined video and text (subtitle/metadata) analysis.
Video Annotator: Building Video Classifiers using Vision-Language Read More »