Skip to content

Latest commit

 

History

History
77 lines (58 loc) · 6.92 KB

video-ai.md

File metadata and controls

77 lines (58 loc) · 6.92 KB

🏠Home

Video

Text to video generation

Frame Interpolation (Temporal Interpolation)

Segmentation & Tracking

  • Segment and Track Anything, code. an innovative framework combining the Segment Anything Model (SAM) and DeAOT tracking model, enables precise, multimodal object tracking in video, demonstrating superior performance in benchmarks
  • Track Anything, code. extends the Segment Anything Model (SAM) to achieve high-performance, interactive tracking and segmentation in videos with minimal human intervention, addressing SAM's limitations in consistent video segmentation
  • MAGVIT Single model for multiple video synthesis outperforming existing methods in quality and inference time, code and models, paper
  • FastSAM Fast Segment Anything, a CNN trained achieving a comparable performance with the SAM method at 50× higher run-time speed.
  • SAM-PT Extending SAM to zero-shot video segmentation with point-based tracking, paper
  • DEVA Tracking Anything with Decoupled Video Segmentation, paper
  • Cutie Putting the Object Back into Video Object Segmentation, paper
  • YOLOv10 Real-Time End-to-End Object Detection
  • SAM2 enables fast, precise selection of any object in any video or image

Super Resolution (Spacial Interpolation)

Spacio Temporal Interpolation

NeRF

  • Instant-ngp Train NeRFs in under 5 seconds on windows/linux with support for GPUs
  • NeRFstudio A Collaboration Friendly Studio for NeRFs simplifying the process of creating, training, and testing NeRFs and supports web-based visualizer, benchmarks, and pipeline support.
  • Threestudio A Framework for 3D Content Creation from Text Prompts, Single Images, and Few-Shot Images or text2image created single image to 3D
  • Zero-1-to-3 Zero-shot One Image to 3D Object for novel view synthesis and 3D reconstruction
  • localrf NeRFs for reconstructing large-scale stabilized scenes from shakey videos, paper, project page
  • gaussian-splatting reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering", paper
  • 4d-gaussian-splatting Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting, paper

Deepfakes

  • roop one-click deepfake (face swap)
    • rope GUI-focused roop
  • streamv2v Official Pytorch implementation of StreamV2V
  • MusePose Pose Driven Image 2 Video framework to generate Virtual Humans
  • V-Express generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images
  • Deep-Live-Cam real time face swap and one-click video deepfake with only a single image

Benchmarking

Inpainting Outpainting

  • ProPainter Improving Propagation and Transformer for Video Inpainting, paper