Skip to content

Web server and full video synthesis pipeline based on NeRF. RCOS

License

Notifications You must be signed in to change notification settings

Benjamin-Avrahami/vidtonerf

 
 

Repository files navigation

Contributors Forks Issues MIT License


Logo

NeRF or Nothing core repository

A micro-services based project in rendering novel perspectives of input videos utilizing neural radiance fields.
Learn more about NeRFs »

View Demo · Report Bug · Request Feature

About The Project

This repository contains the backend for the NeRf (Neural Radiance Fields) or Nothing web application that takes raw user video and renders a novel realistic view of the scene they captured. Neural Radiance Fields are a new technique in novel view synthesis that has recently reached state of the art results.

Some Important Background About NeRFs

NeRFs operate by first taking sets of input images taken at known locations and projecting rays from each input image via a pinhole camera model projection into 3D space. Assuming the input images are all capturing different perspectives of the same scene these reprojected rays will intersect in the center of the scene forming a field of light rays that produce the input images (these are the initial radiance fields). Then a small neural network is trained to predict the intensities and colors of light along this intersecting region in order to model the radiance fields that must have produced the initial images. This neural network is initialized randomly for each new scene and trained uniquely to model each captured scene. When the training is over a neural network is trained that can predict the color and intensity of a ray when polled at a specific angle and location in the scene. Using this trained neural network, raytracing can be used to poll the neural network along all the rays pointing towards a new virtual camera to take a picture from the scene at a perspective never seen before. Important to this project is the fact that the locations for each image are needed in order to train a NeRF, we get this data from running structure from motion (using COLMAP) on the input video. To learn more please visit the learning resources in the wiki.

General Pipeline:

  1. Run Structure from motion on input video (using COLMAP implementation) to localize the camera position in 3D space for each input frame
  2. Convert the structure from motion data to Normalized Device Coordinates NDC
  3. Train the NeRF (implemented with TensoRF) on the input frames and their corresponding NDC coordinates
  4. Render a new virtual "flythrough" of the scene using the trained NeRF

Project Structure

Full Project Structure Diagram Since running COLMAP and TensoRF takes upwards of 30 minutes per input video, this project utilizes RabbitMQ to queue work orders for asynchronous workers to complete user requests. MongoDb is used to keep track of active and past user jobs. The worker implementations are under the NeRF, and colmap folders respectively while the central webserver is under web-server. For more information on how these components communicate and how data is formatted see the READMEs within each of the aforementioned folders.

Getting Started

To run the project install and run the web-server, the nerf worker, and the colmap worker in any order by running their respective installations in their READMEs. Once these are running the front-end can be started by visiting the front end repo. Once everything is running the website should be available at localhost:3000 and a video can be uploaded to test the application.

Prerequisites

  1. Have Docker installed locally
  2. Install COLMAP
  3. Install ffmpeg
  4. If you intend to run the NeRF and COLMAP workers locally ensure you have NVIDIA GPUS with atleast 6GB of vram as these are resource intensive applications

Output Example

Converting the training images from the nerf-synthetic dataset lego example to a video then running vidtonerf produces the following result:

Example Output

Roadmap

TODO

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Interested in the project?

Come join our discord server!

Or, inquire at: [email protected]

Acknowledgments

About

Web server and full video synthesis pipeline based on NeRF. RCOS

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.0%
  • Dockerfile 2.0%