Initial commit!

YutongZheng · Oct 9, 2020 · 81de8f4 · 81de8f4
commit 81de8f4
Show file tree

Hide file tree

Showing 65 changed files with 8,659 additions and 0 deletions.
diff --git a/.DS_Store b/.DS_Store
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,17 @@
+*.pyc
+*.so
+*.pkl
+.DS_Store/
+__pycache__/
+*.pt
+.idea/
+.vscode/
+build/
+*.egg-info/
+images/
+*.blend1
+.vscode/
+.history/
+tools/
+fb_sweep/
+fb_scripts/
diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
@@ -0,0 +1,45 @@
+# Open Source Code of Conduct
+
+## Our Pledge
+
+In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to make participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
+
+## Our Standards
+
+Examples of behavior that contributes to creating a positive environment include:
+
+Using welcoming and inclusive language
+Being respectful of differing viewpoints and experiences
+Gracefully accepting constructive criticism
+Focusing on what is best for the community
+Showing empathy towards other community members
+Examples of unacceptable behavior by participants include:
+
+The use of sexualized language or imagery and unwelcome sexual attention or advances
+Trolling, insulting/derogatory comments, and personal or political attacks
+Public or private harassment
+Publishing others’ private information, such as a physical or electronic address, without explicit permission
+Other conduct which could reasonably be considered inappropriate in a professional setting
+
+## Our Responsibilities
+
+Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
+
+Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
+
+## Scope
+
+This Code of Conduct applies within all project spaces, and it also applies when an individual is representing the project or its community in public spaces. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
+
+## Enforcement
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at [email protected]. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
+
+Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project’s leadership.
+
+## Attribution
+
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
+available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
+
+[homepage]: https://www.contributor-covenant.org
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,28 @@
+# Contributing to Neural Sparse Voxel Fields (NSVF)
+We want to make contributing to this project as easy and transparent as
+possible.
+
+## Pull Requests
+We actively welcome your pull requests.
+
+1. Fork the repo and create your branch from `master`.
+2. If you've added code that should be tested, add tests.
+3. If you've changed APIs, update the documentation.
+4. Ensure the test suite passes.
+5. Make sure your code lints.
+6. If you haven't already, complete the Contributor License Agreement ("CLA").
+
+## Contributor License Agreement ("CLA")
+In order to accept your pull request, we need you to submit a CLA. You only need
+to do this once to work on any of Facebook's open source projects.
+
+Complete your CLA here: <https://code.facebook.com/cla>
+
+## Issues
+We use GitHub issues to track public bugs. Please ensure your description is
+clear and has sufficient instructions to be able to reproduce the issue.
+
+## License
+By contributing to Neural Sparse Voxel Fields,
+you agree that your contributions will be licensed under the LICENSE file in
+the root directory of this 
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) Facebook, Inc. and its affiliates.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
@@ -0,0 +1,221 @@
+# Neural Sparse Voxel Fields (NSVF)
+
+Photo-realistic free-viewpoint rendering of real-world scenes using classical computer graphics techniques is a challenging problem because it requires the difficult step of capturing detailed appearance and geometry models.
+Neural rendering is an emerging field that employs deep neural networks to implicitly learn scene representations encapsulating both geometry and appearance from 2D observations with or without a coarse geometry.
+However, existing approaches in this field often show blurry renderings or suffer from slow rendering process. We propose [Neural Sparse Voxel Fields (NSVF)](https://arxiv.org/abs/2007.11571), a new neural scene representation for fast and high-quality free-viewpoint rendering.
+
+Here is the official repo for the paper:
+
+* [Neural Sparse Voxel Fields (Liu et al., 2020)](https://arxiv.org/abs/2007.11571).
+
+<img src='docs/figs/framework.png'/>
+
+## Requirements and Installation
+
+This code is implemented in PyTorch using [fairseq framework](https://github.com/pytorch/fairseq).
+
+The code has been tested on the following system:
+
+* Python >= 3.6
+* PyTorch 1.4.0
+* [Nvidia apex library](https://github.com/NVIDIA/apex) (optional)
+* Nvidia GPU (Tesla V100 32GB) CUDA 10.1
+
+Only learning and rendering on GPUs are supported.
+
+To install, first clone this repo and install all dependencies:
+
+```bash
+pip install -r requirements.txt
+```
+
+Then,  run
+
+```bash
+pip install --editable ./
+```
+
+Or if you want to install the code locally, run:
+
+```bash
+python setup.py build_ext --inplace
+```
+
+## Dataset
+
+You can download the pre-processed synthetic and real datasets used in our paper.
+Please also cite the original papers if you use any of them in your work.
+
+Dataset | Download Link | Notes on Dataset Split
+---|---|---
+Synthetic-NSVF | [download (.zip)](https://dl.fbaipublicfiles.com/nsvf/dataset/Synthetic_NSVF.zip) | 0_\* (training) 1_\* (validation) 2_\* (testing)
+[Synthetic-NeRF](https://github.com/bmild/nerf) | [download (.zip)](https://dl.fbaipublicfiles.com/nsvf/dataset/Synthetic_NeRF.zip) | 0_\* (training) 1_\* (validation) 2_\* (testing)
+[BlendedMVS](https://github.com/YoYo000/BlendedMVS)  | [download (.zip)](https://dl.fbaipublicfiles.com/nsvf/dataset/BlendedMVS.zip) | 0_\* (training) 1_\* (testing)
+[Tanks&Temples](https://www.tanksandtemples.org/) | [download (.zip)](https://dl.fbaipublicfiles.com/nsvf/dataset/TanksAndTemple.zip) | 0_\* (training) 1_\* (testing)
+
+### Prepare your own dataset
+
+To prepare a new dataset of a single scene for training and testing, please follow the data structure:
+
+```bash
+<dataset_name>
+|-- bbox.txt         # bounding-box file
+|-- intrinsics.txt   # 4x4 camera intrinsics
+|-- rgb
+    |-- 0.png        # target image for each view
+    |-- 1.png
+    ...
+|-- pose
+    |-- 0.txt        # camera pose for each view (4x4 matrices)
+    |-- 1.txt
+    ...
+[optional]
+|-- test_traj.txt    # camera pose for free-view rendering demonstration (4N x 4)
+```
+
+where the ``bbox.txt`` file contains a line describing the initial bounding box and voxel size:
+
+```bash
+x_min y_min z_min x_max y_max z_max initial_voxel_size
+```
+
+Note that the file names of target images and those of the corresponding camera pose files are not required to be exactly the same. However, the orders of these two kinds of files (sorted by string) must match.  The datasets are split with view indices.
+For example, "``train (0..100)``, ``valid (100..200)`` and ``test (200..400)``" mean the first 100 views for training, 100-199th views for validation, and 200-399th views for testing.
+
+## Train a new model
+
+Given the dataset of a single scene (``{DATASET}``), we use the following command for training an NSVF model to synthesize novel views at ``800x800`` pixels, with a batch size of ``4`` images per GPU and ``2048`` rays per image. By default, the code will automatically detect all available GPUs.
+
+In the following example, we use a pre-defined architecture ``nsvf_base`` with specific arguments:
+
+* By setting ``--no-sampling-at-reader``, the model only samples pixels in the projected image region of sparse voxels for training.
+* By default, we set the ray-marching step size to be the ratio ``1/8 (0.125)`` of the voxel size which is typically described in the ``bbox.txt`` file.
+* It is optional to turn on ``--use-octree``. It will build a sparse voxel octree to speed-up the ray-voxel intersection especially when the number of voxels is greater than ``10000``.
+* By setting ``--pruning-every-steps`` as ``2500``, the model performs self-pruning at every ``2500`` steps.
+* By setting ``--half-voxel-size-at`` and ``--reduce-step-size-at`` as ``5000,25000,75000``,  the voxel size and step size are halved at ``5k``, ``25k`` and ``75k``, respectively.
+
+Note that, although above parameter settings are used for most of the experiments in the paper, it is possible to tune these parameters to achieve better quality. Besides the above parameters, other parameters can also use default settings.
+
+Besides the architecture ``nsvf_base``, you may check other architectures or define your own architectures in the file ``fairnr/models/nsvf.py``.
+
+```bash
+python -u train.py ${DATASET} \
+    --user-dir fairnr \
+    --task single_object_rendering \
+    --train-views "0..100" --view-resolution "800x800" \
+    --max-sentences 1 --view-per-batch 4 --pixel-per-view 2048 \
+    --no-preload \
+    --sampling-on-mask 1.0 --no-sampling-at-reader \
+    --valid-views "100..200" --valid-view-resolution "400x400" \
+    --valid-view-per-batch 1 \
+    --transparent-background "1.0,1.0,1.0" --background-stop-gradient \
+    --arch nsvf_base \
+    --initial-boundingbox ${DATASET}/bbox.txt \
+    --use-octree \
+    --raymarching-stepsize-ratio 0.125 \
+    --discrete-regularization \
+    --color-weight 128.0 --alpha-weight 1.0 \
+    --optimizer "adam" --adam-betas "(0.9, 0.999)" \
+    --lr 0.001 --lr-scheduler "polynomial_decay" --total-num-update 150000 \
+    --criterion "srn_loss" --clip-norm 0.0 \
+    --num-workers 0 \
+    --seed 2 \
+    --save-interval-updates 500 --max-update 150000 \
+    --virtual-epoch-steps 5000 --save-interval 1 \
+    --half-voxel-size-at  "5000,25000,75000" \
+    --reduce-step-size-at "5000,25000,75000" \
+    --pruning-every-steps 2500 \
+    --keep-interval-updates 5 --keep-last-epochs 5 \
+    --log-format simple --log-interval 1 \
+    --save-dir ${SAVE} \
+    --tensorboard-logdir ${SAVE}/tensorboard \
+    | tee -a $SAVE/train.log
+```
+
+The checkpoints are saved in ``{SAVE}``. You can launch tensorboard to check training progress:
+
+```bash
+tensorboard --logdir=${SAVE}/tensorboard --port=10000
+```
+
+There are more examples of training scripts to reproduce the results of our paper under [examples](./examples/train/).
+
+## Evaluation
+
+Once the model is trained, the following command is used to evaluate rendering quality on the test views given the ``{MODEL_PATH}``.
+
+```bash
+python validate.py ${DATASET} \
+    --user-dir fairnr \
+    --valid-views "200..400" \
+    --valid-view-resolution "800x800" \
+    --no-preload \
+    --task single_object_rendering \
+    --max-sentences 1 \
+    --valid-view-per-batch 1 \
+    --path ${MODEL_PATH} \
+    --model-overrides '{"chunk_size":512,"raymarching_tolerance":0.01,"tensorboard_logdir":"","eval_lpips":True}' \
+```
+
+Note that we override the ``raymarching_tolerance`` to ``0.01`` to enable early termination for rendering speed-up.
+
+## Free Viewpoint Rendering
+
+Free-viewpoint rendering can be achieved once a model is trained and a rendering trajectory is specified. For example, the following command is for rendering with a circle trajectory (angular speed 3 degree/frame, 15 frames per GPU). This outputs per-view rendered images and merge the images into a ``.mp4`` video in ``${SAVE}/output`` as follows:
+
+<img src='docs/figs/results.gif'/>
+
+By default, the code can detect all available GPUs.
+
+```bash
+python render.py ${DATASET} \
+    --user-dir fairnr \
+    --task single_object_rendering \
+    --path ${MODEL_PATH} \
+    --model-overrides '{"chunk_size":512,"raymarching_tolerance":0.01}' \
+    --render-beam 1 --render-angular-speed 3 --render-num-frames 15 \
+    --render-save-fps 24 \
+    --render-resolution "800x800" \
+    --render-path-style "circle" \
+    --render-path-args "{'radius': 3, 'h': 2, 'axis': 'z', 't0': -2, 'r':-1}" \
+    --render-output ${SAVE}/output \
+    --render-output-types "color" "depth" "voxel" "normal" --render-combine-output \
+    --log-format "simple"
+```
+
+Our code also supports rendering for given camera poses.
+For instance, the following command is for rendering with the camera poses defined in the 200-399th files under folder ``${DATASET}/pose``:
+
+```bash
+python render.py ${DATASET} \
+    --user-dir fairnr \
+    --task single_object_rendering \
+    --path ${MODEL_PATH} \
+    --model-overrides '{"chunk_size":512,"raymarching_tolerance":0.01}' \
+    --render-save-fps 24 \
+    --render-resolution "800x800" \
+    --render-camera-poses ${DATASET}/pose \
+    --render-views "200..400" \
+    --render-output ${SAVE}/output \
+    --render-output-types "color" "depth" "voxel" "normal" --render-combine-output \
+    --log-format "simple"
+```
+
+The code also supports rendering with camera poses defined in a ``.txt`` file. Please refer to this [example](./examples/render/render_jade.sh).
+
+## License
+
+NSVF is MIT-licensed.
+The license applies to the pre-trained models as well.
+
+## Citation
+
+Please cite as 
+```bibtex
+@article{liu2020neural,
+  title={Neural Sparse Voxel Fields},
+  author={Liu, Lingjie and Gu, Jiatao and Lin, Kyaw Zaw and Chua, Tat-Seng and Theobalt, Christian},
+  journal={NeurIPS},
+  year={2020}
+}
+```
diff --git a/docs/figs/framework.png b/docs/figs/framework.png
diff --git a/docs/figs/results.gif b/docs/figs/results.gif
diff --git a/examples/render/render_jade.sh b/examples/render/render_jade.sh
@@ -0,0 +1,32 @@
+# Copyright (c) Facebook, Inc. and its affiliates.
+#
+# This source code is licensed under the MIT license found in the
+# LICENSE file in the root directory of this source tree.
+
+# just for debugging
+DATA="Jade"
+RES="576x768"
+ARCH="nsvf_base"
+SUFFIX="v1"
+DATASET=/private/home/jgu/data/shapenet/release/BlendedMVS/${DATA}
+SAVE=/checkpoint/jgu/space/neuralrendering/new_release/$DATA
+MODEL=$ARCH$SUFFIX
+MODEL_PATH=$SAVE/$MODEL/checkpoint_last.pt
+
+# additional rendering args
+MODELTEMP='{"chunk_size":%d,"raymarching_tolerance":%.3f,"use_octree":True}'
+MODELARGS=$(printf "$MODELTEMP" 256 0.0)
+
+# rendering with pre-defined testing trajectory
+python render.py ${DATASET} \
+    --user-dir fairnr \
+    --task single_object_rendering \
+    --path ${MODEL_PATH} \
+    --render-beam 1 \
+    --render-save-fps 24 \
+    --render-camera-poses $DATASET/test_traj.txt \
+    --model-overrides $MODELARGS \
+    --render-resolution $RES \
+    --render-output ${SAVE}/$ARCH/output \
+    --render-output-types "color" "depth" "voxel" "normal" \
+    --render-combine-output --log-format "simple"