Skip to content

Latest commit

 

History

History
457 lines (334 loc) · 17.8 KB

README.md

File metadata and controls

457 lines (334 loc) · 17.8 KB

CUDA Path Tracer

University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3

  • Yian Chen
  • Tested on: Windows 10, AMD Ryzen 5800 HS with Radeon Graphics CPU @ 3.20GHz 16GB, NVIDIA GeForce RTX3060 Laptop 8GB

Implemeted Feature

  • Core(as required in Project 3)
    • Diffuse & Specular
    • Jittering (Antialiasing)
    • First Bounce Cache
    • Sort by material
  • Load gltf
  • BVH && SAH
  • Texture mapping & bump mapping
  • Environment Mapping
  • Microfacet BSDF
  • Emissive BSDF (with Emissive Texture)
  • Direct Lighting
  • Multiple Importance Sampling
  • Depth of Field
  • Tone mapping && Gamma Correction

Core features (As required by project instruction)

  • Diffuse & Specular

Diffuse & Specular Demo

  • Jittering
Before jittering After jittering

gltf Load & A Better Workflow (?)

In this pathtracer, supported scene format is gltf for its high expressive capability of 3D scenes. Please view this page for more details about gltf.

Eventually, during development, most scenes used for testing is directly exported from Blender. This enables a much higher flexibility for testing.

  • scenes/pathtracer_robots_demo.glb Link

Alt text

BVH

On host, we can construct and traverse BVH recursively. While in this project, our code run on GPU. Though recent cuda update allows recursive function execution on device, we cannot take that risk as raytracer is very performance-oriented. Recursive execution will slow down the kernel function, as it may bring dynamic stack size.

Thanks to this paper, a novel BVH constructing and traversing algorithm called MTBVH is adopted in this pathtracer. This method is stack-free.

This pathtracer only implements a simple version of MTBVH. Instead of constructing 6 BVHs and traversing one of them at runtime, only 1 BVH is constructed. It implies that this pathtracer still has the potential of speeding up.

  • With BVH & Without BVH:
With BVH Without BVH

As expected, speedup is huge up to 40 times. With a more complex scene, BVH should give a higher speedup.

Texture Mapping & Bump Mapping

To enhance the details of mesh surfaces and gemometries, texture mapping is a must. Here we have not implemented mipmap on GPU, though it should not be that difficult to do so.

  • scenes/pathtracer_test_texture.glb Link
Before bump mapping After bump mapping

Microfact BSDF

To use various material, bsdfs that are more complicated than diffuse/specular are required. Here, we will first implement the classic microfacet BSDF to extend the capability of material in this pathtracer.

This pathtracer uses the Microfacet implementation basd on pbrt.

Metallness = 1. Roughness 0 to 1 from left to right.

Please note that the sphere used here is not an actual sphere but an icosphere.

  • scenes/pathtracer_test_microfacet.glb Link

Microfacet Demo

With texture mapping implemented, we can use metallicRoughness texture now. Luckily, gltf has a good support over metallic workflow.

  • scenes/pathtracer_robot.glb Link

Metallic Workflow Demo

Direct Lighting & MIS

To stress the speed up of convergence in MIS, Russian-Roulette is disabled in this part's rendering.

The tiny dark stripe is visible in some rendering result. This is because by default we do not allow double-sided lighting in this pathtracer.

By default, number of light sample is set to 3.

When sampling for the direction of next bounce, we have adopted importance sampling for bsdf for most of the time. It enhances the convergence speed for specular materials, as the sampling strategies greatly aligned with the expected radiance distribution on hemisphere. However, for diffuse/matte surfaces, this sampling strategies can be optimized, as the most affecting factors for the radiance distribution of these sort of materials is light instead of outgoing rays. Thus, sampling from light is also a valuable strategy to speedup convergence speed of raytracing rough surfaces.

In this demo scene, 3 metal plane are allocated with 4 cube lights. When we only sample bsdf, we can see that the expected radiance on the surface of metal plane converges. When we only sample light, we can see how the rougher part of the scene, the back white wall, has better converging speed. Hence, we are looking forward to a sampling strategy that combines the advantages of these two, which is multiple importance sampling.

  • scenes/pathtracer_mis_demo.glb Link
Only sample bsdf 500spp Only sample light 500spp MIS 500spp

To see more details about this part, see this part of pbrt or this post of mine.

Test on bunny scene. Faster convergence speed can be observed.

  • scenes/pathtracer_bunny_mis.glb Link
Without MIS 256spp With MIS 256spp
Without MIS 5k spp With MIS 5k spp

Depth of Field

In depth of field, we define two variables. focal_length & aperture.

More details can be viewed in this post.

Depth of Field (Aperture=0.3)

How to Run

Due to time limit, this code is not refactored to be production-ready. Many features are controlled by macros or even hardcoded, which is indeed irrational. More refactoring job might be accomplished in future.

To run this program for now, users need to adjust code. Most changes should be done within pathtrace.cu.

...
// Default settings
#define USE_FIRST_BOUNCE_CACHE 0
#define USE_SORT_BY_MATERIAL 0
#define MIS 1 // Multiple Importance Sampling
#define RUSSIAN_ROULETTE 0
#if RUSSIAN_ROULETTE
	#define RR_THRESHOLD 0.7f
#endif
#define USE_ENV_MAP 1
#define USE_BVH 1
#define TONE_MAPPING 0
...
// Change loaded scene here
static Scene * hst_scene = new Scene("..\\scenes\\pathtracer_robots_demo.glb");
...

Most settings should be self-explainable.

Another part where users can change settings is in bvh.h

...
#define BVH_NAIVE 1
#define BVH_SAH 0
...

Here, we can determine whether BVH will use a naive split or a surface area heuristic function.

Other Result

Future (If possible)

Cuda Side

  • More cuda optimization
    • Bank conflict
    • Loop unroll
      • Light sample loop (if multiple light rays)
    • Higher parallelism (Use streams?)
  • Tile-based raytracing
    • Potentially, it should increase the rendering speed, as it will maximize locallity within one pixel/tile. No more realtime camera movement though.

Render Side

  • Adaptive Sampling
  • Mipmap
  • ReSTIR
  • Refractive
  • True BSDF (Add some subsurface scattering if possible?)
  • Volume Rendering (Ready for NeRF)

Below are the development history & some bloppers of this program. If you are not intersted, there is no need to keep reading.

History

  • Load mesh within arbitrary scene

    • Triangle
    • Integrate tinygltf
    • Scene Node Tree
  • Core

    • G Buffer
    • Russian Roulette
    • Sort by material
  • More BSDF

    • Diffuse
    • Emissive
    • Microfacet
    • Reflective
    • Refractive
    • Disney
  • BVH

    • Basic BVH
      • BoundingBox Array
      • Construct BVH
      • Traverse BVH
    • Better Heuristics
      • SAH
    • MTBVH
  • Texture

    • Naive texture sampling
      • A Resource Mananger to help get the handle to texture?
    • Bump mapping
    • Displacement mapping
    • Deal with antialiasing
  • Better sampler

    • Encapsulate a sampler class
      • Gotta deal with cmake issue
    • Monte carlo sampling
    • Importance sampling
    • Direct Lighting
    • Multiple Importance Sampling
  • Camera

    • Jitter
    • Field of depth
    • Motion blur
  • Denoiser

    • Use Intel OpenImage Denoiser for now

Log

09.20

  • Basic raytracer
  • Refactor integrator
  • First triangle!

09.21-22

  • Load arbitrary scene(only geom)
    • Triangle
    • Primitive assemble phase(This will not work, see README of this commit)
    • Use tinygltf Remember to check data type before using accessor Alt text
    • Done with loading a scene with node tree! blender_reference rendered

      Can't tell how excited I am! Now my raytracer is open to most of the scenes!

      • Scene with parenting relationship with_parenting

09.23-09.26

Waste too much time on OOP. Eventually used C-style coding.

09.26 Finally, finish gltf loading and basic bsdf.

  • A brief trial
    • Note that this difference might be due to different bsdf we are using right now. For convenience, we are using the most naive Diffuse BSDF, while Blender use a standard BSDF by default. Alt text

09.27

Naive BVH (probably done...) Scene with 1k faces

  • One bounce, normal shading
    • Without BVH: FPS 10.9 Alt text

    • With BVH: FPS 53.4 Alt text

    • 5 times faster

  • Multiple bounces
    • Without BVH: FPS 7.6 Alt text
    • With BVH: FPS 22.8 Alt text

09.28

  • SAH BVH(probably done...)

  • Texture sampling

    • Try billboarding Alt text

09.29

  • Texture mapping

    • Texture mapping test(only baseColor shading) Alt text

    • Bump mapping

      • Normals in world coordinate
      • Before bump mapping
      • After bump mapping It might be a little difficult to notice the difference before and after bump mapping. Please observe the logo on the box, more details are added.
    • Texture aliasing is indeed quite serious!

      • However, to implement antialiasing for texture mapping, I may need to consider implementing mipmapping.
  • Microfacet model pbr failed...

    • Need to read through microfacet part and think about how to use roughness and metallic indeed

09.30

  • Microfacet
    • Metal Fresnel hack
    • Conductor
      • After mixing, need to consider how to sample
  • Camera
    • Antialiasing

10.1-10.2 Try to refactor camera

  • Failed. gltf seems to have a really ambiguous definition of camera.

10.3

  • Denoising
    • OpenImage Denoiser built
      • CPU only for now
      • Figure out how to build oidn for cuda
    • Integrate it into project

10.4-10.6

  • Microfacet

10.7

  • Environment map

10.8

  • Fix random number issue(Maybe try to generate a better random number array in future?)

    • Before Alt text
    • After

    Please notice the fracture on rabbit head before fixing

10.9

  • MIS (Finally!)

  • Russian Roulette

    • Pro: Speed up by 60%
    • Con: Lower the converge speed
  • Depth of field

    • Add a realtime slider to adjust