Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 3: Matt Elser #24

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
6131dfc
got casting of bounce rays working
mattelser Sep 27, 2021
9e9d075
got basic shading working
mattelser Sep 28, 2021
29881c3
Began stream compaction
mattelser Sep 29, 2021
7575eab
use stream compaction to terminate paths by bounce count each iteration
mattelser Sep 29, 2021
ce87fc9
make path culling more efficient
mattelser Sep 30, 2021
9a9c218
can now sort by materials. combined that functionality with culling.
mattelser Oct 1, 2021
c5a51a3
got core functionality done
mattelser Oct 1, 2021
025727a
added refraction and fresnel
mattelser Oct 2, 2021
ae3b16e
made fresnel more configurable, found/fixed firefly issue
mattelser Oct 2, 2021
fd04a07
WIP on DOF
mattelser Oct 2, 2021
55f8401
more WIP on DOF
mattelser Oct 2, 2021
12efa92
fixed refraction issue
mattelser Oct 3, 2021
e104c2a
got DOF working
mattelser Oct 3, 2021
1deb385
use of DOF now a scene setting
mattelser Oct 3, 2021
8b48d8b
WIP on sub-pixel sampling
mattelser Oct 3, 2021
514e53e
chasing a possible race condition with subpixel sampling
mattelser Oct 4, 2021
f3218f4
back out all subpixel sampling code, it has a persistant strange display
mattelser Oct 5, 2021
b901809
added antialiasing
mattelser Oct 5, 2021
610d861
initial groundwork for obj loading
mattelser Oct 7, 2021
de35db9
WIP on arbitrary mesh loading
mattelser Oct 8, 2021
59b4970
got teapot on screen (with borked normals)
mattelser Oct 8, 2021
003e9f8
got mesh loading working!
mattelser Oct 8, 2021
23817b8
got bounding box detection working
mattelser Oct 8, 2021
a6c4fdd
make bbox configurable option
mattelser Oct 8, 2021
8765b60
WIP adaptive sampling
mattelser Oct 9, 2021
397bdbd
more WIP on adaptive sampling
mattelser Oct 9, 2021
b583899
fixed hit detection issue: back of faces were not being hit
mattelser Oct 9, 2021
f92fde6
README draft complete, adaptive sampling nearly complete, but has bug
mattelser Oct 10, 2021
3795f34
add two test scenes
mattelser Oct 10, 2021
6a9b5cd
improved readme
mattelser Oct 12, 2021
5f9f65c
fix a bug in back face intersection detection and lessen anti-alias j…
mattelser Oct 19, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
179 changes: 174 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,180 @@ CUDA Path Tracer

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Matt Elser
* [LinkedIn](https://www.linkedin.com/in/matt-elser-97b8151ba/), [twitter](twitter.com/__mattelser__)
* Tested on: Tested on: Windows 10, i3-10100F @ 3.6GHz 16GB, GeForce 1660 Super 6GB

### (TODO: Your README)
### Features
![Stanford tryranosaurus](img/trex.png)
![DOF example](img/DOF.png)

This is a GPU based forward path tracer, which renders scenes by calculating "camera rays" bouncing around the scene,
simulating individual light rays. The renderer supports the following features:
- Arbitrary mesh loading using .obj file format
- Multiple shader BSDFs, including refraction
- Anti-aliasing
- Depth of Field
- Several optimizations
- Adaptive Sampling* (not fully implemented)
- Bloopers

### Arbitrary mesh loading
The renderer supports loading arbitrary meshes via .obj files using
(`tinyobjloader`)[https://github.com/tinyobjloader/tinyobjloader].

A bounding box is calculated at load time and used to optimize ray intersection
detection. The bounding box is a mesh itself, consisting of tris. Each ray in
the scene is initially tested for intersection with these tris, and only if an
intersection is found will the ray be checked against the mesh's tris.

One issue discovered was that the triangle intersection detection function
initially used (`glm::intersectRayTriangle()`) does not compute intersections
with the "back" of faces. This caused problems for open meshes like the Newell Teapot
or for meshes assigned a refraction shader, as can be seen here:
![newell teapot hole image](img/back_face_cull_issue.png)
Also somewhat visible through the noise is a secondary effect of this back face
issue: collisions with the bounding box are not detected from within the
bounding box. Notice the sharp lines on the floor cutting off the diffuse
bounce close to the teapot. This is from rays on the floor near the teapot
casting outward and missing the teapot bounding box, and therefore not
checking for collisions with any of the teapots actual tris.

Performance impacts:
- no spatial optimizations are made (other than the bounding box), so each
ray that hits the bounding box is checked against every triangle in the mesh.
As a result, large meshes can have a significant impact on render times.

Known limitations:
- only objs consisting entirely of triangles are supported. Quads are
interpreted with inconsistant results by tinyobjloader
- vertex normals must be included in the .obj file. Missing normals will
inherit whatever memory value the normals were initialized to.
### Multiple shaders
BSDFs are implemented to allow for pure diffuse objects, objects with diffuse
and reflections, as well as objects with both reflection and refraction. The
Fresnel effect is calculated using Schlick's approximation

Since physically correct models do not always provide the preferred result, the
Fresnel effect is tuneable via a user parameter. Note this is separate from the
index of refraction (also tuneable), this is an additional parameter which controls
the power used in Schlick's approximation.
![Fresnel power comparison](img/fresnel_comparison.png)
The sphere on the right has a Fresnel power of 1, which dramatically changes
the reflect/refract ratio in favor of reflection. The sphere in the middle has
a Fresnel power of 3, which is only a subltle change from the (standard)
Fresnel power of 5 on the rightmost sphere.

Known limitations:
- objects with refractions are assumed to have reflection. An object can be reflective without
refraction, but not vice-versa.

### Anti-aliasing
anti-aliasing was accomplished by jittering the origin of camera rays for the initial bounce.
![image without anti-aliasing](img/antialias_off.png)
![image with anti-aliasing](img/antialias_on.png)
The first image has no antialiasing and has jagged pixelated edges along
horizontal lines. The second image has cleaner lines with no notable "jaggies".

Performance impacts:
- An unnoticeable impact to the time it takes each pixel to converge as a result of adding some small randomness.
Known limitations:
- This can not be combined with the "first bounce cache" optimization as it depends on
slightly varied camera ray origins each iteration.
### Depth of Field
Depth of field can optionally be simulated, with tuneable parameters for aperture size, lens radius,
and focal distance.
![DOF example](img/DOFOFF.png)
![DOF example](img/DOF.png)
The first image shows a scene with no simulated depth of field. The second
image has depth of field turned on, simulating the blur according to distance
in the same way a physical camera lens would.

Performance impacts:
- Using DOF requires a greater number of iterations to produce a clean image. The blur is a result
of a stochastic process, and as a result the greater the blur the larger the variance of each blurred pixel
Known limitations:
- This can not be combined with the "first bounce cache" optimization as it depends on
slightly varied camera rays each iteration.
### Optimizations
- first bounce cache
An option is provided to cache the first bounce of each camera ray from iteration 1, and use that cache
for each subsequent iteration (until the camera is moved, generating a new iteration 1 and a new cache).
- sort by materials
In order to decrease divergence (i.e. multiple threads in a warp taking
different code paths as a result of conditionals), rays can optionally be
sorted by their material id. This manimizes the number of warps with different
materials, which may take different amounts of time as a result of calculated
differing BSDFs.
- use stream compaction to cull dead bounces
Bounces that do not hit an object (i.e. which go off into space) are culled every iteration.

The following data was gathered from a single test scene using multiple
shaders, across all available BRDFS. All renders were run to 100 iterations at
a resolution of 720x480. Here is the test scene at 5000 iterations:
![test scene full render](img/test_scene.png)

![optimization comparison](img/optimization_comparison.png)
Performance impacts:
- notably, all optimizations are slightly worse for a trace depth of 8, when
the benefit of these optimizations has not yet outweighed their overhead.
- first bounce cache provides a steady, but minor improvement.
- Stream compaction provides the most dramatic improvement, even in a scene
that is mostly filled by collideable objects.
- sorting materials provides a notable decrease in render times which increases
slightly as the trace depth increases.
- All optimizations provide a performance increase of approximately 2x!

Known limitations:
- As noted above, first bounce cache cannot be combined with DOF or anti-aliasing.
### Adaptive Sampling* (incomplete)
(note, this feature is incomplete. All that is described is implemented, but a
bug prevents it from working properly)
Adaptive sampling is the process of determinig whether a pixel needs further
iterations, and only sampling further if so. To do this, each pixel's variance
is tracked, and updated each time the pixel color is updated (at the "final
gather" stage). After a user-defined minimum number of samples, this variance
is compared with a user-defined value. Once the pixel's variance falls below
this threshold, its color is considered "converged" and it is marked for
culling. Converged pixels are culled and rays for these pixels are no longer
cast, saving potentially significant resources.
As a result of this process, pixels take varying numbers of iterations. The
number of iterations needed by a pixel can be useful information, and so with
each saved image, a "heatmap" is saved alongside it. This heatmap shows the
number of iterations taken for each pixel, and can therefore be used to display
areas where greater or fewer resources are spent.

How this feature is incomplete:
As stated above, all of the aspects of adaptive sampling described are
implemented. However, some misalignment exists between the culling of pixels
and the casting of rays in the following iteration. As a result, the last
pixels of the image are always culled instead of the pixels which have
converged. This is likely due to a sorting mismatch, or using the wrong number
of paths when calling some relevant function. This has not been fixed in time.

![Disfunctional adaptive sampling](img/adaptiveSampleBug.png)
![accurate heatmap showing incorrect sampling](img/heatmap.png)
White represents pixels which required the maximum number of iterations, black
indicates immediate culling. Note the gradient in the heat map showing that
pixels are culled from the bottom up as iterations increase. This reflects what
can be seen in the render itself: the bottom is noisy and gets less noisy
towards the top.
### Bloopers
![stream compaction VHS](img/stream_compaction_blooper.png)
This image is the result of a stream compaction issue. The VHS-like look of it
is amplified by the banded noise at the top, which is the result of a race
condition when sorting paths.
![refraction hall of horrors](img/refraction_mesh_blooper.png)
![refraction hall of horrors](img/refraction_mesh_blooper2.png)
These works of art were created while attempting to fix the mesh back face
collision detection issue described above.

### Notable Sources
- As noted above,
(`tinyobjloader`)[https://github.com/tinyobjloader/tinyobjloader] was used for
mesh loading.
- As noted in the comments, Stack Exchange and Stack overflow
provided the math for two vector manipulation methods
- Matt Pharr & Grep Humphreys Physically Based Rendering Texbook provided useful context

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.

Loading