Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 1: (Charles) Zixin Zhang #7

Open
wants to merge 38 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
beac6fd
Change app resolution and dot size
Sep 9, 2021
be45504
complete Rule 1
Sep 9, 2021
082f1e6
First Part First Try
Sep 9, 2021
b90439b
Fixed: some boids don't move at all, or change color.
Sep 9, 2021
7f4ccc9
switch to Uniform Grid in main.cpp
Sep 10, 2021
8e765b0
Allocate and deallocate addtional buffers
Sep 10, 2021
6cb84f3
unit tested labeling each boid with the index of its grid cell
Sep 10, 2021
33719da
unit tested setting up a parallel array of int indices as pointers to…
Sep 10, 2021
6b0e4f9
unit tested sorting
Sep 10, 2021
d53f57b
unit tested populating start and end indices arrays
Sep 10, 2021
8c5bd37
Part 2 first try
Sep 10, 2021
eb58033
Fixed: should account more neighboring cells
Sep 10, 2021
34100dd
Fixed: some particles not moving, particles gone after some time
Sep 11, 2021
21c2a3c
Delete useless GridApproaches functions
Sep 11, 2021
708906b
Optimization: Only check exactly 8 cells
Sep 11, 2021
5863fd6
Part 2.3 Complete
Sep 11, 2021
b1c2c89
Fixed: Need to pingpong dev_pos and dev_vel
Sep 11, 2021
b6fa391
Delete redundent dev_thrust ponters assignment
Sep 11, 2021
3d95792
Highlight checkpoint
Sep 12, 2021
d5e073a
Test ReadMe
Sep 12, 2021
877581c
Typo Fixed
Sep 12, 2021
3469de1
Smaller Logo
Sep 12, 2021
09db353
Fixed typo
Sep 12, 2021
6b4138c
Update REAME
Sep 12, 2021
cbbf610
Sunglasses
Sep 12, 2021
7d26f3e
Checkpoint: where you generate first image in README
Sep 12, 2021
e294e35
Upload plotting pics
Sep 12, 2021
061af0e
First Draft
Sep 12, 2021
93d693d
Merge branch 'CIS565-Fall-2021:main' into main
Sep 12, 2021
a77901f
Better number of boids
Sep 12, 2021
8c3de49
Typo fixed
Sep 12, 2021
24434da
Add more demo
Sep 12, 2021
26bcb50
center the about project image
Sep 12, 2021
d64df62
add matliplot
Sep 12, 2021
af7a78f
Update README
Sep 12, 2021
6f29c45
Fixed Typo
Sep 12, 2021
5a84c43
Typo Fixed
Sep 12, 2021
d6ccf86
Update REAME
Sep 13, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 82 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,85 @@
**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
Project 1 - Flocking**
<p align="center">
<img src="images/logo.png" alt="Logo" width="150" height="150">
<h2 align="center">Author: (Charles) Zixin Zhang</h2>
<p align="center">
A flocking simulation based on the <strong>Reynolds Boids algorithm</strong>
</p>
</p>

* (TODO) YOUR NAME HERE
* (TODO) [LinkedIn](), [personal website](), [twitter](), etc.
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
---
## About The Project
<p align="center">
<img src="images/logo.gif" alt="Outside Cube Pic" width="400" height="400">
</p>

### (TODO: Your README)

Include screenshots, analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)
A flocking simulation based on the <strong>Reynolds Boids algorithm</strong>, along with two levels of optimization: a <strong>uniform grid</strong>, and a <strong>uniform grid with semi-coherent memory access</strong>.

---
## Highlights

<p align="center">

<img src="images/outSideCube.gif" alt="Outside Cube Pic" width="640" height="360">
<img src="images/insideCube.gif" alt="Inside Cube Pic" width="640" height="360">
</p>


Stats:
- Coherent uniform grid approach
- CPU: i7-10700F @ 2.90GHz
- GPU: SM 8.6 NVIDIA GeForce RTX 3080
- Number of Boids: 12 million
- Average FPS: ~40

Note: For the first picture, I use a larger timestep to speed up the simulation to observe the overall trend better, whereas the second picture uses a smaller time step to better observe the movement of the particles (it also looks cool :sunglasses:).

---

## Performance Analysis

In this project, I investigate 3 approaches to implement the Reynolds Boids algorithm:

1. Naive approach has each boid check every other boid in the simulation.
2. Uniform grid approach culls unnecessary neighbor checks using a data structure called a uniform spatial grid.
3. Coherent uniform gird approach improves upon the second approach by cutting one level of indirection when accessing the boids' data.

---
To validate our optimization, I use ```Matplotlib``` to plot the framerate change with an increasing number of boids for these 3 approaches. Average framerate is observed visually. Note that the below experiment has ```scene_scale=100.0f``` because it will affect FPS based on the number of particles in the scene. Additionally, I consider 30~60 FPS to be an acceptable framerate.

<img src="images/naive.png">

<img src="images/uniform.png">

<img src="images/coherent.png">

Based on the above 3 plots, I conclude that there is approximately **x10** efficiency improvement (in terms of the number boids the method can handle) per step going from the naive approach to the coherent uniform grid approach. For example, the naive approach can handle tens of thousands of particles, whereas the coherent grid approach can handle millions of particles with ease. Our optimization works as expected because of two factors:

1. We have culled tons of neighbor checks by only checking particles in at most 8 cells.
2. We have eliminated the need for another indirection happened when accessing the position/velocity arrays. This is done by reshuffling them so that all the velocities and positions of boids in one cell are contiguous in memory.

Furthermore, the program runs more efficiently without visualization. Drawing all the boids in OpenGL takes time and resources.

---
I also plot framerate change with increasing block size to investigate the effect of block size on the efficiency of the algorithm. Note that the following parameters are used when running this experiment:

- Visualization: off
- Approach: coherent grid
- Number of boids: 50000
- scene_scale: 100.0f

<img src="images/blocksize.png">

At ```blocksize=1024```, the program achieves the highest framerate.

---
In this implementation, the cell width of the uniform grid is hardcoded to be twice the neighborhood distance. Therefore, the program can get away with at most 8 neighbor cell checks. However, if I change the cell width to be the neighborhood distance, 27 neighboring cells will need to be checked. To investigate this further, two setups are used to compare the performance:

1. Uniform grid approach with 50000 boids
2. Uniform grid approach with 500000 boids

Using the first setup, checking 27 cells with ```gridCellWidth = std::max(std::max(rule1Distance, rule2Distance), rule3Distance);``` didn't noticeably impact the performance with 50000 boids sparsely populating the space. Using the second setup with densely populated boids in the space, checking 27 cells provides better performance than checking only 8 cell.

### TODO

Substitute gif with Youtube Link https://stackoverflow.com/questions/11804820/how-can-i-embed-a-youtube-video-on-github-wiki-pages
Binary file added images/blocksize.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/coherent.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/insideCube.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/logo.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/naive.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/outSideCube.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
189 changes: 189 additions & 0 deletions images/plotting/.ipynb_checkpoints/CUDA Flocking-checkpoint.ipynb

Large diffs are not rendered by default.

189 changes: 189 additions & 0 deletions images/plotting/CUDA Flocking.ipynb

Large diffs are not rendered by default.

Binary file added images/uniform.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading