Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 1: Zhihao Ruan #3

Open
wants to merge 32 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
cad68ad
run clang-format
Sep 1, 2021
ea88301
ignore vscode
Sep 2, 2021
23ab7cb
finish section 1.2
Sep 6, 2021
5204087
print CUDA compilation flags; set OpenGL preference to GLVND
Sep 6, 2021
af155d0
add one-liner indicator for Thrust unit test
Sep 6, 2021
2838b17
add const specifiers to brute force functions
Sep 6, 2021
fbfdfaf
finish kernComputeIndices() in 2.1
Sep 6, 2021
b09fd60
remove TODO in kernComputeIndices()
Sep 6, 2021
27dae94
finish kernIdentifyCellStartEnd() for part 2.1
Sep 6, 2021
8eafa66
fix var type error in kernComputeIndices()
Sep 6, 2021
e8dd53f
add more wrappers for computing grid idx from pos
Sep 6, 2021
3eab61d
fix CUDA block allocation error in naive simulation
Sep 7, 2021
2418c9d
add cudaDeviceSynchronize() in naive simulation
Sep 7, 2021
92f9d68
WIP on debugging part 2.1
Sep 7, 2021
1867c46
turn on uniform grid simulation
Sep 7, 2021
0c918fd
cudaFree part 2.1 device arrays
Sep 7, 2021
cccf816
add include of device_launch_parameters.h
Sep 7, 2021
a991948
fix bugs where `<=` mistook as `<`; rewrite 1/gridCellWidth to gridIn…
Sep 7, 2021
dce1a99
fix bug where using cellWidth instead of cellWidth/2 to check boid ce…
Sep 7, 2021
8e72fbe
bug fix
Sep 8, 2021
a2a506a
add FIXME for potential kernel code improvement
Sep 8, 2021
75f1f92
improvement on 8-neighbor search
Sep 9, 2021
a4f350d
add "print_glm" utility
Sep 10, 2021
b94d02f
fix boid vel2 calculation bug in Part 2.1
Sep 10, 2021
ecdd989
change grid loop from x-y-z to z-y-x
Sep 10, 2021
a352091
finish part 2.3
Sep 10, 2021
68a6afc
add Part 2.1, 2.3 demo GIF
Sep 10, 2021
b61d893
increase block size; increase num. of boids
Sep 12, 2021
4521fe6
double rendering for hiDPI display
Sep 12, 2021
95e6fcc
add visualization; finish write-up
Sep 12, 2021
fc54dd3
Update CUDA Computes List in CMake (#1)
shineyruan Sep 12, 2021
bf2f17f
Merge pull request #2 from CIS565-Fall-2021/main
shineyruan Sep 12, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -562,3 +562,5 @@ xcuserdata
*.xccheckout
*.moved-aside
*.xcuserstate

.vscode
5 changes: 5 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,15 @@ list(APPEND CUDA_NVCC_FLAGS ${CUDA_GENERATE_CODE})
list(APPEND CUDA_NVCC_FLAGS_DEBUG "-g -G")
set(CUDA_VERBOSE_BUILD ON)

message(STATUS "CUDA compilation flags: ${CUDA_NVCC_FLAGS}")

if(WIN32)
# Set up include and lib paths
set(CUDA_HOST_COMPILER ${CMAKE_CXX_COMPILER} CACHE FILEPATH "Host side compiler used by NVCC" FORCE)
endif(WIN32)
########################################

set(OpenGL_GL_PREFERENCE GLVND)
find_package(OpenGL REQUIRED)

if(UNIX)
Expand Down Expand Up @@ -67,13 +70,15 @@ set(headers
src/kernel.h
src/main.hpp
src/utilityCore.hpp
src/print_glm.hpp
)

set(sources
src/glslUtility.cpp
src/kernel.cu
src/main.cpp
src/utilityCore.cpp
src/print_glm.cpp
)

list(SORT headers)
Expand Down
40 changes: 34 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,39 @@
**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
Project 1 - Flocking**

* (TODO) YOUR NAME HERE
* (TODO) [LinkedIn](), [personal website](), [twitter](), etc.
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Zhihao Ruan ([email protected])
* [LinkedIn](https://www.linkedin.com/in/zhihao-ruan-29b29a13a/), [personal website](https://zhihaoruan.xyz/)
* Tested on: Ubuntu 20.04 LTS, Ryzen 3700X @ 2.22GHz 48GB, RTX 2060 Super @ 7976MB

### (TODO: Your README)
![](images/uniform-grid-flocking-coherent-highdpi.gif)

## Introduction: Flocking Simulation

Flocking is defined as the action of a crowd. In nature, flocking often happens on a crowd of birds or a school of fish. Birds, for example, often fly together as a whole in the sky, moving from one position to another. Although the shape of the crowd might change a lot, it is very amazing that each bird flies as if they knew the next steps of all other birds, so that it would never diverge from the crowd and they always stay together.

Biologists have been studying the behavior of flocking for a long time. In such context, we would also call each individual a **boid**. One might very easily start to wonder whether there is any type of communications taking place within the crowd so that they could maintain as a whole. Unfortunately, however, there is no such communication mechanism between each two individuals. In fact, according to the [notes from Conrad Parker](http://www.vergenet.net/~conrad/boids/), each individual would be able to stay close to other boids as long as they follow 3 simple rules:
1. Boids try to fly towards the centre of mass of neighboring boids.
2. Boids try to keep a small distance away from other objects (including other boids).
3. Boids try to match velocity with near boids.


The objective of this project would be to build a flocking simulation using CUDA with these 3 simple rules. A demo of the final result has been showed right above this section.

## Performance Analysis
**For each implementation, how does changing the number of boids affect performance? Why do you think this is?**
- The FPS would decrease as the number of boids increases. This is because GPU needs to compute more boid states and thus needs more threads to finish simulation per time step.

**For each implementation, how does changing the block count and block size affect performance? Why do you think this is?**
- The FPS would increase as the block size increases. This is because more boids could be computed in parallel in one block as the block size increases, thus boosting the performance.

**For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?**
- My performance did improve with coherent grid compared to scattered grid. This outcome was expected as GPU would pull out one chunk of memory at a time for a warp to access, and thus with coherent grid all threads in one warp would read the same chunk of data pulled out from GPU memory, thus reducing the memory I/O.

### FPS Graph Plots

**FPS change with increasing number of boids in different modes:**
![](images/FPS_num_boids.png)

**FPS change with different block size (uniform coherent grid, number of boids set to 50k):**
![](images/FPS_block_size.png)

Include screenshots, analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)
10 changes: 10 additions & 0 deletions cmake/.clang-format
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
BasedOnStyle: Google
---
Language: Cpp
AccessModifierOffset: -2
AlignConsecutiveAssignments: true
AlignConsecutiveMacros: true
SortIncludes: false
---

Loading