University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 1 - Flocking

Dear Grader!

As you may not know, I registered late for the class(Actually registered on the last day of registration), and I negotiated with Professor Shehzan about due dates via email and my Project 1 due date is Sept 22. However as you can see, I was late for about 26 hours. I know there are 4 late days without penalty but I only want to use ONE on this project. I didn't mean to waste a late day only for the extra 2 hours. You can deduct points for being late for one day (hopefully 2 hours!)

Thanks

Xiao Wei
- (TODO) LinkedIn, personal website, twitter, etc.
Tested on: Windows 10, i9-9900k @ 3.6GHz 16.0GB, RTX 2080 SUPER 16GB

(TODO: Your README)

Simulation gif

For each implementation, how does changing the number of boids affect performance? Why do you think this is?

Performance drops as we increase the number of boids. Each boid in the flocking takes resource(thread) and computational resource #to get their velocity and postion. More boids, more work

For each implementation, how does changing the block count and block size affect performance? Why do you think this is?

Changing block size actually does not change performance drastically. Acquiring more block only needs simple operation

For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?

Yes, the performance improved. I expected the outcome since we don't bother to access dev_particleArrayIndices

Did changing cell width and checking 27 vs 8 neighboring cells affect performance? Why or why not? Be careful: it is insufficient (and possibly incorrect) to say that 27-cell is slower simply because there are more cells to check!

27 Cells actually is faster somehow, I guess the smaller 27 cells give more granularity for paralleling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

(TODO: Your README)

For each implementation, how does changing the number of boids affect performance? Why do you think this is?

For each implementation, how does changing the block count and block size affect performance? Why do you think this is?

For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?

Did changing cell width and checking 27 vs 8 neighboring cells affect performance? Why or why not? Be careful: it is insufficient (and possibly incorrect) to say that 27-cell is slower simply because there are more cells to check!

Files

README.md

Latest commit

History

README.md

File metadata and controls

(TODO: Your README)

For each implementation, how does changing the number of boids affect performance? Why do you think this is?

For each implementation, how does changing the block count and block size affect performance? Why do you think this is?

For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?

Did changing cell width and checking 27 vs 8 neighboring cells affect performance? Why or why not? Be careful: it is insufficient (and possibly incorrect) to say that 27-cell is slower simply because there are more cells to check!