Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 5: Han Wang #29

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
build
36 changes: 31 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,36 @@ Vulkan Grass Rendering

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Han Wang

### (TODO: Your README)
* Tested on: Windows 11, 11th Gen Intel(R) Core(TM) i9-11900H @ 2.50GHz 22GB, GTX 3070 Laptop GPU

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
![Unlock FPS](img/hw5.gif)


## Summery


This project involves the creation of a grass rendering system built on the foundation of Vulkan. The primary tasks encompass three essential components: The rendering pipeline, compute shader and grass simulation. The rendering pipeline follows the standard Vulkan structure, progressing through the vertex shader, tessellation control shader, primitive generator, tessellation evaluation shader, geometry shader, and finally the fragment shader. In our pursuit of enhancing rendering efficiency, particularly when dealing with the complex demands of rendering grass, I strategically prioritized the implementation of the compute shader ahead of the fragment shader. Furthermore, I incorporated a sophisticated culling system designed to eliminate unnecessary computations in areas not visible, thereby significantly improving both efficiency and performance. This culling system comprises distance culling, view-frustum culling, and orientation culling.




![Unlock FPS](img/hw5_2.gif)

## Analysis

Based on the output I got, I tested the different numbers of blades and their run time on the rendering part and got the following graph:

![Unlock FPS](img/runtime.png)

Analyzing the above graph, it becomes readily apparent that a direct correlation exists between the number of blades to be drawn and the computational time required for these calculations. As the quantity of blades in the scene increases, the overall runtime is expected to exhibit an uptick. It's worth noting, though, that this increment is not particularly substantial, primarily due to the inherent parallel nature of the processes involved. In other words, the scaling of runtime with blade count, while observable, is moderated by the parallelism in the system, resulting in a relatively minor impact on overall performance.

Also, since I've implemented all three different culling methods, I tried to implement them separately and get the following graph:

![Unlock FPS](img/culling.png)]


The graph presented above offers a clear visual representation of discernible variations in frames per second (FPS) performance across three distinct culling methodologies. In my analysis, I believe that the primary factor contributing to potential FPS improvements with culling lies in the optimization of the rendering process, specifically in reducing the number of grass blades that necessitate rendering.

These three culling methods diverge in their efficacy based on their respective capabilities to cull out portions of the grass scene. Essentially, the essence of their distinction is rooted in how effectively each method can eliminate or exclude certain grass blades from the rendering pipeline. The underlying principle is straightforward: the fewer grass blades that must be processed and rendered, the greater the potential increase in FPS, resulting in a more responsive and smoother visual experience for the user.
Binary file added bin/Debug/vulkan_grass_rendering.exe
Binary file not shown.
Binary file added bin/Debug/vulkan_grass_rendering.pdb
Binary file not shown.
Binary file added img/culling.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/hw5.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/hw5_2.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/runtime.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/Blades.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode

VkBuffer Blades::GetBladesBuffer() const {
return bladesBuffer;
}
}

VkBuffer Blades::GetCulledBladesBuffer() const {
return culledBladesBuffer;
Expand Down
179 changes: 175 additions & 4 deletions src/Renderer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@
#include "Camera.h"
#include "Image.h"

#include <iostream>
#include <chrono>
#include <ctime>

static constexpr unsigned int WORKGROUP_SIZE = 32;

Renderer::Renderer(Device* device, SwapChain* swapChain, Scene* scene, Camera* camera)
Expand All @@ -14,6 +18,9 @@ Renderer::Renderer(Device* device, SwapChain* swapChain, Scene* scene, Camera* c
swapChain(swapChain),
scene(scene),
camera(camera) {
//reference https://stackoverflow.com/questions/997946/how-to-get-current-time-and-date-in-c
auto start = std::chrono::system_clock::now();


CreateCommandPools();
CreateRenderPass();
Expand All @@ -33,6 +40,15 @@ Renderer::Renderer(Device* device, SwapChain* swapChain, Scene* scene, Camera* c
CreateComputePipeline();
RecordCommandBuffers();
RecordComputeCommandBuffer();

auto end = std::chrono::system_clock::now();

std::chrono::duration<double> elapsed_seconds = end - start;
std::time_t end_time = std::chrono::system_clock::to_time_t(end);

std::cout << "finished computation at " << std::ctime(&end_time)
<< "elapsed time: " << elapsed_seconds.count()*1000 << "ms"
<< std::endl;
}

void Renderer::CreateCommandPools() {
Expand Down Expand Up @@ -198,6 +214,39 @@ void Renderer::CreateComputeDescriptorSetLayout() {
// TODO: Create the descriptor set layout for the compute pipeline
// Remember this is like a class definition stating why types of information
// will be stored at each binding
// reference https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkDescriptorType.html#:~:text=VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER%20specifies%20a%20uniform%20buffer,a%20dynamic%20uniform%20buffer%20descriptor.
VkDescriptorSetLayoutBinding bladesBuffer = {};
bladesBuffer.binding = 0;
bladesBuffer.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
bladesBuffer.descriptorCount = 1;
bladesBuffer.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
bladesBuffer.pImmutableSamplers = nullptr;

VkDescriptorSetLayoutBinding culledBladesBuffer = {};
culledBladesBuffer.binding = 1;
culledBladesBuffer.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
culledBladesBuffer.descriptorCount = 1;
culledBladesBuffer.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
culledBladesBuffer.pImmutableSamplers = nullptr;

VkDescriptorSetLayoutBinding numBladesBuffer = {};
numBladesBuffer.binding = 2;
numBladesBuffer.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
numBladesBuffer.descriptorCount = 1;
numBladesBuffer.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
numBladesBuffer.pImmutableSamplers = nullptr;

std::vector<VkDescriptorSetLayoutBinding> bindings = { bladesBuffer, culledBladesBuffer, numBladesBuffer};

VkDescriptorSetLayoutCreateInfo layoutInfo = {};
layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
layoutInfo.bindingCount = static_cast<uint32_t>(bindings.size());
layoutInfo.pBindings = bindings.data();

if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) {
throw std::runtime_error("Failed to create descriptor set layout");
}

}

void Renderer::CreateDescriptorPool() {
Expand All @@ -216,6 +265,8 @@ void Renderer::CreateDescriptorPool() {
{ VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 },

// TODO: Add any additional types and counts of descriptors you will need to allocate

{ VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,static_cast<uint32_t>(scene->GetBlades().size()*3) }
};

VkDescriptorPoolCreateInfo poolInfo = {};
Expand Down Expand Up @@ -320,6 +371,46 @@ void Renderer::CreateModelDescriptorSets() {
void Renderer::CreateGrassDescriptorSets() {
// TODO: Create Descriptor sets for the grass.
// This should involve creating descriptor sets which point to the model matrix of each group of grass blades
grassDescriptorSets.resize(scene->GetBlades().size());

VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(grassDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("Failed to allocate descriptor set");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(grassDescriptorSets.size());


for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {
VkDescriptorBufferInfo modelBufferInfo = {};
modelBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer();
modelBufferInfo.offset = 0;
modelBufferInfo.range = sizeof(ModelBufferObject);

descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[i].dstSet = grassDescriptorSets[i];
descriptorWrites[i].dstBinding = 0;
descriptorWrites[i].dstArrayElement = 0;
descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
descriptorWrites[i].descriptorCount = 1;
descriptorWrites[i].pBufferInfo = &modelBufferInfo;
descriptorWrites[i].pImageInfo = nullptr;
descriptorWrites[i].pTexelBufferView = nullptr;



}

vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(grassDescriptorSets.size()), descriptorWrites.data(), 0, nullptr);



}

void Renderer::CreateTimeDescriptorSet() {
Expand Down Expand Up @@ -360,6 +451,74 @@ void Renderer::CreateTimeDescriptorSet() {
void Renderer::CreateComputeDescriptorSets() {
// TODO: Create Descriptor sets for the compute pipeline
// The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades

computeDescriptorSets.resize(scene->GetBlades().size());

VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(computeDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("Failed to allocate descriptor set");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(3 * computeDescriptorSets.size());


for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {
VkDescriptorBufferInfo BladeBuffer_info = {};
BladeBuffer_info.buffer = scene->GetBlades()[i]->GetBladesBuffer();
BladeBuffer_info.offset = 0;
BladeBuffer_info.range = sizeof(Blade)*NUM_BLADES;

descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 0].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 0].dstBinding = 0;
descriptorWrites[3 * i + 0].dstArrayElement = 0;
descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 0].descriptorCount = 1;
descriptorWrites[3 * i + 0].pBufferInfo = &BladeBuffer_info;
descriptorWrites[3 * i + 0].pImageInfo = nullptr;
descriptorWrites[3 * i + 0].pTexelBufferView = nullptr;

VkDescriptorBufferInfo culledBladesBuffer_info = {};
culledBladesBuffer_info.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer();
culledBladesBuffer_info.offset = 0;
culledBladesBuffer_info.range = sizeof(Blade) * NUM_BLADES;

descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 1].dstBinding = 1;
descriptorWrites[3 * i + 1].dstArrayElement = 0;
descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 1].descriptorCount = 1;
descriptorWrites[3 * i + 1].pBufferInfo = &culledBladesBuffer_info;
descriptorWrites[3 * i + 1].pImageInfo = nullptr;
descriptorWrites[3 * i + 1].pTexelBufferView = nullptr;

VkDescriptorBufferInfo numBladesBuffer_info = {};
numBladesBuffer_info.buffer = scene->GetBlades()[i]->GetNumBladesBuffer();
numBladesBuffer_info.offset = 0;
numBladesBuffer_info.range = sizeof(BladeDrawIndirect);

descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 2].dstBinding = 2;
descriptorWrites[3 * i + 2].dstArrayElement = 0;
descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 2].descriptorCount = 1;
descriptorWrites[3 * i + 2].pBufferInfo = &numBladesBuffer_info;
descriptorWrites[3 * i + 2].pImageInfo = nullptr;
descriptorWrites[3 * i + 2].pTexelBufferView = nullptr;

}

vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);


}

void Renderer::CreateGraphicsPipeline() {
Expand Down Expand Up @@ -717,7 +876,11 @@ void Renderer::CreateComputePipeline() {
computeShaderStageInfo.pName = "main";

// TODO: Add the compute dsecriptor set layout you create to this list
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout };




std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout };

// Create pipeline layout
VkPipelineLayoutCreateInfo pipelineLayoutInfo = {};
Expand Down Expand Up @@ -885,6 +1048,14 @@ void Renderer::RecordComputeCommandBuffer() {

// TODO: For each group of blades bind its descriptor set and dispatch

for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {

vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[i], 0, nullptr);
vkCmdDispatch(computeCommandBuffer, NUM_BLADES / WORKGROUP_SIZE, 1, 1);
}



// ~ End recording ~
if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) {
throw std::runtime_error("Failed to record compute command buffer");
Expand Down Expand Up @@ -976,13 +1147,13 @@ void Renderer::RecordCommandBuffers() {
VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() };
VkDeviceSize offsets[] = { 0 };
// TODO: Uncomment this when the buffers are populated
// vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);
vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);

// TODO: Bind the descriptor set for each grass blades model

vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, graphicsPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr);
// Draw
// TODO: Uncomment this when the buffers are populated
// vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
}

// End render pass
Expand Down
5 changes: 5 additions & 0 deletions src/Renderer.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,18 @@ class Renderer {
VkDescriptorSetLayout cameraDescriptorSetLayout;
VkDescriptorSetLayout modelDescriptorSetLayout;
VkDescriptorSetLayout timeDescriptorSetLayout;
VkDescriptorSetLayout computeDescriptorSetLayout;

VkDescriptorPool descriptorPool;

VkDescriptorSet cameraDescriptorSet;
std::vector<VkDescriptorSet> modelDescriptorSets;

VkDescriptorSet timeDescriptorSet;

std::vector<VkDescriptorSet> grassDescriptorSets;
std::vector<VkDescriptorSet> computeDescriptorSets;

VkPipelineLayout graphicsPipelineLayout;
VkPipelineLayout grassPipelineLayout;
VkPipelineLayout computePipelineLayout;
Expand Down
1 change: 1 addition & 0 deletions src/Scene.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#include "Scene.h"
#include "BufferUtils.h"
#include <iostream>

Scene::Scene(Device* device) : device(device) {
BufferUtils::CreateBuffer(device, sizeof(Time), VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, timeBuffer, timeBufferMemory);
Expand Down
Loading