-
Notifications
You must be signed in to change notification settings - Fork 823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid Unnecessary Graph Searches by Checking for State Changes #2886
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…n than necessary by checking if the section actually changed in a way that's relevant to the graph search
jellysquid3
approved these changes
Dec 2, 2024
jellysquid3
pushed a commit
that referenced
this pull request
Dec 2, 2024
Avoids rebuilding the render lists and doing a graph search more often than necessary by checking if the section actually changed in a way that's relevant to the graph search. For worlds that update their blocks frequently (every tick or every redstone tick) this avoids half the graph searches. Some graph searches are still necessary to schedule rebuild tasks, but when the task results come back, this doesn't do another graph search unless the section's visibility data or build state changed in a way that needs the render list to be updated.
ThatMG393
added a commit
to ThatMG393/sodium
that referenced
this pull request
Dec 15, 2024
* Correctly handle colorization on NeoForge * Combine the vertex position attributes (CaffeineMC#2753) This improves terrain rendering performance significantly on Intel Xe-LP graphics under Linux. * Add option for Fullscreen Resolution (CaffeineMC#2642) The resolution controls would not fit in the allocated space, so the rendering of slider controls was changed to enable rendering the slider bar and the value text on separate lines. Co-authored-by: MeeniMc <[email protected]> * Only enable Fullscreen Resolution option on Windows Additionally, adjust the rendering of the controls to be less confusing when disabled, and provide an explanation as to what the option does. * Use consistent vertex ordering in entity rendering Some core shaders were relying on the model part faces being written out in a specific order. We still don't support core shaders, but the fix here is trivial enough. Fixes CaffeineMC#2745 * Add check for NeoForge per-quad AO flag * Disable mod entrypoint on Forge when running on servers (CaffeineMC#2773) * Fix excessively large allocations in chunk meshing The requested capacity was being multiplied by the vertex stride more than once, which resulted in far too much memory being allocated. Closes CaffeineMC#2792 * Fix some issues with Uint32 representation This increases the maximum size of vertex and index buffers to 4 billion elements, since the Uint32 types stored in memory are now safely represented with Int64. For vertex buffers, this increases their maximum size to 80 GiB, and index buffers have a maximum size of 16 GiB, whereas both were limited to 2 GiB prior. * Fix cull bitmask ordering in entity rendering Closes CaffeineMC#2788 * Add support for Maven Local publishing * Fix incorrect warning message when D3DKMT is not supported * Add Flawless Frames handler for NeoForge * Add angle-based section visibility path occlusion (CaffeineMC#2811) This eliminates 8-13% of the rendered sections at higher render distances on average in testing, and correspondingly reduces graph search time by a similar amount. * Disable material downgrading on Intel Gen8 and older Fixes CaffeineMC#2830 * Delay normal face calculation to use This potentially fixes some cases of CaffeineMC#2835. * Skip particle rendering optimizations for incompatible mods Fixes CaffeineMC#2827 * Update project URLs in source and documentation We're no longer a Fabric-exclusive mod, so let's get rid of the suffix. * Add third-party license notice for Fabric API * Optimizations for some block models (CaffeineMC#2508) Co-authored-by: muzikbike <[email protected]> * Fix sorting failures on rotated cuboids (CaffeineMC#2812) Use the accurate vertex positions for unaligned and aligned (but rotated) quads. * Rework the Gradle build scripts for multi-loader * Shared logic is moved into a build plugin where possible * Build time is significantly improved when the Gradle daemon is warmed * Mixins are remapped in-place now, eliminating the need for refmaps at runtime. This also gets rid of some warning messages at startup. * Module relationships are now correctly represented in IDEA for other source sets (fixes a lot of code analysis features) * Split Java source and resources into different configurations * Run configurations are now consistent between NeoForge/Fabric * The common project is no longer remapped unnecessarily * Updated Gradle and build plugins * Restore versioning schema to build script * Make organization of platform mixin packages consistent Fixes CaffeineMC#2688 * Exclude README documentation from processed resources These files are only meant to be in the source distribution, and Minecraft doesn't like them. * Don't try to load a refmap from the mixin plugin * Enumerate additional PCI classes in the graphics adapter probe Some integrated GPUs, such as RDNA3.5, appear to use the PCI_CLASS_DISPLAY_OTHER class. * Remove KDE and GNOME specific backends for browsing URLs The bugs with xdg-open have been resolved upstream and most Linux distributions are shipping the patches. Also, make sure we get a successful exit code from the XDG implementation. * Remove Minecraft from classpath of the pre-launch source set This will help to avoid class-load issues and makes the code more hygienic. * Update to Minecraft 1.21.3 * Ensure tooltips are constrained to the screen (CaffeineMC#2845) * Update authors and contributors list * Remove leftover popPose * Trust existing fog color * Block the Overwolf Overlay due to graphical corruption The overlay does not correctly restore the texture unit state in OpenGL, which causes problems when Minecraft thinks a texture has already been bound to a slot. Since disabling the OpenGL state cache globally is not an acceptable solution (it would severely hurt performance) and their software doesn't give us any method to detect the problematic version, we block all versions. gep_minecraft.dll is the payload they actually inject, which has no version information or description. Fixes CaffeineMC#2862 * Avoid static memory allocation in EntityRenderer Just allocate on the stack, since it's a small amount of memory (<1 KiB) and avoids needing complex finalizers. * Fix y-offset calculation for back face culling in cloud rendering (CaffeineMC#2864) * Fix memory leak and double free in CloudRenderer * Improve color mixing functions The existing functions did not implement rounding correctly (often leading to off-by-one errors). Additionally, the improved variants are both slightly faster and easier to understand. * Fixup documentation in ColorMixer * Unify color mixing/swizzling utilities The Fabric integration code was re-implementing a lot of the utilities that already exist in Sodium unnecessarily. Also, improve the documentation so that ABGR and RGBA are not used interchangeably. * Add optimized function for bi-linear interpolation This reduces the number of ALU ops significantly and creates a common utility function in the project. * Reduce time complexity for box blurs Measuring the time spent per box blur in biome blending, the following results were observed. Radius Before After % Improvement 7 blocks 9100ns 3700ns 59% 3 blocks 5400ns 3200ns 41% 1 blocks 3700ns 2600ns 29% * Revert detection of Overwolf Overlay They claim this has since been fixed. We will re-examine in the future if we see additional reports. This reverts commit e7ea6f7. * Bump version to 0.6.0-beta.5 * Bump version to 0.6.0-final * Update render code for chunk status map Fixes CaffeineMC#2881 * Rollup of fixes and improvements for cloud rendering Some changes were made to cloud rendering in newer versions that needed to be replicated in Sodium. - The alpha cutoff for clouds was changed to (a < 10). - Texture loading can now gracefully fail, and it is expected that rendering is skipped when this happens. - The movement/positioning of clouds was slightly changed. - The render pass system now needs to be told about render target usages (fixes CaffeineMC#2883). This commit also improves mesh building time by around 35% on a fast processor (AMD Ryzen HX AI 370) through various micro-optimizations. * Fix culling behavior between transparent and opaque blocks Minecraft 1.21.2 changed some of the rules, and this was causing the faces of transparent blocks to be rendered even when they were hidden by full opaque blocks. Fixes CaffeineMC#2850 * Fix precision issues in cloud rendering at far distances * Use alternative workaround for NVIDIA drivers The NVIDIA driver enables a driver feature called "Threaded Optimizations" when it finds Minecraft, which causes severe performance issues and sometimes even crashes. Newer versions of the driver seem to use a slightly different heuristic which our workaround doesn't address. So, instead, use an alternative method that enables GL_DEBUG_OUTPUT_SYNCHRONOUS. This seems to reliably disable the functionality *even if* the user has configured it otherwise in their driver settings. Additionally, on Windows, we now always indicate to the driver that Minecraft is running, so that users with hybrid graphics don't see regressed performance. * Sort render lists for regions and sections after traversal (CaffeineMC#2780) Render sections and regions are sorted after the graph traversal is performed. This decouples their ordering from the graph, which isn't entirely correct for draw call sorting. Fixes CaffeineMC#2266 * Add support for new NeoForge fluid overlay API * Ensure depth test is configured when rendering clouds The state of the depth test prior to cloud rendering is undefined. After rendering, it is expected to be disabled again. * Fix rounding error in ColorMixer#mix Rounding of the values now happens after the 16-bit intermediaries are added together. This affected some animated textures, causing them to exhibit flickering behavior. * Add additional optimized block models This covers the following additional blocks: - Cauldrons - Brewing Stands - Bells Co-authored-by: JellySquid <[email protected]> * Avoid marking the section graph as dirty if state didn't change (CaffeineMC#2886) Avoids rebuilding the render lists and doing a graph search more often than necessary by checking if the section actually changed in a way that's relevant to the graph search. For worlds that update their blocks frequently (every tick or every redstone tick) this avoids half the graph searches. Some graph searches are still necessary to schedule rebuild tasks, but when the task results come back, this doesn't do another graph search unless the section's visibility data or build state changed in a way that needs the render list to be updated. * Use larger bounding box for nearby sections in frustum check (CaffeineMC#2879) This fixes some problems where very large block entities in nearby sections may be incorrectly culled. But it does not comprehensively fix the problem for all other sections, since that would require visiting the 27-neighborhood of every section, which is too slow. * Bump version to 0.6.1 * Bump dependency versions * Update mod manifest * Ensure ItemRenderContext.isDefaultTranslucent is initialized * Update compatible mods listing * Update mod manifest to restrict Minecraft versions * Update to Minecraft 1.21.4 * Update NeoForge manifest for Minecraft 1.21.4 * Fix glyph effect orientation * Fix hidden surface elimination in fluid rendering for waterlogged blocks (CaffeineMC#2907) * Fix detection for specific Intel OpenGL ICDs The OpenGL ICD name now includes the file extension, which the regex expressions were not matching. * Avoid showing the incompatible driver error in some cases For systems with hybrid graphics, it may be the case that an incompatible graphics driver is installed, but that it isn't used for the OpenGL context. We can avoid showing errors in this situation by checking the vendor string of the context immediately after creation. This is not the most robust check, but in practice, a single system should not have multiple graphics drivers installed from the same vendor, so checking the string should be relatively safe. * Fix lambda mappings * Bump version and dependency requirements * Use correct coordinates for sorting chunk sections (CaffeineMC#2924) Fix section and region sorting by using the correct section coordinate instead of the integer part of the camera transform, which is incorrect near the origin. Closes CaffeineMC#2918 * Update README.md --------- Co-authored-by: IMS212 <[email protected]> Co-authored-by: JellySquid <[email protected]> Co-authored-by: douira <[email protected]> Co-authored-by: MeeniMc <[email protected]> Co-authored-by: JellySquid <[email protected]> Co-authored-by: IThundxr <[email protected]> Co-authored-by: muzikbike <[email protected]>
ThatMG393
pushed a commit
to ThatMG393/sodium
that referenced
this pull request
Dec 15, 2024
…eineMC#2886) Avoids rebuilding the render lists and doing a graph search more often than necessary by checking if the section actually changed in a way that's relevant to the graph search. For worlds that update their blocks frequently (every tick or every redstone tick) this avoids half the graph searches. Some graph searches are still necessary to schedule rebuild tasks, but when the task results come back, this doesn't do another graph search unless the section's visibility data or build state changed in a way that needs the render list to be updated.
ThatMG393
pushed a commit
to ThatMG393/sodium
that referenced
this pull request
Dec 15, 2024
…eineMC#2886) Avoids rebuilding the render lists and doing a graph search more often than necessary by checking if the section actually changed in a way that's relevant to the graph search. For worlds that update their blocks frequently (every tick or every redstone tick) this avoids half the graph searches. Some graph searches are still necessary to schedule rebuild tasks, but when the task results come back, this doesn't do another graph search unless the section's visibility data or build state changed in a way that needs the render list to be updated.
ThatMG393
added a commit
to ThatMG393/sodium
that referenced
this pull request
Dec 15, 2024
* rename some things for clarity * fix waterlogged glass panes (once again, but more this time) by avoiding distance sorting through the detection of primary intersectors when geometry is intersecting and then sorting them in a fixed order * use Mth.clamp for clarity * refactor buffer and sort result handling, buffers are now freed immediately instead of keeping them to avoid memory usage buffer caching would be a better solution but that's complicated and doesn't currently work correctly * reduce number of unique triggers by around 5 percent without impacting sorting or building performance * importantly sort a little farther away, sort tasks are fast * use defer zero frames for important sort tasks by default * fix build * clarify authorship of BitArray * fix bug with radix sort for SNR heuristic in BSP partition generating wrong indexes * combine draw commands * correctly reset accumulated element count * remove draw call combining for indexed rendering as it's broken and hard to fix * skip heuristic if there's no quads * refactor primary intersector detection to handle large cases better, also removed the warning message about unpartitionable geometry as it seems to not be a relevant problem * fix topo sorting in some situations where the dot product was wrongly not recalculated when the normal is quantized. also fixed aligned quads not receiving the more accurate center based on the average of the unique vertexes. * reorder vertex ranges before uploaded to optimize for combined draw commands * tune primary intersector detection to handle situations where only a small amount of geometry is intersecting * Correctly handle colorization on NeoForge * Combine the vertex position attributes (CaffeineMC#2753) This improves terrain rendering performance significantly on Intel Xe-LP graphics under Linux. * Add option for Fullscreen Resolution (CaffeineMC#2642) The resolution controls would not fit in the allocated space, so the rendering of slider controls was changed to enable rendering the slider bar and the value text on separate lines. Co-authored-by: MeeniMc <[email protected]> * Only enable Fullscreen Resolution option on Windows Additionally, adjust the rendering of the controls to be less confusing when disabled, and provide an explanation as to what the option does. * fix draw command combining, remove aggressive non-empty command skipping because it seems broken * Use consistent vertex ordering in entity rendering Some core shaders were relying on the model part faces being written out in a specific order. We still don't support core shaders, but the fix here is trivial enough. Fixes CaffeineMC#2745 * Add check for NeoForge per-quad AO flag * Disable mod entrypoint on Forge when running on servers (CaffeineMC#2773) * fix graphical corruption when there's a lot of geometry by appropriately picking the size of the required shared index buffer * cleanup unused and broken code * cleanup calculation of mask bit and element count * cleanup meshing, storage, and renderer * fix translucent rendering by correctly decoding vertex segments * cleanup misc, remove unused code * refactor translucent AnyOrderData to not generate its own trivial index buffer and instead share this type of data within regions * add Index Pool arena size * add arena content caching * Fix excessively large allocations in chunk meshing The requested capacity was being multiplied by the vertex stride more than once, which resulted in far too much memory being allocated. Closes CaffeineMC#2792 * Fix some issues with Uint32 representation This increases the maximum size of vertex and index buffers to 4 billion elements, since the Uint32 types stored in memory are now safely represented with Int64. For vertex buffers, this increases their maximum size to 80 GiB, and index buffers have a maximum size of 16 GiB, whereas both were limited to 2 GiB prior. * refactor storage to cope with larger amounts of geometry and use less ugly hacks, rename a bunch of methods to be consistent and clearer * remove debug code * Fix cull bitmask ordering in entity rendering Closes CaffeineMC#2788 * Add support for Maven Local publishing * Fix incorrect warning message when D3DKMT is not supported * Add Flawless Frames handler for NeoForge * Add angle-based section visibility path occlusion (CaffeineMC#2811) This eliminates 8-13% of the rendered sections at higher render distances on average in testing, and correspondingly reduces graph search time by a similar amount. * Disable material downgrading on Intel Gen8 and older Fixes CaffeineMC#2830 * Delay normal face calculation to use This potentially fixes some cases of CaffeineMC#2835. * Skip particle rendering optimizations for incompatible mods Fixes CaffeineMC#2827 * Update project URLs in source and documentation We're no longer a Fabric-exclusive mod, so let's get rid of the suffix. * Add third-party license notice for Fabric API * Optimizations for some block models (CaffeineMC#2508) Co-authored-by: muzikbike <[email protected]> * Fix sorting failures on rotated cuboids (CaffeineMC#2812) Use the accurate vertex positions for unaligned and aligned (but rotated) quads. * Rework the Gradle build scripts for multi-loader * Shared logic is moved into a build plugin where possible * Build time is significantly improved when the Gradle daemon is warmed * Mixins are remapped in-place now, eliminating the need for refmaps at runtime. This also gets rid of some warning messages at startup. * Module relationships are now correctly represented in IDEA for other source sets (fixes a lot of code analysis features) * Split Java source and resources into different configurations * Run configurations are now consistent between NeoForge/Fabric * The common project is no longer remapped unnecessarily * Updated Gradle and build plugins * Restore versioning schema to build script * Make organization of platform mixin packages consistent Fixes CaffeineMC#2688 * Exclude README documentation from processed resources These files are only meant to be in the source distribution, and Minecraft doesn't like them. * Don't try to load a refmap from the mixin plugin * Enumerate additional PCI classes in the graphics adapter probe Some integrated GPUs, such as RDNA3.5, appear to use the PCI_CLASS_DISPLAY_OTHER class. * Remove KDE and GNOME specific backends for browsing URLs The bugs with xdg-open have been resolved upstream and most Linux distributions are shipping the patches. Also, make sure we get a successful exit code from the XDG implementation. * Remove Minecraft from classpath of the pre-launch source set This will help to avoid class-load issues and makes the code more hygienic. * Update to Minecraft 1.21.3 * Ensure tooltips are constrained to the screen (CaffeineMC#2845) * Update authors and contributors list * Remove leftover popPose * Trust existing fog color * Block the Overwolf Overlay due to graphical corruption The overlay does not correctly restore the texture unit state in OpenGL, which causes problems when Minecraft thinks a texture has already been bound to a slot. Since disabling the OpenGL state cache globally is not an acceptable solution (it would severely hurt performance) and their software doesn't give us any method to detect the problematic version, we block all versions. gep_minecraft.dll is the payload they actually inject, which has no version information or description. Fixes CaffeineMC#2862 * Avoid static memory allocation in EntityRenderer Just allocate on the stack, since it's a small amount of memory (<1 KiB) and avoids needing complex finalizers. * Fix y-offset calculation for back face culling in cloud rendering (CaffeineMC#2864) * Fix memory leak and double free in CloudRenderer * Improve color mixing functions The existing functions did not implement rounding correctly (often leading to off-by-one errors). Additionally, the improved variants are both slightly faster and easier to understand. * Fixup documentation in ColorMixer * Unify color mixing/swizzling utilities The Fabric integration code was re-implementing a lot of the utilities that already exist in Sodium unnecessarily. Also, improve the documentation so that ABGR and RGBA are not used interchangeably. * Add optimized function for bi-linear interpolation This reduces the number of ALU ops significantly and creates a common utility function in the project. * Reduce time complexity for box blurs Measuring the time spent per box blur in biome blending, the following results were observed. Radius Before After % Improvement 7 blocks 9100ns 3700ns 59% 3 blocks 5400ns 3200ns 41% 1 blocks 3700ns 2600ns 29% * Revert detection of Overwolf Overlay They claim this has since been fixed. We will re-examine in the future if we see additional reports. This reverts commit e7ea6f7. * Bump version to 0.6.0-beta.5 * Bump version to 0.6.0-final * Update render code for chunk status map Fixes CaffeineMC#2881 * Rollup of fixes and improvements for cloud rendering Some changes were made to cloud rendering in newer versions that needed to be replicated in Sodium. - The alpha cutoff for clouds was changed to (a < 10). - Texture loading can now gracefully fail, and it is expected that rendering is skipped when this happens. - The movement/positioning of clouds was slightly changed. - The render pass system now needs to be told about render target usages (fixes CaffeineMC#2883). This commit also improves mesh building time by around 35% on a fast processor (AMD Ryzen HX AI 370) through various micro-optimizations. * Fix culling behavior between transparent and opaque blocks Minecraft 1.21.2 changed some of the rules, and this was causing the faces of transparent blocks to be rendered even when they were hidden by full opaque blocks. Fixes CaffeineMC#2850 * Fix precision issues in cloud rendering at far distances * Use alternative workaround for NVIDIA drivers The NVIDIA driver enables a driver feature called "Threaded Optimizations" when it finds Minecraft, which causes severe performance issues and sometimes even crashes. Newer versions of the driver seem to use a slightly different heuristic which our workaround doesn't address. So, instead, use an alternative method that enables GL_DEBUG_OUTPUT_SYNCHRONOUS. This seems to reliably disable the functionality *even if* the user has configured it otherwise in their driver settings. Additionally, on Windows, we now always indicate to the driver that Minecraft is running, so that users with hybrid graphics don't see regressed performance. * Sort render lists for regions and sections after traversal (CaffeineMC#2780) Render sections and regions are sorted after the graph traversal is performed. This decouples their ordering from the graph, which isn't entirely correct for draw call sorting. Fixes CaffeineMC#2266 * Add support for new NeoForge fluid overlay API * Ensure depth test is configured when rendering clouds The state of the depth test prior to cloud rendering is undefined. After rendering, it is expected to be disabled again. * Fix rounding error in ColorMixer#mix Rounding of the values now happens after the 16-bit intermediaries are added together. This affected some animated textures, causing them to exhibit flickering behavior. * Add additional optimized block models This covers the following additional blocks: - Cauldrons - Brewing Stands - Bells Co-authored-by: JellySquid <[email protected]> * Avoid marking the section graph as dirty if state didn't change (CaffeineMC#2886) Avoids rebuilding the render lists and doing a graph search more often than necessary by checking if the section actually changed in a way that's relevant to the graph search. For worlds that update their blocks frequently (every tick or every redstone tick) this avoids half the graph searches. Some graph searches are still necessary to schedule rebuild tasks, but when the task results come back, this doesn't do another graph search unless the section's visibility data or build state changed in a way that needs the render list to be updated. * Use larger bounding box for nearby sections in frustum check (CaffeineMC#2879) This fixes some problems where very large block entities in nearby sections may be incorrectly culled. But it does not comprehensively fix the problem for all other sections, since that would require visiting the 27-neighborhood of every section, which is too slow. * Bump version to 0.6.1 * Bump dependency versions * Update mod manifest * Ensure ItemRenderContext.isDefaultTranslucent is initialized * Update compatible mods listing * Update mod manifest to restrict Minecraft versions * Update to Minecraft 1.21.4 * Update NeoForge manifest for Minecraft 1.21.4 * Fix glyph effect orientation * Fix hidden surface elimination in fluid rendering for waterlogged blocks (CaffeineMC#2907) * Fix detection for specific Intel OpenGL ICDs The OpenGL ICD name now includes the file extension, which the regex expressions were not matching. * Avoid showing the incompatible driver error in some cases For systems with hybrid graphics, it may be the case that an incompatible graphics driver is installed, but that it isn't used for the OpenGL context. We can avoid showing errors in this situation by checking the vendor string of the context immediately after creation. This is not the most robust check, but in practice, a single system should not have multiple graphics drivers installed from the same vendor, so checking the string should be relatively safe. * Fix lambda mappings * Bump version and dependency requirements * Use correct coordinates for sorting chunk sections (CaffeineMC#2924) Fix section and region sorting by using the correct section coordinate instead of the integer part of the camera transform, which is incorrect near the origin. Closes CaffeineMC#2918 * Update README.md --------- Co-authored-by: douira <[email protected]> Co-authored-by: IMS212 <[email protected]> Co-authored-by: JellySquid <[email protected]> Co-authored-by: douira <[email protected]> Co-authored-by: MeeniMc <[email protected]> Co-authored-by: JellySquid <[email protected]> Co-authored-by: IThundxr <[email protected]> Co-authored-by: muzikbike <[email protected]>
ThatMG393
added a commit
to ThatMG393/sodium
that referenced
this pull request
Jan 3, 2025
* Correctly handle colorization on NeoForge * Combine the vertex position attributes (CaffeineMC#2753) This improves terrain rendering performance significantly on Intel Xe-LP graphics under Linux. * Add option for Fullscreen Resolution (CaffeineMC#2642) The resolution controls would not fit in the allocated space, so the rendering of slider controls was changed to enable rendering the slider bar and the value text on separate lines. Co-authored-by: MeeniMc <[email protected]> * Only enable Fullscreen Resolution option on Windows Additionally, adjust the rendering of the controls to be less confusing when disabled, and provide an explanation as to what the option does. * Use consistent vertex ordering in entity rendering Some core shaders were relying on the model part faces being written out in a specific order. We still don't support core shaders, but the fix here is trivial enough. Fixes CaffeineMC#2745 * Add check for NeoForge per-quad AO flag * Disable mod entrypoint on Forge when running on servers (CaffeineMC#2773) * Fix excessively large allocations in chunk meshing The requested capacity was being multiplied by the vertex stride more than once, which resulted in far too much memory being allocated. Closes CaffeineMC#2792 * Fix some issues with Uint32 representation This increases the maximum size of vertex and index buffers to 4 billion elements, since the Uint32 types stored in memory are now safely represented with Int64. For vertex buffers, this increases their maximum size to 80 GiB, and index buffers have a maximum size of 16 GiB, whereas both were limited to 2 GiB prior. * Fix cull bitmask ordering in entity rendering Closes CaffeineMC#2788 * Add support for Maven Local publishing * Fix incorrect warning message when D3DKMT is not supported * Add Flawless Frames handler for NeoForge * Add angle-based section visibility path occlusion (CaffeineMC#2811) This eliminates 8-13% of the rendered sections at higher render distances on average in testing, and correspondingly reduces graph search time by a similar amount. * Disable material downgrading on Intel Gen8 and older Fixes CaffeineMC#2830 * Delay normal face calculation to use This potentially fixes some cases of CaffeineMC#2835. * Skip particle rendering optimizations for incompatible mods Fixes CaffeineMC#2827 * Update project URLs in source and documentation We're no longer a Fabric-exclusive mod, so let's get rid of the suffix. * Add third-party license notice for Fabric API * Optimizations for some block models (CaffeineMC#2508) Co-authored-by: muzikbike <[email protected]> * Fix sorting failures on rotated cuboids (CaffeineMC#2812) Use the accurate vertex positions for unaligned and aligned (but rotated) quads. * Rework the Gradle build scripts for multi-loader * Shared logic is moved into a build plugin where possible * Build time is significantly improved when the Gradle daemon is warmed * Mixins are remapped in-place now, eliminating the need for refmaps at runtime. This also gets rid of some warning messages at startup. * Module relationships are now correctly represented in IDEA for other source sets (fixes a lot of code analysis features) * Split Java source and resources into different configurations * Run configurations are now consistent between NeoForge/Fabric * The common project is no longer remapped unnecessarily * Updated Gradle and build plugins * Restore versioning schema to build script * Make organization of platform mixin packages consistent Fixes CaffeineMC#2688 * Exclude README documentation from processed resources These files are only meant to be in the source distribution, and Minecraft doesn't like them. * Don't try to load a refmap from the mixin plugin * Enumerate additional PCI classes in the graphics adapter probe Some integrated GPUs, such as RDNA3.5, appear to use the PCI_CLASS_DISPLAY_OTHER class. * Remove KDE and GNOME specific backends for browsing URLs The bugs with xdg-open have been resolved upstream and most Linux distributions are shipping the patches. Also, make sure we get a successful exit code from the XDG implementation. * Remove Minecraft from classpath of the pre-launch source set This will help to avoid class-load issues and makes the code more hygienic. * Update to Minecraft 1.21.3 * Ensure tooltips are constrained to the screen (CaffeineMC#2845) * Update authors and contributors list * Remove leftover popPose * Trust existing fog color * Block the Overwolf Overlay due to graphical corruption The overlay does not correctly restore the texture unit state in OpenGL, which causes problems when Minecraft thinks a texture has already been bound to a slot. Since disabling the OpenGL state cache globally is not an acceptable solution (it would severely hurt performance) and their software doesn't give us any method to detect the problematic version, we block all versions. gep_minecraft.dll is the payload they actually inject, which has no version information or description. Fixes CaffeineMC#2862 * Avoid static memory allocation in EntityRenderer Just allocate on the stack, since it's a small amount of memory (<1 KiB) and avoids needing complex finalizers. * Fix y-offset calculation for back face culling in cloud rendering (CaffeineMC#2864) * Fix memory leak and double free in CloudRenderer * Improve color mixing functions The existing functions did not implement rounding correctly (often leading to off-by-one errors). Additionally, the improved variants are both slightly faster and easier to understand. * Fixup documentation in ColorMixer * Unify color mixing/swizzling utilities The Fabric integration code was re-implementing a lot of the utilities that already exist in Sodium unnecessarily. Also, improve the documentation so that ABGR and RGBA are not used interchangeably. * Add optimized function for bi-linear interpolation This reduces the number of ALU ops significantly and creates a common utility function in the project. * Reduce time complexity for box blurs Measuring the time spent per box blur in biome blending, the following results were observed. Radius Before After % Improvement 7 blocks 9100ns 3700ns 59% 3 blocks 5400ns 3200ns 41% 1 blocks 3700ns 2600ns 29% * Revert detection of Overwolf Overlay They claim this has since been fixed. We will re-examine in the future if we see additional reports. This reverts commit e7ea6f7. * Bump version to 0.6.0-beta.5 * Bump version to 0.6.0-final * Update render code for chunk status map Fixes CaffeineMC#2881 * Rollup of fixes and improvements for cloud rendering Some changes were made to cloud rendering in newer versions that needed to be replicated in Sodium. - The alpha cutoff for clouds was changed to (a < 10). - Texture loading can now gracefully fail, and it is expected that rendering is skipped when this happens. - The movement/positioning of clouds was slightly changed. - The render pass system now needs to be told about render target usages (fixes CaffeineMC#2883). This commit also improves mesh building time by around 35% on a fast processor (AMD Ryzen HX AI 370) through various micro-optimizations. * Fix culling behavior between transparent and opaque blocks Minecraft 1.21.2 changed some of the rules, and this was causing the faces of transparent blocks to be rendered even when they were hidden by full opaque blocks. Fixes CaffeineMC#2850 * Fix precision issues in cloud rendering at far distances * Use alternative workaround for NVIDIA drivers The NVIDIA driver enables a driver feature called "Threaded Optimizations" when it finds Minecraft, which causes severe performance issues and sometimes even crashes. Newer versions of the driver seem to use a slightly different heuristic which our workaround doesn't address. So, instead, use an alternative method that enables GL_DEBUG_OUTPUT_SYNCHRONOUS. This seems to reliably disable the functionality *even if* the user has configured it otherwise in their driver settings. Additionally, on Windows, we now always indicate to the driver that Minecraft is running, so that users with hybrid graphics don't see regressed performance. * Sort render lists for regions and sections after traversal (CaffeineMC#2780) Render sections and regions are sorted after the graph traversal is performed. This decouples their ordering from the graph, which isn't entirely correct for draw call sorting. Fixes CaffeineMC#2266 * Add support for new NeoForge fluid overlay API * Ensure depth test is configured when rendering clouds The state of the depth test prior to cloud rendering is undefined. After rendering, it is expected to be disabled again. * Fix rounding error in ColorMixer#mix Rounding of the values now happens after the 16-bit intermediaries are added together. This affected some animated textures, causing them to exhibit flickering behavior. * Add additional optimized block models This covers the following additional blocks: - Cauldrons - Brewing Stands - Bells Co-authored-by: JellySquid <[email protected]> * Avoid marking the section graph as dirty if state didn't change (CaffeineMC#2886) Avoids rebuilding the render lists and doing a graph search more often than necessary by checking if the section actually changed in a way that's relevant to the graph search. For worlds that update their blocks frequently (every tick or every redstone tick) this avoids half the graph searches. Some graph searches are still necessary to schedule rebuild tasks, but when the task results come back, this doesn't do another graph search unless the section's visibility data or build state changed in a way that needs the render list to be updated. * Use larger bounding box for nearby sections in frustum check (CaffeineMC#2879) This fixes some problems where very large block entities in nearby sections may be incorrectly culled. But it does not comprehensively fix the problem for all other sections, since that would require visiting the 27-neighborhood of every section, which is too slow. * Bump version to 0.6.1 * Bump dependency versions * Update mod manifest * Ensure ItemRenderContext.isDefaultTranslucent is initialized * Update compatible mods listing * Update mod manifest to restrict Minecraft versions * Update to Minecraft 1.21.4 * Update NeoForge manifest for Minecraft 1.21.4 * Fix glyph effect orientation * Fix hidden surface elimination in fluid rendering for waterlogged blocks (CaffeineMC#2907) * Fix detection for specific Intel OpenGL ICDs The OpenGL ICD name now includes the file extension, which the regex expressions were not matching. * Avoid showing the incompatible driver error in some cases For systems with hybrid graphics, it may be the case that an incompatible graphics driver is installed, but that it isn't used for the OpenGL context. We can avoid showing errors in this situation by checking the vendor string of the context immediately after creation. This is not the most robust check, but in practice, a single system should not have multiple graphics drivers installed from the same vendor, so checking the string should be relatively safe. * Fix lambda mappings * Bump version and dependency requirements * Use correct coordinates for sorting chunk sections (CaffeineMC#2924) Fix section and region sorting by using the correct section coordinate instead of the integer part of the camera transform, which is incorrect near the origin. Closes CaffeineMC#2918 * Update README.md * Update textures for Leaves variants * Added "Pale Oak Leaves" for Minecraft 1.21.4. * Reduced the file size of all block textures. * Clean up BitArray class * Use link.caffeinemc.net domain for some URLs * Use ShellExecuteW from message box callbacks This fixes a regression caused by 26f4263. The underlying problem is that accessing Java's AWT *after* LWJGL3 has initialized is not possible. Minecraft has a utility class which uses rundll32 internally, but we cannot access that due to classloader restrictions on NeoForge. That leaves us with having to implement the call ourselves, and simply using Shell32 directly (like we do for other Windows APIs) seems easiest. * Do not bake ambient lighting into cached per-face light data Fixes CaffeineMC#2806 * Switch to Parchment mappings 2024.12.07 * Bump version and dependencies * Do not cache ambient brightness at initialization The world may not be assigned to the renderer at initialization, which is the case for non-terrain rendering (i.e. block entities.) Likely, there is no performance benefit to caching this data in the first place, so the easiest solution is to just remove the code. * Bump version to 0.6.5 * Fix transforms not being run on fast path * Bump version to 0.6.6 * Rename model texture references to match Vanilla (CaffeineMC#2958) Custom models which extended these base models would not properly have their textures applied, as the texture references were accidentally changed. * Do not apply optimizations to sprites with special tickers * Aggressively optimize entity rendering This is anywhere from 10 to 15% faster depending on what entities are being rendered. Most of the improvements come from more efficiently laying out the cuboid data and coalescing neighboring 32-bit values into 64-bit words. Furthermore, vertex positions are calculated by extracting vectors from the pose matrix and adding them to the origin vertex, which avoids many matrix multiplications. Co-authored-by: MoePus <[email protected]> * Avoid quaternion transforms in particle rendering The billboard geometry can be computed using the camera's left and up vectors, saving some cycles. When rendering thousands of billboard particles, this was ~10% faster than baseline in my observation. Co-authored-by: MoePus <[email protected]> * Add shader source line annotations (CaffeineMC#2691) --------- Co-authored-by: IMS212 <[email protected]> Co-authored-by: JellySquid <[email protected]> Co-authored-by: douira <[email protected]> Co-authored-by: MeeniMc <[email protected]> Co-authored-by: JellySquid <[email protected]> Co-authored-by: IThundxr <[email protected]> Co-authored-by: muzikbike <[email protected]> Co-authored-by: MoePus <[email protected]>
ThatMG393
added a commit
to ThatMG393/sodium
that referenced
this pull request
Jan 3, 2025
* Ensure depth test is configured when rendering clouds The state of the depth test prior to cloud rendering is undefined. After rendering, it is expected to be disabled again. * Fix rounding error in ColorMixer#mix Rounding of the values now happens after the 16-bit intermediaries are added together. This affected some animated textures, causing them to exhibit flickering behavior. * Add additional optimized block models This covers the following additional blocks: - Cauldrons - Brewing Stands - Bells Co-authored-by: JellySquid <[email protected]> * Avoid marking the section graph as dirty if state didn't change (CaffeineMC#2886) Avoids rebuilding the render lists and doing a graph search more often than necessary by checking if the section actually changed in a way that's relevant to the graph search. For worlds that update their blocks frequently (every tick or every redstone tick) this avoids half the graph searches. Some graph searches are still necessary to schedule rebuild tasks, but when the task results come back, this doesn't do another graph search unless the section's visibility data or build state changed in a way that needs the render list to be updated. * Use larger bounding box for nearby sections in frustum check (CaffeineMC#2879) This fixes some problems where very large block entities in nearby sections may be incorrectly culled. But it does not comprehensively fix the problem for all other sections, since that would require visiting the 27-neighborhood of every section, which is too slow. * Bump version to 0.6.1 * Bump dependency versions * Update mod manifest * Ensure ItemRenderContext.isDefaultTranslucent is initialized * Update compatible mods listing * Update mod manifest to restrict Minecraft versions * Update to Minecraft 1.21.4 * Update NeoForge manifest for Minecraft 1.21.4 * Fix glyph effect orientation * Fix hidden surface elimination in fluid rendering for waterlogged blocks (CaffeineMC#2907) * Fix detection for specific Intel OpenGL ICDs The OpenGL ICD name now includes the file extension, which the regex expressions were not matching. * Avoid showing the incompatible driver error in some cases For systems with hybrid graphics, it may be the case that an incompatible graphics driver is installed, but that it isn't used for the OpenGL context. We can avoid showing errors in this situation by checking the vendor string of the context immediately after creation. This is not the most robust check, but in practice, a single system should not have multiple graphics drivers installed from the same vendor, so checking the string should be relatively safe. * Fix lambda mappings * Bump version and dependency requirements * Use correct coordinates for sorting chunk sections (CaffeineMC#2924) Fix section and region sorting by using the correct section coordinate instead of the integer part of the camera transform, which is incorrect near the origin. Closes CaffeineMC#2918 * Update README.md * Update textures for Leaves variants * Added "Pale Oak Leaves" for Minecraft 1.21.4. * Reduced the file size of all block textures. * Clean up BitArray class * Use link.caffeinemc.net domain for some URLs * Use ShellExecuteW from message box callbacks This fixes a regression caused by 26f4263. The underlying problem is that accessing Java's AWT *after* LWJGL3 has initialized is not possible. Minecraft has a utility class which uses rundll32 internally, but we cannot access that due to classloader restrictions on NeoForge. That leaves us with having to implement the call ourselves, and simply using Shell32 directly (like we do for other Windows APIs) seems easiest. * Do not bake ambient lighting into cached per-face light data Fixes CaffeineMC#2806 * Switch to Parchment mappings 2024.12.07 * Bump version and dependencies * Do not cache ambient brightness at initialization The world may not be assigned to the renderer at initialization, which is the case for non-terrain rendering (i.e. block entities.) Likely, there is no performance benefit to caching this data in the first place, so the easiest solution is to just remove the code. * Bump version to 0.6.5 * Fix transforms not being run on fast path * Bump version to 0.6.6 * Rename model texture references to match Vanilla (CaffeineMC#2958) Custom models which extended these base models would not properly have their textures applied, as the texture references were accidentally changed. * Do not apply optimizations to sprites with special tickers * Aggressively optimize entity rendering This is anywhere from 10 to 15% faster depending on what entities are being rendered. Most of the improvements come from more efficiently laying out the cuboid data and coalescing neighboring 32-bit values into 64-bit words. Furthermore, vertex positions are calculated by extracting vectors from the pose matrix and adding them to the origin vertex, which avoids many matrix multiplications. Co-authored-by: MoePus <[email protected]> * Avoid quaternion transforms in particle rendering The billboard geometry can be computed using the camera's left and up vectors, saving some cycles. When rendering thousands of billboard particles, this was ~10% faster than baseline in my observation. Co-authored-by: MoePus <[email protected]> * Add shader source line annotations (CaffeineMC#2691) --------- Co-authored-by: JellySquid <[email protected]> Co-authored-by: muzikbike <[email protected]> Co-authored-by: douira <[email protected]> Co-authored-by: IMS212 <[email protected]> Co-authored-by: MoePus <[email protected]>
ThatMG393
pushed a commit
to ThatMG393/sodium
that referenced
this pull request
Jan 4, 2025
…eineMC#2886) Avoids rebuilding the render lists and doing a graph search more often than necessary by checking if the section actually changed in a way that's relevant to the graph search. For worlds that update their blocks frequently (every tick or every redstone tick) this avoids half the graph searches. Some graph searches are still necessary to schedule rebuild tasks, but when the task results come back, this doesn't do another graph search unless the section's visibility data or build state changed in a way that needs the render list to be updated.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Avoids rebuilding the render lists and doing a graph search more often than necessary by checking if the section actually changed in a way that's relevant to the graph search. For worlds that update their blocks frequently (every tick or every redstone tick) this avoids half the graph searches. Some graph searches are still necessary to schedule rebuild tasks, but when the task results come back, this doesn't do another graph search unless the section's visibility data or build state changed in a way that needs the render list to be updated.
In combination with async frustum culling where task collection is not bound to the graph search, this is even more effective.