Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid Unnecessary Graph Searches by Checking for State Changes #2886

Merged
merged 1 commit into from
Dec 2, 2024

Conversation

douira
Copy link
Collaborator

@douira douira commented Nov 20, 2024

Avoids rebuilding the render lists and doing a graph search more often than necessary by checking if the section actually changed in a way that's relevant to the graph search. For worlds that update their blocks frequently (every tick or every redstone tick) this avoids half the graph searches. Some graph searches are still necessary to schedule rebuild tasks, but when the task results come back, this doesn't do another graph search unless the section's visibility data or build state changed in a way that needs the render list to be updated.

In combination with async frustum culling where task collection is not bound to the graph search, this is even more effective.

…n than necessary by checking if the section actually changed in a way that's relevant to the graph search
@jellysquid3 jellysquid3 merged commit 8f4aaaa into CaffeineMC:dev Dec 2, 2024
1 check passed
jellysquid3 pushed a commit that referenced this pull request Dec 2, 2024
Avoids rebuilding the render lists and doing a graph search
more often than necessary by checking if the section actually
changed in a way that's relevant to the graph search.

For worlds that update their blocks frequently (every tick or
every redstone tick) this avoids half the graph searches. Some
graph searches are still necessary to schedule rebuild tasks,
but when the task results come back, this doesn't do another
graph search unless the section's visibility data or build state
changed in a way that needs the render list to be updated.
@douira douira deleted the only-rebuild-when-necessary branch December 7, 2024 23:50
ThatMG393 added a commit to ThatMG393/sodium that referenced this pull request Dec 15, 2024
* Correctly handle colorization on NeoForge

* Combine the vertex position attributes (CaffeineMC#2753)

This improves terrain rendering performance significantly
on Intel Xe-LP graphics under Linux.

* Add option for Fullscreen Resolution (CaffeineMC#2642)

The resolution controls would not fit in the allocated space, so the
rendering of slider controls was changed to enable rendering the slider
bar and the value text on separate lines.

Co-authored-by: MeeniMc <[email protected]>

* Only enable Fullscreen Resolution option on Windows

Additionally, adjust the rendering of the controls
to be less confusing when disabled, and provide an
explanation as to what the option does.

* Use consistent vertex ordering in entity rendering

Some core shaders were relying on the model part faces being
written out in a specific order. We still don't support
core shaders, but the fix here is trivial enough.

Fixes CaffeineMC#2745

* Add check for NeoForge per-quad AO flag

* Disable mod entrypoint on Forge when running on servers (CaffeineMC#2773)

* Fix excessively large allocations in chunk meshing

The requested capacity was being multiplied by the vertex
stride more than once, which resulted in far too much
memory being allocated.

Closes CaffeineMC#2792

* Fix some issues with Uint32 representation

This increases the maximum size of vertex and index buffers
to 4 billion elements, since the Uint32 types stored in memory are
now safely represented with Int64.

For vertex buffers, this increases their maximum size to 80 GiB,
and index buffers have a maximum size of 16 GiB, whereas both
were limited to 2 GiB prior.

* Fix cull bitmask ordering in entity rendering

Closes CaffeineMC#2788

* Add support for Maven Local publishing

* Fix incorrect warning message when D3DKMT is not supported

* Add Flawless Frames handler for NeoForge

* Add angle-based section visibility path occlusion (CaffeineMC#2811)

This eliminates 8-13% of the rendered sections at higher render distances on average in testing, and correspondingly reduces graph search time by a similar amount.

* Disable material downgrading on Intel Gen8 and older

Fixes CaffeineMC#2830

* Delay normal face calculation to use

This potentially fixes some cases of CaffeineMC#2835.

* Skip particle rendering optimizations for incompatible mods

Fixes CaffeineMC#2827

* Update project URLs in source and documentation

We're no longer a Fabric-exclusive mod, so let's get rid of
the suffix.

* Add third-party license notice for Fabric API

* Optimizations for some block models (CaffeineMC#2508)

Co-authored-by: muzikbike <[email protected]>

* Fix sorting failures on rotated cuboids (CaffeineMC#2812)

Use the accurate vertex positions for unaligned and aligned (but rotated) quads.

* Rework the Gradle build scripts for multi-loader

* Shared logic is moved into a build plugin where possible
* Build time is significantly improved when the Gradle daemon is warmed
* Mixins are remapped in-place now, eliminating the need for refmaps
  at runtime. This also gets rid of some warning messages at startup.
* Module relationships are now correctly represented in IDEA for other
  source sets (fixes a lot of code analysis features)
* Split Java source and resources into different configurations
* Run configurations are now consistent between NeoForge/Fabric
* The common project is no longer remapped unnecessarily
* Updated Gradle and build plugins

* Restore versioning schema to build script

* Make organization of platform mixin packages consistent

Fixes CaffeineMC#2688

* Exclude README documentation from processed resources

These files are only meant to be in the source distribution, and
Minecraft doesn't like them.

* Don't try to load a refmap from the mixin plugin

* Enumerate additional PCI classes in the graphics adapter probe

Some integrated GPUs, such as RDNA3.5, appear to use
the PCI_CLASS_DISPLAY_OTHER class.

* Remove KDE and GNOME specific backends for browsing URLs

The bugs with xdg-open have been resolved upstream and most
Linux distributions are shipping the patches.

Also, make sure we get a successful exit code from the XDG
implementation.

* Remove Minecraft from classpath of the pre-launch source set

This will help to avoid class-load issues and makes the code more
hygienic.

* Update to Minecraft 1.21.3

* Ensure tooltips are constrained to the screen (CaffeineMC#2845)

* Update authors and contributors list

* Remove leftover popPose

* Trust existing fog color

* Block the Overwolf Overlay due to graphical corruption

The overlay does not correctly restore the texture unit state
in OpenGL, which causes problems when Minecraft thinks a texture
has already been bound to a slot.

Since disabling the OpenGL state cache globally is not an
acceptable solution (it would severely hurt performance) and
their software doesn't give us any method to detect the
problematic version, we block all versions.

gep_minecraft.dll is the payload they actually inject, which
has no version information or description.

Fixes CaffeineMC#2862

* Avoid static memory allocation in EntityRenderer

Just allocate on the stack, since it's a small amount of
memory (<1 KiB) and avoids needing complex finalizers.

* Fix y-offset calculation for back face culling in cloud rendering (CaffeineMC#2864)

* Fix memory leak and double free in CloudRenderer

* Improve color mixing functions

The existing functions did not implement rounding
correctly (often leading to off-by-one errors).

Additionally, the improved variants are both slightly
faster and easier to understand.

* Fixup documentation in ColorMixer

* Unify color mixing/swizzling utilities

The Fabric integration code was re-implementing a lot
of the utilities that already exist in Sodium unnecessarily.

Also, improve the documentation so that ABGR and RGBA are
not used interchangeably.

* Add optimized function for bi-linear interpolation

This reduces the number of ALU ops significantly and
creates a common utility function in the project.

* Reduce time complexity for box blurs

Measuring the time spent per box blur in biome blending,
the following results were observed.

Radius      Before      After       % Improvement
7 blocks    9100ns      3700ns      59%
3 blocks    5400ns      3200ns      41%
1 blocks    3700ns      2600ns      29%

* Revert detection of Overwolf Overlay

They claim this has since been fixed. We will re-examine in
the future if we see additional reports.

This reverts commit e7ea6f7.

* Bump version to 0.6.0-beta.5

* Bump version to 0.6.0-final

* Update render code for chunk status map

Fixes CaffeineMC#2881

* Rollup of fixes and improvements for cloud rendering

Some changes were made to cloud rendering in newer versions
that needed to be replicated in Sodium.

- The alpha cutoff for clouds was changed to (a < 10).
- Texture loading can now gracefully fail, and it is
expected that rendering is skipped when this happens.
- The movement/positioning of clouds was slightly changed.
- The render pass system now needs to be told about
render target usages (fixes CaffeineMC#2883).

This commit also improves mesh building time by around 35% on
a fast processor (AMD Ryzen HX AI 370) through various
micro-optimizations.

* Fix culling behavior between transparent and opaque blocks

Minecraft 1.21.2 changed some of the rules, and this was
causing the faces of transparent blocks to be rendered
even when they were hidden by full opaque blocks.

Fixes CaffeineMC#2850

* Fix precision issues in cloud rendering at far distances

* Use alternative workaround for NVIDIA drivers

The NVIDIA driver enables a driver feature called
"Threaded Optimizations" when it finds Minecraft,
which causes severe performance issues and sometimes
even crashes.

Newer versions of the driver seem to use a slightly
different heuristic which our workaround doesn't
address.

So, instead, use an alternative method that enables
GL_DEBUG_OUTPUT_SYNCHRONOUS. This seems to reliably
disable the functionality *even if* the user has
configured it otherwise in their driver settings.

Additionally, on Windows, we now always indicate to
the driver that Minecraft is running, so that users
with hybrid graphics don't see regressed performance.

* Sort render lists for regions and sections after traversal (CaffeineMC#2780)

Render sections and regions are sorted after the graph traversal is performed. This decouples their ordering from the graph, which isn't entirely correct for draw call sorting.

Fixes CaffeineMC#2266

* Add support for new NeoForge fluid overlay API

* Ensure depth test is configured when rendering clouds

The state of the depth test prior to cloud rendering is
undefined. After rendering, it is expected to be
disabled again.

* Fix rounding error in ColorMixer#mix

Rounding of the values now happens after the 16-bit
intermediaries are added together.

This affected some animated textures, causing them to
exhibit flickering behavior.

* Add additional optimized block models

This covers the following additional blocks:
- Cauldrons
- Brewing Stands
- Bells

Co-authored-by: JellySquid <[email protected]>

* Avoid marking the section graph as dirty if state didn't change (CaffeineMC#2886)

Avoids rebuilding the render lists and doing a graph search
more often than necessary by checking if the section actually
changed in a way that's relevant to the graph search.

For worlds that update their blocks frequently (every tick or
every redstone tick) this avoids half the graph searches. Some
graph searches are still necessary to schedule rebuild tasks,
but when the task results come back, this doesn't do another
graph search unless the section's visibility data or build state
changed in a way that needs the render list to be updated.

* Use larger bounding box for nearby sections in frustum check (CaffeineMC#2879)

This fixes some problems where very large block entities in
nearby sections may be incorrectly culled. But it does not
comprehensively fix the problem for all other sections,
since that would require visiting the 27-neighborhood of
every section, which is too slow.

* Bump version to 0.6.1

* Bump dependency versions

* Update mod manifest

* Ensure ItemRenderContext.isDefaultTranslucent is initialized

* Update compatible mods listing

* Update mod manifest to restrict Minecraft versions

* Update to Minecraft 1.21.4

* Update NeoForge manifest for Minecraft 1.21.4

* Fix glyph effect orientation

* Fix hidden surface elimination in fluid rendering for waterlogged blocks (CaffeineMC#2907)

* Fix detection for specific Intel OpenGL ICDs

The OpenGL ICD name now includes the file extension,
which the regex expressions were not matching.

* Avoid showing the incompatible driver error in some cases

For systems with hybrid graphics, it may be the case
that an incompatible graphics driver is installed, but that
it isn't used for the OpenGL context.

We can avoid showing errors in this situation by checking
the vendor string of the context immediately after
creation.

This is not the most robust check, but in practice, a single
system should not have multiple graphics drivers installed
from the same vendor, so checking the string should be
relatively safe.

* Fix lambda mappings

* Bump version and dependency requirements

* Use correct coordinates for sorting chunk sections (CaffeineMC#2924)

Fix section and region sorting by using the correct section coordinate instead of the integer part of the camera transform, which is incorrect near the origin.

Closes CaffeineMC#2918

* Update README.md

---------

Co-authored-by: IMS212 <[email protected]>
Co-authored-by: JellySquid <[email protected]>
Co-authored-by: douira <[email protected]>
Co-authored-by: MeeniMc <[email protected]>
Co-authored-by: JellySquid <[email protected]>
Co-authored-by: IThundxr <[email protected]>
Co-authored-by: muzikbike <[email protected]>
ThatMG393 pushed a commit to ThatMG393/sodium that referenced this pull request Dec 15, 2024
…eineMC#2886)

Avoids rebuilding the render lists and doing a graph search
more often than necessary by checking if the section actually
changed in a way that's relevant to the graph search.

For worlds that update their blocks frequently (every tick or
every redstone tick) this avoids half the graph searches. Some
graph searches are still necessary to schedule rebuild tasks,
but when the task results come back, this doesn't do another
graph search unless the section's visibility data or build state
changed in a way that needs the render list to be updated.
ThatMG393 pushed a commit to ThatMG393/sodium that referenced this pull request Dec 15, 2024
…eineMC#2886)

Avoids rebuilding the render lists and doing a graph search
more often than necessary by checking if the section actually
changed in a way that's relevant to the graph search.

For worlds that update their blocks frequently (every tick or
every redstone tick) this avoids half the graph searches. Some
graph searches are still necessary to schedule rebuild tasks,
but when the task results come back, this doesn't do another
graph search unless the section's visibility data or build state
changed in a way that needs the render list to be updated.
ThatMG393 added a commit to ThatMG393/sodium that referenced this pull request Dec 15, 2024
* rename some things for clarity

* fix waterlogged glass panes (once again, but more this time) by avoiding distance sorting through
the detection of primary intersectors when geometry is intersecting and then sorting them in a fixed order

* use Mth.clamp for clarity

* refactor buffer and sort result handling, buffers are now freed immediately instead of keeping them to avoid memory usage
buffer caching would be a better solution but that's complicated and doesn't currently work correctly

* reduce number of unique triggers by around 5 percent without impacting sorting or building performance

* importantly sort a little farther away, sort tasks are fast

* use defer zero frames for important sort tasks by default

* fix build

* clarify authorship of BitArray

* fix bug with radix sort for SNR heuristic in BSP partition generating wrong indexes

* combine draw commands

* correctly reset accumulated element count

* remove draw call combining for indexed rendering as it's broken and hard to fix

* skip heuristic if there's no quads

* refactor primary intersector detection to handle large cases better,
also removed the warning message about unpartitionable geometry as it seems to not be a relevant problem

* fix topo sorting in some situations where the dot product was wrongly not recalculated when the normal is quantized.
also fixed aligned quads not receiving the more accurate center based on the average of the unique vertexes.

* reorder vertex ranges before uploaded to optimize for combined draw commands

* tune primary intersector detection to handle situations where only a small amount of geometry is intersecting

* Correctly handle colorization on NeoForge

* Combine the vertex position attributes (CaffeineMC#2753)

This improves terrain rendering performance significantly
on Intel Xe-LP graphics under Linux.

* Add option for Fullscreen Resolution (CaffeineMC#2642)

The resolution controls would not fit in the allocated space, so the
rendering of slider controls was changed to enable rendering the slider
bar and the value text on separate lines.

Co-authored-by: MeeniMc <[email protected]>

* Only enable Fullscreen Resolution option on Windows

Additionally, adjust the rendering of the controls
to be less confusing when disabled, and provide an
explanation as to what the option does.

* fix draw command combining, remove aggressive non-empty command skipping because it seems broken

* Use consistent vertex ordering in entity rendering

Some core shaders were relying on the model part faces being
written out in a specific order. We still don't support
core shaders, but the fix here is trivial enough.

Fixes CaffeineMC#2745

* Add check for NeoForge per-quad AO flag

* Disable mod entrypoint on Forge when running on servers (CaffeineMC#2773)

* fix graphical corruption when there's a lot of geometry by appropriately picking the size of the required shared index buffer

* cleanup unused and broken code

* cleanup calculation of mask bit and element count

* cleanup meshing, storage, and renderer

* fix translucent rendering by correctly decoding vertex segments

* cleanup misc, remove unused code

* refactor translucent AnyOrderData to not generate its own trivial index buffer and instead share this type of data within regions

* add Index Pool arena size

* add arena content caching

* Fix excessively large allocations in chunk meshing

The requested capacity was being multiplied by the vertex
stride more than once, which resulted in far too much
memory being allocated.

Closes CaffeineMC#2792

* Fix some issues with Uint32 representation

This increases the maximum size of vertex and index buffers
to 4 billion elements, since the Uint32 types stored in memory are
now safely represented with Int64.

For vertex buffers, this increases their maximum size to 80 GiB,
and index buffers have a maximum size of 16 GiB, whereas both
were limited to 2 GiB prior.

* refactor storage to cope with larger amounts of geometry and use less ugly hacks, rename a bunch of methods to be consistent and clearer

* remove debug code

* Fix cull bitmask ordering in entity rendering

Closes CaffeineMC#2788

* Add support for Maven Local publishing

* Fix incorrect warning message when D3DKMT is not supported

* Add Flawless Frames handler for NeoForge

* Add angle-based section visibility path occlusion (CaffeineMC#2811)

This eliminates 8-13% of the rendered sections at higher render distances on average in testing, and correspondingly reduces graph search time by a similar amount.

* Disable material downgrading on Intel Gen8 and older

Fixes CaffeineMC#2830

* Delay normal face calculation to use

This potentially fixes some cases of CaffeineMC#2835.

* Skip particle rendering optimizations for incompatible mods

Fixes CaffeineMC#2827

* Update project URLs in source and documentation

We're no longer a Fabric-exclusive mod, so let's get rid of
the suffix.

* Add third-party license notice for Fabric API

* Optimizations for some block models (CaffeineMC#2508)

Co-authored-by: muzikbike <[email protected]>

* Fix sorting failures on rotated cuboids (CaffeineMC#2812)

Use the accurate vertex positions for unaligned and aligned (but rotated) quads.

* Rework the Gradle build scripts for multi-loader

* Shared logic is moved into a build plugin where possible
* Build time is significantly improved when the Gradle daemon is warmed
* Mixins are remapped in-place now, eliminating the need for refmaps
  at runtime. This also gets rid of some warning messages at startup.
* Module relationships are now correctly represented in IDEA for other
  source sets (fixes a lot of code analysis features)
* Split Java source and resources into different configurations
* Run configurations are now consistent between NeoForge/Fabric
* The common project is no longer remapped unnecessarily
* Updated Gradle and build plugins

* Restore versioning schema to build script

* Make organization of platform mixin packages consistent

Fixes CaffeineMC#2688

* Exclude README documentation from processed resources

These files are only meant to be in the source distribution, and
Minecraft doesn't like them.

* Don't try to load a refmap from the mixin plugin

* Enumerate additional PCI classes in the graphics adapter probe

Some integrated GPUs, such as RDNA3.5, appear to use
the PCI_CLASS_DISPLAY_OTHER class.

* Remove KDE and GNOME specific backends for browsing URLs

The bugs with xdg-open have been resolved upstream and most
Linux distributions are shipping the patches.

Also, make sure we get a successful exit code from the XDG
implementation.

* Remove Minecraft from classpath of the pre-launch source set

This will help to avoid class-load issues and makes the code more
hygienic.

* Update to Minecraft 1.21.3

* Ensure tooltips are constrained to the screen (CaffeineMC#2845)

* Update authors and contributors list

* Remove leftover popPose

* Trust existing fog color

* Block the Overwolf Overlay due to graphical corruption

The overlay does not correctly restore the texture unit state
in OpenGL, which causes problems when Minecraft thinks a texture
has already been bound to a slot.

Since disabling the OpenGL state cache globally is not an
acceptable solution (it would severely hurt performance) and
their software doesn't give us any method to detect the
problematic version, we block all versions.

gep_minecraft.dll is the payload they actually inject, which
has no version information or description.

Fixes CaffeineMC#2862

* Avoid static memory allocation in EntityRenderer

Just allocate on the stack, since it's a small amount of
memory (<1 KiB) and avoids needing complex finalizers.

* Fix y-offset calculation for back face culling in cloud rendering (CaffeineMC#2864)

* Fix memory leak and double free in CloudRenderer

* Improve color mixing functions

The existing functions did not implement rounding
correctly (often leading to off-by-one errors).

Additionally, the improved variants are both slightly
faster and easier to understand.

* Fixup documentation in ColorMixer

* Unify color mixing/swizzling utilities

The Fabric integration code was re-implementing a lot
of the utilities that already exist in Sodium unnecessarily.

Also, improve the documentation so that ABGR and RGBA are
not used interchangeably.

* Add optimized function for bi-linear interpolation

This reduces the number of ALU ops significantly and
creates a common utility function in the project.

* Reduce time complexity for box blurs

Measuring the time spent per box blur in biome blending,
the following results were observed.

Radius      Before      After       % Improvement
7 blocks    9100ns      3700ns      59%
3 blocks    5400ns      3200ns      41%
1 blocks    3700ns      2600ns      29%

* Revert detection of Overwolf Overlay

They claim this has since been fixed. We will re-examine in
the future if we see additional reports.

This reverts commit e7ea6f7.

* Bump version to 0.6.0-beta.5

* Bump version to 0.6.0-final

* Update render code for chunk status map

Fixes CaffeineMC#2881

* Rollup of fixes and improvements for cloud rendering

Some changes were made to cloud rendering in newer versions
that needed to be replicated in Sodium.

- The alpha cutoff for clouds was changed to (a < 10).
- Texture loading can now gracefully fail, and it is
expected that rendering is skipped when this happens.
- The movement/positioning of clouds was slightly changed.
- The render pass system now needs to be told about
render target usages (fixes CaffeineMC#2883).

This commit also improves mesh building time by around 35% on
a fast processor (AMD Ryzen HX AI 370) through various
micro-optimizations.

* Fix culling behavior between transparent and opaque blocks

Minecraft 1.21.2 changed some of the rules, and this was
causing the faces of transparent blocks to be rendered
even when they were hidden by full opaque blocks.

Fixes CaffeineMC#2850

* Fix precision issues in cloud rendering at far distances

* Use alternative workaround for NVIDIA drivers

The NVIDIA driver enables a driver feature called
"Threaded Optimizations" when it finds Minecraft,
which causes severe performance issues and sometimes
even crashes.

Newer versions of the driver seem to use a slightly
different heuristic which our workaround doesn't
address.

So, instead, use an alternative method that enables
GL_DEBUG_OUTPUT_SYNCHRONOUS. This seems to reliably
disable the functionality *even if* the user has
configured it otherwise in their driver settings.

Additionally, on Windows, we now always indicate to
the driver that Minecraft is running, so that users
with hybrid graphics don't see regressed performance.

* Sort render lists for regions and sections after traversal (CaffeineMC#2780)

Render sections and regions are sorted after the graph traversal is performed. This decouples their ordering from the graph, which isn't entirely correct for draw call sorting.

Fixes CaffeineMC#2266

* Add support for new NeoForge fluid overlay API

* Ensure depth test is configured when rendering clouds

The state of the depth test prior to cloud rendering is
undefined. After rendering, it is expected to be
disabled again.

* Fix rounding error in ColorMixer#mix

Rounding of the values now happens after the 16-bit
intermediaries are added together.

This affected some animated textures, causing them to
exhibit flickering behavior.

* Add additional optimized block models

This covers the following additional blocks:
- Cauldrons
- Brewing Stands
- Bells

Co-authored-by: JellySquid <[email protected]>

* Avoid marking the section graph as dirty if state didn't change (CaffeineMC#2886)

Avoids rebuilding the render lists and doing a graph search
more often than necessary by checking if the section actually
changed in a way that's relevant to the graph search.

For worlds that update their blocks frequently (every tick or
every redstone tick) this avoids half the graph searches. Some
graph searches are still necessary to schedule rebuild tasks,
but when the task results come back, this doesn't do another
graph search unless the section's visibility data or build state
changed in a way that needs the render list to be updated.

* Use larger bounding box for nearby sections in frustum check (CaffeineMC#2879)

This fixes some problems where very large block entities in
nearby sections may be incorrectly culled. But it does not
comprehensively fix the problem for all other sections,
since that would require visiting the 27-neighborhood of
every section, which is too slow.

* Bump version to 0.6.1

* Bump dependency versions

* Update mod manifest

* Ensure ItemRenderContext.isDefaultTranslucent is initialized

* Update compatible mods listing

* Update mod manifest to restrict Minecraft versions

* Update to Minecraft 1.21.4

* Update NeoForge manifest for Minecraft 1.21.4

* Fix glyph effect orientation

* Fix hidden surface elimination in fluid rendering for waterlogged blocks (CaffeineMC#2907)

* Fix detection for specific Intel OpenGL ICDs

The OpenGL ICD name now includes the file extension,
which the regex expressions were not matching.

* Avoid showing the incompatible driver error in some cases

For systems with hybrid graphics, it may be the case
that an incompatible graphics driver is installed, but that
it isn't used for the OpenGL context.

We can avoid showing errors in this situation by checking
the vendor string of the context immediately after
creation.

This is not the most robust check, but in practice, a single
system should not have multiple graphics drivers installed
from the same vendor, so checking the string should be
relatively safe.

* Fix lambda mappings

* Bump version and dependency requirements

* Use correct coordinates for sorting chunk sections (CaffeineMC#2924)

Fix section and region sorting by using the correct section coordinate instead of the integer part of the camera transform, which is incorrect near the origin.

Closes CaffeineMC#2918

* Update README.md

---------

Co-authored-by: douira <[email protected]>
Co-authored-by: IMS212 <[email protected]>
Co-authored-by: JellySquid <[email protected]>
Co-authored-by: douira <[email protected]>
Co-authored-by: MeeniMc <[email protected]>
Co-authored-by: JellySquid <[email protected]>
Co-authored-by: IThundxr <[email protected]>
Co-authored-by: muzikbike <[email protected]>
ThatMG393 added a commit to ThatMG393/sodium that referenced this pull request Jan 3, 2025
* Correctly handle colorization on NeoForge

* Combine the vertex position attributes (CaffeineMC#2753)

This improves terrain rendering performance significantly
on Intel Xe-LP graphics under Linux.

* Add option for Fullscreen Resolution (CaffeineMC#2642)

The resolution controls would not fit in the allocated space, so the
rendering of slider controls was changed to enable rendering the slider
bar and the value text on separate lines.

Co-authored-by: MeeniMc <[email protected]>

* Only enable Fullscreen Resolution option on Windows

Additionally, adjust the rendering of the controls
to be less confusing when disabled, and provide an
explanation as to what the option does.

* Use consistent vertex ordering in entity rendering

Some core shaders were relying on the model part faces being
written out in a specific order. We still don't support
core shaders, but the fix here is trivial enough.

Fixes CaffeineMC#2745

* Add check for NeoForge per-quad AO flag

* Disable mod entrypoint on Forge when running on servers (CaffeineMC#2773)

* Fix excessively large allocations in chunk meshing

The requested capacity was being multiplied by the vertex
stride more than once, which resulted in far too much
memory being allocated.

Closes CaffeineMC#2792

* Fix some issues with Uint32 representation

This increases the maximum size of vertex and index buffers
to 4 billion elements, since the Uint32 types stored in memory are
now safely represented with Int64.

For vertex buffers, this increases their maximum size to 80 GiB,
and index buffers have a maximum size of 16 GiB, whereas both
were limited to 2 GiB prior.

* Fix cull bitmask ordering in entity rendering

Closes CaffeineMC#2788

* Add support for Maven Local publishing

* Fix incorrect warning message when D3DKMT is not supported

* Add Flawless Frames handler for NeoForge

* Add angle-based section visibility path occlusion (CaffeineMC#2811)

This eliminates 8-13% of the rendered sections at higher render distances on average in testing, and correspondingly reduces graph search time by a similar amount.

* Disable material downgrading on Intel Gen8 and older

Fixes CaffeineMC#2830

* Delay normal face calculation to use

This potentially fixes some cases of CaffeineMC#2835.

* Skip particle rendering optimizations for incompatible mods

Fixes CaffeineMC#2827

* Update project URLs in source and documentation

We're no longer a Fabric-exclusive mod, so let's get rid of
the suffix.

* Add third-party license notice for Fabric API

* Optimizations for some block models (CaffeineMC#2508)

Co-authored-by: muzikbike <[email protected]>

* Fix sorting failures on rotated cuboids (CaffeineMC#2812)

Use the accurate vertex positions for unaligned and aligned (but rotated) quads.

* Rework the Gradle build scripts for multi-loader

* Shared logic is moved into a build plugin where possible
* Build time is significantly improved when the Gradle daemon is warmed
* Mixins are remapped in-place now, eliminating the need for refmaps
  at runtime. This also gets rid of some warning messages at startup.
* Module relationships are now correctly represented in IDEA for other
  source sets (fixes a lot of code analysis features)
* Split Java source and resources into different configurations
* Run configurations are now consistent between NeoForge/Fabric
* The common project is no longer remapped unnecessarily
* Updated Gradle and build plugins

* Restore versioning schema to build script

* Make organization of platform mixin packages consistent

Fixes CaffeineMC#2688

* Exclude README documentation from processed resources

These files are only meant to be in the source distribution, and
Minecraft doesn't like them.

* Don't try to load a refmap from the mixin plugin

* Enumerate additional PCI classes in the graphics adapter probe

Some integrated GPUs, such as RDNA3.5, appear to use
the PCI_CLASS_DISPLAY_OTHER class.

* Remove KDE and GNOME specific backends for browsing URLs

The bugs with xdg-open have been resolved upstream and most
Linux distributions are shipping the patches.

Also, make sure we get a successful exit code from the XDG
implementation.

* Remove Minecraft from classpath of the pre-launch source set

This will help to avoid class-load issues and makes the code more
hygienic.

* Update to Minecraft 1.21.3

* Ensure tooltips are constrained to the screen (CaffeineMC#2845)

* Update authors and contributors list

* Remove leftover popPose

* Trust existing fog color

* Block the Overwolf Overlay due to graphical corruption

The overlay does not correctly restore the texture unit state
in OpenGL, which causes problems when Minecraft thinks a texture
has already been bound to a slot.

Since disabling the OpenGL state cache globally is not an
acceptable solution (it would severely hurt performance) and
their software doesn't give us any method to detect the
problematic version, we block all versions.

gep_minecraft.dll is the payload they actually inject, which
has no version information or description.

Fixes CaffeineMC#2862

* Avoid static memory allocation in EntityRenderer

Just allocate on the stack, since it's a small amount of
memory (<1 KiB) and avoids needing complex finalizers.

* Fix y-offset calculation for back face culling in cloud rendering (CaffeineMC#2864)

* Fix memory leak and double free in CloudRenderer

* Improve color mixing functions

The existing functions did not implement rounding
correctly (often leading to off-by-one errors).

Additionally, the improved variants are both slightly
faster and easier to understand.

* Fixup documentation in ColorMixer

* Unify color mixing/swizzling utilities

The Fabric integration code was re-implementing a lot
of the utilities that already exist in Sodium unnecessarily.

Also, improve the documentation so that ABGR and RGBA are
not used interchangeably.

* Add optimized function for bi-linear interpolation

This reduces the number of ALU ops significantly and
creates a common utility function in the project.

* Reduce time complexity for box blurs

Measuring the time spent per box blur in biome blending,
the following results were observed.

Radius      Before      After       % Improvement
7 blocks    9100ns      3700ns      59%
3 blocks    5400ns      3200ns      41%
1 blocks    3700ns      2600ns      29%

* Revert detection of Overwolf Overlay

They claim this has since been fixed. We will re-examine in
the future if we see additional reports.

This reverts commit e7ea6f7.

* Bump version to 0.6.0-beta.5

* Bump version to 0.6.0-final

* Update render code for chunk status map

Fixes CaffeineMC#2881

* Rollup of fixes and improvements for cloud rendering

Some changes were made to cloud rendering in newer versions
that needed to be replicated in Sodium.

- The alpha cutoff for clouds was changed to (a < 10).
- Texture loading can now gracefully fail, and it is
expected that rendering is skipped when this happens.
- The movement/positioning of clouds was slightly changed.
- The render pass system now needs to be told about
render target usages (fixes CaffeineMC#2883).

This commit also improves mesh building time by around 35% on
a fast processor (AMD Ryzen HX AI 370) through various
micro-optimizations.

* Fix culling behavior between transparent and opaque blocks

Minecraft 1.21.2 changed some of the rules, and this was
causing the faces of transparent blocks to be rendered
even when they were hidden by full opaque blocks.

Fixes CaffeineMC#2850

* Fix precision issues in cloud rendering at far distances

* Use alternative workaround for NVIDIA drivers

The NVIDIA driver enables a driver feature called
"Threaded Optimizations" when it finds Minecraft,
which causes severe performance issues and sometimes
even crashes.

Newer versions of the driver seem to use a slightly
different heuristic which our workaround doesn't
address.

So, instead, use an alternative method that enables
GL_DEBUG_OUTPUT_SYNCHRONOUS. This seems to reliably
disable the functionality *even if* the user has
configured it otherwise in their driver settings.

Additionally, on Windows, we now always indicate to
the driver that Minecraft is running, so that users
with hybrid graphics don't see regressed performance.

* Sort render lists for regions and sections after traversal (CaffeineMC#2780)

Render sections and regions are sorted after the graph traversal is performed. This decouples their ordering from the graph, which isn't entirely correct for draw call sorting.

Fixes CaffeineMC#2266

* Add support for new NeoForge fluid overlay API

* Ensure depth test is configured when rendering clouds

The state of the depth test prior to cloud rendering is
undefined. After rendering, it is expected to be
disabled again.

* Fix rounding error in ColorMixer#mix

Rounding of the values now happens after the 16-bit
intermediaries are added together.

This affected some animated textures, causing them to
exhibit flickering behavior.

* Add additional optimized block models

This covers the following additional blocks:
- Cauldrons
- Brewing Stands
- Bells

Co-authored-by: JellySquid <[email protected]>

* Avoid marking the section graph as dirty if state didn't change (CaffeineMC#2886)

Avoids rebuilding the render lists and doing a graph search
more often than necessary by checking if the section actually
changed in a way that's relevant to the graph search.

For worlds that update their blocks frequently (every tick or
every redstone tick) this avoids half the graph searches. Some
graph searches are still necessary to schedule rebuild tasks,
but when the task results come back, this doesn't do another
graph search unless the section's visibility data or build state
changed in a way that needs the render list to be updated.

* Use larger bounding box for nearby sections in frustum check (CaffeineMC#2879)

This fixes some problems where very large block entities in
nearby sections may be incorrectly culled. But it does not
comprehensively fix the problem for all other sections,
since that would require visiting the 27-neighborhood of
every section, which is too slow.

* Bump version to 0.6.1

* Bump dependency versions

* Update mod manifest

* Ensure ItemRenderContext.isDefaultTranslucent is initialized

* Update compatible mods listing

* Update mod manifest to restrict Minecraft versions

* Update to Minecraft 1.21.4

* Update NeoForge manifest for Minecraft 1.21.4

* Fix glyph effect orientation

* Fix hidden surface elimination in fluid rendering for waterlogged blocks (CaffeineMC#2907)

* Fix detection for specific Intel OpenGL ICDs

The OpenGL ICD name now includes the file extension,
which the regex expressions were not matching.

* Avoid showing the incompatible driver error in some cases

For systems with hybrid graphics, it may be the case
that an incompatible graphics driver is installed, but that
it isn't used for the OpenGL context.

We can avoid showing errors in this situation by checking
the vendor string of the context immediately after
creation.

This is not the most robust check, but in practice, a single
system should not have multiple graphics drivers installed
from the same vendor, so checking the string should be
relatively safe.

* Fix lambda mappings

* Bump version and dependency requirements

* Use correct coordinates for sorting chunk sections (CaffeineMC#2924)

Fix section and region sorting by using the correct section coordinate instead of the integer part of the camera transform, which is incorrect near the origin.

Closes CaffeineMC#2918

* Update README.md

* Update textures for Leaves variants

* Added "Pale Oak Leaves" for Minecraft 1.21.4.
* Reduced the file size of all block textures.

* Clean up BitArray class

* Use link.caffeinemc.net domain for some URLs

* Use ShellExecuteW from message box callbacks

This fixes a regression caused by 26f4263.

The underlying problem is that accessing Java's AWT *after*
LWJGL3 has initialized is not possible.

Minecraft has a utility class which uses rundll32 internally,
but we cannot access that due to classloader restrictions on
NeoForge.

That leaves us with having to implement the call ourselves,
and simply using Shell32 directly (like we do for other
Windows APIs) seems easiest.

* Do not bake ambient lighting into cached per-face light data

Fixes CaffeineMC#2806

* Switch to Parchment mappings 2024.12.07

* Bump version and dependencies

* Do not cache ambient brightness at initialization

The world may not be assigned to the renderer at
initialization, which is the case for non-terrain
rendering (i.e. block entities.)

Likely, there is no performance benefit to caching
this data in the first place, so the easiest solution
is to just remove the code.

* Bump version to 0.6.5

* Fix transforms not being run on fast path

* Bump version to 0.6.6

* Rename model texture references to match Vanilla (CaffeineMC#2958)

Custom models which extended these base models would not properly have their textures applied, as the texture references were accidentally changed.

* Do not apply optimizations to sprites with special tickers

* Aggressively optimize entity rendering

This is anywhere from 10 to 15% faster depending
on what entities are being rendered.

Most of the improvements come from more efficiently
laying out the cuboid data and coalescing neighboring
32-bit values into 64-bit words.

Furthermore, vertex positions are calculated by
extracting vectors from the pose matrix and adding
them to the origin vertex, which avoids many
matrix multiplications.

Co-authored-by: MoePus <[email protected]>

* Avoid quaternion transforms in particle rendering

The billboard geometry can be computed using the
camera's left and up vectors, saving some cycles.

When rendering thousands of billboard particles, this
was ~10% faster than baseline in my observation.

Co-authored-by: MoePus <[email protected]>

* Add shader source line annotations (CaffeineMC#2691)

---------

Co-authored-by: IMS212 <[email protected]>
Co-authored-by: JellySquid <[email protected]>
Co-authored-by: douira <[email protected]>
Co-authored-by: MeeniMc <[email protected]>
Co-authored-by: JellySquid <[email protected]>
Co-authored-by: IThundxr <[email protected]>
Co-authored-by: muzikbike <[email protected]>
Co-authored-by: MoePus <[email protected]>
ThatMG393 added a commit to ThatMG393/sodium that referenced this pull request Jan 3, 2025
* Ensure depth test is configured when rendering clouds

The state of the depth test prior to cloud rendering is
undefined. After rendering, it is expected to be
disabled again.

* Fix rounding error in ColorMixer#mix

Rounding of the values now happens after the 16-bit
intermediaries are added together.

This affected some animated textures, causing them to
exhibit flickering behavior.

* Add additional optimized block models

This covers the following additional blocks:
- Cauldrons
- Brewing Stands
- Bells

Co-authored-by: JellySquid <[email protected]>

* Avoid marking the section graph as dirty if state didn't change (CaffeineMC#2886)

Avoids rebuilding the render lists and doing a graph search
more often than necessary by checking if the section actually
changed in a way that's relevant to the graph search.

For worlds that update their blocks frequently (every tick or
every redstone tick) this avoids half the graph searches. Some
graph searches are still necessary to schedule rebuild tasks,
but when the task results come back, this doesn't do another
graph search unless the section's visibility data or build state
changed in a way that needs the render list to be updated.

* Use larger bounding box for nearby sections in frustum check (CaffeineMC#2879)

This fixes some problems where very large block entities in
nearby sections may be incorrectly culled. But it does not
comprehensively fix the problem for all other sections,
since that would require visiting the 27-neighborhood of
every section, which is too slow.

* Bump version to 0.6.1

* Bump dependency versions

* Update mod manifest

* Ensure ItemRenderContext.isDefaultTranslucent is initialized

* Update compatible mods listing

* Update mod manifest to restrict Minecraft versions

* Update to Minecraft 1.21.4

* Update NeoForge manifest for Minecraft 1.21.4

* Fix glyph effect orientation

* Fix hidden surface elimination in fluid rendering for waterlogged blocks (CaffeineMC#2907)

* Fix detection for specific Intel OpenGL ICDs

The OpenGL ICD name now includes the file extension,
which the regex expressions were not matching.

* Avoid showing the incompatible driver error in some cases

For systems with hybrid graphics, it may be the case
that an incompatible graphics driver is installed, but that
it isn't used for the OpenGL context.

We can avoid showing errors in this situation by checking
the vendor string of the context immediately after
creation.

This is not the most robust check, but in practice, a single
system should not have multiple graphics drivers installed
from the same vendor, so checking the string should be
relatively safe.

* Fix lambda mappings

* Bump version and dependency requirements

* Use correct coordinates for sorting chunk sections (CaffeineMC#2924)

Fix section and region sorting by using the correct section coordinate instead of the integer part of the camera transform, which is incorrect near the origin.

Closes CaffeineMC#2918

* Update README.md

* Update textures for Leaves variants

* Added "Pale Oak Leaves" for Minecraft 1.21.4.
* Reduced the file size of all block textures.

* Clean up BitArray class

* Use link.caffeinemc.net domain for some URLs

* Use ShellExecuteW from message box callbacks

This fixes a regression caused by 26f4263.

The underlying problem is that accessing Java's AWT *after*
LWJGL3 has initialized is not possible.

Minecraft has a utility class which uses rundll32 internally,
but we cannot access that due to classloader restrictions on
NeoForge.

That leaves us with having to implement the call ourselves,
and simply using Shell32 directly (like we do for other
Windows APIs) seems easiest.

* Do not bake ambient lighting into cached per-face light data

Fixes CaffeineMC#2806

* Switch to Parchment mappings 2024.12.07

* Bump version and dependencies

* Do not cache ambient brightness at initialization

The world may not be assigned to the renderer at
initialization, which is the case for non-terrain
rendering (i.e. block entities.)

Likely, there is no performance benefit to caching
this data in the first place, so the easiest solution
is to just remove the code.

* Bump version to 0.6.5

* Fix transforms not being run on fast path

* Bump version to 0.6.6

* Rename model texture references to match Vanilla (CaffeineMC#2958)

Custom models which extended these base models would not properly have their textures applied, as the texture references were accidentally changed.

* Do not apply optimizations to sprites with special tickers

* Aggressively optimize entity rendering

This is anywhere from 10 to 15% faster depending
on what entities are being rendered.

Most of the improvements come from more efficiently
laying out the cuboid data and coalescing neighboring
32-bit values into 64-bit words.

Furthermore, vertex positions are calculated by
extracting vectors from the pose matrix and adding
them to the origin vertex, which avoids many
matrix multiplications.

Co-authored-by: MoePus <[email protected]>

* Avoid quaternion transforms in particle rendering

The billboard geometry can be computed using the
camera's left and up vectors, saving some cycles.

When rendering thousands of billboard particles, this
was ~10% faster than baseline in my observation.

Co-authored-by: MoePus <[email protected]>

* Add shader source line annotations (CaffeineMC#2691)

---------

Co-authored-by: JellySquid <[email protected]>
Co-authored-by: muzikbike <[email protected]>
Co-authored-by: douira <[email protected]>
Co-authored-by: IMS212 <[email protected]>
Co-authored-by: MoePus <[email protected]>
ThatMG393 pushed a commit to ThatMG393/sodium that referenced this pull request Jan 4, 2025
…eineMC#2886)

Avoids rebuilding the render lists and doing a graph search
more often than necessary by checking if the section actually
changed in a way that's relevant to the graph search.

For worlds that update their blocks frequently (every tick or
every redstone tick) this avoids half the graph searches. Some
graph searches are still necessary to schedule rebuild tasks,
but when the task results come back, this doesn't do another
graph search unless the section's visibility data or build state
changed in a way that needs the render list to be updated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-enhancement Type: Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants