Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move output swizzling pass before fusions #1651

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Commits on Sep 13, 2024

  1. Move output swizzling pass before fusions

    * Move the output fusion swizzling before fusions
    * Due to the tension between LDS consolidation and multibuffering,
    remove the reuse-lds call before the output swizzle. As a consequence,
    remove the "increasing total LDS usage" heuristic from output swizzle
    enablemente, since it should probably be fine
    * Fix an issue where fusion traversal wasn't working correctly,
    resulting in insufficinetly vectorized writes to global memory despite
    previous attempts to fix the issue
    * Fix a test that wasn't using i8 LDS
    * Update the packed arithmetic test to check for vectorized writes
    * Add a guard in case the ExistingOps strictness is still letting LDS
    writes into the output swizzle rewrite
    krzysz00 committed Sep 13, 2024
    Configuration menu
    Copy the full SHA
    c989d45 View commit details
    Browse the repository at this point in the history

Commits on Sep 17, 2024

  1. Move output swizzling pass before fusions

    * Move the output fusion swizzling before fusions
    * Due to the tension between LDS consolidation and multibuffering,
    remove the reuse-lds call before the output swizzle. As a consequence,
    remove the "increasing total LDS usage" heuristic from output swizzle
    enablemente, since it should probably be fine
    * Fix an issue where fusion traversal wasn't working correctly,
    resulting in insufficinetly vectorized writes to global memory despite
    previous attempts to fix the issue
    * Fix a test that wasn't using i8 LDS
    * Update the packed arithmetic test to check for vectorized writes
    * Add a guard in case the ExistingOps strictness is still letting LDS
    writes into the output swizzle rewrite
    krzysz00 authored and dhernandez0 committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    a4b4ea4 View commit details
    Browse the repository at this point in the history
  2. Refactor Reuse LDS to be able to keep the same heuristic and also rev…

    …ert the enableApplicability change
    dhernandez0 committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    4cc423b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7f45086 View commit details
    Browse the repository at this point in the history