Add equivalence tester

By default, SLOTHY includes a 'selfcheck' verifying that the DFGs of input and output are isomorphic with respect to the rescheduling found by SLOTHY. This selfcheck is very powerful and should catch most modelling errors in SLOTHY. However, some aspects of SLOTHY remain outside of the scope of the DFG-based selfcheck: - Address offset fixup This is a historically rather error prone feature to which the self-check is blind (since it relies on making the DFG blind to address increments). - Loop boundaries If SW pipelining is enabled, the selfcheck checks equivalence of a _bounded unrolling_ of the input and output loops. Therefore, it would not catch issues with (a) the loop boundary, or (b) exceptional iterations for low loop counts. While it is not SLOTHY's goal to provide formal guarantees of correctness of the optimizations -- this should be gauged/confirmed through extensive test and ideally formal verification -- development would be faster if bugs in the aforementioned areas were caught before the developer integrates and tests the SLOTHY output manually. As a step towards catching more potential pitfalls early in the SLOTHY workflow, this commit starts the development of an _equivalence tester_. The idea of the equivalence tester is simple: Just run original and optimized assembly with various random inputs, and check that they indeed lead to equivalent outputs. This commit implements this idea for AArch64 and Armv7-M, leveraging LLVM-MC as an assembler and the unicorn-engine for emulation. An initial limitation is support for loops: For now, the equivalence checker tests bounded unrollings of loops, just like the selfcheck. That is against the point of the equivalenc check, but it will be improved at a later point, after some refactoring of the loop generation code. The checker added in this commit is able to detect issues with address offset fixup, however, which would previously have gone unnoticed by the selfcheck. The new equivalence test is configured via config.selftest, which is enabled by default, but silently skipped for unsupported platforms or target architectures.
slothy-optimizer · Dec 8, 2024 · 87e6043 · 87e6043
1 parent 2ebb278
commit 87e6043
Show file tree

Hide file tree

Showing 9 changed files with 455 additions and 52 deletions.
diff --git a/.github/actions/setup-ubuntu/action.yml b/.github/actions/setup-ubuntu/action.yml
@@ -0,0 +1,33 @@
+name: Setup ubuntu
+description: Setup ubuntu
+
+inputs:
+  packages:
+    description: Space-separated list of additional packages to install
+    required: false
+    default: 'llvm llvm-runtime'
+
+runs:
+  using: composite
+  steps:
+    - name: Update package repository
+      shell: bash
+      run: |
+        sudo apt-get update
+    - name: Install base packages
+      shell: bash
+      run: |
+        sudo apt-get install python3-venv python3-pip make -y
+    - name: Install additional packages
+      if: ${{ inputs.packages != ''}}
+      shell: bash
+      run: |
+        sudo apt-get install ${{ inputs.packages }} -y
+    - name: Setup Python venv
+      shell: bash
+      run: |
+        python3 -m venv venv
+        source venv/bin/activate
+        python3 -m pip install -r requirements.txt
+        deactivate
+        echo "$(pwd)/venv/bin/" >> "$GITHUB_PATH"
diff --git a/.github/workflows/test_basic.yaml b/.github/workflows/test_basic.yaml
@@ -12,16 +12,12 @@ jobs:
             github.event.pull_request.user.login == 'mkannwischer'
             }}
     runs-on: ubuntu-latest
-    strategy: 
-      matrix:      
+    strategy:
+      matrix:
         target: [slothy.targets.arm_v7m.cortex_m7,slothy.targets.arm_v81m.cortex_m55r1, slothy.targets.arm_v81m.cortex_m85r1, slothy.targets.aarch64.cortex_a55, slothy.targets.aarch64.cortex_a72_frontend, slothy.targets.aarch64.apple_m1_firestorm_experimental, slothy.targets.aarch64.apple_m1_icestorm_experimental]
     steps:
     - uses: actions/checkout@v3
-    - name: Install python dependencies
-      run: |
-        python3 -m venv venv
-        ./venv/bin/python3 -m pip install -r requirements.txt
-        echo BASH_ENV="./venv/bin/activate" >> $GITHUB_ENV
+    - uses: ./.github/actions/setup-ubuntu
     - name: Run examples
       run: |
         python3 example.py --dry-run --only-target=${{ matrix.target }}
@@ -34,11 +30,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
     - uses: actions/checkout@v3
-    - name: Install python dependencies
-      run: |
-        python3 -m venv venv
-        ./venv/bin/python3 -m pip install -r requirements.txt
-        echo BASH_ENV="./venv/bin/activate" >> $GITHUB_ENV
+    - uses: ./.github/actions/setup-ubuntu
     - name: Run tutorial
       run: |
         (cd tutorial && ./tutorial_all.sh)
@@ -51,11 +43,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
     - uses: actions/checkout@v3
-    - name: Install python dependencies
-      run: |
-        python3 -m venv venv
-        ./venv/bin/python3 -m pip install -r requirements.txt
-        echo BASH_ENV="./venv/bin/activate" >> $GITHUB_ENV
+    - uses: ./.github/actions/setup-ubuntu
     - name: Run examples
       run: |
         python3 example.py --examples simple0,simple1,simple0_loop,simple1_loop
@@ -68,11 +56,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
     - uses: actions/checkout@v3
-    - name: Install python dependencies
-      run: |
-        python3 -m venv venv
-        ./venv/bin/python3 -m pip install -r requirements.txt
-        echo BASH_ENV="./venv/bin/activate" >> $GITHUB_ENV
+    - uses: ./.github/actions/setup-ubuntu
     - name: Run examples
       run: |
         python3 example.py --examples ntt_kyber_1_23_45_67_m55,ntt_dilithium_12_34_56_78_m55 --timeout=300
@@ -85,11 +69,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
     - uses: actions/checkout@v3
-    - name: Install python dependencies
-      run: |
-        python3 -m venv venv
-        ./venv/bin/python3 -m pip install -r requirements.txt
-        echo BASH_ENV="./venv/bin/activate" >> $GITHUB_ENV
+    - uses: ./.github/actions/setup-ubuntu
     - name: Run examples
       run: |
         python3 example.py --examples ntt_kyber_123_4567_a55,ntt_dilithium_123_45678_a55 --timeout=300
@@ -102,11 +82,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
     - uses: actions/checkout@v3
-    - name: Install python dependencies
-      run: |
-        python3 -m venv venv
-        ./venv/bin/python3 -m pip install -r requirements.txt
-        echo BASH_ENV="./venv/bin/activate" >> $GITHUB_ENV
+    - uses: ./.github/actions/setup-ubuntu
     - name: Run examples
       run: |
         (cd paper/scripts && NO_LOG=Y ./slothy_sqmag.sh)
@@ -119,11 +95,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
     - uses: actions/checkout@v3
-    - name: Install python dependencies
-      run: |
-        python3 -m venv venv
-        ./venv/bin/python3 -m pip install -r requirements.txt
-        echo BASH_ENV="./venv/bin/activate" >> $GITHUB_ENV
+    - uses: ./.github/actions/setup-ubuntu
     - name: Run examples
       run: |
         (cd paper/scripts && NO_LOG=Y ./slothy_fft.sh)
diff --git a/requirements.txt b/requirements.txt
@@ -2,3 +2,4 @@ numpy==1.26.4
 ortools==9.7.2996
 pandas==2.1.1
 sympy==1.12
+unicorn=2.1.1
diff --git a/slothy/core/config.py b/slothy/core/config.py
@@ -100,6 +100,52 @@ def reserved_regs_are_locked(self):
         to eliminate uses of particular registers from some piece of assembly."""
         return self._reserved_regs_are_locked
 
+    @property
+    def selftest(self):
+        """Indicates whether SLOTHY performs an empirical equivalence-test on the
+        optimization results.
+
+        When this is set, and if the target architecture and host platform support it,
+        this will run an empirical equivalence checker trying to confirm that the
+        input and output of SLOTHY are likely functionally equivalent.
+
+        The primary purpose of this checker is to detect issue that would presently
+        be overlooked by the selfcheck:
+        - The selfcheck is currently blind to address offset fixup. If something goes
+          wrong, the input and output will not be functionally equivalent, but we would
+          only notice once we actually compile and run the code. The selftest will
+          likely catch issues.
+        - When using software pipelining, the selfcheck reduces to a straightline check
+          for a bounded unrolling of the loop. An unbounded selfcheck is currently not
+          implemented.
+          With the selftest, you still need to fix a loop bound, but at least you can
+          equivalence-check the loop-form (including the compare+branch instructions
+          at the loop boundary) rather than the unrolled code.
+
+        DEPENDENCY: To run this, you need `llvm-mc` the binary in your path or configured
+        as via `llvm_mc_binary`, and `unicorn-engine` Python bindings setup.
+
+        NOTE: This is so far implemented as a repeated randomized test -- nothing clever.
+        """
+        return self._selftest
+
+    @property
+    def selftest_iterations(self):
+        """If selftest is set, indicates the number of random selftest to conduct"""
+        return self._selftest_iterations
+
+    @property
+    def selftest_address_gprs(self):
+        """Dictionary of (reg, sz) items indicating which registers are assumed to be
+        pointers to memory, and if so, of what size."""
+        return self._selftest_address_gprs
+
+    @property
+    def selftest_default_memory_size(self):
+        """Default buffer size to use for registers which are automatically inferred to be
+        used as pointers and for which no memory size has been configured via `address_gprs`."""
+        return self._selftest_default_memory_size
+
     @property
     def selfcheck(self):
         """Indicates whether SLOTHY performs a self-check on the optimization result.
@@ -431,6 +477,13 @@ def llvm_mca_binary(self):
         is set."""
         return self._llvm_mca_binary
 
+    @property
+    def llvm_mc_binary(self):
+        """The llvm-mc binary to be used for assembling output data
+
+        This is only relevant if `selftest` is set."""
+        return self._llvm_mc_binary
+
     @property
     def timeout(self):
         """The timeout in seconds after which the underlying constraint solver stops
@@ -1143,6 +1196,10 @@ def __init__(self, Arch, Target):
         self._reserved_regs = None
         self._reserved_regs_are_locked = True
 
+        self._selftest = True
+        self._selftest_iterations = 10
+        self._selftest_address_gprs = None
+        self._selftest_default_memory_size = 1024
         self._selfcheck = True
         self._selfcheck_failure_logfile = None
         self._allow_useless_instructions = False
@@ -1172,6 +1229,7 @@ def __init__(self, Arch, Target):
         self._compiler_binary = "gcc"
         self._compiler_include_paths = None
         self._llvm_mca_binary = "llvm-mca"
+        self._llvm_mc_binary = "llvm-mc"
 
         self.keep_tags = True
         self.inherit_macro_comments = False
@@ -1260,6 +1318,20 @@ def reserved_regs_are_locked(self,val):
     @variable_size.setter
     def variable_size(self,val):
         self._variable_size = val
+    @selftest.setter
+    def selftest(self,val):
+        if hasattr(self.arch, "Checker") is False:
+            raise InvalidConfig("Trying to enable checker, but architecture model does not seem to support it")
+        self._selftest = val
+    @selftest_iterations.setter
+    def selftest_iterations(self,val):
+        self._selftest_iterations = val
+    @selftest_address_gprs.setter
+    def selftest_address_gprs(self,val):
+        self._selftest_address_gprs = val
+    @selftest_default_memory_size.setter
+    def selftest_default_memory_size(self,val):
+        self._selftest_default_memory_size = val
     @selfcheck.setter
     def selfcheck(self,val):
         self._selfcheck = val
@@ -1308,6 +1380,9 @@ def compiler_include_paths(self, val):
     @llvm_mca_binary.setter
     def llvm_mca_binary(self, val):
         self._llvm_mca_binary = val
+    @llvm_mc_binary.setter
+    def llvm_mc_binary(self, val):
+        self._llvm_mc_binary = val
     @timeout.setter
     def timeout(self, val):
         self._timeout = val