Add equivalence tester #108

hanno-becker · 2024-11-25T06:12:29Z

By default, SLOTHY includes a 'selfcheck' verifying that the DFGs
of input and output are isomorphic with respect to the rescheduling
found by SLOTHY. This selfcheck is very powerful and should catch most
modelling errors in SLOTHY.

However, some aspects of SLOTHY remain outside of the scope of the
DFG-based selfcheck:

Address offset fixup

This is a historically rather error prone feature to which the self-check
is blind (since it relies on making the DFG blind to address increments).
Loop boundaries

If SW pipelining is enabled, the selfcheck checks equivalence of a
bounded unrolling of the input and output loops.

Therefore, it would not catch issues with (a) the loop boundary, or
(b) exceptional iterations for low loop counts.

While it is not SLOTHY's goal to provide formal guarantees of correctness
of the optimizations -- this should be gauged/confirmed through extensive test
and ideally formal verification -- development would be faster if bugs
in the aforementioned areas were caught before the developer integrates
and tests the SLOTHY output manually.

As a step towards catching more potential pitfalls early in the SLOTHY
workflow, this commit starts the development of an equivalence tester.

The idea of the equivalence tester is simple: Just run original and optimized
assembly with various random inputs, and check that they indeed lead to
equivalent outputs.

This commit implements this idea for AArch64 and Armv7-M, leveraging
LLVM-MC as an assembler and the unicorn-engine for emulation.

An initial limitation is support for loops: For now, the equivalence
checker tests bounded unrollings of loops, just like the selfcheck. That is
against the point of the equivalenc check, but it will be improved at a later
point, after some refactoring of the loop generation code.

The checker added in this commit is able to detect issues with address offset
fixup, however, which would previously have gone unnoticed by the selfcheck.

The new equivalence test is configured via config.selftest, which is enabled
by default, but silently skipped for unsupported platforms or target architectures.

dop-amin

I really like how well the feature is integrated with slothy already.

As you write, there's room for improvment regarding the abstraction of the approach. I think things like the number/types of registers etc. could be inferred from the Arch models automatically for example.

I tested the checker with some small examples and also introduced mistakes on purpose that were caught.
Apart from the suggested changes to the code, there have been no issues for me.

Possibly, adding an example in example.py could be nice to make it complete.

tests/aarch64/checker/README.md

tests/aarch64/checker/checker.c

slothy/targets/aarch64/aarch64_neon.py

hanno-becker · 2024-11-28T12:15:08Z

Marking this is as draft because @dop-amin had promising ideas on how to achieve the goals of this PR by reusing the Unicorn Framework.

By default, SLOTHY includes a 'selfcheck' verifying that the DFGs of input and output are isomorphic with respect to the rescheduling found by SLOTHY. This selfcheck is very powerful and should catch most modelling errors in SLOTHY. However, some aspects of SLOTHY remain outside of the scope of the DFG-based selfcheck: - Address offset fixup This is a historically rather error prone feature to which the self-check is blind (since it relies on making the DFG blind to address increments). - Loop boundaries If SW pipelining is enabled, the selfcheck checks equivalence of a _bounded unrolling_ of the input and output loops. Therefore, it would not catch issues with (a) the loop boundary, or (b) exceptional iterations for low loop counts. While it is not SLOTHY's goal to provide formal guarantees of correctness of the optimizations -- this should be gauged/confirmed through extensive test and ideally formal verification -- development would be faster if bugs in the aforementioned areas were caught before the developer integrates and tests the SLOTHY output manually. As a step towards catching more potential pitfalls early in the SLOTHY workflow, this commit starts the development of an _equivalence tester_. The idea of the equivalence tester is simple: Just run original and optimized assembly with various random inputs, and check that they indeed lead to equivalent outputs. This commit implements this idea for AArch64 and Armv7-M, leveraging LLVM-MC as an assembler and the unicorn-engine for emulation. An initial limitation is support for loops: For now, the equivalence checker tests bounded unrollings of loops, just like the selfcheck. That is against the point of the equivalenc check, but it will be improved at a later point, after some refactoring of the loop generation code. The checker added in this commit is able to detect issues with address offset fixup, however, which would previously have gone unnoticed by the selfcheck. The new equivalence test is configured via config.selftest, which is enabled by default, but silently skipped for unsupported platforms or target architectures.

hanno-becker · 2024-12-09T06:46:50Z

@dop-amin Can you pick up the global selftest for Armv7-M? The following should be made work

python3 example.py --example armv7m_simple0_func_m7

At the moment, it passes the local selftest for me, but the global selftest ƒails with unmapped memory or invalid instruction errors -- it looks like it is missing the function return branch. My suspicion is that I am messing up and somehow llvm-mc (assembling) and unicorn (emulation) aren't called in a compatible configuration. This is configured in arch_v7m.py

llvm_mca_arch = "arm"
llvm_mc_arch = "arm" ### TODO: What to put here?
llvm_mc_attr = "armv5te,thumb2,dsp" ### TODO: What to put here?

unicorn_arch = UC_ARCH_ARM
unicorn_mode = UC_MODE_THUMB

This commit prototypes an implementation of a 'global' selftest, building and emulating the entire source code loaded into SLOTHY and checking that input and output behave equivalently. To use, call `slothy.global_selftest(...)` at any point during stateful SLOTHY operations. Then, the global selftest will compare the current state of the code with the original state. See ntt_kyber_123_4567 for an example.

dop-amin reviewed Nov 27, 2024

View reviewed changes

tests/aarch64/checker/README.md Outdated Show resolved Hide resolved

tests/aarch64/checker/checker.c Outdated Show resolved Hide resolved

slothy/targets/aarch64/aarch64_neon.py Outdated Show resolved Hide resolved

hanno-becker marked this pull request as draft November 28, 2024 12:14

hanno-becker force-pushed the equiv_test branch 6 times, most recently from c96bc14 to 87e6043 Compare December 8, 2024 06:19

hanno-becker force-pushed the equiv_test branch from 87e6043 to f6c6404 Compare December 8, 2024 06:20

hanno-becker marked this pull request as ready for review December 8, 2024 06:28

hanno-becker force-pushed the equiv_test branch 3 times, most recently from 495860c to c65d7b9 Compare December 9, 2024 06:44

hanno-becker changed the title ~~AArch64: Add equivalence tester~~ Add equivalence tester Dec 9, 2024

hanno-becker force-pushed the equiv_test branch 3 times, most recently from a359d3f to 76cadea Compare December 9, 2024 10:37

hanno-becker force-pushed the equiv_test branch from 76cadea to 0e04f48 Compare December 9, 2024 10:40

hanno-becker merged commit c31b6b3 into main Dec 9, 2024
13 checks passed

hanno-becker deleted the equiv_test branch December 9, 2024 11:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add equivalence tester #108

Add equivalence tester #108

hanno-becker commented Nov 25, 2024 •

edited

Loading

dop-amin left a comment

hanno-becker commented Nov 28, 2024

hanno-becker commented Dec 9, 2024

Add equivalence tester #108

Add equivalence tester #108

Conversation

hanno-becker commented Nov 25, 2024 • edited Loading

dop-amin left a comment

Choose a reason for hiding this comment

hanno-becker commented Nov 28, 2024

hanno-becker commented Dec 9, 2024

hanno-becker commented Nov 25, 2024 •

edited

Loading