Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove absolute registry paths from panic messages in wasm #1594

Merged
merged 8 commits into from
Nov 3, 2024

Conversation

brson
Copy link
Contributor

@brson brson commented Sep 10, 2024

What

This uses the --remap-path-prefix option to rustc to eliminate absolute paths embedded in the wasm.

Fixes #1445

Why

This is generally required to make builds reproducible across different machines.

These absolute paths can appear for both debuginfo and panic messages, though in our case since we are compiling in release mode and stripping, in practice this only affects panic messages.

The comments in the patch are fairly thorough as to why and how the patch works, but there are some things to note.

Known limitations

This relies on setting CARGO_ENCODED_RUSTFLAGS, which is similar to RUSTFLAGS, and which cargo interprets with higher priority. As of now this is the only way to pass the required flag to all dependencies, and crates.io dependencies are where this problem appears specifically (we can't just pass -- --remap-path-prefix to cargo rustc).

By setting one of the RUSTFLAGS vars, cargo ignores any build.rustflags and similar configuration options on the local filesystem. Detecting this situation and adapting to it is challenging and probably involves using cargo's own API as a dependency.

I do not quite understand the conditions under which the problem this addresses actually occurs, and don't know how to succinctly construct a test case. Panic messages in the local workspace do not exhibit the problem, nor do those in the pre-compiled standard library (which is already compiled in a similar way). Triggering the problem requires using dependencies from crates.io; and possibly crates.io crates that call into the core library where a panic is located - I'm not sure.

I know it occurs for the eth_abi example, which depends on some third-party crates.

The data section of the eth_abi wat before and after this patch:

  (data (;0;) (i32.const 1048576) "/home/brian/.cargo/registry/src/index.crates.io-6f17d22bba15001f/alloy-sol-t
ypes-0.6.3/src/types/data_type.rs/home/brian/.cargo/registry/src/index.crates.io-6f17d22bba15001f/soroban-sdk-2
1.6.0/src/bytes.rs\00\00\00m\00\10\00`\00\00\00>\02\00\00\0d\00\00\00\00\00\10\00m\00\00\00\ed\03\00\00\01\00\0
0\00/home/brian/.cargo/registry/src/index.crates.io-6f17d22bba15001f/alloy-sol-types-0.6.3/src/abi/decoder.rs\0
0\00\00\f0\00\10\00i\00\00\00\ad\00\00\00\1b\00\00\00\f0\00\10\00i\00\00\00\ba\00\00\00R\00\00\00\00\00\10\00m\
00\00\00\a9\00\00\00\0d\00\00\00\00\00\10\00m\00\00\00\a9\00\00\00\13\00\00\00\00\00\00\00\00\00\00\00\01\00\00
\00\06\00\00\00called `Result::unwrap()` on an `Err` valueTryFromSliceErrorsrc/lib.rs\00\00\e8\01\10\00\0a\00\0
0\00#\00\00\00-\00\00\00capacity overflow\00\00\00\04\02\10\00\11\00\00\00library/alloc/src/raw_vec.rs \02\10\0
0\1c\00\00\00\19\00\00\00\05\00\00\00()attempt to add with overflow\00\00N\02\10\00\1c\00\00\00attempt to multi
ply with overflow\00\00\00t\02\10\00!\00\00\00)\00\00\00\01\00\00\00\00\00\00\00explicit panic\00\00\ac\02\10\0
0\0e\00\00\00: \00\00\01\00\00\00\00\00\00\00\c4\02\10\00\02\00\00\00\00\00\00\00\0c\00\00\00\04\00\00\00\07\00\00\00\08\00\00\00\09\00\00\00    ,\0a((\0a00010203040506070809101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899 out of range for slice of length range end index \00\e3\03\10\00\10\00\00\00\c1\03\10\00\22\00\00\00source slice length () does not match destination slice length (\04\04\10\00\15\00\00\00\19\04\10\00+\00\00\00\a0\02\10\00\01\00\00\00/home/brian/.cargo/registry/src/index.crates.io-6f17d22bba15001f/soroban-sdk-21.6.0/src/alloc.rs\5c\04\10\00`\00\00\00\1b\00\00\00\0a\00\00\00\5c\04\10\00`\00\00\00$\00\00\00\1b\00\00\00\5c\04\10\00`\00\00\00?\00\00\00\0d\00\00\00"))
  (data (;0;) (i32.const 1048576) "index.crates.io-6f17d22bba15001f/alloy-sol-types-0.6.3/src/types/data_type.rsindex.crates.io-6f17d22bba15001f/soroban-sdk-21.6.0/src/bytes.rs\00\00\00M\00\10\00@\00\00\00>\02\00\00\0d\00\00\00\00\00\10\00M\00\00\00\ed\03\00\00\01\00\00\00index.crates.io-6f17d22bba15001f/alloy-sol-types-0.6.3/src/abi/decoder.rs\00\00\00\b0\00\10\00I\00\00\00\ad\00\00\00\1b\00\00\00\b0\00\10\00I\00\00\00\ba\00\00\00R\00\00\00\00\00\10\00M\00\00\00\a9\00\00\00\0d\00\00\00\00\00\10\00M\00\00\00\a9\00\00\00\13\00\00\00\00\00\00\00\00\00\00\00\01\00\00\00\06\00\00\00called `Result::unwrap()` on an `Err` valueTryFromSliceErrorsrc/lib.rs\00\00\88\01\10\00\0a\00\00\00#\00\00\00-\00\00\00capacity overflow\00\00\00\a4\01\10\00\11\00\00\00library/alloc/src/raw_vec.rs\c0\01\10\00\1c\00\00\00\19\00\00\00\05\00\00\00()attempt to add with overflow\00\00\ee\01\10\00\1c\00\00\00attempt to multiply with overflow\00\00\00\14\02\10\00!\00\00\00)\00\00\00\01\00\00\00\00\00\00\00explicit panic\00\00L\02\10\00\0e\00\00\00: \00\00\01\00\00\00\00\00\00\00d\02\10\00\02\00\00\00\00\00\00\00\0c\00\00\00\04\00\00\00\07\00\00\00\08\00\00\00\09\00\00\00    ,\0a((\0a00010203040506070809101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899 out of range for slice of length range end index \00\83\03\10\00\10\00\00\00a\03\10\00\22\00\00\00source slice length () does not match destination slice length (\a4\03\10\00\15\00\00\00\b9\03\10\00+\00\00\00@\02\10\00\01\00\00\00index.crates.io-6f17d22bba15001f/soroban-sdk-21.6.0/src/alloc.rs\fc\03\10\00@\00\00\00\1b\00\00\00\0a\00\00\00\fc\03\10\00@\00\00\00$\00\00\00\1b\00\00\00\fc\03\10\00@\00\00\00?\00\00\00\0d\00\00\00"))

I have tested this on windows in addition to linux and mac and it appears to work.

@brson brson requested a review from a team as a code owner September 10, 2024 22:17
@leighmcculloch leighmcculloch self-requested a review September 10, 2024 22:18
@brson brson changed the title Remove registry paths from panic messages in wasm Remove absolute registry paths from panic messages in wasm Sep 10, 2024
@brson
Copy link
Contributor Author

brson commented Sep 11, 2024

One thing to note here is that this affects the builtin file! macro, so if some dependency is doing something funky with that macro it could break in unexpected ways.

@leighmcculloch
Copy link
Member

is that this affects the builtin file! macro

Do you have any examples of how use of a file! macro would be affected? Am I correct in understanding that it'd only affect use of the builtin file! macro that included the full prefix we're mapping (i.e. {cargo_home}/registry/src/)?

Copy link
Member

@leighmcculloch leighmcculloch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏🏻 This looks great to me and the tradeoffs discussed in the comments seem reasonable. The detailed comments are much appreciated 💯. Two questions inline, although I expect they don't change what we're doing here.

cmd/soroban-cli/src/commands/contract/build.rs Outdated Show resolved Hide resolved
cmd/soroban-cli/src/commands/contract/build.rs Outdated Show resolved Hide resolved
@brson
Copy link
Contributor Author

brson commented Sep 26, 2024

is that this affects the builtin file! macro

Do you have any examples of how use of a file! macro would be affected? Am I correct in understanding that it'd only affect use of the builtin file! macro that included the full prefix we're mapping (i.e. {cargo_home}/registry/src/)?

Yes it will only affect paths to the registry, but those paths will no longer be "valid" (absolute or relative), so if some dependency code is doing something particularly unexpected by trying to use file! to actually locate and open a file, perhaps in a build script, it would break.

I am not aware of any code or use case for actually doing this though - within the rust code base file! is only used for printing debug-style information. And there are already cases where the compiler itself munges this value to remove the absoluteness, while not preserving relative paths - a simple test on the playground of include!(file!()) fails because the relative path returned by file! is wrong.

I did a github code search and it seems file! is mostly used in logging and error messages. So a likely failure scenario might be a subsequent parsing of logs gets confused about where files mentioned in the logs are.

Edit: From a re-read of cargo's rustflags handling, rustflags may not apply to build scripts at all anyway.

@brson
Copy link
Contributor Author

brson commented Oct 5, 2024

@leighmcculloch I have pushed a commit that does the remapping with CARGO_BUILD_RUSTFLAGS instead of CARGO_ENCODED_RUSTFLAGS. It has tradeoffs:

One nice advantage is that CARGO_BUILD_RUSTFLAGS does get merged by cargo with build.rustflags, so we're not overriding any user-defined rustflags settings.

Disadvantages:

  • The user can override our rustflags with various env vars or target.wasm32-unknown-unknown.rustflags. I have added warnings about this in the env vars case.
  • CARGO_BUILD_RUSTFLAGS is split on whitespace, so paths can't contain spaces. I've added a warninga bout this.

This patch continues to merge any existing CARGO_BUILD_RUSTFLAGS with our own (note that cargo splits this env var on any whitespace, whereas it spits RUSTFLAGS only on ' '). It could, but does not, attempt to also merge RUSTFLAGS, CARGO_ENCODED_RUSTFLAGS, and CARGO_wasm32-unknown-unknown_RUSTFLAGS.

I have manually retested the mapping with the eth_abi crate, and tested the warnings.

@leighmcculloch
Copy link
Member

  • CARGO_BUILD_RUSTFLAGS is split on whitespace, so paths can't contain spaces. I've added a warninga bout this.

I'm wondering how this will affect Windows users. It's been a while since I was a regular windows users, but when I last was it was possible to have spaces in the username and therefore in the home path. 🤔

@leighmcculloch
Copy link
Member

leighmcculloch commented Oct 18, 2024

  • CARGO_BUILD_RUSTFLAGS is split on whitespace, so paths can't contain spaces. I've added a warninga bout this.

Can the value be escaped?

@leighmcculloch
Copy link
Member

leighmcculloch commented Oct 18, 2024

I pushed a change in 240a6b3 to output the env var to stderr, so that the command that's outputted shows the full command in use. For example:

❯ stellar contract build
CARGO_BUILD_RUSTFLAGS=--remap-path-prefix=/Users/leighmcculloch/.cargo/registry/src/= cargo rustc --manifest-path=Cargo.toml --crate-type=cdylib --target=wasm32-unknown-unknown --release
    Finished `release` profile [optimized] target(s) in 0.06s

Copy link
Member

@leighmcculloch leighmcculloch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good and we can run with this.

A question remains about whether we add tests for this logic. We don't currently test the build command because up until now it was rather uninteresting, and not trivial to test. But as we add more logic, feels like we should do that now.

I'm thinking if we have a small test vector contract, and check that soroban contract build produces a valid wasm file, and then have some tests that run on mac, windows, linux, with the same version of rust, should produce the same wasm file? Maybe that's overkill...

Thoughts?

@brson
Copy link
Contributor Author

brson commented Nov 2, 2024

Can the value be escaped?

I do not believe so. I looked but could not find evidence that this is unescaped anywhere.

@brson
Copy link
Contributor Author

brson commented Nov 2, 2024

I'm thinking if we have a small test vector contract, and check that soroban contract build produces a valid wasm file, and then have some tests that run on mac, windows, linux, with the same version of rust, should produce the same wasm file? Maybe that's overkill...

Thoughts?

I think this is very testworthy but testing it is tough. The idea of comparing across OSes would probably work for a simple contract, though contracts are still not reproducable across OSes generally after this patch (needs to be fixed in cargo). Examining the CI results of each OS like that seems pretty complicated though. Maybe the expected wash hash can be hardcoded (and would have to change as compilers etc change).

My preference for testing this particular patch would be to: build a test wasm that is known to contain paths for panic messages; extract those paths from the wasm; check that they are not absolute paths, and that the first path segment in those paths is as expected. Extracting that metadata is definitely possible, but it's quite a big test setup to test this one thing, which is why I didn't bother.

I'm happy to go create that test though. Something to hack on.

@brson
Copy link
Contributor Author

brson commented Nov 2, 2024

I fixed the merge conflicts and manually retested. Building after the merge did change the lockfile significantly - I didn't look closely and assume it's correct.

@brson
Copy link
Contributor Author

brson commented Nov 2, 2024

Extracting that metadata is definitely possible

While this is possible, I believe these paths end of in a big blob of an rodata section, so parsing them out of there could be ugly.

Copy link
Member

@leighmcculloch leighmcculloch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to approve and merge this and we can get feedback.

I think the change is relatively low risk.

But, we may find out that the assumption around spaces in the filesystem path is a large problem in which case maybe in a future version we flip to using another env var to set it.

@leighmcculloch leighmcculloch merged commit e44b89f into stellar:main Nov 3, 2024
26 checks passed
@leighmcculloch
Copy link
Member

@brson Thanks for the effort to understand what could be done to improve reproducibility and navigate the options and tradeoffs.

@leighmcculloch
Copy link
Member

leighmcculloch commented Nov 4, 2024

I guess we should have asked sooner, a quick ask on Discord (ref) raised a comment that spaces in windows home paths are common because the username is often a "first last" name. I might have pushed us towards the wrong tradeoff. 🤔

@brson
Copy link
Contributor Author

brson commented Nov 5, 2024

I guess we should have asked sooner, a quick ask on Discord (ref) raised a comment that spaces in windows home paths are common because the username is often a "first last" name. I might have pushed us towards the wrong tradeoff. 🤔

Oops. Good news though: the result of this should just be that stellar-cli prints a warning on windows if their paths contain a space - they can still develop fine, but can't create a reproducible build.

I've filed an issue about testing this feature, as previously discussed, and am working on it: #1704

We can also revisit this implementation, dig through cargo again and see if there is any other way to do it while not clobbering rustflags and handling spaces; maybe see if there is any possibility of accelerating cargo feature work to make this more reliable.

In the meantime I am cautiously optimistic that this implementation is not going to cause much heartache.

@leighmcculloch
Copy link
Member

leighmcculloch commented Nov 6, 2024

Agreed, I think it is very unlikely problems would be created, and it'll be helpful to let this settle and to see how this positively influences things.

A real hope would be that Rust would stabilise the trim paths RFC that you mentioned on the issue and then we could simply engage it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Strip absolute local paths from binaries
2 participants