Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge changes in to support parsing bash scripts #737

Open
wants to merge 35 commits into
base: future
Choose a base branch
from

Conversation

BolunThompson
Copy link
Contributor

Code written by @sethsabar. The tests pass with the changes from binpash/shasta#5 and binpash/libbash#1 (CI will fail until those are merged in).

Signed-off-by: Bolun Thompson <[email protected]>
Signed-off-by: Bolun Thompson <[email protected]>
Signed-off-by: Bolun Thompson <[email protected]>
The bash tests contain scripts which use UTF-8 only characters,
but, by default, Python throws an exception when writing non-ASCII
characters to a file.

Signed-off-by: Bolun Thompson <[email protected]>
Signed-off-by: Bolun Thompson <[email protected]>
Signed-off-by: Bolun Thompson <[email protected]>
Signed-off-by: Bolun Thompson <[email protected]>
@angelhof
Copy link
Member

I merged the depending PRs, but I suspect we also need to create a PyPI release of libbash correct?

@BolunThompson
Copy link
Contributor Author

BolunThompson commented Dec 17, 2024

You’re correct — thanks for catching that. PR for it is binpash/libbash#3.

Signed-off-by: Bolun Thompson <[email protected]>
@angelhof
Copy link
Member

Great, we also need to push this release to PyPI, for which I will wait for Seth to give me access to the libbash PyPI repo.

@angelhof
Copy link
Member

After merging libbash and pushing the repo to PyPI we should be able to continue working on this :)

@BolunThompson BolunThompson marked this pull request as ready for review December 21, 2024 02:08
Copy link

OS =
CPU =
Ram =
Hash = 75891c3
Kernel=
||
|-|-|-|-|-|-|-|-|-|

Copy link

OS:ubuntu-20.04
Sat Dec 21 02:11:33 UTC 2024
intro: 2/2 tests passed.
interface: 42/42 tests passed.
compiler: 52/54 tests passed.
bigrams.sh are not identical
bigrams.sh are not identical

Copy link

OS =
CPU =
Ram =
Hash = 441020a
Kernel=
||
|-|-|-|-|-|-|-|-|-|

Copy link

OS:ubuntu-20.04
Sun Dec 22 09:51:39 UTC 2024
intro: 2/2 tests passed.
interface: 42/42 tests passed.
compiler: 54/54 tests passed.

Copy link
Member

@angelhof angelhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @BolunThompson and @sethsabar !! The only change that we need to do is to make sure that bash tests run in CI (not by modifying the yml file but rather the test scripts). In my opinion, it would be better to just modify all test files to run both with and without --bash for all tests (avoiding the control flow that exists currently), however I don't have a strong opinion on that.

@@ -179,6 +192,9 @@ execute_tests() {
export pash_output="${intermediary_dir}/${microbenchmark}_${n_in}_pash_output"
export script_conf=${microbenchmark}_${n_in}
echo '' > "${pash_time}"
if [ "$test_mode" == "bash" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this here?

@@ -96,12 +100,21 @@ pipeline_microbenchmarks=(
execute_pash_and_check_diff() {
TIMEFORMAT="%3R" # %3U %3S"
if [ "$DEBUG" -eq 1 ]; then
{ time "$PASH_TOP/pa.sh" $@ ; } 1> "$pash_output" 2> >(tee -a "${pash_time}" >&2) &&
diff -s "$seq_output" "$pash_output" | head | tee -a "${pash_time}" >&2
if [ "$test_mode" == "bash" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that we can just modify the file to run tests both with --bash and without (modifying the configurations array above).

@@ -0,0 +1,24 @@
#!/usr/bin/env bash

set -x e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should run this script in CI!

@@ -328,7 +339,7 @@ test_IFS()
}

## We run all tests composed with && to exit on the first that fails
if [ "$#" -eq 0 ]; then
if [ "$#" -eq 0 ] || [ "$test_mode" = "bash" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really remove all these control flow checks and make sure that all tests always run both with and without --bash.

Also changes it so that only the bash tests only
run in bash mode, which I feel is fair since they
test bash only features
Copy link

OS =
CPU =
Ram =
Hash = 3c2606b
Kernel=
||
|-|-|-|-|-|-|-|-|-|

Copy link

OS =
CPU =
Ram =
Hash = 89bfc84
Kernel=
||
|-|-|-|-|-|-|-|-|-|

Copy link

OS:ubuntu-20.04
Thu Dec 26 03:18:06 UTC 2024
intro: 2/2 tests passed.
interface: 212/214 tests passed.
compiler: 98/108 tests passed.
test_histexp7.sub are not identical
test_unicode3.sub are not identical
shortest_scripts.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/shortest_scripts.sh
shortest_scripts.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/shortest_scripts.sh
deadlock_test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/deadlock_test.sh
deadlock_test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/deadlock_test.sh
micro_10.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/micro_10.sh
micro_10.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/micro_10.sh
sed-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/sed-test.sh
sed-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/sed-test.sh
tr-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/tr-test.sh
tr-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/tr-test.sh

Copy link

OS:ubuntu-20.04
Thu Dec 26 03:25:42 UTC 2024
intro: 2/2 tests passed.
interface: 212/214 tests passed.
compiler: 98/108 tests passed.
test_histexp7.sub are not identical
test_unicode3.sub are not identical
shortest_scripts.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/shortest_scripts.sh
shortest_scripts.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/shortest_scripts.sh
deadlock_test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/deadlock_test.sh
deadlock_test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/deadlock_test.sh
micro_10.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/micro_10.sh
micro_10.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/micro_10.sh
sed-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/sed-test.sh
sed-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/sed-test.sh
tr-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/tr-test.sh
tr-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/tr-test.sh

Copy link

OS =
CPU =
Ram =
Hash = 190f03e
Kernel=
||
|-|-|-|-|-|-|-|-|-|

Copy link

OS:ubuntu-20.04
Sun Dec 29 03:58:56 UTC 2024
intro: 2/2 tests passed.
interface: 212/214 tests passed.
compiler: 98/108 tests passed.
test_histexp7.sub are not identical
test_unicode3.sub are not identical
shortest_scripts.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/shortest_scripts.sh
shortest_scripts.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/shortest_scripts.sh
deadlock_test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/deadlock_test.sh
deadlock_test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/deadlock_test.sh
micro_10.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/micro_10.sh
micro_10.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/micro_10.sh
sed-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/sed-test.sh
sed-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/sed-test.sh
tr-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 2 --output_time /home/runner/work/pash/pash/evaluation/tests/tr-test.sh
tr-test.sh are not identical with flags -d 1 --assert_all_regions_parallelizable --bash --width 8 --output_time /home/runner/work/pash/pash/evaluation/tests/tr-test.sh

@BolunThompson
Copy link
Contributor Author

BolunThompson commented Jan 4, 2025

The current problem is that pash expands words in bash mode by calling echo {arg} for every arg, replacing the argument with the output. While this usually works, it doesn’t split words. Without a bash argument expander, currently they’re just split naively on IFS, leading to commands like tr “ “ “ “ being parsed as four quotation marks instead of two string arguments (since every character is a CArgChar).

I’m finishing bash_expand.py, which already sketches out using a bash server to expand asts before compilation, like the dash code. I’ll send (another) PR to sh-expand with it (hopefully soon!).

@angelhof
Copy link
Member

angelhof commented Jan 5, 2025

The current problem is that pash expands words in bash mode by calling echo {arg} for every arg, replacing the argument with the output. While this usually works, it doesn’t split words. Without a bash argument expander, currently they’re just split naively on IFS, leading to commands like tr “ “ “ “ being parsed as four quotation marks instead of two string arguments (since every character is a CArgChar).

I’m finishing bash_expand.py, which already sketches out using a bash server to expand asts before compilation, like the dash code. I’ll send (another) PR to sh-expand with it (hopefully soon!).

Ohh, that is a bummer... Happy to discuss the solution if needed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants