Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: avoid setting DELETE_AT_EXIT multiple times #2619

Merged

Conversation

wyardley
Copy link
Contributor

@wyardley wyardley commented Sep 25, 2024

Fixes #2618

  • Check if DELETE_AT_EXIT is set in setup_trap_handler() before trying to set it, to avoid DELETE_AT_EXIT: readonly variable error
  • Set the --will-cite flag for parallel - in some cases, I've seen the citation warning
  • Add swp to list of file extensions to ignore (for vim)
  • Small shellcheck fixes (not exhaustive)

For the first, not sure if this is the right fix - maybe it would be simpler to avoid having that function get called twice somehow?

Not sure if --will-cite will cause any issues licensing-wise, but I think it's allowed?

Edit finally saw the error again locally. So this is the problem this is designed to solve:

Running terraform validate
Academic tradition requires you to cite works you base your article on.
If you use programs that use GNU Parallel to process data for an article in a
scientific publication, please cite:

  Tange, O. (2024, April 22). GNU Parallel 20240422 ('Børsen').
  Zenodo. https://doi.org/10.5281/zenodo.11043435

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

More about funding GNU Parallel and the citation notice:
https://www.gnu.org/software/parallel/parallel_design.html#citation-notice

To silence this citation notice: run 'parallel --citation' once.

Copy link
Collaborator

@apeabody apeabody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wyardley! We should also bump the dev-tools image patch version https://github.com/GoogleCloudPlatform/cloud-foundation-toolkit/blob/master/infra/build/Makefile#L59

(do not merge as-is)

@apeabody
Copy link
Collaborator

/gcbrun

@apeabody
Copy link
Collaborator

@bharathkkb - Thoughts on --will-cite for parallel?

@wyardley wyardley requested a review from apeabody September 25, 2024 16:07
@wyardley
Copy link
Contributor Author

Bumped the tag

@apeabody
Copy link
Collaborator

/gcbrun

1 similar comment
@apeabody
Copy link
Collaborator

/gcbrun

Copy link
Collaborator

@apeabody apeabody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @wyardley!

LGTM, just waiting on @bharathkkb for opinion on --will-cite

@bharathkkb
Copy link
Member

Thanks for the PR @wyardley! My thinking is we can do an initial parallel --citation and then keep the --will-cite for subsequent so citation is printed in the beginning?

if [[ "${ENABLE_PARALLEL:-}" -eq 1 ]]; then
   parallel --citation
    ...
    | parallel --will-cite ..
    ...

Fixes GoogleCloudPlatform#2618

- Check if `DELETE_AT_EXIT` is set in `setup_trap_handler()` before
  trying to set it, to avoid `DELETE_AT_EXIT: readonly variable` error
- Set the `--will-cite` flag for parallel - in some cases, I've seen the
  citation warning
- Add swp to list of file extensions to ignore (for vim)
- Small shellcheck fixes (not exhaustive)
@wyardley wyardley force-pushed the wyardley/misc_helper_fixes branch from 92c4fb2 to bce5c05 Compare September 26, 2024 05:40
@wyardley wyardley changed the title fix: avoid setting DELETE_AT_EXIT twice and set parallel --will-cite fix: avoid setting DELETE_AT_EXIT twice Sep 26, 2024
@wyardley wyardley changed the title fix: avoid setting DELETE_AT_EXIT twice fix: avoid setting DELETE_AT_EXIT multiple times Sep 26, 2024
@wyardley
Copy link
Contributor Author

My thinking is we can do an initial parallel --citation and then keep the --will-cite for subsequent so citation is printed in the beginning?

I'll just remove that part of the change and update the PR title. It only shows up once anyway, IIRC, so no point in making the code more complex for something like this; I guess having it in the beginning might clutter the output marginally less, but not sure it's worth it.

Parallel isn't currently enabled in the GitHub actions check (though it's referenced in a Cloudbuild check - not sure if that runs internally, but it's certainly not visible to most of us).

I'm not a lawyer, but worth looking at the FAQ as well as the wording in the man page and in the citation notice itself:
https://git.savannah.gnu.org/cgit/parallel.git/tree/doc/citation-notice-faq.txt

The link only addresses the license and copyright law. It does not address academic tradition, and the citation notice only refers to academic tradition.
[...]
== I do not write scientific articles. Does the notice apply to me? ==

The notice is only relevant if you write scientific articles.

My analysis is that it's allowed (though possibly discouraged) to include --will-cite when used in a script.

@wyardley
Copy link
Contributor Author

wyardley commented Sep 26, 2024

Also, the output from parallel -citation is even longer, and needs to have will cite entered in, which seems fragile for something non-interactive.

% echo 'will cite' | parallel --citation
Academic tradition requires you to cite works you base your article on.
If you use programs that use GNU Parallel to process data for an article in a
scientific publication, please cite:

@software{tange_2024_13826092,
      author       = {Tange, Ole},
      title        = {GNU Parallel 20240922 ('Gold Apollo AR924')},
      month        = Sep,
      year         = 2024,
      note         = {{GNU Parallel is a general parallelizer to run
                       multiple serial command line programs in parallel
                       without changing them.}},
      publisher    = {Zenodo},
      doi          = {10.5281/zenodo.13826092},
      url          = {https://doi.org/10.5281/zenodo.13826092}
}

(Feel free to use \nocite{tange_2024_13826092})

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

More about funding GNU Parallel and the citation notice:
https://lists.gnu.org/archive/html/parallel/2013-11/msg00006.html
https://www.gnu.org/software/parallel/parallel_design.html#citation-notice
https://git.savannah.gnu.org/cgit/parallel.git/tree/doc/citation-notice-faq.txt

@wyardley
Copy link
Contributor Author

I updated DELETE_AT_EXIT to have the readonly on another line after the assignment

    DELETE_AT_EXIT="$(mktemp -d)"
    readonly DELETE_AT_EXIT

recommended since errexit is set.

@apeabody
Copy link
Collaborator

/gcbrun

@apeabody
Copy link
Collaborator

Parallel isn't currently enabled in the GitHub actions check (though it's referenced in a Cloudbuild check - not sure if that runs internally, but it's certainly not visible to most of us).

Hi @wyardley! A few repos (primarily those with lots of folders) do/will pass ENABLE_PARALLEL=1 in the GitHub action check. Here is a current example: https://github.com/terraform-google-modules/terraform-docs-samples/pull/738/files

The cloudbuild lint are legacy, replaced by the github workflow lint.

@wyardley
Copy link
Contributor Author

wyardley commented Sep 26, 2024

Got it, thanks.

Yeah, I'm just not sure that the possible workarounds are much better (for the reason mentioned above); using one of the alternatives to parallel might work, and I could take a pass at that separately if there's interest.

I've done something along these lines before, where -n can be adjusted to a higher level of parallelism, maybe something along those lines would do the trick?

            xargs -n1 sh -c
            'echo "${0}" ; cd "${0}" && terraform init -backend=false -input=false'
            < /tmp/tests_to_run

Also, I see my old nemesis, that extra newline in validate output is still around 🙃

@apeabody
Copy link
Collaborator

Got it, thanks.

Yeah, I'm just not sure that the possible workarounds are much better (for the reason mentioned above); using one of the alternatives to parallel might work, and I could take a pass at that separately if there's interest.

I've done something along these lines before, where -n can be adjusted to a higher level of parallelism, maybe something along those lines would do the trick?

            xargs -n1 sh -c
            'echo "${0}" ; cd "${0}" && terraform init -backend=false -input=false'
            < /tmp/tests_to_run

Also, I see my old nemesis, that extra newline in validate output is still around 🙃

Yeah, I've heard good things about rush and rust parallel, however they would need to be incorporated into the image and to one of your other requests to potentially run outside of docker aren't common in package manages.

If there was a desire to explore alternates, I'd say xargs is the most compelling given it's ubiquity and speed.

@apeabody apeabody merged commit ac6615a into GoogleCloudPlatform:master Sep 26, 2024
6 checks passed
@wyardley wyardley deleted the wyardley/misc_helper_fixes branch September 26, 2024 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

error when setup_trap_handler() is called twice
3 participants