Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Sentry Spans for Bash execution. #289

Merged
merged 8 commits into from
Mar 14, 2023
Merged

Conversation

mpass99
Copy link
Collaborator

@mpass99 mpass99 commented Feb 17, 2023

Related to #270

Additionally, we want to gain insights of the span nomad.execute.exec that is describing the Nomad execute request. To do so, we produce a special string in the bash command, scan the output for this string and convert it into a Sentry span.

@mpass99 mpass99 requested a review from MrSerth February 17, 2023 13:27
@mpass99 mpass99 self-assigned this Feb 17, 2023
@mpass99 mpass99 force-pushed the feature/#270-sentry-performance branch from 103307e to 2754e08 Compare February 17, 2023 15:16
Copy link
Member

@MrSerth MrSerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! It's really nice to have those insights and they look beautiful with CodeOcean run and score requests.

I just have some minor questions, but they should not prevent me from showing my great appreciation! 👏🥳

Most questions are inlined. One that I still have is the difference between nomad.execute.pipe and nomad.execute.bash. The one is for piping the command to the tty and the other for invoking it (should we add some details about the difference to the description)? Why is the "end" command only shown once? And would it make sense to add some more details (like [parts of] the actual command as a span description)?

internal/nomad/sentry_debug_writer.go Outdated Show resolved Hide resolved
internal/nomad/sentry_debug_writer.go Outdated Show resolved Hide resolved
internal/nomad/sentry_debug_writer.go Outdated Show resolved Hide resolved
internal/nomad/sentry_debug_writer.go Show resolved Hide resolved
internal/nomad/nomad.go Outdated Show resolved Hide resolved
internal/nomad/nomad.go Outdated Show resolved Hide resolved
@mpass99
Copy link
Collaborator Author

mpass99 commented Feb 19, 2023

It's really nice to have those insights and they look beautiful with CodeOcean run and score requests.

Those insights show the delay caused by the Container management (Nomad + Docker). Here, we might be ale to save up to 300ms (2 * (50ms + 100ms)) per run.

One that I still have is the difference between nomad.execute.pipe and nomad.execute.bash. The one is for piping the command to the tty and the other for invoking it (should we add some details about the difference to the description)?

👍

Why is the "end" command only shown once?

Because we have no additional information how long the "End"-span should last. Therefore, we just show the span until the "End"-message is received by Poseidon.

And would it make sense to add some more details (like [parts of] the actual command as a span description)?

👍

Copy link
Member

@MrSerth MrSerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for your changes and the detailed explanations! I found them really useful in understanding and reviewing. While I still have some questions (which we could happily talk about), I absolutely find this command execution part difficult and not easy to get. That being said, I am still impressed what you've created there.

During testing, I also noticed that Sentry warns about a dropped span (probably related to the not finished nomad.execute.bash "End"). Should we talk a look there?

[Sentry] 2023/02/19 12:02:32 Dropped unfinished span: Op="nomad.execute.bash" TraceID=11a0c1a2446141a9be27bd86753a48cb SpanID=b2c7dbd192c309f4

internal/nomad/sentry_debug_writer.go Outdated Show resolved Hide resolved
internal/nomad/sentry_debug_writer.go Show resolved Hide resolved
internal/nomad/nomad.go Outdated Show resolved Hide resolved
internal/nomad/nomad.go Outdated Show resolved Hide resolved
@mpass99 mpass99 force-pushed the feature/#270-sentry-performance branch 2 times, most recently from e00027f to 994d585 Compare February 26, 2023 20:29
MrSerth
MrSerth previously approved these changes Feb 26, 2023
Copy link
Member

@MrSerth MrSerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! This is super nice, and I cannot wait to get insights through the enhanced monitoring 🏎️

I tested the integration and just love it 😻. The latency information looks great within a span (I find the overall execution much easier to get), the status code is added as a tag, we get nested spans with the parent nomad.execute.exec and a clean code base to read (especially with the sentry_debug_writer.go) plus some command tracing.

I was just thinking of the span descriptions, and wondered whether we could add some more details to them. Here's the current state:

Bildschirm­foto 2023-02-26 um 23 46 13

What if.... we could achieve the following span descriptions (this is a mockup)?
Bildschirm­foto 2023-02-26 um 23 53 08

In this screenshot, I was just finding something that is similar to the actual command being executed, but simplified it for better readability. Just a random thought, let's think about it and discuss it tomorrow 🙂.

@mpass99
Copy link
Collaborator Author

mpass99 commented Feb 27, 2023

ToDo:

  • Clean up functions (dynamic nesting of bash commands)
  • (Static) renaming of span descriptions

@MrSerth
Copy link
Member

MrSerth commented Feb 27, 2023

I was quickly searching for the shell escape. For one, a package such as shellescape might be useful (and I would prefer some library if it exists). Alternatively, some manual escaping of all characters (except explicit safe ones) could be the way to go (here are some good replies for that topic).

- Abstracting from the exec form while generating.
- Removal of single quotes (usage of only double-quotes).
- Bash-nesting using escaping of special characters.
Due to the wrapping of the command, the exit code could not have been retrieved correct anymore.
@mpass99 mpass99 force-pushed the feature/#270-sentry-performance branch from faebac7 to 28fd264 Compare March 11, 2023 11:47
that was ignoring composed messages including a newline.
Also, add regression test.
@mpass99 mpass99 force-pushed the feature/#270-sentry-performance branch 3 times, most recently from bbf9181 to 5776884 Compare March 11, 2023 17:07
@mpass99
Copy link
Collaborator Author

mpass99 commented Mar 11, 2023

For one, a package such as shellescape might be useful (and I would prefer some library if it exists).

This library is not useful in our case as it only surrounds the string with single-quotes [1]
I could not find another library for our use case.

Alternatively, some manual escaping could be the way to go (here are some good replies for that topic).

The proposed set of safe characters does not work in our case. For example, the characters " " (sleep\ 2\ &&\ echo\ $(date)), "&" (echo $date && bash -vxc "sleep 2 \&\& echo \$(date)"), "(", ")" (echo $date && bash -vxc "sleep 2 && echo \$\(date\)"), "=" (echo $date && bash -vxc "ec\=1 && echo \$(date)") are not included in the safe set of characters and therefore would be escaped - (leading to errors when executing).

Instead, the manual of bash is very helpful. In the chapter "Quoting", it is explained, that the only characters with a special meaning inside a double-quoted text are $, backtick, " and \. We can use the go library to escape the backslash and double-quote correctly (fmt.Sprintf("%q", "abc")) and therefore just handle the two remaining characters.

All in all, although we had some complication during these changes, I think the code quality and simplicity has increased by introducing the dynamic nesting of the bash commands.

@MrSerth
Copy link
Member

MrSerth commented Mar 13, 2023

Thanks for having a look here and considering the two misleading suggestions I made; your solution looks fine and having a look at the manual always seems to be a good idea 👍

As discussed, let's shortly investigate:

  • Context: Add full command
  • Shorten span description to "make run", ... (but keep the full "user" command)
  • Span description: Include "unset ${!NOMAD_@}" (and escape)

@mpass99 mpass99 force-pushed the feature/#270-sentry-performance branch 2 times, most recently from 8579d7b to a30854e Compare March 13, 2023 23:33
@codeclimate
Copy link

codeclimate bot commented Mar 13, 2023

Code Climate has analyzed commit a30854e and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 83.8% (65% is the threshold).

This pull request will bring the total coverage in the repository to 71.8% (0.5% change).

View more on Code Climate.

@mpass99 mpass99 force-pushed the feature/#270-sentry-performance branch from a30854e to c8169fb Compare March 14, 2023 22:19
Copy link
Member

@MrSerth MrSerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks so much for working on this and iterating on the various feedback and discussion aspects that we had! I really enjoyed reading the code and am convinced that we made many good decisions, also leading to cleaner, better maintainable code! 💪

I cannot wait to continue analyzing spans on Sentry and dig through performance issues 🚀

Once question that came to my mind (and which is related to #219) is the AWS serverless part. I saw you disabled the tests, which is totally fine, but this also indicates we are breaking the functionality with these changes, I assume? Probably, we should revisit the relevant parts once we continued with #219 to keep an active working code base. For now, I would just proceed merging and postpone this to a later date ;)

@MrSerth MrSerth merged commit 0218f91 into main Mar 14, 2023
@MrSerth MrSerth deleted the feature/#270-sentry-performance branch March 14, 2023 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants