-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(logs): add alloy + loki #451
base: main
Are you sure you want to change the base?
Conversation
8eb43d7
to
55efe88
Compare
6ec7912
to
f2a4f09
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a few hitches getting things up, but seems generally good.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
* fix and clean up metrics * allow prom to be accessed by external services --------- Co-authored-by: x19 <[email protected]>
cc: @dnut i think https://github.com/orgs/Syndica/projects/2/views/16?pane=issue&itemId=92360295 is solved here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but I just had a few minor nitpicks. I'm not very opinionated about any of this, so feel free to turn them down.
need to also modify the prometheus `target` to point to the different port). | ||
- sig metrics will be published to localhost:12345 (if you change this on the sig cli, you will | ||
need to also modify the prometheus `target` to point to the different port). | ||
- sig gossip metrics are published to localhost:12355 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant with prior bullet
image: grafana/loki:3.0.0 | ||
container_name: loki | ||
ports: | ||
- "3100:3100" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to expose these ports to the host system? This shouldn't be necessary just to connect from other containers in the same docker network.
- "./alloy/config.alloy:/etc/alloy/config.alloy" | ||
- "../logs:/var/log/alloy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very important at this point, but it's a good practice to mount as read-only if you know the files don't need to be modified by the container. Does config.alloy
need to be modified by the application?
- "./alloy/config.alloy:/etc/alloy/config.alloy" | |
- "../logs:/var/log/alloy" | |
- "./alloy/config.alloy:/etc/alloy/config.alloy:ro" | |
- "../logs:/var/log/alloy:ro" |
ports: | ||
- "3100:3100" | ||
volumes: | ||
- ./loki:/etc/loki |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ./loki:/etc/loki | |
- ./loki:/etc/loki:ro |
@@ -41,6 +41,7 @@ pub fn getMetrics( | |||
_: *httpz.Request, | |||
response: *httpz.Response, | |||
) !void { | |||
response.content_type = .TEXT; // expected by prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens when you don't include this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prometheus doesn't record it at all.
prometheus | time=2025-01-06T15:04:39.954Z level=ERROR source=scrape.go:1590 msg="Failed to determine correct type of scrape target." component="scrape manager" scrape_pool=prometheus target=http://host.docker.internal:12345/metrics content_type="" fallback_media_type="" err="non-compliant scrape target sending blank Content-Type and no fallback_scrape_protocol specified for target"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this due to a recent change in prometheus?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea - worth noting that our prometheus image doesn't have a version on it, pretty sure that means we're tracking the latest image, which is probably undesirable.
@@ -10,6 +10,7 @@ | |||
/kcov-output | |||
/results | |||
/validator | |||
/logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a logs folder? I feel like it's slightly easier to work with a log file when it's placed directly in the current working directory. Doesn't make much difference but just a small convenience.
Also I don't see where a log file is being produced by the code.
note: alerting requires proper parsing of logs which requires some changes to our logger which ill put into a separate PR
logs/sig.log
TL;DR
Added log aggregation capabilities to the metrics stack using Loki and Alloy.
What changed?
/logs
to gitignoreHow to test?
./zig-out/bin/sig gossip -n testnet 2>&1 | tee -a logs/sig.log
docker compose -f compose_mac.yaml up -d
docker compose -f compose_linux.yaml up -d
localhost:3000
and verify logs are visible in Explore view