Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with prometheus metrics #261

Open
lesnikutsa opened this issue Oct 21, 2024 · 1 comment
Open

Problem with prometheus metrics #261

lesnikutsa opened this issue Oct 21, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@lesnikutsa
Copy link

Description and context

Metrics from prometheus disappear on the node from time to time

Steps to reproduce

  1. I enabled prometheus metrics on the node to use my own grafana dashboard and monitor the node metrics. Here are the changes I made in config.toml
prometheus = true
# Address to listen for Prometheus collector(s) connections
prometheus_listen_addr = "0.0.0.0:26660"
  1. At the very beginning, I see that the metrics are being sent and I see in the browser that the metrics are working correctly - example below:
# HELP cometbft_abci_connection_method_timing_seconds Timing for each ABCI method.
# TYPE cometbft_abci_connection_method_timing_seconds histogram
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="0.0001"} 0
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="0.0004"} 0
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="0.002"} 0
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="0.009"} 2
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="0.02"} 3
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="0.1"} 3
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="0.65"} 3
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="2"} 3
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="6"} 3
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="25"} 3
cometbft_abci_connection_method_timing_seconds_bucket{chain_id="iliad-0",method="commit",type="sync",le="+Inf"} 3
cometbft_abci_connection_method_timing_seconds_sum{chain_id="iliad-0",method="commit",type="sync"} 0.023611912
  1. After a day of node operation, I received a notification in telegram that prometheus operation on the node was disabled. After checking IP:26660 in the browser, I only see the following metrics:
An error has occurred while serving metrics:

2 error(s) occurred:
* "cosmos__p2p_filter_addr_[2001:41d0:203:e4db::6]:26656" is not a valid metric name
* "cosmos_query__p2p_filter_addr_[2001:41d0:203:e4db::6]:26656" is not a valid metric name

Solution recommendation

The metrics should definitely not disappear when the Story node is running. So far, rebooting the story node has helped me, but this is clearly not a solution to the problem. Something is interrupting the delivery of metrics that should be provided constantly

@lesnikutsa lesnikutsa added the bug Something isn't working label Oct 21, 2024
@lesnikutsa
Copy link
Author

Today the problem has recurred. That is, the problem persists with a frequency of about 2-3 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant