Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add env to disable metrics operation when it is disabled #96

Closed
wants to merge 40 commits into from

Conversation

boodyvo
Copy link
Contributor

@boodyvo boodyvo commented Sep 6, 2024

  • Added variable to disable dependency for the database. With the new variable METRIC_DATABASE_ENABLED=false we can run proxy without database dependency.
  • Simplified implementation to keep the metrics code as is, as we don't know the future solution for metrics, so right now it can accept db as nil to avoid refactoring in many parts of the code (don't know why it was implemented not as a methods of structure, but allows a param of database 🤷).
  • Added a CI without database to test dependency.
  • Added separate set of E2E tests to validate same E2E test set runs ok, if we remove dependency for metrics checks.
  • Fixed local tests locally. Need also to add them to CI (at the moment, there are some problems with cache test on CI with local env, as one particular cache is not hit, need to debug)
  • An additional PR for the refactoring will be further, where the interface and noop database implementation provided

@boodyvo boodyvo marked this pull request as ready for review September 6, 2024 19:59
main_no_metrics_test.go Outdated Show resolved Hide resolved
Comment on lines +40 to +42
if db == nil {
return nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: any reason we don't want to return an explicit error rather than a silent return?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh, this is to avoid refactoring. The implementation we have in not methods-based, but expects that we pass the db. To remove db dependency during initialization we return not fake, but nil db and pass it. As it is expected (and added to the comments), we allow to pass db as nil. In such case we don't need errors.

Another ways to do that:

  • switch to interface and add empty implementation, that does nothing
  • return particulr errors, but in this case we need to refactor processing among the codebase

This helps us to keep the current metrics in the code, while we don't have the decision about about how do we want to implement metrics

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an additional further PR (already created), that we discussed with @pirtleshell and in the task: https://app.shortcut.com/kava-labs/story/14336
This one provide a better refactoring, adding interfaces.

Copy link
Member

@pirtleshell pirtleshell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i recognize the desire to #JustGetItDone® but i am compelled to recommend that we refactor into a MetricsDb interface rather than pretend like we have a postgres db when we do not.

For this case of not having the database, we'd implement a NoopMetricsClient. The existing code would be the PostgresMetricsClient

The refactor would probably include moving the partitioning/compaction/pruning routines into the client so that different implementations could set them up differently.

Benefits include:

  • big code clean up & improved separation of concerns. all the conditional checks for db == nil or !serviceConfig.MetricDatabaseEnabled can be removed
  • our future selves don't need to keep in our minds for every code change that the PostgresClient might actually not exist
  • it will allow a simple transition to using a new backend to store the metrics. if the interface exists and is used, converting the whole service to prometheus becomes the straightforward task of just writing a new MetricsDb implementation.

i'd be ok with this change for now to lock in the db destruction cost-savings, but i'd advocate for continuing to work on this and iteratively refactor the code into a more maintainable state.

.github/workflows/ci-e2e-no-metrics-tests.yml Outdated Show resolved Hide resolved
@@ -0,0 +1,32 @@
---
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of another docker-compose file to maintain, can we make use of multiple env files or just set the env variable directly? seems like the only difference should be setting METRIC_DATABASE_ENABLED to false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pirtleshell I think database container is removed from here, but I guess we can just launch it with:

docker compose up redis, proxy

instead of creating new file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored. As there is some problem with passing env variable into the container passing them as part of .env and no_metric.env files depending on type of tests running. Keep one docker compose for both ci

@@ -208,6 +208,7 @@ func TestE2ETestProxyCreatesRequestMetricForEachRequest(t *testing.T) {

// make request to api and track start / end time of the request to
startTime := time.Now()
time.Sleep(1 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the motivation for sleeping in these tests? i'm wondering if there is a way to make the tests more resilient?

if we need a sleep before every call to waitForMetricsInWindow, let's add it into that method instead of needing to remember to call it every time we define startTime. (include a comment before that call to sleep to explain why it is necessary)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was the problem on local test environment. Looks like the waitForMetricsInWindow tried to wait for metrics in particular time range. To make it work on local env, there was a problem with timing as we need some delay between startTime and actual metric write

clients/database/postgres.go Outdated Show resolved Hide resolved
@boodyvo
Copy link
Contributor Author

boodyvo commented Sep 12, 2024

i recognize the desire to #JustGetItDone® but i am compelled to recommend that we refactor into a MetricsDb interface rather than pretend like we have a postgres db when we do not.

For this case of not having the database, we'd implement a NoopMetricsClient. The existing code would be the PostgresMetricsClient

The refactor would probably include moving the partitioning/compaction/pruning routines into the client so that different implementations could set them up differently.

Benefits include:

  • big code clean up & improved separation of concerns. all the conditional checks for db == nil or !serviceConfig.MetricDatabaseEnabled can be removed
  • our future selves don't need to keep in our minds for every code change that the PostgresClient might actually not exist
  • it will allow a simple transition to using a new backend to store the metrics. if the interface exists and is used, converting the whole service to prometheus becomes the straightforward task of just writing a new MetricsDb implementation.

i'd be ok with this change for now to lock in the db destruction cost-savings, but i'd advocate for continuing to work on this and iteratively refactor the code into a more maintainable state.

The idea here was to make an update with least amount of refactoring and intrusion to the code, as we don't know how exactly we are going to support and create metrics (mentioned in the second paragraph of description). Totally agree with what you say, but don't believe it will worse the effort for refactoring, as eventually we potentially will delete that code or significantly will refactor that.

I started to refactor it according to the comment while reviewing (to introduce interfaces), if you think it make sense at the moment

@boodyvo
Copy link
Contributor Author

boodyvo commented Sep 13, 2024

Extracted refactoring into a separate task: https://app.shortcut.com/kava-labs/story/14336

@boodyvo boodyvo closed this Sep 16, 2024
@boodyvo
Copy link
Contributor Author

boodyvo commented Sep 16, 2024

Was merged already another PR with necessary improvements + refactoring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants