Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential loss of enqueued metrics #21

Open
Aerdayne opened this issue Dec 20, 2023 · 1 comment
Open

Potential loss of enqueued metrics #21

Aerdayne opened this issue Dec 20, 2023 · 1 comment

Comments

@Aerdayne
Copy link

Aerdayne commented Dec 20, 2023

Hey!

It doesn't look like the repo has been lively lately, so I'm also using this issue in order to gauge whether it has been abandoned or not.

It looks like due to the current Worker design, it's possible to lose already enqueued metrics when the app shuts down due to a deploy, for example. Here's a short reproduction script:

#!/usr/bin/env ruby
# frozen_string_literal: true

require "bundler/setup"
require "yabeda/datadog"

Yabeda::Datadog.config.log_level = Logger::DEBUG
Yabeda::Datadog.config.batch_size = 10
Yabeda::Datadog.config.num_threads = 1

adapter = Yabeda::Datadog.start

Yabeda.configure do
  group :test

  counter :test_counter, tags: [:test_tag]
end

Yabeda.configure!
Yabeda::Datadog.start_exporter

th = Thread.new do
  loop do
    20.times { Yabeda.test.test_counter.increment({ test_tag: '1'}) }

    sleep(1)
  end
ensure
  puts
  puts adapter.instance_variable_get(:@worker).instance_variable_get(:@queue).size # can be non-zero
end

sleep

Pardon me for I didn't spend much time making it 100% deterministic, so there might be a need to execute and ctrl+c it several times to see the expected outcome.

In some cases the worker queue will not be empty when the process gets shut down. I specifically configured the batch size to be smaller than the actual per-second metric ingestion rate, and reduced the thread count to 1 to make the test more reproducible, but I'm certain this behaviour can occur with more realistic setups.

As far as I can tell, most libraries implementing a similar use-case do a flush when a process exits, e.g. InfluxDB client here and DogStatsD itself here. Both of these flushes are called from an at_exit hook. I don't see a similar mechanism in yabeda-datadog.

There's a Worker.close method which is not invoked automatically, and it looks like calling it still wouldn't solve the possibility of metrics loss since it exits a thread instead of joining it like DogStatsD does internally.

@Envek
Copy link
Member

Envek commented Dec 22, 2023

Hey! Thanks for reporting. Implement at_exit hook is a good idea. Pull requests are welcome!


I'm not actively developing this gem and its creator also doesn't seem to be interested in its development, but I would be happy to review and merge a pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants