feat(experiments): New trends calculation methods #26256

danielbachhuber · 2024-11-18T19:25:57Z

Changes

TBD

How did you test this code?

Tests should pass.

posthog-bot · 2024-11-27T07:32:34Z

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week. If you want to permanentely keep it open, use the waiting label.

posthog-bot · 2024-12-05T07:32:50Z

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week. If you want to permanentely keep it open, use the waiting label.

andehen

from a first preliminary glance, this looks statistically sound to me!

over the coming days, I would like to spend more time on digging deeper down into the stats, so likely I'll have more input on the methodology in general then

andehen · 2024-12-05T13:28:56Z

posthog/hogql_queries/experiments/experiment_trends_statistics.py

+    intervals = {}
+
+    for variant in variants:
+        posterior = stats.gamma(a=variant.count + 1, scale=1 / variant.absolute_exposure)


need to add the beta to be statistically correct.
also, suggest to add PRIOR_ALPHA, and PRIOR_BETA as constants in this file, to avoid hard coding them.

Suggested change

posterior = stats.gamma(a=variant.count + 1, scale=1 / variant.absolute_exposure)

posterior = stats.gamma(a=variant.count + 1, scale=1 / (variant.absolute_exposure + 1))

also, here we use variant.absolute_exposure where as in calculate_probabilities we use variant.exposure. What's the difference between them and why is the other used here?

andehen · 2024-12-05T13:33:17Z

posthog/hogql_queries/experiments/experiment_trends_statistics.py

+def calculate_probabilities(
+    control_variant: ExperimentVariantTrendsBaseStats,
+    test_variants: list[ExperimentVariantTrendsBaseStats],
+    simulations: int = 100_000,


I guess this is an arbitrary large number for now, which is fine. we should do some quick testing to see if we really need such a large number

andehen · 2024-12-05T13:38:57Z

posthog/hogql_queries/experiments/experiment_trends_statistics.py

+        wins = sum(
+            target_sample > max(other_variant_samples)
+            for target_sample, other_variant_samples in zip(target_samples, zip(*other_samples))
+        )


can use vectorization here, much faster than a loop. haven't tested, but something like this perhaps (assuming we are working with numpy arrays here).

wins = (target_samples[:, None] > other_samples).sum(axis=1).sum()

danielbachhuber · 2024-12-09T15:51:27Z

Closing in favor of #26760, which is based on the new Jupyter notebook.

danielbachhuber added 6 commits November 18, 2024 10:44

New methods for calculating significance

ad836de

Fix calculate_credible_intervals and add tests

dc263cd

Add tests for calculate_probabilities

2d92338

Hook into existing query runner

9a2e132

Merge branch 'master' into experiments/improve-trends-stats

0dfcd56

Import methods

a17390d

posthog-bot added the stale label Nov 27, 2024

danielbachhuber removed the stale label Nov 27, 2024

posthog-bot added the stale label Dec 5, 2024

andehen reviewed Dec 5, 2024

View reviewed changes

posthog-bot removed the stale label Dec 6, 2024

danielbachhuber closed this Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(experiments): New trends calculation methods #26256

feat(experiments): New trends calculation methods #26256

danielbachhuber commented Nov 18, 2024

posthog-bot commented Nov 27, 2024

posthog-bot commented Dec 5, 2024

andehen left a comment

andehen Dec 5, 2024

andehen Dec 5, 2024

andehen Dec 5, 2024

andehen Dec 5, 2024

danielbachhuber commented Dec 9, 2024

	posterior = stats.gamma(a=variant.count + 1, scale=1 / variant.absolute_exposure)
	posterior = stats.gamma(a=variant.count + 1, scale=1 / (variant.absolute_exposure + 1))

feat(experiments): New trends calculation methods #26256

feat(experiments): New trends calculation methods #26256

Conversation

danielbachhuber commented Nov 18, 2024

Changes

How did you test this code?

posthog-bot commented Nov 27, 2024

posthog-bot commented Dec 5, 2024

andehen left a comment

Choose a reason for hiding this comment

andehen Dec 5, 2024

Choose a reason for hiding this comment

andehen Dec 5, 2024

Choose a reason for hiding this comment

andehen Dec 5, 2024

Choose a reason for hiding this comment

andehen Dec 5, 2024

Choose a reason for hiding this comment

danielbachhuber commented Dec 9, 2024