Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Don't use uuid field to count #23049

Merged
merged 17 commits into from
Jun 18, 2024
Merged

perf: Don't use uuid field to count #23049

merged 17 commits into from
Jun 18, 2024

Conversation

timgl
Copy link
Collaborator

@timgl timgl commented Jun 18, 2024

Problem

For "Total count" queries, we do count(e.uuid). However, this is unnecessary, as you can just do count(). Doing the former also doesn't help with duplicate events, and in general duplicate event sresolve quick enough that it's not worth adding huge memory load at query time to solve for it. This reduces the amount of data we need to read by half.

Changes

count(e.uuid) -> count()

Before:
image

After:

image

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Does this work well for both Cloud and self-hosted?

How did you test this code?

@timgl timgl requested review from tkaemming and a team June 18, 2024 12:45
@timgl timgl enabled auto-merge (squash) June 18, 2024 12:45
timgl and others added 11 commits June 18, 2024 18:50
#23022)

* perf: Use max_bytes_before_external_group_by for funnels and path queries

* Update query snapshots

* Update query snapshots

* Update query snapshots

* Update query snapshots

* Update query snapshots

* Update query snapshots

* fix test

* also do retention query

* Update query snapshots

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
* Added job cancellation checking into rest source

* Remove chunk size

* fixed mypy

* Fixed test

* fixed mypy

* Fixed test

* WIP

* Get the right zendesk tables set as incremental

* Final fixes and tests for zendesk

* Fixed mypy

* Updated incremental table names
* 🔥 initial commit

* update readme

* Update README.md

* Update README.md

* deploy scripts

* very basic consumer setup

* add some configs and docker-compose

* formatting for testing

* add tailscale

* flip from dev to prod flag

* set default to be not prod

* default for group_id

* tailscale up

* update gitignore

* basic geolocation

* remove unused localServer

* document mmdb

* just make configs an example

* drop raw print

* add a start script (downloads the mmdb)

* add readme and update configs.example

* ts working

* if in start

* update start script

* fix start

* fix start

* fix more

* add sql endpoints for tokenId and Person lookups

* work towards filter

* sub channel

* fix subChan

* hardcode team2 token

* add cors

* only allow get and head

* add atomicbool

* add channel to kafka

* add logs

* verbose logs

* make array

* drop sub ptrs

* more logs

* helps to loop

* drop some logigng

* move sub branch

* logging

* drop log

* hog

* Deal with numeric distinct ids later

* logs

* api_key

* send 1/1000

* remove log

* remove more logs

* change response payload

* set timestamp if needed

* fill in person_id if team_id is set

* require teamid, convert to token

* clean up subs on disconnect

* log

* check for token in another place

* clean up subs on disconnect

* drop modulo and log

* fix no assign

* don't reuse db conn for now

* drop a log

* add back commented out log

* Don't block on send to client channel

* add geo bool

* only geo events

* use wrapper ip

* don't require team in geo mode

* add an endpoint and stats keeper for teams

* remove stats keeper

* start stats keeper

* wire it up

* change the shape of the response

* omit empty error

* omit empty on the stats as well

* enable logging on back pressure

* add jwt endpoint for testing

* support multiple event types

* Get Auth Setup

* jwt team is float so turn that into int

* logs

* add auth for stats endpoint

* remove tailscale and use autoTLS on public endpoints

* default to :443 for auto tls

* remove un-needed endpoints and handlers

* Use compression because... a lot of data (#9)

* add dockerfile and CI/CD (#10)

* add dockerfile and CI/CD

* Use ubuntu not alpine

couldn't build in alpine :'(

* Add MMDB download to Dockerfile (#11)

* Use clearer name for MMDB

* Don't connect to Kafka over SSL in dev

* Fix JWT token in example config

* Add postgres.url to example config

* Add expected scope

* Fix const syntax

* Put scope validation where claims are known

* Fix audience validation

* moves

* ignore livestream for ci

* main -> master

* move GA to root

* docker lint fix

* fix typo

* fixes for docker builds

* test docker build

* livestream build docker

* dang

* Update .github/workflows/livestream-docker-image.yml

Co-authored-by: Neil Kakkar <[email protected]>

* Update .github/workflows/livestream-docker-image.yml

Co-authored-by: Neil Kakkar <[email protected]>

* don't build posthog container when PR is pushed for rust or livestream

* Update .github/workflows/livestream-docker-image.yml

Co-authored-by: Neil Kakkar <[email protected]>

* add a lot of paths-ignore

* Update .github/workflows/livestream-docker-image.yml

Co-authored-by: Neil Kakkar <[email protected]>

* Dorny filters are handling most of what I was trying to do

* remove tailscale to speed up builds

* maybe?

* push container to github.com/posthog/postog

* don't build container on PR

* remove more filters because dorny

---------

Co-authored-by: Brett Hoerner <[email protected]>
Co-authored-by: Zach Waterfield <[email protected]>
Co-authored-by: Frank Hamand <[email protected]>
Co-authored-by: Michael Matloka <[email protected]>
Co-authored-by: Neil Kakkar <[email protected]>
)

The existing Event Select implementation shows all events, both client-sent and autocaptured by the Posthog platform, when selecting an event to activate a Survey popup. Instead, we would like to only show custom events sent by the client in this list and show the user a tooltip about how the events are observed and surveys are activated.
* feat(settings): Allow linking to personal API key preset

* Update UI snapshots for `chromium` (2)

* Update UI snapshots for `chromium` (2)

* Update UI snapshots for `chromium` (1)

* Update UI snapshots for `chromium` (2)

* Update UI snapshots for `chromium` (1)

* Update UI snapshots for `chromium` (2)

* Update UI snapshots for `chromium` (2)

* Update UI snapshots for `chromium` (1)

* Update UI snapshots for `chromium` (1)

* Update UI snapshots for `chromium` (2)

* Update UI snapshots for `chromium` (1)

* Update UI snapshots for `chromium` (2)

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Contributor

github-actions bot commented Jun 18, 2024

Size Change: 0 B

Total Size: 1.06 MB

ℹ️ View Unchanged
Filename Size
frontend/dist/toolbar.js 1.06 MB

compressed-size-action

@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

4 snapshot changes in total. 0 added, 4 modified, 0 deleted:

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

3 snapshot changes in total. 0 added, 3 modified, 0 deleted:

  • chromium: 0 added, 3 modified, 0 deleted (diff for shard 2)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

1 snapshot changes in total. 0 added, 1 modified, 0 deleted:

  • chromium: 0 added, 1 modified, 0 deleted (diff for shard 2)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@timgl timgl merged commit 3fc2c30 into master Jun 18, 2024
84 checks passed
@timgl timgl deleted the dont-use-uuid-to-count branch June 18, 2024 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.