Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: rust funnel udfs #25326

Merged
merged 9 commits into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/actions/run-backend-tests/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,7 @@ runs:

- name: Upload updated timing data as artifacts
uses: actions/upload-artifact@v4
if: ${{ inputs.person-on-events != 'true' && inputs.clickhouse-server-image == 'clickhouse/clickhouse-server:23.12.6.19-alpine' }}
if: ${{ inputs.person-on-events != 'true' && inputs.clickhouse-server-image == 'clickhouse/clickhouse-server:23.12.6.19' }}
with:
name: timing_data-${{ inputs.segment }}-${{ inputs.group }}
path: .test_durations
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/ci-backend-update-test-timing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,13 @@ jobs:
group: 1
token: ${{ secrets.POSTHOG_BOT_GITHUB_TOKEN }}
python-version: '3.11.9'
clickhouse-server-image: 'clickhouse/clickhouse-server:23.12.6.19-alpine'
clickhouse-server-image: 'clickhouse/clickhouse-server:23.12.6.19'
segment: 'FOSS'
person-on-events: false

- name: Upload updated timing data as artifacts
uses: actions/upload-artifact@v4
if: ${{ inputs.person-on-events != 'true' && inputs.clickhouse-server-image == 'clickhouse/clickhouse-server:23.12.6.19-alpine' }}
if: ${{ inputs.person-on-events != 'true' && inputs.clickhouse-server-image == 'clickhouse/clickhouse-server:23.12.6.19' }}
with:
name: timing_data-${{ inputs.segment }}-${{ inputs.group }}
path: .test_durations
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/ci-backend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ jobs:
fail-fast: false
matrix:
python-version: ['3.11.9']
clickhouse-server-image: ['clickhouse/clickhouse-server:23.12.6.19-alpine']
clickhouse-server-image: ['clickhouse/clickhouse-server:23.12.6.19']
segment: ['Core']
person-on-events: [false, true]
# :NOTE: Keep concurrency and groups in sync
Expand All @@ -249,7 +249,7 @@ jobs:
include:
- segment: 'Temporal'
person-on-events: false
clickhouse-server-image: 'clickhouse/clickhouse-server:23.12.6.19-alpine'
clickhouse-server-image: 'clickhouse/clickhouse-server:23.12.6.19'
python-version: '3.11.9'
concurrency: 1
group: 1
Expand Down Expand Up @@ -320,7 +320,7 @@ jobs:
strategy:
fail-fast: false
matrix:
clickhouse-server-image: ['clickhouse/clickhouse-server:23.12.6.19-alpine']
clickhouse-server-image: ['clickhouse/clickhouse-server:23.12.6.19']
if: needs.changes.outputs.backend == 'true'
runs-on: ubuntu-latest
steps:
Expand Down
2 changes: 1 addition & 1 deletion docker-compose.base.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ services:
# Note: please keep the default version in sync across
# `posthog` and the `charts-clickhouse` repos
#
image: ${CLICKHOUSE_SERVER_IMAGE:-clickhouse/clickhouse-server:23.12.6.19-alpine}
image: ${CLICKHOUSE_SERVER_IMAGE:-clickhouse/clickhouse-server:23.12.6.19}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a good idea moving off of alpine since this will reflect more what production is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah for sure - in this case, a glibc binary is more than 2x faster than a MUSL libc binary - apparently the musl libc binary has a slow memory allocator (going for simplicity and portability over speed), but that's huge lol

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually started wondering if clickhouse would be faster locally running on a normal container vs an alpine one

restart: on-failure

zookeeper:
Expand Down
8 changes: 7 additions & 1 deletion docker/clickhouse/docker-entrypoint-initdb.d/init-db.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
#!/bin/bash
set -e

apk add python3
apt-get update
# Necessary because clickhouse runs on Ubuntu 20, which has an old glibc and an old python default
# Can remove when we upgrade clickhouse, as the new images run on Ubuntu 22
apt-get -y install python3.9
ln -s /usr/bin/python3.9 /usr/bin/python3
wget http://launchpadlibrarian.net/588931980/libc6_2.35-0ubuntu3_amd64.deb
dpkg --auto-deconfigure -i libc6_2.35-0ubuntu3_amd64.deb
cp -r /idl/* /var/lib/clickhouse/format_schemas/
14 changes: 11 additions & 3 deletions docker/clickhouse/user_defined_function.xml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@
<name>value</name>
</argument>
<format>JSONEachRow</format>
<command>aggregate_funnel.py</command>
<command>aggregate_funnel</command>
<lifetime>600</lifetime>
</function>

<function>
Expand Down Expand Up @@ -62,7 +63,8 @@
<name>value</name>
</argument>
<format>JSONEachRow</format>
<command>aggregate_funnel_cohort.py</command>
<command>aggregate_funnel</command>
<lifetime>600</lifetime>
</function>

<function>
Expand Down Expand Up @@ -95,7 +97,8 @@
<name>value</name>
</argument>
<format>JSONEachRow</format>
<command>aggregate_funnel_array.py</command>
<command>aggregate_funnel</command>
<lifetime>600</lifetime>
</function>

<function>
Expand Down Expand Up @@ -129,6 +132,7 @@
</argument>
<format>JSONEachRow</format>
<command>aggregate_funnel_test.py</command>
<lifetime>600</lifetime>
</function>

<function>
Expand Down Expand Up @@ -170,6 +174,7 @@
</argument>
<format>JSONEachRow</format>
<command>aggregate_funnel_trends.py</command>
<lifetime>600</lifetime>
</function>

<function>
Expand Down Expand Up @@ -208,6 +213,7 @@
</argument>
<format>JSONEachRow</format>
<command>aggregate_funnel_array_trends.py</command>
<lifetime>600</lifetime>
</function>

<function>
Expand Down Expand Up @@ -246,6 +252,7 @@
</argument>
<format>JSONEachRow</format>
<command>aggregate_funnel_cohort_trends.py</command>
<lifetime>600</lifetime>
</function>

<function>
Expand Down Expand Up @@ -283,5 +290,6 @@
</argument>
<format>JSONEachRow</format>
<command>aggregate_funnel_array_trends_test.py</command>
<lifetime>600</lifetime>
</function>
</functions>
1 change: 1 addition & 0 deletions funnel-udf/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/target
152 changes: 152 additions & 0 deletions funnel-udf/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 14 additions & 0 deletions funnel-udf/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
[package]
name = "funnels"
version = "0.1.0"
edition = "2021"

[dependencies]
serde = { version = "1.0.104", features = ["derive"] }
serde_json = "1.0.48"
itertools = "0.11"
uuid = { version = "1.10.0", features = ["v4", "serde"] }


[profile.release]
lto = true
Loading
Loading