Skip to content

Commit

Permalink
Add Datafusion (#4)
Browse files Browse the repository at this point in the history
* add databend

* add datafusion

* fix tests

---------

Co-authored-by: root <[email protected]>
  • Loading branch information
lmangani and root authored Nov 14, 2023
1 parent 669ad86 commit 03d1027
Show file tree
Hide file tree
Showing 7 changed files with 129 additions and 4 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
strategy:
fail-fast: false
matrix:
DBNAME: [ "chdb", "duckdb", "glaredb", "databend"]
DBNAME: [ "chdb", "duckdb", "glaredb", "databend", "datafusion"]

steps:
- uses: actions/checkout@v3
Expand Down
6 changes: 5 additions & 1 deletion benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

DBNAME = os.getenv('DBNAME', '*')
ITERATIONS = int(os.getenv('ITERATIONS', 3))
BENCHMARKS = ["version", "count", "groupby", "groupby-local"]
BENCHMARKS = ["version", "count", "groupby"]

@contextmanager
def suppress_stdout():
Expand Down Expand Up @@ -80,6 +80,10 @@ def main():
print("Testing databend")
databendx = SessionContext()
benchmark_db("databend", lambda query: databendx.sql(query).collect())
case "datafusion":
print("Testing datafusion")
databendx = SessionContext()
benchmark_db("datafusion", lambda query: databendx.sql(query).collect())

if __name__ == "__main__":
main()
112 changes: 111 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "embedded-olap-benchmarks"
version = "0.1.2"
version = "0.1.3"
description = ""
authors = ["Lorenzo Mangani <[email protected]>"]

Expand All @@ -11,6 +11,7 @@ chdb = "^0.16.0rc2"
duckdb = "^0.9.1"
glaredb = "^0.5.1"
databend = "^1.2.207"
datafusion = "^32.0.0"

[tool.poetry.dev-dependencies]

Expand Down
1 change: 1 addition & 0 deletions queries/count.datafusion.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
SELECT count(*) FROM 'https://shell.duckdb.org/data/tpch/0_01/parquet/lineitem.parquet';
8 changes: 8 additions & 0 deletions queries/groupby.datafusion.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
SELECT TO_DATE(tpep_pickup_datetime::date) as day,
PULocationID as location,
count(*) as trips,
sum(fare_amount) + sum(mta_tax) + sum(tolls_amount) + sum(tip_amount) as revenue
FROM 'https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2023-01.parquet'
WHERE trip_distance > 5
GROUP BY tpep_pickup_datetime, location
ORDER BY day
1 change: 1 addition & 0 deletions queries/version.datafusion.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
SELECT version()

0 comments on commit 03d1027

Please sign in to comment.