Binary Predicates Introduction and Benchmark Notebook #1156

thomcom · 2023-05-26T20:59:02Z

Description

Closes #1138
Closes #1141 here
Depends on #1152
Depends on #1064

Please direct your attention to the notebook since the dependencies and delayed state of CI issues over this week have put a lot of files into this PR.

This notebook demonstrates cuSpatial's new binary predicates on large datasets, benchmarking and comparing against the host implementation on GeoPandas.

In order to support the large inputs for these comparisons I had to reactivate the pairwise_point_in_polygon functionality that I'd previously written off. This is because quadtree doesn't support large N for NxN operations, since it is many-to-many, and brute-force would require a huge number of iterations to support such large dataframes. There are some more optimizations that can be made to speed up pairwise_point_in_polygon, but the algorithm itself isn't sufficiently fast. It is detailed fairly closely in the notebook.

Please take a look and let's have some conversations about steps forward.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

review-notebook-app · 2023-05-26T20:59:06Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

notebooks/binary_predicates.ipynb

isVoid · 2023-05-30T20:32:24Z

notebooks/binary_predicates.ipynb

@@ -0,0 +1,1157 @@
+{


Line #8. [display(widgets.HBox(row)) for row in rows]
There's not need to use list comprehension here. Just
for row in rows:
display(widgets.HBox(row))

Reply via ReviewNB

isVoid · 2023-05-30T20:32:25Z

notebooks/binary_predicates.ipynb

@@ -0,0 +1,1157 @@
+{


Replace "+" mark with "-", I think "Point-point relationship" is more typical in showing a binary relationship.

Reply via ReviewNB

isVoid · 2023-05-30T20:32:25Z

notebooks/binary_predicates.ipynb

@@ -0,0 +1,1157 @@
+{


Is there a broken output in this cell? Or it's just the reviewNB was not able to show it.

Reply via ReviewNB

Ah yeah it looks like ReviewNB can't display the HBoxed output areas that are defined by the import ipywidgets as widgets call. It shows a visualization of the geometries used in test.

isVoid · 2023-05-30T20:32:25Z

notebooks/binary_predicates.ipynb

@@ -0,0 +1,1157 @@
+{


There are only two relationships that a point can have with another point: equality or inequality.

I understand what you mean here. Although for those who don't have much context in binpreds, point-point can also be disjoint, or intersects. But they can all be reduced to equal/inequality. Perhaps rephrase the word here for general audience?

Reply via ReviewNB

I gave it another pass.

isVoid · 2023-05-30T20:32:25Z

notebooks/binary_predicates.ipynb

@@ -0,0 +1,1157 @@
+{


There are three relationships that a Point can have with a LineString: Disjoint, Point is equal to one of the LineString boundary points, and point falls between the LineString boundary points.
Is there a name for the latter 2 relationships?

Reply via ReviewNB

Not that I'm specifically aware of. I assume that you're looking for one of the binary predicate names since I used disjoint for the first name?

isVoid · 2023-05-30T20:38:48Z

Before I go further, may I ask who's the intended audience of this notebook? The first section is a deep dive into the situations between each possible combination of geometries, and this section makes me feel that it reveals too much of the underlying implementation detail to GIS users. If this is for documentation purposes, it also doesn't seem like it's clear enough that this is how the predicate is implemented.

isVoid · 2023-05-30T20:52:01Z

notebooks/binary_predicates.ipynb

@@ -0,0 +1,1157 @@
+{


Line #2. """Times the speed of producing the test data as well as the predicate execution.
Why should you time the data production? Benchmark should only time the kernel.

Reply via ReviewNB

isVoid · 2023-05-30T20:52:01Z

notebooks/binary_predicates.ipynb

@@ -0,0 +1,1157 @@
+{


Line #9. 'GeoPandas predicate time',

I have forgotten the difference between "predicate time" and "time". You said at one point you want to remove one of them?

Reply via ReviewNB

isVoid · 2023-05-30T20:52:01Z

notebooks/binary_predicates.ipynb

@@ -0,0 +1,1157 @@
+{


This section seems better to exist in code comments, not in notebook...

Reply via ReviewNB

isVoid · 2023-05-30T20:55:33Z

The notebook is largely benchmark, not binpred introduction.

jarmak-nv · 2023-05-31T17:24:42Z

Just providing a super high level review here - the notebook is incredibly valuable for us as a team, it can alert us to regressions, track improvements, and show where we should put our focus in the future assuming further acceleration is needed.

I think the other comments are correct in that this isn't the best "story" for highlighting the binpreds. Team facing, it's truly awesome, I think we could just also benefit from you making a more story-focused version that highlights some but not all of the binpreds more like the geocoding notebook.

Let's work as a team to figure it out a good story - you've done amazing work to get the binpreds into cuSpatial and I want as many people as possible to see/touch it and have the i get it 💡 moment.

isVoid

Chatted with @thomcom offline and walked through the changes in this PR. In addition to the notebook, there are multiple performance optimization in the binary predicates and the GeoSeries infrastructure that the code base can all benefits.

python/cuspatial/cuspatial/tests/binpreds/test_binpred_test_dispatch.py

harrism · 2023-08-02T22:34:05Z

.devcontainer/devcontainer.json

@@ -86,4 +86,6 @@
      }
    }
  }
+
+  "forwardPorts": [8080]


Weird, I use notebooks in the dev container often and didn't need this.

It was needed to use snakeviz with it, for some reason.

jarmak-nv · 2023-08-03T01:41:52Z

I think CI might be failing on the notebooks step because it doesn't know to look in subdirs like this was moved to.

Does it make sense for this to be in https://github.com/rapidsai/cuspatial/tree/branch-23.08/python/cuspatial/benchmarks/api?

thomcom · 2023-08-03T02:58:33Z

It might pass CI if I put it in that folder? :)

ajschmidt8

It looks like this PR introduces a new benchmarking notebook.

Is this notebook intended to be run in CI?

I think historically we've skipped benchmark notebooks in CI. See cuml and cugraph for reference:

cuspatial has a similar variable to skip notebook below:

cuspatial/ci/test_notebooks.sh

Line 37 in a09cc0d

SKIPNBS=""

harrism · 2023-08-03T21:07:06Z

Oh, we should probably skip the cuproj_benchmark.ipynb too.

thomcom · 2023-08-04T00:38:14Z

@ajschmidt8 your review might get this in for 23.08

ajschmidt8

Ray beat me to it.

approving anyway for good measure 👍

isVoid · 2023-08-04T13:10:32Z

/merge

isVoid · 2023-08-04T13:10:49Z

Thanks @raydouglass @ajschmidt8 ! Congrats @thomcom !

thomcom added 3 - Ready for Review Ready for review by team feature request New feature or request non-breaking Non-breaking change labels May 26, 2023

thomcom added this to the DE-9IM milestone May 26, 2023

thomcom requested review from harrism and isVoid May 26, 2023 20:59

thomcom requested review from a team as code owners May 26, 2023 20:59

thomcom self-assigned this May 26, 2023

github-actions bot added cmake Related to CMake code or build configuration libcuspatial Relates to the cuSpatial C++ library Python Related to Python code labels May 26, 2023

harrism requested changes May 29, 2023

View reviewed changes

notebooks/binary_predicates.ipynb Show resolved Hide resolved

isVoid reviewed May 30, 2023

View reviewed changes

github-actions bot removed the libcuspatial Relates to the cuSpatial C++ library label May 31, 2023

isVoid approved these changes May 31, 2023

View reviewed changes

python/cuspatial/cuspatial/tests/binpreds/test_binpred_test_dispatch.py Outdated Show resolved Hide resolved

This comment was marked as duplicate.

Sign in to view

thomcom changed the base branch from branch-23.06 to branch-23.08 June 6, 2023 14:37

harrism reviewed Aug 2, 2023

View reviewed changes

harrism approved these changes Aug 2, 2023

View reviewed changes

thomcom requested a review from a team as a code owner August 3, 2023 03:34

github-actions bot added the ci label Aug 3, 2023

ajschmidt8 reviewed Aug 3, 2023

View reviewed changes

Skip the two benchmarking notebooks.

ff386c0

thomcom force-pushed the fea/binpred-introduction-notebook branch from 6abf517 to ff386c0 Compare August 3, 2023 22:42

isVoid requested a review from ajschmidt8 August 4, 2023 01:37

raydouglass approved these changes Aug 4, 2023

View reviewed changes

ajschmidt8 approved these changes Aug 4, 2023

View reviewed changes

rapids-bot bot merged commit ae8eb90 into rapidsai:branch-23.08 Aug 4, 2023
55 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Binary Predicates Introduction and Benchmark Notebook #1156

Binary Predicates Introduction and Benchmark Notebook #1156

thomcom commented May 26, 2023 •

edited

Loading

review-notebook-app bot commented May 26, 2023

isVoid May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

thomcom May 31, 2023

isVoid May 30, 2023 •

edited

Loading

thomcom May 31, 2023

isVoid May 30, 2023 •

edited

Loading

thomcom May 31, 2023

isVoid commented May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid commented May 30, 2023 •

edited

Loading

jarmak-nv commented May 31, 2023

isVoid left a comment

This comment was marked as duplicate.

harrism Aug 2, 2023

thomcom Aug 3, 2023

jarmak-nv commented Aug 3, 2023

thomcom commented Aug 3, 2023

ajschmidt8 left a comment

harrism commented Aug 3, 2023

thomcom commented Aug 4, 2023

ajschmidt8 left a comment

isVoid commented Aug 4, 2023

isVoid commented Aug 4, 2023 •

edited

Loading

Binary Predicates Introduction and Benchmark Notebook #1156

Binary Predicates Introduction and Benchmark Notebook #1156

Conversation

thomcom commented May 26, 2023 • edited Loading

Description

Checklist

review-notebook-app bot commented May 26, 2023

isVoid May 30, 2023 • edited Loading

Choose a reason for hiding this comment

isVoid May 30, 2023 • edited Loading

Choose a reason for hiding this comment

isVoid May 30, 2023 • edited Loading

Choose a reason for hiding this comment

thomcom May 31, 2023

Choose a reason for hiding this comment

isVoid May 30, 2023 • edited Loading

Choose a reason for hiding this comment

thomcom May 31, 2023

Choose a reason for hiding this comment

isVoid May 30, 2023 • edited Loading

Choose a reason for hiding this comment

thomcom May 31, 2023

Choose a reason for hiding this comment

isVoid commented May 30, 2023 • edited Loading

isVoid May 30, 2023 • edited Loading

Choose a reason for hiding this comment

isVoid May 30, 2023 • edited Loading

Choose a reason for hiding this comment

isVoid May 30, 2023 • edited Loading

Choose a reason for hiding this comment

isVoid commented May 30, 2023 • edited Loading

jarmak-nv commented May 31, 2023

isVoid left a comment

Choose a reason for hiding this comment

This comment was marked as duplicate.

harrism Aug 2, 2023

Choose a reason for hiding this comment

thomcom Aug 3, 2023

Choose a reason for hiding this comment

jarmak-nv commented Aug 3, 2023

thomcom commented Aug 3, 2023

ajschmidt8 left a comment

Choose a reason for hiding this comment

harrism commented Aug 3, 2023

thomcom commented Aug 4, 2023

ajschmidt8 left a comment

Choose a reason for hiding this comment

isVoid commented Aug 4, 2023

isVoid commented Aug 4, 2023 • edited Loading

thomcom commented May 26, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid commented May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid May 30, 2023 •

edited

Loading

isVoid commented May 30, 2023 •

edited

Loading

isVoid commented Aug 4, 2023 •

edited

Loading