Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self-fix function for distance calculation parallelization #48

Open
3 tasks
sigmafelix opened this issue Jan 29, 2024 · 1 comment
Open
3 tasks

Self-fix function for distance calculation parallelization #48

sigmafelix opened this issue Jan 29, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request function additional function ideas

Comments

@sigmafelix
Copy link
Collaborator

sigmafelix commented Jan 29, 2024

Distance calculation parallelization with smaller spatial extents than the entire dataset's extent may result in erroneous values if some grids/sub-regions have no target data features or edge cases are present near the boundary of adjacent grids/sub-regions. Gradually expanding grids can be used to fix such edge cases. One challenge is to design a function which determine whether the current calculation is shorter or longer than the actual shortest distance to the nearest feature that would have been found at the full dataset.

Problem statement

Given a grid $G_i$, a point or line target feature set $V$, and a point origin feature set $U$, we want to find
$\text{if }\sup {d((U_k \cap G_i ), (V_l \cap G_i))} < \sup {d((U_k \cap G_i ), V)}$, or
$\text{if }\inf {d((U_k \cap G_i ), (V_l \cap G_i))} > \sup {d((U_k \cap G_i ), V)}$ $\text{ } \forall k, l$
$\inf$ problem is relevant as we consider calculating the shortest distance to the target feature set.

Hypothesis

  • A distance is considered suspicious/sub-optimal when it is longer than the distance from this point to the grid boundary.

  • Hypothesis implementation

  • Gradual increment in search window

  • Check the influence to the performance

@sigmafelix sigmafelix added enhancement New feature or request function additional function ideas labels Jan 29, 2024
@sigmafelix sigmafelix added this to the Manuscript work milestone Jan 29, 2024
@sigmafelix sigmafelix self-assigned this Jan 29, 2024
@sigmafelix
Copy link
Collaborator Author

sigmafelix commented Jun 7, 2024

A function in the next version will--

  • Use alphahull to identify the outermost points to detect erroneous distance calculation.
library(terra)
library(sf)
library(alphahull)

nc <- vect(system.file("gpkg/nc.gpkg", package = "sf"))
ncp <- spatSample(ncp, 3000)
ncpp <- crds(ncp)
ncp_ahull <- ahull(ncpp[,1], ncpp[,2], 1)

# get the outermost point row indices
ncp_ahull$ashape.obj$alpha.extremes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request function additional function ideas
Projects
None yet
Development

No branches or pull requests

1 participant