Skip to content

Commit

Permalink
Merge pull request #50 from sheinbergon/dev-24.0.x
Browse files Browse the repository at this point in the history
Back port new functionality to 24.0.x
  • Loading branch information
sheinbergon authored Dec 23, 2023
2 parents f97ad09 + 1e4cc2f commit fd3ad80
Show file tree
Hide file tree
Showing 5 changed files with 127 additions and 19 deletions.
40 changes: 22 additions & 18 deletions README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,20 @@
[![Github Workflow Status](https://img.shields.io/github/actions/workflow/status/sheinbergon/dremio-udf-gis/release-ci.yml?branch=23.1.x&logo=githubactions&style=for-the-badge)](https://github.com/sheinbergon/dremio-udf-gis/actions?query=workflow%3Arelease-actions)
[![GitHub release (latest by date)](https://img.shields.io/github/v/release/sheinbergon/dremio-udf-gis?logo=github&color=%2340E0D0&style=for-the-badge)](https://github.com/sheinbergon/dremio-udf-gis/releases/latest)
[![Maven Central](https://img.shields.io/maven-central/v/org.sheinbergon/dremio-udf-gis?logo=apachemaven&color=Crimson&style=for-the-badge)](https://search.maven.org/search?q=g:org.sheinbergon%20a:dremio-udf-gis*)
[![Snyk Vulnerabilities for GitHub Repo](https://img.shields.io/snyk/vulnerabilities/github/sheinbergon/dremio-udf-gis?logo=snyk&color=432f95&style=for-the-badge)](https://app.snyk.io/org/sheinbergon/project/94183993-505b-439c-9078-6276fa4c1626)
[![Coveralls](https://img.shields.io/coveralls/github/sheinbergon/dremio-udf-gis?logo=coveralls&style=for-the-badge)](https://coveralls.io/github/sheinbergon/dremio-udf-gis)
[![Liberapay](https://img.shields.io/liberapay/patrons/sheinbergon?logo=liberapay&style=for-the-badge)](https://liberapay.com/sheinbergon/donate)

# Dremio Geo-Spatial Extensions

### What you get

- Widespread OGC implementation for SQL (adheres to PostGIS standards)
- Supported input formats: `WKT`, `WKB (HEX or BINARY)`
- Supported output formats: `WKT`, `WKB`, `GeoJSON`
- Easily installable Maven-Central/Github artifacts shaded jar artifact
- Dremio CE version compatibility (new versions will be released with each community edition)
- Supported input formats: `WKT`, `WKB (HEX or BINARY)`
- Supported output formats: `WKT`, `WKB`, `GeoJSON`
- Easily installable Maven-Central/Github artifacts shaded jar artifact
- Dremio CE version compatibility (new versions will be released with each community edition)
- Up-2-date Proj4J & JTS geometry based implementation


### Sponsorship

Enjoying my work? A show of support would be much obliged :grin:
Expand All @@ -28,6 +27,7 @@ Enjoying my work? A show of support would be much obliged :grin:
</a>

### Installation

- Take the shaded jar for the desired version and place inside your Dremio installation (`$DREMIO_HOME/jars/3rdparty`)
- Restart your Dremio server(s)
- Rejoice! (and see the [WIKI](https://github.com/sheinbergon/dremio-udf-gis/wiki) for detailed usage instructions)
Expand All @@ -36,29 +36,30 @@ Enjoying my work? A show of support would be much obliged :grin:

| Library Version | Dremio Version | Status |
|-----------------|----------------|------------|
| 0.2.x | 20.1.x | Legacy |
| 0.3.x | 21.1.x | Legacy |
| 0.4.x | 21.2.x | Legacy |
| 0.5.x | 22.0.x | Maintained |
| 0.6.x | 22.1.x | Maintained |
| 0.7.x | 23.0.x | Maintained |
| 0.8.x | 23.1.x | Maintained |
| 0.9.x | 24.0.x | Maintained |

| 0.2.x | 20.1.0 | Legacy |
| 0.3.x | 21.1.1 | Legacy |
| 0.4.x | 21.2.0 | Legacy |
| 0.5.x | 22.0.0 | Legacy |
| 0.6.x | 22.1.1 | Legacy |
| 0.7.x | 23.0.1 | Legacy |
| 0.8.x | 23.1.0 | Legacy |
| 0.9.x | 24.0.0 | Maintained |

### Usage Notes

As opposed to PostGIS, Dremio is only a query engine based on existing/projected data sources/lakes.
That means that `Geometry` is not a natively supported data type, and you can only access it if
it's being properly projected from the data sources (For example, PostGIS Geometry is read as an `EWKB` HEX encoded string).

In order to successfully use the provided GIS functions, you must first make sure the geometry is in `WKB (BINARY)` format.
If it's not, you need to decode it:
If it's not, you need to decode it:

- if the input is in `WKT` format, use `ST_GeomFromText`
- if the input is a HEX encoded`WKB`, use Dremio's `FROM_HEX`

This library uses Dremios' Arrow buffers (`ArrowBuf`) to maintain geometry data in binary (`WKB`) format (for performance and efficiency)
when interchanging it between GIS functions, which is of course undecipherable for the naked eye. When running queries from the UI,
`WKB` output will always be base64 encoded.
`WKB` output will always be base64 encoded.

In order to resolve Data back to human-readable format (`WKT`), use `ST_AsText`/`ST_AsGeoJson`

Expand All @@ -73,11 +74,14 @@ SELECT ST_AsText(
```

### Roadmap

- Frequent version/dependency updates
- Add more OGC/PostGIS matching functionality
- Add Geography type support

### Noteworthy Mentions
Work in this repository was originally based on the following sources:

Work in this repository was originally based on the following sources:

- [Apache Drill GIS Functionality](https://github.com/apache/drill/tree/master/contrib/udfs/src/main/java/org/apache/drill/exec/udfs/gis)
- [Christy Haragan's initial port](https://github.com/christyharagan/dremio-gis)
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<carrotsearch.version>0.7.0</carrotsearch.version>
<arrow-memory-netty.version>9.0.0-20221123064031-c39b8a6253-dremio</arrow-memory-netty.version>
</properties>
<version>0.9.2-SNAPSHOT</version>
<version>0.9.4-SNAPSHOT</version>
<name>dremio-udf-gis</name>
<description>GIS UDF extensions for Dremio</description>
<url>https://github.com/sheinbergon/dremio-udf-gis</url>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
* <p>
* http://www.apache.org/licenses/LICENSE-2.0
* <p>
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.sheinbergon.dremio.udf.gis;

import com.dremio.exec.expr.SimpleFunction;
import com.dremio.exec.expr.annotations.FunctionTemplate;
import com.dremio.exec.expr.annotations.Output;
import com.dremio.exec.expr.annotations.Param;

import javax.inject.Inject;

@FunctionTemplate(
name = "ST_GeomFromGeoJSON",
scope = FunctionTemplate.FunctionScope.SIMPLE,
nulls = FunctionTemplate.NullHandling.INTERNAL)
public class STGeomFromGeoJson implements SimpleFunction {

@Param
org.apache.arrow.vector.holders.NullableVarCharHolder jsonInput;

@Output
org.apache.arrow.vector.holders.NullableVarBinaryHolder binaryOutput;

@Inject
org.apache.arrow.memory.ArrowBuf buffer;

public void setup() {
}

public void eval() {
if (org.sheinbergon.dremio.udf.gis.util.GeometryHelpers.isHolderSet(jsonInput)) {
org.locationtech.jts.geom.Geometry geom = org.sheinbergon.dremio.udf.gis.util.GeometryHelpers.toGeometryFromGeoJson(jsonInput);
byte[] bytes = org.sheinbergon.dremio.udf.gis.util.GeometryHelpers.toEWKB(geom);
buffer = buffer.reallocIfNeeded(bytes.length);
org.sheinbergon.dremio.udf.gis.util.GeometryHelpers.populate(bytes, buffer, binaryOutput);
} else {
org.sheinbergon.dremio.udf.gis.util.GeometryHelpers.markHolderNotSet(binaryOutput);
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import org.locationtech.jts.algorithm.Angle;
import org.locationtech.jts.geom.*;
import org.locationtech.jts.io.*;
import org.locationtech.jts.io.geojson.GeoJsonReader;
import org.locationtech.jts.io.geojson.GeoJsonWriter;
import org.locationtech.jts.operation.buffer.BufferOp;
import org.locationtech.jts.operation.valid.IsValidOp;
Expand Down Expand Up @@ -126,6 +127,17 @@ public static Geometry toGeometry(final @Nonnull NullableVarCharHolder holder) {
}
}

@Nonnull
public static Geometry toGeometryFromGeoJson(final @Nonnull NullableVarCharHolder holder) {
try {
String json = toUTF8String(holder);
GeoJsonReader reader = new GeoJsonReader();
return reader.read(json);
} catch (ParseException x) {
throw new RuntimeException(x);
}
}

@Nonnull
public static Geometry toGeometryFromEWKT(final @Nonnull NullableVarCharHolder holder) {
try {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
package org.sheinbergon.dremio.udf.gis

import org.apache.arrow.vector.holders.NullableVarBinaryHolder
import org.apache.arrow.vector.holders.NullableVarCharHolder
import org.sheinbergon.dremio.udf.gis.spec.GeometryInputFunSpec
import org.sheinbergon.dremio.udf.gis.util.allocateBuffer

internal class STGeomFromGeoJsonTests : GeometryInputFunSpec.NullableVarChar<STGeomFromGeoJson>() {

init {
testGeometryInput(
"Calling ST_GeomFromGeoJSON on a POINT",
"""
{"type":"Point","coordinates":[0.5,0.5],"crs":{"type":"name","properties":{"name":"EPSG:4326"}}}
""".trimIndent(),
byteArrayOf(1, 1, 0, 0, 32, -26, 16, 0, 0, 0, 0, 0, 0, 0, 0, -32, 63, 0, 0, 0, 0, 0, 0, -32, 63)
)

testInvalidGeometryInput(
"Calling ST_GeomFromGeoJSON on rubbish text",
"42ifon2 fA!@",
)

testNullGeometryInput(
"Calling ST_GeomFromGeoJSON on null input"
)
}

override val function = STGeomFromGeoJson().apply {
jsonInput = NullableVarCharHolder()
binaryOutput = NullableVarBinaryHolder()
buffer = allocateBuffer()
}

override val STGeomFromGeoJson.input: NullableVarCharHolder get() = function.jsonInput
override val STGeomFromGeoJson.output: NullableVarBinaryHolder get() = function.binaryOutput
}

0 comments on commit fd3ad80

Please sign in to comment.