From af966a06cdf18726627397d78ddd2d672f2049ad Mon Sep 17 00:00:00 2001 From: Kyle Barron Date: Fri, 8 Sep 2023 15:24:29 -0400 Subject: [PATCH 1/3] wip flatgeobuf observable example --- _quarto.yml | 1 + flatgeobuf/flatgeobuf-in-js.qmd | 151 ++++++++++++++++++++++++++++++++ flatgeobuf/intro.qmd | 2 + 3 files changed, 154 insertions(+) create mode 100644 flatgeobuf/flatgeobuf-in-js.qmd diff --git a/_quarto.yml b/_quarto.yml index 864aa3b..3e65bc6 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -59,6 +59,7 @@ website: - flatgeobuf/intro.qmd - flatgeobuf/hilbert-r-tree.qmd - flatgeobuf/flatgeobuf.ipynb + - flatgeobuf/flatgeobuf-in-js.qmd - section: PMTiles contents: - pmtiles/intro.qmd diff --git a/flatgeobuf/flatgeobuf-in-js.qmd b/flatgeobuf/flatgeobuf-in-js.qmd new file mode 100644 index 0000000..0de5f9b --- /dev/null +++ b/flatgeobuf/flatgeobuf-in-js.qmd @@ -0,0 +1,151 @@ +--- +title: FlatGeobuf in JavaScript +subtitle: Example of using FlatGeobuf with Leaflet +--- + +FlatGeobuf is a cloud-native vector data format because it contains a built-in spatial index that allows reading a specific spatial region from within the file without downloading the entire file's content. + +This is very useful from browser applications, because + +This is an example of using FlatGeobuf with cloud-native spatial filtering in JavaScript. + +## Streaming vs Filtering + +FlatGeobuf is amenable to two different manners of getting the data into the browser. + +Streaming is ... + +Filtering is ... + + +## Example + +Load the FlatGeobuf JavaScript library: + +```{ojs} +flatgeobuf = require("flatgeobuf@3.26.2/dist/flatgeobuf-geojson.min.js") +``` + +This library has two functions: `deserialize` to fetch a remote file and parse it to GeoJSON, and `serialize`, which converts GeoJSON to FlatGeobuf. + +```{ojs} +flatgeobuf +``` + +For this demo, we'll use the same data source as in the [FlatGeobuf leaflet example](https://flatgeobuf.org/examples/leaflet/large.html). This data file represents every census block in the USA. + +```{ojs} +url = 'https://flatgeobuf.septima.dk/population_areas.fgb' +``` + +The above is a really big file at 12GB total size, so we don't want to fetch the entire file. In this demo, we'll choose a small bounding box representing an area over Manhattan in New York City. + +Beware: if you make this bounding box too big, FlatGeobuf will try to download a large amount of data into your browser and maybe crash the tab! + +```{ojs} +bbox = { + return { + minX: -74.003802, + minY: 40.725756, + maxX: -73.981481, + maxY: 40.744008, + } +} +``` + +```{ojs} +// leaflet uses lat-lon ordering I think +bboxObjectToArray = (obj) => [[obj.minY, obj.minX], [obj.maxY, obj.maxX]] +``` + +Render the rectangle to the Leaflet map + +```{ojs} +rectangle = L.rectangle(bboxObjectToArray(bbox), { interactive: false, color: "blue", fillOpacity: 0.0, opacity: 1.0 }).addTo(map); +``` + +Update map: +```{ojs} +console.log('hi') +// { +// // let previousResults = L.layerGroup().addTo(map); + +// // remove the old results +// // previousResults.remove(); +// const nextResults = L.layerGroup().addTo(map); +// // previousResults = nextResults; + +// const colorScale = ((d) => { +// return d > 750 ? '#800026' : +// d > 500 ? '#BD0026' : +// d > 250 ? '#E31A1C' : +// d > 100 ? '#FC4E2A' : +// d > 50 ? '#FD8D3C' : +// d > 25 ? '#FEB24C' : +// d > 10 ? '#FED976' : +// '#FFEDA0' +// }); + +// console.log(iter); +// const iter = flatgeobuf.deserialize(url, bbox); +// for await (const feature of iter) { +// // Leaflet styling +// const defaultStyle = { +// color: colorScale(feature.properties["population"]), +// weight: 1, +// fillOpacity: 0.4, +// }; +// L.geoJSON(feature, { +// style: defaultStyle, +// }).on({ +// 'mouseover': function(e) { +// const layer = e.target; +// layer.setStyle({ +// weight: 4, +// fillOpacity: 0.8, +// }); +// }, +// 'mouseout': function(e) { +// const layer = e.target; +// layer.setStyle(defaultStyle); +// } +// }).bindPopup(`${feature.properties["population"]} people live in this census block.`) +// .addTo(nextResults); +// } +// } +``` + +Instantiate the Leaflet map and include a default OSM tile layer. + +```{ojs} +map = { + const container = html`
`; + yield container; + const map = L.map(container).setView([40.7299, -73.9923], 13); + L.tileLayer("https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png", { + attribution: "© OpenStreetMap contributors" + }).addTo(map); +} +``` + +Load the Leaflet JavaScript library and fetch its CSS styling defintions if needed. + +```{ojs} +L = { + const L = await require("leaflet@1/dist/leaflet.js"); + if (!L._style) { + const href = await require.resolve("leaflet@1/dist/leaflet.css"); + document.head.appendChild(L._style = html``); + } + return L; +} +``` + +### References + +This notebook is an amalgam of several resources + +- FlatGeobuf example +- [`@bjornharrtell/streaming-flatgeobuf`](https://observablehq.com/@bjornharrtell/streaming-flatgeobuf) is a useful related resource for an example of a streaming load of FlatGeobuf. +- [`@observablehq/hello-leaflet`](https://observablehq.com/@observablehq/hello-leaflet) for an example of loading and rendering a Leaflet map using Observable. +- diff --git a/flatgeobuf/intro.qmd b/flatgeobuf/intro.qmd index eb180bf..0320709 100644 --- a/flatgeobuf/intro.qmd +++ b/flatgeobuf/intro.qmd @@ -3,6 +3,8 @@ title: FlatGeobuf subtitle: Guidelines for FlatGeobuf --- + + # FlatGeobuf FlatGeobuf is a binary file format for geographic vector data, such as points, lines, and polygons. From 3d56d0baa4748368e3ab79d8f953a283f6d9cf10 Mon Sep 17 00:00:00 2001 From: Kyle Barron Date: Fri, 8 Sep 2023 18:41:47 -0600 Subject: [PATCH 2/3] Finished FlatGeobuf notebook --- flatgeobuf/flatgeobuf-in-js.qmd | 184 +++++++++++++++++++------------- 1 file changed, 111 insertions(+), 73 deletions(-) diff --git a/flatgeobuf/flatgeobuf-in-js.qmd b/flatgeobuf/flatgeobuf-in-js.qmd index 0de5f9b..baa0c3d 100644 --- a/flatgeobuf/flatgeobuf-in-js.qmd +++ b/flatgeobuf/flatgeobuf-in-js.qmd @@ -5,21 +5,24 @@ subtitle: Example of using FlatGeobuf with Leaflet FlatGeobuf is a cloud-native vector data format because it contains a built-in spatial index that allows reading a specific spatial region from within the file without downloading the entire file's content. -This is very useful from browser applications, because +This is very useful for browser-based applications, because it allows them to make use of large files hosted on commodity cloud object storage without maintaining a server. -This is an example of using FlatGeobuf with cloud-native spatial filtering in JavaScript. +This notebook provides an example of using FlatGeobuf with spatial filtering from JavaScript. -## Streaming vs Filtering +## Downloading vs Streaming vs Range reads -FlatGeobuf is amenable to two different manners of getting the data into the browser. +FlatGeobuf supports a few different ways of loading data into the browser. -Streaming is ... +_Downloading_ refers to fetching the entire FlatGeobuf file and parsing it after the full file has finished downloading. This has the downside that the user must wait for the entire download to finish before they see any interaction on the web page. This may lead to a user wondering if the web page is broken if it takes a while to download their data. -Filtering is ... +_Streaming_ refers to making use of a file's contents incrementally as it downloads. This approach still downloads the entire file from beginning to end, but enables e.g. rendering part of the data on a map quickly, before waiting for the full file to finish downloading. This has the benefit of increased responsiveness, but the downside that a large file will be loaded in full. FlatGeobuf supports streaming because the file's metadata is located at the beginning. A good example of this in action is ["Streaming FlatGeobuf"](https://observablehq.com/@bjornharrtell/streaming-flatgeobuf) by Björn Harrtell. +_Range reads_ refers to fetching only specific parts of the file that are required by the user. In the context of FlatGeobuf, this usually means a spatial query. FlatGeobuf enables this through its spatial index at the beginning of the file. Web clients can read the header, and then make requests only for data in a specific location. This has the benefit that very large files can be used in situations where downloading them in full would be impractical. A downside is that it takes more individual HTTP requests to understand which byte range in the file contains the desired data, leading to a longer latency before data starts to display. ## Example +This example uses slightly-modified JavaScript syntax used in [Observable notebooks](https://observablehq.com/). + Load the FlatGeobuf JavaScript library: ```{ojs} @@ -32,13 +35,13 @@ This library has two functions: `deserialize` to fetch a remote file and parse i flatgeobuf ``` -For this demo, we'll use the same data source as in the [FlatGeobuf leaflet example](https://flatgeobuf.org/examples/leaflet/large.html). This data file represents every census block in the USA. +For this demo, we'll use the same data source as in the [FlatGeobuf leaflet example](https://flatgeobuf.org/examples/leaflet/large.html). This data file represents _every census block in the USA_. ```{ojs} url = 'https://flatgeobuf.septima.dk/population_areas.fgb' ``` -The above is a really big file at 12GB total size, so we don't want to fetch the entire file. In this demo, we'll choose a small bounding box representing an area over Manhattan in New York City. +The above is a really big file at almost 12GB total size, so we don't want to fetch the entire file. In this demo, we'll choose a small bounding box representing an area over Manhattan in New York City. Beware: if you make this bounding box too big, FlatGeobuf will try to download a large amount of data into your browser and maybe crash the tab! @@ -53,82 +56,58 @@ bbox = { } ``` +The above `bbox` object represents a bounding box in the format required by the FlatGeobuf API, but Leaflet's API requires an array-formatted bounding box, so we'll define a function to convert between the two: + ```{ojs} -// leaflet uses lat-lon ordering I think -bboxObjectToArray = (obj) => [[obj.minY, obj.minX], [obj.maxY, obj.maxX]] +// leaflet uses lat-lon ordering +bboxObjectToArray = (obj) => [ + [obj.minY, obj.minX], + [obj.maxY, obj.maxX], +]; ``` -Render the rectangle to the Leaflet map +Next we'll fetch all the data from the FlatGeobuf file within this bounding box. Notice how we pass the `bbox` argument into `deserialize`. ```{ojs} -rectangle = L.rectangle(bboxObjectToArray(bbox), { interactive: false, color: "blue", fillOpacity: 0.0, opacity: 1.0 }).addTo(map); +features = { + const iter = flatgeobuf.deserialize(url, bbox); + const features = []; + for await (const feature of iter) { + features.push(feature); + } + return features; +} ``` -Update map: +There are 354 individual features that match this query: + ```{ojs} -console.log('hi') -// { -// // let previousResults = L.layerGroup().addTo(map); - -// // remove the old results -// // previousResults.remove(); -// const nextResults = L.layerGroup().addTo(map); -// // previousResults = nextResults; - -// const colorScale = ((d) => { -// return d > 750 ? '#800026' : -// d > 500 ? '#BD0026' : -// d > 250 ? '#E31A1C' : -// d > 100 ? '#FC4E2A' : -// d > 50 ? '#FD8D3C' : -// d > 25 ? '#FEB24C' : -// d > 10 ? '#FED976' : -// '#FFEDA0' -// }); - -// console.log(iter); -// const iter = flatgeobuf.deserialize(url, bbox); -// for await (const feature of iter) { -// // Leaflet styling -// const defaultStyle = { -// color: colorScale(feature.properties["population"]), -// weight: 1, -// fillOpacity: 0.4, -// }; -// L.geoJSON(feature, { -// style: defaultStyle, -// }).on({ -// 'mouseover': function(e) { -// const layer = e.target; -// layer.setStyle({ -// weight: 4, -// fillOpacity: 0.8, -// }); -// }, -// 'mouseout': function(e) { -// const layer = e.target; -// layer.setStyle(defaultStyle); -// } -// }).bindPopup(`${feature.properties["population"]} people live in this census block.`) -// .addTo(nextResults); -// } -// } +features ``` -Instantiate the Leaflet map and include a default OSM tile layer. +As in the FlatGeobuf example, we'll define a color scale based on how many people live in the census block. ```{ojs} -map = { - const container = html`
`; - yield container; - const map = L.map(container).setView([40.7299, -73.9923], 13); - L.tileLayer("https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png", { - attribution: "© OpenStreetMap contributors" - }).addTo(map); -} +colorScale = (d) => { + return d > 750 + ? "#800026" + : d > 500 + ? "#BD0026" + : d > 250 + ? "#E31A1C" + : d > 100 + ? "#FC4E2A" + : d > 50 + ? "#FD8D3C" + : d > 25 + ? "#FEB24C" + : d > 10 + ? "#FED976" + : "#FFEDA0"; +}; ``` -Load the Leaflet JavaScript library and fetch its CSS styling defintions if needed. +Next we load the Leaflet JavaScript library and fetch its CSS styling defintions if needed. ```{ojs} L = { @@ -141,11 +120,70 @@ L = { } ``` +Next we instantiate the Leaflet map and include multiple layers: + +- An `L.tileLayer` to show basemap tiles on the map for context. +- An `L.rectangle` to show the bounding box of our FlatGeobuf query. +- An `L.layerGroup` to group all the FlatGeobuf features into a single layer. +- An `L.geoJSON` item for each feature in the FlatGeobuf response. + +```{ojs} +map = { + const container = html`
`; + yield container; + const map = L.map(container).setView([40.7299, -73.9923], 13); + L.tileLayer("https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png", { + attribution: + "© OpenStreetMap contributors", + }).addTo(map); + + // Render the bounding box rectangle + L.rectangle(bboxObjectToArray(bbox), { + interactive: false, + color: "blue", + fillOpacity: 0.0, + opacity: 1.0, + }).addTo(map); + + const results = L.layerGroup().addTo(map); + for (const feature of features) { + // Leaflet styling + const defaultStyle = { + color: colorScale(feature.properties["population"]), + weight: 1, + fillOpacity: 0.4, + }; + L.geoJSON(feature, { + style: defaultStyle, + }) + .on({ + mouseover: function (e) { + const layer = e.target; + layer.setStyle({ + weight: 4, + fillOpacity: 0.8, + }); + }, + mouseout: function (e) { + const layer = e.target; + layer.setStyle(defaultStyle); + }, + }) + .bindPopup( + `${feature.properties["population"]} people live in this census block.` + ) + .addTo(results); + } +} +``` + +Voilà! We just fetched data directly from a massive FlatGeobuf file, directly from the client, without a server in between. + ### References -This notebook is an amalgam of several resources +This notebook was created with help from several resources -- FlatGeobuf example +- [FlatGeobuf Leaflet example](https://flatgeobuf.org/examples/leaflet/large.html) ([and its source code](https://github.com/flatgeobuf/flatgeobuf/blob/master/examples/leaflet/large.html)) - [`@bjornharrtell/streaming-flatgeobuf`](https://observablehq.com/@bjornharrtell/streaming-flatgeobuf) is a useful related resource for an example of a streaming load of FlatGeobuf. - [`@observablehq/hello-leaflet`](https://observablehq.com/@observablehq/hello-leaflet) for an example of loading and rendering a Leaflet map using Observable. -- + From 508019ad121bb7c394eca81c41bb02ddd4576d59 Mon Sep 17 00:00:00 2001 From: Kyle Barron Date: Fri, 8 Sep 2023 18:50:57 -0600 Subject: [PATCH 3/3] move note --- flatgeobuf/intro.qmd | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/flatgeobuf/intro.qmd b/flatgeobuf/intro.qmd index 0320709..7b9ac23 100644 --- a/flatgeobuf/intro.qmd +++ b/flatgeobuf/intro.qmd @@ -3,8 +3,6 @@ title: FlatGeobuf subtitle: Guidelines for FlatGeobuf --- - - # FlatGeobuf FlatGeobuf is a binary file format for geographic vector data, such as points, lines, and polygons. @@ -48,6 +46,12 @@ FlatGeobuf is a write-only format, and doesn't support appending, as that would FlatGeobuf optionally supports a spatial index at the beginning of the file, which can speed up reading portions of a file based on a spatial query. For more information on how this spatial index works, refer to the [Hilbert R Tree](./hilbert-r-tree.qmd) page. +::: {.callout-note} + +Note that because FlatGeobuf has no internal chunking, the spatial index references _every single object_ in the file. This means that for datasets with many small geometries, like points, the spatial index will be very large as a proportion of the file size. + +::: +