Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation updated automatically for version 2022-11 #497

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
349 changes: 203 additions & 146 deletions app/content/analytics-toolbox-bigquery/release-notes.md

Large diffs are not rendered by default.

8 changes: 6 additions & 2 deletions app/content/analytics-toolbox-bigquery/sql-reference/h3.md
Original file line number Diff line number Diff line change
Expand Up @@ -368,9 +368,13 @@ carto.H3_POLYFILL(geography, resolution)

**Description**

Returns an array with all the H3 cell indexes **with centers** contained in a given polygon. It will return `null` on error (invalid geography type or resolution out of bounds).
Returns an array with all the H3 cell indexes **with centers** contained in a given polygon. It will return `null` on error (invalid geography type or resolution out of bounds). In case of lines, it will return the H3 cell indexes intersecting those lines. For a given point, it will return the H3 index of cell in which that point is contained.

* `geography`: `GEOGRAPHY` **polygon** or **multipolygon** representing the area to cover.
{{% bannerNote type="note" title="warning"%}}
Lines polyfill is calculated by approximating S2 cells to H3 cells, in some cases some cells might be missing.
{{%/ bannerNote %}}

* `geography`: `GEOGRAPHY` representing the area to cover.
* `resolution`: `INT64` number between 0 and 15 with the [H3 resolution](https://h3geo.org/docs/core-library/restable).

**Return type**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ The CARTO Analytics Toolbox's functions are organized in modules based on the fu
| processing | core |<ul style="list-style:none"><li><a href="../processing/#st_delaunaylines">ST_DELAUNAYLINES</a></li><li><a href="../processing/#st_delaunaypolygons">ST_DELAUNAYPOLYGONS</a></li><li><a href="../processing/#st_polygonize">ST_POLYGONIZE</a></li><li><a href="../processing/#st_voronoilines">ST_VORONOILINES</a></li><li><a href="../processing/#st_voronoipolygons">ST_VORONOIPOLYGONS</a></li></ul>|
| quadbin | core |<ul style="list-style:none"><li><a href="../quadbin/#quadbin_bbox">QUADBIN_BBOX</a></li><li><a href="../quadbin/#quadbin_boundary">QUADBIN_BOUNDARY</a></li><li><a href="../quadbin/#quadbin_center">QUADBIN_CENTER</a></li><li><a href="../quadbin/#quadbin_fromgeogpoint">QUADBIN_FROMGEOGPOINT</a></li><li><a href="../quadbin/#quadbin_fromlonglat">QUADBIN_FROMLONGLAT</a></li><li><a href="../quadbin/#quadbin_fromzxy">QUADBIN_FROMZXY</a></li><li><a href="../quadbin/#quadbin_isvalid">QUADBIN_ISVALID</a></li><li><a href="../quadbin/#quadbin_kring">QUADBIN_KRING</a></li><li><a href="../quadbin/#quadbin_kring_distances">QUADBIN_KRING_DISTANCES</a></li><li><a href="../quadbin/#quadbin_polyfill">QUADBIN_POLYFILL</a></li><li><a href="../quadbin/#quadbin_resolution">QUADBIN_RESOLUTION</a></li><li><a href="../quadbin/#quadbin_sibling">QUADBIN_SIBLING</a></li><li><a href="../quadbin/#quadbin_tochildren">QUADBIN_TOCHILDREN</a></li><li><a href="../quadbin/#quadbin_toparent">QUADBIN_TOPARENT</a></li><li><a href="../quadbin/#quadbin_tozxy">QUADBIN_TOZXY</a></li></ul>|
| random | core |<ul style="list-style:none"><li><a href="../random/#st_generatepoints">ST_GENERATEPOINTS</a></li></ul>|
| retail | advanced |<ul style="list-style:none"><li><a href="../retail/#build_revenue_model_data">BUILD_REVENUE_MODEL_DATA</a></li><li><a href="../retail/#build_revenue_model">BUILD_REVENUE_MODEL</a></li><li><a href="../retail/#predict_revenue_average">PREDICT_REVENUE_AVERAGE</a></li><li><a href="../retail/#find_whitespace_areas">FIND_WHITESPACE_AREAS</a></li><li><a href="../retail/#find_twin_areas">FIND_TWIN_AREAS</a></li><li><a href="../retail/#commercial_hotspots">COMMERCIAL_HOTSPOTS</a></li><li><a href="../retail/#build_cannibalization_data">BUILD_CANNIBALIZATION_DATA</a></li><li><a href="../retail/#cannibalization_overlap">CANNIBALIZATION_OVERLAP</a></li></ul>|
| retail | advanced |<ul style="list-style:none"><li><a href="../retail/#build_revenue_model_data">BUILD_REVENUE_MODEL_DATA</a></li><li><a href="../retail/#build_revenue_model">BUILD_REVENUE_MODEL</a></li><li><a href="../retail/#predict_revenue_average">PREDICT_REVENUE_AVERAGE</a></li><li><a href="../retail/#find_whitespace_areas">FIND_WHITESPACE_AREAS</a></li><li><a href="../retail/#find_twin_areas">FIND_TWIN_AREAS</a></li><li><a href="../retail/#commercial_hotspots">COMMERCIAL_HOTSPOTS</a></li><li><a href="../retail/#build_cannibalization_data">BUILD_CANNIBALIZATION_DATA</a></li><li><a href="../retail/#cannibalization_overlap">CANNIBALIZATION_OVERLAP</a></li><li><a href="../retail/#find_twin_areas_weighted">FIND_TWIN_AREAS_WEIGHTED</a></li></ul>|
| routing | advanced |<ul style="list-style:none"><li><a href="../routing/#distance_map">DISTANCE_MAP</a></li><li><a href="../routing/#distance_map_from_network">DISTANCE_MAP_FROM_NETWORK</a></li><li><a href="../routing/#distance_map_from_network_table">DISTANCE_MAP_FROM_NETWORK_TABLE</a></li><li><a href="../routing/#find_shortest_path">FIND_SHORTEST_PATH</a></li><li><a href="../routing/#find_shortest_path_from_network">FIND_SHORTEST_PATH_FROM_NETWORK</a></li><li><a href="../routing/#find_shortest_path_from_network_table">FIND_SHORTEST_PATH_FROM_NETWORK_TABLE</a></li><li><a href="../routing/#generate_network">GENERATE_NETWORK</a></li><li><a href="../routing/#generate_network_table">GENERATE_NETWORK_TABLE</a></li></ul>|
| s2 | core |<ul style="list-style:none"><li><a href="../s2/#s2_boundary">S2_BOUNDARY</a></li><li><a href="../s2/#s2_center">S2_CENTER</a></li><li><a href="../s2/#s2_fromgeogpoint">S2_FROMGEOGPOINT</a></li><li><a href="../s2/#s2_fromhilbertquadkey">S2_FROMHILBERTQUADKEY</a></li><li><a href="../s2/#s2_fromlonglat">S2_FROMLONGLAT</a></li><li><a href="../s2/#s2_fromtoken">S2_FROMTOKEN</a></li><li><a href="../s2/#s2_fromuint64repr">S2_FROMUINT64REPR</a></li><li><a href="../s2/#s2_tohilbertquadkey">S2_TOHILBERTQUADKEY</a></li><li><a href="../s2/#s2_totoken">S2_TOTOKEN</a></li><li><a href="../s2/#s2_touint64repr">S2_TOUINT64REPR</a></li></ul>|
| s2 | core |<ul style="list-style:none"><li><a href="../s2/#s2_boundary">S2_BOUNDARY</a></li><li><a href="../s2/#s2_center">S2_CENTER</a></li><li><a href="../s2/#s2_fromgeogpoint">S2_FROMGEOGPOINT</a></li><li><a href="../s2/#s2_fromhilbertquadkey">S2_FROMHILBERTQUADKEY</a></li><li><a href="../s2/#s2_fromlonglat">S2_FROMLONGLAT</a></li><li><a href="../s2/#s2_fromtoken">S2_FROMTOKEN</a></li><li><a href="../s2/#s2_fromuint64repr">S2_FROMUINT64REPR</a></li><li><a href="../s2/#s2_resolution">S2_RESOLUTION</a></li><li><a href="../s2/#s2_tochildren">S2_TOCHILDREN</a></li><li><a href="../s2/#s2_tohilbertquadkey">S2_TOHILBERTQUADKEY</a></li><li><a href="../s2/#s2_totoken">S2_TOTOKEN</a></li><li><a href="../s2/#s2_touint64repr">S2_TOUINT64REPR</a></li></ul>|
| statistics | advanced |<ul style="list-style:none"><li><a href="../statistics/#getis_ord_h3">GETIS_ORD_H3</a></li><li><a href="../statistics/#getis_ord_quadbin">GETIS_ORD_QUADBIN</a></li><li><a href="../statistics/#gfun">GFUN</a></li><li><a href="../statistics/#gwr_grid">GWR_GRID</a></li><li><a href="../statistics/#knn">KNN</a></li><li><a href="../statistics/#local_morans_i_h3">LOCAL_MORANS_I_H3</a></li><li><a href="../statistics/#local_morans_i_quadbin">LOCAL_MORANS_I_QUADBIN</a></li><li><a href="../statistics/#lof">LOF</a></li><li><a href="../statistics/#lof_table">LOF_TABLE</a></li><li><a href="../statistics/#morans_i_h3">MORANS_I_H3</a></li><li><a href="../statistics/#morans_i_quadbin">MORANS_I_QUADBIN</a></li><li><a href="../statistics/#ordinary_kriging">ORDINARY_KRIGING</a></li><li><a href="../statistics/#ordinary_kriging_table">ORDINARY_KRIGING_TABLE</a></li><li><a href="../statistics/#p_value">P_VALUE</a></li><li><a href="../statistics/#smoothing_mrf_h3">SMOOTHING_MRF_H3</a></li><li><a href="../statistics/#smoothing_mrf_quadbin">SMOOTHING_MRF_QUADBIN</a></li><li><a href="../statistics/#variogram">VARIOGRAM</a></li></ul>|
| tiler | advanced |<ul style="list-style:none"><li><a href="../tiler/#create_point_aggregation_tileset">CREATE_POINT_AGGREGATION_TILESET</a></li><li><a href="../tiler/#create_simple_tileset">CREATE_SIMPLE_TILESET</a></li><li><a href="../tiler/#create_spatial_index_tileset">CREATE_SPATIAL_INDEX_TILESET</a></li><li><a href="../tiler/#create_tileset">CREATE_TILESET</a></li></ul>|
| transformations | core |<ul style="list-style:none"><li><a href="../transformations/#st_buffer">ST_BUFFER</a></li><li><a href="../transformations/#st_centermean">ST_CENTERMEAN</a></li><li><a href="../transformations/#st_centermedian">ST_CENTERMEDIAN</a></li><li><a href="../transformations/#st_centerofmass">ST_CENTEROFMASS</a></li><li><a href="../transformations/#st_concavehull">ST_CONCAVEHULL</a></li><li><a href="../transformations/#st_destination">ST_DESTINATION</a></li><li><a href="../transformations/#st_greatcircle">ST_GREATCIRCLE</a></li><li><a href="../transformations/#st_line_interpolate_point">ST_LINE_INTERPOLATE_POINT</a></li></ul>|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,11 @@ SELECT `carto-os`.carto.ST_DELAUNAYLINES(
```

{{% bannerNote type="note" title="ADDITIONAL EXAMPLES"%}}

* [A NYC subway connection graph using Delaunay triangulation](/analytics-toolbox-bigquery/examples/a-nyc-subway-connection-graph-using-delaunay-triangulation/)
{{%/ bannerNote %}}


### ST_DELAUNAYPOLYGONS

{{% bannerNote type="code" %}}
Expand Down Expand Up @@ -132,7 +134,6 @@ SELECT `carto-os`.carto.ST_DELAUNAYPOLYGONS(
```



### ST_POLYGONIZE

{{% bannerNote type="code" %}}
Expand Down
77 changes: 75 additions & 2 deletions app/content/analytics-toolbox-bigquery/sql-reference/retail.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,7 @@ This procedure is the second step of the Revenue Prediction analysis workflow. I
* MAX_ITERATIONS: 50
* DATA_SPLIT_METHOD: 'NO_SPLIT'

This parameter allows using other options compatible with the model used. Models currently supported are [LINEAR_REG](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-create-glm) and [BOOSTED_TREE_REGRESSOR](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-create-boosted-tree). Check the [model documentation](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-create#model_option_list) for more information. Please note that `BOOSTED_TREE_REGRESSOR` is available only on certain regions, as detailed <a href="https://cloud.google.com/bigquery-ml/docs/locations#regional-locations" target="_blank">here</a>.

This parameter allows using other options compatible with the model used. Models currently supported are [LINEAR_REG](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-create-glm) and [BOOSTED_TREE_REGRESSOR](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-create-boosted-tree). Check the [model documentation](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-create#model_option_list) for more information.
* `output_prefix`: `STRING` destination prefix for the output tables. It must contain the project, dataset and prefix. For example `<my-project>.<my-dataset>.<output-prefix>`.

**Output**
Expand Down Expand Up @@ -524,4 +523,78 @@ CALL `carto-un`.carto.CANNIBALIZATION_OVERLAP(
{{%/ bannerNote %}}


### FIND_TWIN_AREAS_WEIGHTED

{{% bannerNote type="code" %}}
carto.FIND_TWIN_AREAS_WEIGHTED(origin_query, target_query, index_column, weights, max_results, output_prefix)
{{%/ bannerNote %}}

**Description**

Procedure to obtain the twin areas for a given origin location in a target area. The function is similar to the FIND_TWIN_AREAS(#find_twin_areas) where the full description of the method, based on Principal Component Analysis (PCA), can be found [here](https://carto.com/blog/spatial-data-science-site-planning). Herein, no PCA is performed, but the user has the capability to specify weights for the features and check the similarities between origin and target area. The sum of weights must be less than or equal to 1. Not all them need to be defined. The undefined features are set to the remaining value divided by their number to reach 1. In the case where weights are provided, then no PCA takes place, and the features are standardized.

The output twin areas are those of the target area considered to be the most similar to the origin location, based on the values of a set of variables. Only variables with numerical values are supported. Both origin and target areas should be provided in grid format (h3, or quadbin) of the same resolution. We recommend using the [carto.GRIDIFY_ENRICH](../#gridify_enrich) procedure to prepare the data in the format expected by this procedure.

**Input**

* `origin_query`: `STRING` query to provide the origin cell (`index` column) and its associated data columns. No NULL values should be contained in any of the data columns provided. The cell can be an h3, or a quadbin index. For quadbin, the value should be cast to `STRING` (`CAST(index AS STRING)`). Example origin queries are:

```sql
-- When selecting the origin cell from a dataset of gridified data
SELECT * FROM `<project>.<dataset>.<origin_table>`
WHERE index_column = <cell_id>
```

```sql
-- When the input H3 cell ID is inferred from a (longitude, latitude) pair
SELECT * FROM `<project>.<dataset>.<origin_table>`
WHERE ST_INTERSECTS(`carto-un`.H3_BOUNDARY(index_column), ST_GEOGPOINT(<longitude>, <latitude>))
```

```sql
-- When the input quadbin cell ID is inferred from a (longitude, latitude) pair
SELECT * FROM `<project>.<dataset>.<origin_table>`
WHERE ST_INTERSECTS(`carto-un`.carto.QUADBIN_BOUNDARY(index_column), ST_GEOGPOINT(<longitude>, <latitude>))
```

```sql
-- When the cell ID is a quadbin and requires to be cast
SELECT * EXCEPT(index_column), CAST(index_column AS STRING)
FROM `<project>.<dataset>.<origin_table>`
```

* `target_query`: STRING query to provide the target area grid cells (`index` column) and their associated data columns, e.g. `SELECT * FROM <project>.<dataset>.<target_table>`. The data columns should be similar to those provided in the `origin_query`, otherwise the procedure will fail. Grid cells with any NULL values will be excluded from the analysis.
* `index_column`: `STRING` name of the index column for both the `origin_query` and the `target_query`.
* `weights`: `ARRAY<STRUCT<name STRING, value FLOAT64>>` the weights on the features. If set to `NULL`, then all features are treated equally. This parameter is considered only if the length of weights is greater or equal than one. The sum of weights must be less than or equal to 1. If less weights than the number of features are provided, then for the undefined features, the remaining 1 - sum(weights) is distributed evenly.
* `max_results`: `INT64` of the maximum number of twin areas returned. If set to `NULL`, all target cells are returned.
* `output_prefix`: `STRING` destination and prefix for the output tables. It must contain the project, dataset and prefix: `<project>.<dataset>.<prefix>`.

**Output**

The procedure outputs the following:

* Twin area model, named `<project>.<dataset>.<prefix>_model`. Please note that the model computation only depends on the `target_query` and therefore the same model can be used if the procedure is re-run for a different `origin_query`. To allow for this scenario in which the model is reused, if the output model already exists, it won't be recomputed. To avoid this behavior, simply choose a different `<prefix>` in the `output_prefix` parameter.

* Results table, named `<project>.<dataset>.<prefix>_<origin_index>_results`, containing in each row the index of the target cells (`index_column`) and its associated `similarity_score` and `similarity_skill_score`. The `similarity_score` corresponds to the distance between the origin and target cell taking into account the user defined weights; the `similarity_skill_score` for a given target cell `*t*` is computed as `1 - similarity_score(*t*) / similarity_score(<*t*>)`, where `<*t*>` is the average target cell, computed by averaging each feature for all the target cells. This `similarity_skill_score` represents a relative measure: the score will be positive if and only if the target cell is more similar to the origin than the mean vector data, with a score of 1 meaning perfect matching or zero distance. Therefore, a target cell with a larger score will be more similar to the origin under this scoring rule.

{{% customSelector %}}
**Example**
{{%/ customSelector %}}

```sql
CALL `carto-un`.carto.FIND_TWIN_AREAS_WEIGHTED(
-- Input queries
'''SELECT * FROM `cartobq.docs.twin_areas_origin_enriched_quadbin` LIMIT 1''',
'''SELECT * FROM `cartobq.docs.twin_areas_target_enriched_quadbin`''',
-- Twin areas model inputs
'quadbin',
NULL,
NULL,
'my-project.my-dataset.my-prefix'
);
-- Table `<my-project>.<my-dataset>.<output-prefix>_{ID}_results` will be created
-- with the column: quadbin, similarity_score, similarity_skill_score
```


{{% euFlagFunding %}}
56 changes: 56 additions & 0 deletions app/content/analytics-toolbox-bigquery/sql-reference/s2.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,62 @@ SELECT `carto-os`.carto.S2_FROMUINT64REPR('9926595690882924544');
```


### S2_RESOLUTION

{{% bannerNote type="code" %}}
carto.S2_RESOLUTION(index)
{{%/ bannerNote %}}

**Description**

Returns the S2 cell resolution as an integer.

* `index`: `STRING` The S2 cell index.

**Return type**

`INT64`

{{% customSelector %}}
**Example**
{{%/ customSelector %}}

```sql
SELECT `carto-os`.carto.S2_RESOLUTION(-6432928348669739008);
-- 11
```


### S2_TOCHILDREN

{{% bannerNote type="code" %}}
carto.S2_TOCHILDREN(index, resolution)
{{%/ bannerNote %}}

**Description**

Returns an array with the S2 indexes of the children/descendents of the given hexagon at the given resolution.

* `index`: `STRING` The S2 cell index.
* `resolution`: `INT64` number between 0 and 30 with the [S2 resolution](https://S2geo.org/docs/core-library/restable).

**Return type**

`ARRAY<STRING>`

{{% customSelector %}}
**Example**
{{%/ customSelector %}}

```sql
SELECT `carto-os`.carto.S2_TOCHILDREN(-6432928348669739008, 12);
-- 6432928554828169216
-- 6432928417389215744
-- 6432928279950262272
-- 6432928142511308800
```


### S2_TOHILBERTQUADKEY

{{% bannerNote type="code" %}}
Expand Down
Loading