Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to publish data belonging to all DB schemas #39

Merged

Conversation

groldan
Copy link
Member

@groldan groldan commented Aug 25, 2024

Configuration properties to define how to handle multiple PostgreSQL schemas to avoid Collection name clashes.

When running with the postgis Spring profile, the default configuration is to serve data
from all tables in all schemas, except for a few hard-coded schema names: pg_toast, pg_catalog, information_schema, topology, and tiger;
and to prefix all feature type names with the schema name and a : delimiter.

For example, a table locations in the public schema will be presented as public:locations.

Newly created database schemas, or removed schemas, will be noticed after a period defined by the postgis.schemas.refresh.interval config property.

If refresh is disabled (i.e. postgis.schemas.refresh.enabled: false), there will be no delay in noticing new or deleted PostgreSQL schemas, but the performance will suffer, as the list of schemas will need to be queried upon each API request.

Default configuration

The default configuration (i.e. if nothing is explicily configured) is equivalent to the following:

postgis:
  schemas:
    delimiter: ":"
    refresh:
      enabled: true
      interval: PT5S
    include:
    - schema: "*"
      prefix-tables: true

Schema aliasing

Individual schema names can be aliased. For example, to change the prefix of the ville_Villeneuve-d'Ascq schema to villeneuve, use the following config:

postgis:
  schemas:
    include:
    - schema: "*"
      prefix-tables: true
    - schema: "ville_Villeneuve-d'Ascq"
      alias: "villeneuve"

This will result in a table locations in that schema to be published as villeneuve:locations instead of ville_Villeneuve-d'Ascq:locations.

A single schema can be configured to be un-prefixed, meaning its tables will be published as Collections without a schema prefix.

For example, to remove the public: prefix for the public schema, use:

postgis:
  schemas:
    include:
    - schema: "public"
      prefix-tables: false

If the * schema wildcard, or more than one schema is configured with prefix-tables: false, the application will fail to start.

Move SchemaUnawarePostGISDialect to package org.geotools.data.postgis
to be able of overridding the package private method
lookupGeometryType()
@groldan groldan force-pushed the GSMEL-334_read_all_postgis_schemas branch 4 times, most recently from 507b82c to c73e5a5 Compare September 3, 2024 16:20
@groldan groldan marked this pull request as ready for review September 3, 2024 16:20
@groldan groldan force-pushed the GSMEL-334_read_all_postgis_schemas branch 2 times, most recently from f640876 to 499e432 Compare September 3, 2024 16:32
@groldan groldan force-pushed the GSMEL-334_read_all_postgis_schemas branch from 499e432 to e9a2caa Compare September 3, 2024 16:34
Copy link
Member

@pmauduit pmauduit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall

But by reading the PR, there is still something I am unsure of: for now, to circumvent geotools targeting only one schema, we had these SchemaUnaware... classes, and we had to set a search_path with every expected schemas on the user being used for connecting to postgresql. And since the classes have not been removed from the codebase (moved to the org.geotools.data.postgis package), I was wondering if the search_path hack was still necessary ?

postgis.md Show resolved Hide resolved
pom.xml Show resolved Hide resolved
@groldan
Copy link
Member Author

groldan commented Sep 4, 2024

But by reading the PR, there is still something I am unsure of: for now, to circumvent geotools targeting only one schema, we had these SchemaUnaware... classes, and we had to set a search_path with every expected schemas on the user being used for connecting to postgresql. And since the classes have not been removed from the codebase (moved to the org.geotools.data.postgis package), I was wondering if the search_path hack was still necessary ?

I don't think those classes have anything to do with SET search_path TO? but as they understand from their docs, in order not to assume which schema postgis is installed on?

/**
 * Extension to {@link PostgisNGDataStoreFactory} creating data stores that
 * don't rely on the public {@literal geometry_columns} table to get the SRID of
 * a geometry column. Get the SRID from the geometry in the table instead.
 */

@pmauduit
Copy link
Member

pmauduit commented Sep 5, 2024

I don't think those classes have anything to do with SET search_path TO?

in the current state of what is being deployed, I think we need the custom classes to relax some geotools behaviour AND to set the search path (so that when GT issues SQL queries, there is a chance for postgresql to get a hand on it). I've the feeling that setting the search path is not necessary anymore with your PR.

@f-necas
Copy link
Collaborator

f-necas commented Sep 5, 2024

Just a small issue about backward compatibility and multiple schemas.

If the * schema wildcard, or more than one schema is configured with prefix-tables: false, the application will fail to start.

For MEL, the tables are not prefixed, and we can't set, prefix-tables false if there's more than 1 schema. I'll discuss with them this afternoon but I think we could not upgrade to this version atm.

EDIT : Ok for those new URLs

@f-necas f-necas self-requested a review September 5, 2024 11:43
@f-necas
Copy link
Collaborator

f-necas commented Sep 9, 2024

LGTM with my small java background either.

I've seen some experimental from lombok @Delegate which are supposely not be added in lombok in future :

Current status: negative - Currently we feel this feature will not move out of experimental status anytime soon, and support for this feature may be dropped if future versions of javac or ecj make it difficult to continue to maintain the feature.
https://projectlombok.org/features/experimental/Delegate

But we will check this in time.

Thanks a lot for this huge evolution !

@pmauduit pmauduit merged commit fee547d into georchestra:main Sep 9, 2024
1 check passed
@groldan groldan deleted the GSMEL-334_read_all_postgis_schemas branch September 9, 2024 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants