Skip to content

Latest commit

 

History

History
111 lines (78 loc) · 8.63 KB

data-storage.md

File metadata and controls

111 lines (78 loc) · 8.63 KB

Data Storage

Layman stores several types of data in several stores.

Types of data:

Data stores:

Types of Data

Users

Information about users includes their names, contacts, and authentication credentials.

When user reserves his username, names, contacts and other relevant metadata are obtained from authorization provider and saved to filesystem, Redis, PostgreSQL, and GeoServer. User's personal workspace is created too.

Layers

Information about layers includes vector data and visualization.

When user publishes new layer

Subsequently, when asynchronous tasks run,

  • vector data file chunks and completed vector data files are saved to filesystem (if sent asynchronously),
  • vector data files are imported to PostgreSQL,
  • PostgreSQL table with vector data is registered to, access rights are synchronized to, and visualization file is saved to GeoServer for WFS,
  • PostgreSQL table with vector data is registered to, access rights are synchronized to, and visualization file is saved to GeoServer for WMS of layers with SLD style,
  • QGS file is created on filesystem and through QGIS server registered to GeoServer with WMS cascade, access rights are synchronized, for WMS of layers with QGIS style,
  • thumbnail file is saved to filesystem,
  • and metadata record is saved to PostgreSQL using Micka's CSW.

When user patches existing layer, data is saved in the same way.

Maps

Information about maps includes JSON definition.

When user publishes new map

Subsequently, when asynchronous tasks run,

  • thumbnail file is saved to filesystem
  • and metadata record is saved to PostgreSQL using Micka's CSW.

When user patches existing map, data is saved in the same way.

Tasks

Information about asynchronous tasks consists of few parameters necessary for Celery task runner. In case of publishing or patching layer or map, it includes e.g. task name, owner name, layer/map name, and additional parameters derived from HTTP POST/PATCH parameters.

Task information are saved to Redis only.

Data version

Information about data version including migration ID is stored in PostgreSQL.

Stores

Redis

Data is saved in LAYMAN_REDIS_URL database. Keys are prefixed with

  • Layman python module name that saved the data, followed by :, e.g. layman.layer.geoserver: or layman:
  • other strings, e.g. celery, _kombu, or unacked in case of Celery task data.

Redis is used as temporary data store. When Layman stops, data persists in Redis, however on each startup Layman flushes the Redis database and imports user-related data and publication-related data from filesystem. It means that any task-related data is lost on startup. This can be controlled by LAYMAN_SKIP_REDIS_LOADING.

Filesystem

Data is saved to LAYMAN_DATA_DIR directory.

Workspace directory is created in LAYMAN_DATA_DIR directory for every created workspace. Name of the workspace directory is the same as workspace name. User-related information is saved in the root of personal workspace directory.

Publication directory is created inside workspace directory for each publication (e.g. map or layer) the user published. Name of the publication directory is the same as name of the publication (e.g. layername or mapname). Publication-related information is saved in publication directory.

QGIS workspace directory is created in LAYMAN_QGIS_DATA_DIR directory for every created workspace. Name of the workspace directory is the same as workspace name.

QGIS layer directory is created inside QGIS workspace directory for each layer with QGIS style the user published. Name of the publication directory is the same as name of the layer. QGS project with style definition is stored in this directory for WMS purpose.

Filesystem is used as persistent data store, so data survives Layman restart.

PostgreSQL

Layman uses directly one database specified by LAYMAN_PG_DBNAME to store data. There are two kinds of schemas in such database:

  • LAYMAN_PRIME_SCHEMA that holds information about
    • users, workspaces, and publications including access rights
    • data version including migration ID
  • Schemas holding vector layer data.
    • One workspace schema is created for every created workspace. Name of workspace schema is always the same as workspace name.
    • One table is created in workspace schema for each published layer. Name of the table is the same as layername. The table contains data from vector data files.

Second database is used by Micka to store metadata records. The database including its structure is completely managed by Micka. By default, it's named hsrs_micka6.

PostgreSQL is used as persistent data store, so data survives Layman restart.

GeoServer

User and role are created for every user who reserved username. User name on GeoServer is the same as username on Layman. Role name is composed a USER_<upper-cased username>.

Two workspaces are created, each with one PostgreSQL datastore, for every workspace (both personal and public). First workspace is meant for WFS and has the same name as the workspace on Layman. Second workspace is meant for WMS and is suffixed with _wms. Name of the datastore is postgresql for both workspaces. Every workspace-related information (including PostgreSQL datastore) is saved inside workspace. For layers with QGIS style, also WMS store is created for each layer (pointing to QGS file of the layer on the QGIS server) and WMS layer to cascade the layer's WMS.

Feature type and layer are registered in both workspaces (WMS and WFS), and style is created in WMS workspace for each layer published on Layman. Name of these three models are the same as layername. Feature type points to appropriate PostgreSQL table through PostgreSQL datastore. Style contains visualization file.

Two access rules are created for each layer in each GeoServer workspace (WFS and WMS), one for read access right, one for write access right. Every username from Layman's access right is represented by user's role name (i.e. USER_<upper-cased username>). Role EVERYONE is represented as ROLE_ANONYMOUS on GeoServer.

GeoServer is used as persistent data store, so data survives Layman restart.