Skip to content

Clustering improvements: disk quota

aaime edited this page Aug 14, 2012 · 5 revisions

Introduction

The current disk quota mechanism is based on the Berkely DB Java edition. While very efficient it has a few drawbacks in enterprise setups:

  • although possible via streaming replication, it is not easy to make it cluster
  • enterprise systems might already have their database of choice, which is the mandatory way to store data. The database in question is often a well known relational one (Oracle, PostgreSQL, SQL Server), often centralized and clustered itself.

The purpose of this proposal is to allow people to use the disk quota under a wider range of cases by extending the implementations options and allow the subsystem to work off a relational database

Changes

The current gwc-diskquota module embeds in a single container three core elements:

  • the object model describing disk quotas, annotated for integration with BDB Java Edition
  • the BDB Java edition based storage engine, implementing the QuotaStore interface
  • the higher levels system listening to tile cache changing, doing the disk quota math and issuing requests to the storage subsystem

In order to have a pluggable system with multiple implementations the following changes are proposed:

  • switch the model classes (LayerQuota, PageStats, Quota, TilePage, TileSet) to interfaces
  • have a parent gwc-diskquota module that contains no code, but only sub-modules
  • have a gwc-diskquota-core module containing the upper level system and the model classes
  • have a gwc-diskquota-bdb module containing the Berkely DB implementation of the quota subsystem
  • have a gwc-diskquota-jdbc module with the code common to all relationa database oriented implementations of the disk quota subsystem
  • have a gwc-diskquota-oracle and a gwc-diskquota-postgres couple of modules providing the specific implementation details for the Oracle and PostgreSQL databases respectively

Implementation

The JDBC database based disk quota system will follow the core/dialect pattern already successfully used in GeoTools JDBC data stores, where a core module provides all the core functionality while the database specific code (if any) is contained in a Dialect class hierarchy, with one subclass per target database.

The core module will be implemented around Spring JDBCTemplate and will be powered by a user provided DataSource to allow users to use a simple local connection pool or rely on the connection pooling abilities of the web container.

The jury on the usage of prepared statements is still out, the queries used by the disk quota subsystem seem to read small amounts of data, as such they seem to be a good fit for the case.

In case the target database is empty the code will automatically populate the necessary tables, relying on the dialect classes to setup the best possible index layout.

Backward compatibility

The default GWC implementation will keep on using the BDB embedded solution, posing no backwards compatibility changes. Systems willing to use the JDBC database back-ends will have to start using them from the beginning, eventually re-seeding the tile caches.

Alternatively here we could have a startup procedure recognizing the database is empty, and computing all the quota pages on startup by querying the StorageBroker. Not clear if the StorageBroker provides enough information to do so

Clone this wiki locally