Skip to content

Some notes on database indexes

Billy Charlton edited this page Jun 30, 2017 · 7 revisions

« Back to Recipes for typical tasks


OSGeo.org has a TON of useful information about PostGIS, including indexing.

PostGIS goes much faster if you create indexes on the columns that you perform selects and joins. This isn't mandatory, but will radically improve performance, especially on large tables.

  • Note! You must call VACUUM ANALYZE tablename; after you add your indexes, or the indexes won't do anything.
  • Once your index is in place, you don't have to do anything at all to use it: PostGIS will automatically use indexes when it determines that they will make an operation run more quickly.

Regular indexes

A normal index can efficiently index columns containing integer, text, and float data. Create indexes for as many columns in a table as you like: here we're creating indexes on the taz and mtc_taz columns of our model_output table:

CREATE INDEX taz_ix ON scratch.model_output (taz);
CREATE INDEX mtctaz_ix ON scratch.model_output (mtc_taz);
VACUUM ANALYZE scratch.model_output;

Geospatial Indexes

Regular indexes don't work on geospatial columns, but geospatial data really needs an index to perform well. For geospatial indexes, PostGIS has its own special index type called a GIST index. For reference, this page on OSGeo explains how geospatial indexes work: http://revenant.ca/www/postgis/workshop/indexing.html

The short version is:

  • Create a geospatial GIST index for each geospatial column in your tables;
  • Then cluster (sort) the rows on the most relevant geo column, so that nearby items are adjacent in the database, too;
  • Finally, vacuum the database to compact everything and make things efficient.

Do those steps after you add lots of data. If you append lots of data to an existing table, the index will auto-update but you'll need to re-cluster and re-vacuum.

Some sample code, assuming that a column named geom in each table contains the PostGIS geospatial data:

CREATE INDEX sf_schools_gix ON sf_schools USING GIST (geom);
CREATE INDEX sf_taxlots_gix ON sf_taxlots USING GIST (geom);
CREATE INDEX sf_buildings_gix ON sf_buildings USING GIST (geom);
CLUSTER sf_buildings USING sf_buildings_gix;
CLUSTER sf_taxlots USING sf_taxlots_gix;
VACUUM ANALYZE sf_schools;
VACUUM ANALYZE sf_taxlots ;
VACUUM ANALYZE sf_buildings ;

It will take a few minutes to run, and then your database will be fast and spiffy.

Clone this wiki locally