Smathermather's Weblog

Remote Sensing, GIS, Ecology, and Oddball Techniques

Archive for November, 2008

PostGIS Gripe—Limits to Postgre’s B-tree indexing

Posted by smathermather on November 27, 2008

Is this a gripe about PostGIS or PostgreSQL?  Hard to say.  At my office, we are migrating to a PostGIS—GeoServer—Client-of-your-choice stack.  As usual, I have to try the hard stuff first.  So, rather than keeping the 2ft contour lines for our 7 county area in shapefiles and doing some indexing on them there, I’m attempting to load them into PostGIS all in a single table.  Make any sense?  Maybe not.  Honestly, I really don’t know.  In doing so though, PostGIS has done pretty well.  I loaded all 9.6 GB of 2ft contours from one county into the database, served it up directly in uDig, and then through GeoServer to OpenLayers and Google Earth.  For viewing the whole county, it was slow, but I’d never let a user view the 2ft contour dataset all in one go.  Zoomed in, however, it did pretty well, in fact, I’d say considering there was no caching turned on for OpenLayers, it did splendidly.

First, how I loaded it:

I used shp2pgsql.  My contours were in a bunch of little chunks derived from a state DEM dataset, so I iterated through:

Create the table:

shp2pgsql -s 102722 N2110610 base.contours > N2110610.sql

psql -U postgres -d CM -f N2110610.sql
Then iterate through the appends (note the -a flag)

shp2pgsql -s 102722 N2110615 base.contours -a > N2110615.sql
psql -U postgres -d CM -f N2110615.sql

we can script this—I have to confess to having done this in excel with worksheet functions since my Bash shell scripting skills help me none in the Windows world, but a friend told me about FOR…IN…DO in batch scripting, so next time… .

So, loading was easy, and all worked well.  Viewing was fine.  What is my gripe?  B-tree Indexing.  What if I wanted to ensure I had only unique geometric records in my database?  I might initiate the table like this instead:

CREATE TABLE “base”.”contours” (gid serial PRIMARY KEY,
“objectid” int8,
“type” int8,
“elevation” smallint,
“shape_len” numeric);
SELECT AddGeometryColumn(‘base’,’contours’,’the_geom’,’102722′,’MULTILINESTRING’,2);
alter table base.contours add constraint unique_geom unique (the_geom);

Adding a unique constraint to the geometry creates an implicit B-tree index, and therein lies the problem– B-tree indices are limited in size to around 2700 bytes.  Our index exceeds that manyfold.  According to the Postgre documentation “Currently, only B-tree indexes can be declared unique.”

This probably is fine for most datasets, but it really limits the use of a uniqueness constraint on geometry columns of large size.  Since this dataset won’t require updates, my work-around (I hope) will be to load the data in the database, perform a select distinct and dump it into another table to clean up any possible duplicates, but it sure would be nice if I could constrain the table in the first place… .

Posted in PostGIS | Tagged: , , , , , | 7 Comments »