Smathermather's Weblog

Remote Sensing, GIS, Ecology, and Oddball Techniques

Trigger Speeeeed, or triggers and data maintenance in PostGIS

Posted by smathermather on April 21, 2010

I have completed 4 out of 6 of my triggers for maintaining boundary data in PostGIS.  This is a follow-up to this post.  If you haven’t read my PostGIS trigger post, follow this link and come back.  I’ll wait.  Go on.

Back?  OK.  So I’ve implemented my property boundary triggers.  I need about 6-7 different versions of our boundary for cartographic reasons– some which are dissolved (ST_Union) based on one set of attributes, some based on another, etc..  Needless to say, I just want to modify one version, and have the other versions automagically be created.  This is where PostGIS functions and triggers come in.  Anything you can conceive of doing with the hundreds of PostGIS spatial functions you could run within a trigger.

First, as always, testing.  Each time I update the record (can’t figure out how to do triggers once per transaction in PostgreSQL, so this runs with each record insert) I want to take my boundary and create a dissolved copy of it in another table.  Naively, I tried to remove and recreate the table first:

DROP TRIGGER dissolver ON base.reservation_boundaries_public_private_cm;

CREATE OR REPLACE FUNCTION dissolve_bounds() RETURNS TRIGGER AS $dissolver$

BEGIN
 IF(TG_OP='INSERT') THEN

 DROP TABLE IF EXISTS base.reservation_boundaries_public_private_cm_dissolved;

 CREATE TABLE base.reservation_boundaries_public_private_cm_dissolved
 AS
 SELECT res, ST_Union(the_geom)
 AS the_geom
 FROM base.reservation_boundaries_public_private_cm
 GROUP BY res;

 ALTER TABLE base.reservation_boundaries_public_private_cm_dissolved ADD COLUMN gid BIGSERIAL;
 ALTER TABLE base.reservation_boundaries_public_private_cm_dissolved ADD PRIMARY KEY (gid);
 ALTER TABLE base.reservation_boundaries_public_private_cm_dissolved ADD CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2);
 ALTER TABLE base.reservation_boundaries_public_private_cm_dissolved ADD CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 102722);
 GRANT SELECT ON TABLE base.reservation_boundaries_public_private_cm_dissolved TO low_level_access;
 COMMENT ON TABLE base.reservation_boundaries_public_private_cm_dissolved IS 'Base boundary layer, with public and private boundaries, dissolved (CM)';

 END IF;
 RETURN NEW;
END;

$dissolver$ LANGUAGE plpgsql;

CREATE TRIGGER dissolver
 AFTER INSERT ON base.reservation_boundaries_public_private_cm
 FOR EACH STATEMENT EXECUTE PROCEDURE dissolve_bounds();

It works, but it is a little slow.  It takes about 50 seconds to complete the transaction.  So, when I wrote this I was a fundamental SQL nube.  Now, still nubish, I realize that dropping and recreating the table is quite inefficient.  It’s much better to remove the contents of the table, and insert new records (here is just the modified code between IF() THEN and END IF;).

 DELETE FROM base.reservation_boundaries_public_private_cm_dissolved;

 INSERT INTO base.reservation_boundaries_public_private_cm_dissolved
  SELECT res, ST_Union(the_geom)
   AS the_geom
   FROM base.reservation_boundaries_public_private_cm
   GROUP BY res;

Moderately faster– runs in about 30 seconds.

To create all the tables, I have a few functions run in the trigger.  Remarkably, this doesn’t take much longer than just modifying one table within the trigger, or about 30 seconds:

DROP TRIGGER dissolver ON base.reservation_boundaries_public_private_cm;

CREATE OR REPLACE FUNCTION dissolve_bounds() RETURNS TRIGGER AS $dissolver$

BEGIN
 IF(TG_OP='INSERT') THEN

 DELETE FROM base.reservation_boundaries_public_private_cm_dissolved;

 INSERT INTO base.reservation_boundaries_public_private_cm_dissolved
  SELECT res, ST_Union(the_geom)
   AS the_geom
   FROM base.reservation_boundaries_public_private_cm
   GROUP BY res;

 DELETE FROM base.reservation_boundaries_cm_dissolved;

 INSERT INTO base.reservation_boundaries_cm_dissolved;
 SELECT res, ST_Union(the_geom)
 AS the_geom
 FROM base.reservation_boundaries_public_private_cm
 WHERE access = 1
 GROUP BY res;

 DELETE FROM base.reservation_boundaries_cm;

 INSERT INTO base.reservation_boundaries_cm
 SELECT res, the_geom
 FROM base.reservation_boundaries_public_private_cm
 WHERE access = 1;

 DELETE FROM base.reservation_boundaries_private_cm;

 INSERT INTO base.reservation_boundaries_private_cm
 SELECT res, the_geom
 FROM base.reservation_boundaries_public_private_cm
 WHERE access = 0;

 END IF;
 RETURN NEW;
END;

$dissolver$ LANGUAGE plpgsql;

CREATE TRIGGER dissolver
 AFTER INSERT ON base.reservation_boundaries_public_private_cm
 FOR EACH STATEMENT EXECUTE PROCEDURE dissolve_bounds();

Now just a few more things to stick in there for masking versions of my boundary, and a script to export this to a shapefile to synchronize with my file system, PostGIS gets to maintain by boundaries, not me or an intern.  Between you and me, my interns are good and very reliable, but PostGIS is much more reliable than I am.

3 Responses to “Trigger Speeeeed, or triggers and data maintenance in PostGIS”

  1. vincent said

    Did you try to use Truncate instead of delete to remove all elements from your table ?

    With Delete, the mvcc mechanism of postgresql makes the on-disk used space bigger and bigger if vacuum is not managed accordingly. Autovacuum should do the job, but it may have to be fine-tuned.

    As for transaction triggers, there are none and probably won’t be any, as it would lead to some very complicated theoritical problems regarding the ACID constraints.

  2. smathermather said

    I’ve never really grok’d the implications of ACID constraints, but I’ve heard that before. This trigger isn’t too inefficient, so I’ll put up with it… . 😀

    With truncate, we shave off a couple seconds in the processing, and take care of the vacuuming problem. Thank you, as I have yet to get the Autovacuum settings to be effective (I keep using the excuse that I haven’t gone live yet, but eventually that excuse will get old). This is great advice, especially as each transaction duplicates and removes the entire dataset. That could get messy quickly.

  3. […] versions of our exterior property boundary layer for our park system for different purposes, e.g. this post.  What I’d like to include in this automatically managed dataset is a multi-ring buffer […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: