PostGIS and LiDAR– oops!

I won’t offer much prelude.  Read this post, and this post first… .

Inadvertently I demonstrated the value of spatial indices, i.e. I meant to use them and didn’t.  In attempting to sort my tables by their spatial index I got the following errors:

ALTER TABLE lidar_ground CLUSTER ON lidar_ground_the_geom_idx;
 ERROR:  "lidar_ground_the_geom_idx" is not an index for table "lidar_ground"
ALTER TABLE lidar_veg CLUSTER ON lidar_veg_the_geom_idx;
 ERROR:  "lidar_veg_the_geom_idx" is not an index for table "lidar_veg"

Looking further, I saw there were no spatial indices on either table.

Returning to my original post, I had typos:

CREATE INDEX lidar_veg_the_geom_idx ON lidar USING gist(the_geom);
CREATE INDEX lidar_ground_the_geom_idx ON lidar USING gist(the_geom);

Do you see it?  It should be

CREATE INDEX lidar_veg_the_geom_idx ON lidar_veg USING gist(the_geom);
CREATE INDEX lidar_ground_the_geom_idx ON lidar_ground USING gist(the_geom);

I created two identical indices on the lidar table, not on the two tables I was interested in.  So, what was the relative performance hit?

Without the indices:

Vinton=# EXPLAIN ANALYZE SELECT pgis_fn_nn(b.the_geom, 10,1,1,'lidar_ground','true','vnum','the_geom') FROM lidar_veg AS b WHERE b.vnum > 576505 AND b.vnum < 576600;
 QUERY PLAN                                                               
----------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using lidar_veg_pkey on lidar_veg b  (cost=0.00..108.97 rows=31 width=100) (actual time=571.154..28589.303 rows=50 loops=1)
 Index Cond: ((vnum > 576505) AND (vnum < 576600))
 Total runtime: 28589.398 ms
(3 rows)

With the indices:

EXPLAIN ANALYZE SELECT pgis_fn_nn(b.the_geom, 10,1,1,'lidar_ground','true','vnum','the_geom') FROM lidar_veg AS b WHERE b.vnum > 576505 AND b.vnum < 576600;
 QUERY PLAN                                                       

------------------------------------------------------------------------------------------------------------------------------
-------
 Index Scan using lidar_veg_pkey on lidar_veg b  (cost=0.00..108.97 rows=31 width=100) (actual time=86.489..138.579 rows=50 lo
ops=1)
 Index Cond: ((vnum > 576505) AND (vnum < 576600))
 Total runtime: 138.717 ms
(3 rows)

Now with CLUSTERing:

ALTER TABLE lidar_veg CLUSTER ON lidar_veg_the_geom_idx;
ALTER TABLE lidar_ground CLUSTER ON lidar_ground_the_geom_idx;

EXPLAIN ANALYZE SELECT pgis_fn_nn(b.the_geom, 10,1,1,'lidar_ground','true','vnum','the_geom') FROM lidar_veg AS b WHERE b.vnum > 576505 AND b.vnum < 576600;
 QUERY PLAN                                                        

------------------------------------------------------------------------------------------------------------------------------
-----
 Index Scan using lidar_veg_pkey on lidar_veg b  (cost=0.00..108.97 rows=31 width=100) (actual time=1.929..56.285 rows=50 loop
s=1)
 Index Cond: ((vnum > 576505) AND (vnum < 576600))
 Total runtime: 56.378 ms
(3 rows)

Indices sped up the query by a factor of 206 times.  Adding the CLUSTER sped it up another 2.4 times over and above the indices for a total of a query that runs 507 times faster.  Did I say the full query would take a few years?  Ahem, I meant a couple days.  Specifically about 2.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.