Further optimization of the PostGIS LiDAR Vegetation Height Query

There’s much to be said for knowing your data in order to best optimize the analysis of it.  Beyond all other bits of cleverness, having a functional understanding of your problem is the first step toward conceiving an intelligent and efficient solution.

One thing that I didn’t do two posts ago was to spend any time deciding how far out the search for nearby points with ST_DWithin would be.  So I played around a bit visualizing the data to address this question.  Here in green are the ground points shown in Quantum GIS.

And below they are displayed as points two feet in diameter (kind of like buffering them, but here, it’s just taking advantage of qGIS’ display properties).

They almost converge.  Going up to 3 feet, we can conclude that the spacing of this LiDAR dataset is about 3 feet.  These points are over a flat residential area (the rectangular gaps are buildings).

and below we have the points over a cliff/slump area.  We can see some gaps in this extreme topography.  So I’ll conclude that if I use a search area of 3.5 feet, I should be able to more efficiently perform my nearest neighbor search, and find the heights of the vegetation and buildings relative to the closest ground point.

```SELECT DISTINCT ON(g1.gid)  g1.gid as gid, g2.gid as gid_ground, g1.x as x, g1.y as y, g2.z as z, g1.z - g2.z as height, g1.the_geom as geometry
FROM veg As g1, ground As g2
WHERE g1.gid <> g2.gid AND ST_DWithin(g1.the_geom, g2.the_geom, 3.5)
ORDER BY g1.gid, ST_Distance(g1.the_geom,g2.the_geom);
So, here is the vegetation height shown from lightest green (shortest) to darkest green (tallest):

We can throw in buildings similarly colored in gray (in this case we extend the search window to 30 feet to ensure we find a ground point nearby):

And a closer look over my house.

I know I wrote down some numbers for how much faster this is than searching 15 feet, but I can’t remember where I put them.  Needless to say, this is the most important optimization.