Smathermather's Weblog

Remote Sensing, GIS, Ecology, and Oddball Techniques

Posts Tagged ‘Digital Surface Model’

Landscape Position using GDAL — PT 3

Posted by smathermather on November 25, 2014

More landscape position pictures — just showing riparianess. See also

https://smathermather.wordpress.com/2014/11/22/landscape-position-using-gdal/

and

https://smathermather.wordpress.com/2014/11/24/landscape-position-using-gdal-pt-2/

valleyz3 valleyz2 valleyz1 valleyz

Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL

Posted in Analysis, Ecology, GDAL, Landscape Position, Other, POV-Ray | Tagged: , , , , , , , , , , | Leave a Comment »

Landscape Position using GDAL — PT 2

Posted by smathermather on November 24, 2014

Just pretty pics today of estimated riparianess. If you prefer a bit of code, see previous post https://smathermather.wordpress.com/2014/11/22/landscape-position-using-gdal/

valleys_improved valleys_improved_1 valleys_improved_2 valleys_improved_3 valleys_improved_3.1 valleys_improved_4 valleys_improved_5

Posted in Analysis, Ecology, GDAL, Landscape Position, Other, POV-Ray | Tagged: , , , , , , , , , , | 4 Comments »

Landscape Position using GDAL

Posted by smathermather on November 22, 2014

Hat tip again to Seth Fitzsimmons. I’ve been looking for a good, easy to use smoothing algorithm for rasters. Preferably something so easy, I don’t even need to write a little python, and so efficient I can run it on 30GB+ datasets and have it complete before I get distracted again by the next shiny project (a few hours).

Seth’s solution? Downsample to a low resolution using GDAL, then sample back up to a higher resolution in order to smooth the raster. My innovation to his approach? Use Lanczos resampling to keep location static, and get a great smooth model:

Unsmoothed DEM

Unsmoothed DEM

Smoothed DEM

Smoothed DEM

Code to do this in gdal follows. “-tr” sets our resamping resolution, “-r lanczos” sets our resampling algorithm, and the “-co” flags are not strictly necessary, but I’ve got a 30GB dataset, so it helps to chop up the inside of the TIFF in little squares to optimize subsequent processing.

gdalwarp -tr 50 50 -srcnodata "0 -32767" -r lanczos  -co "BLOCKXSIZE=512" -co "BLOCKYSIZE=512" oh_leap_dem.tif oh_leap_dem_50.tif
gdalwarp -tr 10 50 -srcnodata "0 -32767" -r lanczos  -co "BLOCKXSIZE=512" -co "BLOCKYSIZE=512" oh_leap_dem_50.tif oh_leap_dem_10-50.tif

At first this excited me for cartographic reasons. We can use this to simplify contours, and then use simplified contours at different zoom levels for maps:

But, we can also use this for analyses. For example, if we difference these smoothed images with our original digital elevation model, we get a measurement of local elevation difference, the first step in establishing where valleys, ridges, and other land forms are.

# Resample to lower resolution
gdalwarp -tr 328.0523587211646 328.0523587211646 -srcnodata "0 -32767" -r lanczos  -co "BLOCKXSIZE=512" -co "BLOCKYSIZE=512" oh_leap_dem.tif oh_leap_dem_328.tif
# Upsample again to get nicely smoothed data
gdalwarp -tr 3.048293887897243 3.048293887897243 -srcnodata "0 -32767" -r lanczos  -co "BLOCKXSIZE=512" -co "BLOCKYSIZE=512" oh_leap_dem_328.tif oh_leap_dem_3-328.tif
# Merge two datasets together into single image as separate bands to ensure they are the same dimensions
# (gdal_calc, as a wrapper for numpy requires this)
gdal_merge -separate -o oh_leap_dem_3-328_m.tif oh_leap_dem.tif oh_leap_dem_3-328.tif
# And now we'll use gdal_calc to difference our elevation model with the smoothed one to get relative elevation 
gdal_calc -A oh_leap_dem_3-328_m.tif -B oh_leap_dem_3-328_m.tif --A_band=1 --B_band=2 --outfile=oh_leap_dem_lp_328.tif --calc="A-B"

So, if we want a good proxy for riparian zones, we can use a technique like this, instead of buffering our streams and rivers a fixed distance (in this case, I’ve used 4 different distances:

Map of landscape position estimated valleys in Cuyahoga County, Ohio

Map of landscape position estimated valleys in Cuyahoga County, Ohio

Pretty snazzy riparian finder. It seems to also find upland headwater wetlands (most of them historic and drained for Cuyahoga County). I am now running on 4 million acres of Ohio at a 10ft (~3 meter) resolution. It’s that efficient.

Addendum: It also finds escarpment edges, like the Portage Escarpment in the above, so it is a mix of a few landforms. Darn handy nonetheless.

Posted in Analysis, Ecology, GDAL, Landscape Position, Other, POV-Ray | Tagged: , , , , , , , , , , | 2 Comments »

Proper (ab)use of a database, contour interpolation using #postgresql #postgis #arcgis

Posted by smathermather on August 6, 2012

Anyone who has been following along at home knows I don’t think much like a DBA.  Sometimes that’s good; mostly it’s probably bad.  In this post, I hope it will be interesting.

The problem of the day is how to take engineering contours derived from breaklines, a lidar point cloud, and all the lot, and do a good job interpolating that to a DEM.  This is an interesting problem, as it’s another one of those problems with so many sub-optimal solutions.  In part, it’s an impossible problem– going from a surface model to contours is inherently a loss of information, so going the other way is difficult to do believably, both with respect to believable smoothness and the requisite feature retention.  This is a little like the pugilistic pie puppets problem: ‘flaky vs. tender’ of making good pie crust.  Doing one well seems to preclude doing the other well.

Now, Chris Hormann, of PovRay fame has a solution, there are solutions in ArcGIS, and in GRASS, and plenty of other attempts.  Here’s a decent list, which I haven’t checked for timeliness:  http://vterrain.org/Elevation/contour.html.  Since we have a ArcGIS license, it handles the dataset I have OK, the results checked out OK, this is what we chose to use for now.  I tried GRASS, but it was too slow for the amount of data being processed.  The command in ArcGIS ends up looking something like this:


TopoToRaster_sa "N2260635.shp elevation Contour;N2260640.shp elevation Contour;N2260645.shp elevation Contour;N2265633.shp elevation Contour;N2265640.shp elevation Contour;N2265645.shp elevation Contour;N2270635.shp elevation Contour;N2270640.shp elevation Contour;N2270645.shp elevation Contour;N2275650.shp elevation Contour" "C:\Temp\TopoToR_N2269.tif" 2 "2265031 645070 2270065 639945" 20 # # ENFORCE CONTOUR 40 # 1 0 # # # # # #

And thus we go from this:

To this:

Now to do it 623 more times… .  And in walks PostgreSQL for some abuse as a spatial engine with string manipulation to boot.

Each tile of contours I run through this approach needs the adjacent 8 tiles of contours loaded in order to avoid dreaded edge effects:

Except that sometimes, there aren’t 8 tiles available, like on the edge of large water bodies:

So, we can’t just use a heuristic that loads the 8 adjacent tiles– we need to enumerate these tiles.  An ST_Touches should do the trick here:

SELECT p.tile, a.tile AS tiles
FROM cuy_contour_tiles AS p, cuy_contour_tiles As a
WHERE ST_Touches(p.geom, a.geom);

But, that does not arrange our table very nicely:

I want something that looks like this:

N2225655, N2225660, N2220650 …

N2225660, N2225665, N220665 … etc.

Thank goodness for Postgres’ String_agg, and moreover for Leo Hsu and Regina Obe’s blog.

Adapting their entry, we can reorganize those entries like this:

SELECT
STRING_AGG(a.tile, ',' ORDER BY a.tile)
As tiles
FROM cuy_contour_tiles AS p, cuy_contour_tiles As a
WHERE ST_Touches(p.geom, a.geom)
GROUP BY p.tile
ORDER BY p.tile;

Ah, now that’s better:

And if we want, we can use an arbitrary delimiter, and build our entire command in one swell foop:

SELECT p.tile,
'TopoToRaster_sa "'
|| 'W:\contours\cuy_contours\shapefiles\' || p.tile || '.shp elevation Contour;'
|| 'W:\contours\cuy_contours\shapefiles\'
|| STRING_AGG(a.tile, '.shp elevation Contour;W:\contours\cuy_contours\shapefiles\' ORDER BY a.tile)
|| '.shp elevation Contour;" '
|| '"v:\svm\dem\cuyahoga\' || p.tile || '.tif" 2'
|| ' "' || replace(replace(replace(ST_Extent(p.geom)::text, 'BOX(', ''), ')', ''), ',', ' ') || '"'
|| ' 20 # # ENFORCE CONTOUR 40 # 1 0 # # # # '
As tiles
FROM cuy_contour_tiles AS p, cuy_contour_tiles As a
WHERE ST_Touches(p.geom, a.geom)
GROUP BY p.tile
ORDER BY p.tile

Ok.  I have 623 more tiles to do.  Until next time… .

Posted in Analysis, ArcGIS, Database, PostGIS, PostgreSQL, SQL | Tagged: , , , , , , , , , , | Leave a Comment »

GDAL Slopes– Local Hydrologic Slope vs. the Standard Approach

Posted by smathermather on December 15, 2011

Open Source software is not, of course just about warm and fuzzies, great support, rapid development cycles, shared costs, etc., it’s also about getting your hands dirty with someone else’s code and implementing stuff more quickly and more intelligently because of it, and hopefully learning something in the process.  You don’t have to poke under the hood to drive the car, but sometimes it’s nice to anyway.

The two normal approaches to calculating slope in GIS raster tools (used in applications from ESRI’s Spatial Analyst and GDAL’s gdaldem) are Horn’s formula and Zevenbergen & Thorne’s formula, both integrated calculations of slope that do a bit of generalization in the process of calculating slope.  This is particularly useful if you are using slope for more than a localized calculations, for example in doing estimates of local light conditions.  That said, if you want to match local hydrologic slope (something which will more closely approximate a field measurement of slope), you are better off using a different formula altogether.  An article a colleague sent me: “When GIS Slope is Not What You Think” in this month’s The Forestry Source tipped me off to this, and I thought I’d do some quick and dirty digging in the GDAL code to see if I could implement something a little different.

Here is the standard Horn approach as currently implemented in GDAL:

float GDALSlopeHornAlg (float* afWin, float fDstNoDataValue, void* pData)
{
    const double radiansToDegrees = 180.0 / M_PI;
    GDALSlopeAlgData* psData = (GDALSlopeAlgData*)pData;
    double dx, dy, key;

    using std::max;
    dx = (max((afWin[0], afWin[3], afWin[3], afWin[6])) -
          max((afWin[2], afWin[5], afWin[5], afWin[8])))/psData->ewres;

    dy = ((afWin[6], afWin[7], afWin[7], afWin[8]) -
          max((afWin[0], afWin[1], afWin[1], afWin[2])))/psData->nsres;

    key = (dx * dx + dy * dy);

    if (psData->slopeFormat == 1)
        return (float) (atan(sqrt(key) / (8*psData->scale)) * radiansToDegrees);
    else
        return (float) (100*(sqrt(key) / (8*psData->scale)));
}

Borrowing from gdaldem roughness we implement this:

float GDALSlopeHydroAlg (float* afWin, float fDstNoDataValue, void* pData)

{
    // Hydrologic Slope is the max
    //  local slope btw center cell and adjacent cells

    const double radiansToDegrees = 180.0 / M_PI;
    GDALSlopeAlgData* psData = (GDALSlopeAlgData*)pData;
                double key;

    float pafLocalMin = afWin[0];
     float pafLocalMax = afWin[0];

    for ( int k = 1; k < 9; k++)
    {
        if (afWin[k] < pafLocalMin)
        {
            pafLocalMin=afWin[k];
        }

        if (afWin[k] > pafLocalMax)
        {
            pafLocalMax=afWin[k];
        }
    }

    key = afWin[4] - pafLocalMin;

    if (psData->slopeFormat == 1)
        return (float) (atan(sqrt(key) / (2*psData->scale)) * radiansToDegrees);
    else
        return (float) (100*(sqrt(key) / (2*psData->scale)));
}

Output is rougher than the original, as expected:

Local Hydrologic Slope

Zevenbergen & Thorne’s Slope

And looking at the stats we see a larger standard deviation for local hydrologic slope:


Z:\>gdalinfo Hydrologic_slope.tif
Driver: GTiff/GeoTIFF
Files: Hydrologic_slope.tif
...

Metadata:
STATISTICS_MINIMUM=0
STATISTICS_MAXIMUM=46.168109893799
STATISTICS_MEAN=10.206419514979
STATISTICS_MEDIAN=2.8041767686283e-266
STATISTICS_MODE=1.4463930749435e-307
STATISTICS_STDDEV=5.3794018830967
LAYER_TYPE=athematic

Z:\>gdalinfo Zevenbergen_Thorne.tif
Driver: GTiff/GeoTIFF
Files: Zevenbergen_Thorne.tif
...
Metadata:
STATISTICS_MINIMUM=0
STATISTICS_MAXIMUM=52.424499511719
STATISTICS_MEAN=3.0666069784504
STATISTICS_MEDIAN=4.331070843283e-184
STATISTICS_MODE=-4
STATISTICS_STDDEV=3.567882216098
LAYER_TYPE=athematic

A couple of additional notes.  First, this is 2.5ft resolution data.  At this resolution, the differences in algorithms may be somewhat academic.  Also, of note– the range for the Zevenbergen-Thorn slope algorithm is greater than that for the local hydrologic slope.  This is an outcome I did not anticipate.

 

Update:

Hmmm, looks like I have more revisions and questions from the GDAL dev board. May have to revisit this… .  Forgot about the corners of my 3×3 window… .

Posted in Analysis, Ecology, GDAL | Tagged: , , , , | Leave a Comment »

Landscape Position: Conclusion? (part 2)

Posted by smathermather on December 7, 2011

From earlier post:

“I’ve managed to pilot most of a fast high resolution landscape position workflow with PovRay as my magic tool. The final steps I hope to pipe through PostGIS Raster. In the meantime a screenshot and description: blues are riparian, raw ocre, etc upland categories, grey is mostly flat lake plain and mid slopes, all derived from just a high res DEM input (no hydro lines as input, so it works on areas where you don’t know where the streams may be). There will be more categories in the final, in the meantime, welcome to December.”

I didn’t use PostGIS Raster, but the incredibly useful gdal_calc.py to finish out my landscape position calculations.  I will endeavor to post my code to github soon, but the parts and pieces include using gdal and PovRay.  PovRay helps with the sampling (really!) of nearby neighbors in the raster DEM, and gdal does the averaging and differencing of those to get relative landscape position.  I spent some time yesterday scratching my head over how to show all the new landscape position information on a readable and useable map, and after discussion with collegues, decided to use it to divide the world into two categories– riparian & lowland + headwaters & upland (not reflected yet in the labels).  To find out more about landscape position, follow this link, or better yet this one.  (BTW, the green is park land, blue is riparian/lowland/stream networks, purple is the basin boundary).

Riparian map based on landscape position, calculated using PovRay and GDAL.

Posted in Analysis, Ecology, Landscape Position, Other, POV-Ray | Tagged: , , , , , , , , , , , | Leave a Comment »

Landscape Position: Conclusion?

Posted by smathermather on November 30, 2011

I’ve managed to pilot most of a fast high resolution landscape position workflow with PovRay as my magic tool. The final steps I hope to pipe through PostGIS Raster. In the meantime a screenshot and description: blues are riparian, raw ocre, etc upland categories, grey is mostly flat lake plain and mid slopes, all derived from just a high res DEM input (no hydro lines as input, so it works on areas where you don’t know where the streams may be). There will be more categories in the final, in the meantime, welcome to December.

20111130-221953.jpg

Posted in Analysis, Ecology, Landscape Position, Other, POV-Ray | Tagged: , , , , , , , , , , , | 1 Comment »

Dialog– What qualifies as a benchmark? Part 1

Posted by smathermather on July 25, 2011

Normally, my blog is a bit of a monologue.  It’s not a bad thing, but can be a little lonely.  Every now and then I get a great (and often doubtful) comments, which enhances things considerably.

What follows is some dialog about the LiDAR shootout series, largely between Etienne and Pierre, posted with their permission:

Pierre:

“Etienne, Stephen,

“I really appreciate the benchmark work you are doing to compare different technologies to solve your lidar data processing problem.

“However, if you want your tests to have as much credibility as possible and mostly if you wishes to publish the results on the web I would recommend that you compare horses with horses, dogs with dogs, general vector technologies with general vector technologies, raster technologies with raster technologies, specialized lidar technologies with specialized lidar technologies. Otherwise the dog will always lose against the horses. You have to be clear on that and well explain the differences and the pro’s and con’s between all the tech you will be using in your benchmark. Are you doing the test in ArcGIS using ESRI lidar technology (or general vector one)? Same question for R and PostGIS Raster (in the last case it is clear that the answer is no). In your graph you are comparing very different kind of animals without giving much chances to dogs.

“For every technologies you will also have to test different approaches in order to find the fastest one for this technology in order to compare it with the fastest one in the other technologies. Comparing a slow approach in one tech with the fastest in the other would be a bit unfair. This is what we did for PostGIS Raster. Did you do the same with R and ArcGIS?

“Basically I think it is just unfair to compare ArcGIS, R and PostGIS Raster with LAStools for working with Lidar data. Mostly if you are not using ArcGIS and R packages for dealing with lidar. Teach us you did not use them if you did not. We will all learn something.”

Etienne:

“As you said, the point is to compare different strategies to get the job done. That’s it. It’s not a question of being fair or not, we report facts. Maybe the bottom line is that pgraster shouldn’t be used to extract lidar height. However it can be used for other things. Well, we all learned something.

“It also emphasize the fact that there is no point_cloud type in postgis. I guess this could improve a lot the postgis results as it seems handling point cloud might be really fast. I learned that.

“Each solution has its drawbacks, for ArcGIS I think we’ll all agree about the cost of the licence and furthermore the cost of the lidar plugin. For lastools, it require a licensing for commercial use (which is really low btw compared to ArcGIS + Spatial analyst + LP360). PG raster require a postgres installation and some SQL knowledge, etc. R require to learn a new language, etc.

“Now I must acknowledge, the tests we did were on only 500k points. That could be considered a lot, and not. When dealing with lidar, one could easily expect to get more than a thousand times this quantity. But it was the limit imposed by the LP360 version I used and also a reasonable size for benchmarking. Note that pg raster was really stable in the timing (whatever the size).

“Finally, please note that Martin’s approach is new and he just released the tool a week ago. It’s to the better of my knowledge, the first one the do it.

“Pierre, explain your point better, I’m not sure I get it (or I disagree, as you can see).”

Pierre:
“My point is just that you have to be clear in your post about the fact that LAStool is a technology developed explicitly for dealing with lidar data. PostGIS raster is not and I guess you are not using ESRI lidar technology and I have no idea how you do it with R. The point is to compare potatoes with potatoes.

“If I would be from the ESRI side I would quickly post a comment in the post saying “Did you read this article?” http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Estimating_forest_canopy_density_and_height/00q8000000v3000000/

“A don’t know how you did it in ArcGIS but it seems that they are storing lidar data as multipoints.

“A good benchmark should precise all this.

Etienne (Pierre > quoted):

> If I would be from the ESRI side I would quickly post a comment in the post saying “Did you read this article?”

> http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/Estimating_forest_canopy_density_and_height/00q8000000v3000000/
“I would reply: apart from the fact that it uses lidar data, it has nothing to do with the object of the present article 🙂 But I’d be happy to discover a better way than what I did (for all the tools btw).

“Maybe you should (re-)read the post, it’s written. For R as well. Maybe Steve, you should precise (this is my fault) that for ArcGIS, I’ve used LP360 (evaluation) to load lidar data, which was limited to 500k points. But it wasn’t involved in the computation as I used (as written in the post) «ExtractValuesToPoints».”

Pierre:

> So it’s a good benchmark ?

“I just think it would be more fair to create three test categories:

“1) Lidar technologies (apparently your ArcGIS test should be in this category)

“2) Vector/raster technologies (why not use non lidar techs if lidar techs are so much better?)

“3) Nearest neighbor technique using vector only technology

“I won’t comment further…

Etienne:

“I personally would not encourage these categories as there is some confusion (at least in my mind) on the definition of what is a «LiDAR thechnology». However I think it would be a good idea to nuance the conclusions, which I think will be done in further posts (or not). It’s also pretty clear from the RMSE measures what is the main drawback of each solution (with speed).

“I’m still open for discussion…

Stephen (me):

“Hi Pierre,
     “I agree with you in principle, but the ArcGIS aside, there are some very real opportunities for optimization on point in raster analyses with tiling, indexing, etc.  In other words, I don’t see any practical reasons why PostGIS Raster couldn’t be (nearly) as fast as an nn technique, so long as the tile size is optimized for density of the analysis in question.  The only thing that makes it a dog is its age as a technology.

(moments later after discovering the lengthy dialog…)

“It looks like I missed all the discussion here…

“The reason such benchmarks are interesting is (in part) that they are not fair.  PostGIS isn’t optimized for LiDAR analyses.  PostGIS Raster isn’t optimized for LiDAR analyses.  PostGIS Raster is very young technology which really isn’t fair.  The PostGIS development folks know this, and were interested in seeing my initial benchmarks with nn analyses for LiDAR height (last year) because they are aware of the optimizations that are not yet there for handling point clouds.  Vector and raster analyses (as we’ve run them here) should not in principle or practice substantially different in speed, so long as the query analyzer maximizes the efficiency of the techniques.  Add in multipoint optimizations and other voodoo, and well, that’s a different story.  Another aspect to this is that comparing lastools to the rest is unfair not just because it’s optimized for lidar but because it’s working at a much lower level (file level) than the abstractions a database provides.  The trade-off, off course, is that a database can provide transaction support, standardized language, automated query analyzers, etc. etc. that don’t apply for file level processing.

“What I see here, categories aside, is a need for the other 2-3 parts of this 3-4 part series– the breakdown of which techniques are used, how they’re implemented, and the rationale behind them.  Don’t worry, we’ll get there.  That’s the danger of starting with the punchline, but we’ll build the nuance in.

 

So, there we are.  Looks like I’m committed to building this out with a little more nuance.  🙂  Stay tuned.

Posted in Database, Database Optimization, LiDAR Shootout, PostGIS, SQL | Tagged: , , , , , , , | Leave a Comment »

LiDAR Shootout! — New Chart, Final Results

Posted by smathermather on July 21, 2011

Comparison of lidar height calculations

In reviewing the final numbers and charts from Etienne and Pierre, above are the results we see.  The only revision is a moderate increase in speed for the PG Raster query.

Final results in speed for lastools– ~350,000 points per second.  In other words– off-the-charts fast.  And the initial RMSE of ~25 feet was a mistake– it is probably closer to 0.2 feet.

Stay tuned for detailed reviews of these techniques (with code).

 

Posted in Database, Database Optimization, LiDAR Shootout, PostGIS, SQL | Tagged: , , , , , , , | Leave a Comment »

LiDAR Shootout!

Posted by smathermather on July 18, 2011

For a couple of months now I’ve been corresponding with Etienne Racine and Pierre Racine out of Montreal Laval University in Quebec City.  They decided to take on the problem of finding the speed and accuracy of a number of different techniques for extracting canopy height from LiDAR data.  They have been kind enough to allow me to post the results here.  This will be a multipart series.  We’ll start with the punchline, and then build out the code behind each of the queries/calculations in turn (category will be “LiDAR Shootout“) with separate posts.

In the hopper for testing were essentially 4 different ways to extract canopy height from a LiDAR point cloud and (in three of the cases) a DEM extracted from the LiDAR point cloud.  The 4 techniques were as follows:

  • Use ArcGIS ExtractValuesToPoints.
  • Use R using the raster and rgdal packages with raster::extract() function with in-memory results.
  • Use postgis (2.0 svn) Raster datatype (formerly WKT Raster)
  • And use a simple nearest neighbor approach with ST_DWithin within PostGIS.

The data are from the Ohio Statewide Imagery Program (OSIP) program, run by Ohio Geographically Referenced Information Program (OGRIP).  Ohio is blessed with an excellent 2006-2008 LiDAR dataset and statewide DEM derived from the (effectively) 3.5 foot horizontal posting data (specs may say 5-ft, I can’t remember…).

Speeeed:

So, first to speed, initial results.  Who wins in the speed category?  Measured as points per second (on consistently the same hardware), nearest neighbor wins by a landslide (bigger is better here):

Application:  Points per Second

ArcGIS:  4504
R:  3228
PostGIS Raster:  1451 (maybe as high as 4500)
Postgis nn:  18208

Now, later on, Pierre discovered changes to indexing may help an the EXPLAIN query analyzer optimization which tripled the PostGIS Raster query speed, making it about the same speed as ArcGIS.  More on that later.

Figure removed– to be replaced in another post

Accuracy:

A fast calculation is always better, unless you trade off accuracy in some detrimental way.  With PostGIS NN, we’re not interpolating our LiDAR ground point cloud before calculating the difference, so relative to the other three techniques, we could be introducing some bias/artifacts here, a point Jeff Evans makes here.  Overall error relative to the interpolated solution introduced by using an NN technique on this dataset: 0.268 feet.  Whew!  Sigh of relief for me!  We may spend some time looking at the geographic distribution of that error to see if it’s random or otherwise, but that’s very near the error level for the input data.

Addendum:

8 days ago, Martin Isenburg’s let the lasheight tool drop. Lastools is impressively fast.  Preliminary tests by Etienne: 140450 points per second.

Oy!  A factor of 8 faster than PostGIS NN  And it uses an on-the-fly calculated DEM using delaunay triangulation.  Preliminary results indicate a very different DEM than the one calculated by the state:

RMSE: 25.4227446600656

More research will need done to find out the source of the differences: they are not random.

In other news:

PostGIS will get a true Knn technique in the near future.  Such a project is funded, so stay tuned for faster results:

http://www.postgis.org/pipermail/postgis-devel/2011-June/013931.html

Also, index changes to PostGIS could come down the pike, if funded, that will result in bounding box queries that are 6-times faster, ala:

http://blog.opengeo.org/2011/05/27/pgcon-notes-3/

Between the two optimizations, we could give Lastools a run for its money on speed  🙂 .  In the meantime, preliminary RMSE aside, Lastools (the 5th tool not in this test) may win hands down.

Posted in Database, Database Optimization, LiDAR Shootout, PostGIS, SQL | Tagged: , , , , , , , , | 9 Comments »