Smathermather's Weblog

Remote Sensing, GIS, Ecology, and Oddball Techniques

Taking Slices from LiDAR data: Part VI

Posted by smathermather on March 19, 2016

I finally got PDAL properly compiled with Point Cloud Library (PCL) baked in. Word to the wise — CLANG is what the makers are using to compile. The PDAL crew were kind enough to revert the commit which broke GCC support, but why swim upstream? If you are compiling PDAL yourself, use CLANG. (Side note, the revert to support GCC was really helpful for ensuring we could embed PDAL into OpenDroneMap without any compiler changes for that project.)

With a compiled version of PDAL with the PCL dependencies built in, I can bypass using the docker instance. When I was spawning tens of threads of Docker and then killing them, recovery was a problem (it would often hose my docker install completely). I’m sure there’s some bug to report there, or perhaps spawning 40 docker threads is ill advised for some grander reason, but regardless, running PDAL outside a container has many benefits, including simpler code. If you recall our objectives with this script, we want to:

  • Calculate relative height of LiDAR data
  • Slice that data into bands of heights
  • Load the data into a PostgreSQL/PostGIS/pgPointCloud database.

The control script without docker becomes as follows:

#!/bin/bash 

# readlink gets us the full path to the file. This is necessary for docker
readlinker=`readlink -f $1`
# returns just the directory name
pathname=`dirname $readlinker`
# basename will strip off the directory name and the extension
name=`basename $1 .las`

# PDAL must be built with PCL.
# See http://www.pdal.io/tutorial/calculating-normalized-heights.html

pdal translate "$name".las "$name".bpf height --writers.bpf.output_dims="X,Y,Z,Intensity,ReturnNumber,NumberOfReturns,ScanDirectionFlag,EdgeOfFlightLine,Classification,ScanAngleRank,UserData,PointSourceId,HeightAboveGround"

# Now we split the lidar data into slices of heights, from 0-1.5 ft, etc.
# on up to 200 feet. We're working in the Midwest, so we don't anticipate
# trees much taller than ~190 feet
for START in 0:1.5 1.5:3 3:6 6:15 15:30 30:45 45:60 60:105 105:150 150:200
        do
        # We'll use the height classes to name our output files and tablename.
        # A little cleanup is necessary, so we're removing the colon ":".
        nameend=`echo $START | sed s/:/-/g`

        # Name our output
        bpfname=$name"_"$nameend.bpf

        # Implement the height range filter
        pdal translate $name.bpf $bpfname -f range --filters.range.limits="HeightAboveGround[$START)"

        # Now we put our data in the PostgreSQL database.
        pdal pipeline -i pipeline.xml --writers.pgpointcloud.table='pa_layer_'$nameend --readers.bpf.filename=$bpfname --writers.pgpointcloud.overwrite='false'
done

We still require our pipeline xml in order to set our default options as follows:

<?xml version="1.0" encoding="utf-8"?>
<Pipeline version="1.0">
  <Writer type="writers.pgpointcloud">
    <Option name="connection">
      host='localhost' dbname='user' user='user' password=‘password’
    </Option>
    <Option name="table">54001640PAN_heightasz_0-1.5</Option>
    <Option name="compression">dimensional</Option>
    <Filter type="filters.chipper">
      <Option name="capacity">400</Option>
      <Reader type="readers.bpf">
      <Option name="filename">54001640PAN_heightasz_0-1.5.bpf</Option>
      </Reader>
    </Filter>
  </Writer>
</Pipeline>

And as before, we can use parallel to make this run a little lot faster:

find . -name '*.las' | parallel -j20 ./pdal_processor.sh

For the record, I found out through testing that my underlying host only has 20 processors (though more cores). No point in running more processes than that… .

So, when point clouds get loaded, they’re broken up in to “chips” or collections of points. How many chips do we have so far?:

user=# SELECT COUNT(*) FROM "pa_layer_0-1.5";
  count   
----------
 64413535
(1 row)

Now, how many rows is too many in a PostgreSQL database? Answer:

In other words, your typical state full of LiDAR (Pennsylvania or Ohio for example) are not too large to store, retrieve, and analyze. If you’re in California or Texas, or have super dense stuff that’s been flown recently, you will have to provide some structure in the form of partitioning your data into separate tables based on e.g. geography. You could also modify your “chipper” size in the XML file. I have used the default 400 points per patch (for about 25,765,414,000 points total), which is fine for my use case as then I do not exceed 100 million rows once the points are chipped:

      <Option name="capacity">400</Option>

3 Responses to “Taking Slices from LiDAR data: Part VI”

  1. Finally something very real and clear on PDAL after “years” of search ! Would it be alright and its really to much to ask I guess but it does’nt hurt to ask since I have been on constant struggle trying to make the pipe work for my work for our national mapping project using LiDAR. Can you provide a walk through on your successful compilation? I’m working feature extraction for floodplains – we have plenty of floods year round in the Philippines!

  2. […] Part 7 of N… , see e.g. my previous post on the topic. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: