Smathermather's Weblog

Remote Sensing, GIS, Ecology, and Oddball Techniques

Archive for October, 2009

Loading LiDAR data in PostGIS

Posted by smathermather on October 23, 2009

For a variety of reasons, I want to load my whole LiDAR vegetation dataset into PostGIS.  I have it in raw, unclassified LAS files, and broken into classified, space delimited text files as well.  The unclassified LAS is nice, if I ever want to hone the data, but for now, we’ll assume the classified data are correctly classified by someone smarter than I am (a likely scenario).

First I wanted to concatenate the text files (here in Windows Command):

copy *.xyz cuy_lidar_veg.txt

Now, I really wanted them in CSV form to make my insert statements easy, so I cheated and used lastools to do the job, assuming it would be more efficient than a stream based script, like awk or sed:

txt2las -i cuy_lidar_veg.txt -o cuy_lidar_veg.las -parse xyz
las2txt -i cuy_lidar_veg.las -o cuy_lidar_veg.csv -sep comma-parse xyz

converting from this:

2114483.860 612559.340 794.720
2114470.510 612558.420 794.420
2114449.970 612565.010 793.900
2114451.540 612569.640 793.800
2114481.010 612571.740 794.390
2114473.230 612572.950 794.130 ...

to this:

2114483.86,612559.34,794.72
2114470.51,612558.42,794.42
2114449.97,612565.01,793.9
2114451.54,612569.64,793.8
2114481.01,612571.74,794.39
2114473.23,612572.95,794.13 ...

I need a table to accept the data so in psql:

CREATE TABLE base.cuy_veg_lidar(
 x numeric,
 y numeric,
 z numeric
);

Now to use awk to format a series of insert statements and pass them through psql right into the database:

less cuy_lidar_veg.csv | awk '{ print "INSERT INTO base.cuy_veg_lidar (x,y,z) VALUES (" $0 ");" };' | psql -U postgres -d CM

This creates a series of command which look like this:

INSERT INTO base.cuy_veg_lidar (x,y,z) VALUES (2114120.67,612570.62,791.11);

Oh, and this last step I ran in CYGWIN command prompt so I could easily access my favorite unix utilities… .  I’m still a nube in the Windows world.  I need my crutches… .

Why so much work when I could just use Postgres’ COPY command?  Permissions, permissions, permissions… .  I like my database to have very little file access ability… .

Next we add a geometry column for 3 dimensional points:

SELECT AddGeometryColumn('base','cuy_veg_lidar','the_geom',9102722,'POINT', 3);

And populate that with our XYZ information:

update base.cuy_veg_lidar set the_geom = ST_SetSRID(ST_MakePoint(x,y,z),102722);

Posted in Database, PostGIS, SQL | Tagged: , , , | Leave a Comment »

Unique constraint on geometry continued

Posted by smathermather on October 22, 2009

An actual example of using the hashed geometry to uniquely constrain geometry (nothing really new here, just an example of using what we learned in the last post with a new problem):

I’ve got a contours dataset from the county where I work that is in 20 different shapefiles ranging in size from 500MB to 2GB.  I want to put them into a PostGIS database in a single table, so that I can use PostGIS to slice and dice them into reasonable chunks for use by our engineers in AutoDesk products, by ArcGIS users in several divisions, and to improve spatial indexing when dump them into one table in our spatial database as well (ala Regina Obe/Boston GIS’s Map Dicing).  This is a really detailed LiDAR-based contour dataset which used breaklines to make a really detailed and accurate product– which sadly is unusable in its current form because it takes too long to load any given section.

But, today, no slicing and dicing just yet.  I loaded all the data and discovered duplicate geometries.  It seems that some of these blocks that the data come in have overlaps.  I don’t have a screen shot of it, but when I viewed the data in uDig, you can see the duplicate geometries because the transparency of the contour lines in the duplicated areas is less transparent, i.e. the lines are darker against a white background.  I could search for and remove duplicates, but that seems heavy handed.  I’m going to recreate the table with a unique constraint applied to the hashed geometry (and hope for no hash collisions).

First we create the table:

CREATE TABLE base.cuy_contours_2
(
 gid serial NOT NULL,
 elevation smallint,
 shape_leng real,
 the_geom geometry,
 hash character(32),
 CONSTRAINT cuy_contours_2_pkey PRIMARY KEY (gid),
 CONSTRAINT geom_uniq UNIQUE (hash),
 CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 4),
 CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'LINESTRING'::text OR the_geom IS NULL),
 CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 9102722)
)
WITH (OIDS=FALSE);
ALTER TABLE base.cuy_contours_2 OWNER TO postgres;

CREATE INDEX cuy_contours_2_the_geom_idx
 ON base.cuy_contours_2
 USING gist
 (the_geom);

Now we create our trigger that creates a hashed version of the geometry, “the_geom”, in a column called “hash”.

DROP TRIGGER hasher on base.cuy_contours_2;
CREATE OR REPLACE FUNCTION hash_geom() RETURNS TRIGGER AS $hasher$
BEGIN
 IF(TG_OP='INSERT') THEN

 UPDATE base.cuy_contours_2 SET hash = MD5(ST_AsBinary(the_geom));

 END IF;
 RETURN NEW;
END;
$hasher$ LANGUAGE plpgsql;

CREATE TRIGGER hasher AFTER INSERT ON base.cuy_contours_2 FOR EACH STATEMENT EXECUTE PROCEDURE hash_geom();

And finally we add a unique constraint:

ALTER TABLE base.cuy_contours_2 ADD CONSTRAINT geom_uniq UNIQUE (hash);

Now, sadly, I have to use windows where I work, so the following is Windows Command language to load all my shapefiles into the PostGIS database.  That said, I think the command, in this rare case, is more elegant than similar code in, say, BASH scripting:

for %f in (*.shp) do shp2pgsql -s 102722 -a -S -N skip %f base.cuy_contours_2 | psql -h bobs_server -d CM -U postgres

using the shp2pgsql -a flag for append, -S to enforce simple geometries, -N skip to ensure that we keep all records with non-null geometries to compensate for potential errors in the input dataset).

Posted in PostGIS | Tagged: , , , | Leave a Comment »