KNN with FLANN and laspy, a starting place
Posted by smathermather on August 8, 2014
FLANN is Fast Library for Approximate Nearest Neighbors, which is a purportedly wicked fast nearest neighbor library for comparing multi-dimensional points. I only say purportedly, as I haven’t verified, but I assume this to be quite true. I’d like to move some (all) of my KNN calculations outside the database.
I’d like to do the following with FLANN– take a LiDAR point cloud and change it into a LiDAR height-above-ground point cloud. What follows is my explorations so far.
In a previous series of posts, e.g. https://smathermather.wordpress.com/2014/07/14/lidar-and-pointcloud-extension-pt-6/
I have been using the point cloud extension in PostGIS. I like the 2-D chipping, but I think I should segregate my data into height classes before sending it into the database. In this way, I can query my data by height class and by location efficiently, taking full advantage of the efficiencies of storing all those little points in chips, while also being able to query the data in any of the dimensions I need to in the future. Enter FLANN.
I haven’t gotten far. To use FLANN with LiDAR through Python, I’m also using laspy. There’s a great tutorial here: http://laspy.readthedocs.org/en/latest/tut_part_1.html
I make one change to the tutorial section using FLANN. The code as written is:
import laspy import pyflann as pf import numpy as np # Open a file in read mode: inFile = laspy.file.File("./laspytest/data/simple.las") # Grab a numpy dataset of our clustering dimensions: dataset = np.vstack([inFile.X, inFile.Y, inFile.Z]).transpose() # Find the nearest 5 neighbors of point 100. neighbors = flann.nn(dataset, dataset[100,], num_neighbors = 5) print("Five nearest neighbors of point 100: ") print(neighbors) print("Distances: ") print(neighbors)
To make this example work with the current version of pyflann, we need to make sure we import all of pyflann (or at least nn), and also set flann = FLANN() as follows:
import laspy import numpy as np from pyflann import * # Open a file in read mode: inFile = laspy.file.File("simple.las") # Grab a numpy dataset of our clustering dimensions: dataset = np.vstack([inFile.X, inFile.Y, inFile.Z]).transpose() # Find the nearest 5 neighbors of point 100. flann = FLANN() neighbors = flann.nn(dataset, dataset[100,], num_neighbors = 5) print("Five nearest neighbors of point 100: ") print(neighbors) print("Distances: ") print(neighbors)
Finally, a small note on installation of pyflann on Ubuntu. What I’m about to document is undoubtedly not the recommended way to get pyflann working. But it worked… .
Installation for FLANN on Ubuntu can be found here: http://www.pointclouds.org/downloads/linux.html
But this does not seem to install pyflann. That said, it installs all our dependencies + FLANN, so…
I cloned, compiled, and installed the FLANN repo: https://github.com/mariusmuja/flann
git clone git://github.com/mariusmuja/flann.git cd flann mkdir BUILD cd BUILD cmake ../. make sudo make install
This get’s pyflann where it needs to go, and voila! we can now do nearest neighbor searches within Python.
Next step, turn my LiDAR xyz point cloud into a xy-height point cloud, then dump in height-class by height class into PostgreSQL. Wish me luck!