This post typed into my iPod as an homage. Assisted today by my collegue, J. Stein.
Moderate obfuscation of locations is an important technique for the protection of data, say something sensitive like the nesting locations of the very rare and strange fuzzy-bellied gnat catcher. We still want to display the data, but want to make it slightly wrong.
A naive approach would place it a random distance away, i.e. Location +- random()*500ft. Unfortunately, this could be a distance as small as 0 ft away, not an outcome we want.
So, instead we constrain our placement of random points to a minimum and maximum distance away. Enter ST_Donut. We need to construct a donut, and then place a random, ahem, sprinkle on it, and return a point:
CREATE OR REPLACE FUNCTION ST_Donut (
geom Geometry, dough numeric, nut numeric, geom2 Geometry
)
RETURNS Geometry
AS $$
DECLARE
donut Geometry;
BEGIN
donut = ST_SetSRID(ST_Difference(ST_Difference(ST_Buffer(geom, dough), ST_Buffer(geom, nut)), ST_SRID(geom)),geom2);
RETURN donut;
END;
$$ LANGUAGE plpgsql;
Thanks to sorokine:
CREATE OR REPLACE FUNCTION RandomPoint ( geom Geometry ) RETURNS Geometry AS $$ DECLARE maxiter INTEGER := 1000; i INTEGER := 0; x0 DOUBLE PRECISION; dx DOUBLE PRECISION; y0 DOUBLE PRECISION; dy DOUBLE PRECISION; xp DOUBLE PRECISION; yp DOUBLE PRECISION; rpoint Geometry; BEGIN -- find envelope x0 = ST_XMin(geom); dx = (ST_XMax(geom) - x0); y0 = ST_YMin(geom); dy = (ST_YMax(geom) - y0); WHILE i maxiter THEN RAISE NOTICE 'number of interations exceeded max'; END IF; RETURN rpoint; END; $$ LANGUAGE plpgsql;
Mmmm, donut…
Edit: October 8, 11:06
The real function is:
CREATE OR REPLACE FUNCTION ST_Donut (
geom Geometry, dough numeric, nut numeric, geom2 Geometry
)
RETURNS Geometry
AS $$
DECLARE
donut Geometry;
BEGIN
donut = ST_SetSRID(ST_Difference(ST_Buffer(geom, dough), ST_Buffer(geom, nut)), ST_SRID(geom));
The code I had in the block at the top is the overloaded version. In this case it can take in two geometries; the second geometry is a limiting geometry– say a boundary outside of which you don’t want to place the randomized point. PostgreSQL let’s you overload functions, so if two functions have the same name, but different variable input, whether by type or number or both, it will run the function that matches the name and inputs. This allows us to create two functions with the same name, but slightly different purposes. BTW, this overdetermined function should probably check to see if the second geometry is a polygon, but it does not. Buyer beware… .
Hi,
I think taking random x’, y’ in the bounding box frame around x, testing if the distance to x is inside the donut circles and retrying if not is probably quicker than computing a buffer and 2Â differences. Keep posted if you try it !
Ahh, how embarassing! I did use ST_Buffer to effectively test distance to a geographic entity. And then I wrote a post about it. Confession: I was so excited with the language games around ST_Donut, PostGIS for breakfast, and sprinkle, I gave no thought to query optimization, and chose the least efficient method. I shall rewrite ASAP.