Smathermather's Weblog

Remote Sensing, GIS, Ecology, and Oddball Techniques

Posts Tagged ‘ImagePyramid’

Efficient delivery of raster data part 4

Posted by smathermather on May 1, 2016

Guest post from my colleague Patrick Lorch who wrote up what we did the other day in order to view a whole bunch of tiled images in a directory in QGIS (I did some mild editing to his posts. Mistakes are mine). The great thing about the approach is that is generalizeable to most tools that use GDAL for their raster API. This is part of a series. You can view this series in reverse with this post:

Building virtual datasets from a bunch of tiffs
What do you do when someone gives you aerial images stored as tiles in different directories representing different zoom levels? The goal is to make them easy to use as baselayers in QGIS. The answer is to reference them in a virtual data set (VRT).

gdalbuildvrt is the ticket
First make lists of tiffs
If the directory structure is something like this:

total 8047
drwxr-xr-x 7 pdl    4096 Apr 29 14:22 ./
drwxr-xr-x 5 pdl    4096 Apr 27 22:10 ../
drwxr-xr-x 2 pdl 1310720 Apr 22 21:37 0/
drwxr-xr-x 2 pdl  393216 Apr 22 22:54 1/
drwxr-xr-x 2 pdl   98304 Apr 28 14:44 2/
drwxr-xr-x 2 pdl   32768 Apr 28 14:44 3/
drwxr-xr-x 2 pdl    8192 Apr 28 14:44 4/

Then you first need a set of list files listing tiffs in each directory.

ls 0\*.tif > list0.txt
ls 1\*.tif > list1.txt
ls 2\*.tif > list2.txt
ls 3\*.tif > list3.txt
ls 4\*.tif > list4.txt

Now make the vrts

gdalbuildvrt -input_file_list list0.txt aerial_2015_0.vrt
gdalbuildvrt -input_file_list list1.txt aerial_2015_1.vrt
gdalbuildvrt -input_file_list list2.txt aerial_2015_2.vrt
gdalbuildvrt -input_file_list list3.txt aerial_2015_3.vrt
gdalbuildvrt -input_file_list list4.txt aerial_2015_4.vrt

Now you can open these in QGIS depending on what zoom level you need.

These VRTs may now be loaded as ordinary rasters in QGIS or whatever you please. In this case, we retiled with multiple resample levels (see this post for more info), so we’ll have to define max/min ranges at which the different image collections are visible.

Thanks for the write up Pat!

Posted in GDAL | Tagged: , , , | Leave a Comment »

(Whichever tiler you use) and efficient delivery of raster data (image pyramid layer) (update2)

Posted by smathermather on April 15, 2016

Subdivision of geographic data is a panacea to problems you didn’t know you had.

Maybe you deal with vector data, so you pre-tile your vector data to ship to the browser to render– you’re makin’ smaller data. Maybe you use cutting edge PostGIS so you apply ST_Subdivide to keep your data smaller than the database page size like Paul Ramsey describes here. Smaller’s better… . Or perhaps you are forever reprojecting your data in strange ways, across problematic boundaries or need to buffer in an optimum coordinate system to avoid distortion. Regardless of the reason, smaller is better.

Maybe you aren’t doing vector work, but this time raster. What’s the equivalent tiling process?  I wrote about this for GeoServer almost 5 (eep!) years ago now (with a slightly more recent follow up) and much of what I wrote still applies:

  • Pre-tile your raw data in modest chunks
  • Use geotiff so you can use internal data structures to have even smaller tiles inside your tiles
  • Create pyramids / pre-summarized data as tiles too.

Fortunately, while these posts were written for GeoServer, they apply to any tiler. Pre-process with gdal_retile.

gdal_retile.py -v -r bilinear -levels 4 -ps 6144 6144 -co "TILED=YES" -co "BLOCKXSIZE=256" -co "BLOCKYSIZE=256" -s_srs EPSG:3734 -targetDir aerial_2011 --optfile list.txt

Let’s break this down a little:

First we choose our resampling method for our pyramids (bilinear). Lanzcos would also be fine here.

-r bilinear

Next we set the number of resampling levels. This will depend on the size of the dataset.

-levels 4

Next we specify the pixel and line size of the output geotiff. This can be pretty large. We probably want to avoid a size that forces the use of bigtiff (i.e. 4GB).

-ps 6144 6144

Now we get into the geotiff data structure — we internally tile the tifs, and make them 256×256 pixels. We could also choose 512. We’re just aiming to have our tile size near to the size that we are going to send to the browser.

-co "TILED=YES" -co "BLOCKXSIZE=256" -co "BLOCKYSIZE=256"

Finally, we specify our coordinate system (this is state plane Ohio), our output directory (needs created ahead of time) and our input file list.

-s_srs EPSG:3734 -targetDir aerial_2011 --optfile list.txt

That’s it. Now you have a highly optimized raster dataset that can:

  • get the level of detail necessary for a given request,
  • and can extract only the data necessary for a given request.
  • Pretty much any geospatial solution which uses GDAL can leverage this work to make for very fast rendering of raster data to a tile cache. If space is an issue, apply compression options that match your use case.

    Posted in GDAL | Tagged: , , , | 2 Comments »

    GeoServer and efficient delivery of raster data (image pyramid layer) (update)

    Posted by smathermather on May 11, 2012

    A perennial favorite on this blog is “GeoServer and efficient delivery of raster data (image pyramid layer)“. I am neither the last nor the first authority on this topic (check the GeoSolutions blog for authoritative work on GeoServer and raster, also look to the GeoServer documentation), but I’ve had some good experiences with serving rasters in GeoServer, especially using image pyramid layers

    Read the original, as this will just augment, but here are some targets to hit with the retiling necessary for larger datasets. This is the command I currently use for the retiling:

    
    gdal_retile.py -v -r bilinear -levels 4 -ps 6144 6144 -co "TILED=YES" -co "BLOCKXSIZE=256" -co "BLOCKYSIZE=256" -s_srs  EPSG:3734 -targetDir aerial_2011 --optfile list.txt
    
    

    Don’t be afraid of big block sizes. Bump your memory up in your application container, stop worrying and learn to love the larger tif. I keep my total number of output tifs to no more than 2000, where I start to see performance issues in my implementation.
    Also, give the image pyramid code a break. After retiling, do this:

    mkdir 0
    mv *.tif 0/.
    

    Posted in GeoServer, GeoWebCache | Tagged: , , , , | 1 Comment »

    GeoServer and efficient delivery of raster data (image pyramid layer)

    Posted by smathermather on May 12, 2011

    One thing I’ve learned in the last few years is that there is no theoretical reason why (properly indexed and summarized) data cannot be displayed at all scales as quickly as at any scale. This is the principle at work behind the extraordinary efficiencies of delivering data and imagery through a slippy map interface like Google, Bing, and OpenLayers as well as efficient thick client interfaces like Google Earth. So, in principle, and largely in practice, serving spatial data for a whole county or the whole world shouldn’t be any more onerous than serving data for a particular site, so long as you have adequate storage for the pre-rendered and summarized data, and time to pre-render the data. As storage tends to be cheaper than processing and network speed, this is a no-brainer.

    A number of great Open Source tools exist to help with serving large amounts of data efficiently, not the least of which is my favorite, GeoServer (paired with GeoWebCache). For serving imagery, in our case orthorectified 0.6-inch (0.1524 meter) aerial imagery, we have a few options. GeoServer does natively support GeoTiff, but for this large an area at this level of detail, we’d have to wade into the realm of BigTiff support through the GDAL extension, because we have 160GB imagery to serve. We could use wavelet compressed imagery, e.g. MrSid or ECW or Jpeg2000, but I don’t have a license to create a lossless version of these, and besides, storage is cheaper than processors– wavelet compressed imagery may be a good field solution, but for server side work, it doesn’t make a lot of sense unless it’s all you have available. Finally, there are two data source extensions to GeoServer meant for large imagery, the ImageMosaic Plugin, and the ImagePyramid Plugin. The ImageMosaic Plugin works well for serving large amounts of images, and has some great flexibility with respect to handling transparency and image overlap. The ImagePyramid extension is tuned for serving imagery at many scales. The latter is what we chose to deploy.

    The ImagePyramid extension takes advantage of gdal_retile.py, a utility built as part of GDAL that takes an image or set of images and re-tiles them to a standardized size (e.g. 2048×2048) and creates overviews as separate images in a hierachy (here shown as outlines of the images):

    But here’s the problem– for some reason, I can’t load all the images at once. If I do, only the low resolution pyramids (8-foot pixels and larger) load. If I break the area into smaller chunks, most of them fewer than 2000 images, they load fine.

    1" = 50' scale snapshot


    1:32,000 scale snapshot

    Posted in GeoServer, GeoWebCache | Tagged: , , , , | 5 Comments »