Oracle Spatial Mapping and Map Rendering Performance Tips

Comment: This article was written about 10 years ago. Most of what is written is still relevant though the comments about disk usage and access do not apply if SSD is being used.

Introduction

There are lots of things one can do to improve performance in mapping environments because of a lot of the visualisation is based on “background” or read-only data.

Here are 10 “tips” that I find useful:

1. Spatially sort read-only data.

This tip makes sure that data that is close to each other in space are next to each other on disk! Dan gave a good suggestion when he referenced Chapter 14, “Reorganize the Table Data to Minimize I/O” pp 580- 582, Pro Oracle Spatial. But just as easily one can create a table as SELECT… WHERE SDO_FILTER() where the filtering object is an optimized rectangle across the whole of the dataset. (This is quite quick on 10g and above but much slower on earlier releases.)

There are many methods for doing this, I suggest you consider using a Morton space key based sort – ORDER BY Morton(gx,yy) a per the article Spatial Sorting of Data via Morton Key.

2. Generalise data

2.1. Simplification

Rendering spatial data can be expensive where the data is geometrically detailed (many vertices) esp where the data is being visualised at smaller scales than it was captured at. So, if your “zoom thresholds” allow 1:10,000 data to be used at 1:100,000 then you are going to have problems. Consider pre-generalising the data (see sdo_util.simplify) before deployment. You can add multiple columns to your base table to hold this data. Be careful with polygon data because generalising polygons that share boundaries will create gaps etc as the data is more generalised. Often it is better to export the data to a GIS which can maintain the boundary relationships when generalising (say via topological relationships).

Oracle’s MapViewer has excellent on-the-fly generalisation but here one needs to be careful. Application tier caching (cf Bryan’s comments below) can help here a lot.

2.2. Coordinate Precision

Oracle does not enforce a coordinate precision on the data it stores. It only applies its sdo_tolerances when processing data in its spatial functions.

Many spatial data loader software loads ordinates to the precision it finds the are described in. So, if the X ordinate of a point is described as 300,000.1234567 then those loader software often will load exactly that. But if the actual ordinate’s observed value cannot be any better than 1cm, consider loading 300,000.12 rather than the full number.

Why?

Because:

  • It reduces storage of the SDO_GEOMETRY objects potentially contributing to significant saving in storage space due to the nature of the way Oracle stores the NUMBER data type.
  • This reduction in storage space can, in turn, improve query performance

3. Consider “denormalising” data

There is an old adage in databases that is “normalise for edit, denormalise for performance”. When we load spatial data we often get it from suppliers in a fairly flat or normalised form. In consort with spatial sorting, consider denormalising the data via aggregations based on a rendering attribute and some sort of spatial unit.

For example, if you have 1 million points stored as single points in SDO_GEOMETRY.SDO_POINT which you want to render by a single attribute containing 20 values, consider aggregating the data using this attribute AND some sort of spatial BUCKET or BIN. So, consider using SDO_AGGR_UNION coupled with Spatial Analysis and Mining package functions to GROUP (the data) BY <> and a set of spatial extents.

4. Tablespace use

When creating tables via denormalisation, sorting and/or ordinate precision reduction, it may be useful to create the target table in such a way that its blocks are as full as possible and packed next to each other in the tablespace. (Consider tablespace defragmentation beforehand.) Also, if the data is READ ONLY consider setting the PCTFREE to 0 in order to pack the data up into as small a number of blocks as possible.

Finally, talk to your DBA in order to find out how the oracle database’s physical and logical storage is organised. Is a SAN being used or SAME arranged disk arrays? Knowing this you can organise your spatial data and indexes using more effective and efficient methods that will ensure greater scalability. (See 2.2. Coordinate Precision above.)

5. SDO_TOLERANCE, Clean Data

If you are querying data other than via MBR (eg find all land parcels that touch each other) then make sure that your sdo_tolerance values are appropriate. I have seen sites where data captured to 1cm had an sdo_tolerance value set to a millionth of a meter!

A corollary to this is make sure that all your data passes validation at the chosen sdo_tolerance value before deploying to visualisation. Run sdo_geom.validate_geometry()/validate_layer()…

6. RTree Spatial Indexing

At 10g and above lots of great work went in to the RTree indexing. So, make sure you are using RTrees and not QuadTrees. Also, many GIS applications create sub-optimal RTrees by not using the additional parameters available at 10g and above.

6.1 If your table/column sdo_geometry data contains only points, lines or polygons then let the RTree indexer know (via layer_gtype) as it can implement certain optimizations based on this knowledge.

6.2 With 10g you can set the RTree’s spatial index data block use via sdo_pct_free. Consider setting this parameter to 0 if the table/column sdo_geometry data is read only.

6.3 If a table/column is in high demand (eg it is the most commonly used table in all visualisations) you can consider loading (a part of) the RTree index into memory. Now, with the RTree indexing, the sdo_non_leaf_tbl=true parameter will split the RTree index into its leaf (contains actual rowid reference) and non-leaf (the tree built on the leaves) components. Most RTrees are built without this so only the MDRT_9AA9A$ secondary tables are built. But if sdo_non_leaf_tbl is set to true you will see the creation of an additional MDNT_9AA9A$ secondary table (for the non_leaf part of the rtree index). Now, if appropriate, the non_leaf table can be loaded into memory via the following:

ALTER TABLE MDNT_9AA9A$ STORAGE ( BUFFER_AREA KEEP );

This is NOT a general panacea for all performance problems. One should investigate other options before embarking on this (cf Tom Kyte’s books such as Expert Oracle Database Architecture, 9i and 10g Programming Techniques and Solutions.)

6.4 (NO LONGER RECOMMENDED – ORACLE’S RTREE INDEXES ARE SELF-MANAGING) Don’t forget to check your spatial index data quality regularly. Because many sites use GIS package GUI tools to create tables, load data and index them, there is a real tendency to not check what they have done or regularly monitor the objects. Check the SDO_RTREE_QUALITY column in USER_SDO_INDEX_METADATA and look for indexes with an SDO_RTREE_QUALITY setting that is greater than 2. If greater than 2 consider rebuilding or recreating the index.

7. The rendering engine.

Whatever rendering engine one uses make sure you try and understand fully what it can and cannot do. AutoDesk’s MapGuide is an excellent product but I have seen it simply cache table/column data and never dynamically access it. Also, I have been at one site which was running Deegree and MapViewer and MapViewer was so fast in comparison to Deegree that I was called in to find out why. I discovered that Deegree was using SDO_RELATE(… ANYINTERACT …) for all MBR queries while MapViewer was using SDO_FILTER. Just this difference was causing some queries to perform at less than 10% of the speed of MapViewer!!!!

8. Don’t draw data that are sub-pixel.

As one zooms out objects become smaller and smaller until they reach a point where the whole object can be drawn within a single pixel. If you have control over your map visualisation application you might want to consider setting the SDO_FILTER / SDO_RELATE parameters’ “min_resolution” flag dynamically so that its value is the same as the number of meters / pixel (eg min_resolution=10). If this is set Oracle Spatial will only include spatial objects in the returned search set if one side of a geometry’s MBR is greater than or equal to this value. Thus any geometries smaller than a pixel will not be returned. Very useful for large scale data being drawn at small scales and for which *no selection* (eg identify) is required. With Oracle MapViewer this behaviour can be set via the generalized_pixels parameter.

9. Network fetch

If your rendering engine (app server) and database are on separate machines you need to investigate what sort of fetch sizes are being used when returning data from queries to the middle-tier. Fetch sizes for attribute only data rows and rows containing spatial data can be, and normally are, radically different. Accepting the default settings for these sizes could be killing you (as could the sort_area_size of the Oracle session the application server has created on the database). For example I have been informed that MapInfo Pro uses a fixed value of 25 records per fetch when communicating with Oracle. I have done some testing to show that this value can be too small for certain types of spatial data. SQL Developer’s GeoRaptor uses 100 which is generally better (but this one can modify this). Most programmers accept defaults for network properties when programming in ADO/ODBC/OLEDB/JDBC: just be careful as to what is being set here. (This is one of the great strengths of ArcSDE: its TCP/IP network transport is well written, tuneable and very efficient.)

10. Physical Format

Finally, while Oracle’s excellent MapViewer requires data its spatial data to be in Oracle, other commercial rendering engines do not. So, consider using alternate, physical file formats that are more optimal for your rendering engine. For example, Google Earth Enterprise “compiles” all the source data into an optimal format which the server then serves to Google Earth Enterprise clients. Similarly, a shapefile on local disk to the application server (with spatial indexing) may be faster that storing the data back in Oracle on a database server that is being shared with other business databases (eg Oracle financials). If you don’t like this approach and want to use Oracle only consider using a dedicated Oracle XE on the application server for the data that is read only and used in most of your generated maps eg contour or drainage data.

Just some things to think about.

regards
Simon