Effects of Sdo_Geometry Ordinate Precision on Performance

In a previous article I showed how too many digits of precision for the ordinates of an sdo_geometry object – stored in its sdo_ordinate_array – has an effect on storage volume.

In this article I will revisit the question of storage size while examining the effect that storage size has on performance (search and retrieval).

Approach – MultiPoint Sdo_Geometry Objects

The simplest way to create an experimental framework where one wants to examine sdo_ordinate_array ordinate precision on storage and performance effects, is to base one’s experiments on the generation and use of MultiPoint geometry objects as it is easy to generate valid sdo_geometry object of a specific size.

The generation of the multipoint objects is in two steps:

  • The cells of a standard sized rectangular grid is created in memory
  • For each cell in the grid a single multipoint geometry is created containing a variable number of randomly generated ordinates.

This looks like this in SQL:

 CREATE TABLE &MPTableName. (
   id   NUMBER,
   geom sdo_geometry
 ) nologging;
 -- Now insert points
 SET timing ON
 INSERT INTO &MPTableName.
 WITH grids AS (
 SELECT rownum AS id,
        200000 + ( 100 * c.column_value ) AS minx,
        100 AS rangeX,
        600000 + ( 100 * r.column_value ) AS miny,
        100 AS rangeY
   FROM TABLE(codesys.geom.generate_series(0,99,1)) c,
        TABLE(codesys.geom.generate_series(0,99,1)) r
 )
 SELECT rownum AS id,
        MultiPointFromRandom(&pointCount.,g.minx,g.miny,g.rangex,g.rangey,NULL) AS geom
   FROM grids g;
 SET timing off
 COMMIT;
 ALTER TABLE &MPTableName. LOGGING;

Note: The grids generated by the “grids” CTE are NOT stored in any way.

The function, MultiPointFromRandom(), is as follows:

 CREATE OR REPLACE FUNCTION MultiPointFromRandom(p_pointCount IN pls_integer,
                                                 p_minx       IN NUMBER,
                                                 p_miny       IN NUMBER,
                                                 p_rangex     IN NUMBER,
                                                 p_rangey     IN NUMBER,
                   p_srid       IN NUMBER DEFAULT NULL)
 RETURN MDSYS.SDO_GEOMETRY DETERMINISTIC
 AS
   v_ordinates mdsys.sdo_ordinate_array;
 BEGIN
   IF ( p_pointCount IS NULL ) THEN
      RETURN NULL;
   END IF;
   v_ordinates := NEW mdsys.sdo_ordinate_array();
   v_ordinates.EXTEND(ABS(p_pointCount) * 2);
   FOR i IN 1..p_pointCount LOOP
       v_ordinates((i*2)-1) := round(dbms_random.VALUE(p_minx, p_minx + p_rangex),12);
       v_ordinates((i*2)  ) := round(dbms_random.VALUE(p_miny, p_miny + p_rangey),12);
   END LOOP;
   RETURN mdsys.sdo_geometry(2005,p_srid,NULL,sdo_elem_info_array(1,1,p_pointCount),v_ordinates);
 END MultiPointFromRandom;
 /
 SHOW errors

After creation of the multipoints, the following occurs:

  • Space usage of table and LOBs is computed
  • SDO_GEOM Metadata and an RTree Index is created
  • Performance statistics are generated using a function called RandomSearchByExtent

Next the tests are repeated but against an ordinate rounded version of the MultiPoint geometry data just created via:

 CREATE TABLE &MPTableName.R (
     ID     NUMBER(10,0),
     GEOM   MDSYS.SDO_GEOMETRY
 ) NOLOGGING;
 -- Create points by rounding the ordinates of the first table
 --
 INSERT /*+append*/ INTO &MPTableName.R (ID,GEOM)
 SELECT ID,CODESYS.ROUNDORDINATES(p.GEOM,2)
   FROM &MPTableName. p;
 COMMIT;
 ALTER TABLE &MPTableName.R LOGGING;

All these steps have been put into a single SQL file called MultiPoint.sql.

Running the Tests

A set of tests are created which test multipoint geometries with a variety of sizes. Each test calls the single SQL file described above.

 @multipoint MP_250_O 250
 @multipoint MP_500_O 500
 @multipoint MP_1000_O 1000
 @multipoint MP_2500_O 2500
 @multipoint MP_5000_O 5000
 @multipoint MP_7500_O 7500
 @multipoint MP_10000_O 10000
 @multipoint MP_50000_O 50000
 @multipoint MP_100000_O 100000

After running the tests, the output CSV data was loaded into a spreadsheet and processed to create two graphs.

Graph One: Storage before and After Rounding

As can be seen, there is the potential for large savings in storage.

Graph Two: Percentage Performance Improvement After Rounding

Except for a few “outlying” results (250, 2500 and 5000), the results are encouraging. The effect on performance of ordinate rounding is muted for sdo_geometry objects with small vertex counts, but there appears to be a solid effect for larger objects that indicates that ordinate rounding could be considered “best practice” in spatial data management.

I hope this is of use to someone.