Effects of Sdo_Geometry Ordinate Precision on Storage

In this article I am going look at the effect ordinate precision has on the storage space occupied by tables containing sdo_geometry data. In a related article I will also look at the effect storage precision may have on performance (ie searching and returning data in SQL).


Sdo_Geometry Coordinate Precision

There is a little known side effect of using too many digits of precision when describing and storing spatial data using Oracle’s SDO_GEOMETRY object type. That side effect is that the storage requirements can be significantly greater than expected.

Unlike other spatial storage formats (normally some sort of compressed binary) Oracle storage is open. Very open. Oracle stores the ordinates describing the spatial data (SDO_GEOMETRY) in an array of NUMBER called SDO_ORDINATE_ARRAY.

Note: This is an array of NUMBER and not an array of binary double. Oracle stores a NUMBER in a variable storage format so small numbers take less space than larger NUMBERs. A BINARY_DOUBLE’s storage size is fixed regardless as to the precision of the data being held.

The ramification of this is that your storage costs are in relation to the number of digits used.

Simple Example

The Google Mercator Map coordinates that are created by projection from the original longitude/latitude data are unnecessarily large as can be shown from the following example:

 SELECT sdo_cs.transform(sdo_geometry(2002,8311,NULL,
                                       sdo_elem_info_array(1,2,1),
                                       sdo_ordinate_array(147.123,-32.456,147.672,-33.739)),3785) AS geom
   FROM dual;
 -- Results
 --
 GEOM
 -------------------------------------------------------------------------------------------------------------------
 MDSYS.SDO_GEOMETRY(2002,3785,NULL,
                    MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),
                    MDSYS.SDO_ORDINATE_ARRAY(16377657.4439788,-3800381.82007675,16438771.8444243,-3970070.49100647))

Being that the input data was only specified to 0.001 of a degree, an output – in meters – specified to 8 decimal places seems somewhat excessive.

But if I rounded this data to 1 decimal place what would be the potential saving in storage?

Oracle’s documentation shows how to calculate the size of a NUMBER as stored in the database. As such, an analysis was carried out on a table of Australian land parcels to see what the effect of rounding the ordinates to 1cm and 1mm would have on data size.

Let’s apply a bit of SQL to compute NUMBER size for the above Google Mercator geometry before and after a potential rounding of its ordinates:

 SELECT SUMX,SUMY,
         SUMXR2,SUMYR2,100 - ROUND(((SUMXR2+SUMYR2)/(SUMX+SUMY) * 100),1) AS PCT_Save_cm,
         SUMXR3,SUMYR3,100 - ROUND(((SUMXR3+SUMYR3)/(SUMX+SUMY) * 100),1) AS PCT_Save_mm
   FROM (SELECT SUM(vsize(t.x)) AS SUMX, SUM(vsize(ROUND(T.X,2))) AS SUMXR2, SUM(vsize(ROUND(T.X,3))) AS SUMXR3,
                SUM(vsize(t.y)) AS SUMY, SUM(vsize(ROUND(T.Y,2))) AS SUMYR2, SUM(vsize(ROUND(T.Y,3))) AS SUMYR3
           FROM (SELECT sdo_cs.transform(
                           sdo_geometry(2002,8311,NULL,
                                        sdo_elem_info_array(1,2,1),
                                        sdo_ordinate_array(147.123,-32.456,147.672,-33.739)),3785) AS geom
                   FROM dual) S,
                TABLE(SDO_UTIL.GETVERTICES(S.GEOM)) T
         ) f;
 -- Results
 --
 SUMX SUMY SUMXR2 SUMYR2 PCT_SAVE_CM SUMXR3 SUMYR3 PCT_SAVE_MM
 ---- ---- ------ ------ ----------- ------ ------ -----------
 18   20   12     14     31.6        14     15     23.7

So, trimming the X and Y ordinates to 1cm (2 digits of precision) should produce a 31.6 percent reduction in storage, while trimming to 1mm should produce a 23.7 percent reduction in storage.

That’s a single object, what about geometries stored in tables?

Trimming or Rounding

To execute the trimming of an actual sdo_geometry object requires a relatively simply PL/SQL function. I created just such a function a long time ago which is available in my GEOM PL/SQL package available for free on this website. A standalone version of this function is presented in an associated article on this website called Rounding Coordinates or Ordinates in SDO_GEOMETRY

Baseline

Note that the original parcel data is also over specified:

 SELECT geom
   FROM parcel a
  WHERE sdo_util.getNumVertices(a.geom) < 6
    AND rownum < 2;
 -- Results
 --
 GEOM
 ----------------------------------------------------------------------------------------------------------------
 MDSYS.SDO_GEOMETRY(2003,82469,NULL,
                    MDSYS.SDO_ELEM_INFO_ARRAY(1,1003,1),
                    MDSYS.SDO_ORDINATE_ARRAY(447291.935307683,6470098.06758179,447294.953958522,6470116.87270875,                                            447258.829362703,6470122.5382358,447255.811490383,6470103.73299092,
                                             447291.935307683,6470098.06758179))

So, before using the RoundOrdinates function, we need to get a baseline measure of storage.

 WITH object_storage AS (
 SELECT segment_name AS TABLE_NAME, 'ALL' AS column_name, '' AS segment_name, '' AS segment_subtype, s.bytes, s.blocks, s.extents
   FROM dba_segments s
  WHERE s.owner = 'LAND'
    AND s.segment_name IN ('PARCEL')
 UNION ALL
 SELECT l.TABLE_NAME,l.column_name, s.segment_name, '' AS segment_subtype, s.bytes, s.blocks, s.extents
   FROM user_lobs l,
        dba_segments s
   WHERE l.TABLE_NAME IN ('PARCEL')
     AND s.segment_name = l.segment_name
 )
 SELECT TABLE_NAME,column_name,segment_name,
        to_char(SUM(bytes),'FM9,999,999,999,999') AS bytes,SUM(blocks) AS blocks,SUM(extents) AS extents
   FROM object_storage
  GROUP BY TABLE_NAME,column_name,segment_name;
 .
 TABLE_NAME  COLUMN_NAME            SEGMENT_NAME              BYTES      BLOCKS EXTENTS
 ------ ---------------------- ------------------------- ---------- ------ -------
 PARCEL ALL                                              29,360,128 3584   43
 PARCEL "GEOM"."SDO_ORDINATES" SYS_LOB0000077718C00021$$ 6,291,456  768    21
 PARCEL "GEOM"."SDO_ELEM_INFO" SYS_LOB0000077718C00020$$ 65,536     8      1

Now, let’s make the change to the ordinates.

Changed Ordinates

 DROP   TABLE LM_PARCEL_P;
 .
 CREATE TABLE LM_PARCEL_P (FID,AREA,GID,LOT_NUMBER,SECTION,DP_NUMBER,ID_PARCEL_TYPE,ID_PARCEL_LEGAL_STATUS,ASSET,ADMIN_CODE,ID_ORIGIN,ID_ACCURACY,NARRATIVE,GEOM)
 NOLOGGING
 AS
 SELECT FID,AREA,GID,LOT_NUMBER,SECTION,DP_NUMBER,ID_PARCEL_TYPE,ID_PARCEL_LEGAL_STATUS,ASSET,ADMIN_CODE,ID_ORIGIN,ID_ACCURACY,NARRATIVE,CODESYS.ROUNDORDINATES(p.GEOM,0.0005) AS GEOMETRY
   FROM PARCEL p;
 commit;
 .
 ALTER TABLE LM_PARCEL_P LOGGING;
 .
 -- Calculate space usage of table and LOBs
 WITH object_storage AS (
 SELECT segment_name AS TABLE_NAME, 'ALL' AS column_name, '' AS segment_name, '' AS segment_subtype, s.bytes, s.blocks, s.extents
   FROM dba_segments s
  WHERE s.owner = 'LAND'
    AND s.segment_name IN ('PARCEL_P')
 UNION ALL
 SELECT l.TABLE_NAME,l.column_name, s.segment_name, '' AS segment_subtype, s.bytes, s.blocks, s.extents
   FROM user_lobs l,
        dba_segments s
   WHERE l.TABLE_NAME IN ('PARCEL_P')
     AND s.segment_name = l.segment_name
 )
 SELECT TABLE_NAME,column_name,segment_name,
        to_char(SUM(bytes),'FM9,999,999,999,999') AS bytes,SUM(blocks) AS blocks,SUM(extents) AS extents
   FROM object_storage
  GROUP BY TABLE_NAME,column_name,segment_name;
 .
 TABLE_NAME  COLUMN_NAME            SEGMENT_NAME              BYTES       BLOCKS EXTENTS
 -------- ---------------------- ------------------------- ----------- ------ -------
 PARCEL_P ALL                                              25,165,824  3072   39
 PARCEL_P "GEOM"."SDO_ORDINATES" SYS_LOB0000077747C00021$$ 3,145,728   384    18
 PARCEL_P "GEOM"."SDO_ELEM_INFO" SYS_LOB0000077747C00020$$ 65,536      8      1

Results

The difference in sdo_ordinate storage is:

3,145,728 / 6,291,456 * 100 = 50%.

And of the table’s storage:

25,165,824 / 29,360,128 = 85%

Not bad.

So, the benefits of a decrease in disk storage and traffic should be obvious to all. But can we see any statistical benefits in this change? (These calculations should be ratified before making any decisions with respect to your data.) I will include some numbers in the next few days.

I hope this is useful to someone.

If you found this article informative, try the followup article on the effect rounding has on precision.

Effects of Sdo_Geometry Ordinate Precision on Performance