Top 5 Recent Articles
- Biography (1)
- Blog (41)
- Changes (1)
- Customers (1)
- Data Models (1)
- Education (2)
- General Software (21)
- Georaptor Blog (5)
- Image Catalog (2)
- Licensing (1)
- ManifoldGIS (3)
- MySQL Blog (4)
- MySQL Spatial (3)
- Networking and Routing (including Optimization) (3)
- Oracle Spatial (171)
- Philosophy (1)
- PostGIS (30)
- Press Releases (1)
- Source code (24)
- Space Curves (1)
- Spatial DB comparison (1)
- SQL (1)
- SQL Server Blog (58)
- SQL Server Spatial (General) (15)
- SQL Server Spatial (LRS) (37)
- Stored Procedure (2)
- Training (1)
- XML (5)
Effects of Sdo_Geometry Ordinate Precision on Performance
In a previous article I showed how too many digits of precision for the ordinates of an sdo_geometry object – stored in its sdo_ordinate_array – has an effect on storage volume.
In this article I will revisit the question of storage size while examining the effect that storage size has on performance (search and retrieval).
Approach – MultiPoint Sdo_Geometry Objects
The simplest way to create an experimental framework where one wants to examine sdo_ordinate_array ordinate precision on storage and performance effects, is to base one’s experiments on the generation and use of MultiPoint geometry objects as it is easy to generate valid sdo_geometry object of a specific size.
The generation of the multipoint objects is in two steps:
- The cells of a standard sized rectangular grid is created in memory
- For each cell in the grid a single multipoint geometry is created containing a variable number of randomly generated ordinates.
This looks like this in SQL:
CREATE TABLE &MPTableName. ( id NUMBER, geom sdo_geometry ) nologging; -- Now insert points SET timing ON INSERT INTO &MPTableName. WITH grids AS ( SELECT rownum AS id, 200000 + ( 100 * c.column_value ) AS minx, 100 AS rangeX, 600000 + ( 100 * r.column_value ) AS miny, 100 AS rangeY FROM TABLE(codesys.geom.generate_series(0,99,1)) c, TABLE(codesys.geom.generate_series(0,99,1)) r ) SELECT rownum AS id, MultiPointFromRandom(&pointCount.,g.minx,g.miny,g.rangex,g.rangey,NULL) AS geom FROM grids g; SET timing off COMMIT; ALTER TABLE &MPTableName. LOGGING;
Note: The grids generated by the “grids” CTE are NOT stored in any way.
The function, MultiPointFromRandom(), is as follows:
CREATE OR REPLACE FUNCTION MultiPointFromRandom(p_pointCount IN pls_integer, p_minx IN NUMBER, p_miny IN NUMBER, p_rangex IN NUMBER, p_rangey IN NUMBER, p_srid IN NUMBER DEFAULT NULL) RETURN MDSYS.SDO_GEOMETRY DETERMINISTIC AS v_ordinates mdsys.sdo_ordinate_array; BEGIN IF ( p_pointCount IS NULL ) THEN RETURN NULL; END IF; v_ordinates := NEW mdsys.sdo_ordinate_array(); v_ordinates.EXTEND(ABS(p_pointCount) * 2); FOR i IN 1..p_pointCount LOOP v_ordinates((i*2)-1) := round(dbms_random.VALUE(p_minx, p_minx + p_rangex),12); v_ordinates((i*2) ) := round(dbms_random.VALUE(p_miny, p_miny + p_rangey),12); END LOOP; RETURN mdsys.sdo_geometry(2005,p_srid,NULL,sdo_elem_info_array(1,1,p_pointCount),v_ordinates); END MultiPointFromRandom; / SHOW errors
After creation of the multipoints, the following occurs:
- Space usage of table and LOBs is computed
- SDO_GEOM Metadata and an RTree Index is created
- Performance statistics are generated using a function called RandomSearchByExtent
Next the tests are repeated but against an ordinate rounded version of the MultiPoint geometry data just created via:
CREATE TABLE &MPTableName.R ( ID NUMBER(10,0), GEOM MDSYS.SDO_GEOMETRY ) NOLOGGING; -- Create points by rounding the ordinates of the first table -- INSERT /*+append*/ INTO &MPTableName.R (ID,GEOM) SELECT ID,CODESYS.ROUNDORDINATES(p.GEOM,2) FROM &MPTableName. p; COMMIT; ALTER TABLE &MPTableName.R LOGGING;
All these steps have been put into a single SQL file called MultiPoint.sql.
Running the Tests
A set of tests are created which test multipoint geometries with a variety of sizes. Each test calls the single SQL file described above.
@multipoint MP_250_O 250 @multipoint MP_500_O 500 @multipoint MP_1000_O 1000 @multipoint MP_2500_O 2500 @multipoint MP_5000_O 5000 @multipoint MP_7500_O 7500 @multipoint MP_10000_O 10000 @multipoint MP_50000_O 50000 @multipoint MP_100000_O 100000
After running the tests, the output CSV data was loaded into a spreadsheet and processed to create two graphs.
Graph One: Storage before and After Rounding
As can be seen, there is the potential for large savings in storage.
Graph Two: Percentage Performance Improvement After Rounding
Except for a few “outlying” results (250, 2500 and 5000), the results are encouraging. The effect on performance of ordinate rounding is muted for sdo_geometry objects with small vertex counts, but there appears to be a solid effect for larger objects that indicates that ordinate rounding could be considered “best practice” in spatial data management.
I hope this is of use to someone.