Top 5 Recent Articles
- Algorithms (13)
- All (407)
- Biography (1)
- Blog (44)
- Business Requirements (1)
- Commentary (1)
- Customers (2)
- Data Models (1)
- Education (2)
- GeoRaptor (5)
- Image Processing (2)
- Import Export (5)
- Licensing (2)
- Linear Referencing (3)
- Manifold GIS (3)
- Mapping (1)
- MySQL Spatial (7)
- Networking and Routing (including Optimization) (3)
- Open Source (16)
- Oracle Spatial and Locator (178)
- PostGIS (33)
- Published Articles (1)
- Recommendations (1)
- Services (1)
- Software Change Log (1)
- Source Code (35)
- Space Curves (9)
- Spatial Database Functions (101)
- Spatial DB comparison (1)
- Spatial XML Processing (10)
- SQL Server Spatial (General) (83)
- SQL Server Spatial (LRS) (38)
- Standards (1)
- Stored Procedure (15)
- Tessellation or Gridding (9)
- Tools (2)
- Training (2)
Effects of Sdo_Geometry Ordinate Precision on Performance
In a previous article I showed how too many digits of precision for the ordinates of an sdo_geometry object – stored in its sdo_ordinate_array – has an effect on storage volume.
In this article I will revisit the question of storage size while examining the effect that storage size has on performance (search and retrieval).
Approach – MultiPoint Sdo_Geometry Objects
The simplest way to create an experimental framework where one wants to examine sdo_ordinate_array ordinate precision on storage and performance effects, is to base one’s experiments on the generation and use of MultiPoint geometry objects as it is easy to generate valid sdo_geometry object of a specific size.
The generation of the multipoint objects is in two steps:
- The cells of a standard sized rectangular grid is created in memory
- For each cell in the grid a single multipoint geometry is created containing a variable number of randomly generated ordinates.
This looks like this in SQL:
CREATE TABLE &MPTableName. ( id NUMBER, geom sdo_geometry ) nologging; -- Now insert points SET timing ON INSERT INTO &MPTableName. WITH grids AS ( SELECT rownum AS id, 200000 + ( 100 * c.column_value ) AS minx, 100 AS rangeX, 600000 + ( 100 * r.column_value ) AS miny, 100 AS rangeY FROM TABLE(codesys.geom.generate_series(0,99,1)) c, TABLE(codesys.geom.generate_series(0,99,1)) r ) SELECT rownum AS id, MultiPointFromRandom(&pointCount.,g.minx,g.miny,g.rangex,g.rangey,NULL) AS geom FROM grids g; SET timing off COMMIT; ALTER TABLE &MPTableName. LOGGING;
Note: The grids generated by the “grids” CTE are NOT stored in any way.
The function, MultiPointFromRandom(), is as follows:
CREATE OR REPLACE FUNCTION MultiPointFromRandom(p_pointCount IN pls_integer, p_minx IN NUMBER, p_miny IN NUMBER, p_rangex IN NUMBER, p_rangey IN NUMBER, p_srid IN NUMBER DEFAULT NULL) RETURN MDSYS.SDO_GEOMETRY DETERMINISTIC AS v_ordinates mdsys.sdo_ordinate_array; BEGIN IF ( p_pointCount IS NULL ) THEN RETURN NULL; END IF; v_ordinates := NEW mdsys.sdo_ordinate_array(); v_ordinates.EXTEND(ABS(p_pointCount) * 2); FOR i IN 1..p_pointCount LOOP v_ordinates((i*2)-1) := round(dbms_random.VALUE(p_minx, p_minx + p_rangex),12); v_ordinates((i*2) ) := round(dbms_random.VALUE(p_miny, p_miny + p_rangey),12); END LOOP; RETURN mdsys.sdo_geometry(2005,p_srid,NULL,sdo_elem_info_array(1,1,p_pointCount),v_ordinates); END MultiPointFromRandom; / SHOW errors
After creation of the multipoints, the following occurs:
- Space usage of table and LOBs is computed
- SDO_GEOM Metadata and an RTree Index is created
- Performance statistics are generated using a function called RandomSearchByExtent
Next the tests are repeated but against an ordinate rounded version of the MultiPoint geometry data just created via:
CREATE TABLE &MPTableName.R ( ID NUMBER(10,0), GEOM MDSYS.SDO_GEOMETRY ) NOLOGGING; -- Create points by rounding the ordinates of the first table -- INSERT /*+append*/ INTO &MPTableName.R (ID,GEOM) SELECT ID,CODESYS.ROUNDORDINATES(p.GEOM,2) FROM &MPTableName. p; COMMIT; ALTER TABLE &MPTableName.R LOGGING;
All these steps have been put into a single SQL file called MultiPoint.sql.
Running the Tests
A set of tests are created which test multipoint geometries with a variety of sizes. Each test calls the single SQL file described above.
@multipoint MP_250_O 250 @multipoint MP_500_O 500 @multipoint MP_1000_O 1000 @multipoint MP_2500_O 2500 @multipoint MP_5000_O 5000 @multipoint MP_7500_O 7500 @multipoint MP_10000_O 10000 @multipoint MP_50000_O 50000 @multipoint MP_100000_O 100000
After running the tests, the output CSV data was loaded into a spreadsheet and processed to create two graphs.
Graph One: Storage before and After Rounding
As can be seen, there is the potential for large savings in storage.
Graph Two: Percentage Performance Improvement After Rounding
Except for a few “outlying” results (250, 2500 and 5000), the results are encouraging. The effect on performance of ordinate rounding is muted for sdo_geometry objects with small vertex counts, but there appears to be a solid effect for larger objects that indicates that ordinate rounding could be considered “best practice” in spatial data management.
I hope this is of use to someone.