Top 5 Recent Articles
ARTICLES CATEGORIES
- Algorithms (22)
- All (399)
- Biography (1)
- Blog (44)
- Business Requirements (1)
- Commentary (1)
- Conversion (2)
- Customers (2)
- Data Models (1)
- Education (2)
- GeoRaptor (13)
- GPS (1)
- Image Processing (2)
- Import Export (8)
- Licensing (2)
- LiDAR (1)
- Linear Referencing (4)
- Manifold GIS (3)
- Mapping (1)
- MySQL Spatial (7)
- Networking and Routing (including Optimization) (5)
- Open Source (18)
- Oracle Spatial and Locator (194)
- Partitioning (1)
- PostGIS (36)
- Projections (1)
- Published Articles (1)
- qGIS (1)
- Recommendations (1)
- Services (1)
- Software Change Log (1)
- Source Code (37)
- Space Curves (9)
- Spatial Database Functions (109)
- Spatial DB comparison (1)
- Spatial XML Processing (11)
- SQL Server Spatial (92)
- Standards (3)
- Stored Procedure (17)
- Tessellation or Gridding (10)
- Tools (2)
- Topological Relationships (1)
- Training (2)
- Triangulation (2)
Spatial Database Independence
I often hear asserted in discussions about accessing data in databases that the approach taken should be “database independent”. The assertion is always made as if it were some sort of Scientific Truth (Dogma even). And it is always assumed that this is easy to achieve. Finally, it always seems to be an accepted reality as if it was backed by some sort of consensus in the Computer Science Literature.
(I have almost always heard this when discussing non-Microsoft databases and technologies but rarely when the Microsoft technologies are the principle components of a solution. But, perhaps that is just a selective reading based on my own career!)
This article will provide a simple comparison based on generating “vectors” from the rings of a multi-polygon using Oracle’s ST_Geometry, SQL Server 2008’s STGeometry and PostgreSQL/PostGIS’s ST_Geometry.
Initial Comments on Hierarchy
In my comparison work I use the OGC SFS 1.1 standard with preference for the SQL/MM standard. Since all three databases are “Object” relational databases (ORDBMS) we need to start with a comment about this aspect of the implementation.
Oracle’s ST_Geometry is a true type hierarchy with only those functions defined on specific elements available for geometries of that type. So, for example, you cannot call ST_Area on an ST_CURVE (ST_LINESTRING, ST_MULTILINESTRING etc), whereas in SQL Server 2008 and PostGIS you can.
Oracle
SELECT a.geom.ST_Area() AS area FROM (SELECT TREAT(ST_Geometry.FROM_WKT('LINESTRING(10.0 10.0, 15.0 10.0, 15.0 15.0, 10.0 15.0, 10.0 10.0)',NULL) AS st_linestring) AS geom FROM DUAL ) a; -- Result ORA-00904: "A"."GEOM"."ST_AREA": invalid identifier 00904. 00000 - "%s: invalid identifier"
The availability of a true object hierarchy was not always the case. Up to 10g Oracle only had a singly-inherited type library. The original SDO_GEOMETRY approach is still singly inherited: only the SQLMM/OGC implementation from 10g onwards is not.
In the above SQL you will notice a lack of an ST_LineFromText() constructor that returns an ST_LineString object: this is not available in the ST_CURVE or ST_LINESTRING sub-class objects. Given the use of ST_Geometry one would have thought an ST_GeomFromText() method would be available rather than the non-standard FROM_WKT: nope, sorry, Oracle has not implemented one. The only ST*FromText() method available are:
SELECT object_name || '.' || procedure_name AS conversion_method FROM all_procedures WHERE owner = 'MDSYS' AND object_name LIKE 'ST%' AND procedure_name LIKE 'ST%TEXT'; -- Result CONVERSION_METHOD ------------------------------------- ST_ANNOTATION_TEXT.ST_ANNOTATION_TEXT ST_MULTIPOLYGON.ST_BDMPOLYFROMTEXT ST_MULTIPOLYGON.ST_BDMPOLYFROMTEXT ST_POLYGON.ST_BDPOLYFROMTEXT ST_POLYGON.ST_BDPOLYFROMTEXT
Finally, with Oracle, some constructors or accessors return ST_Geometry objects even if they are ST_LineStrings etc. To access the methods of the underlying sub-class object, we need to “cast” them to the required type via the Oracle TREAT function. eg SELECT TREAT (geom as ST_LineString) as line FROM …..
PostGIS
Note that in PostGIS you cannot use the . notation to access the methods of an object.
SELECT ST_Area(a.geom) AS area FROM (SELECT ST_LineFromText('LINESTRING(10.0 10.0, 15.0 10.0, 15.0 15.0, 10.0 15.0, 10.0 10.0)',0) AS geom ) a; -- Result area ---- 0
Because PostGIS is “singly inherited”, it does not use a “dot” object.method invocation from a base class. Thus ST_LineFromText() function is an independent function that returns a geometry (ST_Geometry): No casting is required. SQL Server 2008
SELECT a.geom.STArea() AS area FROM (SELECT geometry::STLineFromText('LINESTRING(10.0 10.0, 15.0 10.0, 15.0 15.0, 10.0 15.0, 10.0 10.0)',0) AS geom) a; -- Result area ---- 0
SQL Server 2008 is a “singly inherited” type system that uses a “dot” object.method invocation from a base class in object creation or method execution. Thus ST_STLineFromText() function requires the use of “geometry::” that acts as a surrogate for a missing ST_Geometry abstract class.
Comparisons
The following three SQL statements are the result of trying to minimise the differences in SQL between the three databases. Where possible, I avoided use of anything that is specific to a database (eg rownum in Oracle). In addition, I use my SQL Server and Oracle versions of PostgreSQL’s generate_series() set returning function.
Oracle
-- MultiPoly with two elements (polygons) one with an Inner Ring WITH poly AS ( SELECT ROW_NUMBER() OVER (PARTITION BY b.n ORDER BY b.n) AS polyNo, a.geom FROM (SELECT TREAT(VALUE(t) AS st_polygon) AS geom FROM (SELECT ST_MultiPolygon.ST_BdMPolyFromText( 'MULTIPOLYGON(((10.0 10.0, 15.0 10.0, 15.0 15.0, 10.0 15.0, 10.0 10.0), (12.0 12.0, 12.0 14.0, 14.0 14.0, 14.0 12.0, 12.0 12.0)), ((100.0 100.0, 150.0 100.0, 150.0 150.0, 100.0 150.0, 100.0 100.0)))',0) AS geom FROM dual) o, TABLE(o.geom.ST_GEOMETRIES()) t ) a, (SELECT 1 AS n FROM dual) b WHERE a.geom.ST_GeometryType() = 'ST_POLYGON' ) SELECT a.polyNo, a.ringNo, ROW_NUMBER() OVER (PARTITION BY a.PolyNo, a.ringNo, b.n ORDER BY b.n) AS vectorNo, a.sp.ST_X() AS sx, a.sp.ST_Y() AS sy, a.ep.ST_X() AS ex, a.ep.ST_Y() AS ey FROM ( SELECT g.polyNo, g.ringNo, g.geom.ST_PointN(p.column_value ) AS sp, g.geom.ST_PointN(p.column_value+1) AS ep FROM (SELECT p.polyNo, 1 AS ringNo, p.geom.ST_ExteriorRing() AS geom FROM poly p UNION ALL SELECT p.polyNo, p.column_value+1 AS ringNo, p.geom.ST_InteriorRingN(p.column_value) AS geom FROM poly p, TABLE(codesys.geom.generate_series(1,p.geom.ST_NumInteriorRing(),1)) p WHERE p.column_value < p.geom.ST_NumInteriorRing() /* Reports 2 but actually only 1: the first */ ) g, TABLE(codesys.geom.generate_series(1,g.geom.ST_NumPoints(),1)) p WHERE p.column_value < g.geom.ST_NumPoints() ) a, (SELECT 1 AS n FROM dual) b ORDER BY a.polyNo, a.ringNo, 3; -- Result POLYNO RINGNO VECTORNO SX SY EX EY ------ ------ -------- ---- --- --- --- 1 1 1 10 10 15 10 1 1 2 15 10 15 15 1 1 3 15 15 10 15 1 1 4 10 15 10 10 1 2 1 12 12 14 12 1 2 2 12 14 12 12 1 2 3 14 14 12 14 1 2 4 14 12 14 14 2 1 1 100 100 150 100 2 1 2 150 100 150 150 2 1 3 150 150 100 150 2 1 4 100 150 100 100 -- Result 12 ROWS selected
Notes:
- Need for “(select 1 as n from dual) b” in order to generate a unique row number as Row_Number() cannot be used outside of a partition or order by (this is the same problem across all three databases).
- generate_series return referenced by database specific “column_value” column name;
- The use of TREAT to ensure the polygons extracted from the MultiPolygon are correctly cast to ST_Polygons so its methods eg ST_ExteriorRing() can be used.
- ST_NumInteriorRing() seems to also report the exterior ring in its count yet it cannot be used (this is not the same as PostGIS and SQL Server 2008).
- “FROM dual” cannot be avoided;
- ST_GeometryType returns “ST_POLYGON” (uppercase);
- TABLE functions cannot be avoided but they can be used with values from other tables in a query;
- TABLE function must be used to disaggregate the individual polygons returned by the non-standard ST_GEOMETRIES() method – Oracle does not implement ST_GeometryN();
- Constructor function name ST_BdMPolyFromText different from other databases.
However, an additional comment is warranted with regards to ROW_NUMBER. In Oracle – but not the other 2 databases – it is possible to generate a row number as follows:
SELECT ROW_NUMBER() OVER (partition BY 1 ORDER BY 1) AS rowno, b.column_value FROM TABLE(codesys.geom.generate_series(1,5,1)) b; -- Result ROWNO COLUMN_VALUE ----- ------------ 1 1 2 2 3 3 4 4 5 5
To ensure commonality across databases in generating row numbers, I had to use the constructs as shown in the different SQL.
SQL Server 2008
-- MultiPoly with two elements (polygons) one with an Inner Ring WITH poly AS ( SELECT ROW_NUMBER() OVER (PARTITION BY B.N ORDER BY B.N) AS polyNo, a.geom FROM (SELECT o.geom.STGeometryN(t.IntValue) AS geom FROM (SELECT geometry::STMPolyFromText( 'MULTIPOLYGON(((10.0 10.0, 15.0 10.0, 15.0 15.0, 10.0 15.0, 10.0 10.0), (12.0 12.0, 12.0 14.0, 14.0 14.0, 14.0 12.0, 12.0 12.0)), ((100.0 100.0, 150.0 100.0, 150.0 150.0, 100.0 150.0, 100.0 100.0)))',0) AS geom ) o CROSS APPLY generate_series(1,o.geom.STNumGeometries(),1) t ) a, (SELECT 1 AS n) b WHERE UPPER(a.geom.STGeometryType()) = 'POLYGON' ) SELECT a.polyNo, a.ringNo, ROW_NUMBER() OVER (PARTITION BY a.PolyNo, a.ringNo, b.n ORDER BY b.n) AS vectorNo, a.sp.STX AS sx, a.sp.STY AS sy, a.ep.STX AS ex, a.ep.STY AS ey FROM ( SELECT g.polyNo, g.ringNo, g.geom.STPointN(p.IntValue ) AS sp, g.geom.STPointN(p.IntValue+1) AS ep FROM (SELECT polyNo, 1 AS ringNo, p.geom.STExteriorRing() AS geom FROM poly p UNION ALL SELECT polyNo, t.IntValue+1 AS ringNo, p.geom.STInteriorRingN(t.IntValue) AS geom FROM poly p CROSS APPLY generate_series(1,p.geom.STNumInteriorRing(),1) t ) g CROSS APPLY generate_series(1,g.geom.STNumPoints(),1) p WHERE p.IntValue < g.geom.STNumPoints() ) a, (SELECT 1 AS n) b ORDER BY a.polyNo, a.ringNo, 3; -- Results polyNo ringNo vectorNo sx sy ex ey ------ ------ -------- --- --- --- --- 1 1 1 10 10 15 10 1 1 2 15 10 15 15 1 1 3 15 15 10 15 1 1 4 10 15 10 10 1 2 1 12 12 12 14 1 2 2 12 14 14 14 1 2 3 14 14 14 12 1 2 4 14 12 12 12 2 1 1 100 100 150 100 2 1 2 150 100 150 150 2 1 3 150 150 100 150 2 1 4 100 150 100 100
Notes:
- Use of non-standard “CROSS APPLY” to allow for table values to be used by the generate_series function;
- generate_series return referenced by database specific “IntValue” column name;
- Methods have ST prefix with no underscore;
- Use of “geometry::” cast.
- STGeometryType returns “Polygon” (no ST_ prefix as in other two databases);
- Constructor function name STMPolyFromText different from other databases;
- STNumInteriorRing reports only interior rings;
- STX/STY are not methods.
PostgreSQL/PostGIS
WITH poly AS ( SELECT ROW_NUMBER() OVER (PARTITION BY b.n ORDER BY b.n) AS polyNo, a.geom FROM (SELECT ST_GeometryN(o.geom,generate_series(1,ST_NumGeometries(o.geom),1)) AS geom FROM (SELECT ST_MultiPolygonFromText( 'MULTIPOLYGON(((10.0 10.0, 15.0 10.0, 15.0 15.0, 10.0 15.0, 10.0 10.0), (12.0 12.0, 12.0 14.0, 14.0 14.0, 14.0 12.0, 12.0 12.0)), ((100.0 100.0, 150.0 100.0, 150.0 150.0, 100.0 150.0, 100.0 100.0)))',0) AS geom ) o ) a, (SELECT 1 AS n ) b WHERE UPPER(ST_GeometryType(a.geom)) = 'ST_POLYGON' ) SELECT a.polyNo, a.ringNo, ROW_NUMBER() OVER (PARTITION BY a.PolyNo, a.ringNo, b.n ORDER BY b.n) AS vectorNo, ST_X(a.sp) AS sx,ST_Y(a.sp) AS sy, ST_X(a.ep) AS ex,ST_Y(a.ep) AS ey FROM ( SELECT g.polyNo, g.ringNo, ST_PointN(g.geom,generate_series(1,ST_NumPoints(g.geom)-1,1)) AS sp, ST_PointN(g.geom,generate_series(2,ST_NumPoints(g.geom) ,1)) AS ep FROM (SELECT p.polyNo, 1 AS ringNo, ST_ExteriorRing(p.geom) AS geom FROM poly p UNION ALL SELECT p.polyNo, r.ringNo + 1, ST_InteriorRingN(p.geom,r.ringNo) AS geom FROM poly p, (SELECT generate_series(1,ST_NumInteriorRing(p.geom),1) AS ringNo FROM poly p) AS r ) g ) a, (SELECT 1 AS n) b ORDER BY a.polyNo, a.ringNo, 3; -- Results polyNo ringNo vectorNo sx sy ex ey ------ ------ -------- --- --- --- --- 1 1 1 10 10 15 10 1 1 2 15 10 15 15 1 1 3 15 15 10 15 1 1 4 10 15 10 10 1 2 1 12 12 12 14 1 2 2 12 14 14 14 1 2 3 14 14 14 12 1 2 4 14 12 12 12 2 1 1 100 100 150 100 2 1 2 150 100 150 150 2 1 3 150 150 100 150 2 1 4 100 150 100 100
Notes:
- Constructor function name MultiPolygonFromText different from other databases;
- STNumInteriorRing reports only interior rings.
- No object.method notation;
- ST_GeometryType returns “ST_Polygon” which is upper cased for cross-database comparison purposes;
- set returning function must be implemented as an attribute in a select statement in order to be able to use geometry methods.
Set Returning Functions
PostgreSQL’s inability for set returning functions eg generate_series() to draw their values/parameters from another table’s columns in the SQL is one of the major issues in trying to create cross-database SQL.
Here are some examples that show the issue. The first two show how generate_series can be used reasonably “naturally” with the value from the first table “a” being fed into the generate_series of “b”. The only difference is SQL Server’s keywords “CROSS APPLY” vs Oracle’s “, TABLE”.
-- SQL Server 2008 SELECT b.* FROM (SELECT 5 AS maxValue) a CROSS APPLY generate_series(1,a.maxValue,1) b; -- Result IntValue -------- 1 2 3 4 5
-- Oracle SELECT b.* FROM (SELECT 5 AS maxValue FROM DUAL) a, TABLE(codesys.geom.generate_series(1,a.maxValue,1)) b; -- Result COLUMN_VALUE ------------ 1 2 3 4 5
But with PostgreSQL the above pattern cannot be applied.
-- PostgreSQL - Attempt 1 SELECT b.* FROM (SELECT 5 AS maxValue) a, generate_series(1,a.maxValue,1) b; -- Result ERROR: FUNCTION expression IN FROM cannot refer TO other relations OF same query level LINE 3: generate_series(1,a.maxValue,1) b;
-- PostgreSQL - Attempt 2 SELECT b.* FROM (SELECT 5 AS maxValue) a, (SELECT generate_series(1,a.maxValue,1) AS column_value) b; -- Result ERROR: subquery IN FROM cannot refer TO other relations OF same query level LINE 3: (SELECT generate_series(1,a.maxValue,1) AS column_val...
-- PostgreSQL - Attempt 3 WITH aTable AS ( SELECT 5 AS maxValue ) SELECT b.*, a.maxValue FROM aTable a, (SELECT generate_series(1,a.maxValue,1) AS column_value FROM aTable a) b; -- Result column_value maxValue ------------ -------- 1 5 2 5 3 5 4 5 5 5
Finally, we get a result we can work with with PostgreSQL. However, to use this as the template for “cross database” SQL coding is, I think, pretty unacceptable. It is ugly, wordy and contorted, leading to semantic interpretation problems, never mind the potential for voluminous SQL (with multiple CTEs).
There is a better method in PostgreSQL which is as follows.
-- PostgreSQL - Attempt 4 SELECT generate_series(1,a.maxValue,1) b, a.maxValue FROM (SELECT 5 AS maxValue) a; -- Result column_value maxValue ------------ -------- 1 5 2 5 3 5 4 5 5 5
However, this use of a table function as an attribute is not possible in SQL Server and Oracle, thus it cannot be a basis for a cross-platform SQL “template”.
I am sure there are better methods. Anyone?
Summary
Here is a table comparing the OGC/SQLMM Methods Used:
Standard (SQL/MM ie ISO TC211) | Oracle (11gR2) | SQL Server 2008 | PostGIS 1.5 |
---|---|---|---|
ST_MPolyFromText() | ST_BdMPolyFromText() | STMPolyFromText() | ST_MultiPolygonFromText() |
ST_ExteriorRing() | ST_ExteriorRing() | STExteriorRing() | ST_ExteriorRing() |
ST_GeometryN() | ST_Geometries() | STGeometryN() | ST_GeometryN() |
ST_GeometryType() | ST_GeometryType() | STGeometryType() | ST_GeometryType() |
ST_InteriorRingN() | ST_InteriorRingN() | STInteriorRingN() | ST_InteriorRingN() |
ST_NumGeometries() | none | STNumGeometries() | ST_NumGeometries() |
ST_NumInteriorRing() | ST_NumInteriorRing() | STNumInteriorRing() | ST_NumInteriorRing() |
ST_NumPoints() | ST_NumPoints() | STNumPoints() | ST_NumPoints() |
ST_PointN() | ST_PointN() | STPointN() | ST_PointN() |
ST_X() | ST_X() | STX | ST_X() |
ST_Y() | ST_Y() | STY | ST_Y() |
In short, the OGC/SQLMM side of the database access problem is reasonably straight-forward (if anything ever is). Sure, there are naming differences but they can overcome. The lack of a true object hierarchy does provide some interesting implementation issues but these, I think, can be encapsulated via WITH or CTE expressions.
But, on the whole “database (or any technology) independence” is not easy to achieve for a whole host of non-spatial reasons.
- Implementations of the relevant SQL standards (eg SQL99 Analytics) differ across all databases;
- Non-standard database-specific functionality exists eg ROWNUM/DUAL/TABLE function (Oracle) etc;
- Lack of “common” methods across all database products for handling collections eg PostgreSQL has no similar “CROSS APPLY”/TABLE function handling capability (can do so via methods that the other databases do not provide);
- OGC/SQLMM implementation varies across databases with with methods missing (eg ST_GeometryN in Oracle), and limited or no inheritance;
- Interpretation of OGC/SQLMM standards differ eg SQL Server 2008 + PostGIS has geography datatype which the standard does not mention or require.
- API keywords vary;
- Database architectures differ.
Conclusion
Well, this is one attempt at looking at the cross-database. I have not discussed other architectural/philosophical design patterns like Oracle’s “Writers never block readers”, database ROI etc. That stuff is well covered elsewhere eg Tom Kyte’s views are well represented in this article by Mark Rittman called The Cost Of Database Independence and Comparison of different SQL implementations.
I have probably made some errors as I don’t profess to being a PostgreSQL expert (just an experienced user). If anyone else wishes to contribute to this discussion please do via some constructive comments.
Documentation
- MySQL Spatial General Functions
- Oracle LRS Objects
- Oracle Spatial Exporter (Java + pl/SQL)
- Oracle Spatial Object Functions
- Oracle Spatial Object Functions (Multi Page)
- PostGIS pl/pgSQL Functions
- SC4O Oracle Java Topology Suite (Java + pl/SQL)
- SQL Server Spatial General TSQL Functions
- SQL Server Spatial LRS TSQL Functions