Top 5 Recent Articles
ARTICLES CATEGORIES
- Algorithms (22)
- All (399)
- Biography (1)
- Blog (44)
- Business Requirements (1)
- Commentary (1)
- Conversion (2)
- Customers (2)
- Data Models (1)
- Education (2)
- GeoRaptor (13)
- GPS (1)
- Image Processing (2)
- Import Export (8)
- Licensing (2)
- LiDAR (1)
- Linear Referencing (4)
- Manifold GIS (3)
- Mapping (1)
- MySQL Spatial (7)
- Networking and Routing (including Optimization) (5)
- Open Source (18)
- Oracle Spatial and Locator (194)
- Partitioning (1)
- PostGIS (36)
- Projections (1)
- Published Articles (1)
- qGIS (1)
- Recommendations (1)
- Services (1)
- Software Change Log (1)
- Source Code (37)
- Space Curves (9)
- Spatial Database Functions (109)
- Spatial DB comparison (1)
- Spatial XML Processing (11)
- SQL Server Spatial (92)
- Standards (3)
- Stored Procedure (17)
- Tessellation or Gridding (10)
- Tools (2)
- Topological Relationships (1)
- Training (2)
- Triangulation (2)
Loading Spatial Data from an external CSV file in Oracle
There have been many, many articles written on the web describing how to access external data (data held in operating system files not under the control of Oracle and not stored within its own datafiles and tablespaces) from within Oracle such that one can query the data using SQL.
In short, Oracle allows a user to access external data by wrapping a table definition over the top of the external data files and using its SQL loader and DataPump technologies to convert the data “on the fly” as users query it. As the Oracle documentation notes, this is a read-only table whose metadata is stored in the database but whose data in stored outside the database.
The table “wrapper” is achieved by using the CREATE TABLE statement to define the columns of the table in the normal manner, but the appending to the definition its physical_properties, in particular, the ORGANIZATION EXTERNAL clause.
This clause is made up of two dependent sub-clauses, the external_table_clause and the external_data_properties that allows you to specify the properties so described. For example, the TYPE clause of external_table_clause allows you to indicate the driver to be used to access external table’s data. Oracle database provides two access drivers: ORACLE_LOADER and ORACLE_DATAPUMP. Whereas the ACCESS PARAMETERS clause of “external_data_properties” allows you to describe the physical layout of the file (what type of delimiter, number of fields etc).
The best way to describe the external data loading capability of Oracle is to show you how to use it to load a small sample of LiDAR data (after it has been post-processed).
Firstly, we need to place our data in an operating system directory which Oracle can access, (in this example I have placed a file called ground.csv in C:\Temp\Lidar), and then create an Oracle Directory object and grant the user that will create and load the data (here my standard “CODESYS” user), read access on that directory.
-- Create directory that points to actual data file. CREATE OR REPLACE DIRECTORY EXTERNAL_DATA_DIRECTORY AS 'C:\temp\lidar'; CREATE OR REPLACE DIRECTORY succeeded. -- If create directory done by DBA as SYSTEM then you would need to grant the user doing the processing the ability to read from that directory via: GRANT READ ON DIRECTORY EXTERNAL_DATA_DIRECTORY TO codesys; GRANT READ succeeded.
Now we need to define our table object in Oracle. We also need to describe its external properties as part of the one definition.
DROP TABLE lidar; DROP succeeded. CREATE TABLE lidar( easting NUMBER, northing NUMBER, height NUMBER ) ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY DATA_PUMP_DIR ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE FIELDS TERMINATED BY ',' MISSING FIELD VALUES ARE NULL (easting, northing, height) ) LOCATION (EXTERNAL_DATA_DIRECTORY:'ground.csv')) REJECT LIMIT UNLIMITED; CREATE TABLE succeeded. --Note that if our external file was TAB delimited we would use the following clause: -- FIELDS TERMINATED BY 0x'09'
Now we execute SQL SELECT statements against the table.
SELECT count(*) FROM lidar; COUNT(*) ---------------------- 40401 SELECT count(*) FROM lidar l WHERE l.easting BETWEEN 478040 AND 478060 AND l.northing BETWEEN 5228050 AND 5228070 AND l.height BETWEEN 200 AND 600; 1 rows selected COUNT(*) ---------------------- 301 1 rows selected
I will now demonstrate that you cannot create spatial indexes on this table. Firstly, note that I have no SDO_GEOMETRY column in this table that I can index. However, I could try creating a function-based index (even though I doubt it would work because Oracle indexes normally contain ROWIDs that link the index to the actual row in the table holding the data. External tables do not have ROWIDs.)
It is worth noting the following from the Oracle documentation.
Restrictions on External Tables
External tables are subject to the following restrictions:
- An external table cannot be a temporary table.
- You cannot specify constraints on an external table.
- An external table cannot have object type, varray, or LONG columns.
(I will look at the last limitation at the end of this article.)
Let’s create the function based index in the common manner, including the metadata.
-- Create Function that returns a point geometry so we can create a function-based index -- create or replace function observation(p_srid in number, p_x in number, p_y in number, p_z in number ) return mdsys.sdo_geometry deterministic Is Begin Return MDSYS.SDO_GEOMETRY(3001,p_srid,mdsys.sdo_point_type(p_x,p_y,p_z),NULL,NULL); End observation; / show errors function observation(p_srid Compiled. No Errors. -- Generate sdo_geom_metadata entry -- DELETE FROM USER_SDO_GEOM_METADATA WHERE table_name = 'LIDAR'; 1 rows deleted COMMIT; COMMIT succeeded. -- Generate USER_SDO_GEOM_METADATA entry based on actual extent of data in external table. DECLARE v_minx Number; v_maxx Number; v_miny Number; v_maxy Number; v_minz Number; v_maxz Number; BEGIN SELECT min(easting),max(easting),min(northing),max(northing),min(height),max(height) INTO v_minx, v_maxx, v_miny, v_maxy, v_minz, v_maxz FROM LIDAR; INSERT INTO USER_SDO_GEOM_METADATA( TABLE_NAME, COLUMN_NAME, DIMINFO, SRID ) VALUES ( 'LIDAR', 'OBSERVATION(28355,EASTING,NORTHING,HEIGHT)', MDSYS.SDO_DIM_ARRAY( MDSYS.SDO_DIM_ELEMENT('X',v_minx,v_maxx,0.005), MDSYS.SDO_DIM_ELEMENT('Y',v_miny,v_maxy,0.005), MDSYS.SDO_DIM_ELEMENT('Z',v_minz,v_maxz,0.005) ), 28355 ); END; / SHOW ERRORS anonymous block completed No Errors. COMMIT; COMMIT succeeded. -- Let's check to make sure the entry has been created SELECT TABLE_NAME,COLUMN_NAME,DIMINFO FROM USER_SDO_GEOM_METADATA WHERE TABLE_NAME = 'LIDAR'; TABLE_NAME COLUMN_NAME DIMINFO ---------- ------------------------------------------ -------------------------------------------------------------------------------------------------------------------------------------------------------------- LIDAR OBSERVATION(28355,EASTING,NORTHING,HEIGHT) MDSYS.SDO_DIM_ARRAY(MDSYS.SDO_DIM_ELEMENT(X,478000,478100,0.005),MDSYS.SDO_DIM_ELEMENT(Y,5228000,5228100,0.005),MDSYS.SDO_DIM_ELEMENT(Z,1.019,1999.959,0.005)) 1 rows selected -- Now try and create an index on this function DROP INDEX LIDAR_GEOM; Error starting at line 84 in command: DROP INDEX LIDAR_GEOM Error report: SQL Error: ORA-01418: specified index does not exist 01418. 00000 - "specified index does not exist" CREATE INDEX LIDAR_GEOM ON LIDAR(OBSERVATION(28355,EASTING,NORTHING,HEIGHT)) INDEXTYPE is mdsys.spatial_index parameters('sdo_indx_dims=2, layer_gtype=point'); Error starting at line 85 in command: CREATE INDEX LIDAR_GEOM ON LIDAR(OBSERVATION(28355,EASTING,NORTHING,HEIGHT)) INDEXTYPE is mdsys.spatial_index parameters('sdo_indx_dims=2, layer_gtype=point') Error at Command Line:86 Column:10 Error report: SQL Error: ORA-30657: operation not supported on external organized table 30657.0000 - "operation not supported on external organized table" *Cause: User attempted on operation on an external table which is not supported. *Action: Don't do that!
Don’t you love the recommended Action!
OK, so our external data of LiDAR points can really on read via full-table scans as one cannot create an index on an externally organized table.
If we want to spatially index we have to bring the data inside the database which we will now do.
-- Rename our existing LiDAR table to LiDAR_EXT DROP TABLE lidar_ext; Error starting at line 90 in command: drop table lidar_ext Error report: SQL Error: ORA-00942: table or view does not exist 00942. 00000 - "table or view does not exist" RENAME lidar TO lidar_ext; rename lidar succeeded. CREATE TABLE lidar AS SELECT sdo_geometry(3001,28355,sdo_point_type(easting,northing,height),null,null) as geom FROM lidar_ext; create table succeeded. -- Change column name in existing USER_SDO_GEOM_METADATA UPDATE user_sdo_geom_metadata SET column_name = 'GEOM' WHERE table_name = 'LIDAR'; 1 rows updated COMMIT; commit succeeded. -- Now, rebuild index. DROP INDEX LIDAR_GEOM; Error starting at line 104 in command: DROP INDEX LIDAR_GEOM Error report: SQL Error: ORA-01418: specified index does not exist 01418. 00000 - "specified index does not exist" CREATE INDEX LIDAR_GEOM ON LIDAR(GEOM) INDEXTYPE is mdsys.spatial_index parameters('sdo_indx_dims=2, layer_gtype=point'); CREATE INDEX succeeded. DROP TABLE LIDAR_EXT; DROP TABLE LIDAR_EXT succeeded.
So, now we can query the data.
SELECT count(*) FROM lidar l WHERE SDO_ANYINTERACT(l.geom,sdo_geometry(2003,28355,null,sdo_elem_info_array(1,3,3),sdo_ordinate_array(478040,5228050,478060,5228070))) = 'TRUE' AND l.geom.sdo_point.z BETWEEN 200 AND 600; COUNT(*) ---------------------- 301 1 rows selected
Data Other than Points
Our example is based on point data that is loaded into three numeric attributes rather than to an SDO_GEOMETRY object directly. As noted above:
- An external table cannot have object type, varray, or LONG columns.
Instead of modifying my existing example to try and define the SDO_GEOMETRY object directly on the X, Y and Z fields in the CSV, I thought I would attempt to load the linear data associated with my article on executing shp2sdo from within the database.
The shp2sdo utility creates a SQL Loader control file with the actual data stored in-line inside the control file or externally in a data file. This is controlled by the “-d” parameter. I re-ran the tool so that I got the shapefile’s data in the separate data file. I the opened the CTL file, grabbed the load parameters and constructed an appropriate CREATE TABLE statement from it inside Oracle database. Here is that statement:
CREATE OR REPLACE DIRECTORY TEMP_DATA_DIRECTORY AS 'C:\temp'; -- GRANT READ ON DIRECTORY TEMP_DATA_DIRECTORY TO codesys; CREATE TABLE GEODATA250K_TAS_ROADS ( GID NUMBER(38,0), CLASS VARCHAR2(20 BYTE), FEATWIDTH VARCHAR2(20 BYTE), NRN VARCHAR2(20 BYTE), PID VARCHAR2(20 BYTE), FEATREL VARCHAR2(20 BYTE), FORMATION VARCHAR2(20 BYTE), ATTRREL VARCHAR2(20 BYTE), RETIRED VARCHAR2(20 BYTE), SRN VARCHAR2(20 BYTE), SOURCE VARCHAR2(20 BYTE), SHAPE_LENG VARCHAR2(20 BYTE), NAME VARCHAR2(20 BYTE), TEXTNOTE VARCHAR2(20 BYTE), FEATTYPE VARCHAR2(20 BYTE), PLANACC VARCHAR2(20 BYTE), SYMBOL VARCHAR2(20 BYTE), CREATED VARCHAR2(20 BYTE), PEANO NUMBER, GEOM MDSYS.SDO_GEOMETRY ) ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY TEMP_DATA_DIRECTORY ACCESS PARAMETERS ( CONTINUEIF NEXT(1:1) = '#' FIELDS TERMINATED BY '|' TRAILING NULLCOLS ( GID INTEGER EXTERNAL, CLASS NULLIF CLASS = BLANKS, FEATWIDTH NULLIF FEATWIDTH = BLANKS, NRN NULLIF NRN = BLANKS, PID NULLIF PID = BLANKS, FEATREL NULLIF FEATREL = BLANKS, FORMATION NULLIF FORMATION = BLANKS, ATTRREL NULLIF ATTRREL = BLANKS, RETIRED NULLIF RETIRED = BLANKS, SRN NULLIF SRN = BLANKS, SOURCE NULLIF SOURCE = BLANKS, SHAPE_LENG NULLIF SHAPE_LENG = BLANKS, NAME NULLIF NAME = BLANKS, TEXTNOTE NULLIF TEXTNOTE = BLANKS, FEATTYPE NULLIF FEATTYPE = BLANKS, PLANACC NULLIF PLANACC = BLANKS, SYMBOL NULLIF SYMBOL = BLANKS, CREATED NULLIF CREATED = BLANKS, PEANO, GEOM COLUMN OBJECT ( SDO_GTYPE INTEGER EXTERNAL, SDO_SRID INTEGER EXTERNAL, SDO_ELEM_INFO VARRAY TERMINATED BY '|/' (X FLOAT EXTERNAL), SDO_ORDINATES VARRAY TERMINATED BY '|/' (X FLOAT EXTERNAL) ) ) ) LOCATION (TEMP_DATA_DIRECTORY:'GeoData250K_Tas_roads.dat')) REJECT LIMIT UNLIMITED;
On execution this is what happened.
Error starting at line 4 in command: CREATE TABLE GEODATA250K_TAS_ROADS ( GID NUMBER(38,0), CLASS VARCHAR2(20 BYTE), FEATWIDTH VARCHAR2(20 BYTE), NRN VARCHAR2(20 BYTE), PID VARCHAR2(20 BYTE), FEATREL VARCHAR2(20 BYTE), FORMATION VARCHAR2(20 BYTE), ATTRREL VARCHAR2(20 BYTE), RETIRED VARCHAR2(20 BYTE), SRN VARCHAR2(20 BYTE), SOURCE VARCHAR2(20 BYTE), SHAPE_LENG VARCHAR2(20 BYTE), NAME VARCHAR2(20 BYTE), TEXTNOTE VARCHAR2(20 BYTE), FEATTYPE VARCHAR2(20 BYTE), PLANACC VARCHAR2(20 BYTE), SYMBOL VARCHAR2(20 BYTE), CREATED VARCHAR2(20 BYTE), PEANO NUMBER, GEOM MDSYS.SDO_GEOMETRY ) ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY TEMP_DATA_DIRECTORY ACCESS PARAMETERS ( CONTINUEIF NEXT(1:1) = '#' FIELDS TERMINATED BY '|' TRAILING NULLCOLS ( GID INTEGER EXTERNAL, CLASS NULLIF CLASS = BLANKS, FEATWIDTH NULLIF FEATWIDTH = BLANKS, NRN NULLIF NRN = BLANKS, PID NULLIF PID = BLANKS, FEATREL NULLIF FEATREL = BLANKS, FORMATION NULLIF FORMATION = BLANKS, ATTRREL NULLIF ATTRREL = BLANKS, RETIRED NULLIF RETIRED = BLANKS, SRN NULLIF SRN = BLANKS, SOURCE NULLIF SOURCE = BLANKS, SHAPE_LENG NULLIF SHAPE_LENG = BLANKS, NAME NULLIF NAME = BLANKS, TEXTNOTE NULLIF TEXTNOTE = BLANKS, FEATTYPE NULLIF FEATTYPE = BLANKS, PLANACC NULLIF PLANACC = BLANKS, SYMBOL NULLIF SYMBOL = BLANKS, CREATED NULLIF CREATED = BLANKS, PEANO, GEOM COLUMN OBJECT ( SDO_GTYPE INTEGER EXTERNAL, SDO_SRID INTEGER EXTERNAL, SDO_ELEM_INFO VARRAY TERMINATED BY '|/' (X FLOAT EXTERNAL), SDO_ORDINATES VARRAY TERMINATED BY '|/' (X FLOAT EXTERNAL) ) ) ) LOCATION (TEMP_DATA_DIRECTORY:'GeoData250K_Tas_roads.dat')) REJECT LIMIT UNLIMITED Error at Command Line:24 Column:1 Error report: SQL Error: ORA-30656: column type not supported on external organized table 30656.0000 - "column type not supported on external organized table" *Cause: Attempt to create an external organized table with a column of type LONG, LOB, BFILE, ADT, or VARRAY. *Action: These column types are not supported, change the DDL.
Note, as per the documentation, “These column types are not supported”. For non-point geometry data you must use other methods for loading.
I hope this is useful to someone.
Documentation
- MySQL Spatial General Functions
- Oracle LRS Objects
- Oracle Spatial Exporter (Java + pl/SQL)
- Oracle Spatial Object Functions
- Oracle Spatial Object Functions (Multi Page)
- PostGIS pl/pgSQL Functions
- SC4O Oracle Java Topology Suite (Java + pl/SQL)
- SQL Server Spatial General TSQL Functions
- SQL Server Spatial LRS TSQL Functions