Loading Point Data from a CSV File in PostGIS

I recently wrote an article on the use of External File with Oracle tables to load point data into Oracle.

I thought I would have a look on how to do this in PostGIS.

A bit of research came up with the PostgreSQL COPY SQL command.

There appears to be no equivalent to Oracle’s Organisation External for a table. In PostgreSQL you simply create your table and then insert data into it from an external file. You don’t seem to need to concern yourself with special permissions to do this: just get in and do it.

Which is what we will do right now.

 -- First off let's create our target table
 DROP TABLE lidar;

 Query returned successfully with no result in 15 ms.

   easting float,
  northing float,
    height float

 Query returned successfully with no result in 16 ms.

 -- Now we can execute the COPY command to load our data (PostgreSQL or perhaps PGAdminIII) gets 
 -- a little upset if we use Windows backslashes between the folders in the filename. 
 -- Yes, you can use a double backslash which will work but PostgreSQL will continue to complain. 
 -- Solution: simply use Linux/Unix forward slash filenames.
 COPY lidar ( easting, northing, height )
     FROM 'c:/temp/lidar/ground.csv'
           DELIMITER AS ','
           CSV HEADER ;

 Query returned successfully: 40401 rows affected, 297 ms execution time. 

 -- Now let's query the loaded data.
 SELECT count(*)
   FROM lidar l
  WHERE l.easting  BETWEEN 478040  AND 478060
    AND l.northing BETWEEN 5228050 AND 5228070
    AND l.height   BETWEEN 200     AND 600;


 -- Now let's add a 3D point geometry column to the table...
 SELECT AddGeometryColumn('public', 'lidar', 'geom', 28355, 'POINT', 3);

 "public.lidar.geom SRID:28355 TYPE:POINT DIMS:3"

 -- Now populate the added column
 update lidar set geom = ST_SetSRID(ST_MakePoint(easting,northing,height),28355);

 Query returned successfully: 40401 rows affected, 359 ms execution time.

 -- Create a spatial index for faster querying
 CREATE INDEX lidar_geom ON lidar USING GIST ( geom );

 Query returned successfully with no result in 1344 ms.

 -- Now, use the newly indexed spatial column in the spatial equivalent of the above query
 SELECT count(*)
   FROM lidar
  WHERE geom && SetSRID('BOX3D(478040 5228050,478060 5228070)'::box3d,28355) 
    AND height BETWEEN 200 AND 600;


Now, that was pretty simple. There are certainly advantages to the PostgreSQL way of loading CSV organised data. I am not sure about the lack of security with regards to external file access though. One thing in Oracle’s favour is that externally organised tables can be used without having to load the data which can be useful in many situations.

Anyway, I hope this article is of use to someone.