Removing Duplicate Vertices in SQL Server Spatial

Introduction

Vertices are used to define, or shape, linestrings and polygon rings. Two adjacent vertices should be distinct, that is not equal to each other.

But what is equal?

One can take two approaches:

  • Two vertices are equal when their ordinate values are the same given a statement of decimal precision.
  • So, POINT(1.1 1.1) is not equal to POINT(1.1001 1.1) at 6 digits of decimal precision but is equal at 1 digit of decimal precision.

  • Two vertices are equal when the distance between them is less than a supplied amount (tolerance or epsilon).
  • The distance between POINT(1.1 1.1) and POINT(1.1001 1.1) is 0.00009999999999999. As long as the distance value (tolerance or epsilon) is less than this, they vertices are equal.

SQL Server provides two functions that check a geometry object’s validity:

  • STIsValid() and
  • ValidDetailed()

Neither of these will detect duplicate vertices in a LineString or a Polygon ring:

with data as (
select 'LineString' as test, geometry::STGeomFromText('LINESTRING(0 0,1 1,1 1,2 2)',0) as geom
union all
select 'Polygon' as test, geometry::STGeomFromText('POLYGON((0 0,1 0,1 0,1 1,0 1,0 0))',0) as geom
)
select test, a.geom.STIsValid() as isValid, a.geom.IsValidDetailed() as isValidReason
  from data as a

test       isValid isValidReason
LineString 1       24400: Valid
Polygon    1       24400: Valid

Note: In Oracle SDO_GEOMETRY LineStrings and Polygons with duplicate vertices are reported as being in error with PostGIS reports duplicate vertices as valid.

Removing Duplicate Vertices

Even though SQL Server Spatial cannot detect duplicate vertices there are many use cases out there that require duplicate vertices to be removed. One is the conversion of GPS observations in to linestrings.

This article introduces two functions which will remove duplicate vertices:

  • STRemoveDuplicatePoints() – Removes duplicate vertices where their ordinates are equal at some statement of precision.
  • STRemoveDuplicatePointsByDistance() – Removes duplicate vertices where they are equal within a supplied distance.

Here are the two functions with examples.

STRemoveDuplicatePoints

Function [dbo].[STRemoveDuplicatePoints] (
  @p_geometry geometry,
  @p_round_xy int = 8,
  @p_round_z  int = 2,
  @p_round_m  int = 2
)
 Returns geometry

Description

This function removes any duplicate vertices in the supplied (and supported) geometry.

Two adjacent points are equal if their ordinates, at the distance decimal digits of precision, are equal.

If geometry has Z or M ordinates these are used in any distance comparison.

If a LineString reduces to a single Point a Point is returned.

If a MultiLineString element reduces to a single point the element is removed. If all elements are removed MULTILINESTRING EMPTY is returned.

If a Polygon or MultiPolygon element (ring) reduces to < 4 points the element is removed. If all elements (rings) are removed POLYGON EMPTY or MULTIPOLYGON EMPTY is returned.

Examples

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('POINT(0 0)',0),3,1,1).AsTextZM() as geom;
GO

geom
POINT (0 0)

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('GEOMETRYCOLLECTION(POINT(0 0),POINT(1 1),LINESTRING(0 0,10 10))',0),1,1,1).AsTextZM() as geom;
GO

geom
GEOMETRYCOLLECTION (POINT (0 0), POINT (1 1), LINESTRING (0 0, 10 10))

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('CIRCULARSTRING(0 0,10 10,20 0,30 -10, 40 0)',0),1,1,1).AsTextZM() as geom;
GO

geom
CIRCULARSTRING (0 0, 10 10, 20 0, 30 -10, 40 0)

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('COMPOUNDCURVE(CIRCULARSTRING(0 0,10 10,20 0,30 -10, 40 0),(40 0,50 0))',0),1,1,1).AsTextZM() as geom;
GO

geom
COMPOUNDCURVE (CIRCULARSTRING (0 0, 10 10, 20 0, 30 -10, 40 0), (40 0, 50 0))

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('CURVEPOLYGON(CIRCULARSTRING(1 3, 3 5, 4 7, 7 3, 1 3))',0),1,1,1).AsTextZM() as geom;
go

geom
CURVEPOLYGON (CIRCULARSTRING (1 3, 3 5, 4 7, 7 3, 1 3))

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('LINESTRING(1 1.001,1 1)',0),2,1,1).AsTextZM() as geom; 
GO

geom
POINT (1 1)

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('LINESTRING(1 1.001,1 1,1 1,2 2)',0),1,1,1).AsTextZM() as geom; 
GO

geom
LINESTRING (1 1.001, 2 2)

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('LINESTRING(1 1.001 1,1 1 2,1 1 2,2 2 3)',0),1,1,1).AsTextZM() as geom; 
GO

geom
LINESTRING (1 1.001 1, 1 1 2, 2 2 3)

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('MULTIPOINT(1 1.001,1 1)',0),2,1,1).AsTextZM() as geom;
GO

geom
POINT (1 1)

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('MULTIPOINT((1 1.001),(1 1),(1 1),(2 2))',0),1,1,1).AsTextZM() as geom;
GO

geom
MULTIPOINT ((1 1.001), (2 2))

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('MULTILINESTRING((0 0,0.01 0.01),(3 3,5 5,7 7))',0),1,1,1).AsTextZM() as geom;
GO

geom
MULTILINESTRING ((3 3, 5 5, 7 7))

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('MULTILINESTRING((0 0,1 1),(3 3,4 4))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
MULTILINESTRING ((0 0, 1 1), (3 3, 4 4))

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('POLYGON((0 0,10 0,10 10,0 10,0 0),(0.5 0.1,1 1,1 0.1,0.5 0.1))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
POLYGON ((0 0, 10 0, 10 10, 0 10, 0 0))

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('MULTIPOLYGON(((0 0,10 0,10 10,0 10,0 0)),((0.5 0.1,1 1,1 0.1,0.5 0.1)))',0),0,1,1).AsTextZM() as geom;
GO

geom
MULTIPOLYGON (((0 0, 10 0, 10 10, 0 10, 0 0)))

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('MULTIPOLYGON(((0 0,10 0,10 10,0 10,0 0)),((0.5 0.1,1 1,1 0.1,0.5 0.1)))',0),0,1,1).AsTextZM() as geom;
GO

geom
MULTIPOLYGON (((0 0, 10 0, 10 10, 0 10, 0 0)))

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('MULTIPOLYGON(((0.5 0.1,1 1,1 0.1,0.5 0.1)),((0 0,10 0,10 10,0 10,0 0)))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
MULTIPOLYGON (((0 0, 10 0, 10 10, 0 10, 0 0)))

select [dbo].[STRemoveDuplicatePoints](geometry::STGeomFromText('MULTIPOLYGON(((10 10,11 10,11 11,10 11,10 10)),((0 0,20 0,20 20,0 20,0 0),(0.5 0.1,1 1,1 0.1,0.5 0.1)))',0),0,1,1).AsTextZM() as geom;
GO

geom
MULTIPOLYGON (((10 10, 11 10, 11 11, 10 11, 10 10)), ((0 0, 20 0, 20 20, 0 20, 0 0)))

STRemoveDuplicatePointsByDistance

Function [dbo].[STRemoveDuplicatePointsByDistance] (
  @p_geometry geometry,
  @p_distance float,
  @p_round_z  int = 2,
  @p_round_m  int = 3           
)
 Returns geometry

Description

This function removes any duplicate vertices in the supplied (and supported) geometry.

When two adjacent points are compared by using the supplied distance.

If geometry has Z or M ordinates these are used in any distance comparison ie If same 2D distance the they are equal only if z and m ordinates are also equal.

If a LineString reduces to a single Point a Point is returned.

If a MultiLineString element reduces to a single point the element is removed. If all elements are removed MULTILINESTRING EMPTY is returned.

If a Polygon or MultiPolygon element (ring) reduces to < 4 points the element is removed. If all elements (rings) are removed POLYGON EMPTY or MULTIPOLYGON EMPTY is returned.

Examples

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('POINT(0 0)',0),0.5,1,1).AsTextZM() as geom;
GO

geom
POINT (0 0)

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('GEOMETRYCOLLECTION(POINT(0 0),POINT(1 1),LINESTRING(0 0,10 10))',0),0.5,1,1).AsTextZM() as geom;
GO

geom
GEOMETRYCOLLECTION (POINT (0 0), POINT (1 1), LINESTRING (0 0, 10 10))

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('CIRCULARSTRING(0 0,10 10,20 0,30 -10, 40 0)',0),0.5,1,1).AsTextZM() as geom;
GO

geom
CIRCULARSTRING (0 0, 10 10, 20 0, 30 -10, 40 0)

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('COMPOUNDCURVE(CIRCULARSTRING(0 0,10 10,20 0,30 -10, 40 0),(40 0,50 0))',0),0.5,1,1).AsTextZM() as geom;
GO

geom
COMPOUNDCURVE (CIRCULARSTRING (0 0, 10 10, 20 0, 30 -10, 40 0), (40 0, 50 0))

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('CURVEPOLYGON(CIRCULARSTRING(1 3, 3 5, 4 7, 7 3, 1 3))',0),1.5,1,1).AsTextZM() as geom;
go

geom
CURVEPOLYGON (CIRCULARSTRING (1 3, 3 5, 4 7, 7 3, 1 3))

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('LINESTRING(1 1.001,1 1)',0),0.5,1,1).AsTextZM() as geom; 
GO

geom
POINT (1 1)
select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('LINESTRING(1 1.001,1 1,1 1,2 2)',0),0.5,1,1).AsTextZM() as geom; 
GO

geom
LINESTRING (1 1.001, 2 2)

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('LINESTRING(1 1.001 1,1 1 2,1 1 2,2 2 3)',0),0.5,1,1).AsTextZM() as geom; 
GO

geom
LINESTRING (1 1.001 1, 1 1 2, 2 2 3)

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('MULTIPOINT(1 1.001,1 1)',0),2,1,1).AsTextZM() as geom;
GO

geom
POINT (1 1)

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('MULTIPOINT((1 1.001),(1 1),(1 1),(2 2))',0),1.4,1,1).AsTextZM() as geom;
GO

geom
MULTIPOINT ((1 1.001), (2 2))

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('MULTILINESTRING((0 0,1 1,2 2),(3 3,4 4))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
MULTILINESTRING ((0 0, 2 2))

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('MULTILINESTRING((0 0,1 1),(3 3,5 5,7 7))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
MULTILINESTRING ((3 3, 5 5, 7 7))

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('MULTILINESTRING((0 0,1 1),(3 3,4 4))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
MULTILINESTRING EMPTY

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('POLYGON((0.5 0.1,1 0.1,1 1,0.5 0.1))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
POLYGON EMPTY

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('POLYGON((0 0,10 0,10 10,0 10,0 0),(0.5 0.1,1 1,1 0.1,0.5 0.1))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
POLYGON ((0 0, 10 0, 10 10, 0 10, 0 0))

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('MULTIPOLYGON(((0 0,10 0,10 10,0 10,0 0)),((0.5 0.1,1 1,1 0.1,0.5 0.1)))',0),10,1,1).AsTextZM() as geom;
GO

geom
MULTIPOLYGON EMPTY

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('MULTIPOLYGON(((0 0,10 0,10 10,0 10,0 0)),((0.5 0.1,1 1,1 0.1,0.5 0.1)))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
MULTIPOLYGON (((0 0, 10 0, 10 10, 0 10, 0 0)))

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('MULTIPOLYGON(((0.5 0.1,1 1,1 0.1,0.5 0.1)),((0 0,10 0,10 10,0 10,0 0)))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
MULTIPOLYGON (((0 0, 10 0, 10 10, 0 10, 0 0)))

select [dbo].[STRemoveDuplicatePointsByDistance](geometry::STGeomFromText('MULTIPOLYGON(((10 10,11 10,11 11,10 11,10 10)),((0 0,20 0,20 20,0 20,0 0),(0.5 0.1,1 1,1 0.1,0.5 0.1)))',0),1.5,1,1).AsTextZM() as geom;
GO

geom
MULTIPOLYGON (((0 0, 20 0, 20 20, 0 20, 0 0)))

Conclusion

The presented functions offer powerful methods for ensuring high data quality within a SQL Server Spatial database.

The functions can be accessed via a small donation here.

I hope they are of use to someone out there.