# Using R in SQL 2016 to calculate the distance between Cities

6. September 2016 Leave a comment

Since SQL Server 2016, R can be used in T-SQL statements to perform sophisticated calculations. One example I was facing, was to calculate the distance between two cities. Although there are many ways to solve this tasks, R can also be used to perform a exact calculation.

### Prerequisites

R services need to be installed in order to execute R scripts within T-SQL. To calculate the distance between two geo coordinates, the geosphere library is required. The procedure to install additional packages is documented at MSDN.

### Data model

This example contains 2 tables, Cities and DistanceTable. The Cities table contains the name and geo coordinates of a city, while the DistanceTable contains two references FromCity and ToCtiy to the Cities table.

Column |
Datatype |

CityID | int (Primary Key) |

Name | nvarchar(128) |

Longitude | real |

Latitude | real |

Column |
Datatype |

JournalID | int (Primary key) |

FromCity | int (Foreign key) |

ToCity | int (Foreign key) |

For example the two Austrian Cities Linz and Vienna look like this:

An entry in the distance table looks like this:

I’ve added another view to output the geo coordinates from both cities which are referenced in the DistanceTable

CREATE VIEW [dbo].[DistanceViewLonLat]

AS

SELECT DT.JournalID,

FC.Longitude AS FromLon, FC.Latitude AS FromLat,

TC.Longitude AS ToLon, TC.Latitude AS ToLat

FROM

dbo.DistanceTable AS DT

INNER JOIN dbo.Cities AS FC ON DT.FromCity = FC.CityID

INNER JOIN dbo.Cities AS TC ON DT.ToCity = TC.CityID

GO

A record from the view looks like this

### R Script

The following R script takes a record from the view as input and calculates the distance between two points and rounds the result from meter to kilometer.

exec sp_execute_external_script

@language =N’R’,

@script=N’

library(sp)

library(geosphere)

sqlvalues <- as.matrix(InputDataSet);getDistKm <- function(row)

{

p1 <- c(row[1], row[2])

p2 <- c(row[3], row[4])d <- distGeo(p1,p2) / 1000

c(row[1], row[2], row[3], row[4], d)

}km <- apply(sqlvalues,1,getDistKm)

km <- t(km)OutputDataSet <- as.data.frame(km)

‘,

@input_data_1 =N’select FromLon, FromLat , ToLon, ToLat from DistanceViewLonLat where JournalID = 1;’

with result sets (([fromlng] real, [fromlat] real, [tolng] real, [tolat] real, [km] real not null));

go

The result looks like this: