Using R in SQL 2016 to calculate the distance between Cities

Since SQL Server 2016, R can be used in T-SQL statements to perform sophisticated calculations. One example I was facing, was to calculate the distance between two cities. Although there are many ways to solve this tasks, R can also be used to perform a exact calculation.

Prerequisites

R services need to be installed in order to execute R scripts within T-SQL. To calculate the distance between two geo coordinates, the geosphere library is required. The procedure to install additional packages is documented at MSDN.

image

Data model

This example contains 2 tables, Cities and DistanceTable. The Cities table contains the name and geo coordinates of a city, while the DistanceTable contains two references FromCity and ToCtiy to the Cities table.

Column Datatype
CityID int (Primary Key)
Name nvarchar(128)
Longitude real
Latitude real
Column Datatype
JournalID int (Primary key)
FromCity int (Foreign key)
ToCity int (Foreign key)

For example the two Austrian Cities Linz and Vienna look like this:

image

An entry in the distance table looks like this:

image

I’ve added another view to output the geo coordinates from both cities which are referenced in the DistanceTable

CREATE VIEW [dbo].[DistanceViewLonLat]
AS
SELECT DT.JournalID,
FC.Longitude AS FromLon, FC.Latitude AS FromLat,
TC.Longitude AS ToLon, TC.Latitude AS ToLat
FROM
dbo.DistanceTable AS DT
INNER JOIN dbo.Cities AS FC ON DT.FromCity = FC.CityID
INNER JOIN dbo.Cities AS TC ON DT.ToCity = TC.CityID
GO

A record from the view looks like this

image

R Script

The following R script takes a record from the view as input and calculates the distance between two points and rounds the result from meter to kilometer.

exec sp_execute_external_script
@language =N’R’,
@script=N’
library(sp)
library(geosphere)
sqlvalues <- as.matrix(InputDataSet);

getDistKm <- function(row)
{
p1 <- c(row[1], row[2])
p2 <- c(row[3], row[4])

d <- distGeo(p1,p2) / 1000
c(row[1], row[2], row[3], row[4], d)
}

km <- apply(sqlvalues,1,getDistKm)
km <- t(km)

OutputDataSet <- as.data.frame(km)
‘,
@input_data_1 =N’select FromLon, FromLat , ToLon, ToLat from DistanceViewLonLat where JournalID = 1;’
with result sets (([fromlng] real, [fromlat] real, [tolng] real, [tolat] real, [km] real not null));
go

The result looks like this:

image

About erpcoder
Azure Cloud Architect and Dynamics 365 enthusiast working in Research & Development for InsideAx

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: