km process description
Last updated 31-3-2019
Index
Outline
Distance markers indicate a location along a road based on distance from some reference point. They have existed for thousands of years and were originally mostly stones. They often also indicate a road number and sometimes distance to one more destinations.
They are also known as mileposts, milestones, mile(age) markers, kilometer markers, driver location signs, highway location markers.
km 87 on the N247 in Portugal, 2011
km 34 on the N52 in France (1988)
km 0 on section 073 of the L3030 in Hessen (Germany)
1.3 miles on the N5 in Ireland. The leading 0 and
dash never appear on directional signs.
km 19 on State Highway 5 in Andhra Pradesh
(2010, now Telangana), India with distance to Kalwakurthy
Many roads have several sections partly with the same values. Beside the road number it is also necessary to include some description of the section along the road.
This page describes the process used to create the mileage and km points linked on the Mileage and kilometerage index page.
To compute the location of kilometer or mileage points we need to know a few points to begin with (as many as possible to get high accuracy). Then we can interpolate the other values using a list of co-ordinates along the road (a track). Conveniently, Open Street Map has a routing service which provides the option to download the routes between given points.
This diagram illustrates the process:
Collect known points
We could go out and collect points with a GPS receiver. It would be time consuming and costly to obtain reasonable coverage that way. Instead we can use online mapping services such as Google Maps and Mapillary, a crowd sourced streetview site.
In most countries, there are signs / stones every km, others have them every 100 m or 500 m. More difficult to handle are countries that only have sporadic signs at random locations.
Google Maps
Co-ordinates in Google Maps can simply be read from the URL. For example, this pano shows a km sign right beside the car. We can now copy the co-ordinates 50.0608102,5.2010495 from the URL:
https://www.google.nl/maps/@50.0608102,5.2010495,3a,83.4y,42.41h,78.68t/data=!3m6!1e1!3m4!1s1hwLJHLFkH0owuB_s-KfWA!2e0!7i13312!8i6656 and put it in a spreadsheet. In a second column, add the km value, and, for beginning and end of a section, add a description which will be used to form the section name later.
Accuracy of co-ordinates obtained from Google Maps this way is in the order of a few meters. There seems to be a strong bias for the co-ordinates to appear slightly behind the actual location.
This can be observed by comparing opposite carriageways: Go into streetview on one carriageway, and find a pano which shows a recognisable location (not necessarily a km sign, can also be e.g. a point on a bridge) as precisely as possible on the line perpendicular to the direction of travel, and note the co-ordinates. Then go to the other direction and find a streetview pano that shows an image exactly opposite the first one, and copy the co-ordinates as well, then compare the points (e.g. plan a route between them). Chances are that the two points will NOT appear exactly opposite each other. This means that the location at which the picture is shown in streetview is not exactly the location where the picture was actually taken. Usually you will see that the two points appear before the point where they would be opposite each other. This suggests the location that is recorded is slightly off sync (the car had already travelled further down the road). Try e.g. around 40.613555,15.2637171 in Italy.
Therefore it is good to use a streetview image which shows the sign slightly behind the car.
NB This can also be seen by comparing with aerial imagery, but this too can have a small inaccuracy.
Mapillary
Where Google Maps has no streetview, one can often use Mapillary in a similar way.
Example URL: https://www.mapillary.com/app/?lat=47.364531464691595&lng=16.085324182596764
Since this usually only has forward images, the accuracy that can be obtained is much lower. We can estimate this by comparing with Google Streetview where it is available: Compare the co-ordinates from streetview with the co-ordinates of the last Mapillary picture which shows the sign.
NB One should keep in mind that when selecting a location for streetview by dragging onto the blue line, one will get the nearest point that has streetview, not that exact location. Thus one should use an actual streetview co-ordinate pair as basis.
It is also useful to know the frequency for the given picture sequence (this can vary between 1 and 50 m). For example, say we have this picture:
and 10 pictures further we see
Since the signs must be (about) 200 m apart, we can estimate the distance between panos as 20 m. Of course the speed of the car is not constant, and signs may be closer together or further apart than indicated, the GPS receiver may be off etc. but this way one can still aim for 10-20 m accuracy.
Now if know that we can still see signs at a distance of about 10 m, we can interpolate the location between the given location and that of the next picture.
This is of course much more cumbersome than using Google streetview.
Note: To avoid copyright issues, these pictures are not actually from Mapillary.
Youtube
When there is no streetview either from Google or Mapillary, sometimes a youtube video can help. This does require an aerial image on which specific locations along the road can be recognised. As an example, see this video of the A23 in Greece. At 4:01, the 7 km sign can be seen but it is hard to find the right location. At the end of the video, it is clear that km 22.3 corresponds to streetview at 41.2647927,25.4294302. Note that streetview is old, the signs were put up later.
This way of getting locations is even more time consuming than using Mapillary.
Download points
In order for the standard scripts to work, there must be an empty line between sections and the road number must appear in a line one its own.
When done, download the points to a text file which must have the following file name syntax: <cc>_points_<date>.txt
Example: nl_points_20190221.txt
Referred to as the 'points file' below.
Preparation
The process is semi-automated: a script creates a file with the necessary commands which can then be executed one by one.
Create commands
The script is run as follows:
create_km_commands.sh <cc> <version> <date> <output> [<resolution>]
cc: country code, will be used in directory and file names
version: a two-digit sequential number with leading zero if applicable
date: day in 8-digit format YYYYMMDD
output: name of the file with commands
Optional hm: if this is "hm" a point will be placed every 100 m instead of every km.
Example for the Netherlands:
create_km_commands.sh nl 03 20190221 commands03.sh hm
Split script
The first section creates the required directory structure and runs a script that splits the points file into one per road section.
Furthermore, it has the following outputs:
An HTML file with links to openrouteservice.org
A shell script that converts all of the required gpx files to csv files with the co-ordinates of all the points along the route (the tracks)
A shell script to interpolate the track for each section and place points with the desired mile or km values
A shell script to combine all the sections and add the right names
An HTML file with automatically generated (and partly incorrect if both carriageways appear) descriptions of the sections
A log file
Download tracks
Use the HTML file with openrouteservice.org links and open them one by one.
Click on 'Export Route' and choose 'GPS eXchange Format (.gpx)'.
Openrouteservice URL
A route URL has the following syntax:
https://openrouteservice.org/directions?n1=0&n2=0&n3=7&a=<start latitude>,<start longitude>,<destination latitude>,<destination longitude>&b=0&c=0&k1=en-US&k2=km
Example:
https://openrouteservice.org/directions?n1=0&n2=0&n3=7&a=52.349576,4.966966,52.232389,6.171115&b=0&c=0&k1=en-US&k2=km
Problems
Openrouteservice is inclined to leave the main road and go via an exit, and join the motorway again right away.
For example, the route from Leiden to Vogelenzang leaves the highway four times. These cases must all be fixed manually by moving the route to the highway. It can be adjusted by dragging a point on the route.
Sometimes a segment is blocked for routing. In that case plan two separate routes and manually create a side input with co-ordinate pairs on the blocked part.
Convert tracks
This sed command converts a gpx track to a csv file with one point per line, format lat,lng:
sed -e 's+trkpt+\n&+g' track.gpx | grep trkpt | sed -e 's+.*"\(.*\)".*"\(.*\)".*+\1,\2+' > track.csv
When doing an update of only a few roads or new split sections, one will only convert a few files, this is done with a command (provided by the create_km_commands.sh script) like this:
ls *gpx | sed -e 's+\(.*\).gpx+grep "\1" ../../1_split/att07/convert_gpx_07.sh+' | sh | sh
Explanation:
Convert the list of gpx files into a script that selects the given files from the convert script
Pipe through sh first time to actually get these commands
Pipe through sh again to execute them, i.e. convert to csv files
Interpolate
Haversine function
The formula to compute the distance between two lat-lng co-ordinate pairs is explained on Wikipedia.
Implementation in perl: perlmonks
Implementation in python: stackoverflow
Implementation in javascript: stackoverflow
Many other languages: rosettacode
Algorithm
A script interpolates a single track to compute the required km or mileage points for one road section.
The algorithm can be summarised as follows:
Read the known points into memory
Read the track into memory
Project and insert known values into the track
Compute the total travelled distance from the start to each of the points on the track
Adjust offset based on known values (see below)
Interpolate distances
Print output file with km values
Print statistics
Details
Suppose we have found signs at km 0 and at km 101 but the distance found on openrouteservice.org is only 100 km. Therefore we have to 'blow up' the distance by a factor 1.01 to get the 'administrative' distance (though this may deviate considerably from the 'real' distance, see e.g. in France where an administrative kilometer can be more than 1800 meters long in reality).
Since OSM roads are represented by straight segments, in practice the travelled distance using openrouteservice is slightly too small (the same is true for Google Maps). The average correction factor for OSM found in Europe is around 1.002, which means that the deviation is about 0.2 %.
There are a number of edge cases that have to be dealt with, see also checks below:
Projecting onto a line is not enough, at a bend we can have a case where the point lies in a cone outside the strips perpendicular to the two nearest line segments:
Similarly, there can be two segments onto which the point can be projected:
A very bendy road can have two or more segments for which the strip perpendicular to it contains the point to be projected
Two points can be projected onto the same point on the track
This last case should not happen since two known points should not be so close together, and this is currently not dealt with correctly.
Kilometer values can increase on one section and decrease elsewhere on the same road in the same direction
If the section extends beyond the first or last known points, locations not between known points are extrapolated. This will generally give lower accuracy than interpolation and should be avoided as much as possible.
Checks
The log file per section lists all applied correction factors. Main check is whether these correction factors are not too extreme. If it is e.g. 1.2 for a 200 m stretch (i.e. the real distance between 5.0 and 5.2 is 240 m according to openrouteservice.org) then this is possible, but if we have this factor on a 100 km stretch then there must be something wrong.
In practice this usually means that there is a discontinuity that was not previously noticed.
A gap of 20 km is usually easy to find because it leads to a very high correction factor but a mismatch of 50 or 100 m is hard to find.
Discontinuities exist for various reasons:
An administrative boundary (for example in France, kilometerage of national routes is per departement).
Realignment of the road made it longer or shorter, requiring new km values on a subsection or a 'gap'. Conversely a discontinuity can be deliberately included to anticipate possible future realignments.
At a TOTSO (where the road turns off) one direction is longer than the other or the distance is calculated via the centerline of the road to the point where centerlines cross (e.g. in the Netherlands and Germany).
Various historical reasons, e.g. in the past there was a different road numbering system and at some junction the road used to turn off while it now continues straight.
In some countries kilometer values are simply stretched or compressed in case of realignments. This is especially frequent in Spain and France.
Discontinuity point on the A3 in Germany: Dreieck Heumar, 140855 m fron the border with the Netherlands
Do a sample check of 100 or 200 points and compute the distance between the interpolated point and the real location in streetview (or Mapillary).
While the result is not good enough, add mid points of the longer sections.
Suppose we do a sample check of 200 points and find 21 cases with a deviation of more than 25 m. With R we find e.g.
pbinom(21,200,0.2) [1] 0.0002364216 pbinom(21,200,0.1) [1] 0.6483522 pbinom(21,200,0.15) [1] 0.04149666 pbinom(21,200,0.14) [1] 0.08906897
Iterating we find that with 95 % confidence we can say that at least 85.2 % of points are within 25 m.
Overview of checks:
Combine files
The interpolated files are combined into a single csv file with header.
Format
The output of the combine script has the following fields:
Sections
When the spreadsheet looks like this: Then the combine script will be default assume that the A1 has two sections, and name these 'Atown - Btown' and 'Ctown - Dtown'.
The split script will also summarise the A1 as going from Atown to Dtown in the description HTML file.
If there are other descriptions like 'A bypass' then the destination will be left blank, but if we also have the opposite direction from Dtown to Ctown further below, because that carriageway is far away from the other direction or there is some other special case, then this will be correct in the combine script but the description HTML file will say that the A1 goes from Atown to Ctown, which is wrong.
Checks
The relevant files are copied to the combine directory.
Before running the combine script, the following checks are run:
Check that all required files are present
Check that all copied files are used by the combine script
Check that the combine script uses every file only once
It is nice to show integer km values in a different colour. This command, applied to the output of the combine script, adds a field for colour, also distinguishing between motorways (with A number) and other roads.
perl -e 'open(I, "input.csv"); while (<I>) { chomp; @f = split(/,/); if ($f[7] =~ /A/) { if ($f[2] =~ /\.0/) { $int = 0 } else { $int = 1 } } else { if ($f[2] =~ /\.0/) { $int = 2 } else { $int = 3 } } print "$_,$int\n" }' |\ sed -e 's+unit,3+unit,colour+' > output.csv
Upload
Fusion table
So far all files are uploaded to Fusion tables. These will be deprecated in December 2019.
Beside this the first few data sets were also added to a page in HTML format. This is superfluous since the user can simply click on 'Rows' and download the table as csv.
Maps API
An alternative would be the Google Maps API.
Earth Engine
Alternatively, Earth Engine could be used.
Links
Mileage and kilometerage index page
Haversine formula on Wikipedia
Highway location marker on Wikipedia
Driver Location Signs SABRE
Marcel Monterie