UK24727

Data for the UK pubs TSP.




The Pubs Galore site states: "We have the largest list of pubs in the UK kept up to date by our dedicated members." The list is indeed extensive, totaling more than 80,000 locations.

For our data set, we pruned the full list. First, based on names, we removed establishments that seemed more like restaurants than public houses. So TGI Friday's was out and the The Plough and Horses was in. Second, we took only the first location of pubs that had a number of branches in the same town. Third, we removed any pub that was marked as permanently closed by the members of Pubs Galore. Finally, we removed several pubs that could not be reached by foot, such as those located in airport terminals.

We were left with 24,727 locations, covering England (22,044), the Ilse of Man (19), Northern Ireland (240), Scotland (1,186), and Wales (1,238). With this much data, there are bound to be a few errors, such as unmarked pub closings. But the Pubs Galore team has created a very accurate data set and we are in their debt for providing the material for an ultimate UK pub crawl.


Interactive map

A good way to view the data is to have a look at a Google Maps drawing giving a marker for every pub location. That is a lot of markers to crowd onto a map of the UK, but if you zoom in you can see the distribution of points. There are sparse regions in the forest parks of northern England and in large parts of Scotland, but the big picture is that there are a lot of pubs in the UK!

Screen image of map markers Click image for an interactive map.

If you are unable to view the interactive map, here is a montage of screen shots showing the pub locations, color coded by region.


Latitude and longitude

Having the list of pub names adds spice to the TSP challenge, but when it comes to the mathematics, all we need are the latitude and longitude coordinates for the stops. This is the information that is used to query Google Maps for walking distances and for computing the geometric distance between pairs of pubs.

Here is a link to a text file with the latitude and longitude (in decimal form) for all 24,727 pub locations.


Travel distances

In the pubs problem, we assume the distance to walk from the The Fiddler's Elbow to The Bald Face Stag is the same as the distance to walk back from The Stag to The Elbow. This is quite reasonable for walking, although for innner-city travel it sometimes does not hold for driving (due to one-way streets).

We also make the assumption that the walking distance between two pubs is no shorter than the geometric distance to fly from one pub to the other. To be precise, we define the distance from pub A to pub B to be the maximum of the geometric distance between the latitude and longitude coordinates for A and B and the walking distance from A to B provided by Google Maps.

In our computation, the geometric distance between two pubs is specified by an approximation to the great circle distance on the Earth (treating the Earth as a ball). This is a variation of the TSPLIB GEO-norm, adjusted to handle decimal degrees and scaled to provide the distance in meters rather than in kilometers. The distance function is called the GEOM-norm; a computer code to compute the GEOM distance between two points can be found on the World TSP page.

The UK27427 data set, in TSPLIB format, is given here. This data set allows the geometric distances to be computed, for example, by the Concorde TSP code.

To fully specify the problem we solved, we need also to report the Google distances we used, since Google's algorithms adjust their routes over time, taking into consideration new construction and better data sets for the road networks.

Here is a gzipped list of the 639,164 pairs of points for which we queried Google Maps. Each line gives the indices of two points, using the order specified in the TSPLIB data set, where the points are numbered from 0 up to 24,726. Following the pair of numbers, on each line, is the reported walking distance in meters.

To reproduce the UK24727 problem, use the maximum of the GEOM distance and the walking distance for each of the 639,164 pairs in the file, and use the GEOM distance for each of the remaining 305,060,737 pairs. We did not need to query Google for these additional pairs, since our final tour does not use any of these direct legs. (Since the defined distance is the maximum, having the Google distance can only increase the length of any tour that uses one of the 305,060,737 pairs.)

Please be warned that the travel data was obtained in the fall of 2015. If you need to hit a few UK pubs, be sure to check out Google Maps for up-to-date distances and directions.


Tour data

Last, but not least, here is the optimal tour as a list of indices, using the order specified in the TSPLIB data set.