The korea81998 TSP instance consists of walking times, measured in seconds, between 81,998 bars in South Korea. We will refer to the walking times as "distances", to match the usual TSP language. The distances in korea81998 are symmetric, that is, for any pair of shops A and B, the distance from A to B is the same as the distance from B back to A.
The first step in building an example of the TSP is to collect the locations of the points to visit. In the case of korea81998, we were fortunate to have available a very nice, large data set for a TSP challenge. This is thanks to Dr. Sang-il Oum of the Institute for Basic Science (IBS) in Daejeon, Korea. Sang-il is the Chief Investigator at one the world's leading centers for the study of discrete mathematics.
I was scheduled to give a TSP lecture in March 2024 at KAIST, also located in Daejeon, and I was looking for a data set of locations to build a local Daejeon TSP tour. In December 2023, Sang-il wrote "Do you need the data set for all pubs in Korea, made by the Korean National Police Agency? It's latest file costs 1,000 KRW and has 90,680 entries." Wow. After first verifying that 1000 KRW is less than 1 US dollar (it was good to check that the exchange rate didn't go the other way around), I wrote back "Thanks!"
In early 2024 I extracted from the Sang-il-provided data set a list of 2,104 bars in Daejeon. If you are interested, you can view the spot in the KAIST talk were I present the Daejeon-bar walk. It is also described in the More Tours page.
At the time of the KAIST lecture, I didn't even consider trying to solve the TSP for all of Korea. It was far larger than the Dutch tour we found in 2021, and that example required 97 years of computer time (if run on a single processor core of our Linux servers). But in the fall of 2024, Keld Helsgaun and I made progress on how to combine the LKH and Concorde computer codes to possibly solve large instances of the TSP, so in December 2024 we looked again at the Korean example.
The full data set contains information for 90,680 bars in Korea. But some of the entries are listed as having the same latitude-longitude coordinates. This is possibly due to places located on different floors of the same building. Having no easy way to measure the time spent on an elevator or staircase, we took only a single example from each such cluster of co-located points. This results in a collection of 81,998 distinct points.
The latitude-longitude coordinates for the 81,998 locations are given in the file korea81998.xy.
We adopted the Open Source Routing Machine (OSRM) to obtain point-to-point walking times between the bar locations. A nice set of instructions for running the OSRM can be found on the Project-OSRM GitHub page. In building the korea81998 distance table, we made the following adjustments.
The full data set, in TSPLIB format, is a 22 GByte file. This is too large to store on our university server. But if you would like to carry out experiments on korea81998, please write to bico@uwaterloo.ca and I'll attempt to place the file temporarily at a public location.
Click on the images to see high-resolution snapshots.
Our optimal korea81998 tour is listed in the following TSPLIB-format file.
The TSPLIB format for a tour lists in the "TOUR_SECTION" the indices of the points from the Lat-Lng file as they appear in the tour; so a permutation of the numbers from 1 to n.
The locations of the bars were downloaded from a database maintained by the Korean National Police Agency.
The table of point-to-point walking times was created with the Open Source Routing Machine (OSRM).
The tour map-drawings were created with the Leaflet open-source JavaScript library for mobile-friendly interactive maps and make use of map tiles built by OpenStreetMap, by Carto Basemaps and by Stadia Maps.