Slice of HYG data set



hyg109399 data


The HYG Database, constructed by David Nash, combines the Hipparcos, Yale Bright Star and Gliese catalogues. It contains explicit xyz-coordinates for each of its stars. Just what we need for a TSP challenge.

The full collection has 119,614 entries. That's great, but 10,215 of them have coordinates placing the star on a sphere exactly 100,000 parsecs from Earth, whereas each of the remaining 109,399 stars has distance less than 1,000 parsecs. We therefore consider two examples, one having the full set of 119,614 stars and the other, presented here, having the 109,399 stars with more accurate coordinates.

Data Sets

The data for hyg109399 were extracted from Field 13 in the original hygdata_v3.csv.gz file. The HYG Database provides the following description.

"13. *X,Y,Z: The Cartesian coordinates of the star, in a system based on the equatorial coordinates as seen from Earth. +X is in the direction of the vernal equinox (at epoch 2000), +Z towards the north celestial pole, and +Y in the direction of R.A. 6 hours, declination 0 degrees."

We scaled the coordinates by 10 to put them in units of 1/10th parsecs. The resulting 109,399 xyz triples of coordinates are listed in each of the following two files, one star per per line.

  • hyg109399.xyz, list of coordinates
  • hyg109399.tsp, the coordinates in TSPLIB format.

To match the points of hyg109399 to the orginal HYG database, the following file contains for each entry its index into hygdata_v3.csv, together with its xyz-coordinates.

  • hyg109399_names.txt

The following three files give our optimal tour.

  • hyg109399_tour.txt, a list of integers from 1 up to 109,399, giving the order the stars appear in the tour,
  • hyg109399_order.txt, the points for hyg109399 permuted in optimal order,
  • hyg109399.tour, the tour in TSPLIB format.

Computing Distances

To create an instance of the TSP, we need to specify precisely the point-to-point distances we use. For this we adopt the standard TSPLIB norm for 3D Euclidean data. This norm takes the straight-line distance between two points and rounds the resulting value to the nearest integer. In our case, the star-to-star distance is therefore measured to the nearest 1/10th parsec. Here is a simplified version of the computer code used in Concorde for the distance calculation.

Interactive Views

Stars -- Zoom, pan, and rotate the data set to see its 3D structure. Stars are represented by twinkling points.
Light -- In this version, stars are represented as square particles, resulting in an image that is easier to render (in case you have trouble with the twinkling version).

Static Images

HYG Points Full View
HYG Points Zoom 1
HYG Points Zoom 2