Asteroid Taxonomy with Photometric Colors

- applications of machine learning

We present the results from our applications of machine learning methods in asteroid taxonomy classifiation problems. The learning data are given in three dimensional colors (e.g., g-i, i-z, and griz) defined by us. Two different machine learning methods are tested and compared in our studies. Both methods model the distribution of asteroid colors as a mixture of Gaussian distributions (i.e., Gaussian mixtures). The model-based clustering method (i.e., unsupervised learning method) tries to identify dense stuructures as homogenous taxonomy groups without explicitly exploting the knowledge of the known taxnomy samples in the multidimensional color space. Therefore, clustering results require our interpretation of their correspondence to the known taxonomy groups. The second method is a semi-supervised learning algorithm which is also based on the Gaussian mixture model, and this method works with the colors of the known taxonomy samples in finding the best classification of target data. See Roh et al. (2020, accepted to A&A) for details in the SDSS color space. In this page, we provide the 3D plots of our analysis, downloadable analysis results, and simple Python scripts with relevant files to use the inferred taxonomy assignment models for newly measured asteroid colors.

Unsupervised learning results

-SDSS g, r, i, and z bands

The assignment of taxonomy can be done in two different ways: inference from combining MCMC samples with dissimilarity matrix (heareafter, raw) vs. maximal posterior inference (heareafter, MAP)

Unsupervised learning membership results in its raw inference method: C type (blue dots), S type (red dots), V type (lime green dots) , X type (purple dots) and Unassigned (small black dots).

3D plot for the raw inference (S: red, C: blue, X: violet, V: green), 3D plot for the MAP inference (S: red, C: blue, X: violet, V: green)

Taxonomy assignment tables in the CSV format: Table 1 and Table 3 in the paper.

Mixture distribution parameters (in NumPy file format) (i.e., Table 2 in the paper)

Semi-supervised learning results

-SDSS g, r, i, and z bands

We identify seven taxonomy groups as recognizable outcomes in the semi-supervised learning results.

Membership results: C (blue), D (orange red), K (yellow), L (lime), S (red), V (lime green), X (purple) and Unassigned (small black).

3D plot (S: red, C: blue, X: violet, V: green, D: orange, L: grass, K: yellow)

Taxonomy assignment table in the CSV format (i.e., Table 5 in the paper)

Mixture distribution parameters (in NumPy file format) (i.e., Table 4 in the paper)

You can use the Python script provided by us with the above npy files (i.e., the mixture distribution parameters) to infer asteroid taxonomy classes for given colors.
(usage) ./classify_object_with_npy_parameter_SDSS.py -un or -semi (color1: g-i) (color2: i-z) (color3: griz)
Examples are: ./classify_object_with_npy_parameter_SDSS.py -semi 0.28 -0.06 0.69 or ./classify_object_with_npy_parameter_SDSS.py -un 0.28 -0.06 0.69 where -un and -semi options mean using unsupervised and semi-supervised learning results, respectively.