Connecting Optical Morphology, Environment, and HI Mass Fraction for Low-Redshift Galaxies Using Deep Learning
A galaxy's cold gas content can determine its current and future star formation properties. Most of that cold gas in present-day galaxies is in the form of neutral atomic hydrogen (HI), which radiates weakly through a 21-cm emission line. Since it is so difficult to observe this signal, many different heuristics have been developed in order to estimate the HI mass fraction (equivalent to a galaxy's HI mass normalized by its stellar mass). This work aims to improve existing approaches by leveraging all optical imaging information.
We use deep convolutional neural networks to encode SDSS gri images (spanning 224 x 224 pixels, or roughly 100" x 100") of optical counterparts to HI detections in low-redshift Universe. We can predict HI mass fractions to within 0.23 dex RMSE for the SDSS x ALFALFA data set using imaging alone. When the CNN is also used for pattern recognition, the combined result outperforms all other machine learning regression methods (e.g., as low as 0.20 dex scatter for an independent ALFALFA data set).
Results can be found in the paper: see ADS or arXiv link.
Download this repository by running
git clone https://github.com/jwuphysics/HI-convnets.git
cd HI-convnets
Note that the most recent results are found in notebook/updates
, while previous results can be found in notebook
.
Pytorch and Fastai >=2.0
are required to run this code. At the time of this writing, the fastai2
library is undergoing large changes, and will eventually supercede the fastai
repository.
Imaging data were queried from the SDSS DR14 image cutout service using a download script similar to the one in our metallicity prediction deep convnet. See, for example, src/get_sdss_cutouts.py
. Galaxy positions for the training data set were taken from the ALFALFA 40% catalogs (Haynes et al. 2011) crossmatched to the SDSS Main Galaxy Sample. The ALFALFA 100%, xGASS, and NIBLES catalogs were used as test data sets.
@ARTICLE{2020ApJ...900..142W,
author = {{Wu}, John F.},
title = "{Connecting Optical Morphology, Environment, and H I Mass Fraction for Low-redshift Galaxies Using Deep Learning}",
journal = {\apj},
keywords = {Galaxies, Galaxy evolution, Galaxy processes, Galaxy environments, Interstellar atomic gas, Interstellar medium, Astronomy data analysis, Astronomy data modeling, Astronomy data visualization, Convolutional neural networks, Neural networks, 573, 594, 614, 2029, 833, 847, 1858, 1859, 1968, 1938, 1933, Astrophysics - Astrophysics of Galaxies, Astrophysics - Instrumentation and Methods for Astrophysics},
year = 2020,
month = sep,
volume = {900},
number = {2},
eid = {142},
pages = {142},
doi = {10.3847/1538-4357/abacbb},
archivePrefix = {arXiv},
eprint = {2001.00018},
primaryClass = {astro-ph.GA},
adsurl = {https://ui.adsabs.harvard.edu/abs/2020ApJ...900..142W},
adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}
If you have any questions or comments, please reach out via email!
This work began during the MIAPP Programme on Galaxy Evolution and was inspired by conversations with Mike Jones (IAA) and Luke Leisman (Valpariso). Conversations with @jegpeek were super helpful. The anonymous ApJ referee also provided lots of useful comments and feedback. Some of this work was also done at the Interstellar Institute meeting, SO-STAR. The Fastai course and software developed by Jeremy Howard et al. have been immensely useful for this work. Likewise, the Grad-CAM implementation by @anhquan0412, and combined RAdam + LookAhead optimizer (aka Ranger) by @lessw2020 are used in this work.