Françoise COMBES — Collège de France & Observatoire de Paris — Deep machine learning in Cosmology

Page 1

Deep machine learning in Cosmology Chaire Galaxies et Cosmologie Abell 2218

Franรงoise Combes November, 2018


The exponential increase of data

Area of surveys

Year + IFUS

From Brinkman, Huertas-Company


Also in numerical simulations Sizes of the simulations Number of particles or Resolution elements + Physical processes

Illustris project Genel et al 2014


Number of publications on machine learning NASA-ADS, refereed publications, with key-words « Machine learning neurons » Recent explosion in the last 4 years

Error(%)

Since 2012, machines separate Cats and dogs! Russakovsky et al 2014 Image Challenge


Powerful telescopes in space and on the ground Euclid, ESA satellite: launch 2021: Main goal Dark Energy 15000deg2, 12 billions galaxies  WFIRST from NASA: Dark Energy & exoplanets (~2027-30?)  LSST: Large Synoptic Survey Telescope: wide field 8m in Chile 20 TB/night: All sky observed every 3 days Millions of alerts/night SKA: Square Kilometer Array: Antennas array, Australia and South Africa, Several frequencies, l= 2cm to 6m Petabytes/sec 100 Petaflops machines

5


Big data management A huge challenge, for SKA: Petabytes/sec Petaflops machines working continuously (108 PC) A few exabytes/h, dishes=10x global internet, Phased arrays =100x global internet traffic! LSST: more than half of the cost! 1-2 millions alerts per night, available in 60sec 15 Tbytes /night-- Every 3 days, all sky observed 20 000 deg2 3200 Mpixels, 10 deg2, 15s/pose Euclid: 100Gb/day But spectro @ground

LSST

6


7


Classification of Hubble-Sandage (few 100)

High-z

Elliptiques … Bulbe dominant … Spirales … Irrégulières

8


From Hubble sequence to red sequence A change of paradigm

Main parameters SFR Also SFH, dust, age, metallicity 2 formation mechanisms Critical mass 3 1010Mo

Blue Color

Color-Magnitude diagram 150 000 galaxies in the SDSS

Red

Baldry et al 2004 Schavinski et al 2014


Galaxy Zoo: citizen science

The first part received millions of classifications (since 2007) SDSS, then CANDELS, DECals, Computer images, GAMA, KIDS Now galaxy Zoo-4 (7/2017): 1 million galaxies to classify

Archives of publications (55)


Angular Momentum One of the main factors of galaxy fate In addition to environment (over-density)


Galaxy Classification methods CNN: Convolutional Neural Network

Entering into the black box: First step, first layer : 32 filters of the image Eacjhfilter is contrast-normalized individually See e.g. filters for edge-detection, how it varies with color Dieleman 2015


Convolution


Downsampling, Pooling

Dominguez-Sanchez 2018


After several layers..

Dieleman 2015


Results on Galaxy Classification Application to 600 000 galaxies from SDSS Pbulge

Pcigar

Pbulge

Pedge

Dominguez-Sanchez et al 2018


Test on numerical simulations Morphology, types, Sersic index, etc.. Comparable results, but CNN runs 104 faster then GALFIT 3.5h instead of 1sec for 1000 objects

Tucillo et al 2018


How big the sample should be? Normally, the biggest numbers are better. Especially for complex shapes to recognize (in current life, landscapes, animals, scenarios..) However galaxies are simple objects!

Huertas-Company 2018


Possible to enlarge training sets The morphology of the object should not depend on orientation Rotations and translations are possible, to enlarge the set (+crops)

Dieleman 2015


Galaxy Zoo 2

Willett et al 2013


Generative Adversarial Network Degrading the image, to train the machine

Schawinski et al 2017


GAN performance

GAN trained on 4500 SDSS galaxies ďƒ¨ promising for billions


Three cases of failures (rarity, noise)


PSF-GAN To subtract AGN point sources automatically Much faster than GALFIT and with much less errors! Point sources added artificially for Training Less sensitive to deformation of PSF


With HST at high resolution, and known PSF, the quasar is subtracted to see the underlying galaxy

Martel et al 2003


GAN versus parametric tool GALFIT

Stark et al 2018


GAN less sensitive to PSF broadening

Stark et al 2018


PSFGAN uses its knowledge of galaxies

Point source added to test Residuals compared the visual structure of galaxies helps PSFGAN for PS subtraction


Use of Fader networks Manipulating images with sliding attributes, Lample et al 2017

The image is first encoded to a latent representation. The attributes are selected and a decoder is trained to rebuild the image with other Attributes ďƒ¨ Many applications in evolutional research


Exploring galaxy evolution with generative models

Schawinski et al 2018


Test models to quench star formation What are the salient characters changing from A to B How is the evolution from the blue cloud to red sequence? M(halo), environment, sSFR, gas mass, etc.. Schawinski et al 2018

The machine selects sSFR over dust fader to make a satellite


Weak lensing & cosmological parameters The primordial fluctuations seen in the Cosmic microwave background are gaussian – only average intensity and dispersion with the power spectrum versus scale, is sufficient However, lensing produces non-gaussianities Also non-gaussianity index ďƒ¨ seed of cosmic structures, inflation Gupta et al 2018, Deep learning But with a CNN of different architecture, it is possible to do better Ribli et al 2018, CNN Wm=0.26, s8 =0.8 -----


Weak lensing and machine learning Cosmic shear on 106 galaxies CFHTLenS, DES, and KiDS-450 Future 109 galaxies, Euclid, SKA, LSST, WFIRST..

Not only 2point correlations! Haiman 2018


Entering the CNN black box In the learning phase, the convolution kernel, and the corresponding weights are progressively changed, to have the optimum loss function The network has learned to use the shapes of lensing peaks, and their gradients. ďƒ¨Ribli et al 2018 propose a new statistic: the number of peaks with a given gradient, which out-performs all previous statistics


Galaxy metallicity Goal: determine the metallicity Z = 12 +log O/H from only 3 bands Colors gri from SDSS (128x128 images), better with high resolution Excellent relation Mass-metallicity It seems that the CNN has learned a way to deduce Z from galaxy morphology 96 000 images, CNN with 34 layers

Wu & Boada 2018


Galaxy mass predicted RMSE Root Mean Square Error NMAD Normal Median Absolute Deviation (independent of outliers)

Wu & Boada 2018


Galaxy evolution model Semi-analytical models of galaxy evolution since z=4, with radial transport of gas and stars within galactic discs (axisym)

Even axisymmetric, there are so many parameters, gas content, fH2, SFR, M*, metallicity (Z*, Zg), radius, surface density, Vel etc. ďƒ¨ Neural network Forbes et al 2018


Too large numbers of parameters ďƒ¨The neural network quickly returns the physical parameters of the various models, to be compared to observations

Forbes et al 2018


Modified gravity f(R), + sterile neutrinos Discrimination between a dozen of models, through a CNN Much better results with the CNN than closest-neighbour search From a series of numerical simulations of the models and in particular their weak lensing maps Size counts! Power spectrum, peak counts and Minkowski functionals are combined into a joint feature vector, to make a classical Estimator of the statistics Training epoch

Merten et al 2018


Modified gravity, sterile neutrinos Discrimination between a dozen of models, through a CNN

Training epoch

Loss function versus training epoch

Peel et al 2018

Noise influence


Summary

Deep machine learning techniques are blooming In many astronomical domains ďƒ¨ Will be mandatory with future instruments (Euclid, SKA, ..) Galaxy classification Weak lensing, Cluster findings in cosmology IFU galaxy kinematics: Califa, Sami (103) MANGA (104), Hector(105) Modified gravity, or Dark matter, Dark Energy


GalaxyGAN, Space.ml, ETH Zürich, K. Schawinski


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.