Science Data Challenge paper shares insights (and code) BY CASSANDRA CAVALLARO (SKAO)
The SKAO’s Science Data Challenge series is yielding valuable results for the science community in how to tackle the volume and complexity of future SKA data. A recent paper, led by the SKAO Science team and published in the Monthly Notices of the Royal Astronomical Society, details the full results and lessons learned from the second Science Data Challenge, which ran in 2021. It asked teams to find and measure the neutral hydrogen content of galaxies in simulated SKA-Mid telescope data. The challenge received generous support from eight international supercomputing facilities, which provided dedicated resources for teams. The paper features contributions from more than 100 participants, representing over 40 institutions in 18 countries. Along with details of how the SKAO’s Science team simulated SKA-Mid’s view of the neutral hydrogen sky, there are descriptions of the techniques – both new and established – that teams used to tackle the challenge. It also includes links to the source code for the simulations and some of the teams’ methods. The authors note that a combination of methods, and a collaborative, multidisciplinary approach, will be key to exploiting huge astronomical data sets like those the SKAO will create. The winning strategy combined predictions from two independent machine learning techniques to yield a 20% improvement in overall performance.
ABOVE: The 3D data cube analysed in the challenge contained 2,683 sources. The challenge’s 3D data cube is a series of stacked radio images, each reflecting a different frequency. It shows galaxies across a distance of 4 billion light years. Credit: SKAO
“By sharing the findings we’re aiming to grow our collective knowledge and hopefully inform those beyond our immediate community, in a way that could feed other innovations,” said SKAO Scientist Dr Philippa Hartley, who co-led the challenge. “That’s why it was really important for us to publish the source code as well, as part of our goal to make science more open and more accessible.” Meanwhile the latest challenge – number three in the series – has now concluded. It tasked participants with the recovery of the most distant neutral hydrogen signatures from a simulation of the SKA-Low view of Cosmic Dawn. Watch out for the results in the next issue of Contact!
SKAO Science Data Challenge 2 MA P OF W OR L DW IDE PA R T IC IPATIO N
IRIS (STFC) UK
CSCS Lugano, Switzerland
INAF Rome, Italy
GENCIIDRIS Orsay,France
THE C H A L L EN G E IN N UMB ER S Teams analysing
1TB
ENGAGE SKA - UCLCA Aveiro & Coimbra, Portugal
of astronomical data
280
registered participants in
22
countries
8
supercomputing centres
15 million CPU core hours* and 15 TB RAM available for teams
Participants
China SRC-proto Shanghai, China
IAA-CSIC Granada, Spain
AusSRC & Pawsey Perth, Australia
Computing facilities
1–5 6–10 11–20 20+
NOVEMBER
2023
C O N TA C T
5