Comparing sampling techniques to chart parameter space of 21 cm global signal with Artificial Neural Networks

Tripathi, Anshuman; Kaur, Gursharanjit; Datta, Abhirup; Majumdar, Suman

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/14991

Title:	Comparing sampling techniques to chart parameter space of 21 cm global signal with Artificial Neural Networks
Authors:	Tripathi, Anshuman Kaur, Gursharanjit Datta, Abhirup Majumdar, Suman
Keywords:	first stars;Machine learning;reionization;Statistical sampling techniques
Issue Date:	2024
Publisher:	Institute of Physics
Citation:	Tripathi, A., Kaur, G., Datta, A., & Majumdar, S. (2024). Comparing sampling techniques to chart parameter space of 21 cm global signal with Artificial Neural Networks. Journal of Cosmology and Astroparticle Physics. Scopus. https://doi.org/10.1088/1475-7516/2024/10/041
Abstract:	Understanding the first billion years of the universe requires studying two critical epochs: the Epoch of Reionization (EoR) and Cosmic Dawn (CD). However, due to limited data, the properties of the Intergalactic Medium (IGM) during these periods remain poorly understood, leading to a vast parameter space for the global 21cm signal. Training an Artificial Neural Network (ANN) with a narrowly defined parameter space can result in biased inferences. To mitigate this, the training dataset must be uniformly drawn from the entire parameter space to cover all possible signal realizations. However, drawing all possible realizations is computationally challenging, necessitating the sampling of a representative subset of this space. This study aims to identify optimal sampling techniques for the extensive dimensionality and volume of the 21cm signal parameter space. The optimally sampled training set will be used to train the ANN to infer from the global signal experiment. We investigate three sampling techniques: random, Latin hypercube (stratified), and Hammersley sequence (quasi-Monte Carlo) sampling, and compare their outcomes. Our findings reveal that sufficient samples must be drawn for robust and accurate ANN model training, regardless of the sampling technique employed. The required sample size depends primarily on two factors: the complexity of the data and the number of free parameters. More free parameters necessitate drawing more realizations. Among the sampling techniques utilized, we find that ANN models trained with Hammersley Sequence sampling demonstrate greater robustness compared to those trained with Latin hypercube and Random sampling. © 2024 IOP Publishing Ltd and Sissa Medialab. All rights, including for text and data mining, AI training, and similar technologies, are reserved.
URI:	https://doi.org/10.1088/1475-7516/2024/10/041 https://dspace.iiti.ac.in/handle/123456789/14991
ISSN:	1475-7516
Type of Material:	Journal Article
Appears in Collections:	Department of Astronomy, Astrophysics and Space Engineering

Files in This Item:

There are no files associated with this item.

Show full item record

Altmetric Badge: