Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/11320
Title: Machine Learning Aided Interpretable Approach for Single Nucleotide-Based DNA Sequencing using a Model Nanopore
Authors: Jena, Milan Kumar
Roy, Diptendu Sinha
Pathak, Biswarup
Keywords: DNA;DNA sequences;Gene encoding;Learning algorithms;Machine learning;Mean square error;Nanopores;Quantum theory;DNA nucleotides;DNA Sequencing;Electrical detection;High-throughput analysis;Machine-learning;Next-generation sequencing;Quantum tunneling;Single nucleotides;Solid-state nanopore;Transmission function;Nucleotides;nucleotide;DNA sequence;machine learning;nanopore;nucleotide sequence;procedures;Base Sequence;Machine Learning;Nanopores;Nucleotides;Sequence Analysis, DNA
Issue Date: 2022
Publisher: American Chemical Society
Citation: Jena, M. K., Roy, D., & Pathak, B. (2022). Machine learning aided interpretable approach for single nucleotide-based DNA sequencing using a model nanopore. Journal of Physical Chemistry Letters, 13(50), 11818-11830. doi:10.1021/acs.jpclett.2c02824
Abstract: Solid-state nanopore-based electrical detection of DNA nucleotides with the quantum tunneling technique has emerged as a powerful strategy to be the next-generation sequencing technology. However, experimental complexity has been a foremost obstacle in achieving a more accurate high-throughput analysis with industrial scalability. Here, with one of the nucleotide training data sets of a model monolayer gold nanopore, we have predicted the transmission function for all other nucleotides with root-mean-square error scores as low as 0.12 using the optimized eXtreme Gradient Boosting Regression (XGBR) model. Further, the SHapley Additive exPlanations (SHAP) analysis helped in exploring the interpretability of the XGBR model prediction and revealed the complex relationship between the molecular properties of nucleotides and their transmission functions by both global and local interpretable explanations. Hence, experimental integration of our proposed machine-learning-assisted transmission function prediction method can offer a new direction for the realization of cheap, accurate, and ultrafast DNA sequencing. © 2022 American Chemical Society. All rights reserved.
URI: https://doi.org/10.1021/acs.jpclett.2c02824
https://dspace.iiti.ac.in/handle/123456789/11320
ISSN: 1948-7185
Type of Material: Journal Article
Appears in Collections:Department of Chemistry

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: