Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/4838
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChauhan, Vikasen_US
dc.contributor.authorTiwari, Arunaen_US
dc.contributor.authorJoshi, Niranjanen_US
dc.contributor.authorKhandelwal, Sahajen_US
dc.date.accessioned2022-03-17T01:00:00Z-
dc.date.accessioned2022-03-17T15:35:42Z-
dc.date.available2022-03-17T01:00:00Z-
dc.date.available2022-03-17T15:35:42Z-
dc.date.issued2021-
dc.identifier.citationChauhan, V., Tiwari, A., Joshi, N., & Khandelwal, S. (2021). Multi-label classifier for protein sequence using heuristic-based deep convolution neural network. Applied Intelligence, doi:10.1007/s10489-021-02529-6en_US
dc.identifier.issn0924-669X-
dc.identifier.otherEID(2-s2.0-85108609849)-
dc.identifier.urihttps://doi.org/10.1007/s10489-021-02529-6-
dc.identifier.urihttps://dspace.iiti.ac.in/handle/123456789/4838-
dc.description.abstractDeep learning techniques are found very useful to classify sequential data in recent times. The protein sequences belong to the functional classes based on the structure of their sequences. The annotation task of protein sequences into corresponding functional classes is multi-label in nature. The primary structure of protein contains a notable amount of vast data compared to the other secondary, tertiary, and quaternary structures. The clustering-based techniques require expert domain knowledge from the extensive data samples. Traditional methods use the n-gram features of amino acids while ignoring the relationship of motifs and amino acid sequence. This paper proposes an efficient method to classify the proteins into their functional classes using a convolution neural network based on heuristic rules. The proposed approach works on the primary structure of protein sequences which considers the relationship among motifs and amino acids. The proposed approach also takes into account the amino acid locations in the protein sequence. The proposed approach considers the affinity information between amino acids and motifs. Along with achieving high performance in the classification of protein sequences, we propose a heuristic approach to improve the precision and recall of the individual functional classes. The proposed heuristic approach improves the performance and handles the data imbalance problem. The proposed approach is compared with other competitive approaches, and our approach provides better performance metrics in terms of precision, recall, AUC, and subset accuracy. The greatest challenge with multi-label classification is to handle the data imbalance, which appears due to variance in frequencies of the labels in the data. This data imbalance is dealt with weight modulation in the loss function to influence the learning process. © 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.en_US
dc.language.isoenen_US
dc.publisherSpringeren_US
dc.sourceApplied Intelligenceen_US
dc.subjectAmino acidsen_US
dc.subjectConvolutionen_US
dc.subjectDeep learningen_US
dc.subjectDeep neural networksen_US
dc.subjectHeuristic methodsen_US
dc.subjectLearning systemsen_US
dc.subjectNeural networksen_US
dc.subjectProteinsen_US
dc.subjectAmino acid sequenceen_US
dc.subjectConvolution neural networken_US
dc.subjectLearning techniquesen_US
dc.subjectMulti label classificationen_US
dc.subjectPerformance metricsen_US
dc.subjectPrecision and recallen_US
dc.subjectPrimary structuresen_US
dc.subjectQuaternary structureen_US
dc.subjectClassification (of information)en_US
dc.titleMulti-label classifier for protein sequence using heuristic-based deep convolution neural networken_US
dc.typeJournal Articleen_US
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: