Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/4708
Title: A novel technique of feature extraction with dual similarity measures for protein sequence classification
Authors: Bharilla, Neha
Tiwarib, Aruna
Rawat, Anshul
Keywords: Bioinformatics;Computer aided diagnosis;Extraction;Feature extraction;Intelligent computing;Neural networks;Proteins;Classification accuracy;Extracting features;Information resource;Protein Classification;Protein sequence classification;Similarity measure;Superfamily classification;Training algorithms;Classification (of information)
Issue Date: 2015
Publisher: Elsevier B.V.
Citation: Bharill, N., Tiwari, A., & Rawat, A. (2015). A novel technique of feature extraction with dual similarity measures for protein sequence classification. Paper presented at the Procedia Computer Science, , 48(C) 795-801. doi:10.1016/j.procs.2015.04.217
Abstract: In this article, a novel approach for extracting features from protein sequences is proposed. This approach extracts only six features corresponding to each protein sequence. These features are computed by globally considering the probabilities of occurrences of the amino acids in different positions within the superfamily which locally belongs to the six exchange groups. Then, these features are used as an input to the Neural Network formed by Boolean-Like Training Algorithm (BLTA). The BLTA is used to classify the protein sequences obtained from the Protein Information Resource (PIR). To investigate the efficacy of proposed feature extraction approach, the experimentation is performed on two superfamilies, namely Ras and Globin using tenfold cross validation. The highest Classification Accuracy achieved is 100.00±00.00 with Computational Time 170.49±70.87 (s) are remarkably better in comparison to the Classification Accuracies and Computational Time achieved by Mansouri, Bandyopadhyay and Wang. The experimental results demonstrate that the proposed approach extracts the most significant and lesser number of features for each protein sequence due to which it results in considerably potential improvement in Classification Accuracy and takes less Computational Time in comparison with other well-known feature extraction approaches. © 2015 The Authors.
URI: https://doi.org/10.1016/j.procs.2015.04.217
https://dspace.iiti.ac.in/handle/123456789/4708
ISSN: 1877-0509
Type of Material: Conference Paper
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
File SizeFormat 
CP8.pdf
  Restricted Access
564.93 kBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: