A novel technique of feature extraction based on local and global similarity measure for protein classification

Bharill, Neha; Tiwari, Aruna

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/4709

Title:	A novel technique of feature extraction based on local and global similarity measure for protein classification
Authors:	Bharill, Neha Tiwari, Aruna
Keywords:	Bioinformatics;Biomedical engineering;Biomedical signal processing;Computer aided diagnosis;Extraction;Feature extraction;Feedforward neural networks;Learning algorithms;Proteins;Classification accuracy;Extracting features;Information resource;Neural network learning algorithm;Position-specific information;Protein Classification;Protein sequence classification;Training algorithms;Classification (of information)
Issue Date:	2015
Publisher:	SciTePress
Citation:	Bharill, N., & Tiwari, A. (2015). A novel technique of feature extraction based on local and global similarity measure for protein classification. Paper presented at the BIOINFORMATICS 2015 - 6th International Conference on Bioinformatics Models, Methods and Algorithms, Proceedings; Part of 8th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2015, 219-224. doi:10.5220/0005283702190224
Abstract:	The paper aims to propose a novel approach for extracting features from protein sequences. This approach extracts only 6 features for each protein sequence which are computed by globally considering the probabilities of occurrences of the amino acids in different position of the sequences within the superfamily which locally belongs to the six exchange groups. Then, these features are used as an input for Neural Network learning algorithm named as Boolean-Like Training Algorithm (BLTA). The BLTA classifier is used to classify the protein sequences obtained from the Protein Information Resource (PIR). To investigate the efficacy of proposed feature extraction approach, the experimentation is performed on two superfamilies, namely Ras and Globin. Across tenfold cross validation, the highest Classification Accuracy achieved by proposed approach is 94.32±3.52 with Computational Time 6.54±0.10 (s) is remarkably better in comparison to the Classification Accuracies achieved by other approaches. The experimental results demonstrate that the proposed approach extracts the minimum number of features for each protein sequence. Therefore, it results in considerably potential improvement in Classification Accuracy and takes less Computational Time for protein sequence classification in comparison with other well-known feature extraction approaches.
URI:	https://doi.org/10.5220/0005283702190224 https://dspace.iiti.ac.in/handle/123456789/4709
ISBN:	9789897580703
Type of Material:	Conference Paper
Appears in Collections:	Department of Computer Science and Engineering

Files in This Item:

There are no files associated with this item.

Show full item record

Altmetric Badge: