Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/10939
Title: HPC enabled a Novel Deep Fuzzy Scalable Clustering Algorithm and its Application for Protein Data
Authors: Jha, Preeti;Tiwari, Aruna;Anand, Vaibhav K.Arya, Sudhanshu S.Singh, Tanmay P.
Keywords: Big data; Cluster computing; Clustering algorithms; Deep neural networks; Fuzzy inference; Fuzzy neural networks; Iterative methods; Proteins; Clusterings; Deep learning; Feature space; High-dimensional; ITS applications; Neural-networks; Performance computing; Protein data; Scalable algorithms; Scalable clustering; Fuzzy clustering
Issue Date: 2022
Publisher: Institute of Electrical and Electronics Engineers Inc.
Citation: Jha, P., Tiwari, A., Bharill, N., Ratnaparkhe, M., Patel, O. P., Anand, V., . . . Singh, T. (2022). HPC enabled a novel deep fuzzy scalable clustering algorithm and its application for protein data. Paper presented at the 2022 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2022, doi:10.1109/CIBCB55180.2022.9863036 Retrieved from www.scopus.com
Abstract: Fuzzy clustering is a common way to divide data into groups. Even though it has been improved a lot, fuzzy clustering still has problems while clustering real high-dimensional Big Data with complicated latent distributions. To solve this problem, this study comes up with a way to represent the data in a feature space that was built from a scalable deep neural network using Apache Spark on HPC. In this paper, we proposed SDnnRSIO-FCM, a Scalable Deep Neural Network Random Sampling Iterative Optimization-FCM clustering algorithm, and the SDnnLFCM, a scalable version of the Deep Neural Network Literal Fuzzy c-Means algorithm. We focus on the design and implementation of the proposed SDnnRSIO-FCM and SDnnLFCM algorithms using the Apache Spark cluster in a High-Performance Computing (HPC) environment by representing the data in a feature space produced by the neural network to handle Big Data. First, data is mapped into new feature space to aid in the reconstruction of the original data by providing a good representation. Second, scalable fuzzy clustering is embedded with neural networks to propose deep fuzzy clustering methods. The experimental results conducted on two huge benchmark datasets show that the SDnnRSIO-FCM algorithm outperforms the SDnnLFCM algorithm in terms of Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and F-score. Furthermore, the proposed SDnnRSIO-FCM applied to huge soybean protein sequences in comparison with SDnnLFCM shows a significant improvement in terms of Silhouette index (SI), Davies-Bouldin index (DBI), and Calinski-Harabasz index (CHI). © 2022 IEEE.
URI: https://doi.org/10.1109/CIBCB55180.2022.9863036
https://dspace.iiti.ac.in/handle/123456789/10939
ISBN: 978-1665484626
Type of Material: Conference Paper
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: