Scalable incremental fuzzy consensus clustering algorithm for handling big data

Jha, Preeti; Tiwari, Aruna; Bharill, Neha; Mounika, Mukkamalla

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/4817

Full metadata record

DC Field	Value	Language
dc.contributor.author	Jha, Preeti	en_US
dc.contributor.author	Tiwari, Aruna	en_US
dc.contributor.author	Bharill, Neha	en_US
dc.contributor.author	Mounika, Mukkamalla	en_US
dc.date.accessioned	2022-03-17T01:00:00Z	-
dc.date.accessioned	2022-03-17T15:35:37Z	-
dc.date.available	2022-03-17T01:00:00Z	-
dc.date.available	2022-03-17T15:35:37Z	-
dc.date.issued	2021	-
dc.identifier.citation	Jha, P., Tiwari, A., Bharill, N., Ratnaparkhe, M., Nagendra, N., & Mounika, M. (2021). Scalable incremental fuzzy consensus clustering algorithm for handling big data. Soft Computing, 25(13), 8703-8719. doi:10.1007/s00500-021-05733-1	en_US
dc.identifier.issn	1432-7643	-
dc.identifier.other	EID(2-s2.0-85103356568)	-
dc.identifier.uri	https://doi.org/10.1007/s00500-021-05733-1	-
dc.identifier.uri	https://dspace.iiti.ac.in/handle/123456789/4817	-
dc.description.abstract	Consensus clustering can produce novel, stable, and robust clustering results. Consensus clustering intends to merge a few existing basic segments into a coordinated one, and this has been broadly perceived as a promising solution for heterogeneous data clustering for big data. Even though many clustering algorithms have been proposed, getting a decent quality segment with high effectiveness is still not yet decided. In this paper, we propose a scalable incremental fuzzy consensus clustering (SIFCC) algorithm for a big data framework. It has been implemented on Apache Spark cluster framework, a distributed data stream environment for handling big data by considering the data as a set of data subsets that are processed incrementally. Sparks work great for iterative algorithms by supporting in-memory calculations, scalability, etc. SIFCC not only facilitates efficient big data clustering, but also improves the quality of clusters, performs storage space optimization, and time complexity during clustering. To establish the comparison, we designed and implemented the scalable model of existing fuzzy consensus clustering (FCC) on Apache Spark cluster, named as a scalable fuzzy consensus clustering (SFCC). Extensive experiments on real-world datasets show that the SIFCC algorithm achieves the better potential for clustering of Big Data in comparison with SFCC. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.	en_US
dc.language.iso	en	en_US
dc.publisher	Springer Science and Business Media Deutschland GmbH	en_US
dc.source	Soft Computing	en_US
dc.subject	Cluster analysis	en_US
dc.subject	Data streams	en_US
dc.subject	Digital storage	en_US
dc.subject	Iterative methods	en_US
dc.subject	Large dataset	en_US
dc.subject	Cluster framework	en_US
dc.subject	Consensus clustering	en_US
dc.subject	Distributed data streams	en_US
dc.subject	Heterogeneous data clustering	en_US
dc.subject	Iterative algorithm	en_US
dc.subject	Quality segments	en_US
dc.subject	Real-world datasets	en_US
dc.subject	Robust clustering	en_US
dc.subject	Clustering algorithms	en_US
dc.title	Scalable incremental fuzzy consensus clustering algorithm for handling big data	en_US
dc.type	Journal Article	en_US
Appears in Collections:	Department of Computer Science and Engineering

Files in This Item:

There are no files associated with this item.

Show simple item record

Altmetric Badge: