Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/368
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBharill, Nehaen_US
dc.contributor.authorTiwari, Arunaen_US
dc.date.accessioned2016-10-25T05:38:05Z-
dc.date.available2016-10-25T05:38:05Z-
dc.date.issued2016-
dc.identifier.citationBharill, N., Tiwari, A., & Malviya, A. (2016). Fuzzy based clustering algorithms to handle big data with implementation on apache spark. Paper presented at the Proceedings - 2016 IEEE 2nd International Conference on Big Data Computing Service and Applications, BigDataService 2016, 95-104. doi:10.1109/BigDataService.2016.34en_US
dc.identifier.otherEID(2-s2.0-84973650084)-
dc.identifier.urihttps://doi.org/10.1109/BigDataService.2016.34-
dc.identifier.urihttps://dspace.iiti.ac.in/handle/123456789/368-
dc.description.abstractWith the advancement in technology, a huge amount of data containing useful information, called Big Data, is generated on a daily basis. For processing such tremendous volume of data, there is a need of Big Data frameworks such as Hadoop MapReduce, Apache Spark etc. Among these, Apache Spark performs up to 100 times faster than conventional frameworks like Hadoop Mapreduce. For the effective analysis and interpretation of this data, scalable Machine Learning methods are required to overcome the space and time bottlenecks. Partitional clustering algorithms are widely adopted by researchers for clustering large datasets due to their low computational requirements. Thus, we focus on the design of partitional clustering algorithm and its implementation on Apache Spark. In this paper, we propose a partitional based clustering algorithm called Scalable Random Sampling with Iterative Optimization Fuzzy c-Means algorithm (SRSIO-FCM) which is implemented on Apache Spark to handle the challenges associated with Big Data Clustering. Experimentation is performed on several big datasets to show the effectiveness of SRSIO-FCM in comparison with a proposed scalable version of the Literal Fuzzy c-Means (LFCM) called SLFCM implemented on Apache Spark. The comparative results are reported in terms of value of F-measure, ARI, Objective function, Run-time and Scalability. The reported results show the great potential of SRSIO-FCM for Big Data clustering. © 2016 IEEE.en_US
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineers Inc.en_US
dc.relation.ispartofseriesCP1;en_US
dc.sourceProceedings - 2016 IEEE 2nd International Conference on Big Data Computing Service and Applications, BigDataService 2016en_US
dc.subjectAlgorithmsen_US
dc.subjectArtificial intelligenceen_US
dc.subjectBig dataen_US
dc.subjectCluster analysisen_US
dc.subjectCopyingen_US
dc.subjectFuzzy clusteringen_US
dc.subjectFuzzy systemsen_US
dc.subjectIterative methodsen_US
dc.subjectLearning systemsen_US
dc.subjectOptimizationen_US
dc.subjectComputational requirementsen_US
dc.subjectFuzzy C-means algorithmsen_US
dc.subjectIterative algorithmen_US
dc.subjectIterative Optimizationen_US
dc.subjectObjective functionsen_US
dc.subjectPartitional clusteringen_US
dc.subjectPartitional clustering algorithmen_US
dc.subjectScalable machine learningen_US
dc.subjectClustering algorithmsen_US
dc.titleFuzzy based clustering algorithms to handle big data with implementation on apache sparken_US
dc.typeConference Paperen_US
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
CP1.pdf
  Restricted Access
588.61 kBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: