Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/4603
Full metadata record
DC FieldValueLanguage
dc.contributor.authorNemade, Vishalen_US
dc.contributor.authorShastri, Aditya A.en_US
dc.contributor.authorAhuja, Kapilen_US
dc.contributor.authorTiwari, Arunaen_US
dc.date.accessioned2022-03-17T01:00:00Z-
dc.date.accessioned2022-03-17T15:34:56Z-
dc.date.available2022-03-17T01:00:00Z-
dc.date.available2022-03-17T15:34:56Z-
dc.date.issued2019-
dc.identifier.citationNemade, V., Shastri, A., Ahuja, K., & Tiwari, A. (2019). Scaled and projected spectral clustering with vector quantization for handling big data. Paper presented at the Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence, SSCI 2018, 2174-2179. doi:10.1109/SSCI.2018.8628915en_US
dc.identifier.isbn9781538692769-
dc.identifier.otherEID(2-s2.0-85062766825)-
dc.identifier.urihttps://doi.org/10.1109/SSCI.2018.8628915-
dc.identifier.urihttps://dspace.iiti.ac.in/handle/123456789/4603-
dc.description.abstractIn this modern era, the advent of web technologies and social networking websites is generating a significant amount of data every day. In this scenario, where the data size is now reaching zetta bytes (i.e., 1021), its analysis is very important.Since spectral-based clustering algorithms provide more accurate results than traditional clustering algorithms, we focus on these algorithms. In our work, we propose a modified version of spectral clustering, which we call Projected Spectral Clustering (PSC). As the complexity of the PSC algorithm is Opn3q, where n is the size of the data, we use two variants of vector quantization sampling namely k-Means (KM) and Bisecting k-Means (BKM). To make our algorithm scalable for handling Big Data, we implement it on Apache Spark using two approaches for computing the Gaussian Kernel matrix, which is the most important step here (i.e. Map Reduce and Map Only). We call this algorithm Scalable PSC (SPSC).We measure the accuracy of SPSC using three evaluation criteria tested on a variety of different datasets. Our new algorithm gives good clustering accuracies. Further, we perform another set of experiments on a different number of cores to demonstrate runtime/ scalability efficiency of our algorithm. Finally, we prove this scalability by doing a complexity analysis. © 2018 IEEE.en_US
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineers Inc.en_US
dc.sourceProceedings of the 2018 IEEE Symposium Series on Computational Intelligence, SSCI 2018en_US
dc.subjectArtificial intelligenceen_US
dc.subjectBig dataen_US
dc.subjectCluster analysisen_US
dc.subjectMatrix algebraen_US
dc.subjectSamplingen_US
dc.subjectScalabilityen_US
dc.subjectSocial sciences computingen_US
dc.subjectVector quantizationen_US
dc.subjectClustering accuracyen_US
dc.subjectComplexity analysisen_US
dc.subjectEvaluation criteriaen_US
dc.subjectGaussian kernelsen_US
dc.subjectMap-reduceen_US
dc.subjectSpectral clusteringen_US
dc.subjectTraditional clusteringen_US
dc.subjectWeb technologiesen_US
dc.subjectK-means clusteringen_US
dc.titleScaled and Projected Spectral Clustering with Vector Quantization for Handling Big Dataen_US
dc.typeConference Paperen_US
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: