Approx-Ch: An Approximate Chameleon Clustering for Large-Scale and High-Dimensional Data

Singh, Priyanshu; Ahuja, Kapil

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/18335

Full metadata record

DC Field	Value	Language
dc.contributor.author	Singh, Priyanshu	en_US
dc.contributor.author	Ahuja, Kapil	en_US
dc.date.accessioned	2026-05-14T12:28:25Z	-
dc.date.available	2026-05-14T12:28:25Z	-
dc.date.issued	2025	-
dc.identifier.citation	Singh, P., Ahuja, K., & Raha, S. (2025). Approx-Ch: An Approximate Chameleon Clustering for Large-Scale and High-Dimensional Data. Proceedings of 2025 IEEE 22nd India Council International Conference, INDICON 2025. https://doi.org/10.1109/INDICON68490.2025.11392902	en_US
dc.identifier.isbn	979-833159031-4	-
dc.identifier.other	EID(2-s2.0-105036205372)	-
dc.identifier.uri	https://dx.doi.org/10.1109/INDICON68490.2025.11392902	-
dc.identifier.uri	https://dspace.iiti.ac.in:8080/jspui/handle/123456789/18335	-
dc.description.abstract	Hierarchical clustering remains a fundamental challenge in data mining, particularly when dealing with real-world datasets. Here, traditional approaches fail to scale effectively when the datasets are large-scale and high-dimensional. Recent Chameleon clustering algorithms - Chameleon2, M-Chameleon, and INNGS-Chameleon - have proposed advanced strategies that try to address this challenge. However, they still suffer from O(n2) computational complexity. We address this challenge here by introducing Approximate-Chameleon (Approx-Ch) that has O(n log n) complexity.Our algorithm has three parts. First, Graph Generation - here we use approximate k-NN search instead of an exact one, as used by earlier three algorithms. This results in fast nearest-neighbor computation, significantly reducing the graph generation time. Second, Graph Partitioning - here we use a multi-level partitioning approach as compared to a single-level one, mostly used by the prior three works. This change ensures that graph partitioning is robust to the errors introduced by approximate graph generation. This also facilitates minimal configuration requirements. Third, Merging - here we follow Chameleon2 by retaining its flood-fill heuristic and its merging criteria since it is the cheapest among the earlier three algorithms.On real-world benchmark datasets used in former three works, Approx-Ch delivers an average improvement of 5% in clustering quality and reduces total run-time by 86%. This demonstrates that algorithmic efficiency and clustering quality can co-exist in large-scale hierarchical clustering. © 2025 IEEE.	en_US
dc.language.iso	en	en_US
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	en_US
dc.source	Proceedings of 2025 IEEE 22nd India Council International Conference, INDICON 2025	en_US
dc.title	Approx-Ch: An Approximate Chameleon Clustering for Large-Scale and High-Dimensional Data	en_US
dc.type	Conference Paper	en_US
Appears in Collections:	Department of Computer Science and Engineering

Files in This Item:

There are no files associated with this item.

Show simple item record

Altmetric Badge: