KNN weighted reduced universum twin SVM for class imbalance learning

Ganaie, M. A.; Tanveer, M.

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/9763

Title:	KNN weighted reduced universum twin SVM for class imbalance learning
Authors:	Ganaie, M. A. Tanveer, M.
Keywords:	Diagnosis\|Geometry\|Nearest neighbor search\|Neurodegenerative diseases\|Class imbalance\|Class imbalance learning\|Imbalance ratio\|KNN weighted\|Learning models\|Nearest-neighbour\|Rectangular kernel\|Support vectors machine\|Twin support vector machines\|Universum\|Support vector machines
Issue Date:	2022
Publisher:	Elsevier B.V.
Citation:	Ganaie, M. A., & Tanveer, M. (2022). KNN weighted reduced universum twin SVM for class imbalance learning. Knowledge-Based Systems, 245 doi:10.1016/j.knosys.2022.108578
Abstract:	In real world problems, imbalance of data samples poses major challenge for the classification problems as the data samples of a particular class are dominating. Problems like fault and disease detection involve imbalance data and hence need attention to avoid the bias towards a particular class. The classification models like support vector machines (SVM) get biased to majority class samples and hence results in misclassification of the minority class samples. SVM suffers as no prior information related to the data is involved in the generation of hyperplanes. Also, local information of the neighbourhood is ignored in SVM samples and thus treats each sample equally for generating the hyperplanes. However, the data points may be contaminated and may mislead the generation of hyperplanes. Inspired by the idea of prior data information and local neighbourhood information, we propose K-nearest neighbour based weighted reduced universum twin SVM for class imbalance learning (KWRUTSVM-CIL). The proposed KWRUTSVM-CIL embodies the local neighbourhood information and uses universum data to balance the classes in class imbalance problems. Local neighbourhood information is incorporated via weight matrix in the objective function. In proposed KWRUTSVM-CIL model, weight vectors are used in the corresponding constraints of the objective functions to exploit the interclass information. The oversampling and undersampling approaches are followed to balance the data in class imbalance problems. Universum data gives prior information of the data. Twin SVM, universum twin SVM, and reduced universum twin SVM for class imbalance implement empirical risk minimization principle and thus may lead to overfitting. However, the proposed KWRUTSVM-CIL model embodies regularization term to maximize the margin and implement the structural risk minimization principle which is the marrow of statistical learning and overcomes the issues of overfitting. Experimental results and the statistical analysis signify that the generalization ability of proposed KWRUTSVM-CIL model is superior in comparison to other twin SVM based models. As an application, we use the proposed KWRUTSVM-CIL model for the diagnosis of Alzheimer's disease and breast cancer disease. The proposed KWRUTSVM-CIL model showed better generalization performance compared to other twin SVM based models in biomedical datasets. © 2022 Elsevier B.V.
URI:	https://dspace.iiti.ac.in/handle/123456789/9763 https://doi.org/10.1016/j.knosys.2022.108578
ISSN:	0950-7051
Type of Material:	Journal Article
Appears in Collections:	Department of Mathematics

Files in This Item:

There are no files associated with this item.

Show full item record

Altmetric Badge: