Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/4565
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChoudhary, Ajay K.en_US
dc.contributor.authorJha, Preetien_US
dc.contributor.authorTiwari, Arunaen_US
dc.contributor.authorBharill, Nehaen_US
dc.date.accessioned2022-03-17T01:00:00Z-
dc.date.accessioned2022-03-17T15:34:51Z-
dc.date.available2022-03-17T01:00:00Z-
dc.date.available2022-03-17T15:34:51Z-
dc.date.issued2021-
dc.identifier.citationChoudhary, A., Jha, P., Tiwari, A., Bharill, N., & Ratnaparkhe, M. (2021). Scalable fuzzy clustering-based regression to predict the isoelectric points of the plant protein sequences using apache spark. Paper presented at the IEEE International Conference on Fuzzy Systems, , 2021-July doi:10.1109/FUZZ45933.2021.9494447en_US
dc.identifier.isbn9781665444071-
dc.identifier.issn1098-7584-
dc.identifier.otherEID(2-s2.0-85114687205)-
dc.identifier.urihttps://doi.org/10.1109/FUZZ45933.2021.9494447-
dc.identifier.urihttps://dspace.iiti.ac.in/handle/123456789/4565-
dc.description.abstractLearning in non-stationary environments require modern tools and algorithms to quickly adapt to the new pattern because concept drift can change the underlying distribution. So, the existing assumption that the data is independent and identically distributed may be invalid in data stream scenarios. Given the massive volume of high-speed data streams and the concept drift, traditional machine learning algorithms must be self-adapting. One of the difficulties in handling regression tasks is the complexities of equations for the regression models when combined with drift handling techniques. The high dimensional protein data is a major challenge for bioinformatics researchers to analyse the dynamics of the sequences. This paper proposes a Scalable Fuzzy Clustering induced Regression (SFC-R) algorithm to predict the isoelectric point of the plant protein sequences using Apache Spark clusters. The SFC-R algorithm uses the input features extracted from the plant protein sequences and validates performance in terms of mean squared error (MAE) and root-mean-square error (RMSE). Experiments on plant protein datasets are carried out to validate the high accuracy and robustness of our approach. © 2021 IEEE.en_US
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineers Inc.en_US
dc.sourceIEEE International Conference on Fuzzy Systemsen_US
dc.subjectBioinformaticsen_US
dc.subjectClustering algorithmsen_US
dc.subjectData streamsen_US
dc.subjectFuzzy clusteringen_US
dc.subjectFuzzy systemsen_US
dc.subjectMachine learningen_US
dc.subjectMean square erroren_US
dc.subjectProteinsen_US
dc.subjectRegression analysisen_US
dc.subjectHandling techniqueen_US
dc.subjectHigh-dimensionalen_US
dc.subjectIso-electric pointsen_US
dc.subjectMean squared erroren_US
dc.subjectNon-stationary environmenten_US
dc.subjectRegression modelen_US
dc.subjectRoot mean square errorsen_US
dc.subjectUnderlying distributionen_US
dc.subjectLearning algorithmsen_US
dc.titleScalable Fuzzy Clustering-based Regression to Predict the Isoelectric Points of the Plant Protein Sequences using Apache Sparken_US
dc.typeConference Paperen_US
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: