COMPARING ACCURACY OF LOGISTIC REGRESSION, K-NEAREST NEIGHBOR, SUPPORT VECTOR MACHINE, AND NAÏVE BAYES MODELS USING TRACKING ENSEMBLE MACHINE LEARNING
Selecting model for classifying target correctly is important. Logistic regression (LR), K-nearest neighbor (KNN), support vector machine (SVM), and Naïve Bayes (NB) are base models in classifying target. Tracking ensemble is the method for comparing accuracy in machine learning.
Datasets are generated by a code of Python as recommended by Brownlee [1]. Five sample sizes of 1,000, 3,000, 5,000, 7,000, and 10,000 are selected. The number of features is 20 having informative and redundant features, respectively, as 15 and 5.
The result shows that support vector machine (SVM) has the highest mean of accuracy and the lowest coefficient of variation of accuracy in all sample sizes. Naïve Bayes (NB) has the lowest mean of accuracy and the highest coefficient of variation of accuracy in all sample sizes. It is recommended to select support vector machine (SVM) for classifying target.
accuracy, tracking ensemble, feature, target, SVM.
Received: August 13, 2023; Accepted: October 9, 2023; Published: October 26, 2023
How to cite this article: Kuntoro Kuntoro, Comparing accuracy of logistic regression, K-nearest neighbor, support vector machine, and Naïve Bayes models using tracking ensemble machine learning, JP Journal of Biostatistics 24(1) (2024), 1-13. http://dx.doi.org/10.17654/0973514324001
This Open Access Article is Licensed under Creative Commons Attribution 4.0 International License
References:[1] J. Bownlee, Stacking Ensemble Machine Learning with Python, 2021.https://machinelearningmastery.com/stacking-ensemble-machine-learning-with-python/100823.[2] S. Greenland, R. Daniel and N. Pearce, Outcome modelling strategies in epidemiology: traditional methods and basic alternatives, International Journal of Epidemiology 45(2) (2016), 565-575.[3] Java T Point, Regression vs Classification in Machine Learning, 2023.https://www.javatpoint.com/regression-vs-classification-in-machine-learning. 110823.[4] E. Lutin, Ensemble Methods in Machine Learning: What are They and Why Use Them? 2017.https://towardsdatascience.com/ensemble-methods-in-machine-learning-what-are-they-and-why-use-them-68ec3f9fef5f 110823.[5] S. Ray, Naïve Bayes Classifier Explained: Application and Practice Problems of Naïve Bayes Classifier, 2017.https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/.