Abstract:
Diabetes mellitus is a group of metabolic disorders known as ‘diabetes’ ,it has affected hundreds of millions of individuals. Diabetes detection is of great significancewith regard to its serious complications. Many studies on diabetes prediction datasets have been conducted,where most of it are the studies on diabetes collected from individual,and it is also where the onset of diabetes dataset is high,studying the female in Pima Indian natives population during 1967. Most of the previous studies concentrated primarily on one or two specific complicated technique to test the data, while there is a lack of extensive research on popular technique . In this paper, we are conducting a thorough exploration of the most common technques like SVM (Support Vector Machine), (K Nearest Neighbors),etc.) used for identifying diabetes and other preprocessing method. Basically, we examine these techniques by precision of cross-validation on the dataset. We compare each classifier ‘s apects of analyzing and we modify the parameters to improve their accuracy. The best technique we find has 77.86% accuracy using 10-fold cross-validation.
Description:
This thesis submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Information and Communication Engineering of East West University, Dhaka, Bangladesh