Skip to content

Manjeedan11/diabetes-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Diabetes Prediction using K-Nearest Neighbors (KNN)

This repository contains a Python script for predicting diabetes using the K-Nearest Neighbors (KNN) algorithm. The code uses the popular machine learning library scikit-learn and involves preprocessing the data, handling missing values, and training a KNN classifier.

Dataset

The dataset used in this project is named "diabetes.csv." It is assumed to contain information related to diabetes, with columns such as Glucose, BloodPressure, SkinThickness, BMI, Insulin, and others.

Code Overview

Data Preprocessing:

  • Replace zero values in specific columns ('Glucose', 'BloodPressure', 'SkinThickness', 'BMI', 'Insulin') with the mean of non-zero values.
  • Split the dataset into input features (X) and output labels (y).

Train-Test Split:

  • Split the dataset into training and testing sets using the train_test_split function from scikit-learn.

Feature Scaling:

  • Standardize the features using the StandardScaler to ensure that all features have the same scale.

K-Nearest Neighbors Classification:

  • Create a KNN classifier with parameters (n_neighbors=11, p=2, metric='euclidean').
  • Train the classifier using the training data.

Prediction and Evaluation:

  • Predict the labels for the test set using the trained classifier.
  • Evaluate the model performance using confusion matrix and F1 score.

Running the Code

To run this code, make sure you have Python installed along with the required libraries specified in the script. You can install these dependencies using:

pip install pandas numpy scikit-learn

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages