# Diabetes Prediction using K-Nearest Neighbors (KNN)

This repository contains a Python script for predicting diabetes using the K-Nearest Neighbors (KNN) algorithm. The code uses the popular machine learning library scikit-learn and involves preprocessing the data, handling missing values, and training a KNN classifier.

## Dataset

The dataset used in this project is named "diabetes.csv." It is assumed to contain information related to diabetes, with columns such as Glucose, BloodPressure, SkinThickness, BMI, Insulin, and others.

## Code Overview

### Data Preprocessing:

- Replace zero values in specific columns ('Glucose', 'BloodPressure', 'SkinThickness', 'BMI', 'Insulin') with the mean of non-zero values.
- Split the dataset into input features (X) and output labels (y).

### Train-Test Split:

- Split the dataset into training and testing sets using the `train_test_split` function from scikit-learn.

### Feature Scaling:

- Standardize the features using the `StandardScaler` to ensure that all features have the same scale.

### K-Nearest Neighbors Classification:

- Create a KNN classifier with parameters (n_neighbors=11, p=2, metric='euclidean').
- Train the classifier using the training data.

### Prediction and Evaluation:

- Predict the labels for the test set using the trained classifier.
- Evaluate the model performance using confusion matrix and F1 score.

## Running the Code

To run this code, make sure you have Python installed along with the required libraries specified in the script. You can install these dependencies using:

```bash
pip install pandas numpy scikit-learn