SMS Spam Classifier

This project is a Streamlit web application that classifies SMS or email messages as Spam or Not Spam using a machine learning model. The application provides a user-friendly interface for real-time classification.

Project Overview

Objective: Build a robust machine learning model to classify text messages as spam or not spam.
Approach:
1. Preprocess the input text data.
2. Train a classifier using a labeled dataset.
3. Deploy the model as a web application using Streamlit.

Dataset Details

Categories:
- Spam: Messages that are unsolicited or promotional in nature.
- Not Spam: Legitimate and personal messages.
Source:
- Public datasets such as the UCI SMS Spam Collection.
Structure:
- Text: The SMS/email content.
- Label: Classification (spam or not spam).

Key Features

Text Preprocessing:
- Lowercasing text.
- Tokenization.
- Stopword removal.
- Stemming using the Porter Stemmer.
Vectorization:
- Text is converted into numerical format using TfidfVectorizer.
Machine Learning Model:
- Trained using Naive Bayes or another classifier for text classification.
- Saved as a serialized model (model.pkl).
Interactive Web Application:
- Built using Streamlit for real-time message classification.
- Users can input any SMS/email text to check if it's spam or not.

Requirements

Ensure the following are installed:

Python Version: Python 3.7+

Python Libraries (listed in requirements.txt):

streamlit
scikit-learn
nltk
pandas
numpy
matplotlib

Pretrained model files: vectorizer.pkl and model.pkl.

Setup and Installation

1. Clone the Repository

git clone https://github.com/rahulpoojith/SMS-spam-classifier.git
cd SMS-spam-classifier

2. Install Dependencies

Install the required Python libraries using:

pip install -r requirements.txt

3. Download NLTK Data

Download NLTK data for text preprocessing:

python -m nltk.downloader punkt stopwords

4. Add Model Files

Ensure the following files are in the project directory:

vectorizer.pkl: Contains the trained TfidfVectorizer.
model.pkl: Contains the trained classification model.

How to Run the Application

Start the Streamlit App

Run the application using:

streamlit run app.py

Interact with the Application

Open your browser and navigate to the URL provided in the terminal (usually http://localhost:8501).
Enter the SMS or email text in the input box.
Click the "Predict" button to classify the message.
The application will display the result as Spam or Not Spam.

Folder Structure

SMS-spam-classifier/
├── app.py              # Main Streamlit application
├── model.pkl           # Trained classification model
├── vectorizer.pkl      # TfidfVectorizer for text vectorization
├── requirements.txt    # Python dependencies
├── README.md           # Project documentation

Customization

Model: Replace model.pkl with any new model for experimentation.
Text Preprocessing: Modify the transform_text function in app.py for additional preprocessing steps.
Dataset: Use a different labeled dataset for retraining the model.

Acknowledgements

UCI SMS Spam Collection Dataset
Libraries: Streamlit, Scikit-learn, NLTK, TfidfVectorizer

License

This project is licensed under the MIT License. Feel free to use and modify it as needed.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
README.md		README.md
app.py		app.py
model.pkl		model.pkl
pyvenv.cfg		pyvenv.cfg
requirements.txt		requirements.txt
setup.py		setup.py
sms-spam-detection.ipynb		sms-spam-detection.ipynb
spam.csv		spam.csv
vectorizer.pkl		vectorizer.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMS Spam Classifier

Project Overview

Dataset Details

Key Features

Requirements

Setup and Installation

1. Clone the Repository

2. Install Dependencies

3. Download NLTK Data

4. Add Model Files

How to Run the Application

Start the Streamlit App

Interact with the Application

Folder Structure

Customization

Acknowledgements

License

About

Releases

Packages

Languages

rahulpoojith/SMS-spam-classifier

Folders and files

Latest commit

History

Repository files navigation

SMS Spam Classifier

Project Overview

Dataset Details

Key Features

Requirements

Setup and Installation

1. Clone the Repository

2. Install Dependencies

3. Download NLTK Data

4. Add Model Files

How to Run the Application

Start the Streamlit App

Interact with the Application

Folder Structure

Customization

Acknowledgements

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages