Fake News Detection Using Scikit-Learn

This project aims to classify news articles as real or fake using a machine learning algorithm. The algorithm used is a Passive Aggressive Classifier, which is a type of online learning algorithm that is used for binary classification.

Installation

The following packages are required to run this project:

newsapi-python
pandas
scikit-learn

To install these packages, run the following command:

pip install newsapi-python pandas scikit-learn

Data

The training data used to train the classifier is taken from Kaggle.

Step 1: Import Necessary Modules

The following modules are required to interact with the News API and create the model:

NewsApiClient from newsapi
random

Step 2: Get News Data from the News API

To interact with the News API, an API key is required. The API key can be obtained by registering at https://newsapi.org/. After obtaining the API key, follow the steps below:

Call the NewsApiClient() method and pass the API key to this method.
Create a method to get the news data from the API using the get_everything() method.
Pass the required parameters to the get_everything() method:
- sources
- domains
- from_param
- to
- language
- sort_by
- page
After getting the results from the API, pass the results to an array and return that array.

Step 3: Get News Sources

The News API has over 3000 authenticated news sources. To get news from these sources, follow the steps below:

Get all the sources from the News API using the get_sources() method.
Add the ID of each source to a list.
Truncate the list to a size of 10, and get news from those sources using the del keyword.

Step 4: Create a DataFrame Using News List

Use a loop to iterate through all the sources from the sourceList and use the getNews() method to get news from the sources.
Add all the returned news to a list.
Use the from_records() method from pandas.DataFrame to create a new DataFrame using the list.
Add new column headings to the DataFrame using the dataframe.columns attribute.

Step 5: Create a DataFrame of Fake and Real News

Load the data from a .csv file available in the same directory with the name of the news.csv file using the read_csv() method from pandas.
Add the column headings to the DataFrame.
Use the concat() method from pandas to concat both DataFrames.

Step 6: Train the Model

To create the training model, the following modules are required:

train_test_split from sklearn.model_selection
CountVectorizer from sklearn.feature_extraction.text
PassiveAggressiveClassifier from sklearn.linear_model
accuracy_score from sklearn.metrics

Follow the steps below to train the model:

Split the training and testing data from the DataFrame using the train_test_split() method.
Use 70% of the data for training and 30% for testing.
Pass the combination of title, text, and news labels to the *arrays parameter of the train_test_split() method.
Use CountVectorizer to create a matrix of token count from the text document.
Create a PassiveAggressiveClassifier model to classify real news from fake news.
Test the model using the test data and calculate the model's accuracy using the accuracy_score method.

By following these steps, we can create a machine learning model to detect fake news from real news.

Credits

This project was created by OpenAI and is based on the tutorial available on DataCamp.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
dataset		dataset
fake_news_detection		fake_news_detection
.gitignore		.gitignore
Fake News Detection Using Scikit-learn.ipynb		Fake News Detection Using Scikit-learn.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fake News Detection Using Scikit-Learn

Installation

Data

Step 1: Import Necessary Modules

Step 2: Get News Data from the News API

Step 3: Get News Sources

Step 4: Create a DataFrame Using News List

Step 5: Create a DataFrame of Fake and Real News

Step 6: Train the Model

Credits

About

Releases

Packages

Languages

naru94/Fake-News-Detection-Using-Scikit-learn

Folders and files

Latest commit

History

Repository files navigation

Fake News Detection Using Scikit-Learn

Installation

Data

Step 1: Import Necessary Modules

Step 2: Get News Data from the News API

Step 3: Get News Sources

Step 4: Create a DataFrame Using News List

Step 5: Create a DataFrame of Fake and Real News

Step 6: Train the Model

Credits

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages