Fake News Detection

Welcome to our GitHub repository! In this project, we explore the capabilities of advanced machine learning techniques to identify and classify fake news articles. Our primary goal is to implement and compare two distinct models: a pre-trained BERT model and a logistic regression model.

Project Overview

This project uses the WELFake dataset for training our models. The news.csv file is utilised for blind testing to evaluate the performance of our models in real-world scenarios. By leveraging the power of BERT, which utilises deep learning techniques to understand the context of words in text, and the simplicity yet effectiveness of logistic regression, we aim to develop a robust system for detecting fake news with high accuracy.

Objectives

To train and fine-tune a pre-trained BERT model on the WELFake dataset for fake news detection.
To implement a logistic regression model as a comparative baseline to assess its performance against deep learning techniques.
To evaluate both models using a separate dataset (news.csv) to ensure they generalise well on unseen data.

Contributors

Brian: Logistic Regression Model and Training
Shan Shan: Data Exploration, Data Processing and Unseen Testing
Magnus: BERT Model and Training

References

https://arxiv.org/abs/1810.04805
https://huggingface.co/docs/transformers/en/index
https://scikit-learn.org/stable/
https://www.kaggle.com/datasets/saurabhshahane/fake-news-classification
https://www.kaggle.com/datasets/antonioskokiantonis/newscsv
https://www.researchgate.net/figure/The-architecture-of-the-Fine-tuned-BERT-base-classifier_fig3_351392408

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Fake News Detection

Project Overview

Objectives

Contributors

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Fake News Detection

Project Overview

Objectives

Contributors

References