Fake News Detection Project
Overview
This project focuses on building a robust fake news detection system using machine learning techniques. The objective is to classify news articles as either "TRUE" or "FAKE" based on their content. The project utilizes various machine learning models, including Logistic Regression, Naive Bayes, and Support Vector Machine (SVM), combined into an ensemble model for improved accuracy and reliability.
Project Structure
Data Preprocessing: The dataset is preprocessed to remove punctuation, convert text to lowercase, and eliminate stopwords.
Feature Extraction: TF-IDF (Term Frequency-Inverse Document Frequency) is used to convert the text data into numerical features suitable for machine learning models.
Model Training: Three individual models (Logistic Regression, Naive Bayes, and SVM) are trained on the preprocessed data. An ensemble model using Voting Classifier is then created to leverage the strengths of these individual models.
Model Evaluation: The performance of the individual models and the ensemble model is evaluated using accuracy scores and classification reports.
Prediction Function: A function is provided to predict the authenticity of new news articles.
Requirements
Python 3.x
pandas
scikit-learn
json
re
pickle
Conclusion
This project demonstrates the effectiveness of using an ensemble model for fake news detection, combining the strengths of multiple machine learning algorithms to achieve higher accuracy and reliability.
For More Detail Check the GitHub Link: