A MULTI-DOMAIN FAKE NEWS DETECTION MODEL USING BERT AND DistilBERT

Sreethu, P R; Sumod, Sundar

DSpace Home
→
PG Thesis
→
Artificial Intelligence (AI)
→
2022
→
View Item

A MULTI-DOMAIN FAKE NEWS DETECTION MODEL USING BERT AND DistilBERT

Sreethu, P R; Sumod, Sundar

URI: http://210.212.227.212:8080/xmlui/handle/123456789/220

Date: 2022-07

Abstract:

The popularity and usage of online media platforms are increasing day by day and the dissemination of data is rapidly raised. The rise of social networks has accelerated the dissemination of rumors, satires, and false information, increase in the distribution of fake news. So the identification of such news as real or fake is an important task in digital life. The fake news may be on different domains such as political domain, entertainment domain, sports domain, etc. Various studies regarding machine learning and deep learning algorithms are found in the literature. Generalizing a learning model by identifying patterns in a text will help to differentiate fake news from the real one. Fake news detection using BERT and LSTM techniques is the most competitive study happening now. A model is proposed using BERT and DistliBERT to detect fake news on multiple domains and the performance is compared with Naive Bayes, Decision Tree, Random Forest, Logistic Regression and SVM classifiers. It is evaluated using the datasets: the Twitter dataset, ISOT dataset, LIAR dataset, and Kaggle dataset. BERT is a widely used pre-trained transformer model for various Natural Language Processing applications. Pre-training and Fine tuning are the two tasks carried out by BERT. Pre-training includes named Masked Language Model (MLM) and Next Sentence Prediction (NSP), these are train on simultaneously. The pre-training task improves the performance of BERT model. BERT is an encoder stack, so the outputs are some vectors. The output vectors are given to a fully connected layer. The number of neurons in the layer should be equal to the number of tokens in the vocabulary. Softmax activation is used to convert a word vector to a distribution. DistilBERT is a distilled model of BERT used to reduce the training time and memory size. The BERT model obtained an accuracy of 94.8%, 100% and 99.89% on the Twitter, ISOT, and the Kaggle datasets respectively; DistilBERT obtained an accuracy of 78.68% on the LIAR dataset.

Show full item record