Hoax News Detection in Indonesian Political Headlines Using Multinomial Naive Bayes
Main Article Content
Abstract
Social media is a means of online social interaction on the Internet, where users can freely share information. Because of the freedom, it cannot be denied that some people will misuse social responsible for misusing social media as a place to spread false news. Based on a survey of 2,032 respondents conducted by DailySocial.id in 2018, it was concluded that the majority of Indonesians do not have the ability to detect hoax news. Therefore, the research aims to design and build a hoax news detection application using the Android-based Multinomial Naive Bayes algorithm. At the design stage, the application is designed to receive input in the form of textual political news headlines. It then uses the Multinomial Naive Bayes algorithm to detect hoaxes by comparing the resulting text with data sets. In the testing phase, the algorithm is tested on a confusion matrix and shows the degree of hoax detection. The accuracy of the hoax detection is 88.9%, the precision is 93.33%, the recall is 84%, the recall is 84%, and the F1 score is 88.4%. With a detection application, it is hoped that this hoax news will be able to contribute to the online environment of the Indonesian people by verifying the information before sharing it on social media.
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
[2] R. Huerta-Álvarez, J. J. Cambra-Fierro, and M. Fuentes-Blasco, “The interplay between social media communication, brand equity and brand engagement in tourist destinations: An analysis in an emerging economy,” J. Destin. Mark. Manag., vol. 16, pp. 1–34, 2020, doi: 10.1016/j.jdmm.2020.100413.
[3] E. Aïmeur, S. Amri, and G. Brassard, Fake news, disinformation and misinformation in social media: a review, vol. 13, no. 1. Springer Vienna, 2023. doi: 10.1007/s13278-023-01028-5.
[4] Y. M. Rocha, G. A. de Moura, G. A. Desidério, F. D. de Oliveira, C. H., Lourenço, and L. D. de Figueiredo Nicolete, “The impact of fake news on social media and its influence on health during the COVID-19 pandemic: A systematic review,” J. Public Heal. From Theory to Pract., pp. 1007–1016, 2021, [Online]. Available: https://doi.org/10.1007/s10389-021-01658-z
[5] M. L. Enabler, “Hasil Survey Wabah HOAX Nasional 2019,” mastel.id. Accessed: May 20, 2024. [Online]. Available: https://mastel.id/hasil-survey-wabah-hoax-nasional-2019/
[6] B. H. K. Kominfo, “Sampai Mei 2023, Kominfo Identifikasi 11.642 Konten Hoaks,” kominfo.go.id. Accessed: May 21, 2024. [Online]. Available: https://www.kominfo.go.id/content/detail/49914/siaran-pers-no123hmkominfo062023-tentang-sampai-mei-2023-kominfo-identifikasi-11642-konten-hoaks/0/siaran_pers
[7] Bskdn.kemendagri.go.id, “Riset: 44 Persen Orang Indonesia Belum Bisa Mendeteksi Berita Hoax,” bskdn.kemendagri.go.id. Accessed: May 21, 2024. [Online]. Available: https://bskdn.kemendagri.go.id/website/riset-44-persen-orang-indonesia-belum-bisa-mendeteksi-berita-hoax/
[8] D. N. Irwanto, “Identifikasi Berita Hoax Menggunakan Kombinasi Metode K-Nearest Neigbor (KNN) dan TF-IDF Berbasis Web Dengan Menggunakan Framework Codeigniter,” UNIVERSITAS WIJAYA PUTRA, 2021.
[9] M. Audina, A. E. Karyawati, I. W. Supriana, I. K. G. Suhartana, I. G. S. Astawa, and I. W. Santiyasa, “Klasifikasi Berita Hoaks Covid-19 Menggunakan Kombinasi Metode K-Nearest Neighbor dan Information Gain,” J. Elektron. Ilmu Komput. Udayana, vol. 10, no. 4, p. 319, 2022, doi: 10.24843/jlk.2022.v10.i04.p02.
[10] T. Ige and S. Adewale, “AI Powered Anti-Cyber Bullying System using Machine Learning Algorithm of Multinomial Naïve Bayes and Optimized Linear Support Vector Machine Interception of Cyberbully Contents in a Messaging System by Machine Learning Algorithm,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 5, pp. 5–9, 2022, doi: 10.14569/IJACSA.2022.0130502.
[11] Yuyun, Nurul Hidayah, and Supriadi Sahibu, “Algoritma Multinomial Naïve Bayes Untuk Klasifikasi Sentimen Pemerintah Terhadap Penanganan Covid-19 Menggunakan Data Twitter,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 4, pp. 820–826, 2021, doi: 10.29207/resti.v5i4.3146.
[12] W. B. Zulfikar, A. R. Atmadja, and S. F. Pratama, “Sentiment Analysis on Social Media Against Public Policy Using Multinomial Naive Bayes,” Sci. J. Informatics, vol. 10, no. 1, pp. 25–34, 2023, doi: 10.15294/sji.v10i1.39952.
[13] C. S. Sriyano and E. B. Setiawan, “Pendeteksian Berita Hoax Menggunakan Naive Bayes Multinomial Pada Twitter dengan Fitur Pembobotan TF-IDF,” Repos. Telkom Univ., vol. 8, no. 2, p. 3396, 2021.
[14] S. Nurhidayah, “A Multinomial Naïve Bayes Decision Support System For Covid-19 Detection,” FUDMA J. Sci., vol. 5, no. 1, p. 55, 2020, doi: 10.33003/fjs-2020-0402-331.
[15] N. Y. Hutama, K. M. Lhaksmana, and I. Kurniawan, “Text Analysis of Applicants for Personality Classification Using Multinomial Naïve Bayes and Decision Tree,” J. Infotel, vol. 12, no. 3, pp. 72–81, 2020, doi: 10.20895/infotel.v12i3.505.
[16] N. Giandomenico , Di Domenicoa; Jason. Sita; Alessio, Ishizakab; Daniel, “Fake News, Social Media and Marketing: A Systematic Review,” J. Bus. Res., pp. 0–48, 2020, doi: https://doi.org/10.1016/j.jbusres.2020.11.037.
[17] M. Zhikri and W. Istiono, “Handling Class Imbalance for Indonesian Twitter Sentiment Analysis A Comparative Study of Algorithms,” J. Syst. Manag. Sci., vol. 14, no. 10, pp. 170–179, 2024, doi: 10.33168/JSMS.2024.1010.
[18] Sajid Khan, Mehmoon Anwar, Huma Qayyum, Farooq Ali, and Marriam Nawaz, “Fake News Classification using Machine Learning: Count Vectorizer and Support Vector Machine,” J. Comput. Biomed. Informatics, vol. 4, no. 01, 2023, doi: 10.56979/401/2022/85.
[19] K. M. Suryaningrum, “Comparison of the TF-IDF Method with the Count Vectorizer to Classify Hate Speech,” Eng. Math. Comput. Sci. J., vol. 5, no. 2, pp. 79–83, 2023, doi: 10.21512/emacsjournal.v5i2.9978.
[20] A. R. Lubis, M. K. M. Nasution, O. S. Sitompul, and E. M. Zamzami, “The effect of the TF-IDF algorithm in times series in forecasting word on social media,” Indones. J. Electr. Eng. Comput. Sci., vol. 22, no. 2, pp. 976–984, 2021, doi: 10.11591/ijeecs.v22.i2.pp976-984.
[21] L. Xiang, “TFIDF-Application of an Improved TF‐IDF Method in Literary Text Classification.pdf,” Hindawi, 2022, doi: https://doi.org/10.1155/2022/9285324.
[22] S. Akuma, T. Lubem, and I. T. Adom, “Comparing Bag of Words and TF-IDF with different models for hate speech detection from live tweets,” Int. J. Inf. Technol., vol. 14, no. 7, pp. 3629–3635, 2022, doi: 10.1007/s41870-022-01096-4.
[23] H. D. Abubakar and M. Umar, “Sentiment Classification: Review of Text Vectorization Methods: Bag of Words, Tf-Idf, Word2vec and Doc2vec,” SLU J. Sci. Technol., vol. 4, no. 1&2, pp. 27–33, 2022, doi: 10.56471/slujst.v4i.266.
[24] J. Xu, Y. Zhang, and D. Miao, “Three-way confusion matrix for classification: A measure driven view,” Inf. Sci. (Ny)., vol. 507, pp. 772–794, 2020, doi: 10.1016/j.ins.2019.06.064.
[25] D. Krstinić, M. Braović, L. Šerić, and D. Božić-Štulić, “Multi-label Classifier Performance Evaluation with Confusion Matrix,” pp. 01–14, 2020, doi: 10.5121/csit.2020.100801.