Hoax News Detection in Indonesian Political Headlines Using Multinomial Naive Bayes

Isi Artikel Utama

Bertrand Baldomero Ferguson
Wirawan Istiono

Abstrak

Media sosial merupakan sarana pergaulan sosial secara daring di internet dimana para penggunanya dapat berbagi informasi secara bebas. Dikarenakan kebebasan yang dimiliki oleh setiap orang, maka tidak dapat dipungkiri bahwa beberapa masyarakat tidak bertanggungjawab melakukan penyalahgunaan media sosial sebagai tempat menyebarkan berita hoaks. Berdasarkan survei yang dilakukan DailySocial.id terhadap 2.032 responden pada tahun 2018 disimpulkan bahwa sebagian besar masyarakat Indonesia belum memiliki kemampuan untuk mendeteksi berita hoaks. Oleh karena itu, penelitian ini bertujuan untuk merancang dan membangun sebuah aplikasi deteksi berita hoaks yang menggunakan algoritma Multinomial Naive Bayes berbasis Android. Pada tahap perancangan, aplikasi didesain untuk menerima input berupa teks judul berita politik. Setelah itu, algoritma Multinomial Naive Bayes digunakan untuk melakukan deteksi berita hoaks dengan membandingkan teks yang dihasilkan dengan dataset. Dalam tahap pengujian, model algoritma diuji dengan menggunakan confusion matrix dan menunjukkan tingkat akurasi deteksi berita hoaks sebesar 88,9%, nilai presisi sebesar 93,33%, nilai recall sebesar 84%, dan f1-score sebesar 88,4%. Dengan adanya aplikasi deteksi berita hoaks ini, diharapkan mampu berkontribusi terhadap lingkungan daring masyarakat Indonesia dengan memverifikasi informasi terlebih dahulu sebelum membagikannya ke media sosial

Rincian Artikel

Bagian
Articles

Referensi

[1] R. Agusiady, D. Saepudin, and Z. Aripin, “The influence of social media communication on consumer perceptions of brands and purchase intentions in the pandemic and post-pandemic era: an analytical study,” J. Jabar Econ. Soc. Netw. Forum, vol. 2, no. 1, pp. 16–30, 2024.
[2] R. Huerta-Álvarez, J. J. Cambra-Fierro, and M. Fuentes-Blasco, “The interplay between social media communication, brand equity and brand engagement in tourist destinations: An analysis in an emerging economy,” J. Destin. Mark. Manag., vol. 16, pp. 1–34, 2020, doi: 10.1016/j.jdmm.2020.100413.
[3] E. Aïmeur, S. Amri, and G. Brassard, Fake news, disinformation and misinformation in social media: a review, vol. 13, no. 1. Springer Vienna, 2023. doi: 10.1007/s13278-023-01028-5.
[4] Y. M. Rocha, G. A. de Moura, G. A. Desidério, F. D. de Oliveira, C. H., Lourenço, and L. D. de Figueiredo Nicolete, “The impact of fake news on social media and its influence on health during the COVID-19 pandemic: A systematic review,” J. Public Heal. From Theory to Pract., pp. 1007–1016, 2021, [Online]. Available: https://doi.org/10.1007/s10389-021-01658-z
[5] M. L. Enabler, “Hasil Survey Wabah HOAX Nasional 2019,” mastel.id. Accessed: May 20, 2024. [Online]. Available: https://mastel.id/hasil-survey-wabah-hoax-nasional-2019/
[6] B. H. K. Kominfo, “Sampai Mei 2023, Kominfo Identifikasi 11.642 Konten Hoaks,” kominfo.go.id. Accessed: May 21, 2024. [Online]. Available: https://www.kominfo.go.id/content/detail/49914/siaran-pers-no123hmkominfo062023-tentang-sampai-mei-2023-kominfo-identifikasi-11642-konten-hoaks/0/siaran_pers
[7] Bskdn.kemendagri.go.id, “Riset: 44 Persen Orang Indonesia Belum Bisa Mendeteksi Berita Hoax,” bskdn.kemendagri.go.id. Accessed: May 21, 2024. [Online]. Available: https://bskdn.kemendagri.go.id/website/riset-44-persen-orang-indonesia-belum-bisa-mendeteksi-berita-hoax/
[8] D. N. Irwanto, “Identifikasi Berita Hoax Menggunakan Kombinasi Metode K-Nearest Neigbor (KNN) dan TF-IDF Berbasis Web Dengan Menggunakan Framework Codeigniter,” UNIVERSITAS WIJAYA PUTRA, 2021.
[9] M. Audina, A. E. Karyawati, I. W. Supriana, I. K. G. Suhartana, I. G. S. Astawa, and I. W. Santiyasa, “Klasifikasi Berita Hoaks Covid-19 Menggunakan Kombinasi Metode K-Nearest Neighbor dan Information Gain,” J. Elektron. Ilmu Komput. Udayana, vol. 10, no. 4, p. 319, 2022, doi: 10.24843/jlk.2022.v10.i04.p02.
[10] T. Ige and S. Adewale, “AI Powered Anti-Cyber Bullying System using Machine Learning Algorithm of Multinomial Naïve Bayes and Optimized Linear Support Vector Machine Interception of Cyberbully Contents in a Messaging System by Machine Learning Algorithm,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 5, pp. 5–9, 2022, doi: 10.14569/IJACSA.2022.0130502.
[11] Yuyun, Nurul Hidayah, and Supriadi Sahibu, “Algoritma Multinomial Naïve Bayes Untuk Klasifikasi Sentimen Pemerintah Terhadap Penanganan Covid-19 Menggunakan Data Twitter,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 4, pp. 820–826, 2021, doi: 10.29207/resti.v5i4.3146.
[12] W. B. Zulfikar, A. R. Atmadja, and S. F. Pratama, “Sentiment Analysis on Social Media Against Public Policy Using Multinomial Naive Bayes,” Sci. J. Informatics, vol. 10, no. 1, pp. 25–34, 2023, doi: 10.15294/sji.v10i1.39952.
[13] C. S. Sriyano and E. B. Setiawan, “Pendeteksian Berita Hoax Menggunakan Naive Bayes Multinomial Pada Twitter dengan Fitur Pembobotan TF-IDF,” Repos. Telkom Univ., vol. 8, no. 2, p. 3396, 2021.
[14] S. Nurhidayah, “A Multinomial Naïve Bayes Decision Support System For Covid-19 Detection,” FUDMA J. Sci., vol. 5, no. 1, p. 55, 2020, doi: 10.33003/fjs-2020-0402-331.
[15] N. Y. Hutama, K. M. Lhaksmana, and I. Kurniawan, “Text Analysis of Applicants for Personality Classification Using Multinomial Naïve Bayes and Decision Tree,” J. Infotel, vol. 12, no. 3, pp. 72–81, 2020, doi: 10.20895/infotel.v12i3.505.
[16] N. Giandomenico , Di Domenicoa; Jason. Sita; Alessio, Ishizakab; Daniel, “Fake News, Social Media and Marketing: A Systematic Review,” J. Bus. Res., pp. 0–48, 2020, doi: https://doi.org/10.1016/j.jbusres.2020.11.037.
[17] M. Zhikri and W. Istiono, “Handling Class Imbalance for Indonesian Twitter Sentiment Analysis A Comparative Study of Algorithms,” J. Syst. Manag. Sci., vol. 14, no. 10, pp. 170–179, 2024, doi: 10.33168/JSMS.2024.1010.
[18] Sajid Khan, Mehmoon Anwar, Huma Qayyum, Farooq Ali, and Marriam Nawaz, “Fake News Classification using Machine Learning: Count Vectorizer and Support Vector Machine,” J. Comput. Biomed. Informatics, vol. 4, no. 01, 2023, doi: 10.56979/401/2022/85.
[19] K. M. Suryaningrum, “Comparison of the TF-IDF Method with the Count Vectorizer to Classify Hate Speech,” Eng. Math. Comput. Sci. J., vol. 5, no. 2, pp. 79–83, 2023, doi: 10.21512/emacsjournal.v5i2.9978.
[20] A. R. Lubis, M. K. M. Nasution, O. S. Sitompul, and E. M. Zamzami, “The effect of the TF-IDF algorithm in times series in forecasting word on social media,” Indones. J. Electr. Eng. Comput. Sci., vol. 22, no. 2, pp. 976–984, 2021, doi: 10.11591/ijeecs.v22.i2.pp976-984.
[21] L. Xiang, “TFIDF-Application of an Improved TF‐IDF Method in Literary Text Classification.pdf,” Hindawi, 2022, doi: https://doi.org/10.1155/2022/9285324.
[22] S. Akuma, T. Lubem, and I. T. Adom, “Comparing Bag of Words and TF-IDF with different models for hate speech detection from live tweets,” Int. J. Inf. Technol., vol. 14, no. 7, pp. 3629–3635, 2022, doi: 10.1007/s41870-022-01096-4.
[23] H. D. Abubakar and M. Umar, “Sentiment Classification: Review of Text Vectorization Methods: Bag of Words, Tf-Idf, Word2vec and Doc2vec,” SLU J. Sci. Technol., vol. 4, no. 1&2, pp. 27–33, 2022, doi: 10.56471/slujst.v4i.266.
[24] J. Xu, Y. Zhang, and D. Miao, “Three-way confusion matrix for classification: A measure driven view,” Inf. Sci. (Ny)., vol. 507, pp. 772–794, 2020, doi: 10.1016/j.ins.2019.06.064.
[25] D. Krstinić, M. Braović, L. Šerić, and D. Božić-Štulić, “Multi-label Classifier Performance Evaluation with Confusion Matrix,” pp. 01–14, 2020, doi: 10.5121/csit.2020.100801.