Sentiment Analysis of Movie Reviews on IMDb Using Logistic Regression Algorithm

Main Article Content

Muhammad Azka Faridi
Fauzia Tuzzahra
Adnan Al-Qadri
Rhaudy Nahavira
Putri Amelia Az-Zahrah
Cindi Apriliani
Abdiansah

Abstract

Sentiment Analysis is a subfield of Machine Learning that focuses on analyzing opinions expressed in textual data. IMDb, as a widely used platform, provides a space for movie enthusiasts worldwide to share their thoughts and reviews. User feedback can serve as a benchmark for evaluating a film's success. This research aims to classify reviews into positive and negative categories using the Logistic Regression algorithm combined with Grid Search and Active Learning methods. The classification results show that the highest accuracy achieved is 90.90% using Logistic Regression with the combination of Grid Search and Active Learning. Meanwhile, Logistic Regression with Active Learning alone achieved an accuracy of 90.58%, Logistic Regression with Grid Search reached 90.17%, and the basic Logistic Regression model achieved an accuracy of 89.81%.

Article Details

How to Cite
Faridi, M. A., Tuzzahra, F., Al-Qadri, A., Nahavira, R., Az-Zahrah, P. A., Apriliani, C., & Abdiansah. (2025). Sentiment Analysis of Movie Reviews on IMDb Using Logistic Regression Algorithm. JITSI : Jurnal Ilmiah Teknologi Sistem Informasi, 6(2), 132 - 139. https://doi.org/10.62527/jitsi.6.2.422
Section
Articles

References

[1] V. K. Singh, R. Piryani, A. Uddin, dan P. Waila, "Sentiment analysis of movie reviews: A new feature-based heuristic for aspect-level sentiment classification," 2013 International Multi-Conference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s), 2013, pp. 712-717.
[2] E. Haddi, X. Liu, dan Y. Shi, “The Role of Text Pre-processing in Sentiment Analysis,” Procedia Computer Science, vol. 17, pp. 26–32, Dec. 2013.
[3] F. Mailoa, "Analisis sentimen data Twitter menggunakan metode text mining tentang masalah obesitas di Indonesia," Journal of Information Systems for Public Health, vol. 6, no. 1, pp. 44–51, 2021.
[4] D. Rifaldi, A. Fadlil, dan Herman, "Teknik Preprocessing Pada Text Mining Menggunakan Data Tweet 'Mental Health'," Decode: Jurnal Pendidikan Teknologi Informasi, vol. 3, pp. 161-171, Apr. 2023.
[5] S. A. Bahtiar, C. Dewa, and A. Luthfi, “Comparison of Naïve Bayes and Logistic Regression in Sentiment Analysis on Marketplace Reviews Using Rating-Based Labeling”, journalisi, vol. 5, no. 3, pp. 915-927, Aug. 2023.
[6] I. Muhamad Malik Matin, “A Hyperparameter Tuning Using GridsearchCV on Random Forest for Malware Detection”, JURNAL MULTIMEDIA NETWORKING INFORMATICS, vol. 9, no. 1, pp. 43–50, May 2023.
[7] A. I. Schein and L. H. Ungar, "Active learning for logistic regression: an evaluation," Machine Learning, vol. 68, no. 3, pp. 235–265, Oct. 2007.
[8] S. A. S. Mola, Y. C. Luttu, and D. N. Rumlaklak, "Perbandingan Metode Machine Learning dalam Analisis Sentimen Komentar Pengguna Aplikasi InDriver pada Dataset Tidak Seimbang," Jurnal Sistem Informasi Bisnis, vol. 14, no. 3, pp. 247-255, Aug. 2024.
[9] G. Aliman, N. Arago, and C. Dela Cruz, "Sentiment Analysis using Logistic Regression," Journal of Computational Innovations and Engineering Applications, vol. 7, no. 1, pp. 35–40, 2022.
[10] A. Hagi and D. B. Rarasati, “Sentiment Analysis of Sirekap Application Review Using Logistic Regression Algorithm,” Jurnal Informatika, vol. 11, no. 2, pp. 55–64, Oct. 2024.
[11] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” Proc. 49th Annu. Meet. Assoc. Comput. Linguist. Hum. Lang. Technol., pp. 142–150, 2011.
[12] R. Wahyudi and G. Kusumawardana, "Analisis Sentimen pada Aplikasi Grab di Google Play Store Menggunakan Support Vector Machine," Jurnal Informatika, vol. 8, pp. 200–207, Sep. 2021.
[13] I. Subagyo, L. D. Yulianto, W. Permadi, A. W. Dewantara, dan A. D. Hartanto, “Sentiment Analisis Review Film di IMDB Menggunakan Algoritma SVM,” Jurnal Sistem Informasi dan Teknologi Informasi, vol. 8, no. 1, pp. 47–56, 2019.
[14] K. Lubis, T. AriBangsa, dan A. Yudertha, "Analisis Sentimen Opini Masyarakat terhadap Pindahnya Ibu Kota Indonesia dengan Menggunakan Klasifikasi Naïve Bayes," Jurnal Teknoinfo, vol. 18, no. 1, hlm. 226–238, Jan. 2024.
[15] E. R. Lidinillah, T. Rohana, and A. R. Juwita, “Analisis sentimen twitter terhadap steam menggunakan algoritma logistic regression dan support vector machine”, tekno, vol. 10, no. 2, pp. 154-164, Jul. 2023.
[16] I. Rahmawati, T. Rika Fitriani, A. No’eman, dan A. Y. P. Yusuf, “Analisis Sentimen Menggunakan Algoritma Logistic Regression Pada Penerbangan Lion Air berdasarkan Ulasan Platform Online”, JRITI, vol. 1, no. 1, hlm. 11–16, Agu 2023.
[17] G. M. Zakir, "Optimalisasi Hyperparameter pada Model Deteksi Transaksi Mencurigakan Menggunakan Grid-Search," e-Proceeding of Engineering, vol. 11, no. 6, pp. 6727-6732, Dec. 2024.
[18] D. D. Nur Cahyo, “Sentiment Analysis for IMDb Movie Review Using Support Vector Machine (SVM) Method”, Inf. J. Ilm. Bid. Teknol. Inf. dan Komun., vol. 8, no. 2, pp. 90-95, Mar. 2023.
[19] R. Mardianto, Stefanie Quinevera, and S. Rochimah, “Perbandingan Metode Random Forest, Convolutional Neural Network, dan Support Vector Machine Untuk Klasifikasi Jenis Mangga”, J. Appl. Comput. Sci. Technol., vol. 5, no. 1, pp. 63 - 71, May 2024.
[20] RIDWAN INDRANSYAH, Yulison Herry Chrisnanto, and Puspita Nurul Sabrina, “KLASIFIKASI SENTIMEN PERGELARAN MOTOGP DI INDONESIA MENGGUNAKAN ALGORITMA CORRELATED NAÏVE BAYES CLASIFIER”, infotech, vol. 8, no. 2, pp. 60–66, Oct. 2022.