Sentiment Analysis of Movie Reviews on IMDb Using Logistic Regression Algorithm
Main Article Content
Abstract
Sentiment Analysis is a subfield of Machine Learning that focuses on analyzing opinions expressed in textual data. IMDb, as a widely used platform, provides a space for movie enthusiasts worldwide to share their thoughts and reviews. User feedback can serve as a benchmark for evaluating a film's success. This research aims to classify reviews into positive and negative categories using the Logistic Regression algorithm combined with Grid Search and Active Learning methods. The classification results show that the highest accuracy achieved is 90.90% using Logistic Regression with the combination of Grid Search and Active Learning. Meanwhile, Logistic Regression with Active Learning alone achieved an accuracy of 90.58%, Logistic Regression with Grid Search reached 90.17%, and the basic Logistic Regression model achieved an accuracy of 89.81%.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
[2] E. Haddi, X. Liu, dan Y. Shi, “The Role of Text Pre-processing in Sentiment Analysis,” Procedia Computer Science, vol. 17, pp. 26–32, Dec. 2013.
[3] F. Mailoa, "Analisis sentimen data Twitter menggunakan metode text mining tentang masalah obesitas di Indonesia," Journal of Information Systems for Public Health, vol. 6, no. 1, pp. 44–51, 2021.
[4] D. Rifaldi, A. Fadlil, dan Herman, "Teknik Preprocessing Pada Text Mining Menggunakan Data Tweet 'Mental Health'," Decode: Jurnal Pendidikan Teknologi Informasi, vol. 3, pp. 161-171, Apr. 2023.
[5] S. A. Bahtiar, C. Dewa, and A. Luthfi, “Comparison of Naïve Bayes and Logistic Regression in Sentiment Analysis on Marketplace Reviews Using Rating-Based Labeling”, journalisi, vol. 5, no. 3, pp. 915-927, Aug. 2023.
[6] I. Muhamad Malik Matin, “A Hyperparameter Tuning Using GridsearchCV on Random Forest for Malware Detection”, JURNAL MULTIMEDIA NETWORKING INFORMATICS, vol. 9, no. 1, pp. 43–50, May 2023.
[7] A. I. Schein and L. H. Ungar, "Active learning for logistic regression: an evaluation," Machine Learning, vol. 68, no. 3, pp. 235–265, Oct. 2007.
[8] S. A. S. Mola, Y. C. Luttu, and D. N. Rumlaklak, "Perbandingan Metode Machine Learning dalam Analisis Sentimen Komentar Pengguna Aplikasi InDriver pada Dataset Tidak Seimbang," Jurnal Sistem Informasi Bisnis, vol. 14, no. 3, pp. 247-255, Aug. 2024.
[9] G. Aliman, N. Arago, and C. Dela Cruz, "Sentiment Analysis using Logistic Regression," Journal of Computational Innovations and Engineering Applications, vol. 7, no. 1, pp. 35–40, 2022.
[10] A. Hagi and D. B. Rarasati, “Sentiment Analysis of Sirekap Application Review Using Logistic Regression Algorithm,” Jurnal Informatika, vol. 11, no. 2, pp. 55–64, Oct. 2024.
[11] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” Proc. 49th Annu. Meet. Assoc. Comput. Linguist. Hum. Lang. Technol., pp. 142–150, 2011.
[12] R. Wahyudi and G. Kusumawardana, "Analisis Sentimen pada Aplikasi Grab di Google Play Store Menggunakan Support Vector Machine," Jurnal Informatika, vol. 8, pp. 200–207, Sep. 2021.
[13] I. Subagyo, L. D. Yulianto, W. Permadi, A. W. Dewantara, dan A. D. Hartanto, “Sentiment Analisis Review Film di IMDB Menggunakan Algoritma SVM,” Jurnal Sistem Informasi dan Teknologi Informasi, vol. 8, no. 1, pp. 47–56, 2019.
[14] K. Lubis, T. AriBangsa, dan A. Yudertha, "Analisis Sentimen Opini Masyarakat terhadap Pindahnya Ibu Kota Indonesia dengan Menggunakan Klasifikasi Naïve Bayes," Jurnal Teknoinfo, vol. 18, no. 1, hlm. 226–238, Jan. 2024.
[15] E. R. Lidinillah, T. Rohana, and A. R. Juwita, “Analisis sentimen twitter terhadap steam menggunakan algoritma logistic regression dan support vector machine”, tekno, vol. 10, no. 2, pp. 154-164, Jul. 2023.
[16] I. Rahmawati, T. Rika Fitriani, A. No’eman, dan A. Y. P. Yusuf, “Analisis Sentimen Menggunakan Algoritma Logistic Regression Pada Penerbangan Lion Air berdasarkan Ulasan Platform Online”, JRITI, vol. 1, no. 1, hlm. 11–16, Agu 2023.
[17] G. M. Zakir, "Optimalisasi Hyperparameter pada Model Deteksi Transaksi Mencurigakan Menggunakan Grid-Search," e-Proceeding of Engineering, vol. 11, no. 6, pp. 6727-6732, Dec. 2024.
[18] D. D. Nur Cahyo, “Sentiment Analysis for IMDb Movie Review Using Support Vector Machine (SVM) Method”, Inf. J. Ilm. Bid. Teknol. Inf. dan Komun., vol. 8, no. 2, pp. 90-95, Mar. 2023.
[19] R. Mardianto, Stefanie Quinevera, and S. Rochimah, “Perbandingan Metode Random Forest, Convolutional Neural Network, dan Support Vector Machine Untuk Klasifikasi Jenis Mangga”, J. Appl. Comput. Sci. Technol., vol. 5, no. 1, pp. 63 - 71, May 2024.
[20] RIDWAN INDRANSYAH, Yulison Herry Chrisnanto, and Puspita Nurul Sabrina, “KLASIFIKASI SENTIMEN PERGELARAN MOTOGP DI INDONESIA MENGGUNAKAN ALGORITMA CORRELATED NAÏVE BAYES CLASIFIER”, infotech, vol. 8, no. 2, pp. 60–66, Oct. 2022.