Sentiment Analysis Comparison of Two E-Commerce Platforms Using Random Forest, Support Vector Machine, Logistic Regression, and IndoBERT
DOI:
https://doi.org/10.59261/jequi.v8i2.324Keywords:
Sentiment Analysis, IndoBERT, Machine Learning, Deep Learning, E-Commerce, VADER-BasedAbstract
Background: The rapid growth of e-commerce in Indonesia has generated massive volumes of user-generated reviews. A critical mismatch exists between numerical star ratings and the actual sentiment expressed in review texts, creating unreliable signals for platform management and highlighting the need for text-based sentiment analysis.
Objective: This study aims to analyze user review sentiments towards two leading e-commerce platforms in Indonesia using Machine Learning and Deep Learning approaches.
Methods: The analysis process was conducted through the CRISP-DM stages, including data cleaning, labeling, model training, and performance evaluation. Two types of labeling were used, namely Rating-Based and VADER-Based, to compare the accuracy levels of sentiment classification. Four models were applied: Random Forest, Support Vector Machine (SVM), Logistic Regression, and IndoBERT. VADER labeling was adapted for Indonesian through preprocessing with an Indonesian-English translation layer.
Results: Based on the evaluation results, the IndoBERT model showed the best performance on e-commerce X, with an accuracy of 0.96 using VADER-based labeling. Meanwhile, for e-commerce Y, Random Forest achieved an accuracy of 0.81 using VADER labeling. These results indicate that IndoBERT's Transformer architecture with contextual embeddings enabled superior understanding of Indonesian semantic nuances. Random Forest's advantage on e-commerce Y (627 samples per label) reflects a lower overfitting risk compared to deep learning models on small datasets.
Conclusion: This study demonstrates the effectiveness of combining IndoBERT and VADER in Indonesian sentiment analysis and can serve as a reference for the e-commerce industry to improve service quality and customer satisfaction strategies.Downloads
References
Ait, B. H. K., Jarir, Z., & Elfirdoussi, S. (2024). Design Of A Machine Learning-Based Decision Support System For Product Scheduling On Non Identical Parallel Machines. Engineering, Technology & Applied Science Research, 14(5), 16317–16325. https://doi.org/10.48084/Etasr.7934
Akbar, H., Aryani, D., Mohammed Al-Shammari, M. K., & Ulum, M. B. (2024). Sentiment Analysis For E-Commerce Product Reviews Based On Feature Fusion And Bidirectional Long Short-Term Memory. Jurnal Teknik Informatika (JUTIF), 5(5), 1385–1391. https://doi.org/10.52436/1.Jutif.2024.5.5.2675
Alzate, M., Arce-Urriza, M., & Cebollada, J. (2022). Mining The Text Of Online Consumer Reviews To Analyze Brand Image And Brand Positioning. Journal Of Retailing And Consumer Services, 67, 102989. https://doi.org/10.1016/J.Jretconser.2022.102989
Ananta, F, A., Id, H. A., & Kania, N. A. (2024). Klasifikasi Sentimen Pada Aplikasi Shopee Menggunakan Fitur Bag Of Word Dan Algoritma Random Forest. Ranah Research : Journal Of Multidisciplinary Research And Development, 6(5), 1678–1683. https://doi.org/10.38035/Rrj.V6i5.994
Gallin, S., & Portes, A. (2024). Online Shopping: How Can Algorithm Performance Expectancy Enhance Impulse Buying? Journal Of Retailing And Consumer Services, 81, 103988. https://doi.org/10.1016/J.Jretconser.2024.103988
Glori, N., & Widjaja, W. (2024). Analisis Perbandingan Kualitas Promosi, Keragaman Produk Dan Kemudahan Penggunaan Antara E-Commerce Shopee Dan Tokopedia. Bina Ekonomi, 28(1), 21–32. https://doi.org/10.26593/Be.V28i1.6229.21-32
Hutto, C., & Gilbert, E. (2014). Vader: A Parsimonious Rule-Based Model For Sentiment Analysis Of Social Media Text. Proceedings Of The International AAAI Conference On Web And Social Media, 8(1), 216–225. https://doi.org/10.1609/Icwsm.V8i1.14550
Ibrahim. (2023). The Future Of Marketing: Emerging Trends And Technologies. Epra International Journal Of Research & Development (IJRD).
Idris, I. S. K., Mustofa, Y. A., & Salihi, I. A. (2023). Analisis Sentimen Terhadap Penggunaan Aplikasi Shopee Mengunakan Algoritma Support Vector Machine (SVM). Jambura Journal Of Electrical And Electronics Engineering, 5(1), 32–35. https://doi.org/10.37905/Jjeee.V5i1.16830
Leon, M. (2025). Retracted: Sentiment Analysis: From Rule-Based Lexicons To Large Language Models. Intelligent Systems With Applications, 28, 200599. https://doi.org/10.1016/J.Iswa.2025.200599
Liang, X., Guo, J., Sun, Y., & Liu, X. (2021). A Method Of Product Selection Based On Online Reviews. Mobile Information Systems, 2021, 1–16. https://doi.org/10.1155/2021/9656315
Liu, Z. (2024). Transformations In Consumer Buying Behavior: Investigating How Online Shopping Platforms. Advances In Economics, Management And Political Sciences, 109(1), 181–186. https://doi.org/10.54254/2754-1169/109/2024bj0135
Luo, X., Deng, Z., Yang, B., & Luo, M. Y. (2024). Pre-Trained Language Models In Medicine: A Survey. Artificial Intelligence In Medicine, 154, 102904. https://doi.org/10.1016/J.Artmed.2024.102904
Nabiilah, G. Z., Prasetyo, S. Y., Izdihar, Z. N., & Girsang, A. S. (2023). Bert Base Model For Toxic Comment Analysis On Indonesian Social Media. Procedia Computer Science, 216, 714–721. https://doi.org/10.1016/J.Procs.2022.12.188
Novirianto, I. F. (2023). Analisis Sentimen Berbasis Aspek Pada Ulasan Aplikasi Mybluebird Dengan Implementasi N-Gram Dan Algoritma Logistic Regressio. Fakultas Sains Dan Teknologi Uin Syarif Hidayatullah Jakarta.
Novitasari, D., Maulana, R., Hastuti, H., & Puspitasari, N. (2024). Seminar Nasional Amikom Surakarta (SEMNASA) 2024.
Rayhan, A., & Gross, D. (2023). The Rise Of Python: A Survey Of Recent Research. License CC By, 4.
Rayhan, F. M., Wijoyo, S. H., & Putra, W. H. N. (2024). Analisis Sentimen Root Cause Analisis Kepuasan Pengguna Aplikasi Tokopedia Pada Ulasan Menggunakan Metode Random Forest. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 8(8).
Sanjaya, T. P. R., Fauzi, A., & Masruriyah, A. F. N. (2023). Analisis Sentimen Ulasan Pada E-Commerce Shopee Menggunakan Algoritma Naive Bayes Dan Support Vector Machine. Infotech : Jurnal Informatika & Teknologi, 4(1), 16–26. https://doi.org/10.37373/Infotech.V4i1.422
Santhosh, V., & Basavarajappa, B. (2022). A Study On Consumer Behaviour Towards Online Shopping. International Journal Of Research And Analytical Reviews (IJRAR), 9(1), 864–874.
Schulze, J. (2020). Online And Offline Shopping: Decision Making Factors That Influence Consumer Purchases.
Wang, L., Pertheban, T. R. A. L., Li, T., & Zhao, L. (2024). Application Of Business Intelligence Based On Big Data In E-Commerce Data Evaluation. Heliyon, 10(21), E38768. https://doi.org/10.1016/J.Heliyon.2024.E38768
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Theresia Vania Davita Suyana, Sfenrianto

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike 4.0 International (CC-BY-SA). that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.



