Sentiment Analysis Comparison of Two E-Commerce Platforms Using Random Forest, Support Vector Machine, Logistic Regression, and IndoBERT

Authors

  • Theresia Vania Davita Suyana Universitas Bina Nusantara
  • Sfenrianto Universitas Bina Nusantara

DOI:

https://doi.org/10.59261/jequi.v8i2.324

Keywords:

Sentiment Analysis, IndoBERT, Machine Learning, Deep Learning, E-Commerce, VADER-Based

Abstract

Background: The rapid growth of e-commerce in Indonesia has generated massive volumes of user-generated reviews. A critical mismatch exists between numerical star ratings and the actual sentiment expressed in review texts, creating unreliable signals for platform management and highlighting the need for text-based sentiment analysis.

Objective: This study aims to analyze user review sentiments towards two leading e-commerce platforms in Indonesia using Machine Learning and Deep Learning approaches.

Methods: The analysis process was conducted through the CRISP-DM stages, including data cleaning, labeling, model training, and performance evaluation. Two types of labeling were used, namely Rating-Based and VADER-Based, to compare the accuracy levels of sentiment classification. Four models were applied: Random Forest, Support Vector Machine (SVM), Logistic Regression, and IndoBERT. VADER labeling was adapted for Indonesian through preprocessing with an Indonesian-English translation layer.

Results: Based on the evaluation results, the IndoBERT model showed the best performance on e-commerce X, with an accuracy of 0.96 using VADER-based labeling. Meanwhile, for e-commerce Y, Random Forest achieved an accuracy of 0.81 using VADER labeling. These results indicate that IndoBERT's Transformer architecture with contextual embeddings enabled superior understanding of Indonesian semantic nuances. Random Forest's advantage on e-commerce Y (627 samples per label) reflects a lower overfitting risk compared to deep learning models on small datasets.

Conclusion: This study demonstrates the effectiveness of combining IndoBERT and VADER in Indonesian sentiment analysis and can serve as a reference for the e-commerce industry to improve service quality and customer satisfaction strategies.

Downloads

Download data is not yet available.

References

Ait, B. H. K., Jarir, Z., & Elfirdoussi, S. (2024). Design Of A Machine Learning-Based Decision Support System For Product Scheduling On Non Identical Parallel Machines. Engineering, Technology & Applied Science Research, 14(5), 16317–16325. https://doi.org/10.48084/Etasr.7934

Akbar, H., Aryani, D., Mohammed Al-Shammari, M. K., & Ulum, M. B. (2024). Sentiment Analysis For E-Commerce Product Reviews Based On Feature Fusion And Bidirectional Long Short-Term Memory. Jurnal Teknik Informatika (JUTIF), 5(5), 1385–1391. https://doi.org/10.52436/1.Jutif.2024.5.5.2675

Alzate, M., Arce-Urriza, M., & Cebollada, J. (2022). Mining The Text Of Online Consumer Reviews To Analyze Brand Image And Brand Positioning. Journal Of Retailing And Consumer Services, 67, 102989. https://doi.org/10.1016/J.Jretconser.2022.102989

Ananta, F, A., Id, H. A., & Kania, N. A. (2024). Klasifikasi Sentimen Pada Aplikasi Shopee Menggunakan Fitur Bag Of Word Dan Algoritma Random Forest. Ranah Research : Journal Of Multidisciplinary Research And Development, 6(5), 1678–1683. https://doi.org/10.38035/Rrj.V6i5.994

Gallin, S., & Portes, A. (2024). Online Shopping: How Can Algorithm Performance Expectancy Enhance Impulse Buying? Journal Of Retailing And Consumer Services, 81, 103988. https://doi.org/10.1016/J.Jretconser.2024.103988

Glori, N., & Widjaja, W. (2024). Analisis Perbandingan Kualitas Promosi, Keragaman Produk Dan Kemudahan Penggunaan Antara E-Commerce Shopee Dan Tokopedia. Bina Ekonomi, 28(1), 21–32. https://doi.org/10.26593/Be.V28i1.6229.21-32

Hutto, C., & Gilbert, E. (2014). Vader: A Parsimonious Rule-Based Model For Sentiment Analysis Of Social Media Text. Proceedings Of The International AAAI Conference On Web And Social Media, 8(1), 216–225. https://doi.org/10.1609/Icwsm.V8i1.14550

Ibrahim. (2023). The Future Of Marketing: Emerging Trends And Technologies. Epra International Journal Of Research & Development (IJRD).

Idris, I. S. K., Mustofa, Y. A., & Salihi, I. A. (2023). Analisis Sentimen Terhadap Penggunaan Aplikasi Shopee Mengunakan Algoritma Support Vector Machine (SVM). Jambura Journal Of Electrical And Electronics Engineering, 5(1), 32–35. https://doi.org/10.37905/Jjeee.V5i1.16830

Leon, M. (2025). Retracted: Sentiment Analysis: From Rule-Based Lexicons To Large Language Models. Intelligent Systems With Applications, 28, 200599. https://doi.org/10.1016/J.Iswa.2025.200599

Liang, X., Guo, J., Sun, Y., & Liu, X. (2021). A Method Of Product Selection Based On Online Reviews. Mobile Information Systems, 2021, 1–16. https://doi.org/10.1155/2021/9656315

Liu, Z. (2024). Transformations In Consumer Buying Behavior: Investigating How Online Shopping Platforms. Advances In Economics, Management And Political Sciences, 109(1), 181–186. https://doi.org/10.54254/2754-1169/109/2024bj0135

Luo, X., Deng, Z., Yang, B., & Luo, M. Y. (2024). Pre-Trained Language Models In Medicine: A Survey. Artificial Intelligence In Medicine, 154, 102904. https://doi.org/10.1016/J.Artmed.2024.102904

Nabiilah, G. Z., Prasetyo, S. Y., Izdihar, Z. N., & Girsang, A. S. (2023). Bert Base Model For Toxic Comment Analysis On Indonesian Social Media. Procedia Computer Science, 216, 714–721. https://doi.org/10.1016/J.Procs.2022.12.188

Novirianto, I. F. (2023). Analisis Sentimen Berbasis Aspek Pada Ulasan Aplikasi Mybluebird Dengan Implementasi N-Gram Dan Algoritma Logistic Regressio. Fakultas Sains Dan Teknologi Uin Syarif Hidayatullah Jakarta.

Novitasari, D., Maulana, R., Hastuti, H., & Puspitasari, N. (2024). Seminar Nasional Amikom Surakarta (SEMNASA) 2024.

Rayhan, A., & Gross, D. (2023). The Rise Of Python: A Survey Of Recent Research. License CC By, 4.

Rayhan, F. M., Wijoyo, S. H., & Putra, W. H. N. (2024). Analisis Sentimen Root Cause Analisis Kepuasan Pengguna Aplikasi Tokopedia Pada Ulasan Menggunakan Metode Random Forest. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 8(8).

Sanjaya, T. P. R., Fauzi, A., & Masruriyah, A. F. N. (2023). Analisis Sentimen Ulasan Pada E-Commerce Shopee Menggunakan Algoritma Naive Bayes Dan Support Vector Machine. Infotech : Jurnal Informatika & Teknologi, 4(1), 16–26. https://doi.org/10.37373/Infotech.V4i1.422

Santhosh, V., & Basavarajappa, B. (2022). A Study On Consumer Behaviour Towards Online Shopping. International Journal Of Research And Analytical Reviews (IJRAR), 9(1), 864–874.

Schulze, J. (2020). Online And Offline Shopping: Decision Making Factors That Influence Consumer Purchases.

Wang, L., Pertheban, T. R. A. L., Li, T., & Zhao, L. (2024). Application Of Business Intelligence Based On Big Data In E-Commerce Data Evaluation. Heliyon, 10(21), E38768. https://doi.org/10.1016/J.Heliyon.2024.E38768

Downloads

Published

2026-06-15