Indonesian Sentiment Analysis towards MyPertamina Application Reviews by Utilizing Machine Learning Algorithms
Main Article Content
This paper is a report of experiment analysis on sentiment analysis in application review that explored the methods and the data. Application review contains a large amount of raw data that has been published by users in the form of text, image, audio, and video. The data can be converted into valuable information by using sentiment analysis. In this work, around 5000 Indonesian review in MyPertamina google play application are analyzed. The goal of this study was to investigate the effectiveness of using sentiment analysis to extract valuable insights from application reviews. Some techniques were applied during this work, such as data collection, pre-processing, feature extraction, TF-IDF text representation, machine learning modelling, and evaluation phase. The machine learning algorithms that we used are Linear Support Vector Classification (Linear SVC) and Multinomial Naïve Bayes (Multinomial NB). The result shows both machine learning models present good performance in this data. The accuracy of Multinomial NB reaches 95%, while Linear SVC obtains 96% of accuracy. The results of the experiment suggest that both Linear SVC and Multinomial NB are well-suited for sentiment analysis tasks on Indonesian language data. Future work could include expanding the dataset to include reviews from a broader range of applications, or evaluating the performance of additional machine learning algorithms. In addition, word cloud analysis also performed in this experiment. The word cloud shows that positive and negative sentiment present some popular words which appear inside the review. It would also be interesting to conduct a deeper analysis of the word cloud results to identify common themes and trends in the positive and negative sentiments expressed in the reviews
 M. Khalid, M. Asif, and U. Shehzaib, “Towards Improving the Quality of Mobile App Reviews,” International Journal of Information Technology and Computer Science, vol. 7, no. 10, pp. 35–41, Sep. 2015, doi: 10.5815/ijitcs.2015.10.05.
 S. R. Joseph, H. Hlomani, K. Letsholo, F. Kaniwa, and K. Sedimo, “Natural Language Processing: A Review,” 2016. [Online]. Available: http://www.euroasiapub.org
 MyPertamina, “MyPertamina,” MyPertamina, 2021. https://mypertamina.id (accessed Sep. 13, 2022).
 D. R. Kawade and Dr. K. S. Oza, “Sentiment Analysis: Machine Learning Approach,” International Journal of Engineering and Technology, vol. 9, no. 3, pp. 2183–2186, Jun. 2017, doi: 10.21817/ijet/2017/v9i3/1709030151.
 S. Wahyu Handani, D. Intan Surya Saputra, Hasirun, R. Mega Arino, and G. Fiza Asyrofi Ramadhan, “Sentiment analysis for go-jek on google play store,” in Journal of Physics: Conference Series, Apr. 2019, vol. 1196, no. 1. doi: 10.1088/1742-6596/1196/1/012032.
 S. Fransiska and A. Irham Gufroni, “Sentiment Analysis Provider by.U on Google Play Store Reviews with TF-IDF and Support Vector Machine (SVM) Method,” Scientific Journal of Informatics, vol. 7, no. 2, pp. 2407–7658, 2020, [Online]. Available: http://journal.unnes.ac.id/nju/index.php/sji
 M. Javed and S. Kamal, “Normalization of Unstructured and Informal Text in Sentiment Analysis,” 2018. [Online]. Available: www.ijacsa.thesai.org
 N. A. Salsabila, Y. A. Winatmoko, A. A. Septiandri, and A. Jamal, “Colloquial Indonesian Lexicon,” in 2018 International Conference on Asian Language Processing (IALP), 2018, pp. 226–229.
 Y. HaCohen-Kerner, D. Miller, and Y. Yigal, “The influence of preprocessing on text classification using a bag-of-words representation,” PLoS One, vol. 15, no. 5, May 2020, doi: 10.1371/journal.pone.0232525.
 D. AbishekB, M. Arul Stephen, and D. Vijayalakshmi, “Data filtering and visualization for sentiment analysis of e-commerce websites,” 2021. [Online]. Available: https://ssrn.com/abstract=3882733
 G. Gupta and S. Malhotra, “Text Document Tokenization for Word Frequency Count using Rapid Miner (Taking Resume as an Example).” [Online]. Available: https://www.researchgate.net/publication/339527155
 D. J. Ladani and N. P. Desai, “Stopword Identification and Removal Techniques on TC and IR applications: A Survey,” in 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Mar. 2020, pp. 466–472. doi: 10.1109/ICACCS48705.2020.9074166.
 W. B. Trihanto, R. Arifudin, and A. Muslim, “Information Retrieval System for Determining the Title of Journal Trends in Indonesian Language Using TF-IDF and Naїve Bayes Classifier,” Scientific Journal of Informatics, vol. 4, no. 2, pp. 2407–7658, 2017, [Online]. Available: http://journal.unnes.ac.id/nju/index.php/sji
 C. Singh, “Word cloud analysis of the BJGP,” British Journal of General Practice, pp. 148–148, 2012.
 F. Y. A'la, Hartatik, N. Firdaus, M. A. Safi'ie and B. K. Riasti, "A Comprehensive Analysis of Twitter Data: A Case Study of Tourism in Indonesia," 2022 1st International Conference on Smart Technology, Applied Informatics, and Engineering (APICS), Surakarta, Indonesia, 2022, pp. 85-89, doi: 10.1109/APICS56469.2022.9918757.