Ensemble Machine Learning to Detect Sarcasm in English on Twitter Social Media

Main Article Content

Mochamad Alfan Rosid
Muhammad Arginanta Kafi Sambada
Suhendro Busono
Fajar Muharram

Abstract

Detecting sarcasm in English tweets on social media platforms like Twitter is a complex task due to its subtle and ambiguous nature. This study explores the use of ensemble machine learning techniques, including Logistic Regression, Naive Bayes, Decision Tree, and Support Vector Machine (SVM), to effectively identify sarcasm. A dataset containing sarcastic and non-sarcastic English tweets was collected and preprocessed. Features representing lexical, syntactic, and semantic information were extracted to train and evaluate the ensemble models. The Support Vector Machine method demonstrated the highest performance among the techniques employed, achieving an accuracy of 80% and an F1-score of 80% for sarcasm detection. This highlights the efficacy of Support Vector Machines in capturing complex patterns and differentiating between sarcastic and non-sarcastic tweets. By leveraging the strengths of multiple machine learning algorithms, the ensemble approach enhances the overall performance of the sarcasm detection system. It provides a more robust and accurate detection of sarcasm, thereby improving the understanding of user sentiments and opinions in online conversations. This research contributes to sentiment analysis and natural language processing, offering valuable insights into sarcasm detection in social media. The findings have practical implications for interpreting user-generated content on platforms like Twitter, enabling a better understanding of user sentiments and facilitating more meaningful interactions.

Article Details

How to Cite
Rosid, M. A., Sambada, M. A. K., Busono, S., & Muharram, F. (2023). Ensemble Machine Learning to Detect Sarcasm in English on Twitter Social Media. Journal of Informatics Information System Software Engineering and Applications (INISTA), 6(1), 11-20. https://doi.org/10.20895/inista.v6i1.1073
Section
Articles

References

[1] Y. V. Aritonang, D. P. Napitupulu, M. H. Sinaga, and J. Amalia, “Pengaruh Hyperparameter pada Fasttext terhadap Performa Model Deteksi Sarkasme Berbasis Bi-LSTM,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 9, no. 3, pp. 2612–2625, 2022, doi: 10.35957/jatisi.v9i3.1331.
[2] Y. Yunitasari, A. Musdholifah, and A. K. Sari, “Sarcasm Detection For Sentiment Analysis in Indonesian Tweets,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 13, no. 1, p. 53, 2019, doi: 10.22146/ijccs.41136.
[3] A. Muhaddisi, B. N. Prastowo, D. Utami, and K. Putri, “Sentiment Analysis With Sarcasm Detection On Politician ’ s Instagram,” vol. 15, no. 4, pp. 349–358, 2021.
[4] V. Govindan and V. Balakrishnan, “A machine learning approach in analyzing the effect of hyperboles using negative sentiment tweets for sarcasm detection,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 8, pp. 5110–5120, 2022, doi: 10.1016/j.jksuci.2022.01.008.
[5] F. Ugm and F. Ugm, “Analisis Sentimen Twitter untuk Teks Berbahasa Indonesia dengan Maximum Entropy dan Support Vector Machine,” vol. 8, no. 1, pp. 91–100, 2014.
[6] A. F. Hidayatullah et al., “Analisis sentimen dan klasifikasi kategori terhadap tokoh publik pada twitter,” vol. 2014, no. semnasIF, pp. 115–122, 2014.
[7] P. Arsi and R. Waluyo, “Analisis Sentimen Wacana Pemindahan Ibu Kota Indonesia Menggunakan Algoritma Support Vector Machine (SVM),” J. Teknol. Inf. dan Ilmu Komput., vol. 8, no. 1, p. 147, 2021, doi: 10.25126/jtiik.0813944.
[8] A. Syahadati, N. C. Lengkong, O. Safitri, S. Machsus, Y. R. Putra, and R. Nooraeni, “ANALISIS SENTIMEN PENERAPAN PSBB DI DKI JAKARTA DAN DAMPAKNYA TERHADAP PERGERAKAN IHSG,” vol. 15, no. 1, pp. 20–25, 2021.
[9] M. Shandy, T. Putra, and Y. Azhar, “Perbandingan Model Logistic Regression dan Artificial Neural Network pada Prediksi Pembatalan Hotel,” vol. 6, no. 1, pp. 29–37, 2021.
[10] R. Rahmanda and D. S. Informasi, “Rancang bangun aplikasi berbasis microservice untuk klasifikasi sentimen. studi kasus: pt. yesboss group indonesia (kata.ai),” 2018.
[11] A. Setiawan, L. W. Santoso, R. Adipranata, U. K. Petra, and J. Siwalankerto, “Klasifikasi Artikel Berita Bahasa Indonesia Dengan Naive Bayes Classifier,” pp. 3–8.
[12] U. Verawardina, F. Edi, and R. Watrianthos, “Analisis Sentimen Pembelajaran Daring Pada Twitter di Masa Pandemi COVID-19 Menggunakan Metode Naïve Bayes,” vol. 5, pp. 157–163, 2021, doi: 10.30865/mib.v5i1.2604.
[13] A. Subekti, “Analisis Sentiment pada Ulasan Film Dengan Optimasi Ensemble Learning,” vol. 7, no. 1, pp. 5–8, 2020.
[14] M. Ma, A. Prayogo, P. Subarkah, and F. Nida, “Sentiment analysis of customer satisfaction levels on smartphone products using Ensemble Learning,” vol. 14, no. 3, pp. 339–347, 2022.
[15] J. Nasional, S. Informasi, M. Kamil, T. Endra, and E. Tju, “Naïve Bayes dan Confusion Matrix untuk Efisiensi Analisa Intrusion Detection System Alert,” vol. 02, pp. 81–88, 2022.