Ensemble Machine Learning to Detect Sarcasm in English on Twitter Social Media

Mochamad Alfan Rosid; Muhammad Arginanta Kafi Sambada; Suhendro Busono; Fajar Muharram

doi:10.20895/inista.v6i1.1073

View PDF

Published Oct 13, 2023

DOI https://doi.org/10.20895/inista.v6i1.1073

Mochamad Alfan Rosid

Universitas Muhammadiyah Sidoarjo

Muhammad Arginanta Kafi Sambada

Universitas Muhammadiyah Sidoarjo

Suhendro Busono

Universitas Muhammadiyah Sidoarjo

Fajar Muharram

Universitas Muhammadiyah Sidoarjo

Abstract

Detecting sarcasm in English tweets on social media platforms like Twitter is a complex task due to its subtle and ambiguous nature. This study explores the use of ensemble machine learning techniques, including Logistic Regression, Naive Bayes, Decision Tree, and Support Vector Machine (SVM), to effectively identify sarcasm. A dataset containing sarcastic and non-sarcastic English tweets was collected and preprocessed. Features representing lexical, syntactic, and semantic information were extracted to train and evaluate the ensemble models. The Support Vector Machine method demonstrated the highest performance among the techniques employed, achieving an accuracy of 80% and an F1-score of 80% for sarcasm detection. This highlights the efficacy of Support Vector Machines in capturing complex patterns and differentiating between sarcastic and non-sarcastic tweets. By leveraging the strengths of multiple machine learning algorithms, the ensemble approach enhances the overall performance of the sarcasm detection system. It provides a more robust and accurate detection of sarcasm, thereby improving the understanding of user sentiments and opinions in online conversations. This research contributes to sentiment analysis and natural language processing, offering valuable insights into sarcasm detection in social media. The findings have practical implications for interpreting user-generated content on platforms like Twitter, enabling a better understanding of user sentiments and facilitating more meaningful interactions.

How to Cite

Rosid, M. A., Sambada, M. A. K., Busono, S., & Muharram, F. (2023). Ensemble Machine Learning to Detect Sarcasm in English on Twitter Social Media. Journal of Informatics Information System Software Engineering and Applications (INISTA), 6(1), 11-20. https://doi.org/10.20895/inista.v6i1.1073

Issue

Vol 6 No 1 (2023): November 2023

Section

Articles

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.

References

[1] Y. V. Aritonang, D. P. Napitupulu, M. H. Sinaga, and J. Amalia, “Pengaruh Hyperparameter pada Fasttext terhadap Performa Model Deteksi Sarkasme Berbasis Bi-LSTM,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 9, no. 3, pp. 2612–2625, 2022, doi: 10.35957/jatisi.v9i3.1331.
[2] Y. Yunitasari, A. Musdholifah, and A. K. Sari, “Sarcasm Detection For Sentiment Analysis in Indonesian Tweets,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 13, no. 1, p. 53, 2019, doi: 10.22146/ijccs.41136.
[3] A. Muhaddisi, B. N. Prastowo, D. Utami, and K. Putri, “Sentiment Analysis With Sarcasm Detection On Politician ’ s Instagram,” vol. 15, no. 4, pp. 349–358, 2021.
[4] V. Govindan and V. Balakrishnan, “A machine learning approach in analyzing the effect of hyperboles using negative sentiment tweets for sarcasm detection,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 8, pp. 5110–5120, 2022, doi: 10.1016/j.jksuci.2022.01.008.
[5] F. Ugm and F. Ugm, “Analisis Sentimen Twitter untuk Teks Berbahasa Indonesia dengan Maximum Entropy dan Support Vector Machine,” vol. 8, no. 1, pp. 91–100, 2014.
[6] A. F. Hidayatullah et al., “Analisis sentimen dan klasifikasi kategori terhadap tokoh publik pada twitter,” vol. 2014, no. semnasIF, pp. 115–122, 2014.
[7] P. Arsi and R. Waluyo, “Analisis Sentimen Wacana Pemindahan Ibu Kota Indonesia Menggunakan Algoritma Support Vector Machine (SVM),” J. Teknol. Inf. dan Ilmu Komput., vol. 8, no. 1, p. 147, 2021, doi: 10.25126/jtiik.0813944.
[8] A. Syahadati, N. C. Lengkong, O. Safitri, S. Machsus, Y. R. Putra, and R. Nooraeni, “ANALISIS SENTIMEN PENERAPAN PSBB DI DKI JAKARTA DAN DAMPAKNYA TERHADAP PERGERAKAN IHSG,” vol. 15, no. 1, pp. 20–25, 2021.
[9] M. Shandy, T. Putra, and Y. Azhar, “Perbandingan Model Logistic Regression dan Artificial Neural Network pada Prediksi Pembatalan Hotel,” vol. 6, no. 1, pp. 29–37, 2021.
[10] R. Rahmanda and D. S. Informasi, “Rancang bangun aplikasi berbasis microservice untuk klasifikasi sentimen. studi kasus: pt. yesboss group indonesia (kata.ai),” 2018.
[11] A. Setiawan, L. W. Santoso, R. Adipranata, U. K. Petra, and J. Siwalankerto, “Klasifikasi Artikel Berita Bahasa Indonesia Dengan Naive Bayes Classifier,” pp. 3–8.
[12] U. Verawardina, F. Edi, and R. Watrianthos, “Analisis Sentimen Pembelajaran Daring Pada Twitter di Masa Pandemi COVID-19 Menggunakan Metode Naïve Bayes,” vol. 5, pp. 157–163, 2021, doi: 10.30865/mib.v5i1.2604.
[13] A. Subekti, “Analisis Sentiment pada Ulasan Film Dengan Optimasi Ensemble Learning,” vol. 7, no. 1, pp. 5–8, 2020.
[14] M. Ma, A. Prayogo, P. Subarkah, and F. Nida, “Sentiment analysis of customer satisfaction levels on smartphone products using Ensemble Learning,” vol. 14, no. 3, pp. 339–347, 2022.
[15] J. Nasional, S. Informasi, M. Kamil, T. Endra, and E. Tju, “Naïve Bayes dan Confusion Matrix untuk Efisiensi Analisa Intrusion Detection System Alert,” vol. 02, pp. 81–88, 2022.

Article Sidebar

Main Article Content

Abstract

Article Details

References