Extracting Post‑Disaster Health Impact Information from News Reports Using Named Entity Recognition
Main Article Content
Abstract
Natural disasters have a significant impact on public health, giving rise to various post-disaster illnesses. This study presents an automated information‑extraction framework based on Named Entity Recognition (NER), leveraging the IndoBERT model to identify disaster types, health impacts, and affected locations from online news reports. Data were gathered via web scraping from multiple reputable news portals and subsequently processed through tokenization, stop‑word removal, and lemmatization. Extracted entities were visualized via bar charts and word clouds to reveal disease patterns associated with each disaster type. Results indicate that floods have a significant public health impact, with skin diseases being the most prevalent, followed by diarrhea, fever, influenza, and Acute Respiratory Infections (ARIs). Volcanic eruptions are linked to health conditions such as ARI, hypertension, diarrhea, and influenza, whereas earthquakes show strong correlations with diarrhea, ARI, skin diseases, and fever. Droughts and landslides are closely associated with diarrheal outbreaks due to compromised sanitation resulting from limited access to clean water. Although less frequently reported, tsunamis also exhibit a notable association with cases of diarrhea. The proposed method achieves 90 % accuracy and an 88 % F1‑score. These findings confirm the effectiveness of our NER-based approach in detecting causal relationships between disasters and health outcomes, providing valuable insights for policymakers and healthcare professionals in designing targeted post-disaster mitigation and response strategies.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
References
[2] World Health Organization, “Public Health Risk Assessment and Interventions Tropical Cyclone PAM: Vanuatu,” 2015. [Online]. Available: https://iris.who.int/bitstream/handle/10665/254640/9789290617495-eng.pdf;jsessionid=F64851D50EC3DBA92D5D2575E84269AC?sequence=1.
[3] Kementrian Kesehatan, “Dampak Karhutla Bagi Kesehatan Masyarakat,” Pusat Krisis Kesehatan, 2022. https://pusatkrisis.kemkes.go.id/dampak-karhutla-bagi-kesehatan-masyarakat (accessed Feb. 05, 2025).
[4] B. Beaglehole, R. T. Mulder, C. M. Frampton, J. M. Boden, G. Newton-Howes, and C. J. Bell, “Psychological distress and psychiatric disorder after natural disasters: systematic review and meta-analysis,” Br. J. Psychiatry, vol. 213, no. 6, pp. 716–722, 2018.
[5] T. Powell, K. M. Wegmann, and E. Backode, “Coping and post-traumatic stress in children and adolescents after an acute onset disaster: A systematic review,” Int. J. Environ. Res. Public Health, vol. 18, no. 9, p. 4865, 2021.
[6] S. Tahernejad, S. Ghaffari, A. Ariza-Montes, U. Wesemann, H. Farahmandnia, and A. Sahebi, “Post-traumatic stress disorder in medical workers involved in earthquake response: a systematic review and meta-analysis,” Heliyon, vol. 9, no. 1, 2023.
[7] E. Kaya, E. I. Onal, S. Fatih, and O. Güler, “Prevalence and predictors of post-traumatic stress disorder among survivors of the 2023 earthquakes in Türkiye: The case of a temporary camp,” Int. J. Disaster Risk Reduct., vol. 114, p. 104976, 2024.
[8] F. Novika, I. Maulidi, B. Marsanto, and A. N. Amalina, “Comparasion Model Analysis Time of Earthquake Occurrence in Indonesia based on Hazard Rate with Single Decrement Method,” J. Teor. dan Apl. Mat., vol. 6, no. 1, pp. 163–176, 2022.
[9] I. Maulidi, F. Novika, V. Apriliani, and M. Syazali, “The Estimation of the hazard function of earthquakes in aceh province with likelihood approach,” Desimal J. Mat., vol. 7, no. 3, pp. 557–566, 2024, doi: 10.24042/djm.
[10] G. Gunawan, “Disaster event, preparedness, and response in Indonesian coastal areas: Data mining of official statistics,” Int. J. Comput. Digit. Syst., vol. 16, no. 1, pp. 249–264, 2024.
[11] M. Sinambela et al., Mitigasi dan manajemen bencana. Yayasan Kita Menulis, 2021.
[12] C. Wu et al., “Natural language processing for smart construction: Current status and future directions,” Autom. Constr., vol. 134, p. 104059, 2022.
[13] M. A. Sit, C. Koylu, and I. Demir, “Identifying disaster-related tweets and their semantic, spatial and temporal context using deep learning, natural language processing and spatial analysis: a case study of Hurricane Irma,” in Social Sensing and Big Data Computing for Disaster Management, Routledge, 2020, pp. 8–32.
[14] J. Sun, Y. Liu, J. Cui, and H. He, “Deep learning-based methods for natural hazard named entity recognition,” Sci. Rep., vol. 12, no. 1, p. 4598, 2022.
[15] M. Chipapi, “Automated disease outbreak detection and reporting system,” Manipal, 2024. doi: 10.13140/RG.2.2.24858.04807.
[16] M. Kim, K. Chae, S. Lee, H.-J. Jang, and S. Kim, “Automated classification of online sources for infectious disease occurrences using machine-learning-based natural language processing approaches,” Int. J. Environ. Res. Public Health, vol. 17, no. 24, p. 9467, 2020.
[17] J. Mangoma and W. Sulistiadi, “Island Health Crisis: Bridging Gaps in Indonesia’s Healthcare Deserts,” J. Indones. Heal. Policy Adm., vol. 9, no. 2, p. 5, 2024.
[18] A. Mehmood, M. T. Zamir, M. A. Ayub, N. Ahmad, and K. Ahmad, “A named entity recognition and topic modeling-based solution for locating and better assessment of natural disasters in social media,” arXiv Prepr. arXiv2405.00903, 2024.
[19] N. Bui et al., “Fine-tuning large language models for improved health communication in low-resource languages,” Comput. Methods Programs Biomed., vol. 263, p. 108655, 2025.
[20] S. Amin, “Learning entity and relation representation for low-resource medical language processing,” 2024.
[21] R. S. Wilkho and N. G. Gharaibeh, “FF-NER: A named entity recognition model for harvesting web-based information about flash floods and related infrastructure impacts,” Int. J. Disaster Risk Reduct., p. 105604, 2025.
[22] R. Suwaileh, T. Elsayed, M. Imran, and H. Sajjad, “When a disaster happens, we are ready: Location mention recognition from crisis tweets,” Int. J. Disaster Risk Reduct., vol. 78, p. 103107, 2022.
[23] J.-C. Klie, R. E. de Castilho, and I. Gurevych, “Analyzing dataset annotation quality management in the wild,” Comput. Linguist., vol. 50, no. 3, pp. 817–866, 2024.
[24] V. Vennila, A. Rajivkannan, S. Savitha, G. J. Santhosh, R. Jeevanantham, and K. Kavin, “Integrated T5 Neural Network and Spacy-Based AI Framework for Advanced Grammar and Speech Analysis,” in International Conference on Sustainability Innovation in Computing and Engineering (ICSICE 2024), 2025, pp. 741–754.
[25] G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, “Neural architectures for named entity recognition,” arXiv Prepr. arXiv1603.01360, 2016.
[26] B. Wilie et al., “IndoNLU: Benchmark and resources for evaluating Indonesian natural language understanding,” arXiv Prepr. arXiv2009.05387, 2020.
[27] M. M. K. Dandu, S. Singiri, S. Nadukuru, S. Jain, R. Agarwal, and S. P. Singh, “Unsupervised Information Extraction with BERT,” Int. J. Res. Mod. Eng. Emerg. Technol. 9, vol. 1, 2021.
[28] M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Inf. Process. Manag., vol. 45, no. 4, pp. 427–437, 2009.
[29] W. Holmes, Speech synthesis and recognition. CRC press, 2002.