Dalarna University's logo and link to the university's website

du.sePublications
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Predicting malaria outbreaks in Somaliland using an XGBoost machine learning framework
Show others and affiliations
2026 (English)In: Discover public health, ISSN 3005-0774, Vol. 23, no 1, article id 718Article in journal (Refereed) Published
Abstract [en]

Background: Malaria remains a significant public health challenge in Somaliland. This study evaluates a preliminary machine learning approach—rather than a full operational system—to predict malaria outbreak years in a data-scarce environment using a limited historical dataset (2002–2021). Methods: A retrospective study was conducted using annual data. An Extreme Gradient Boosting (XGBoost) model performed binary classification of malaria incidence into ‘Outbreak’ and ‘Non-Outbreak’ years. To address the methodological constraints of the small sample size (N = 20) and mitigate the risk of overfitting, a Leave-One-Year-Out Cross-Validation (LOYOCV) strategy was employed, and results were compared against a Logistic Regression baseline. Predictor variables included temperature, rainfall, 1-year lagged rainfall, urbanization, and land-use patterns. Results: The XGBoost model achieved an AUC of 0,880, significantly outperforming the baseline (AUC 0,710). At the optimal threshold, the model yielded a sensitivity of 0,750 and a precision of 0,600. However, the discrete staircase appearance of the resulting ROC curve reflects the model’s high sensitivity to individual data points within the small sample, indicating that these performance metrics should be interpreted with caution. Conclusion: While promising, these results are preliminary. The small sample size and the temporal clustering of outbreaks in the early 2000s suggest that this work serves as a proof-of-concept for data-scarce regions rather than a definitive surveillance tool. Further prospective validation with higher-resolution temporal data is required to ensure the reliability and generalizability of these associations for operational early warning. © The Author(s) 2026.

Place, publisher, year, edition, pages
BioMed Central Ltd , 2026. Vol. 23, no 1, article id 718
Keywords [en]
Data-scarce modeling, Machine learning, Malaria, Outbreak prediction, Preliminary approach, Somaliland, XGBoost
National Category
Public Health, Global Health and Social Medicine
Identifiers
URN: urn:nbn:se:du-53764DOI: 10.1186/s12982-026-02070-2ISI: 001768011300001Scopus ID: 2-s2.0-105039112375OAI: oai:DiVA.org:du-53764DiVA, id: diva2:2064134
Available from: 2026-06-01 Created: 2026-06-01 Last updated: 2026-06-01Bibliographically approved

Open Access in DiVA

fulltext(1131 kB)14 downloads
File information
File name FULLTEXT01.pdfFile size 1131 kBChecksum SHA-512
356a81f3ecc071718adcbecdfb3e9c48ab3156c00ad8252712fb7603a9b97b505ad361ddb1e35a63e7fbf44a1940b49811cc897feab7708368aa391be84d3217
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Hared, Yusuf Abdi

Search in DiVA

By author/editor
Hared, Yusuf Abdi
By organisation
Care Sciences
Public Health, Global Health and Social Medicine

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 118 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf