Effects of non-landslide sampling strategies on machine learning models in landslide susceptibility mapping

© 2024. The Author(s)..

This study aims to explore the effects of different non-landslide sampling strategies on machine learning models in landslide susceptibility mapping. Non-landslide samples are inherently uncertain, and the selection of non-landslide samples may suffer from issues such as noisy or insufficient regional representations, which can affect the accuracy of the results. In this study, a positive-unlabeled (PU) bagging semi-supervised learning method was introduced for non-landslide sample selection. In addition, buffer control sampling (BCS) and K-means (KM) clustering were applied for comparative analysis. Based on landslide data from Qiaojia County, Yunnan Province, China, collected in 2014, three machine learning models, namely, random forest, support vector machine, and CatBoost, were used for landslide susceptibility mapping. The results show that the quality of samples selected using different non-landslide sampling strategies varies significantly. Overall, the quality of non-landslide samples selected using the PU bagging method is superior, and this method performs best when combined with CatBoost for predicting (AUC = 0.897) landslides in very high and high susceptibility zones (82.14%). Additionally, the KM results indicated overfitting, displaying high accuracy for validation but poor statistical outcomes for zoning. The BCS results were the worst.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - volume:14

Enthalten in:

Scientific reports - 14(2024), 1 vom: 26. März, Seite 7201

Sprache:

Englisch

Beteiligte Personen:

Gu, Tengfei [VerfasserIn]
Duan, Ping [VerfasserIn]
Wang, Mingguo [VerfasserIn]
Li, Jia [VerfasserIn]
Zhang, Yanke [VerfasserIn]

Links:

Volltext

Themen:

CatBoost
Journal Article
Landslide susceptibility
Machine learning models
Non-landslide sample
PU bagging

Anmerkungen:

Date Revised 29.03.2024

published: Electronic

Citation Status PubMed-not-MEDLINE

doi:

10.1038/s41598-024-57964-5

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM370216938