J 2021

Machine Learning-Based Processing Proof-of-Concept Pipeline for Semi-Automatic Sentinel-2 Imagery Download, Cloudiness Filtering, Classifications, and Updates of Open Land Use/Land Cover Datasets

ŘEZNÍK, Tomáš; Jan CHYTRÝ and Kateřina TROJANOVÁ

Basic information

Original name

Machine Learning-Based Processing Proof-of-Concept Pipeline for Semi-Automatic Sentinel-2 Imagery Download, Cloudiness Filtering, Classifications, and Updates of Open Land Use/Land Cover Datasets

Authors

ŘEZNÍK, Tomáš (203 Czech Republic, belonging to the institution); Jan CHYTRÝ (203 Czech Republic, belonging to the institution) and Kateřina TROJANOVÁ (203 Czech Republic, belonging to the institution)

Edition

ISPRS International Journal of Geo-Information, Basel, MDPI, 2021, 2220-9964

Other information

Language

English

Type of outcome

Article in a journal

Field of Study

10508 Physical geography

Country of publisher

Switzerland

Confidentiality degree

is not subject to a state or trade secret

References:

Impact factor

Impact factor: 3.099

RIV identification code

RIV/00216224:14310/21:00121282

Organization unit

Faculty of Science

UT WoS

000622576200001

EID Scopus

2-s2.0-85104546426

Keywords in English

machine learning; land use; land cover; satellite imagery; Sentinel 2; image classification; cloud masking; LightGBM estimator

Tags

Tags

International impact, Reviewed
Changed: 16/5/2022 11:06, Mgr. Marie Novosadová Šípková, DiS.

Abstract

In the original language

Land use and land cover are continuously changing in today's world. Both domains, therefore, have to rely on updates of external information sources from which the relevant land use/land cover (classification) is extracted. Satellite images are frequent candidates due to their temporal and spatial resolution. On the contrary, the extraction of relevant land use/land cover information is demanding in terms of knowledge base and time. The presented approach offers a proof-of-concept machine-learning pipeline that takes care of the entire complex process in the following manner. The relevant Sentinel-2 images are obtained through the pipeline. Later, cloud masking is performed, including the linear interpolation of merged-feature time frames. Subsequently, four-dimensional arrays are created with all potential training data to become a basis for estimators from the scikit-learn library; the LightGBM estimator is then used. Finally, the classified content is applied to the open land use and open land cover databases. The verification of the provided experiment was conducted against detailed cadastral data, to which Shannon's entropy was applied since the number of cadaster information classes was naturally consistent. The experiment showed a good overall accuracy (OA) of 85.9%. It yielded a classified land use/land cover map of the study area consisting of 7188 km2 in the southern part of the South Moravian Region in the Czech Republic. The developed proof-of-concept machine-learning pipeline is replicable to any other area of interest so far as the requirements for input data are met.

Links

MUNI/A/1356/2019, interní kód MU
Name: Výzkum proměn geografických procesů a vztahů v prostoru a čase (Acronym: Progeo)
Investor: Masaryk University, Category A
818346, interní kód MU
Name: Si-EU-Soil (Acronym: SIEUSIOL)
Investor: European Union, Food security, sustainable agriculture and forestry, marine and maritime and inland water research (Societal Challenges)