Semi-supervised classification of vegetation: preserving the
good old units and searching for new ones

J 2014

Semi-supervised classification of vegetation: preserving the good old units and searching for new ones

TICHÝ, Lubomír, Milan CHYTRÝ and Zoltán BOTTA-DUKÁT

Basic information

Original name

Semi-supervised classification of vegetation: preserving the good old units and searching for new ones

Authors

TICHÝ, Lubomír (203 Czech Republic, guarantor, belonging to the institution), Milan CHYTRÝ (203 Czech Republic, belonging to the institution) and Zoltán BOTTA-DUKÁT (348 Hungary)

Edition

Journal of Vegetation Science, Wiley-Blackwell, 2014, 1100-9233

Other information

Language

English

Type of outcome

Článek v odborném periodiku

Field of Study

10600 1.6 Biological sciences

Country of publisher

United States of America

Confidentiality degree

není předmětem státního či obchodního tajemství

References:

URL

Impact factor

Impact factor: 3.709

RIV identification code

RIV/00216224:14310/14:00074360

Organization unit

Faculty of Science

DOI

http://dx.doi.org/10.1111/jvs.12193

UT WoS

000343867500019

Keywords in English

Classification stability; Clustering; Data analysis; k-means; Partitioning around medoids; Phytosociology; Plant community ecology; Vegetation type

Abstract

V originále

Aim: The unsupervised nature of traditional numerical methods used to classify vegetation hinders the development of comprehensive vegetation classification systems. Each new unsupervised classification yields partitions that are partly inconsistent with previous classifications and change group membership for some sites. In contrast, supervised methods account for previously established vegetation units, but cannot define new ones. Therefore, we introduce the concept of semi-supervised classification to community ecology and vegetation science. Semi-supervised classification formally reproduces the existing units in a supervised mode and simultaneously identifies new units among unassigned sites in an unsupervised mode. We discuss the concept of semi-supervised clustering, introduce semi-supervised variants of two clustering algorithms that produce groups with crisp boundaries, k-means and partitioning around medoids (PAM), provide a free software tool to perform these classifications and demonstrate the advantages using example data sets of vegetation plots. Methods: Semi-supervised methods use a priori information about group membership for some sites to define centroids (k-means) or medoids (PAM) of site groups that represent previously established vegetation units. They identify these groups in a species hyperspace and assign new sites to them. At the same time, they search for a user-defined number of new groups. We compared the unsupervised, supervised and semi-supervised methods using an example of a forest vegetation data set that was previously classified using expert knowledge, and assessed how well these methods reproduced vegetation units defined by experts. Then we compared supervised and semi-supervised methods in a task when a grassland vegetation classification established in one country was extended to two neighbouring countries. Results and conclusions: Example analyses of vegetation plot data sets demonstrated that semi-supervised variants of k-means and PAM are extremely valuable tools for extending existing vegetation classifications while preserving previously defined vegetation units. They can be used both for identifying so far unrecognized vegetation types in the regions where a vegetation classification already exists and for extending a vegetation classification from a particular region to neighbouring regions with partly identical but partly different vegetation types. Both k-means and PAM provide site groups with crisp boundaries, which makes them a simpler alternative to fuzzy clustering methods.

Links

GAP505/11/0732, research and development project

Name: Zobecněná řízená klasifikace v ekologii společenstev

Investor: Czech Science Foundation

Citovat

TICHÝ, Lubomír, Milan CHYTRÝ and Zoltán BOTTA-DUKÁT. Semi-supervised classification of vegetation: preserving the good old units and searching for new ones. Journal of Vegetation Science. Wiley-Blackwell, 2014, vol. 25, No 6, p. 1504-1512. ISSN 1100-9233. Available from: https://dx.doi.org/10.1111/jvs.12193.

@article{1218660,
   author = {Tichý, Lubomír and Chytrý, Milan and BottaandDukát, Zoltán},
   article_number = {6},
   doi = {http://dx.doi.org/10.1111/jvs.12193},
   keywords = {Classification stability; Clustering; Data analysis; k-means; Partitioning around medoids; Phytosociology; Plant community ecology; Vegetation type},
   language = {eng},
   issn = {1100-9233},
   journal = {Journal of Vegetation Science},
   title = {Semi-supervised classification of vegetation: preserving the good old units and searching for new ones},
   url = {http://onlinelibrary.wiley.com/doi/10.1111/jvs.12193/abstract},
   volume = {25},
   year = {2014}
}

TY  - JOUR
ID  - 1218660
AU  - Tichý, Lubomír - Chytrý, Milan - Botta-Dukát, Zoltán
PY  - 2014
TI  - Semi-supervised classification of vegetation: preserving the good old units and searching for new ones
JF  - Journal of Vegetation Science
VL  - 25
IS  - 6
SP  - 1504-1512
EP  - 1504-1512
PB  - Wiley-Blackwell
SN  - 11009233
KW  - Classification stability
KW  - Clustering
KW  - Data analysis
KW  - k-means
KW  - Partitioning around medoids
KW  - Phytosociology
KW  - Plant community ecology
KW  - Vegetation type
UR  - http://onlinelibrary.wiley.com/doi/10.1111/jvs.12193/abstract
N2  - Aim: The unsupervised nature of traditional numerical methods used to classify vegetation hinders the development of comprehensive vegetation classification systems. Each new unsupervised classification yields partitions that are partly inconsistent with previous classifications and change group membership for some sites. In contrast, supervised methods account for previously established vegetation units, but cannot define new ones. Therefore, we introduce the concept of semi-supervised classification to community ecology and vegetation science. Semi-supervised classification formally reproduces the existing units in a supervised mode and simultaneously identifies new units among unassigned sites in an unsupervised mode. We discuss the concept of semi-supervised clustering, introduce semi-supervised variants of two clustering algorithms that produce groups with crisp boundaries, k-means and partitioning around medoids (PAM), provide a free software tool to perform these classifications and demonstrate the advantages using example data sets of vegetation plots. Methods: Semi-supervised methods use a priori information about group membership for some sites to define centroids (k-means) or medoids (PAM) of site groups that represent previously established vegetation units. They identify these groups in a species hyperspace and assign new sites to them. At the same time, they search for a user-defined number of new groups. We compared the unsupervised, supervised and semi-supervised methods using an example of a forest vegetation data set that was previously classified using expert knowledge, and assessed how well these methods reproduced vegetation units defined by experts. Then we compared supervised and semi-supervised methods in a task when a grassland vegetation classification established in one country was extended to two neighbouring countries. Results and conclusions: Example analyses of vegetation plot data sets demonstrated that semi-supervised variants of k-means and PAM are extremely valuable tools for extending existing vegetation classifications while preserving previously defined vegetation units. They can be used both for identifying so far unrecognized vegetation types in the regions where a vegetation classification already exists and for extending a vegetation classification from a particular region to neighbouring regions with partly identical but partly different vegetation types. Both k-means and PAM provide site groups with crisp boundaries, which makes them a simpler alternative to fuzzy clustering methods.
ER  -

TICHÝ, Lubomír, Milan CHYTRÝ and Zoltán BOTTA-DUKÁT. Semi-supervised classification of vegetation: preserving the good old units and searching for new ones. \textit{Journal of Vegetation Science}. Wiley-Blackwell, 2014, vol.~25, No~6, p.~1504-1512. ISSN~1100-9233. Available from: https://dx.doi.org/10.1111/jvs.12193.

Detailed Information on Publication Record