Robust and complex approach of pathological speech signal
analysis

J 2015

Robust and complex approach of pathological speech signal analysis

MEKYSKA, Jiří, Eva JANOUŠOVÁ, Pedro GOMEZ-VILDA, Zdeněk SMÉKAL, Irena REKTOROVÁ et. al.

Basic information

Original name

Robust and complex approach of pathological speech signal analysis

Authors

MEKYSKA, Jiří (203 Czech Republic), Eva JANOUŠOVÁ (203 Czech Republic, belonging to the institution), Pedro GOMEZ-VILDA (724 Spain), Zdeněk SMÉKAL (203 Czech Republic), Irena REKTOROVÁ (203 Czech Republic, guarantor, belonging to the institution), Ilona ELIÁŠOVÁ (203 Czech Republic, belonging to the institution), Milena KOŠŤÁLOVÁ (203 Czech Republic, belonging to the institution), Martina MRAČKOVÁ (203 Czech Republic, belonging to the institution), Jesus B. ALONSO-HERNANDEZ (724 Spain), Marcos FAUNDEZ-ZANUY (724 Spain) and Karmele LOPEZ-DE-IPINA (724 Spain)

Edition

Neurocomputing, AMSTERDAM, ELSEVIER SCIENCE BV, 2015, 0925-2312

Other information

Language

English

Type of outcome

Článek v odborném periodiku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Netherlands

Confidentiality degree

není předmětem státního či obchodního tajemství

References:

URL

Impact factor

Impact factor: 2.392

RIV identification code

RIV/00216224:14740/15:00085054

Organization unit

Central European Institute of Technology

DOI

http://dx.doi.org/10.1016/j.neucom.2015.02.085

UT WoS

000358808500012

Keywords in English

Pathological speech; Disordered voice; Dysarthria; Speech processing; Bicepstrum; Non-linear dynamic features

Abstract

V originále

This paper presents a study of the approaches in the state-of-the-art in the field of pathological speech signal analysis with a special focus on parametrization techniques. It provides a description of 92 speech features where some of them are already widely used in this field of science and some of them have not been tried yet (they come from different areas of speech signal processing like speech recognition or coding). As an original contribution, this work introduces 36 completely new pathological voice measures based on modulation spectra, inferior colliculus coefficients, bicepstrum, sample and approximate entropy and empirical mode decomposition. The significance of these features was tested on 3 (English, Spanish and Czech) pathological voice databases with respect to classification accuracy, sensitivity and specificity. To our best knowledge the introduced approach based on complex feature extraction and robust testing outperformed all works that have been published already in this field. The results (accuracy, sensitivity and specificity equal to 100.0 +/- 0.0%) are discussable in the case of Massachusetts Eye and Ear Infirmary (MEEI) database because of its limitation related to a length of sustained vowels, however in the case of Principe de Asturias (PdA) Hospital in Alcala de Henares of Madrid database we made improvements in classification accuracy (82.1 +/- 3.3%) and specificity (83.8 +/- 5.1%) when considering a single-classifier approach. Hopefully, large improvements may be achieved in the case of Czech Parkinsonian Speech Database (PARCZ), which are discussed in this work as well. All the features introduced in this work were identified by Mann-Whitney U test as significant (p < 0.05) when processing at least one of the mentioned databases. The largest discriminative power from these proposed features has a cepstral peak prominence extracted from the first intrinsic mode function (p = 6.9443 x 10(-32)) which means, that among all newly designed features those that quantify especially hoarseness or breathiness are good candidates for pathological speech identification. The paper also mentions some ideas for the future work in the field of pathological speech signal analysis that can be valuable especially under the clinical point of view. (C) 2015 Elsevier B.V. All rights reserved.

Links

ED1.1.00/02.0068, research and development project

Name: CEITEC - central european institute of technology

NT13499, research and development project

Name: Řeč, její poruchy a kognitivní funkce u Parkinsonovy nemoci

Files attached

ZVV_2015_140_1319700_Robust_and_complex.pdf

Request the author's version of the file

Citovat

MEKYSKA, Jiří, Eva JANOUŠOVÁ, Pedro GOMEZ-VILDA, Zdeněk SMÉKAL, Irena REKTOROVÁ, Ilona ELIÁŠOVÁ, Milena KOŠŤÁLOVÁ, Martina MRAČKOVÁ, Jesus B. ALONSO-HERNANDEZ, Marcos FAUNDEZ-ZANUY and Karmele LOPEZ-DE-IPINA. Robust and complex approach of pathological speech signal analysis. Neurocomputing. AMSTERDAM: ELSEVIER SCIENCE BV, 2015, vol. 167, November, p. 94-111. ISSN 0925-2312. Available from: https://dx.doi.org/10.1016/j.neucom.2015.02.085.

@article{1319700,
   author = {Mekyska, Jiří and Janoušová, Eva and GomezandVilda, Pedro and Smékal, Zdeněk and Rektorová, Irena and Eliášová, Ilona and Košťálová, Milena and Mračková, Martina and AlonsoandHernandez, Jesus B. and FaundezandZanuy, Marcos and LopezanddeandIpina, Karmele},
   article_location = {AMSTERDAM},
   article_number = {November},
   doi = {http://dx.doi.org/10.1016/j.neucom.2015.02.085},
   keywords = {Pathological speech; Disordered voice; Dysarthria; Speech processing; Bicepstrum; Non-linear dynamic features},
   language = {eng},
   issn = {0925-2312},
   journal = {Neurocomputing},
   title = {Robust and complex approach of pathological speech signal analysis},
   url = {http://ac.els-cdn.com/S0925231215007304/1-s2.0-S0925231215007304-main.pdf?_tid=f6afcc20-9991-11e5-a498-00000aacb35f&acdnat=1449128969_feb07c43c67d9cd575899e67b07d63bb},
   volume = {167},
   year = {2015}
}

TY  - JOUR
ID  - 1319700
AU  - Mekyska, Jiří - Janoušová, Eva - Gomez-Vilda, Pedro - Smékal, Zdeněk - Rektorová, Irena - Eliášová, Ilona - Košťálová, Milena - Mračková, Martina - Alonso-Hernandez, Jesus B. - Faundez-Zanuy, Marcos - Lopez-de-Ipina, Karmele
PY  - 2015
TI  - Robust and complex approach of pathological speech signal analysis
JF  - Neurocomputing
VL  - 167
IS  - November
SP  - 94-111
EP  - 94-111
PB  - ELSEVIER SCIENCE BV
SN  - 09252312
KW  - Pathological speech
KW  - Disordered voice
KW  - Dysarthria
KW  - Speech processing
KW  - Bicepstrum
KW  - Non-linear dynamic features
UR  - http://ac.els-cdn.com/S0925231215007304/1-s2.0-S0925231215007304-main.pdf?_tid=f6afcc20-9991-11e5-a498-00000aacb35f&acdnat=1449128969_feb07c43c67d9cd575899e67b07d63bb
L2  - http://ac.els-cdn.com/S0925231215007304/1-s2.0-S0925231215007304-main.pdf?_tid=f6afcc20-9991-11e5-a498-00000aacb35f&acdnat=1449128969_feb07c43c67d9cd575899e67b07d63bb
N2  - This paper presents a study of the approaches in the state-of-the-art in the field of pathological speech signal analysis with a special focus on parametrization techniques. It provides a description of 92 speech features where some of them are already widely used in this field of science and some of them have not been tried yet (they come from different areas of speech signal processing like speech recognition or coding). As an original contribution, this work introduces 36 completely new pathological voice measures based on modulation spectra, inferior colliculus coefficients, bicepstrum, sample and approximate entropy and empirical mode decomposition. The significance of these features was tested on 3 (English, Spanish and Czech) pathological voice databases with respect to classification accuracy, sensitivity and specificity. To our best knowledge the introduced approach based on complex feature extraction and robust testing outperformed all works that have been published already in this field. The results (accuracy, sensitivity and specificity equal to 100.0 +/- 0.0%) are discussable in the case of Massachusetts Eye and Ear Infirmary (MEEI) database because of its limitation related to a length of sustained vowels, however in the case of Principe de Asturias (PdA) Hospital in Alcala de Henares of Madrid database we made improvements in classification accuracy (82.1 +/- 3.3%) and specificity (83.8 +/- 5.1%) when considering a single-classifier approach. Hopefully, large improvements may be achieved in the case of Czech Parkinsonian Speech Database (PARCZ), which are discussed in this work as well. All the features introduced in this work were identified by Mann-Whitney U test as significant (p < 0.05) when processing at least one of the mentioned databases. The largest discriminative power from these proposed features has a cepstral peak prominence extracted from the first intrinsic mode function (p = 6.9443 x 10(-32)) which means, that among all newly designed features those that quantify especially hoarseness or breathiness are good candidates for pathological speech identification. The paper also mentions some ideas for the future work in the field of pathological speech signal analysis that can be valuable especially under the clinical point of view. (C) 2015 Elsevier B.V. All rights reserved.
ER  -

MEKYSKA, Jiří, Eva JANOUŠOVÁ, Pedro GOMEZ-VILDA, Zdeněk SMÉKAL, Irena REKTOROVÁ, Ilona ELIÁŠOVÁ, Milena KOŠŤÁLOVÁ, Martina MRAČKOVÁ, Jesus B. ALONSO-HERNANDEZ, Marcos FAUNDEZ-ZANUY and Karmele LOPEZ-DE-IPINA. Robust and complex approach of pathological speech signal analysis. \textit{Neurocomputing}. AMSTERDAM: ELSEVIER SCIENCE BV, 2015, vol.~167, November, p.~94-111. ISSN~0925-2312. Available from: https://dx.doi.org/10.1016/j.neucom.2015.02.085.

Detailed Information on Publication Record