D 2006

Estimation Procedures for the False Discovery Rate: A Systematic Comparison for Microarray Data

SCHIMEK, Michael and Tomáš PAVLÍK

Basic information

Original name

Estimation Procedures for the False Discovery Rate: A Systematic Comparison for Microarray Data

Name in Czech

Procedury pro odhad FDR: studie s použitím dat genové exprese

Authors

SCHIMEK, Michael (40 Austria) and Tomáš PAVLÍK (203 Czech Republic, guarantor)

Edition

17. vyd. Rome, Italy, COMPSTAT 2006 - Proceedings in Computational Statistics, p. 67-79, 12 pp. 2006

Publisher

Springer Verlag

Other information

Language

English

Type of outcome

Proceedings paper

Field of Study

10103 Statistics and probability

Country of publisher

Italy

Confidentiality degree

is not subject to a state or trade secret

RIV identification code

RIV/00216224:14110/06:00031811

Organization unit

Faculty of Medicine

ISBN

3-7908-1708-2

UT WoS

000242170000006

Keywords in English

False Discovery Rate; permutation algorithms; Significance Analysis

Tags

International impact, Reviewed
Changed: 1/4/2010 09:18, RNDr. Tomáš Pavlík, Ph.D.

Abstract

V originále

The microarray technology developed in recent years allows for measuring expression levels of thousands of genes simultaneously. In most microarray experiments the measurements are taken under two experimental conditions. Statistical procedures to identify differentially expressed genes involve a serious multiple comparison problem as we have to carry out as many hypothesis testings as the number of candidate genes in the experiment. If we apply the usual type I error rate alpha in each testing, then the probability to reject any truly null hypothesis will greatly exceed the intended overall alpha level. We focus on the recent error control concept of the false discovery rate FDR for which an increasing number of competing estimates as well as algorithms is available. However, there is little comparative evidence. For parametric as well as nonparametric test statistics relevant FDR procedures and typical parameter settings are discussed, including the use of correcting constants in the estimation of the pooled variance. An in-depth simulation study is performed aiming at the aforementioned points with respect to sound statistical inference for microarray gene expression data. Finally the famous Hedenfalk data set is analyzed in a similar fashion and conclusions are drawn for practical microarray analysis.

In Czech

Práce srovnává procedury pro kontrolu FDR v prostředí R na simulovaných datech z DNA mikročipů.