Robust estimation of the relationship between DNA copy number and gene expression

Statistics and Modeling for Complex Data

Looking for genes whose DNA copy number is "associated" with their expression level in a cancer study can help pinpoint candidates implied in the disease and improve on our understanding of its molecular bases. DNA methylation is an important player to account for in this setting, as it can down-regulate gene expression. We translate the biological question of interest into a well-defined statistical parameter whose relevance goes beyond the specific example considered here. We carry out its estimation following the targeted maximum likelihood estimation methodology. I will explain the method and describe its robustness properties. I will show the results of a simulation study inspired by a dataset from the Cancer Genome Atlas (TCGA). This is joint work with Pierre Neuvial and Mark van der Laan.