Theme:

Statistics and Modeling for Complex Data
Speaker:

BORDES Laurent We consider a simple two-component mixture model where one component is entirely known when the other one is entirely unknown. In other words we observe \[(X,Y) \in \R^2\] where $Y=a_Z+b_ZX+\varepsilon_Z$. In this model $Z$ is distributed according a Bernoulli distribution with parameter $\pi \in [0,1]$, the regression parameters $(a_0,b_0)\in\R^2$ and the cumulative distribution function (cdf) $F_0$ associated to $\varepsilon_0$ are known, when the regression parameters $(a_1,b_1)\in\R^2$ and the cdf $F_1$ associated to $\varepsilon_1$ are unknown. The unknown parameter of the model is thus $\vartheta=(p,a_1,b_1,F)$ which identifiability is proved under weak moment conditions. The same conditions allow to propose consistent estimators of $\vartheta$ based on a i.i.d. sample $(X_i,Y_i)_{i=1,\dots,n}$ of $(X,Y)$. The asymptotic behavior of these estimators is studied as well as their finite sample size behavior throught various simulation studies. The covariance of the limit processes is approximated by using a weighted bootstrap method. These works propose an alternative to the estimation methods proposed in [4,3] and extend the results in [1,2] to the regression model.

[1] L. Bordes, C. Delmas and P. Vandekerkhove (2006). Estimating a two-component mixture model when a component is known. Scand. J. Statist., 33(4), 733–752.

[2] L. Bordes and P. Vandekerkhove (2010). Semiparametric two-component mixture model when a component is known: an asymptotically normal estimator. Mathematical Methods of Statistic, 19(1), 22–41.

[3] D.R. Hunter and D.S. Young (2011). Semiparametric Mixtures of Regressions, Penn State Department of Statistics Technical Report #11-02.

[4] P. Vandekerkhove (2010). Estimation of a semiparametric contamined regression model. Preprint.