## Reinforcement learning in signaling game

Theme:
Statistics and Modeling for Complex Data
Speaker:
TARRÈS Pierre

We consider a signaling game originally introduced by Skyrms, which models how two interacting players learn to signal each other and thus create a common language. The first rigorous analysis was done by Argiento, Pemantle, Skyrms and Volkov (2009) with 2 states, 2 signals and 2 acts. We study the case of $M_1$ states, $M_2$ signals and $M_1$ acts for general $M_1$, $M_2$ $\in\ensuremath{\mathbb{N}}$. We prove that the expected payoff increases in average and thus converges a.s., and that a limit bipartite graph emerges, such that no signal-state correspondence is associated to both a synonym and an informational bottleneck. Finally, we show that any graph correspondence with the above property is a limit configuration with positive probability.

## Sparsity with sign-coherent groups of variables via the cooperative-Lasso

Theme:
Statistics and Modeling for Complex Data
Speaker:
CHIQUET Julien

We consider the problems of estimation and selection of parameters endowed with a known group structure, when the groups are assumed to be sign-coherent, that is, gathering either non-negative, non-positive or null parameters. To tackle this problem we propose a new penalty that we call the cooperative-Lasso penalty. We derive the optimality conditions defining the cooperative-Lasso estimate for generalized linear models and propose an efficient active set algorithm suited to high-dimensional problems. We study the asymptotic consistency of the estimator in the linear regression setup and derive its irrepresentable conditions, which are milder than the ones of the group-Lasso regarding the matching of groups with the sparsity pattern of the true parameters. We also address the problem of model selection in linear regression by deriving an approximation of the degrees of freedom of the cooperative-Lasso estimator. Simulations comparing the proposed estimator to the group-Lasso comply with our theoretical results, showing consistent improvements in support recovery for sign-coherent groups. We finally propose an approach widely applicable to the processing of genomic data, where the set of differentially expressed probes is enriched by incorporating all the probes of the microarray that are related to the corresponding genes. In an application to the estimation of chemotherapy pathologic response in breast cancer, the cooperative-Lasso demonstrates much better performances than its competitors.

## Robust estimation of the relationship between DNA copy number and gene expression

Theme:
Statistics and Modeling for Complex Data
Speaker:
CHAMBAZ Antoine

Looking for genes whose DNA copy number is "associated" with their expression level in a cancer study can help pinpoint candidates implied in the disease and improve on our understanding of its molecular bases. DNA methylation is an important player to account for in this setting, as it can down-regulate gene expression. We translate the biological question of interest into a well-defined statistical parameter whose relevance goes beyond the specific example considered here. We carry out its estimation following the targeted maximum likelihood estimation methodology. I will explain the method and describe its robustness properties. I will show the results of a simulation study inspired by a dataset from the Cancer Genome Atlas (TCGA). This is joint work with Pierre Neuvial and Mark van der Laan.

## Statistical inference for structured populations alimented by fragmentation-transport

Theme:
Statistics and Modeling for Complex Data
Speaker:
HOFFMANN Marc

We investigate inference in simple models that decribe the evolution (in size or age) a a population of bactria across scales. The size of the system evolves according to a transport-fragmentation equation: each individual grows with a given transport rate, and splits into two offsprings, according to a binary fragmentation process with unknown division rate that depends on its size. Macroscopically, the system is well approximated by a PDE and statstical inference transfers into a nonlinear inverse problem. Microscopically, a more accurate description is given by a stochastic piecewise deterministic Markov process, which allows for other methods of inference, introducing however stochastic dependences. We will discuss and present some very simple results on the inference of the parameters of the system across scales. Real data analysis is conducted on E. Coli experiments. This is a joint (ongoing) work with M. Doumic (INRIA and Paris 6), N. Krell (Rennes 1) and L. Robert (ENS).

## Parameter estimation of a two-dimensional stochastic differential equation partially observed with application to neuronal data analysis

Theme:
Statistics and Modeling for Complex Data
Speaker:

Stochastic differential systems have been widely developed to describe neuronal activity by taking into account the random behavior of neurons. We focus on the stochastic two-dimensional Morris Lecar model, which drift and volatility functions are non-linear functions of the process and depend on unknown physiological parameters. Statistical estimation of these parameters from neuronal data is very difficult. Indeed, neuronal measurements correspond to discrete observations of only the first coordinate of the system. Furthermore, the SDE has no explicit solution. We propose an estimation method based on a stochastic version of the EM algorithm, the SAEM algorithm, which requires the simulation of the hidden coordinate conditionally to the observations. We propose to perform this simulation step with a particle filter based on the Euler approximation of the SDE. We prove the almost sure convergence of the obtained estimator towards the maximum of the 'exact' likelihood, without Euler approximation. We illustrate the performance of our estimation method on simulated and real data. This is a joint work with Susanne Ditlevsen (Copenhagen).

## A lower bound for the Lasso

Theme:
Statistics and Modeling for Complex Data
Speaker:
VAN DE GEER Sara

In numerical experiments, it has been observed that the Lasso generally selects too many variables, i.e. it yields a large number of false positives. We will con- firm this theoretically by showing in an example that with positive probability, the Lasso satisfies a sparsity oracle inequality while the number of false posi- tives is of larger order than the sparsity index of the underlying true regression. To arrive at this result, we will apply refined concentration inequalities and extreme value theory.

## Estimation for Lévy processes from high frequency data within a long time interval

Theme:
Statistics and Modeling for Complex Data
Speaker:
COMTE Fabienne

This talk is semi-parametric in the sense that we simultaneously estimate a function and two parameters. More precisely, we study nonparametric estimation of the Lévy density for Lévy processes, first without then with Brownian component. We consider $2n$ (resp. $3n$) discrete time observations with step $\Delta$. The asymptotic framework is: $n$ tends to infinity, $\Delta=\Delta_n$ tends to zero while $n\Delta_n$ tends to infinity. We use a Fourier approach to construct an adaptive nonparametric estimator and to provide a bound for the global $L^2$-risk.

More precisely, we consider $(L_t, t \ge 0)$ a real-valued Lévy process, i.e. a process with stationary independent increments and càdlàg sample paths with associated Lévy measure admitting a density $n(x)$. We assume that the Lévy density satisfies $\int_{R} x^2 n(x) dx < \infty.$ For statistical purposes, this assumption, which was proposed in Neumann and Reiss (2009), has several useful consequences. First, for all $t$, $EL_t^2<+\infty$ and as $\int_{R}(e^{iux} -1-iux)n(x)dx$ is well defined, we get: $$\label{fc} \psi_{t}(u)= E(\exp{i u L_t})=\exp{t(iu b-\frac 12 u^2\sigma^2 +\int_{R}(e^{iux} -1-iux) n(x)dx)},(1)$$ where $b=E(L_1)$. Formula (1) is the starting point of the nonparametric part of the estimation strategy and provides, by derivating twice (if $\sigma=0$) or three times (in the general case) an estimator of $h(x)=x^2n(x)$ (if $\sigma=0$) or $p(x)=x^3n(x)$ (in the general case). These estimators are defined by Fourier inversion of type $$\hat p_m(x)= (2\pi)^{-1} \int_{-\pi m}^{\pi m} e^{-iux}\hat p^*(x)dx,$$ and the cutoff parameter $m$ is chosen by a penalization device to obtain the best possible squared bias-variance compromise. Risk bounds are provided for the integrated mean square risk of the adaptive estimators. We discuss rates of convergence and give examples of processes fitting in our framework, for which we can compute the explicit rates of convergence.

Estimators of the drift and of the variance of the Gaussian component are also studied, based for $\sigma^2$ on power variations in the spirit of Aït-Sahalia and Jacod (2007). Simulation results for such processes are also given, for functions $g(x)=xn(x)$, $h(x)=x^2n(x)$ and $p(x)=x^3n(x)$ and also for the parameter estimates of $b$ and $\sigma^2$.

#### Keywords:

Adaptive nonparametric estimation; High frequency data; L ́evy processes; Projection estimators; Power variation.

#### References:

Comte, F. and Genon-Catalot, V. (2011). Estimation for L ́evy processes from high frequency data within a long time interval. Ann. Statist., to appear.
Neumann, M. and Reiss, M. (2009). Nonparametric estimation for L ́evy processes from low-frequency observations. Bernoulli 15, 223-248.
Aït-Sahalia Y. and Jacod J. (2007). Volatility estimators for discretely sampled L ́evy processes. The Annals of Statistics 35, 355-392.

## System Identification for quantum Markov chains

Theme:
Statistics and Modeling for Complex Data
Speaker:

Quantum Markov processes are quantum dynamical systems with a large range of applications in quantum optics and cavity QED.  The paradigmatic example is that of an atom maser, in which atoms pass successively through an optical cavity, interact with the cavity field, such that the outgoing atoms carry information about the interaction.

We consider the problem of system identification for quantum Markov chains in the asymptotic set-up where the experimentalist has access to the output of the Markov chain and the number of `atoms' is large. In the special case of a one parameter model, we derive two asymptotic normality results, the first with respect to the quantum state of the output, the second with respect to the statistics of averages of simple measurement performed on the output. In particular we provide simple estimators whose Fisher information can be optimized over different choices of measured observables. These results can be extended to multiple parameter estimation and continuous time dynamics, opening up a new area of research in quantum statistics with direct relevance for quantum engineering.

## An asymptotic error bound for discriminating between several quantum states

Theme:
Statistics and Modeling for Complex Data
Speaker:
NUSSBAUM Michael

Given a K-tuple of density matrices, we consider the problem of detecting the true state of a quantum system on the basis of measurements performed on N  copies.  We investigate the exponential rate of decay of the averaged error probability of the optimal quantum detector. In the classical case of several  probability measures on a finite sample space,  it is known that the optimal error exponent is given by the worst case binary Chernoff bound between any possible pair from the K distributions, and attained by the MLE. In the quantum case, the analogous worst case binary quantum Chernoff bound may be called the multiple quantum Chernoff bound. Recently it has been shown that this bound is generally unimprovable, and also attainable in the case of K pure states. We extend the attainability result to a larger class of K-tuples of states which are possibly mixed, but are all singular (nonfaithful). For arbitrary finite dimensional states, we construct a detector which attains the multiple quantum Chernoff bound up to a factor 1/3.

## Unconstrained recursive importance sampling

Theme:
Statistics and Modeling for Complex Data
Speaker:
LEMAIRE Vincent
Speaker:
PAGÈS Gilles

We propose an unconstrained stochastic approximation method for finding the optimal change of measure (in an a priori parametric family) to reduce the variance of a Monte Carlo simulation. We consider different parametric families based on the Girsanov theorem and the Esscher transform (exponential-tilting). In [Monte Carlo Methods Appl. 10 (2004) 1–24], it described a projected Robbins–Monro procedure to select the parameter minimizing the variance in a multidimensional Gaussian framework. In our approach, the parameter (scalar or process) is selected by a classical Robbins–Monro procedure without projection or truncation. To obtain this unconstrained algorithm, we extensively use the regularity of the density of the law without assuming smoothness of the payoff. We prove the convergence for a large class of multidimensional distributions as well as for diffusion processes.

We illustrate the efficiency of our algorithm on several pricing problems: a Basket payoff under a multidimensional NIG distribution and a barrier options in different markets.

Source: Vincent Lemaire and Gilles Pagès, Ann. Appl. Probab. Volume 20, Number 3 (2010), 1029-1067.

## Estimation and detection of high-variable functions.

Theme:
Statistics and Modeling for Complex Data
Speaker:
INGSTER Yuri

We give  a survey of some resent results on the minimax estimation and detection of a multivariate function in the white Gaussian noise model. We study the "curse of dimensionality" phenomenon for a function belonging to a ball in various functional spaces: Sobolev spaces, tensor product Sobolev spaces, and spaces of analytic functions. Typically, the rates of the quadratic risk in the estimation problem and the separation rates in the detection problem become catastrophically bad when the number of variables is larger then log(1/ε), where ε is the noise intensity.

We show that the curse of dimensionality is "lifted" for the balls in anisotropic Sobolev spaces and in weighted tensor product spaces. The spaces of the last type were introduced by Sloan and Woznjakovski (1998) in the context of numerical integration problem.

The methods are based on some new probabilistic tools for studying approximation characteristics of the balls in the spaces under consideration.

## Minimax risks for sparse regressions: Ultra-high-dimensional phenomenons

Theme:
Statistics and Modeling for Complex Data
Speaker:
VERZELEN Nicolas

Consider the standard  Gaussian linear regression model Y=X θ+ ε, where Y ∈ Rn is a response vector and XRn x p is a design matrix.
Numerous work have been devoted to building efficient estimators of θ when p is much  larger than n. In such a situation, a classical approach amounts to assuming that θ is approximately sparse. In this talk, I study the minimax risks of estimation and testing over classes of k-sparse vectors θ. These bounds shed light on the limitations due to high-dimensionality.

The results encompass the problem of prediction (estimation of Xθ), the inverse problem (estimation of θ) and linear testing (testing θ=0). Interestingly, an elbow effect occurs when the number of variables k log(p) becomes large compared to n. Indeed, the minimax risks and hypothesis separation distances blow up in this ultra-high dimensional setting. In fact, even dimension reduction techniques cannot provide satisfying results in such an ultra-high dimensional setting.

## On the Robustness of the Snell envelope

Theme:
Statistics and Modeling for Complex Data
Speaker:

We analyze the robustness properties of the Snell envelope backward evolution equation for the discrete time optimal stopping problem. We consider a series of approximation schemes, including cut-off type approximations, Euler discretization schemes, interpolation models, quantization tree models, and the Stochastic Mesh method of Broadie-Glasserman. In each situation, we provide non asymptotic convergence estimates, including $\LL_p$-mean error bounds and exponential concentration inequalities. We deduce these estimates from a single and general robustness property of Snell envelope semigroups. In particular, this analysis allows us to recover existing convergence results for the quantization tree method and to improve significantly the rates of convergence obtained for the Stochastic Mesh estimator of Broadie-Glasserman. In the second part of the article, we propose a new approach using a genealogical tree approximation of the reference Markov process in terms of a neutral type genetic model. In contrast to Broadie-Glasserman Monte Carlo models, the computational cost of this new stochastic particle approximation is linear in the number of sampled points. Some simulations results are provided and confirm the interest of this new algorithm.

## The local geometry of mixtures

Theme:
Statistics and Modeling for Complex Data
Speaker:
GASSIAT Élisabeth

Mixture modelization of distributions uses a set $\cal F$ of probability densities, and for any positive integer $q$, the set $(\mathcal{M}_q)$ of convex combinations (mixtures) of elements of $\cal F$. To derive statistical results for estimation procedures, one needs a quantitative description of complexity for the sequence of models $(\mathcal{M}_q)_{q\in\mathbb{N}}$. Mixture models possess a notoriously complicated geometric structure. In this talk, we will give results about the local entropy of the square root of likelihood ratios (scores) in Hellinger distance. We will explain how they may be derived from the global entropy for the set of normalized scores.

As an application, we will give the precise rate of penalties that lead to almost sure identification of model order without prior upper bound : the minimal penalty that yields strong consistency in the absence of an a priori upper bound on the model order is of order $\eta(q)\log\log n$, where $\eta(q)$ is a dimensional quantity. The proof is based on a (general) precise characterization of the pathwise fluctuations of the generalized likelihood ratio statistics in terms of the geometry of the underlying model.

This talk is based on joint work with Ramon van Handel, Princeton University.

## Moment estimation method for a two-component mixture regression model when a component is known

Theme:
Statistics and Modeling for Complex Data
Speaker:
BORDES Laurent

We consider a simple two-component mixture model where one component is entirely known when the other one is entirely unknown. In other words we observe $(X,Y) \in \R^2$ where $Y=a_Z+b_ZX+\varepsilon_Z$. In this model $Z$ is distributed according a Bernoulli distribution with parameter $\pi \in [0,1]$, the regression parameters $(a_0,b_0)\in\R^2$ and the cumulative distribution function (cdf) $F_0$ associated to $\varepsilon_0$ are known, when the regression parameters $(a_1,b_1)\in\R^2$ and the cdf $F_1$ associated to $\varepsilon_1$ are unknown. The unknown parameter of the model is thus $\vartheta=(p,a_1,b_1,F)$ which identifiability is proved under weak moment conditions. The same conditions allow to propose consistent estimators of $\vartheta$ based on a i.i.d. sample $(X_i,Y_i)_{i=1,\dots,n}$ of $(X,Y)$. The asymptotic behavior of these estimators is studied as well as their finite sample size behavior throught various simulation studies. The covariance of the limit processes is approximated by using a weighted bootstrap method. These works propose an alternative to the estimation methods proposed in [4,3] and extend the results in [1,2] to the regression model.

#### References :

[1] L. Bordes, C. Delmas and P. Vandekerkhove (2006). Estimating a two-component mixture model when a component is known. Scand. J. Statist., 33(4), 733–752.
[2] L. Bordes and P. Vandekerkhove (2010). Semiparametric two-component mixture model when a component is known: an asymptotically normal estimator. Mathematical Methods of Statistic, 19(1), 22–41.
[3] D.R. Hunter and D.S. Young (2011). Semiparametric Mixtures of Regressions, Penn State Department of Statistics Technical Report #11-02.
[4] P. Vandekerkhove (2010). Estimation of a semiparametric contamined regression model. Preprint.

## Semiparametric finite mixture model estimation algorithms

Theme:
Statistics and Modeling for Complex Data
Speaker:
HUNTER David

We present ideas for estimation algorithms in finite mixture models where components are not assumed to come from a particular parametric family. The algorithms, which combine elements of standard EM algorithms and kernel density estimation, are iterative and may be shown to be monotonic in the sense that they guarantee an increase in a smoothed loglikelihood objective function at each iteration. We consider applications of these ideas to multivariate psychological measurements and mixtures of regressions.

## Conditional alphas and realized betas

Theme:
Statistics and Modeling for Complex Data
Speaker:
DISTASO Walter

This paper proposes a two-step procedure to back out the conditional alpha of a given stock from high-frequency returns. We rst estimate the realized factor loadings of the stock, and then retrieve the conditional alpha by estimating the conditional expectation of the stock return in excess over the realized risk premia. The estimation method is fully nonparametric in stark contrast with the literature on conditional alphas and betas. Apart from the methodological contribution, we employ NYSE data to determine the main drivers of conditional alphas as well as to track mispricing over time. In addition, we assess economic relevance of our conditional alpha estimates by means of a market-neutral trading strategy that longs stocks with positive alphas and shorts stocks with negative alphas. The preliminary results are very promising.

## Optimal discretization of hedging strategies with jumps

Theme:
Statistics and Modeling for Complex Data
Speaker:
ROSENBAUM Mathieu

In this work, we consider the hedging error due to discrete trading in models with jumps. We propose a framework enabling to (asymptotically) optimize the discretization times. More precisely, a strategy is said to be optimal if for a given cost function, no strategy has (asymptotically) a lower mean square error for a smaller cost. We focus on strategies based on hitting times and give explicit expressions for the optimal strategies. This is joint work with Peter Tankov.

## Optimality properties for the estimation of jumps in stochastic processes

Theme:
Statistics and Modeling for Complex Data
Speaker:
GLOTER Arnaud

In this paper, we study the problem of optimal estimation of the size of jumps for diffusion processes.
We prove some LAMN property in the case where the instants of jumps are random and the size of jumps are deterministic.
We study too the situation where both the instants and sizes of jumps are random, and prove some Hajek's convolution theorem in this case.

This is a joint work with E. Clément (Université Paris Est) and Sylvain Delattre (Université Paris 7)

## A Central Limit Theorem for Interacting Markov Chains

Theme:
Statistics and Modeling for Complex Data
Speaker:
MOULINES Éric

## Central limit theorems for the nonparametric estimation of time-changed Lévy models

Theme:
Statistics and Modeling for Complex Data
Speaker:
FIGUEROA-LÓPEZ José

We consider a time-changed Lévy model of the form X(t)=Z(T(t)), where Z is a Lévy process with Lévy measure F and T(t) is an independent random clock with speed process driven by an ergodic diffusion r(t). By modeling the log return process of a financial asset in terms of X(t), one can incorporate several important stylized features of asset prices, such as leptokurtic return distributions and volatility clustering. In this talk, we propose an estimator for the integral F(g) of a test function g with respect to the Lévy measure F and prove central limit theorems (CLT) for our estimators. The functional parameters F(g) can in turn be used as the building blocks of several nonparametric estimation methods such as sieve-based estimation and kernel estimation. The CLT are valid when both the sampling frequency and the time-horizon of observations get larger. Our results combine the long-run ergodic properties of the diffusion process r(t) with the short-term ergodic properties of the Lévy process Z via central limit theorems for martingale differences.

## Reconstructing quantum states by certified compressed sensing

Theme:
Statistics and Modeling for Complex Data
Speaker:
OHLIGER Matthias

We investigate a method which allows us to reconstruct low-rank quantum states from few measured expectation values of observables which can be taken from arbitrarily, even continuous, generalized bases. Building on earlier work on matrix completion, we present an algorithm which does not only suceed with very high probability but also allows for a certification of its success. One does not need any assumptions on the state but can check whether the reconstruction was successful based on the available measured data only. The algorithm is fast both in an asymptotic sense and for problem sizes of practical importance. We also discuss the issue of robustness which is vital for any real-world application and show how the performance is affected by noise and decoherence.

Joint work with David Gross, Vincent Nesme, Jens Eisert.

## Trace-preserving property of quantum Markov process in matrix product state

Theme:
Statistics and Modeling for Complex Data
Speaker:
MORIMAE Tomoyuki
Matrix product representation is an efficient way of representing quantum states in exponentially large Hilbert spaces, and it has been used in various fields such as quantum computation, quantum state estimation, and condensed matter physics.
In this talk, after briefly reviewing basics of matrix product representation, I will show our recent result about
the trace-preserving property of the quantum Markov process associated with the matrix used in matrix product representation.
I will also explain an application of this result to measurement-based quantum computation.

Joint work with Dr. Keisuke Fujii (Osaka University, Japan)

## Sparse Analysis of Multichannel and Spherical Astronomical Data

Theme:
Statistics and Modeling for Complex Data
Speaker:
STARCK Jean-Luc

The satellite PLANCK, launched in 2009, maps the anisotropies in the cosmic microwave background (CMB), this being radiation emitted 13.7 billion years ago, at the time when the Universe became transparent to light.  The goal of the experiment is to measure with a precision better than 1%, the parameters for the Standard Model of cosmology, also known as the "Big Bang model."
PLANCK provides us full sky temperature and polarized maps in nine frequencies, from 27 to 850 GHz. We present sparse representations on the sphere and show and these decompositions can be used for recovering the CMB.