### Chapter 2 - Convergences and error estimates - Exercises

Exercise 2.1 (central limit theorem with various rates)
Let $(X_m)_{m\geq 1}$ be a sequence of i.i.d. random variables having the symmetric Pareto distribution with parameter $\alpha>0$, whose density is
$$\frac{\alpha}{2|z|^{\alpha+1}} 1_{|z|\geq 1}.$$
Set $\overline{X}_M:=\frac{1}{M}\sum_{m=1}^M X_m$.
1. For $\alpha>1$, justify the a.s. convergence of $\overline{X}_M$ to 0 as $M\to +\infty$.
2. For $\alpha>2$, prove a central limit theorem for $\overline{X}_M$ at rate $\sqrt M$.
3. For $\alpha\in (1,2]$, determine the rate $u_M\to +\infty$ such that $u_M \overline{X}_M$ converges in distribution to a limit.
Hint: to distinguish the cases $\alpha=2$ and $\alpha\in (1,2)$, use the Levy criterion (Theorem A.1.3) with the representation of the characteristic function  $$\mathbb{E}(e^{iu X})=1 + \alpha |u|^\alpha\int_{|u|}^{+\infty} \frac{\cos(t)-1}{t^{\alpha+1}} {\mathrm d}t.$$

Exercise 2.2 (
substitution method)
We aim at estimating the Laplace transform $$\phi(u):=\mathbb{E}(e^{uX})=e^{\sigma^2u^2/2}$$ of a Gaussian random variable $X\overset d= {\cal N}(0,\sigma^2)$, using a sample of size $M$ of i.i.d. copies of $X$. The parameter $\sigma^2$ is unknown.
Which procedure is the most accurate?
1.  Computing the empirical mean $\phi_{1,M}(u):= \frac{ 1 }{M }\sum_{m=1}^M e^{u X_m}$;

or
2. Estimating $\sigma^2$ by the empirical variance $\sigma^2_M$, then estimating $\phi(u)$ by $\phi_{2,M}(u):= e^{\frac{ u^2 }{2 }\sigma^2_M}$.
We will compare  estimators using related confidence intervals.
Solution.

Exercise 2.3 (central limit theorem, substitution method)
Consider  the setting of Proposition 2.2.6, with the estimation of  $\max(\mathbb{E}(X),a)$. Let $\overline X_{1,M}=\frac{ 2 }{ M}\sum_{m=1}^{M/2} X_i$ and $\overline X_{2,M}=\frac{ 2 }{ M}\sum_{m=M/2+1}^{M} X_i$. Set
$$\overline f_M=\max(\frac{ 1 }{ M}\sum_{m=1}^{M} X_i,a),\qquad \underline f_M=\mathbb{1}_{\overline X_{1,M}\geq a} \overline X_{2,M}+\mathbb{1}_{\overline X_{1,M}< a} a.$$
1. Assume  $a> \mathbb{E}(X)$. Prove that  both  the upper estimator $\overline f_M$ and the lower estimator  $\underline f_M$ converge to $\max(\mathbb{E}(X),a)$ in $L_1$ at the rate $M$.
2. Assume  $a< \mathbb{E}(X)$. Establish a central limit theorem at rate $\sqrt M$ for  $\overline f_M-\max(\mathbb{E}(X),a)$,  for $\underline f_M-\max(\mathbb{E}(X),a)$, and for the pair $(\overline f_M-\max(\mathbb{E}(X),a),\underline f_M-\max(\mathbb{E}(X),a))$.
3. Investigate the case $a=\mathbb{E}(X)$.
Exercise 2.4 (sensitivity formulas, exponential distribution)
Let $Y\overset d={\cal E}xp(\lambda)$ with $\lambda > 0$ and set $F (\lambda) = \mathbb{E}(f (Y))$ for a given bounded function  $f$.
1. Use the likelihood ratio method to represent $F'$ as an expectation.
2.  Use the pathwise differentiation method to get another representation for smooth $f$ (use the formula $X=-\frac{1}{\lambda}\log(U)$).
3.  By integrating by parts, show that both formulas coincide.

Exercise 2.5 (sensitivity formulas, multidimensional Gaussian distribution)
In dimension 1, Example 2.2.11 gives, for $Y^{m,\sigma}\overset d= {\cal N}(m,\sigma^2)$,
\begin{align*}
\partial_m \mathbb{E}(f(Y^{m,\sigma}))&=\mathbb{E}\Big(f(Y^{m,\sigma})\frac{(Y^{m,\sigma}-m)}{\sigma^2}
\Big),\\
\partial_\sigma \mathbb{E}(f(Y^{m,\sigma}))&=\mathbb{E}\Big(f(Y^{m,\sigma}){\frac1 \sigma}\big[\frac{(Y^{m,\sigma}-m)^2}{\sigma^2}-1\big]
\Big).
\end{align*}
Extend this to the multidimensional case for the sensitivity with respect to the mean $\mathbb{E}(Y)$ and to the covariance matrix $\mathbb{E}(Y Y^*)$ (assumed to be invertible).
Hint: for the sensitivity w.r.t. elements of $\mathbb{E}(Y Y^*)$, one has to perturb the matrix $\mathbb{E}(Y Y^*)$ in a symmetric way to keep it symmetric.

Exercise 2.6 (sensitivity formulas, resimulation method)
We aim at illustrating the benefit of using Common Random Numbers (CRN) in the evaluation of $\partial_\theta \mathbb{E}(f(Y^\theta))$ and the impact of the smoothness of $f$ on the estimator variance. We consider the Gaussian model $Y\overset d={\cal N}(\theta,1)$.
1. Denote by $(G_1,\dots,G_M,G'_1,\dots,G'_M)$ i.i.d. copies of ${\cal N}(0,1)$ and consider a smooth function $f$, bounded with bounded derivatives. Compute the variance of the estimator with different random numbers $$\frac 1M\sum_{m=1}^M \frac{f(\theta+\varepsilon+G_m)-f(\theta-\varepsilon+G'_m)}{2\varepsilon}$$ as $\varepsilon\to0$ ($M$ fixed).
2.  Compare it with that of the CRN estimator $$\frac 1M\sum_{m=1}^M \frac{f(\theta+\varepsilon+G_m)-f(\theta-\varepsilon+G_m)}{2\varepsilon}.$$
3. Analyze the variance of the CRN estimator when $f(x)=\mathbf{1}_{x\geq 0}$. What is its dependency w.r.t. $\varepsilon\to0$?
Write a simulation program illustrating these features.

Exercise 2.7 (concentration inequality, maximum of Gaussian variables)
Corollary 2.4.1 states that if $Y$ is a random vector in $\mathbb{R}^d$ with distribution $\mu$ satisfies a logarithmic Sobolev inequality with constant $c_\mu>0$, then for any Lipschitz function $f:\mathbb{R}^d\mapsto \mathbb{R}$, we have
$$\mathbf{P}(|f(Y)-\mathbf{E}(f(Y))|>\varepsilon)\leq 2\exp\left(-\frac{ \varepsilon^2 }{c_\mu |f|_{\rm Lip}^2}\right), \qquad \forall \varepsilon\geq 0.$$
1. Use the above concentration inequality in the Gaussian case to establish the Borell inequality (1975): for any centered $d$-dimensional Gaussian vector $Y=(Y_1,\dots,Y_d)$, we have $$\mathbb{P}\left(|\max_{1\leq i\leq d} Y_i-\mathbb{E}(\max_{1\leq i\leq d} Y_i)|>\varepsilon\right)\leq 2\exp\Big(-\frac{ \varepsilon^2 }{2\sigma^2}\Big), \qquad \forall \varepsilon\geq 0$$ where $\sigma^2=\max_{1\leq i\leq d}\mathbb{E}(Y^2_i)$ (Observe that the constants do not depend much on the dimension, thus passing to infinite dimension is possible).
Hint: first assume that $(Y_i)_i$ are i.i.d. standard Gaussian random variables. To prove  the general case, use the representation of Proposition 1.4.1.
2. We consider the case $d\to+\infty$ and assume that $\sigma^2=\max_{1\leq i\leq d}\mathbb{E}(Y^2_i)$ is bounded as $d\to+\infty$. Assuming that $\mathbb{E}(\max_{1\leq i\leq d} Y_i)\to+\infty$, and deduce that $$\frac{\max_{1\leq i\leq d} Y_i }{\mathbb{E}(\max_{1\leq i\leq d} Y_i)}\underset{d\to+\infty}{\overset{{\rm Prob.}}\longrightarrow 1}.$$
Application: in the standard i.i.d. case, since $\mathbb{E}(\max_{1\leq i\leq d} Y_i)\sim\sqrt{2\log(d)}$ as $d\to+\infty$ (see [J. Galambos. The Asymptotic Theory of Extreme Order Statistics. R.E. Kreiger, Malabar, FL, 1987]), we obtain a nice deterministic equivalent (in probability) of $\max_{1\leq i\leq d} Y_i$.
Solution.