A note on the maximum of iid random variables

Seeking an optimal-ish rescaling of random maxima in a special case

What can we say about the maximum of $n$ iid random variables? In this note, which supports work presented in my TDA work with Veronica Ciocanel, we will look at a special case where the tail of the distribution takes on a form that we saw in our numerical experiments.

Namely, suppose $X, X_1, \ldots, X_n$ are iid with common cumulative distribution (cdf) function $F(x) = P(X \leq x)$. We will assume that

\begin{equation} \label{eq:cdf-form} F(x) = 1 - e^{-\phi(x)}. \end{equation}

where $\phi(x)$ is a smooth, increasing function with at most polynomial growth of degree $d \geq 1$.

It is a well-known result that, when properly rescaled, the maximum of a sequence of iid random variables with this type of distribution converges to the Gumbel distribution, which has cdf $G(x):= e^{-e^{-x}}$ for a standard Gumbel random variable. The rest of the Gumbel family is expressed in terms of a location parameter $\mu$ and scale parameter $\beta$. A non-standard Gumbel random variable can then be written $\widetilde{G} = \beta G + \mu$. The main principle of the analysis that follows is that a "good" rescaling scheme of the sample maxima $X_n^{*}$ can be written in terms of pairs $(\mu_n, \beta_n)$, which are estimates of the location and scale parameters under the assumption that $X_n^{*}$ is actually Gumbel. That is to say, using the method of moments to generate our approximation, we define

\begin{equation} \begin{aligned} \label{eq:rescaling-defn} \beta_n &:= \frac{\sqrt{6}} {\pi} SD\big(X_n^{*}\big), \\ \mu_n &:= E\big(X_n^{*}\big) - \beta_n \gamma \end{aligned} \end{equation}

where $\gamma$ is the Euler-Mascheroni constant. In this note, we will take it as a given that

\begin{equation} \label{eq:converge-to-G} \frac{X_n^{*} - \mu_n}{\beta_n} \implies G \end{equation} where $G$ is a standard Gumbel random variable.

Our goal is to understand $\mu_n$ and $\beta_n$ in terms of the log-survival tail function, $\phi$.


Moments of maximal RVs

The density of the maximal random variable is quick to derive from its cdf. Let $F_n(x) = P\big(X_n^{*} \leq x\big)$. Then $F_n(x) = \prod_{i = 1}^n P(X_i \leq x) = F(x)^n$, and it follows that the density has the form

\begin{equation} \label{eq:max-density-F} f_n(x) = F_n'(x) = n F(x)^{n-1} F'(x). \end{equation}

Under the assumption \eqref{eq:cdf-form} this takes the form \begin{equation} \label{eq:max-density-phi} f_n(x) = n (1-e^{-\phi(x)})^{n-1} e^{-\phi(x)} \phi'(x). \end{equation}

So it follows that the moments of a maximal random variable have the form

\begin{equation} \label{eq:maximal-moments} E\big((X_n^{*})^p\big) = \int_0^\infty x^p n (1-e^{-\phi(x)})^{n-1} e^{-\phi(x)} \phi'(x) dx. \end{equation}


Asymptotic analysis

Under the substitution $y = \phi(x)$, and defining $\psi(y) = \phi^{-1}(y)$, we have \begin{equation*} M_{n,p} := E\big((X_n^{*})^p\big) = \int_0^\infty (\psi(y))^p n (1-e^{-y})^{n-1} e^{-y} dy. \end{equation*}

Now, we expand the binomial term \begin{equation} \begin{aligned} n \, (1-e^{-y})^{n-1} e^{-y} &= n \, \sum_{j=0}^{n-1} {n-1 \choose j} (-1)^j e^{-(j+1) y} \\ &= \sum_{j=0}^{n-1} (j+1) {n \choose j+1} (-1)^j e^{-(j+1) y} \\ &= \sum_{k=1}^{n} k {n \choose k} (-1)^{k-1} e^{-k y} \end{aligned} \end{equation} where we have set $k = j+1$ in the last line.

So, bringing back the full integral, \begin{equation*} M_{n,p} = \sum_{k=1}^{n} {n \choose k} (-1)^{k-1} \int_0^\infty \psi(y)^p k e^{-k y} dy. \end{equation*}

Now, there are multiple perspectives for how to handle the $k$ that appears in the integral. A first one is to look to define our solution in terms of the $\Gamma$ function. To this end, define $t = yk$. Then \begin{equation} \label{eq:Mn-p} M_{n,p} = \sum_{k=1}^n {n \choose k} (-1)^{k-1} \int_0^\infty \psi\big(\frac{t}{k}\big)^p e^{-t} dt. \end{equation}

When $\psi(y)$ admits some kind of series expansion in $y$, each integral will evaluate to a distinct value of the Gamma function, $\Gamma(\alpha) = \int_0^\infty t^{\alpha - 1} e^{-t} dt$.


Solution when $\phi(x)$ is a monomial

Let $c > 0$ and $d \geq 1$ and suppose that $\phi(x) = c x^d$. This is exactly the case, for example, for Rayleigh$(\sigma)$ distributed random variables with $d=2$ and $c = 1/2\sigma^2$. It follows that $\psi(y) = (y/c)^{1/d}$ and the integral in \eqref{eq:Mn-p} can be evaluated as follows: \begin{equation} \int_0^\infty \Big(\frac{y}{kc}\Big)^{p/d} e^{-t} dt = \frac{1}{(ck)^{p/d}} \frac{p}{d} \Gamma\Big(\frac{p}{d}\Big) \end{equation} where we have used the Gamma function property that $\Gamma(1+z) = z \Gamma(z)$.

Altogether we have \begin{equation} \label{eq:Mn-p-monomial} \frac{p}{dc^{p/d}} \Gamma\Big(\frac{p}{d}\Big) \sum_{k=1}^n (-1)^{k-1} {n \choose k} \frac{1}{k^{p/d}} \end{equation}

So, in the end, we are at the mercy of whether or not we can evaluate this sum explicitly as a function of $n$.

Previous
Previous

The Great Gatsby Curve

Next
Next

Asymptotic behavior of the heat equation in free space