Asymptotic Statistics in Non-Sparse Networks
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exchangeable arrays have been studied since the late 70’s (Aldous (1983), Kallenberg (2005)). Eagleson and Weber (1978) and Silverman (1976) establish Strong Law of Large Numbers and Central Limit Theorems for such arrays. Because non-sparse networks and multiway clustering are related to exchangeable arrays, they have received recent attention in statistics and econometrics (Davezies, D’Haultfœuille, and Guyonvarch (2018), Davezies, D’Haultfœuille, and Guyonvarch (2021), Menzel (2018)). We focus below on non-sparse networks and present uniformity results at the basis of the asymptotic normality of many nonlinear estimators. We also show the general validity of a bootstrap scheme adapted to such data.
Non-sparse Networks, Dyadic data and exchangeability
Dyadic data are random variables (or vectors) \(Y_{i,j}\) indexed by \(i\) and \(j\), two units from the same population. For instance, \(Y_{i,j}\) could be exports from country \(i\) to country \(j\). Another example is taken from digital networks where \(Y_{i,j}\) could be, e.g., the numbers of messages from \(i\) to \(j\). It is useful to represent such data as arrays:
In this set set-up, the \(n(n-1)\) variables are potentially dependent because \(Y_{i,j}\) is likely correlated with \(Y_{i',j'}\) if \(\{i,j\}\cap\{i',j'\}\neq \emptyset\). In the first example, exports from \(i\) to \(j\) are correlated with exports from \(j\) to \(i\), but also with other exports or imports of \(i\) or \(j\). The following assumptions allow for such correlations, while keeping important aspects of iid sampling: \[\begin{align*} \textbf{Joint exchangeability: } & (Y_{i,j})_{(i,j)\in \mathbb{N}^{\ast 2}, i\neq j}\text{ has the same distribution as }(Y_{\pi(i),\pi(j)})_{(i,j)\in \mathbb{N}^{\ast 2}, i\neq j}, \\ & \text{ for any permutation }\pi \text{ of }\mathbb{N}^{\ast}, \\ \textbf{Dissociation: } & (Y_{i,j})_{(i,j)\in \{1,…,k\}^2, i\neq j}\text{ and } (Y_{i,j})_{(i,j)\in \{k+1,k+2,…\}^2, i\neq j} \text{ are independent,} \\ & \text{for any }k \in \mathbb{N}^{\ast}. \end{align*}\]
These notions generalize “iidness”: if \((X_i)_{i\geq 1}\) are i.i.d., then \(Y_{i,j}=X_i\), for any \(i\neq j\), defines a jointly exchangeable and dissociated array.
Simple law of large numbers (LLN) and central limit theorems (CLT)
The following results generalize the usual LLN and CLT for iid data to jointly exchangeable and dissociated arrays:
These results actually hold for “multiadic” data, viz. data indexed by \(k\)-tuples instead of pairs. This has a nice consequence. If the variables \((Y_{i,j})_{i\neq j}\) are jointly exchangeable and dissociated, the variables \(Z_{i,j,k}=(Y_{i,j}+Y_{j,i})(Y_{i,k}+Y_{k,i})'\) (with \(i, j\) and \(k\) all distinct) also are. Then by the LLN above, we have, if \(\mathbb{E}\left(|Y_{1,2}|^2\right)<\infty\), \[\frac{1}{n(n-1)(n-2)}\sum_{1\leq i,j,k\leq n}Z_{i,j,k}\xrightarrow{a.s.}\mathbb{E} \left((Y_{1,2}+Y_{2,1})(Y_{1,3}+Y_{3,1})'\right).\] Next, if we let \(\overline{Y}_i=\frac{1}{n-1}\sum_{\substack{1\leq j \leq n\\ j\neq i}}(Y_{i,j}+Y_{j,i})\) and \(\overline{Y}=\frac{1}{n}\sum_{i=1}^{n}\overline{Y}_{i}\), we obtain \[\widehat{V}=\frac{1}{n}\sum_{1\leq i \leq n}(\overline{Y}_i-\overline{Y})(\overline{Y}_i-\overline{Y})'\xrightarrow{a.s.} V.\] Hence, t-tests or F-tests of the hypothesis \(\mathbb{E}(Y_{1,2})=\theta_0\) using \(\widehat{V}\) are asymptotically valid as soon as \(V\) is non-singular.
Uniform LLN and CLT
The previous results are nonetheless insufficient in many cases, especially with nonlinear (e.g., M- or Z-) estimators. A common way to handle such problems, then, is to render “simple’’ LLN and CLT uniform over suitable classes of functions. Specifically, for \(f\in \mathcal{F}\) a class of bounded functions with values in \(\mathbb{R}^k\), let \(\mathbb{P}_n(f)=\frac{1}{n(n-1)}\sum_{1\leq i,j\leq n}f(Y_{i,j})\) and \(\mathbb{G}_n(f)=\sqrt{n}\left(\mathbb{P}_n(f)-E(f(Y_{1,2}))\right)\). The class \(\mathcal{F}\) is called Glivenko-Cantelli if, basically, \[\begin{align}%\text{ almost-surely and in }L^1 : \lim_n \sup_{f\in \mathcal{F}}\left|\mathbb{P}_n(f)-\mathbb{E}(f(Y_{1,2}))\right|\xrightarrow{a.s.} 0.\label{eq1}\tag{1} \end{align}\] Similarly, The class \(\mathcal{F}\) is Donsker for the distribution of \((Y_{i,j})_{i\neq j\geq 1}\) if \[\begin{align}\mathbb{G}_n\stackrel{d}{\longrightarrow}\mathbb{G},\label{eq2}\tag{2}\end{align}\] with \(\mathbb{G}\) a Gaussian process indexed on \(\mathcal{F}\).
Results \(\eqref{eq1}\) and \(\eqref{eq2}\) have been shown for iid data under various conditions. We consider two standard ones involving the so-called covering numbers of \(\mathcal{F}\). First, we introduce additional notation. For any \(\eta > 0\) and any seminorm \(||\cdot||\) on a space including \(\mathcal{F}\), \(N(\eta, \mathcal{F}, ||\cdot||)\) denotes the minimal number of \(||\cdot||\)-closed balls of radius \(\eta\) with centers in \(\mathcal{F}\) needed to cover \(\mathcal{F}\). The seminorms we consider hereafter are \(|f|_{\mu,r} = \left(\int |f|^rd\mu\right)^{1/r}\) for any \(r \geq 1\) and probability measure \(\mu\). Hereafter, an envelope of \(F\) is a measurable function \(F\) satisfying \(F(u) \geq \sup_{f\in \mathcal{F}} |f(u)|\). We let \(\mathcal{Q}\) denote the set of probability measures with finite support. Finally, \(P\) denotes the distribution of one random variable.
The conditions for \(\eqref{eq1}\) and \(\eqref{eq2}\) to hold with iid data are then:
Condition \(CG(\mathcal{F})\): \(\forall \eta>0, \sup_{Q\in \mathcal{Q}}N\left(\eta|F|_{Q,1},\mathcal{F},|.|_{Q,1}\right)<\infty,\) with \(|F|_{P,1}<\infty\).
Condition \(D(\mathcal{F})\): \(\int_0^{+\infty}\sup_{Q\in \mathcal{Q}}\sqrt{\log N\left(\eta|F|_{Q,2},\mathcal{F},|.|_{Q,2}\right)}d\eta<+\infty,\) with \(|F|_{P,2}<\infty\).
Remarkably, these results directly extend to jointly exchangeable and dissociated arrays:
If Condition \(CG(\mathcal{F})\) holds then \(\eqref{eq1}\) holds.
If Condition \(D(\mathcal{F})\) holds then \(\eqref{eq2}\) holds, with \(\mathbb{G}\) having a covariance kernel \(K\) defined by \[K(f_1,f_2)=Cov\left(f_1(Y_{1,2})+f_1(Y_{2,1}),f_2(Y_{1,3})+f_2(Y_{3,1})\right).\]
Uniform LLN and CLT also hold using standard conditions on bracketing numbers, instead of covering numbers as above.
Take-away: the main asymptotic results used to establish the properties of nonlinear estimators with iid data also hold with jointly exchangeable and dissociated arrays.
Hence, asymptotically normal estimators with iid data are also asymptotically normal with jointly exchangeable and dissociated arrays. The only difference lies in their rate of convergence and their asymptotic variance \(V\). If \(V\) is not singular, inference can be based on consistent estimation of \(V\), as explained above. But a simple bootstrap scheme can also be used.
Bootstrapping exchangeable arrays
The main message here is that even if the parameters of interest
depend on “edges” distribution, we have to bootstrap vertices!
Specifically, consider \((1^{\ast}, 2^{\ast},…,n^{\ast})\) and iid
sample drawn uniformly from \(\{1,…,n\}\), and consider the bootstrap
sample \((Y_{i^{\ast},j^{\ast}})_{i\neq j, i^*\neq j^*}\).
To illustrate the bootstrap scheme, imagine a 5-vertices network, and a bootstrap sample of vertices \((1^{\ast},2^{\ast},3^{\ast},4^{\ast},5^{\ast})=(2,1,2,5,4)\), the corresponding bootstrapped network is:
Figure 1: Bootstrapped network for \((1^{\ast},2^{\ast},3^{\ast},4^{\ast},5^{\ast})=(2,1,2,5,4)\)}
The bootstrapped average is \(\mathbb{P}_n^{\ast}(f)=\frac{1}{n(n-1)}\sum_{1\leq i,j\leq n}f(Y_{i^{\ast},j^{\ast}})\mathbb{1}_{\{i^{\ast}\neq j^{\ast}\}}\), while the bootstrapped process is \(\mathbb{G}_n^{\ast}(f)=\sqrt{n}\left(\mathbb{P}_n^{\ast}(f)-\mathbb{P}_n(f)\right)\).
As with iid data, the bootstrap scheme above leads to consistent inference if Condition \(D(\mathcal{F})\) holds. Specifically, we then have, conditional on \((Y_{i,j})_{i,j\in \mathbb{N}^2}\) and almost surely, \[\begin{align}\mathbb{G}_n^{\ast}\stackrel{d}{\longrightarrow}\mathbb{G},\end{align}\] where \(\mathbb{G}\) is the same Gaussian process as above. This ensures the validity of the bootstrap in many nonlinear contexts.
Bibliography
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.