\qproc: A stochastic process on quantile functions

2.6. \(\qproc\): A stochastic process on quantile functions#

\(\require{mathtools} \newcommand{\notag}{} \newcommand{\tag}{} \newcommand{\label}[1]{} \newcommand{\sfrac}[2]{#1/#2} \newcommand{\bm}[1]{\boldsymbol{#1}} \newcommand{\num}[1]{#1} \newcommand{\qty}[2]{#1\,#2} \renewenvironment{align} {\begin{aligned}} {\end{aligned}} \renewenvironment{alignat} {\begin{alignedat}} {\end{alignedat}} \newcommand{\pdfmspace}[1]{} % Ignore PDF-only spacing commands \newcommand{\htmlmspace}[1]{\mspace{#1}} % Ignore PDF-only spacing commands \newcommand{\scaleto}[2]{#1} % Allow to use scaleto from scalerel package \newcommand{\RR}{\mathbb R} \newcommand{\NN}{\mathbb N} \newcommand{\PP}{\mathbb P} \newcommand{\EE}{\mathbb E} \newcommand{\XX}{\mathbb X} \newcommand{\ZZ}{\mathbb Z} \newcommand{\QQ}{\mathbb Q} \newcommand{\fF}{\mathcal F} \newcommand{\dD}{\mathcal D} \newcommand{\lL}{\mathcal L} \newcommand{\gG}{\mathcal G} \newcommand{\hH}{\mathcal H} \newcommand{\nN}{\mathcal N} \newcommand{\pP}{\mathcal P} \newcommand{\BB}{\mathbb B} \newcommand{\Exp}{\operatorname{Exp}} \newcommand{\Binomial}{\operatorname{Binomial}} \newcommand{\Poisson}{\operatorname{Poisson}} \newcommand{\linop}{\mathcal{L}(\mathbb{B})} \newcommand{\linopell}{\mathcal{L}(\ell_1)} \DeclareMathOperator{\trace}{trace} \DeclareMathOperator{\Var}{Var} \DeclareMathOperator{\Span}{span} \DeclareMathOperator{\proj}{proj} \DeclareMathOperator{\col}{col} \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator*{\gt}{>} \definecolor{highlight-blue}{RGB}{0,123,255} % definition, theorem, proposition \definecolor{highlight-yellow}{RGB}{255,193,7} % lemma, conjecture, example \definecolor{highlight-orange}{RGB}{253,126,20} % criterion, corollary, property \definecolor{highlight-red}{RGB}{220,53,69} % criterion \newcommand{\logL}{\ell} \newcommand{\eE}{\mathcal{E}} \newcommand{\oO}{\mathcal{O}} \newcommand{\defeq}{\stackrel{\mathrm{def}}{=}} \newcommand{\Bspec}{\mathcal{B}} % Spectral radiance \newcommand{\X}{\mathcal{X}} % X space \newcommand{\Y}{\mathcal{Y}} % Y space \newcommand{\M}{\mathcal{M}} % Model \newcommand{\Tspace}{\mathcal{T}} \newcommand{\Vspace}{\mathcal{V}} \newcommand{\Mtrue}{\mathcal{M}_{\mathrm{true}}} \newcommand{\MP}{\M_{\mathrm{P}}} \newcommand{\MRJ}{\M_{\mathrm{RJ}}} \newcommand{\qproc}{\mathfrak{Q}} \newcommand{\D}{\mathcal{D}} % Data (true or generic) \newcommand{\Dt}{\tilde{\mathcal{D}}} \newcommand{\Phit}{\widetilde{\Phi}} \newcommand{\Phis}{\Phi^*} \newcommand{\qt}{\tilde{q}} \newcommand{\qs}{q^*} \newcommand{\qh}{\hat{q}} \newcommand{\AB}[1]{\mathtt{AB}~\mathtt{#1}} \newcommand{\LP}[1]{\mathtt{LP}~\mathtt{#1}} \newcommand{\NML}{\mathrm{NML}} \newcommand{\iI}{\mathcal{I}} \newcommand{\true}{\mathrm{true}} \newcommand{\dist}{D} \newcommand{\Mtheo}[1]{\mathcal{M}_{#1}} % Model (theoretical model); index: param set \newcommand{\DL}[1][L]{\mathcal{D}^{(#1)}} % Data (RV or generic) \newcommand{\DLp}[1][L]{\mathcal{D}^{(#1')}} % Data (RV or generic) \newcommand{\DtL}[1][L]{\tilde{\mathcal{D}}^{(#1)}} % Data (RV or generic) \newcommand{\DpL}[1][L]{{\mathcal{D}'}^{(#1)}} % Data (RV or generic) \newcommand{\Dobs}[1][]{\mathcal{D}_{\mathrm{obs}}^{#1}} % Data (observed) \newcommand{\calibset}{\mathcal{C}} \newcommand{\N}{\mathcal{N}} % Normal distribution \newcommand{\Z}{\mathcal{Z}} % Partition function \newcommand{\VV}{\mathbb{V}} % Variance \newcommand{\T}{\mathsf{T}} % Transpose \newcommand{\EMD}{\mathrm{EMD}} \newcommand{\dEMD}{d_{\mathrm{EMD}}} \newcommand{\dEMDtilde}{\tilde{d}_{\mathrm{EMD}}} \newcommand{\dEMDsafe}{d_{\mathrm{EMD}}^{\text{(safe)}}} \newcommand{\e}{ε} % Model confusion threshold \newcommand{\falsifythreshold}{ε} \newcommand{\bayes}[1][]{B_{#1}} \newcommand{\bayesthresh}[1][]{B_{0}} \newcommand{\bayesm}[1][]{B^{\mathcal{M}}_{#1}} \newcommand{\bayesl}[1][]{B^l_{#1}} \newcommand{\bayesphys}[1][]{B^{{p}}_{#1}} \newcommand{\Bconf}[1]{B^{\mathrm{epis}}_{#1}} \newcommand{\Bemd}[1]{B^{\mathrm{EMD}}_{#1}} \newcommand{\BQ}[1]{B^{Q}_{#1}} \newcommand{\Bconfbin}[1][]{\bar{B}^{\mathrm{conf}}_{#1}} \newcommand{\Bemdbin}[1][]{\bar{B}_{#1}^{\mathrm{EMD}}} \newcommand{\bin}{\mathcal{B}} \newcommand{\Bconft}[1][]{\tilde{B}^{\mathrm{conf}}_{#1}} \newcommand{\fc}{f_c} \newcommand{\fcbin}{\bar{f}_c} \newcommand{\paramphys}[1][]{Θ^{{p}}_{#1}} \newcommand{\paramobs}[1][]{Θ^{ε}_{#1}} \newcommand{\test}{\mathrm{test}} \newcommand{\train}{\mathrm{train}} \newcommand{\synth}{\mathrm{synth}} \newcommand{\rep}{\mathrm{rep}} \newcommand{\MNtrue}{\mathcal{M}^{{p}}_{\text{true}}} \newcommand{\MN}[1][]{\mathcal{M}^{{p}}_{#1}} \newcommand{\MNA}{\mathcal{M}^{{p}}_{Θ_A}} \newcommand{\MNB}{\mathcal{M}^{{p}}_{Θ_B}} \newcommand{\Me}[1][]{\mathcal{M}^ε_{#1}} \newcommand{\Metrue}{\mathcal{M}^ε_{\text{true}}} \newcommand{\Meobs}{\mathcal{M}^ε_{\text{obs}}} \newcommand{\Meh}[1][]{\hat{\mathcal{M}}^ε_{#1}} \newcommand{\MNa}{\mathcal{M}^{\mathcal{N}}_a} \newcommand{\MeA}{\mathcal{M}^ε_A} \newcommand{\MeB}{\mathcal{M}^ε_B} \newcommand{\Ms}{\mathcal{M}^*} \newcommand{\MsA}{\mathcal{M}^*_A} \newcommand{\MsB}{\mathcal{M}^*_B} \newcommand{\Msa}{\mathcal{M}^*_a} \newcommand{\MsAz}{\mathcal{M}^*_{A,z}} \newcommand{\MsBz}{\mathcal{M}^*_{B,z}} \newcommand{\Msaz}{\mathcal{M}^*_{a,z}} \newcommand{\MeAz}{\mathcal{M}^ε_{A,z}} \newcommand{\MeBz}{\mathcal{M}^ε_{B,z}} \newcommand{\Meaz}{\mathcal{M}^ε_{a,z}} \newcommand{\zo}{z^{0}} \renewcommand{\lL}[2][]{\mathcal{L}_{#1|{#2}}} % likelihood \newcommand{\Lavg}[2][]{\mathcal{L}^{/#2}_{#1}} % Geometric average of likelihood \newcommand{\lLphys}[2][]{\mathcal{L}^{{p}}_{#1|#2}} \newcommand{\Lavgphys}[2][]{\mathcal{L}^{{p}/#2}_{#1}} % Geometric average of likelihood \newcommand{\lLL}[3][]{\mathcal{L}^{(#3)}_{#1|#2}} \newcommand{\lLphysL}[3][]{\mathcal{L}^{{p},(#3)}_{#1|#2}} \newcommand{\lnL}[2][]{l_{#1|#2}} % Per-sample log likelihood \newcommand{\lnLt}[2][]{\widetilde{l}_{#1|#2}} \newcommand{\lnLtt}{\widetilde{l}} % Used only in path_sampling \newcommand{\lnLh}[1][]{\hat{l}_{#1}} \newcommand{\lnLphys}[2][]{l^{{p}}_{#1|#2}} \newcommand{\lnLphysL}[3][]{l^{{p},(#3)}_{#1|#2}} \newcommand{\Elmu}[2][1]{μ_{{#2}}^{(#1)}} \newcommand{\Elmuh}[2][1]{\hat{μ}_{{#2}}^{(#1)}} \newcommand{\Elsig}[2][1]{Σ_{{#2}}^{(#1)}} \newcommand{\Elsigh}[2][1]{\hat{Σ}_{{#2}}^{(#1)}} \newcommand{\pathP}{\mathop{{p}}} % Path-sampling process (generic) \newcommand{\pathPhb}{\mathop{{p}}_{\mathrm{Beta}}} % Path-sampling process (hierarchical beta) \newcommand{\interval}{\mathcal{I}} \newcommand{\Phiset}[1]{\{\Phi\}^{\small (#1)}} \newcommand{\Phipart}[1]{\{\mathcal{I}_Φ\}^{\small (#1)}} \newcommand{\qhset}[1]{\{\qh\}^{\small (#1)}} \newcommand{\Dqpart}[1]{\{Δ\qh_{2^{#1}}\}} \newcommand{\LsAzl}{\mathcal{L}_{\smash{{}^{\,*}_A},z,L}} \newcommand{\LsBzl}{\mathcal{L}_{\smash{{}^{\,*}_B},z,L}} \newcommand{\lsA}{l_{\smash{{}^{\,*}_A}}} \newcommand{\lsB}{l_{\smash{{}^{\,*}_B}}} \newcommand{\lsAz}{l_{\smash{{}^{\,*}_A},z}} \newcommand{\lsAzj}{l_{\smash{{}^{\,*}_A},z_j}} \newcommand{\lsAzo}{l_{\smash{{}^{\,*}_A},z^0}} \newcommand{\leAz}{l_{\smash{{}^{\,ε}_A},z}} \newcommand{\lsAez}{l_{\smash{{}^{*ε}_A},z}} \newcommand{\lsBz}{l_{\smash{{}^{\,*}_B},z}} \newcommand{\lsBzj}{l_{\smash{{}^{\,*}_B},z_j}} \newcommand{\lsBzo}{l_{\smash{{}^{\,*}_B},z^0}} \newcommand{\leBz}{l_{\smash{{}^{\,ε}_B},z}} \newcommand{\lsBez}{l_{\smash{{}^{*ε}_B},z}} \newcommand{\LaszL}{\mathcal{L}_{\smash{{}^{*}_a},z,L}} \newcommand{\lasz}{l_{\smash{{}^{*}_a},z}} \newcommand{\laszj}{l_{\smash{{}^{*}_a},z_j}} \newcommand{\laszo}{l_{\smash{{}^{*}_a},z^0}} \newcommand{\laez}{l_{\smash{{}^{ε}_a},z}} \newcommand{\lasez}{l_{\smash{{}^{*ε}_a},z}} \newcommand{\lhatasz}{\hat{l}_{\smash{{}^{*}_a},z}} \newcommand{\pasz}{p_{\smash{{}^{*}_a},z}} \newcommand{\paez}{p_{\smash{{}^{ε}_a},z}} \newcommand{\pasez}{p_{\smash{{}^{*ε}_a},z}} \newcommand{\phatsaz}{\hat{p}_{\smash{{}^{*}_a},z}} \newcommand{\phateaz}{\hat{p}_{\smash{{}^{ε}_a},z}} \newcommand{\phatseaz}{\hat{p}_{\smash{{}^{*ε}_a},z}} \newcommand{\Phil}[2][]{Φ_{#1|#2}} % Φ_{\la} \newcommand{\Philt}[2][]{\widetilde{Φ}_{#1|#2}} % Φ_{\la} \newcommand{\Philhat}[2][]{\hat{Φ}_{#1|#2}} % Φ_{\la} \newcommand{\Philsaz}{Φ_{\smash{{}^{*}_a},z}} % Φ_{\lasz} \newcommand{\Phileaz}{Φ_{\smash{{}^{ε}_a},z}} % Φ_{\laez} \newcommand{\Philseaz}{Φ_{\smash{{}^{*ε}_a},z}} % Φ_{\lasez} \newcommand{\mus}[1][1]{μ^{(#1)}_*} \newcommand{\musA}[1][1]{μ^{(#1)}_{\smash{{}^{\,*}_A}}} \newcommand{\SigsA}[1][1]{Σ^{(#1)}_{\smash{{}^{\,*}_A}}} \newcommand{\musB}[1][1]{μ^{(#1)}_{\smash{{}^{\,*}_B}}} \newcommand{\SigsB}[1][1]{Σ^{(#1)}_{\smash{{}^{\,*}_B}}} \newcommand{\musa}[1][1]{μ^{(#1)}_{\smash{{}^{*}_a}}} \newcommand{\Sigsa}[1][1]{Σ^{(#1)}_{\smash{{}^{*}_a}}} \newcommand{\Msah}{{\color{highlight-red}\mathcal{M}^{*}_a}} \newcommand{\Msazh}{{\color{highlight-red}\mathcal{M}^{*}_{a,z}}} \newcommand{\Meah}{{\color{highlight-blue}\mathcal{M}^{ε}_a}} \newcommand{\Meazh}{{\color{highlight-blue}\mathcal{M}^{ε}_{a,z}}} \newcommand{\lsazh}{{\color{highlight-red}l_{\smash{{}^{*}_a},z}}} \newcommand{\leazh}{{\color{highlight-blue}l_{\smash{{}^{ε}_a},z}}} \newcommand{\lseazh}{{\color{highlight-orange}l_{\smash{{}^{*ε}_a},z}}} \newcommand{\Philsazh}{{\color{highlight-red}Φ_{\smash{{}^{*}_a},z}}} % Φ_{\lasz} \newcommand{\Phileazh}{{\color{highlight-blue}Φ_{\smash{{}^{ε}_a},z}}} % Φ_{\laez} \newcommand{\Philseazh}{{\color{highlight-orange}Φ_{\smash{{}^{*ε}_a},z}}} % Φ_{\lasez} \newcommand{\emdstd}{\tilde{σ}} \DeclareMathOperator{\Mvar}{Mvar} \DeclareMathOperator{\AIC}{AIC} \DeclareMathOperator{\epll}{epll} \DeclareMathOperator{\elpd}{elpd} \DeclareMathOperator{\MDL}{MDL} \DeclareMathOperator{\comp}{COMP} \DeclareMathOperator{\Lognorm}{Lognorm} \DeclareMathOperator{\erf}{erf} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator{\Image}{Image} \DeclareMathOperator{\sgn}{sgn} \DeclareMathOperator{\SE}{SE} % standard error \DeclareMathOperator{\Unif}{Unif} \DeclareMathOperator{\Poisson}{Poisson} \DeclareMathOperator{\SkewNormal}{SkewNormal} \DeclareMathOperator{\TruncNormal}{TruncNormal} \DeclareMathOperator{\Exponential}{Exponential} \DeclareMathOperator{\exGaussian}{exGaussian} \DeclareMathOperator{\IG}{IG} \DeclareMathOperator{\NIG}{NIG} \DeclareMathOperator{\Gammadist}{Gamma} \DeclareMathOperator{\Lognormal}{Lognormal} \DeclareMathOperator{\Beta}{Beta} \newcommand{\sinf}{{s_{\infty}}}\)

When replication is non-stationary, each replicate dataset will produce a different loss PPF. A distribution over replications (like the previously defined epistemic distribution \(Ω\)) therefore corresponds to a distribution over PPFs. This leads us to the final formulation of our EMD assumption, which we now operationalize by assuming a linear relation between empirical model discrepancy and epistemic uncertainty:

EMD principle

For a given model \(\M_A\), differences between its PPFs on two replicate datasets should be proportional to \(δ^{\EMD}_A\).

Formalising this idea is made considerably simpler by the fact that PPFs are always scalar functions, irrespective of the model or dataset’s dimensionality. We do this as follows, going through the steps schematically represented by downward facing arrows on the right of Fig. 1.1:

For a given model \(\M_A\), we treat the PPFs of different replicates as realisations of a stochastic process \(\qproc_A\) on the interval \([0, 1]\); a realisation of \(\qproc_A\) (i.e. a PPF) is a function \(\qh_A(Φ)\) with \(Φ \in [0, 1]\). The stochastic process \(\qproc_A\) is defined to satisfy the desiderata listed below, which ensure that it is centered on the observed PPF \(\qs_A\) and that its variance at each \(Φ\) is governed by \(δ^{\EMD}_A\).

Obtaining the \(R\)-distributions shown in Fig. 2.2 is then relatively straightforward: we simply sample an ensemble of \(\qh_A\) from \(\qproc_A\) and integrate each to obtain an ensemble of risks.

Working with PPFs has the important advantage that we can express our fundamental assumption as a simple linear proportionality relation between empirical model discrepancy and epistemic uncertainty. While this is certainly an approximation, this simplifies the interpretability of the resulting \(R\)-distribution. As we show in the Supplementary Discussion, the resulting criterion is also more robust than other alternatives. The simplicity of the assumption however belies the complexity of definining a stochastic process \(\qproc\) directly on PPFs.

2.6.1. Desiderata for \(\qproc\)#

In order to interpret them as such, realisations of \({\qh_A \sim \qproc_A}\) must valid PPFs. This places quite strong constraints on those realisations, for example:

  • All realisations \({\qh_A(Φ) \sim \qproc_A}\) must be monotone.

  • All realisations \({\qh_A(Φ) \sim \qproc_A}\) must be integrable.

Monotonicity follows immediately from definitions: a CDF is always monotone because it is the integral of a positive function (Eq. 2.15), and therefore its inverse must also be monotone.

Integrability simply means that the integral in Eq. 2.17 exists and is finite. Concretely this is enforced by ensuring that the process \(\qproc_A\) is self-consistent[42], a property which we explain in the Methods.

Interpreting the realisations \(\qh\) as PPFs also imposes a third constraint, more subtle but equally important:

  • The process \(\qproc_A\) must be non-accumulating.

A process which is accumulating would start at one end of the domain, say \(Φ=0\), and sequentially accumulate increments until it reaches the other end. Brownian motion over the interval \([0, T]\) is an example of such a process. In contrast, consider the process of constructing a PPF for the data in Fig. 2.1: initially we have few data points and the PPF of their loss is very coarse. As the number of points increases, the PPF gets refined, but since loss values occur in no particular order, this happens simultaneously across the entire interval.

The accumulation of increments strongly influences the statistics of a process; most notably, the variance is usually larger further along the domain. This would not make sense for a PPF: if \(\VV[\qproc_A(Φ)]\) is smaller than \(\VV[\qproc_A(Φ')]\), that should be a consequence of \(δ^{\EMD}(Φ)\) being smaller than \(δ^{\EMD}(Φ')\) — not of \(Φ\) occurring “before” \(Φ'\).

This idea that a realisation \(\qh_A(Φ)\) is generated simultaneously across the interval led us to define \(\qproc_A\) as a sequence of refinements: starting from an initial increment \(Δ\qh_1(0) = \qh_A(1) - \qh_A(0)\) for the entire \(Φ\) interval \([0, 1]\), we partition \([0, 1]\) into \(n\) subintervals, and sample a set of \(n\) subincrements in such a way that they sum to \(Δ\qh_1(0)\). This type of distribution, where \(n\) random variables are drawn under the constraint of a fixed sum, is called a compositional distribution [43]. Note that the constraint reduces the number of dimensions by one, so a pair of increments would be drawn from a 1-d compositional distribution. A typical 1-d example is the beta distribution for \(x_1 \in [0, 1]\), with \(x_2 = (1-x_1)\) and \(α,β > 0\):

(2.21)#\[\begin{aligned} \text{if }\; & x_1 \sim \Beta(α, β) \,, & \text{ then }\; & p(x_1) \propto x_1^{α-1} (1-x_1)^{β-1}\,. \end{aligned}\]

Interestingly, the most natural statistics for compositional distributions are not the mean and variance, but analogue notions of centre and metric variance[43, 44]; for the beta distribution defined above, these are

(2.22)#\[\begin{split}\begin{aligned} \EE_a[(x_1, x_2)] &= \frac{1}{e^{ψ(α)} + e^{ψ(β)}} \bigl(e^{ψ(α)}, e^{ψ(β)}\bigr) \label{eq_comb-stats__EE} \\ \Mvar[(x_1, x_2)] &= \frac{1}{2} \bigl(ψ_1(α) + ψ_1(β)\bigr) \,, \label{eq_comb-stats__Mvar} \end{aligned}\end{split}\]

where \(ψ\) and \(ψ_1\) are the digamma and trigamma functions respectively, and \(\EE_a\) denotes expectation with respect to the Aitchison measure [43, 45]. In essence, Eq. 2.22 are obtained by mapping \(x_1\) and \(x_2\) to the unbounded domain \(\RR\) via a logistic transformation, then evaluating moments of the unbounded variables. Of particular relevance is that — in contrast to the variance — the metric variance \(\Mvar\) of a compositional distribution is therefore unbounded, which simplifies the selection of \(α\) and \(β\) (see Choosing beta distribution parameters in the Methods).

Of course, we not only want the \(\qh_A\) to be valid PPFs, but also descriptive of the model and data. We express this as two additional constraints, which together define a notion of closeness to \(\qs_A\):

  • At each intermediate point \(Φ \in (0, 1)\), the centre of the process is given by \(\qs_A(Φ)\):

(2.23)#\[\EE_a\bigl[ \qh(Φ) \bigr] = \qs_A(Φ) \,.\]
  • At each intermediate point \(Φ \in (0, 1)\), the metric variance is proportional to the square of \(δ^{\EMD}_A(Φ)\) :

(2.24)#\[\Mvar\bigl[ \qh(Φ) \bigr] = c \, δ^{\EMD}_A(Φ)^2 \,.\]

Note that Eq. 2.24 formalises the EMD principle stated at the top of this section, with the sensitivity factor \(c > 0\) determining the conversion from empirical model discrepancy (which we interpret as misspecification) to epistemic uncertainty. Note also that Eq. 2.23 and Eq. 2.24 are akin to a variational approximation of the stochastic process around \(q^*_A\). Correspondingly, we should expect these equations (and therefore \(\qproc\)) to work best when \(c\) is not too large. We describe how one might determine an appropriate value for \(c\) in the later section on calibration.

In addition to the above desiderata, for reasons of convenience, we also ask that

  • The end points are sampled from Gaussian distributions:

(2.25)#\[\begin{split}\begin{aligned} \qh(0) \sim \nN(\qs_A(0), c \, δ^{\EMD}_A(0)^2) \,, \\ \qh(1) \sim \nN(\qs_A(1), c \, δ^{\EMD}_A(1)^2) \,. \end{aligned}\end{split}\]

Thus the process \(\qproc_{A;c}\) is parametrised by two functions and a scalar: \(\qs_A\), \(δ^{\EMD}_A\) and \(c\). It is molded to produce realisations \(\qh\) which as a whole track \(\qs_A\), with more variability between realisations at points \(Φ\) where \(δ^{\EMD}_A(Φ)\) is larger (middle panel of Fig. 2.4a).

To the best of our knowledge the current literature does not provide a process satisfying all of these constraints. To remedy this situation, we propose a new hierarchical beta (HB) process, which we illustrate in Fig. 2.4. A few example realisations of \({\qh \sim \qproc_A}\) are drawn as grey lines in Fig. 2.4a. The mixed PPF \(\qs_A\) (Eq. 2.23) is drawn as a green line, while the region corresponding to \(\qs_A(Φ) \pm \sqrt{c} δ^{\EMD}_A(Φ)\) is shaded in yellow. The process is well-defined for \(c \in [0, \infty]\): the \(c \to 0\) limit is simply \(\qs_A\), while the \(c \to \infty\) limit is a process which samples uniformly at every step, irrespective of \(q^*_A\) and \(δ^{\EMD}_A\). In other words, as \(c\) increases, the ability of each realisation \(\qh_A\) to track \(\qs_A\) is limited by the constraints of monotonicity and non-accumulating increments. This translates to step-like PPFs, as seen in the lower panel of Fig. 2.4a.

Fig. 2.4b shows distributions of \(\qh(Φ)\) at three different values of \(Φ\). The value of \(\qs_A(Φ)\) is indicated by the green vertical bar and agrees well with \(\EE_a\bigl[ \qh(Φ) \bigr]\); the desideratum of Eq. 2.23 is therefore satisfied. The scaling of these distributions with \(δ^{\EMD}_A\) (Eq. 2.24) is however only approximate, which we can see as the yellow shading not having the same width in each panel. This is a result of the tension with the other constraints: the HB process ensures that the monotonicity and integrability constraints are satisfied exactly, but allows deviations in the statistical constraints.

A realisation of an HB process is obtained by a sequence of refinements; we illustrate three such refinements steps in Fig. 2.4d. The basic idea is to refine an increment \(Δ\qh_{ΔΦ}(Φ) := \qh(Φ+ΔΦ) - \qh(Φ)\) into two subincrements \(Δ\qh_{\tfrac{ΔΦ}{2}}(Φ)\) and \(Δ\qh_{\tfrac{ΔΦ}{2}}(Φ+\tfrac{ΔΦ}{2})\), with \(Δ\qh_{\tfrac{ΔΦ}{2}}(Φ) + Δ\qh_{\tfrac{ΔΦ}{2}}(Φ+\tfrac{ΔΦ}{2}) = Δ\qh_{ΔΦ}(Φ)\). To do this, we first determine appropriate parameters \(α\) and \(β\), draw \(x_1\) from the corresponding beta distribution and assign

(2.26)#\[\begin{split}\begin{aligned} Δ\qh_{\tfrac{ΔΦ}{2}}\bigl(Φ\bigr) &\leftarrow x_1 Δ\qh_{ΔΦ}\bigl(Φ\bigr) \,, \\ Δ\qh_{\tfrac{ΔΦ}{2}}\bigl(Φ+\tfrac{ΔΦ}{2}\bigr) &\leftarrow x_2 Δ\qh_{ΔΦ}\bigl(Φ\bigr) \,, \end{aligned}\end{split}\]

where again \(x_2 = (1-x_1)\). Fig. 2.4c shows distributions for the first (orange) and second (blue) subincrements at the fourth refinement step, where we divide an increment over an interval of length \(ΔΦ=2^{-3}\) to two subincrements over intervals of length \(2^{-4}\). Each pair of subincrements is drawn for a different distribution, which depends on the particular realisation \(\qh\), but we can nevertheless see the PPF reflected in the aggregate distribution: the PPF has positive curvature, so the second subincrement tends to be larger than the first. Also both increments are bounded from below by 0, to ensure monotonicity.

A complete description of the HB process, including a procedure for choosing the beta parameters \(α\) and \(β\) such that our desiderata are satisfied, is given in the Methods.

2.6.2. Using \(\qproc\) to compare models#

Having constructed a process \(\qproc_{A;c}\) for a candidate model \(\M_A\), we can use it to induce a distribution on risks. We do this by generating a sequence of PPFs \(\qh_{A,1}, \qh_{A,2}, \dotsc, \qh_{A,M_A}\), where \(M_A \in \NN\) and each \(\qh_{A,i}\) is drawn from \(\qproc_{A;c}\) (see Fig. 2.4 for examples of sampled \(\qh_{A,i}\), and the Methods for more details on how we evaluate \(\Bemd{}\)). As we explain in the next section, the sensitivity parameter \(c\) is a property of the experiment; it is the same for all candidate models.

For each generated PPF, we evaluate the risk functional (using the integral form of Eq. 2.17), thus obtaining a sequence of scalars \(R\bigl[\qh_{A,1}\bigr], R\bigl[\qh_{A,2}\bigr], \dotsc, R\bigl[\qh_{A,M_A}\bigr]\) which follows \(p(R_A \mid \qproc_A)\). With \(M_A\) sufficiently large, these samples accurately characterize the distribution \(p(R_A \mid \qproc_A)\) (we use \(\leftrightarrow\) to relate equivalent descriptions):

(2.27)#\[\begin{split}\begin{multline} R_A \sim p(R_A \mid \D_\test, \M_A; c) \\ \begin{aligned} &\leftrightarrow \Bigl\{R\bigl[\qh_{A,1}\bigr], R\bigl[\qh_{A,2}\bigr], \dotsc, R\bigl[\qh_{A,M_A}\bigr] \Bigm| \qh_A \sim \qproc_A \Bigr\} \\ &\leftrightarrow \Bigl\{R_{A,1}\,,\;\; R_{A,2}\,,\;\; \dotsc, R_{A,M_A}\Bigr\} \,. \end{aligned} \end{multline}\end{split}\]

Repeating this procedure for a different model \(\M_B\) yields a different distribution for the risk:

\[R_B \sim p(R_B \mid \D_\test, \M_B; c) \;\leftrightarrow\; \Bigl\{R_{B,1}\,, R_{B,2}\,, \dotsc, R_{B,M_B}\Bigr\} \,.\]

The \(\Bemd{AB;c}\) criterion Eq. 2.13 then reduces to a double sum:

(2.28)#\[\begin{split}\begin{aligned} \Bemd{AB;c} &\coloneqq P(R_A < R_B \mid c) \\ &\approx \frac{1}{M_A M_B} \sum_{i=1}^{M_A} \sum_{j=1}^{M_B} \boldsymbol{1}_{R_{A,i} < R_{B,j}}\,. \end{aligned}\end{split}\]

In Eq. 2.28, the term within the sum is one when \({R_{A,i} < R_{B,j}}\) and zero otherwise. A value of \(\Bemd{AB;c}\) greater (less) than 0.5 indicates evidence for (against) model \(\M_A\).

The undetermined parameter \(c\) (which converts discrepancy into epistemic uncertainty; c.f. Eq. 2.24) can be viewed as a way to adjust the sensitivity of the criterion: larger values of \(c\) will typically lead to broader distributions of \(R_A\), and therefore lead to a more conservative criterion (i.e. one which is more likely to result in an equivocal outcome). We give some guidelines on choosing \(c\) in the next section.

../../_images/pedag_qhat.svg

Fig. 2.4 Sampling a hierarchical beta (HB) process. (a) PPF samples (grey lines) drawn from a hierarchical beta process \(\qproc\). Mixed (green) and synthetic (red) PPFs are those for the Planck model of Fig. 2.7f (respectively \(\qs\) from Eq. 2.16 and \(\qt\) from Eq. 2.19). Lower panels are enlarged portions of the top panel, corresponding to the black rectangles in the latter. At each \(Φ\), the variance between realisations is controlled by \(\sqrt{c}\, δ^{\EMD}\) (yellow shading), per Eq. 2.24; here we use \(c=0.5\). (b) Marginals of \(\qproc\) at three values of \(Φ\), obtained as histograms of 10,000 realisations \(\qh\). As in (a), the green line indicates the value of \(\qs(Φ)\) and the yellow shading describes the range \(\qs(Φ) \pm \sqrt{c}δ^{\EMD}(Φ)\). Values of \(Φ\) are given alongside the panels and drawn as cyan vertical lines in (a). © Distributions of the subincrements drawn at the same three \(Φ\) positions as in (b), obtained as histograms from the same 10,000 realisations. Subincrements are for the fourth refinement step, corresponding to the fourth panel of (d). Notation in the legend corresponds to Eq. 2.26. (d) Illustration of the refinement process. This \(\qproc\) is parametrised by the same PPFs as the one in (a–c), but we used a larger \(c\) (16) to facilitate visualisation. Each refinement step halves the width \(ΔΦ\) of an increment; here we show four refinements steps, while in most calculations we use eight. New points at each steps are coloured green; the beta distribution from which each new point is drawn is shown in grey. The domain of each of these distributions spans the vertical distance between the two neighbouring points and is shown as a grey vertical line. [source]#