Sampling quantile paths#
__all__ = ["generate_quantile_paths", "generate_path_hierarchical_beta"]
We want to generate paths \(\lnLh\) for the quantile function \(l(Φ)\), with \(Φ \in [0, 1]\), from a stochastic process \(\pathP\) determined by \(\lnLtt(Φ)\) and \(\emdstd(Φ)\). This process must satisfy the following requirements:
It must generate only monotone paths, since quantile functions are monotone.
The process must be heteroscedastic, with variability at \(Φ\) given by \(\emdstd(Φ)\).
Generated paths should track the function \(\lnLtt(Φ)\), and track it more tightly when variability is low.
Paths should be “temporally uncorrelated”: each stop \(Φ\) corresponds to a different ensemble of data points, which can be drawn in any order. So we don’t expect any special correlation between \(\lnLh(Φ)\) and \(\lnLh(Φ')\), beyond the requirement of monotonicity.
In particular, we want to avoid defining a stochastic process which starts at one point and accumulates variance, like the \(\sqrt{t}\) envelope characteristic of a Gaussian white noise.
Concretely, we require the process to be “\(Φ\)-symmetric”: replacing \(\lnLtt(Φ) \to \lnLtt(-Φ)\) and \(\emdstd(Φ) \to \emdstd(-Φ)\) should define the same process, just inverted along the \(Φ\) axis.
Hierarchical beta process#
Because the joint requirements of monotonicity, non-stationarity and \(Φ\)-symmetry are uncommon for a stochastic process, some care is required to define an appropriate \(\pathP\). The approach we choose here is to first select the end points \(\lnLh(0)\) and \(\lnLh(1)\), then fill the interval by successive binary partitions: first \(\bigl\{\lnLh\bigl(\tfrac{1}{2}\bigl)\bigr\}\), then \(\bigl\{\lnLh\bigl(\tfrac{1}{4}\bigr), \lnLh\bigl(\tfrac{3}{4}\bigr)\bigr\}\), \(\bigl\{\lnLh\bigl(\tfrac{1}{8}\bigr), \lnLh\bigl(\tfrac{3}{8}\bigr), \lnLh\bigl(\tfrac{5}{8}\bigr), \lnLh\bigl(\tfrac{7}{8}\bigr)\bigr\}\), etc. (Below we will denote these ensembles \(\{\lnLh\}^{(1)}\), \(\{\lnLh\}^{(2)}\), \(\{\lnLh\}^{(3)}\), etc.) Thus integrating over paths becomes akin to a path integral with variable end points. Moreover, instead of drawing quantile values, we draw increments
Given two initial end points \(\lnLh(0)\) and \(\lnLh(1)\), we therefore first we draw the pair \(\bigl\{Δ q_{2^{-1}}(0),\; Δ q_{2^{-1}}\bigl(2^{-1}\bigr)\}\), which gives us
Then \(\bigl\{\lnLh(0), \lnLh\bigl(\frac{1}{2}\bigr) \bigr\}\) and \(\bigl\{ \lnLh\bigl(\frac{1}{2}\bigr), \lnLh(1) \bigr\}\) serve as end points to draw \(\bigl\{Δ q_{2^{-2}}\bigl(0\bigr),\; Δ q_{2^{-2}}\bigl(2^{-2}\bigr) \bigr\}\) and \(\bigl\{Δ q_{2^{-2}}\bigl(2^{-1}\bigr),\; Δ q_{2^{-2}}\bigl(2^{-1} + 2^{-2}\bigr) \bigr\}\). We repeat the procedure as needed, sampling smaller and smaller incremenents, until the path has the desired resolution. As the increments are constrained:
the path thus sampled is always monotone. Note also that increments must be drawn in pairs (or more generally as a combination) of values constrained by their sum:
The possible increments therefore lie on a 1-simplex, for which a natural choice is to use a beta distribution[1], with the random variable corresponding to the first increment \(Δ q_{2^{-n}}(Φ)\). The density function of a beta random variable has the form
with \(α\) and \(β\) parameters to be determined.
Important
An essential property of a stochastic process is consistency: it must not matter exactly how we discretize the interval [1]. Let \(\{\lnLh\}^{(n)}\) denote the steps which are added when we refine the discretization from steps of \(2^{-n+1}\) to steps of \(2^{-n}\):
A necessary condition for consistency is that coarsening the discretization from steps of \(2^{-n}\) to steps of \(2^{-n+1}\) (i.e. marginalizing over the points at \(\{\lnLh\}^{(n)}\)) does not substantially change the probability law:
with \(ε\) vanishing as \(n\) is increased to infinity.
Satisfying this requirement is required in order to compute integrals over \(\lnLh\), since otherwise they don’t converge as \(ΔΦ\) is reduced.
Conditions for choosing the beta parameters#
To draw an increment \(Δ q_{2^{-n}}\), we need to convert \(\lnLtt(Φ)\) and \(\emdstd(Φ)\) into beta distribution parameters \(α\) and \(β\). If \(x_1\) follows a beta distribution, then its first two cumulants are given by
However, as discussed by Mateu-Figueras et al. (2021, 2011), to properly account for the geometry of a simplex, one should use instead statistics with respect to the Aitchison measure, sometimes referred to as the center and metric variance. Defining \(x_2 = 1-x_1\), these can be written (Mateu-Figueras et al., 2021)
Here \(ψ\) and \(ψ_1\) are the digamma and trigamma functions respectively. (In addition to being more appropriate, the center and metric variance are also better suited for defining a consistent stochastic process. For example, since the metric variance is unbounded, we can always scale it with \(\emdstd(Φ)\) without exceeding its domain.)
Since we want the sum to be \(d := \lnLh(Φ+2^{-n+1}) - \lnLh(Φ)\), we define
Then
We now choose to define the parameters \(α\) and \(β\) via the following relations:
These follow from interpretating \(\lnLtt\) and \(\emdstd\) as estimators for the center and square root of the metric variance. We use \(=^*\) to indicate equality in spirit rather than true equality, since strictly speaking, these are 3 equations for 2 unknown. To reduce the \(\EE_a\) equations to one, we use instead
Remarks
We satisfy the necessary condition for consistency by construction:
\[p\bigl(\{l\}^{(n)}\bigr)\bigr) = \int p\bigl(\{l\}^{(n)} \,\big|\, \{l\}^{(n+1)}\bigr) \,d\{l\}^{(n+1)}\,.\]The stochastic process is not Markovian, so successive increments are not independent. The variance of a larger increment therefore need not equal the sum of the variance of constituent smaller ones; in other words,
\[Δ q_{2^{-n+1}}\bigl(Φ\bigr) = Δ q_{2^{-n}}\bigl(Φ\bigr) + Δ q_{2^{-n}}\bigl(Φ+2^{-n}\bigr)\]does not imply
\[\VV\bigl[Δ q_{2^{-n+1}}\bigl(Φ\bigr)\bigr] = \VV\bigl[Δ q_{2^{-n}}\bigl(Φ\bigr)\bigr] + \VV\bigl[Δ q_{2^{-n}}\bigl(Φ+2^{-n}\bigr)\bigr]\,.\]Our defining equations make equivalent use of the pre (\(Δ q_{2^{-n}}(Φ)\)) and post (\(Δ q_{2^{-n}}(Φ+2^{-n})\)) increments, thus preserving symmetry in \(Φ\).
Step sizes of the form \(2^{-n}\) have exact representations in binary. Thus even small step sizes should not introduce additional numerical errors.
Formulation of the parameter equations as a root-finding problem#
Define
The first value, \(r\), is the ratio of two subincrements within \(Δ q_{2^{-n+1}}(Φ)\). Setting \(\frac{e^{ψ(α)}}{e^{ψ(β)}} = r\), the two equations we need to solve for \(α\) and \(β\) can be written
Note that these equations are symmetric in \(Φ\): replacing \(Φ\) by \(-Φ\) simply changes the sign on both sides of the first. The use of the logarithm in the equation for \(v\) helps to stabilize the numerics.
def f(lnαβ, lnr_v, _array=np.array, _exp=np.exp, _log=np.log,
digamma=scipy.special.digamma, polygamma=scipy.special.polygamma):
α, β = _exp(lnαβ).reshape(2, -1) # scipy's `root` always flattens inputs
lnr, v = lnr_v
return _array((
digamma(α) - digamma(β) - lnr,
_log(polygamma(1, α) + polygamma(1, β)) - _log(v)
)).flat
def f_mid(lnα, v, _exp=np.exp, _log=np.log, polygamma=scipy.special.polygamma):
"1-d equation, for the special case α=β (equiv to r=1)"
return _log(2*polygamma(1, _exp(lnα))) - _log(v)
def jac(lnαβ, lnr_v):
lnα, lnβ = lnαβ.reshape(2, -1) # scipy's `root` always flattens inputs
α, β = np.exp(lnαβ).reshape(2, -1)
j = np.block([[np.diagflat(digamma(α)*lnα), np.diagflat(-digamma(β)*lnβ)],
[np.diagflat(polygamma(1,α)*lnα), np.diagflat(polygamma(1,β)*lnβ)]])
return j
The functions \(ψ\) and \(ψ_1\) diverge at zero, so \(α\) and \(β\) should remain positive. Therefore it makes sense to fit their logarithm: this enforces the lower bound, and improves the resolution where the derivative is highest. The two objective functions (up to scalar shift) are plotted below: the region for low \(\ln α\) and \(\ln β\) shows sharp variation around \(α=β\), suggesting that this area may be most challenging for a numerical optimizer. In practice this is indeed what we observed.
We found however that we can make fits much more reliable by first choosing a suitable initialization point along the \(\ln α = \ln β\) diagonal. In practice this means setting \(α_0 = α = β\) and solving the first equation of Eqs. (11) for \(α_0\). (We use the implementation of Brent’s method in SciPy.) Then we can solve the full 2d problem of Eqs. (11), with \((α_0, α_0)\) as initial value. This procedure was successful for all values of \(r\) and \(v\) we encountered in our experiments.
Fig. 1 Characterization of the digamma (\(ψ\)) and trigamma (\(ψ_1\)) functions, and of the metric variance \(\Mvar\).#
Plotting the eigenvalues of the Jacobian (specifically, the real part of its dominant eigenvalue) in fact highlights three regions with a center at roughly \((\ln α, \ln β) = (0, 0)\). (The Jacobian does not depend on \(r\) or \(v\), so this is true for all fit conditions).
Note that the color scale is clipped, to better resolve values near zero. Eigenvalues quickly increase by multiple orders of magnitude away from \((0,0)\).
Note
Empirically we found that initializing fits at \((0, 0)\) resulted in robust fits for a large number of \((r,v)\) tuples, even when \(r > 100\). We hypothesize that this is because it is difficult for the fit to move from one region to another; by initializing where the Jacobian is small, fits are able to find the desired values before getting stuck in the wrong region.
That being said, there are cases where \((0, 0)\) is not a good initial vector for the root solver is when \(\boldsymbol{r \approx 1}\). This can be resolved by choosing a better initial value along the \((α_0, α_0)\) diagonal, as described above. In practice we found no detriment to always using the 1d problem to select an initial vector, so we use that approach in all cases.
Fig. 2 Objective function has a saddle-point around (0,0) After rewriting Eqs. (11) in terms of \(\ln α\) and \(\ln β\), we compute the Jacobian \(J\). Plotted is the real part of the eigenvalue \(λ_i\) of \(J\) for which \(\lvert\mathop{\mathrm{Re}}(λ)\rvert\) is largest; this gives an indication of how quickly the fit moves away from a given point. In most cases, a root finding algorithm initialized at (0,0) will find a solution.#
Special cases for extreme values#
For extreme values of \(r\) or \(v\), the beta distribution becomes degenerate and numerical optimization may break. We identified four cases requiring special treatment.
\(\boldsymbol{r \to 0}\)
The corresponds to stating that \(Δ q_{2^{-n}}(Φ)\) is infinitely smaller than \(Δ q_{2^{-n}}(Φ+2^{-n})\). Thus we set \(x_1 = 1\), which is equivalent to setting
\(\boldsymbol{r \to \infty}\)
The converse of the previous case: \(Δ q_{2^{-n}}(Φ)\) is infinitely larger than \(Δ q_{2^{-n}}(Φ+2^{-n})\). We set \(x_1 = 0\).
\(\boldsymbol{v \to 0}\)
As \(v\) vanishes, the distribution for \(x_1\) approaches a Dirac delta centered on \(\tfrac{1}{r+1}\). In our implementation, we replace \(x_1\) by a constant when \(v < 10^{-8}\).
\(\boldsymbol{v \to \infty}\)
Having \(v\) go to infinity requires that \(α\) and/or \(β\) go to \(0\) (see Eq. (11) and Fig. 1). The probability density of \(x_1\) is then a Dirac delta: placed at \(x_1=0\) if \(α \to 0\), or placed at \(x_1 = 1\) if \(β \to 0\). If both \(α\) and \(β\) go to \(0\), the PDF must be the sum of two weighted deltas:
The weights \(w_i\) can be determined by requiring that
which yields
(For this special case, we revert to writing the condition in terms of a standard (Lebesgue) expectation, since the center (Eq. (6)) is undefined when \(α, β \to 0\).)
Since we have already considered the special cases \(r = 0\) and \(r \to \infty\), we can assume \(0 < r < \infty\). Then both \(α\) and \(β\) are zero, and \(x_1\) should be a Bernoulli random variable with success probability \(p = w_2 = \frac{1}{r+1}\).
def get_beta_rv(r: Real, v: Real) -> Tuple[float]:
"""
Return α and β corresponding to `r` and `v`.
This function is not exported: it is used only for illustrative purposes
within the notebook.
"""
# Special cases for extreme values of r
if r < 1e-12:
return scipy.stats.bernoulli(0) # Dirac delta at 0
elif r > 1e12:
return scipy.stats.bernoulli(1) # Dirac delta at 1
# Special cases for extreme values of v
elif v < 1e-8:
return get_beta_rv(r, 1e-8)
elif v > 1e4:
# (Actual draw function replaces beta by a Bernoulli in this case)
return scipy.stats.bernoulli(1/(r+1))
# if v < 1e-6:
# # Some unlucky values, like r=2.2715995006941436, v=6.278153793994013e-08,
# # seem to be particularly pathological for the root solver.
# # At least in the case above, the function is continuous at those values
# # (±ε returns very similar values for a and b).
# # Replacing these by nearby values which are more friendly to binary representation
# # seems to help.
# v = digitize(v, rtol=1e-5, show=False)
# if 0.25 < r < 4:
# # Special case for r ≈ 1: improve initialization by first solving r=1 <=> α=β
# Improve initialization by first solving r=1 <=> α=β
x0 = scipy.optimize.brentq(f_mid, -5, 20, args=(v,))
x0 = (x0, x0)
# else:
# # Normal case: Initialize fit at (α, β) = (1, 1)
# x0 = (0, 0)
res = scipy.optimize.root(f, x0, args=[math.log(r), v])
if not res.success:
logger.error("Failed to determine α & β parameters for beta distribution. "
f"Conditions were:\n {r=}\n{v=}")
α, β = np.exp(res.x)
return scipy.stats.beta(α, β)
def scipy_mvroot_solver(fun, x0, args, method, root=scipy.optimize.root):
res = root(fun, x0, args, method)
return res.x, res.success
Examples of different fitted beta distributions#
Plotted below are the beta distributions for different values of \(r\) and \(v\).
/tmp/ipykernel_2413/3907032844.py:1: HoloviewsDeprecationWarning: IPython magic is deprecated and will be removed in version 1.23.0.
get_ipython().run_cell_magic('opts', 'Curve [title="Fitted beta distributions", ylim=(None,7)]', '%%opts Table [title="Empirical statistics (4000 samples)"]\n%%opts Layout [sublabel_format=""]\n\ncurves = {}\nstats = {}\nxarr = np.linspace(0, 1, 400)\nfor (r, v), c in zip(\n [(0.2, 1e-32), (0.2, 1e-16), (0.2, 1e-8), (0.2, 1e-4), (0.2, 1e-2), (0.2, 0.5),\n (0.5, 0.5), (0.5, 0.1), (0.5, 1),\n (1, 0.5), (1, 1e1), (1, 30), (1, 50), (1, 70), (1, 1e2), (1, 1e3), (1, 1e5), (1, 1e6),\n (5, 0.5), (5, 8), (5, 1e4), (5, 2e4), (5, 4e4), (5, 1e6), (5, 1e8), (5, 1e16), (5, 1e32),\n (6.24122778821756, 414.7130462762959),\n (2.2715995006941436, 6.278153793994013e-08),\n (2.271457193328191, 6.075242708902806e-08),\n (2.269182419251242, 6.794061846449025e-08),\n (2.691033486949275e-17, 0.02930210834055045)\n ],\n itertools.cycle(config.viz.colors.bright.cycle)):\n rv = get_beta_rv(r, v)\n if isinstance(rv.dist, scipy.stats.rv_discrete):\n # Dirac delta distribution\n p, = rv.args\n if p == 0:\n α, β = np.inf, 0\n elif p == 1:\n α, β = 0, np.inf\n else:\n α, β = 0, 0\n else:\n # rv is a beta random variable\n α, β = rv.args\n # α, β = get_beta_α_β(r, v)\n x = draw_from_beta(r, v, n_samples=4000)\n # rv = beta(α, β)\n # x = rv.rvs(4000)\n pdf = rv.pmf(xarr) if isinstance(rv.dist, scipy.stats.rv_discrete) else rv.pdf(xarr)\n curves[(r,v)] = hv.Curve(zip(xarr, pdf), label=f"{r=}, {v=}",\n kdims=[hv.Dimension("x1", label="$x_1$")],\n vdims=[hv.Dimension("px1", label="p($x_1$)")] # non-TeX brackets to avoid legend issue \n ).opts(color=c)\n with warnings.catch_warnings():\n warnings.filterwarnings("ignore", "invalid value encountered", category=RuntimeWarning)\n stats[(r,v)] = tuple(f"{y:.3f}" for y in\n (α, β, x.mean(), x.std(),\n np.exp(digamma(α)) / (np.exp(digamma(α)) + np.exp(digamma(β))),\n 0.5 * (polygamma(1, α) + polygamma(1, β))\n ))\n\nhmap = hv.HoloMap(curves, kdims=["r", "v"])\ndists = hmap.overlay()\ndists.opts(legend_position="right")\ndists.opts(width=500, backend="bokeh")\ndists.opts(fig_inches=7, aspect=2.5, legend_cols=2, backend="matplotlib")\n')
Statistics for the fitted beta distributions. \(\mathbb{E}[x_1]\) and \(\mathrm{std}[x_1]\) are computed from 4000 samples. \(\mathbb{E}_a[x_1]\) and \(\mathrm{Mvar}[x_1,x_2]\) are computed using the expressions above.
No JIT |
JIT-ed *gamma |
JIT-ed bisec |
|||||
---|---|---|---|---|---|---|---|
L |
r |
v |
NumPy |
JAX (imperative) |
NumPy |
JAX (imperative) |
JAX (functional) |
1 |
0.2 |
1e-32 |
1.14 μs ± 16.2 ns (±2%) |
508 ns ± 0.614 ns (±2%) |
1.1 μs ± 0.677 ns (±4%) |
501 ns ± 0.388 ns (±1%) |
1.57 ms ± 59.6 µs |
1 |
0.2 |
1e-16 |
1.11 μs ± 2.95 ns (±2%) |
507 ns ± 0.56 ns (±0%) |
1.1 μs ± 5.78 ns (±1%) |
501 ns ± 0.651 ns (±0%) |
1.55 ms ± 13.9 µs |
1 |
0.2 |
1e-08 |
550 μs ± 6.47 μs (±1%) |
2.52 ms ± 12.4 μs (±1%) |
560 μs ± 782 ns (±3%) |
2.07 ms ± 16.8 μs (±5%) |
62.3 ms ± 489 µs |
1 |
0.2 |
0.0001 |
717 μs ± 12 μs (±1%) |
2.84 ms ± 7.44 μs (±3%) |
698 μs ± 2.97 μs (±1%) |
2.22 ms ± 17.4 μs (±2%) |
75.6 ms ± 556 µs |
1 |
0.2 |
0.01 |
737 μs ± 8.1 μs (±1%) |
2.91 ms ± 8.94 μs (±1%) |
735 μs ± 5.79 μs (±3%) |
2.23 ms ± 12.4 μs (±3%) |
76.5 ms ± 1.42 ms |
1 |
0.2 |
0.5 |
804 μs ± 6.82 μs (±1%) |
3.02 ms ± 16.4 μs (±1%) |
805 μs ± 16.5 μs (±2%) |
2.31 ms ± 19.4 μs (±1%) |
83.3 ms ± 1.88 ms |
1 |
0.5 |
0.5 |
757 μs ± 8.25 μs (±1%) |
2.88 ms ± 10.5 μs (±7%) |
764 μs ± 857 ns (±1%) |
2.26 ms ± 12 μs (±1%) |
76.1 ms ± 886 µs |
1 |
0.5 |
0.1 |
735 μs ± 10.2 μs (±2%) |
2.86 ms ± 5.1 μs (±2%) |
754 μs ± 3.06 μs (±2%) |
2.28 ms ± 42 μs (±3%) |
75.7 ms ± 496 µs |
1 |
0.5 |
1.0 |
736 μs ± 8.01 μs (±1%) |
2.86 ms ± 11.1 μs (±3%) |
727 μs ± 3.3 μs (±4%) |
2.26 ms ± 21.9 μs (±0%) |
75.6 ms ± 406 µs |
1 |
1.0 |
0.5 |
512 μs ± 6.09 μs (±2%) |
2.36 ms ± 9 μs (±2%) |
559 μs ± 2 μs (±2%) |
2.06 ms ± 13 μs (±2%) |
49.2 ms ± 517 µs |
1 |
1.0 |
10.0 |
516 μs ± 7.56 μs (±2%) |
2.36 ms ± 7.87 μs (±2%) |
557 μs ± 2.22 μs (±3%) |
2.05 ms ± 19.1 μs (±2%) |
49.1 ms ± 301 µs |
1 |
1.0 |
100.0 |
479 μs ± 7.3 μs (±1%) |
2.36 ms ± 15.4 μs (±1%) |
498 μs ± 1.66 μs (±2%) |
1.99 ms ± 17.1 μs (±1%) |
48.9 ms ± 252 µs |
1 |
1.0 |
1000.0 |
461 μs ± 10.3 μs (±2%) |
2.31 ms ± 2.92 μs (±2%) |
483 μs ± 3.23 μs (±1%) |
1.98 ms ± 8.52 μs (±1%) |
49 ms ± 312 µs |
1 |
1.0 |
100000.0 |
2.11 μs ± 62.1 ns (±8%) |
810 μs ± 1.51 μs (±1%) |
2.07 μs ± 33.3 ns (±4%) |
810 μs ± 2.65 μs (±2%) |
1.29 ms ± 19.6 µs |
1 |
1.0 |
1000000.0 |
2.07 μs ± 18.5 ns (±4%) |
811 μs ± 2.22 μs (±1%) |
2.05 μs ± 15.6 ns (±3%) |
809 μs ± 1.35 μs (±0%) |
1.28 ms ± 5.65 µs |
1 |
5.0 |
0.5 |
805 μs ± 10 μs (±1%) |
3.02 ms ± 4.95 μs (±2%) |
790 μs ± 628 ns (±5%) |
2.27 ms ± 8.23 μs (±2%) |
85.4 ms ± 2.6 ms |
1 |
5.0 |
8.0 |
853 μs ± 11.5 μs (±0%) |
3.16 ms ± 34.4 μs (±1%) |
829 μs ± 3.2 μs (±2%) |
2.31 ms ± 17.3 μs (±1%) |
90.6 ms ± 1.18 ms |
1 |
5.0 |
10000.0 |
587 μs ± 9.9 μs (±3%) |
2.67 ms ± 28.2 μs (±0%) |
573 μs ± 990 ns (±1%) |
2.07 ms ± 18.3 μs (±1%) |
62 ms ± 490 µs |
1 |
5.0 |
20000.0 |
2.06 μs ± 19.8 ns (±2%) |
813 μs ± 5.46 μs (±2%) |
2.05 μs ± 16.6 ns (±1%) |
808 μs ± 1.38 μs (±2%) |
1.28 ms ± 6 µs |
1 |
5.0 |
40000.0 |
2.16 μs ± 10.2 ns (±3%) |
819 μs ± 7.11 μs (±1%) |
2.07 μs ± 9.88 ns (±3%) |
806 μs ± 1.95 μs (±0%) |
1.28 ms ± 11.1 µs |
1 |
5.0 |
1000000.0 |
2.06 μs ± 13.3 ns (±3%) |
834 μs ± 1.26 μs (±2%) |
2.06 μs ± 15.5 ns (±2%) |
814 μs ± 2.21 μs (±1%) |
1.28 ms ± 13.6 µs |
1 |
5.0 |
100000000.0 |
2.08 μs ± 13 ns (±3%) |
835 μs ± 1.88 μs (±2%) |
2.05 μs ± 12 ns (±2%) |
813 μs ± 1.25 μs (±1%) |
1.27 ms ± 10.6 µs |
1 |
5.0 |
1e+16 |
2.06 μs ± 24 ns (±3%) |
827 μs ± 2.65 μs (±5%) |
2.05 μs ± 15.4 ns (±4%) |
812 μs ± 2.74 μs (±0%) |
1.27 ms ± 5.78 µs |
1 |
5.0 |
1e+32 |
2.11 μs ± 38.9 ns (±2%) |
822 μs ± 9.73 μs (±1%) |
2.06 μs ± 13.8 ns (±6%) |
807 μs ± 2.03 μs (±1%) |
1.28 ms ± 33.4 µs |
Implementation#
Generate a single path#
Now that we know how to construct an sampling distribution for the increments, sampling an entire path is just a matter of repeating the process recursively until we reach the desired resolution.
def generate_path_hierarchical_beta(
qstar: Callable, deltaEMD: Callable, c: float,
qstart: float, qend: float, res: int=8, rng=None,
*, Phistart: float=0., Phiend: float=1.
) -> Tuple[Array[float,1], Array[float,1]]:
"""
Returned path has length ``2**res + 1``.
If `qstar` and`Mvar` have a different length, they are linearly-
interpolated to align with the returned array `Φhat`.
Parameters
----------
qstar: Empirical (mixed) quantile function. The generated path will
be centered on this trace. Although any callable mapping [0, 1] to R
is accepted, in all practical use cases this will be the output of
:func:`emd.make_empirical_risk_ppf`.
deltaEMD: Callable returning discrepancy values (i.e. {math}`δ^\mathrm{EMD}`).
Like `qstar`, this must map [0, 1] to R+.
c: Proportionality constant relating δEMD to
the square root of the metric variance.
qstart, qend: The end point ordinates. The hierarchical beta process "fills in"
the space between a start and an end point, but it needs the value of _q_
to be given at those points. These end points are by default Φ=0 and Φ=1,
which correspond to the bounds of a quantile function.
res: Returned paths have length ``2**res+1``, and therefore ``2**res`` increments.
Typical values of `res` are 6, 7 and 8, corresponding to paths of length
64, 128 and 256. Smaller may be useful to accelerate debugging. Larger values
increase the computation cost with typically negligible improvements in accuracy.
Must be at least 1.
rng: Any argument accepted by `numpy.random.default_rng` to initialize an RNG.
Phistart, Phiend: (Keyword only) The abscissa corresponding to the ordinates
`qstart` and `qend`. The default values are 0 and 1, appropriate for generating
a quantile function.
Returns
-------
The pair Φhat, qhat.
Φhat: Array of equally spaced values between 0 and 1, with step size ``2**-res``.
qhat: The generated path, evaluated at the values listed in `Φhat`.
"""
# Validation
res = int(res) # sourcery skip: remove-unnecessary-cast
if Phistart >= Phiend:
raise ValueError("`Phistart` must be strictly smaller than `Phiend`. "
f"Received:\n {Phistart=}\n {Phiend=}")
# if not (len(Phi) == len(qstar) == len(Mvar)):
# raise ValueError("`Phi`, `qstar` and `Mvar` must all have "
# "the same shape. Values received have the respective shapes "
# f"{np.shape(Phi)}, {np.shape(qstar)}, {np.shape(Mvar)}")
if res < 1:
raise ValueError("`res` must be greater or equal to 1.")
rng = np.random.default_rng(rng) # No-op if `rng` is already a Generator
# Interpolation
N = 2**res + 1
Φarr = np.linspace(Phistart, Phiend, N)
qsarr = qstar(Φarr)
# Pre-computations
Mvar = c * deltaEMD(Φarr)**2
# Algorithm
qhat = np.empty(N)
qhat[0] = qstart
qhat[-1] = qend
for n in range(1, res+1):
Δi = 2**(res-n)
i = np.arange(Δi, N, 2*Δi) # Indices for the new values to insert
d = qhat[i+Δi] - qhat[i-Δi] # Each pair of increments must sum to `d`
r = (qsarr[i] - qsarr[i-Δi]) / (qsarr[i+Δi]-qsarr[i]) # Ratio of first/second increments
v = 2*Mvar[i]
# Prevent failure in pathological cases where `q` is flat in some places
if ((qsarr[i+Δi] - qsarr[i-Δi]) == 0).any(): # If both increments are zero, then the calculation for `r` gives 0/0 => NaN
logger.warning("The quantile function is not strictly increasing. Non-increasing intervals were skipped. The sampling of EMD paths may be less accurate.")
keep = ((qsarr[i+Δi] - qsarr[i-Δi]) > 0) # `draw_from_beta` can deal with `r=0` and `r=inf`, but not `r=NaN`.
x1 = np.zeros_like(r)
x1[keep] = draw_from_beta(r[keep], v[keep], rng=rng)
else:
x1 = draw_from_beta(r, v, rng=rng)
qhat[i] = qhat[i-Δi] + d * x1
return Φarr, qhat
Generate ensemble of sample paths#
Note
generate_quantile_paths
and generate_path_hierarchical_beta
are the only two public functions exposed by this module.
To generate \(R\) paths, we repeat the following \(R\) times:
Select start and end points by sampling \(\nN(\tilde{Φ}[0], \lnLtt{}[0])\) and \(\nN(\tilde{Φ}[-1], \lnLtt{}[-1])\).
Call
generate_path_hierarchical_beta
.
def generate_quantile_paths(qstar: Callable, deltaEMD: Callable, c: float,
M: int, res: int=8, rng=None,
*, Phistart: float=0., Phiend: float=1,
progbar: Union[Literal["auto"],None,tqdm,"mp.queues.Queue"]="auto",
previous_M: int=0
) -> Generator[Tuple[Array[float,1], Array[float,1]], None, None]:
r"""
Generate `M` distinct quantile paths, with trajectory and variability determined
by `qstar` and `deltaEMD`.
Paths are generated using the hierarchical beta algorithm, with normal distributions
for the end points and beta distributions for the increments.
Returned paths have length ``2**res + 1``.
Typical values of `res` are 6, 7 and 8, corresponding to paths of length
64, 128 and 256. Smaller values may be useful to accelerate debugging. Larger values
increase the computation cost with (typically) negligible improvements in accuracy.
.. Note:: When using multiprocessing to call this function multiple times,
use either a `multiprocessing.Queue` or `None` for the `progbar` argument.
Parameters
----------
qstar: Empirical (mixed) quantile function. The generated paths will
be centered on this trace. Although any callable mapping [0, 1] to R
is accepted, in all practical use cases this will be the output of
:func:`emd.make_empirical_risk_ppf`.
deltaEMD: Callable returning scaled discrepancy values
(i.e. {math}`δ^\mathrm{EMD}`). Like `qstar`, this must map [0, 1] to R+.
c: Proportionality constant relating δEMD to
the square root of the metric variance.
M: Number of paths to generate.
res: Returned paths have length ``2**res + 1``.
Typical values of `res` are 6, 7 and 8, corresponding to paths of length
64, 128 and 256. Smaller may be useful to accelerate debugging, but larger
values are unlikely to be useful.
rng: Any argument accepted by `numpy.random.default_rng` to initialize an RNG.
progbar: Control whether to create a progress bar or use an existing one.
- With the default value 'auto', a new tqdm progress is created.
This is convenient, however it can lead to many bars being created &
destroyed if this function is called within a loop.
- To prevent this, a tqdm progress bar can be created externally (e.g. with
``tqdm(desc="Generating paths")``) and passed as argument.
Its counter will be reset to zero, and its set total to `M` + `previous_M`.
- (Multiprocessing): To support updating progress bars within child processes,
a `multiprocessing.Queue` object can be passed, in which case no
progress bar is created or updated. Instead, each time a quantile path
is sampled, a value is added to the queue with ``put``. This way, the
parent process can update a progress by consuming the queue; e.g.
``while not q.empty(): progbar.update()``.
The value added to the queue is `M`+`previous_M`, which can be
used to update the total value of the progress bar.
- A value of `None` prevents displaying any progress bar.
previous_M: Used only to improve the display of the progress bar:
the indicated total on the progress bar will be `M` + `previous_M`.
Use this when adding paths to a preexisting ensemble.
Yields
------
Tuples of two 1-d arrays: (Φhat, qhat).
Notes
-----
Returned paths always have an odd number of steps, which as a side benefit is
beneficial for integration with Simpson's rule.
"""
rng = np.random.default_rng(rng)
total = M + previous_M
progbar_is_queue = ("multiprocessing.queues.Queue" in str(type(progbar).mro())) # Not using `isinstance` avoids having to import multiprocessing & multiprocessing.queues
if isinstance(progbar, str) and progbar == "auto":
progbar = tqdm(desc="Sampling quantile paths", leave=False,
total=total)
elif progbar is not None and not progbar_is_queue:
progbar.reset(total)
if previous_M:
# Dynamic miniters don’t work well with a restarted prog bar: use whatever miniter was determined on the first run (or 1)
progbar.miniters = max(progbar.miniters, 1)
progbar.dynamic_miniters = False
progbar.n = previous_M
progbar.refresh()
for r in range(M): # sourcery skip: for-index-underscore
for _ in range(100): # In practice, this should almost always work on the first try; 100 failures would mean a really pathological probability
qstart = rng.normal(qstar(Phistart) , math.sqrt(c)*deltaEMD(Phistart))
qend = rng.normal(qstar(Phiend), math.sqrt(c)*deltaEMD(Phiend))
if qstart < qend:
break
else:
raise RuntimeError("Unable to generate start and end points such that "
"start < end. Are you sure `qstar` is compatible "
"with monotone paths ?")
Φhat, qhat = generate_path_hierarchical_beta(
qstar, deltaEMD, c, qstart=qstart, qend=qend,
res=res, rng=rng, Phistart=Phistart, Phiend=Phiend)
yield Φhat, qhat
if progbar_is_queue:
progbar.put(total)
elif progbar is not None:
progbar.update()
time.sleep(0.05) # Without a small wait, the progbar might not update
Usage example#
2025-07-27 git: No git repo found