Exponential histograms

Jump to interactive view

Exponential histograms are histograms that, instead of encoding explicit bucket boundaries, encode a scale factor used to define the boundaries with an exponential function. This document uses OpenTelemetry's definition and encoding for exponential histograms but the general concepts should be broadly applicable.

Exponential histograms encode a scale factor $s$, a sample count for the zero $Z$ bucket, lists of sample counts for both positive $P$ and negative $N$ buckets, and optionally positive $o_P$ and negative $o_N$ offsets. From these values we can derive a set of functions to define bucket boundaries.

$$\begin{align} b &= 2^{2^{(-s)}} \\[8pt] f(i, o) &= b^{i+o} \\ f_P(i) &= f(i ,\: o_P) \\ f_N(i) &= -f(i ,\: o_N) \\[8pt] Z &= \left[\: f_N(0) ,\: f_P(0) \:\right] \\ P(i) &= \left(\: f_P(i) ,\: f_P(i+1) \:\right] \\ N(i) &= \left[\: f_N(i+1) ,\: f_N(i) \:\right) \end{align}$$

Derive the base $b$ of the exponential function
Derive the bucket boundary $f$ for index $i$ and offset $o$
Derive the positive bucket boundary $f_P$ for index $i$
Derive the negative bucket boundary $f_N$ for index $i$
Derive the zero bucket $Z$
Derive the positive bucket $P$ for index $i$
Derive the negative bucket $N$ for index $i$

Interactive version

Perfect subsetting

Here's a table showing the value of $f$ for different scales assuming $o = 0$, stolen from this excellent blog post by Daniel Dyla that highlights an essential property of exponential histograms.

$i$	$s = -1$	$s = 0$	$s = 1$	$s = 3$
-1	0.25	0.5	0.7071	0.9170
0	1	1	1	1
1	4	2	1.4142	1.0905
2	16	4	2	1.1892
3	64	8	2.8284	1.2968
4	256	16	4	1.4142
5	1024	32	5.6569	1.5422
6	4096	64	8	1.6818
7	16384	128	11.3137	1.8340
8	65536	256	16	2
9	262144	512	22.6274	2.1810

Every time the scale is incremented by 1, a new boundary is inserted between existing ones. To generalise, given two histograms $H_1$ and $H_2$ with scales $s_1$ and $s_2$, each bucket from $H_1$ can fit $2^{s_2 - s_1}$ buckets from $H_2$.

If $s$ is encoded as an integer, any two histograms can be merged as the buckets from the histogram with a larger scale will always be perfect subsets of the buckets from the other. The relative error of the result will also always be at most the same as the relative error of the histogram with a smaller scale.

In cases where the offsets $o_1 \neq 0$ or $o_2 \neq 0$, perfect subsetting still holds but some extra steps are required. The offsets need to be balanced by either incrementing one offset and merging the buckets below its new value into the zero bucket, or by decrementing the other offset if its zero bucket is empty and inserting virtual empty buckets within the zero bucket.

Instead of describing the process of merging exponential histograms, this document provides a live interactive view of two of them in OTLP format. The third view lists the steps required to merge them and the final result. The implementation itself can also serve as a useful reference.

Advantages over explicit histograms

There's no need to explain why the ability to merge any two histograms without a loss in precision is a good thing. But exponential histograms are superior to explicit ones both for data ingestion and analysis and for instrumentation and data collection for other less obvious reasons.

Quantile estimation for exponential histograms is more accurate. Some level of interpolation is always required when estimating quantiles, and for explicit histograms this needs to be done on a best effort basis. On the other hand exponential histograms encode the exact function to be used for interpolation.

At instrumentation time, explicit histograms need to specify explicit bucket boundaries in which samples will be placed. There's no good one-size-fits-all solution as the range of values produced can vary wildly between applications, and not a high likelyhood of most customers explicitly specifying buckets per application either.

With exponential histograms, the instrumentation library can simply start by picking a high initial scale (which produces smaller buckets). If a recorded sample does not fit within the current range of buckets, it's possible to either add new buckets or dynamically change the scale to a lower one (which will produce bigger buckets). Outside the most extreme cases the resulting histogram will provide a good summary of the sampled data.

Detailed steps