Uniform draws of partitions and cycle-partitions, with thinning.
A Haskell library for efficient uniform random sampling of cycle partition graphs on sets of vertices, and partitions of lists or vectors. Selection can be subject to conditions.
Summary
A Haskell library for efficient uniform random sampling of cycle partition graphs on sets of vertices, and partitions of lists or vectors. Selection can be subject to conditions.
Cycle partitions
A cycle partition graph on a set of vertices V is a directed graph G = (E, V) such that for each i in V there exists a unique j in V such that (i, j) in E. In other words, it is a partition of V into a graph with disjoint cycle graphs.
Define C(V) to be the set of cycle partitions graphs of V. uniformCyclePartition
samples from the uniform distribution on C(V), in O(|V|) time.
To do so, it relies on the fact that
σ -> (i, σ(i)) , for i = 1..|V|
defines a bijective map between the permutations σ on |V| distinct elements and the edge sets of C(V). Therefore, sampling a uniform cycle partition without conditions is as simple as sampling a uniform permutation.
Note self-loops are allowed in the possible configurations.
uniformCyclePartitionThin
samples uniformly from the set of cycle partition graphs satisfying a predicate, in O(n/p) time on average, where p is the proportion of cycle partition graphs satisfying the predicate. It works by rejection sampling, so the user is asked to provide a maximum number of sampling attempts to guarantee termination.
List or vector partitions
This package provides functions to draw uniform samples from all 2^(n-1) possible partitions of an ordered list (or vector). uniformPartition
selects a single element uniformly across all possible partitions in O(n) time, and uniformPartitionThin
samples uniformly conditional on a predicate in O(n/p) time on average, where p
is the proportion of elements for which the predicate is True
.
Only the partitioning is randomized: Input list order is preserved.
The samplers randomize the placement of each breakpoint in the partition. In other words the sample space is viewed as a perfect binary tree, and random selection is a random walk from root to leaf. The implementation samples a bit array to determine the walk path instead of creating an actual tree structure, for efficiency.
At the moment, the uniformPartitionThin
is implemented only for lists.
Short-circuiting
The predicate provided to uniformPartitionThin
checks each partition element, a chunk of elements in the original list, in turn. Since partitions are built lazily, the sampler will short-circuit and start sampling a new partition after the first partition element for which the predicate is False
. This is just a consequence of the short-circuiting in base
package function all
.
Similarly, if the predicate itself is short-circuiting, the sampler will short-circuit.
Contributing
Tickets
Send by email, without need for an account, to ~brendanrbrown/[email protected]
Man pages for tickets on SourceHut, particularly the "Email access" section.
Patches
Man pages for sending patches upstream.