Sum product representation for configurations

SumProductTree can use polynomial memory to store exponential number of configurations. It is a sum-product expression tree to store ConfigEnumerator in a lazy style, where configurations can be extracted by depth first searching the tree with the Base.collect method. Although it is space efficient, it is in general not easy to extract information from it due to the exponential large configuration space. Directed sampling is one of its most important operations, with which one can get some statistic properties from it with an intermediate effort. For example, if we want to check some property of an intermediate scale graph, one can type

julia> using GenericTensorNetworks
julia> graph = random_regular_graph(70, 3){70, 105} undirected simple Int64 graph
julia> problem = GenericTensorNetwork(IndependentSet(graph); optimizer=TreeSA());
julia> tree = solve(problem, ConfigsAll(; tree_storage=true))[]+ (count = 1.6596660707628e13) ├─ + (count = 1.2857846065087e13) │ ├─ + (count = 8.768592637867e12) │ │ ├─ + (count = 8.039148342641e12) │ │ │ ├─ + (count = 6.251068167932e12) │ │ │ │ ├─ * (count = 5.024507050782e12) │ │ │ │ │ ⋮ │ │ │ │ │ │ │ │ │ └─ * (count = 1.22656111715e12) │ │ │ │ ⋮ │ │ │ │ │ │ │ └─ * (count = 1.788080174709e12) │ │ │ ├─ * (count = 1.0) │ │ │ │ ⋮ │ │ │ │ │ │ │ └─ * (count = 1.788080174709e12) │ │ │ ⋮ │ │ │ │ │ └─ * (count = 7.29444295226e11) │ │ ├─ * (count = 1.0) │ │ │ ├─ * (count = 1.0) │ │ │ │ ⋮ │ │ │ │ │ │ │ └─ OnehotVec{70, 2}(42, 1) │ │ └─ * (count = 7.29444295226e11) │ │ ├─ + (count = 7.29444295226e11) │ │ │ ⋮ │ │ │ │ │ └─ * (count = 1.0) │ │ ⋮ │ │ │ └─ * (count = 4.08925342722e12) │ ├─ OnehotVec{70, 2}(36, 1) │ └─ + (count = 4.08925342722e12) │ ├─ + (count = 3.753620872253e12) │ │ ├─ + (count = 2.917468454497e12) │ │ │ ⋮ │ │ │ │ │ └─ * (count = 8.36152417756e11) │ │ ⋮ │ │ │ └─ * (count = 3.35632554967e11) │ ├─ * (count = 1.0) │ │ ⋮ │ │ │ └─ * (count = 3.35632554967e11) │ ⋮ │ └─ * (count = 3.738814642541e12) ├─ OnehotVec{70, 2}(62, 1) └─ + (count = 3.738814642541e12) ├─ * (count = 2.512253525391e12) │ ├─ * (count = 1.0) │ │ ├─ * (count = 1.0) │ │ │ ⋮ │ │ │ │ │ └─ OnehotVec{70, 2}(62, 1) │ └─ + (count = 2.512253525391e12) │ ├─ + (count = 1.788080174709e12) │ │ ⋮ │ │ │ └─ * (count = 7.24173350682e11) │ ⋮ │ └─ * (count = 1.22656111715e12) ├─ * (count = 1.0) │ ├─ * (count = 1.0) │ │ ⋮ │ │ │ └─ OnehotVec{70, 2}(62, 1) └─ + (count = 1.22656111715e12) ├─ + (count = 7.29444295226e11) │ ⋮ │ └─ * (count = 4.97116821924e11) ⋮

If one wants to store these configurations, he will need a hard disk of size 256 TB! However, this sum-product binary tree structure supports efficient and unbiased direct sampling.

julia> samples = generate_samples(tree, 1000)1000-element Vector{StaticBitVector{70, 2}}:
 0000100000000100000000010010000001000010100101011101001010001000000001
 0100000100000100110011000111010000000101000001000000000000001000100100
 0011000100010000100010101000000000000110000100010000000001010010001100
 1001000010010001000000000000010001001000100010000001010100000000010000
 0000001000000000000000011000000001000010000101110000010010100000010001
 0000110000000100000011000011110000000101001001010000000000000010000100
 0010000000000000100100000100000000100000100010101011101000100000000010
 0000000000010000011000010000001000000000001000110000000101100000000100
 1001000000000000000010000100110000000000001011010001010001001000110000
 0100000100010011000000010010010000100011000000001000101000100000011000
 ⋮
 0000011001010000100001000000000010100001000000100000000000000101100010
 0000110010000110000000100001010111101010100000001100000010100101000010
 1100010100000000100000001000101000000000000000010100000000100101000100
 0000111000000110000000000011001000000100000000010000000110000111001010
 0100010000000011010000100011011000100011000001001100001000000101000100
 1000010000100001000000010100001000000000000000010000000001000101000100
 0010011010000100100010001000101001000000000000001011010001000101000000
 1010110001000000010011000000101000000100000001001010010001000101000000
 0010011000000000000100010000001111100010000000001001000001101101000100

With these samples, one can already compute useful properties like Hamming distance (see hamming_distribution) distribution. The following code visualizes this distribution with CairoMakie.

using CairoMakie
dist = hamming_distribution(samples, samples)
# bar plot
fig = Figure()
ax = Axis(fig[1, 1]; xlabel="Hamming distance", ylabel="Frequency")
barplot!(ax, 0:length(dist)-1, dist)
fig
Example block output