Sum product representation for configurations

SumProductTree can use polynomial memory to store exponential number of configurations. It is a sum-product expression tree to store ConfigEnumerator in a lazy style, where configurations can be extracted by depth first searching the tree with the Base.collect method. Although it is space efficient, it is in general not easy to extract information from it due to the exponential large configuration space. Directed sampling is one of its most important operations, with which one can get some statistic properties from it with an intermediate effort. For example, if we want to check some property of an intermediate scale graph, one can type

julia> using GenericTensorNetworks
julia> graph = random_regular_graph(70, 3){70, 105} undirected simple Int64 graph
julia> problem = GenericTensorNetwork(IndependentSet(graph); optimizer=TreeSA());
julia> tree = solve(problem, ConfigsAll(; tree_storage=true))[]+ (count = 1.7042933526212e13) ├─ + (count = 1.2943469325281e13) │ ├─ + (count = 8.849769392095e12) │ │ ├─ * (count = 9.47476020478e11) │ │ │ ├─ * (count = 1.0) │ │ │ │ ├─ OnehotVec{70, 2}(56, 1) │ │ │ │ └─ OnehotVec{70, 2}(2, 1) │ │ │ └─ + (count = 9.47476020478e11) │ │ │ ├─ * (count = 1.30951955306e11) │ │ │ │ ⋮ │ │ │ │ │ │ │ └─ + (count = 8.16524065172e11) │ │ │ ⋮ │ │ │ │ │ └─ + (count = 7.902293371617e12) │ │ ├─ * (count = 1.879319842489e12) │ │ │ ├─ OnehotVec{70, 2}(2, 1) │ │ │ └─ + (count = 1.879319842489e12) │ │ │ ⋮ │ │ │ │ │ └─ + (count = 6.022973529128e12) │ │ ├─ * (count = 1.929273595942e12) │ │ │ ⋮ │ │ │ │ │ └─ + (count = 4.093699933186e12) │ │ ⋮ │ │ │ └─ * (count = 4.093699933186e12) │ ├─ * (count = 4.093699933186e12) │ │ ├─ * (count = 1.0) │ │ │ ├─ OnehotVec{70, 2}(47, 1) │ │ │ └─ * (count = 1.0) │ │ │ ⋮ │ │ │ │ │ └─ + (count = 4.093699933186e12) │ │ ├─ * (count = 8.0585461904e10) │ │ │ ⋮ │ │ │ │ │ └─ + (count = 4.013114471282e12) │ │ ⋮ │ │ │ └─ OnehotVec{70, 2}(47, 1) └─ * (count = 4.099464200931e12) ├─ + (count = 4.099464200931e12) │ ├─ * (count = 4.32196039729e11) │ │ ├─ * (count = 1.0) │ │ │ ├─ OnehotVec{70, 2}(56, 1) │ │ │ └─ OnehotVec{70, 2}(2, 1) │ │ └─ + (count = 4.32196039729e11) │ │ ├─ * (count = 5.84493145e10) │ │ │ ⋮ │ │ │ │ │ └─ + (count = 3.73746725229e11) │ │ ⋮ │ │ │ └─ + (count = 3.667268161202e12) │ ├─ * (count = 8.59078393352e11) │ │ ├─ OnehotVec{70, 2}(2, 1) │ │ └─ + (count = 8.59078393352e11) │ │ ⋮ │ │ │ └─ + (count = 2.80818976785e12) │ ├─ * (count = 8.98174172841e11) │ │ ⋮ │ │ │ └─ + (count = 1.910015595009e12) │ ⋮ │ └─ OnehotVec{70, 2}(66, 1)

If one wants to store these configurations, he will need a hard disk of size 256 TB! However, this sum-product binary tree structure supports efficient and unbiased direct sampling.

julia> samples = generate_samples(tree, 1000)1000-element Vector{StaticBitVector{70, 2}}:
 1100001010100010000001101000011100000000010000000000100100001010100110
 1100000010010010001001111000010011011001010100000010110100000100100100
 0100100100101010000001111001000010100001011000001100001100000000100100
 0100001010110010000001001001000000000100010010000010010100001100101100
 0100000010001000011000101000010010010000010100000000001100110000100010
 0100000010110100000000011000010000000101010100000000110100011001100010
 0110001000010000000000101100010100001100010110000000000100100000101100
 0100000001010100010001001110001000010000010000000100001100000000100000
 0100100000001110100000000000000101000010010000000001000101001000000010
 1100001100101010010000100000010010010010010100001000100100010100100000
 ⋮
 0001001110000100100011000000100000100000001000000001001001100010110000
 0001000110101000000011000000100000000001000011000000000000001100010101
 0000000100101000000010000010100000010001001001001001001001010000010000
 0000000000110000100010010001100010100001000001000000000001001001010100
 0000001010110001110010000101100000001000001000000000010001000110011100
 0011000010000000000000000011001010010010000011000101000001000000110000
 0000000000010001100010010001000001011000001000010110001000000000011000
 0001000100001000010001010000001010000010000000000001000000100010010001
 0001010000010100100001000000000001010001001001000011001001000000010001

With these samples, one can already compute useful properties like Hamming distance (see hamming_distribution) distribution. The following code visualizes this distribution with CairoMakie.

using CairoMakie
dist = hamming_distribution(samples, samples)
# bar plot
fig = Figure()
ax = Axis(fig[1, 1]; xlabel="Hamming distance", ylabel="Frequency")
barplot!(ax, 0:length(dist)-1, dist)
fig
Example block output