CPU Acceleration
Bloqade has multithreading-support built-in for faster simulation on multi-core CPUs.
Backends
This is accomplished through separate Sparse-Matrix dense-Vector (SpMV) backends that a user can explicitly choose from to further fine-tune performance. These backends target the CSC format Sparse Matrices that are generated for simulation from Hamiltonian problems. Bloqade also uses other Sparse Matrix formats as well to maximize memory efficiency but with any backend that performs multithreading, the same parallelized functions will be used.
There are three backends to select from:
BloqadeExpr
- the default that Bloqade starts with on first installation. Even if you start Julia with multiple threads, when Bloqade has this backend it will not perform any multithreading.ThreadedSparseCSR
- Converts CSC matrices to CSR format and uses a simple nested for-loop to perform the SpMV multiplication. The code for this was taken from theThreadedSparseCSR
package and migrated into Bloqade to use a more up-to-date version of thePolyester
multithreading library, hence the name.ParallelMergeCSR
- Takes the conjugate transpose (adjoint) of CSC matrices and uses QuEra'sParallelMergeCSR
SpMV package.
ParallelMergeCSR is currently a Linux-only package and may be unreliable/fail to run on other operating systems. You will still be able to take advantage of multithreading through the ThreadedSparseCSR
backend.
ThreadedSparseCSR
is ideal for smaller system sizes or if you are performing simulations with the full Hilbert space (in which case the Hamiltonian matrix has a rather even distribution of non-zero elements per row).
ParallelMergeCSR
is ideal for very large system sizes or if you are performing simulations using the Blockade subspace where the imbalance in the number of non-zero entries per row may be larger than if a full Hilbert space simulation was performed. ParallelMergeCSR
performs some calculations before the actual matrix-vector multiplication occurs to find the ideal distribution of work across multiple threads which means for smaller system sizes more benefit might be obtained through ThreadedSparseCSR
.
Using Multithreading
To enabled multithreading you will need to first import BloqadeExpr
.
julia>using BloqadeExpr
You can verify which backend you currently have enabled through:
BloqadeExpr.backend
To set a new backend, pass in the name as a string using the set_backend
function (you can pass in "BloqadeExpr"
, "ThreadedSparseCSR"
, or "ParallelMergeCSR"
):
BloqadeExpr.set_backend("ThreadedSparseCSR")
You will be prompted to restart the Julia session upon which you should also launch Julia with the desired number of threads:
julia -t num_threads
So long as the backend is not explicitly changed again through BloqadeExpr.set_backend
, the selected backend will persist.