Reference
run() options
QXel adds a few keyword arguments to the standard Braket run() method. They are all optional; the defaults give a correct CPU run. Tune them once a circuit is too slow or too large to fit in memory.
compute_type
Selects which gate kernels run. Start with the default 'cpu' to get a result, then switch to 'cuda' for GPU acceleration once you confirm the circuit is correct.
offload_type and path
Controls where the statevector lives. Use it only when the statevector no longer fits in GPU/CPU memory. 'storage' requires path, a list of one or more devices that are striped together for bandwidth.
max_fusion
Sets the maximum number of qubits a fused gate may act on. Fusing adjacent gates means fewer, larger kernel launches and less memory traffic. The default of 1 is safe; raise it (2 to 4 is typical) on deep circuits to trade extra setup time for faster execution.
A fully tuned run
CUDA kernels, the statevector striped across two NVMe drives, and 2-qubit gate fusion:
result = qxel.run(
circuit,
shots=4096,
compute_type="cuda",
offload_type="storage",
max_fusion=2,
path=["/dev/nvme0n1", "/dev/nvme1n1"],
).result()