Efficient Learning of Readout Noise Cross-Talk Models in Near-Term Quantum Devices
I. INTRODUCTION
Motivation
Imperfections in qubit measurements are a significant source of errors in near-term quantum devices. Characterization of those errors is of a crucial importance for improvements of computation quality, since it allows to use error mitigation. However, a complete characterization of quantum detectors scales exponentially with the number of qubits, and becomes infeasible for large devices. As a consequence, characterization of a generic noise model is impossible. Therefore, there is a need to develop models that accurately capture correlations in the readout process, and can be efficiently reconstructed using appropriately tailored tomographic techniques. In this contribution, we show that the correlated clusters noise model considered, e.g., in [1], can be learned efficiently by post-processing data from detector overlapping tomography (DOT) [2], an extension of overlapping tomography of quantum states [3]. The clusters model assumes that correlations in the measurement process are restricted to subsets (clusters) with bounded size $k$ (this number is called locality). To the best of our knowledge we report here the first experimental analysis of cross-talk effects in the readout noise of 56- and 109-qubit systems of Rigetti's Aspen-M and IBM's Washington devices, respectively (the devices contain 80 and 127 qubits but not all of them were used in analysis).
Preliminaries
In this work, we are mostly interested in characterizing simplified correlated readout noise model [1], which is described by a stochastic map $\Lambda$ relating noisy and ideal readout measurement statistics, with tensor product structure
$\Lambda = \bigotimes_{\chi} \Lambda_{\chi}, \; \; (1)$
where $\chi$ denotes clusters, i.e. a grouping of qubits that exhibit strong correlations in the readout noise. One objective of the present contribution is to develop an efficient method of finding a cluster structure best describing correlations in the experimental data. In general, the measurement noise is modeled via a generic quantum channel (as opposed to classical stochastic map). Here we show that DOT experiments allow to both validate classicallity of the noise, and to learn the cluster structure of that noise (cf. Eq. (1)).
II. RESULTS OVERVIEW
1. Definition of reduced noise channels
We introduce precise definition of reduced measurement channels. In contrast to other common approaches [4-6], we do not require non-signaling assumption to hold i.e., a reduced channel may depend on a state of subsystems, over which reduction is performed (the complement). The maximally mixed state is chosen as an input to the complement to ensures that no input state is preferred. This can be thought of as looking at the reduced channel averaged over all pure input states to the complement (cf. a similar approach in [7]).
2. Efficient experiments' design
We provide a rigorous statistical analysis of sample complexity of reduced readout noise channel reconstruction using detector overlapping tomography (DOT) [1,3]. This technique takes advantage of a parallel measurement strategy and requires preparation of Pauli eigenstates, hence involved circuits are single layers of simple one-qubit gates. To fully characterize, up to precision $\epsilon$, all $k$-qubit subsets in $N$-qubit device, the number of required circuits scales as $O(6^k \log (N) /\varepsilon^2)$, where $k$ can be understood as the maximal locality of correlations in noise. Note that $k$ corresponds to maximal size of the qubits clusters (recall Eq. (1)), and it is not necessarily related to device's connectivity.
3. Operational measure of correlations in the readout noise
To quantify correlations in readout noise, we generalize correlation coefficients introduced in [1]. A coefficient $c_{i\rightarrow j}$ has a sound operational interpretation - it quantifies how marginal measurement statistics on qubit $j$ can differ (as measured by Total Variation distance - TVD) depending on the input state of qubit $i$, which can be arbitrary. For example, $c_{i\rightarrow j}=10\%$ means that depending on the input state on qubit $j$, the marginal distributions on $i$ can differ by as much as $0.1$ in TVD. Correlation coefficients are used as input data to algorithm looking for clusters' assignment.
4. Efficient learning of clusters' structure
To learn noise structure (recall Eq. (1)) from experimental data, we introduce a heuristic, randomized optimization algorithm that assigns strongly correlated qubits to clusters. The algorithm is based on a maximization of ansatze cost function that depends on correlations coefficients $c_{i\rightarrow j}$. Contrary to common algorithms used in community detection problems, here the maximal cluster size is constrained to prevent formation of too big clusters, whose characterization and analysis is infeasible.
5. Noise model benchmarks
In order to verify accuracy of reconstructed noise model we propose two benchmarks. First is based on energy predictions of $t$-local classical Hamiltonians (i.e. containing only Pauli Z operators acting on up to $t$ qubits) that, for considered problem sizes, can be solved classically. Reconstructed noise model is used to simulate noisy energy expectation value, which is compared to experimental results. The second benchmark uses reconstructed noise model to perform error mitigation, and error-mitigated values are compared to theoretical (noiseless) values. We use standard error-mitigation strategy, i.e. noise inversion post-processing on the level of marginals -- see [1,2] for details). For both benchmarks we also test an uncorrelated model, in which noise acts on each qubit independently (it corresponds to clusters of size 1).
6. Experimental results
We report results of experiments on 109 and 56-qubit systems of IBM's Washington and Rigetti's Aspen-M-1 processors, respectively. The maximal cluster size (locality) was set to $k=5$. Our main findings are as follows.
1. Mitigation based on correlated clusters model can improve median error obtained in error-mitigated estimation of $2$-local Hamiltonians by as much as $\approx 80\%$ a factor as high as $\approx 2.4$ compared to uncorrelated noise.
2. The correlations in readout noise do not necessarily correspond to physical topology of a device.
3. Although most of the reported correlations coefficients are below statistical significance, there is still a considerable number of strongly correlated qubit pairs in both devices.
4. Genuine quantum effects (not presented on any plot here) are of low statistical significance. Therefore the classical noise model is a good approximation to measurement noise in both devices.
Bibliography
[1] F. B. Maciejewski, F. Baccari, Z. Zimborás, and M. Oszmaniec, Quantum 5, 464 (2021).
[2] F. B. Maciejewski, Z. Zimborás, and M. Oszmaniec, Quantum 4, 257 (2020).
[3] J. Cotler and F. Wilczek, Physical Review Letters 124, 10.1103/physrevlett.124.100401 (2020).
[4] C.-Y. Hsieh, M. Lostaglio, and A. Acín, arXiv e-prints, arXiv:2102.10926 (2021), arXiv:2102.10926 [quant-ph].
[5] R. Levy, D. Luo, and B. K. Clark, arXiv e-prints, arXiv:2110.02965(2021), arXiv:2110.02965 [quant-ph].
[6] J. Kunjummen, M. C. Tran, D. Carney, and J. M. Taylor, arXiv e-prints , arXiv:2110.03629 (2021), arXiv:2110.03629 [quant-ph].
[7] L. C. G. Govia, G. J. Ribeill, D. Ristè, M. Ware, and H. Krovi, Nature Communications 11, 1084 (2020).
This is a joint work with F. B. Maciejewski, O. Slowik, and M. Oszmaniec