Next Article in Journal
Kick and Fix: The Roots of Quantum Control
Previous Article in Journal
Reciprocal Quantum Channels
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Proceeding Paper

Supervised Quantum State Discrimination †

NEST, Scuola Normale Superiore, Istituto Nanoscienze-CNR, I-56126 Pisa, Italy
Author to whom correspondence should be addressed.
Presented at the 11th Italian Quantum Information Science conference (IQIS2018), Catania, Italy, 17–20 September 2018.
Proceedings 2019, 12(1), 21;
Published: 19 July 2019
(This article belongs to the Proceedings of 11th Italian Quantum Information Science conference (IQIS2018))

1. Introduction

Combining machine learning and quantum information ideas has been a fruitful line of research in the last few years [1,2,3]. These new proposals can be divided in three main categories: using machine learning for enhancing numerical methods and experimental techniques, devising quantum algorithms to attack classical machine learning problems, and studying quantum generalisations of learning tasks. In this contribution we address the latter category with an example of a supervised learning with quantum data. For the sake of our discussion we use a very broad characterisation of quantum learning tasks. First of all we consider a genuine quantum information theory setting, where an agent receives a quantum source (of quantum or classical information), and can operate on it with any processing device allowed by the rules of quantum mechanics. Secondly, we refer to tasks in which a machine should be trained to perform a certain quantum operation and this training can be done through quantum processing, that means with quantum training data and quantum operations.
In a classical supervised learning classification problem a machine is given a set of labelled data, and uses this information to produce a classifier which can be used to assign a label to new unlabelled data. In a probabilistic setting the data x X and the labels y Y are distributed according to the probability distribution P : ( X , Y ) [ 0 , 1 ] and a classifier is a conditional probability distribution C ( Y | X ) . For each P there exist an optimal classifier which minimises the probability of misclassification, and a good learning algorithm is expected to give a good approximation of the optimal classifier, at least when the training dataset becomes large. Even better, if the agent has some prior information on the possible distributions one can define what is the optimal training and test algorithm, which is the one with the lowest probability on average, averaging over all the possible distributions assuming the prior.
Since classical learning tasks can be studied in a probabilistic formulation, a straightforward generalisation of a learning task can be obtained by reinterpreting the task on quantum states rather then probability distributions.
The analogy is clear and fundamental in quantum information theory, as most of the information theoretic tasks that can be defined using probability distribution, like compression and communication, can be generalised to quantum states.

2. The Model

On the basis of these observation, we consider the following problem: an agent is asked to correctly guess the state of a qubit system X initialised with equal probability in the state ρ 1 or ρ 2 , bur ρ 1 and ρ 2 are unknown to the agent. Instead, he receives as a training set for the task a system A made of n qubits known to be in the state ρ 1 and a system B of n other qubits known to be in the state ρ 2 . The agent may have some kind of prior information on ρ 1 and ρ 2 , like their purity or their overlap. The question we ask is to find the two-outcome measurement M ^ { Π 1 , Π 2 } on the joint input made of the training and test data which minimises the probability of misclassification error, calculated by averaging over all the possible couples ρ 1 and ρ 2 according to the prior information. Since the input state of the machine can be one of two alternatives τ 1 = ρ 1 n ρ 1 ρ 2 n (if X in ρ 1 ) and
τ 2 = ρ 1 n ρ 2 ρ 2 n (if X in ρ 2 ), the average probability reads
P e r r ( n ) = d μ ( ρ 1 , ρ 2 ) T r [ τ 1 Π ^ 2 ] + T r [ τ 2 Π ^ 1 ] 2 ,
where d μ ( ρ 1 , ρ 2 ) is a classical probability distribution on the states ρ 1 , ρ 2 , and encodes the prior information of the agent. This problem can be translated in a binary state discrimination task for two known effective states
α ( n ) d μ ( ρ 1 , ρ 2 ) ρ 1 n ρ 1 ρ 2 n , β ( n ) d μ ( ρ 1 , ρ 2 ) ρ 1 n ρ 2 ρ 2 n .
Therefore, defined Θ α ( n ) β ( n ) , our figure of merit can be written as
P e r r , m i n ( n ) = 1 2 1 4 Θ 1 ,
with the symbol 1 indicating the trace norm.
In the limit of n one expects that this quantity converges to the averages of the probability of error for the Helstrom measurement for known ρ 1 and ρ 2 , since the classical description of the template states can be recovered exactly, for example using tomography. In fact this limit is always a lower bound to the probability of error at finite n. We are interested in the finite size correction to this value.

3. Results

Our contribution consists in calculating P e r r , m i n ( n ) for a number of priors, generalising previuos results [4,5]. As detailed in the preprint [6] we extensively use the symmetric and covariant properties of Θ in order to reduce the problem to a simple analytic diagonalisation of 2 × 2 matrices. Moreover, we perform asymptotic expansions of the sum of eigenvalues arising from Equation (3) in order to obtain finite size corrections to the asymptotic limit. In particular we consider these cases:
ρ 1 , ρ 2 have assigned purities—the moduli of their Bloch vectors being respectively r 1 and r 2 ; uniform prior on the Bloch vector’s directions,
P e r r , m i n ( n 1 ) = 1 2 1 24 ( r 1 + r 2 ) 3 | r 1 r 2 | 3 r 1 r 2                                                                                       + 5 24 n ( r 1 + r 2 ) 3 + | r 1 r 2 | 3 r 1 2 r 2 2 1 24 n ( r 1 + r 2 ) 5 | r 1 r 2 | 5 r 1 3 r 2 3 + o 1 n .
where the notation o 1 n indicate terms that goes to zero faster than 1 n .
ρ 1 and ρ 2 are generically mixed qubit states, with a constant density Bloch sphere prior,
P e r r , m i n ( n 1 ) = 17 70 + 18 35 n + o 1 n .
ρ 1 and ρ 2 are pure and they have a fixed overlap Tr [ ρ 1 ρ 2 ] = sin 2 θ 2 ; uniform prior on the global orientation,
P e r r , m i n ( n 1 ) = 1 2 1 | cos θ 2 | + 3 + cos θ 8 2 1 + cos θ 1 n + 1 60 cos θ 5 cos 2 θ 128 2 ( 1 + cos θ ) 3 / 2 1 n 2 + o 1 n 2 .


  1. Biamonte, J.; Wittek, P.; Pancotti, N.; Rebentrost, P.; Wiebe, N.; Lloyd, S. Quantum machine learning. Nature 2017, 549, 195–202. [Google Scholar] [CrossRef] [PubMed]
  2. Dunjko, V.; Briegel, H.J. Machine learning & artificial intelligence in thequantum domain: A review of recent progress. Rep. Prog. Phys. 2018, 81, 074001. [Google Scholar] [PubMed]
  3. Schuld, M.; Petruccione, F. Supervised Learning with Quantum Computers; Springer: Cham, Switzerland, 2018. [Google Scholar]
  4. Hayashi, A.; Horibe, M.; Hashimoto, T. Quantum pure-state identification. Phys. Rev. A 2005, 72, 052306. [Google Scholar] [CrossRef]
  5. Sentís, G.; Bagan, E.; Calsamiglia, J.; Munoz-Tapia, R. Multicopy programmable discrimination of general qubit states. Phys. Rev. A 2010, 82, 042312. [Google Scholar] [CrossRef]
  6. Fanizza, M.; Mari, A.; Giovannetti, V. Optimal universal learning machines for quantum state discrimination. arXiv 2018, arXiv:1805.03477. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Fanizza, M.; Mari, A.; Giovannetti, V. Supervised Quantum State Discrimination. Proceedings 2019, 12, 21.

AMA Style

Fanizza M, Mari A, Giovannetti V. Supervised Quantum State Discrimination. Proceedings. 2019; 12(1):21.

Chicago/Turabian Style

Fanizza, Marco, Andrea Mari, and Vittorio Giovannetti. 2019. "Supervised Quantum State Discrimination" Proceedings 12, no. 1: 21.

Article Metrics

Back to TopTop