1. Introduction
A formal theory
of contextuality is defined by a class
of possible systems of random variables and a rule by which these systems are divided into noncontextual and contextual ones. In the original theory of contextuality (a term in which we include both the Kochen–Specker contextuality and the contextuality in distributed systems, referred to as nonlocality [
1,
2,
3,
4,
5,
6,
7,
8]), the class
is confined to consistently connected systems, or a subclass thereof. These are the systems with no “disturbance” or “signaling,” which means that the variables representing the same property (answering the same question) in different contexts are identically distributed. The Contextuality-by-Default theory (CbD) extends the notion of contextuality to all systems of random variables, including those with disturbance [
9,
10], and it has been applied to several experimental and theoretical situations [
11,
12,
13,
14,
15,
16,
17,
18]. A recent workshop on contextuality [
19] exhibited a renewed interest to studying contextuality in inconsistently connected systems, including approaches that are distinctly non-CbD-like [
20,
21,
22], and some work directly critical of CbD ([
23], responded to in Ref. [
24]).
The present paper is not about CbD specifically. Rather, it is about a broad class of all possible
CbD-like theories, as defined below. The plan and the main message of the paper are as follows. In
Section 2, we present the terminology and notation to be used and define the notion of a system of random variables modeling (representing or describing) another system. In
Section 3, we define the traditional notion of contextuality in the language of probabilistic couplings [
25], and we introduce the notion of
-contextuality as a very broad generalization of both traditional contextuality and CbD-contextuality. In
Section 4, we introduce the notion of consistification of a system and show that any theory
, irrespective of its class
of systems and a specific version of the
-contextuality it uses, can be redefined as a theory
, whose systems are consistently connected, and that uses the traditional notion of contextuality. Because of this, we conclude that there can be no set of substantive requirements
for the notion of contextuality that are satisfied by all consistently connected systems but contravened by some inconsistently connected ones. Indeed, if such a set of requirements existed, one could form a theory
whose class
includes some systems contravening
. However,
would then be satisfied by the theory
that is contextually equivalent to
and a mere reformulation thereof. Consequently, requirements
cannot be substantive: they address a form rather than the substance of the notion of contextuality. In
Section 5, we discuss some issues related to the consistified systems (the term used for the consistently connected systems in
), including the representability thereof by hidden variable models. We also briefly discuss there a still more general (in fact, maximally general) notion of
-contextuality, one that does not have the existence-and-uniqueness property postulated for
-contextuality. In the final analysis, this does not alter the main conclusion of the paper.
The idea that consistification precludes the possibility of rejecting extended contextuality while accepting the traditional one was previously mentioned in Ref. [
24]. However, it was confined to CbD only and mentioned without elaborating. The consistification procedure was first described in Ref. [
13] for an older version of the CbD approach, and it was elaborated and adapted to the current version of CbD in Ref. [
26]. Finally, the
-contextuality in our paper generalizes a more limited version of
-contextuality that was used in Ref. [
27] as a generalization of the CbD approach.
2. Basic Notions
A
system of random variables is a set of double-indexed random variables
where
identifies what the random variable
represents (measures, responds to, or describes);
identifies circumstances under which
is recorded (including what other random variables are recorded together with
);
q and
c are referred to as, respectively, the
content and the
context of the random variable
; and the relation
indicates that a variable with content
q is recorded in context
c. As an example, this is a system with
and
:
The subset of random variables recorded in the same context c (a row in the matrix above) is termed a bunch, and the subset of random variables sharing a content q (a column in the matrix above) is termed a connection. The difference in font ( vs. ) reflects the fact that is a random variable in its own right (i.e., all its components are jointly distributed), whereas the components of are not jointly distributed. In fact, no two random variables and are jointly distributed unless they are in the same bunch, . The measurable space on which is distributed is assumed to be the same for all elements of a connection and can be denoted .
The triple
is called the
format of the system. It is essentially the mathematical depiction of “what the system is about,” what kind of empirical or theoretical situation it represents. Thus, the format of the system in (2) can be presented as
where 🟉 indicates the elements of the relation ≺. To define a system of a given format, one has to specify the distributions of its bunches.
As should be clear from Abstract and Introduction, in this paper, we use the notion of one system of random variables,
, being a “mere reformulation” of another,
. Intuitively, this means that regardless of what empirical or theoretical situation is modeled (described, represented) by
, it is also modeled by
. The relation between a system and a situation it depicts is difficult to formalize directly, as one would have then to impose some formal structure on the situation being represented before it is represented (as in the representational theory of measurement, [
28,
29]). However, it is sufficient for our purposes to formalize a simpler relationship: between a system
and another system that models (describes, represents) the system
. Moreover, rather than presenting this relationship in a most general possible way, it will suffice to describe one special, universally applicable construction of the modeling systems
. We will refer to this construction as
canonical modeling.
Consider two classes of systems, and , in a bijective correspondence to each other, about which we say that any system in is canonically modeled by the corresponding system in . The following definition gives a precise meaning to this relation.
Definition 1. We say that a system with format is canonically modeled by a system with format if
- (canonical contents)
,
- (canonical contexts)
,
- (canonical relation)
, and
- (main bunches)
- (auxiliary bunches)
is uniquely determined by the distributions of the corresponding variables in .
Here, the symbol stands for “has the same distribution as”. The dot symbol in and should be taken as part of the names of these contexts. We choose this notation to emphasize that every random variable of the system is placed in within two contexts, and , whose names are derived from the indices of the variable. Note that the variables in the set defined here have the same distributions as the corresponding variables in . We use the former set, however, to emphasize that the auxiliary bunches are uniquely determined by the corresponding variables in the main bunches. Note that the variables in are not jointly distributed, so depends on their individual distributions only.
To give an example, consider the systems
Observe that in system
the contents, contexts, and the relation between them are constructed in accordance with Definition 1. System
canonically models system
if
and if there is a rule by which the distribution of
is uniquely determined by the distributions of the corresponding variables in
Observe the following properties of canonical modeling.
- 1.
The formats of and are reconstructible from each other, and so are the bunches of the two systems. Moreover, faithfully replicates the bunches of . This allows one to say that and describe the same empirical or theoretical situation.
- 2.
One might wonder why we need the auxiliary contexts at all, and they are indeed unnecessary if all one wants is a system modeling another system, e.g.,
However, we will see the utility of the auxiliary contexts when we introduce consistifications and contextual equivalence, in
Section 4.
- 3.
The contents in the modeling system are “contextualized”. For instance, system
in (
4) may be describing an experiment in which two questions,
and
, are asked in two orders,
indicating “1 then 2” and
indicating “2 then 1” [
30,
31]. In this case, in the modeling system, the content
should be interpreted as “question 1 asked second”, and
should be interpreted as “question 1 asked first”. We return to the issue of interpretation in
Section 5.1.
- 4.
The indexation of the variables in a canonical model is clearly redundant, and it can be simplified. It is more important, however, to maintain the general logic of indexing the variables by their contents and contexts.
3. Traditional and Extended Contextuality
A system is consistently connected if in every connection all its constituent variables have one and the same distribution. Otherwise, the system is inconsistently connected. (The latter term is also used to designate arbitrary systems, i.e., in the meaning of “not necessarily consistently connected”.)
An
overall coupling of a system
in (
1) is an identically labeled system
of
jointly distributed random variables such that its bunches
are distributed as the corresponding bunches
,
Clearly,
S has the same format as
. A
coupling of a connection
is a set
of jointly distributed random variables such that
for all its elements. A connection coupling
is said to be an
identity coupling if
for any two of its elements. Obviously, such a coupling exists if and only if all of its elements (equivalently, all elements of the connection
) have one and the same distribution. Moreover, the identity coupling is unique if it exists. (The uniqueness of a coupling should always be understood as the uniqueness of its distribution. In other words, it is irrelevant on what domain probability space the coupling is defined as a random variable.)
The traditional notion of contextuality is confined to consistently connected systems, and it can be rigorously defined in our terminology as follows.
Definition 2. A consistently connected system is noncontextual if it has a coupling S in which any connection is the identity coupling of the connection . Otherwise, the system is contextual.
The class of all possible systems in a theory is denoted . For instance, can only contain the systems with finite sets Q and C, or only the systems with dichotomous random variables. By constraining the class , one induces constraints on all possible random variables, , on bunches of random variables, , and on possible connections, .
In CbD, contextuality of a system is defined by considering its couplings S and determining if, in some of them, the couplings of the system’s connections satisfy a certain statement. To generalize this definition to all possible CbD-like theories, all one has to do is to replace this specific statement with one that is (almost) arbitrary. Let be any statement of the form “the coupling of connection has the following properties: …”. The only constraints we impose on are as follows.
Definition 3. is considered well-fitting if (1) for any connection , there is one and only one coupling of that satisfies , and (2) if consists of identically distributed random variables, then the coupling that satisfies is the identity coupling. We denote such a coupling of as .
To give an example of a well-fitting statement
: in CbD, if the class
of all possible systems is confined to the systems with dichotomous variables, the well-fitting statement is
“for any two random variables
and
in the coupling of connection
, the probability of
is maximal possible”. Another example: if the class
of all possible systems is confined to the systems with real-valued (or more generally, linearly ordered) variables, then a well-fitting statement can be
“for any two random variables
and
in the coupling of connection
,
and
have the same quantile rank”. In
Section 5.3, we discuss the possibility of dropping the first of the two defining properties of a well-fitting statement
.
Definition 4. Given a well-fitting , a system is -noncontextual if it has a coupling S such that, for any connection of the system, the connection coupling coincides with . Otherwise, the system is -contextual.
4. Equivalence and Impossibility Theorems
It follows from the last two definitions that, for a well-fitting , a consistently connected system is -noncontextual if and only if it is noncontextual in the traditional sense (i.e., in the sense of Definition 2). In other words, any extension of the notion of contextuality using a well-fitting properly reduces to the traditional notion when confined to consistently connected systems. This is not, obviously, sufficient to consider the extension of contextuality by means of a well-constructed . There may be other desiderata for a well-constructed notion of contextuality, and a specific choice of may not satisfy them. The question we pose now is as follows:
- Q*:
Is it possible to formulate a set of such desiderata/requirements for the notion of contextuality that, for some choice of , (1) is satisfied for any consistently connected system, but (2) is not satisfied for some inconsistently connected systems?
Note that we impose no constraints on what
may entail, except for its being related to contextuality. It may, e.g., for some relation
B between systems, have the form “if system
is (non)contextual, then any system
related to
by
B is (non)contextual” [
24].
To answer the question Q*, we need the following result.
Theorem 1. For any well-fitting and system , there is a consistently connected system that canonically models it (Definition 1), such that is -contextual (Definition 4) if and only if is contextual in the traditional sense (Definition 2).
Proof. Let
be a canonically modeling system for
, with
One can check that is consistently connected: every connection of consists of precisely two variables, and , where . Indeed, , because in any canonically modeling system and because we know from (*) that , where .
The system
thus constructed is referred to as a
consistification of
. We can now define the consistification
of a coupling
S of a system in precisely the same way as for the system itself, except that (*) is replaced with the straightforward
with the obvious correspondence between the different indexations within the two random vectors. Clearly,
is a coupling of
.
Assume now that is noncontextual. This means that it has a coupling S such that (a) for every and (b) for every . Then, in the coupling of system , we have (a’) for every , and (b’) for every . Moreover, since both and equal , we have (c’) . However, (a’)-(b’)-(c’) mean that is noncontextual in the traditional sense. The implication here is easily seen to be reversible, and we conclude that is noncontextual if and only if so is . □
In our example (
4),
is a consistification of
if we specify the rule for the auxiliary bunches as follows:
, and the distribution of
is the same as that of
. If
is chosen as in CbD, the consistification of the system
in (
2) is the system below (omitting, for simplicity, the parentheses and commas in
and
):
where all variables are assumed to be dichotomous, and in each of the auxiliary bunches, the variables are pairwise equal with maximal possible probability.
For the purposes of contextuality analysis,
can be viewed as a mere reformulation of
, a different form of the same substance. We express this fact by saying that
and
are
contextually equivalent. (In Refs. [
24,
26], contextual equivalence is defined more narrowly, requiring also the numerical coincidence of certain measures of contextuality, such as contextual fraction [
32]. In this paper, however, the level of abstraction is higher, and we only consider the notion of contextuality rather than its quantifications.)
Consider now a theory of (generally, extended) contextuality . In accordance with Theorem 1, we can form the class of the consistifications of the elements of in a bijective correspondence with . By extension of the term, we can say that and are contextually equivalent. here denotes the statement “the connection has an identity coupling” that underlies the traditional notion of contextuality, because by definition, it can be viewed as a special case of any well-fitting statement . We have now everything we need to demonstrate our main conclusion. Let there be a set of requirements of the notion of contextuality that are satisfied by all consistently connected systems (using the traditional contextuality) and contravened by some inconsistently connected ones, using some version of -contextuality. Let include some of the inconsistently connected systems contravening . Clearly then, requirements contradict theory , but they are satisfied by the contextually equivalent theory . Therefore, is not a set of substantive requirements. We can summarize this as a formal theorem.
Theorem 2. For any well-fitting , there can be no set of substantive requirements of the notion of contextuality that are satisfied by all consistently connected systems (using the traditional contextuality) and contravened by some inconsistently connected ones, using -contextuality.
Of course, a set of requirements satisfied by but not can be readily formulated. The theorem says, however, that all it can do is lead one to prefer one of two equivalent representations of contextuality, without affecting the substance of the notion.
Note also that in the theorem just formulated, we assume no relationship between the set of requirements and the bijective correspondence relating to . In particular, let have the form “if system is contextual, then any system related to by relation B is contextual.” It is not necessary then, although not excluded either, that is also related to by relation B. All that is stated in the theorem above is that if one wishes to use this as a substantive principle in testing competing theories, then the failure of a theory to satisfy it cannot be selectively attributed to the fact that its contains inconsistently connected systems.
5. Miscellaneous Remarks
Here, we consider a few issues related to the main point of this paper.
5.1. Interpretation of Contents
and Contexts
Dealing with consistified systems
, one needs to get used to a new interpretation of contents and contexts of the random variables: as mentioned previously, in
, contents are “contextualized,” with
in place of just
q, and the contexts are simply marginalized contents,
and
. Consider as an example the EPR/Bohm experiment, the most widely investigated paradigm in contextuality/nonlocality research [
1,
33,
34]. In the usual CbD notation, the system representing it is
where
and
denote two settings (axes) to be chosen between by Alice,
and
are settings to choose between by Bob,
c indicates the combination of their choices, and
is the dichotomous (spin-up/spin-down) variables. The consistified representation of the same experiment is (again, omitting the parentheses and commas in the indexation)
The interpretation of, say, the content
here is as follows: it is the choice of axis 3 (that we know to be made by Alice) when Bob’s choice of his axis forms combination 2 with Alice’s choice (which we know to mean that Bob chooses axis 2). The interpretation of context
is that it is simply the set of contents whose second component is 2. Similarly,
is the set of contents whose first component is 3. The random variables within context
are jointly distributed by observation, whereas the random variables within context
are jointly distributed by computation (that, in turn, is uniquely determined by the observations). If
is defined in accordance with CbD,
is computed so that
,
(consistent connectedness), and the probability of
is maximal possible. In particular, if
, then
.
5.2. Hidden Variable Models
One possible argument against contextuality in inconsistently connected systems is that it is not distinguishable from inconsistent connectedness itself in the language of hidden variable models (HVMs). If, the argument goes, a consistently connected system
in (
1) is noncontextual, it has a coupling
S in which all random variables can be presented as
where
is a “hidden” random variable [
35]. If
is contextual, then all its couplings can only be presented as
with ineliminable
c. However, the latter HVM representation is also required for all inconsistently connected systems, irrespective of whether they are
-contextual or
-noncontextual. We would argue in response that this only means that on this general level (merely showing the arguments of the functions), the language of HVMs is too crude to capture the subtler properties of the couplings, such as contextuality under inconsistent connectedness. However, even if one takes this issue as a matter of concern, it is eliminated by the consistification procedure. The system
corresponding to
is noncontextual if and only if it has a coupling
such that, for all
for some random variable
. If
is contextual, then in all its couplings, for some
, which means that their HVM representations can only be different functions,
or, equivalently, the same function but with differently distributed hidden variables,
It is instructive to apply this to the EPR/Bohm systems
and
in (
12) and (
13). Here, contextuality is traditionally referred to as nonlocality because for the contextual system
, all its couplings are represented in the form of (
15): the ineliminable dependence on
c here is interpreted as the dependence of a measurement on a remote setting. However, if one models the EPR/Bohm experiment by system
instead, the HVM representations (
16) and (
17) both contain the contextualized content
as an argument. Following the logic above, they should both be considered nonlocal, even though one of them represents a noncontextual system and is equivalent to (
14), while the other represents a contextual system and is equivalent to (
15). It seems to us, in agreement with other authors [
36], that this demonstration speaks against a naturalistic interpretation of the HVMs in terms of physical dependences.
5.3. The Existence and Uniqueness
Constraint
In the definition of -couplings, their reducibility to identity couplings when applied to identically distributed variables is indispensable because without it, the -contextuality will not be an extension of traditional contextuality. How critical, however, is the second constraint imposed on well-fitting , that the -coupling always exists and is unique? What if one considers statement for which is a set that may be empty or contain more than one coupling? This complicates the matters conceptually because then, in the consistification procedure, the -type bunches, those filled with the -couplings, cannot be formed unquely or cannot be formed at all. However, the main point of this paper can still be made, with some qualifications.
We can agree that the consistification of an inconsistently connected system R is not a single system but a cluster of systems , the elements of which are obtained by filling the -type bunches in the consistification of R by all possible couplings of R’s connections. We can further agree that the cluster is considered noncontextual if it contains a noncontextual system . In particular, if is empty (which means that does not exist for at least one of the connections of R), the latter definition is not satisfied, and the cluster should be considered contextual. Once again, we have a theory dealing with consistently connected systems only, except that the empirical or theoretical situations they depict are represented by clusters of systems sharing a format and the -bunches.
It might seem that dealing with an infinity of possible couplings or proving that is empty is a significantly more difficult mathematical task than when is well-fitting. This is not the case, however, as the complication is not necessarily major. Mathematically, the problem of finding whether a system R is contextual consists in determining whether the variable S having the same format as R can be assigned a probability measure subject to certain constraints on its marginals. The constraints are imposed by the distributions of the bunches (that have to match) and by the statement that has to be satisfied by the couplings of the connections . For discrete random variables and finite sets Q and C, this is a linear programming task, provided that the compliance with can be presented in terms of linear inequalities of the probabilities in the distribution of . For the consistification the problem is precisely the same, except that in place of connection couplings, one deals with -type bunches.