Next Article in Journal
The Tightness of Multipartite Coherence from Spectrum Estimation
Previous Article in Journal
Directly and Simultaneously Expressing Absolute and Relative Treatment Effects in Medical Data Models and Applications
Previous Article in Special Issue
A Generalized Information-Theoretic Approach for Bounding the Number of Independent Sets in Bipartite Graphs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Non-Adaptive Zero-Error Capacity of the Discrete Memoryless Two-Way Channel †

1
Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka 819-0395, Japan
2
Department of EE—Systems, Tel Aviv University, Tel Aviv 69978, Israel
*
Authors to whom correspondence should be addressed.
This work was presented in part at the 2019 IEEE International Symposium on Information Theory.
Entropy 2021, 23(11), 1518; https://doi.org/10.3390/e23111518
Submission received: 13 September 2021 / Revised: 21 October 2021 / Accepted: 22 October 2021 / Published: 15 November 2021
(This article belongs to the Special Issue Combinatorial Aspects of Shannon Theory)

Abstract

:
We study the problem of communicating over a discrete memoryless two-way channel using non-adaptive schemes, under a zero probability of error criterion. We derive single-letter inner and outer bounds for the zero-error capacity region, based on random coding, linear programming, linear codes, and the asymptotic spectrum of graphs. Among others, we provide a single-letter outer bound based on a combination of Shannon’s vanishing-error capacity region and a two-way analogue of the linear programming bound for point-to-point channels, which, in contrast to the one-way case, is generally better than both. Moreover, we establish an outer bound for the zero-error capacity region of a two-way channel via the asymptotic spectrum of graphs, and show that this bound can be achieved in certain cases.

1. Introduction

The problem of reliable communication over a discrete memoryless two-way channel (DM-TWC) was originally introduced and investigated by Shannon [1] in a seminal paper that has marked the inception of multi-user information theory. A DM-TWC is characterized by a quadruple of finite input and output alphabets X 1 , X 2 , Y 1 , Y 2 , and a conditional probability distribution P Y 1 , Y 2 | X 1 , X 2 ( y 1 , y 2 | x 1 , x 2 ) , where x 1 X 1 , x 2 X 2 , y 1 Y 1 , y 2 Y 2 . The channel is memoryless in the sense that channel uses are independent, that is, for any i,
P Y 1 i , Y 2 i | X 1 i , X 2 i , Y 1 i 1 , Y 2 i 1 ( y 1 i , y 2 i | x 1 i , x 2 i , y 1 i 1 , y 2 i 1 ) = P Y 1 , Y 2 | X 1 , X 2 ( y 1 i , y 2 i | x 1 i , x 2 i ) .
In [1], Shannon provided inner and outer bounds for the vanishing-error capacity region of the DM-TWC, in the general setting where the users are allowed to adapt their transmissions on the fly based on past observations. We note that Shannon’s inner bound is tight for non-adaptive schemes, namely when the users map their messages to codewords in advance. The non-adaptive DM-TWC is also sometimes called the restricted DM-TWC [2]. Shannon’s inner and outer bounds have later been improved by utilizing auxiliary random variable techniques [3,4,5], and sufficient conditions under which his bounds coincide have been obtained [6,7]. However, despite much effort, the capacity region of a general DM-TWC under the vanishing-error criterion remains elusive. In fact, a strong indicator for the inherent difficulty of the problem can be observed in Blackwell’s binary multiplying channel, a simple, deterministic, common-output channel whose capacity remains unknown hitherto [4,5,8,9,10].
In yet another seminal work, Shannon proposed and studied the zero-error capacity of the point-to-point discrete memoryless channel [2], also known as the Shannon capacity of a graph. This problem has been extensively studied by others, most notably in [11,12], yet remains generally unsolved. In this paper, we consider the problem of zero-error communication over a DM-TWC. We limit our discussion to the case of non-adaptive schemes, for which the capacity region is known in the vanishing-error case [1]. Despite the obvious difficulty of the problem (the point-to-point zero-error capacity is a special case), its two-way nature adds a new combinatorial dimension that renders it interesting to study. To the best of our knowledge, this problem has not been addressed before, except in the special case of the binary multiplying channel, where upper and lower bounds on non-adaptive zero-error sum capacity have been obtained [13,14,15]. Our bounds are partially based on generalizations of these ideas and an earlier short version [16].
The problem of non-adaptive communication over a DM-TWC can be formulated as follows. Alice and Bob would like to simultaneously convey messages m 1 [ 2 n R 1 ] and m 2 [ 2 n R 2 ] respectively to each other, over n uses of the DM-TWC P Y 1 , Y 2 | X 1 , X 2 . To that end, Alice maps her message to an input sequence (codeword) x 1 n X 1 n using an encoding function f 1 : [ 2 n R 1 ] X 1 n , and Bob maps his message into an input sequence (codeword) x 2 n X 2 n using an encoding function f 2 : [ 2 n R 2 ] X 2 n . We call the pair of codeword collections ( f 1 ( [ 2 n R 1 ] ) , f 2 ( [ 2 n R 2 ] ) ) a codebook pair. Note that the encoding functions depend only on the messages, and not on the observed outputs during the transmission, hence the name non-adaptive. When transmissions end, Alice and Bob observe the resulting (random) channel outputs Y 1 n Y 1 n and Y 2 n Y 2 n respectively, and attempt to decode the message sent by their counterpart, without error. When this is possible, that is, when there exist decoding functions ϕ 1 : [ 2 n R 1 ] × Y 1 n [ 2 n R 2 ] and ϕ 2 : [ 2 n R 2 ] × Y 2 n [ 2 n R 1 ] such that m 2 = ϕ 1 ( m 1 , Y 1 n ) and m 1 = ϕ 2 ( m 2 , Y 2 n ) , for all m 1 , m 2 , with probability one, then the codebook pair (or the encoding functions) is called ( n , R 1 , R 2 ) uniquely decodable. A rate pair ( R 1 , R 2 ) is achievable for the DM-TWC if an ( n , R 1 , R 2 ) uniquely decodable code exists for some n. The non-adaptive zero-error capacity region of a DM-TWC P Y 1 , Y 2 | X 1 , X 2 is the closure of the set of all achievable rate pairs, and is denoted here by C z e ( P Y 1 , Y 2 | X 1 , X 2 ) . Moreover, the non-adaptive zero-error sum-capacity of a DM-TWC P Y 1 , Y 2 | X 1 , X 2 , denoted by C z e s u m ( P Y 1 , Y 2 | X 1 , X 2 ) , is the supremum of the sum-rate R 1 + R 2 taken over all achievable rate pairs.
The main objective of this paper is to provide several single-letter outer and inner bounds on the non-adaptive zero-error capacity region of the DM-TWC. The remainder of this paper is organized as follows. In Section 2, we provide some necessary mathematical preliminaries, discussing in particular the characterization of zero-error DM-TWC capacity via confusion graphs, behavior under graph homomorphisms, and one-shot zero-error communication. Section 3 is devoted to three general outer bounds of the zero-error capacity region of DM-TWC, which are based on Shannon’s vanishing-error non-adaptive capacity region, a two-way analogue of the linear programming bound for point-to-point channels, and the Shannon capacity of a graph. In Section 4, we provide two general inner bounds using random coding and random linear codes respectively. In Section 5, we establish outer bounds for certain types of DM-TWC via the asymptotic spectra of graphs, and also explicitly construct the uniquely decodable codebook pairs achieving the outer bound. Some concluding remarks appear in Section 6.

2. Preliminaries

2.1. Shannon Capacity of a Graph

Let G = ( V , E ) be a graph with vertex set V and edge set E. Two vertices v 1 , v 2 are adjacent, denoted as v 1 v 2 , if there is an edge between v 1 and v 2 , that is, { v 1 , v 2 } E . An independent set in G is a subset of pairwise non-adjacent vertices. A maximum independent set is an independent set with the largest possible number of vertices. The size of a maximum independent set in G is called the independence number of G, denoted by α ( G ) . The complement of a graph G, denoted by G ¯ , is a graph with the same vertex set, where two distinct vertices of G ¯ are adjacent if and only if they are not adjacent in G. We write K n and K ¯ n for the complete graph (containing all possible edges) and the empty graph (containing no edges) over n vertices, respectively.
Let G = ( V ( G ) , E ( G ) ) and H = ( V ( H ) , E ( H ) ) be two graphs. The strong product (or normal product) G H of the graphs G and H is a graph such that
(1)
the vertex set of G H is the Cartesian product V ( G ) × V ( H ) ;
(2)
two vertices ( u , u ) and ( v , v ) are adjacent if and only if one of the followings holds: (a) u = v and u v ; (b) u v and u = v ; (c) u v and u v .
The n-fold strong product of graph G with itself is denoted as G n . The Shannon capacity of graph G was defined in [2] to be:
Θ ( G ) sup n 1 n log α ( G n ) = lim n 1 n log α ( G n ) ,
where the limit exists by Fekete’s lemma. We note that throughout the paper all logarithms are taken to base 2.
The disjoint union  G H of the graphs G and H is a graph such that V ( G H ) = V ( G ) V ( H ) and E ( G H ) = E ( G ) E ( H ) . A graph homomorphism from G to H, denoted by G H , is a mapping φ : V ( G ) V ( H ) such that if g 1 g 2 in G, then φ ( g 1 ) φ ( g 2 ) in H. We write G H if there exists a graph homomorphism G ¯ H ¯ from the complement of G to the complement of H.
In [17], Zuiddam introduced the asymptotic spectrum of graphs notion, and provided a dual characterisation of the Shannon capacity of a graph by applying Strassen’s theory of asymptotic spectra, which includes the Lovász theta number [12], the fractional clique cover number, the complement of the fractional orthogonal rank [18], and the fractional Haemers’ bound over any field [11,19,20] as specific elements of the asymptotic spectrum (also called spectral points).
Theorem 1
([17]). Let G be a collection of graphs that is closed under the disjoint union and the strong product , and also contains the graph with a single vertex K 1 . Define the asymptotic spectrum Δ ( G ) as the set of all mappings η : G R 0 such that for all G , H G :
(1) 
if G H , then η ( G ) η ( H ) ;
(2) 
η ( G H ) = η ( G ) + η ( H ) ;
(3) 
η ( G H ) = η ( G ) η ( H ) ;
(4) 
η ( K 1 ) = 1 .
Then, Θ ( G ) = inf η Δ ( G ) log η ( G ) . In other words, inf η Δ ( G ) η ( G ) = 2 Θ ( G ) and α ( G ) inf η Δ ( G ) η ( G ) .
As remarked in [17], 2 Θ ( G ) is in general not an element of Δ ( G ) . In fact, 2 Θ ( G ) is not additive under ⊔ by a result of Alon [21], and also not multiplicative under ⊠ by a result of Haemers [11]. In Section 3.3, to derive an outer bound for zero-error capacity of a DM-TWC, we will employ the multiplicativity of η ( G ) for η Δ ( G ) under the ⊠ operation.

2.2. Confusion Graphs of Channels

In this subsection, we characterize the zero-error capacity of a discrete memoryless point-to-point channel, as well as the zero-error capacity region of a DM-TWC, in terms of suitably defined graphs. The point-to-point characterization is well known and goes back to Shannon [2], and the DM-TWC case is a natural generalization thereof.
A discrete memoryless point-to-point channel consists of a finite input alphabet X , a finite output alphabet Y , and a conditional probability distribution P Y | X ( y | x ) , where x X , y Y . The channel is memoryless in the sense that P Y i | X i , Y i 1 ( y i | x i , y i 1 ) = P Y | X ( y i | x i ) for the ith channel use. Suppose that a transmitter would like to convey a message m [ 2 n R ] to a receiver over the channel. To that end, the transmitter sends an input sequence x n X n using an encoding function f : [ 2 n R ] X n , and the receiver, after observing the corresponding channel outputs y n Y n , guesses the message using a decoding function ϕ : Y n [ 2 n R ] . This pair ( f , ϕ ) is called an ( n , R ) code, and such a code is called uniquely decodable if m = ϕ ( y n ) holds for any m [ 2 n R ] and any correspondingly possible y n . A rate R is called achievable if an ( n , R ) uniquely decodable code exists for some n. The zero-error capacity of the channel is defined as the supremum of all achievable rates.
A channel P Y | X is associated with a confusion graph G, whose vertex set is the input alphabet X , and two vertices x , x X are adjacent, denoted as x x , if and only if there exists y Y that is possible under both of them, that is, such that P Y | X ( y | x ) > 0 and P Y | X ( y | x ) > 0 . It is easy to verify that C is an ( n , R ) uniquely decodable code if and only if C is an independent set of the graph G n , the n-fold strong product of graph G. Consequently, the zero-error capacity of a point-to-point channel is equal to the Shannon capacity of its confusion graph G. Note that there are infinitely many distinct channels with the same confusion graph, and all of these channels have the same zero-error capacity.
We now proceed to similarly associate a DM-TWC with a collection of confusion graphs, which would then be shown to characterize its zero-error capacity region. To that end, note that when Alice sends a letter x 1 X 1 , the resulting channel from Bob back to Alice at that same instant is the point-to-point channel P Y 1 | X 1 = x 1 , X 2 . This channel is associated with a confusion graph G x 1 , whose vertex set is X 2 and where two vertices x 2 , x 2 X 2 are adjacent, denoted in this case by x 2 x 1 x 2 , if and only if there exists some y 1 Y 1 such that both
P Y 1 | X 1 , X 2 ( y 1 | x 1 , x 2 ) > 0 , P Y 1 | X 1 , X 2 ( y 1 | x 1 , x 2 ) > 0 ,
where
P Y 1 | X 1 , X 2 ( y 1 | x 1 , x 2 ) y 2 Y 2 P Y 1 , Y 2 | X 1 , X 2 ( y 1 , y 2 | x 1 , x 2 ) .
Symmetrically, when Bob sends a letter x 2 X 2 , the resulting channel from Alice to Bob at that same instant is associated with a confusion graph H x 2 , whose vertex set is X 1 , and where two vertices x 1 , x 1 X 1 are adjacent, denoted in this case by x 1 x 2 x 1 , if and only if there exists some y 2 Y 2 such that both:
P Y 2 | X 1 , X 2 ( y 2 | x 1 , x 2 ) > 0 , P Y 2 | X 1 , X 2 ( y 2 | x 1 , x 2 ) > 0 ,
where
P Y 2 | X 1 , X 2 ( y 2 | x 1 , x 2 ) y 1 Y 1 P Y 1 , Y 2 | X 1 , X 2 ( y 1 , y 2 | x 1 , x 2 ) .
Based on the foregoing discussion, a DM-TWC P Y 1 , Y 2 | X 1 , X 2 can be decomposed into a collection of discrete memoryless point-to-point channels, and hence is associated with a corresponding collection of confusion graphs, denoted by [ G 1 , , G | X 1 | ; H 1 , , H | X 2 | ] , where V ( G 1 ) = = V ( G | X 1 | ) = X 2 and V ( H 1 ) = = V ( H | X 2 | ) = X 1 . The following useful observation is immediate, and in particular shows that the zero-error capacity region of a DM-TWC is a function of its confusion graphs only. Thus, from here and on, we will sometimes identify the channel with its collection of confusion graphs.
Proposition 1.
Consider a DM-TWC P Y 1 , Y 2 | X 1 , X 2 , associated with the collection of confusion graphs [ G 1 , , G | X 1 | ; H 1 , , H | X 2 | ] . A codebook pair ( A , B ) is uniquely decodable for P Y 1 , Y 2 | X 1 , X 2 if and only if for any a n = ( a 1 , , a n ) A and b n = ( b 1 , , b n ) B , it holds that B is an independent set of G a 1 G a n , and A is an independent set of H b 1 H b n .
In particular, we see that the capacity region C z e ( P Y 1 , Y 2 | X 1 , X 2 ) depends only on the corresponding confusion graphs [ G 1 , , G | X 1 | ; H 1 , , H | X 2 | ] . Hence, in the sequel, we will write C z e ( [ G 1 , , G | X 1 | ; H 1 , , H | X 2 | ] ) and C z e s u m ( [ G 1 , , G | X 1 | ; H 1 , , H | X 2 | ] ) to represent C z e ( P Y 1 , Y 2 | X 1 , X 2 ) and C z e s u m ( P Y 1 , Y 2 | X 1 , X 2 ) , respectively. We will also often identify the channel with its confusion graphs, and refer to it as [ { G i } ; { H j } ] , when it is clear from the context. This also leads to the following immediate observation, analogues to the point-to-point case.
Proposition 2.
If P Y 1 , Y 2 | X 1 , X 2 and Q Y 1 , Y 2 | X 1 , X 2 have the same confusion graphs up to some relabeling on input symbols, then C z e ( P Y 1 , Y 2 | X 1 , X 2 ) = C z e ( Q Y 1 , Y 2 | X 1 , X 2 ) .
This further immediately implies:
Proposition 3.
C z e ( P Y 1 , Y 2 | X 1 , X 2 ) depends only on the conditional marginal distributions P Y 1 | X 1 , X 2 and P Y 2 | X 1 , X 2 .
The strong product of two DM-TWCs [ G 1 , , G | X 1 | ; H 1 , , H | X 2 | ] and [ G 1 , , G | X 1 | ; H 1 , , H | X 2 | ] , denoted by [ { G i } ; { H j } ] [ { G i } ; { H j } ] , refers to a DM-TWC having input alphabets X 1 × X 1 and X 2 × X 2 , as well as confusion graphs
[ { G i G i : i X 1 , i X 1 } ; { H j H j : j X 2 , j X 2 } ] .
Considering the zero-error sum-capacity with respect to the strong product, we have the lemma below.
Lemma 1.
C z e s u m [ { G i } ; { H j } ] [ { G i } ; { H j } ] C z e s u m [ { G i } ; { H j } ] + C z e s u m [ { G i } ; { H j } ] .
Proof. 
To prove this lemma, it is sufficient to prove that, for any ( n , R 1 , R 2 ) (resp. ( n , R 1 , R 2 ) ) uniquely decodable codebook pair ( A , B ) (resp. ( A , B ) ) for channel [ { G i } ; { H j } ] (resp. [ { G i } ; { H j } ] ), there exists an ( n , R 1 + R 1 , R 2 + R 2 ) uniquely decodable codebook pair for the associated product channel [ { G i } ; { H j } ] [ { G i } ; { H j } ] . To that end, let
A * = { ( ( a 1 , a 1 ) , , ( a n , a n ) ) : a n A , a n A } , B * = { ( ( b 1 , b 1 ) , , ( b n , b n ) ) : b n B , b n B } .
It is easy to verify that ( A * , B * ) is uniquely decodable for the product channel. Moreover, | A * | = | A | | A | = 2 n ( R 1 + R 1 ) and | B * | = | B | | B | = 2 n ( R 2 + R 2 ) . The lemma follows. □

2.3. Dual Graph Homomorphisms

In this subsection we study the behavior of the zero-error capacity region of a DM-TWC under graph homomorphisms, generalizing a similar analysis from the point-to-point channel case [2]. Let [ { G i } ; { H j } ] and [ { G i } ; { H j } ] be two collections of confusion graphs corresponding to two DM-TWCs such that V ( G i ) = V ( G ) , V ( H j ) = V ( H ) , V ( G i ) = V ( G ) and V ( H j ) = V ( H ) . A dual graph homomorphism from [ { G i } ; { H j } ] to [ { G i } ; { H j } ] , denoted by [ { G i } ; { H j } ] [ { G i } ; { H j } ] , is a pair of mappings ( φ , ψ ) , where φ : V ( H ) V ( H ) and ψ : V ( G ) V ( G ) , such that
(1)
if v 1 v 2 in G i , then ψ ( v 1 ) ψ ( v 2 ) in G φ ( i ) ; and
(2)
if u 1 u 2 in H j , then φ ( u 1 ) φ ( u 2 ) in H ψ ( j ) .
It is easy to see that the dual graph homomorphism is a natural generalization of the standard graph homomorphism of two graphs, in the sense that they are both adjacency preserving. We write [ { G i } ; { H j } ] [ { G i } ; { H j } ] if there exists a dual graph homomorphism from [ { G ¯ i } ; { H ¯ j } ] to [ { G ¯ i } ; { H ¯ j } ] . Then:
Lemma 2.
If [ { G i } ; { H j } ] [ { G i } ; { H j } ] , and G ¯ i and H ¯ j do not have self-loops, then
C z e ( [ { G i } ; { H j } ] ) C z e ( [ { G i } ; { H j } ] ) .
Proof. 
Suppose ( φ , ψ ) : [ { G ¯ i } ; { H ¯ j } ] [ { G ¯ i } ; { H ¯ j } ] and ( A , B ) is a uniquely decodable codebook pair of length n for the DM-TWC [ { G i } ; { H j } ] . Let
Φ ( A ) = { φ ( a n ) = ( φ ( a 1 ) , , φ ( a n ) ) : a n A } , Ψ ( B ) = { ψ ( b n ) = ( ψ ( b 1 ) , , ψ ( b n ) ) : b n B } .
We now show that ( Φ ( A ) , Ψ ( B ) ) is a uniquely decodable codebook pair for the DM-TWC [ { G i } ; { H j } ] . To that end, it suffices to show that for any distinct a n , a ˜ n A and b n , b ˜ n B , we have
φ ( a n ) φ ( a ˜ n ) in H ψ ( b 1 ) H ψ ( b n ) , ψ ( b n ) ψ ( b ˜ n ) in G φ ( a 1 ) G φ ( a n ) .
Indeed, since ( A , B ) is a uniquely decodable codebook pair, there exist coordinates i , j [ n ] such that a i a ˜ i in H b i and b j b ˜ j in G a j . By the definition of ( φ , ψ ) , we have that φ ( a i ) φ ( a ˜ i ) in H ψ ( b i ) and ψ ( b j ) ψ ( b ˜ j ) in G φ ( a j ) , implying (1). It is also evident that | Φ ( A ) | = | A | and | Ψ ( B ) | = | B | . The lemma now follows by taking the union over all uniquely decodable codebook pairs ( A , B ) for [ { G i } ; { H j } ] . □

2.4. One-Shot Zero-Error Communication

In this subsection, we consider the problem of zero-error communication over a DM-TWC with only a single channel use by the two parties (i.e., n = 1 ). We refer to the associated set of achievable rate pairs as the one-shot zero-error capacity region, and the associated sum-rate as the one-shot zero-error sum-capacity. Recall that the one-shot zero-error capacity of a point-to-point channel is simply the logarithm of the independence number of its confusion graph; this quantity yields a lower bound on the zero-error capacity of the channel, and also provides an infinite-letter expression for the capacity when evaluated over the product graph. It is therefore interesting to study the analogue of the independence number in the two-way case, which in particular would yield an inner bound on the zero-error capacity region of the DM-TWC. For simplicity of exposition, we will focus here on the one-shot zero-error sum-capacity only.
For convenience we define some notions first. Let [ { G i } ; { H j } ] be a DM-TWC such that V ( G i ) = X 2 and V ( H j ) = X 1 . A pair ( S , T ) of subsets S X 1 and T X 2 is called a dual clique pair of the DM-TWC if t s t and s t s for any distinct s , s S and distinct t , t T , that is, S is a clique in each H t for t T , and T is a clique in each G s for s S . A pair ( S , T ) of subsets S X 1 and T X 2 is called a dual independent pair of the DM-TWC if T is an independent set of the graph G s for each s S , and S is an independent set of the graph H t for each t T . A maximum dual independent pair is a dual independent pair ( S , T ) with the largest possible product of sizes | S | | T | . This product is called the independence product of [ { G i } ; { H j } ] , denoted by π ( { G i } ; { H j } ) . According to the definition, the one-shot zero-error sum-capacity of the DM-TWC is log π ( { G i } ; { H j } ) . It is also readily seen that if two channels have the same confusion graphs up to some relabeling of input symbols, then they have the same collections of dual clique pairs and dual independent pairs, and hence the same one-shot zero-error sum-capacity.
For two graphs G 1 and G 2 , let G 1 G 2 be the union of G 1 and G 2 such that V ( G 1 G 2 ) = V ( G 1 ) V ( G 2 ) and E ( G 1 G 2 ) = E ( G 1 ) E ( G 2 ) . Notice that the graph disjoint union ⊔ in Section 2.1 is a special case of the union ∪, when the vertex sets of G 1 and G 2 are disjoint. For notation convenience, in the rest of this subsection we let | X 1 | = m 1 and | X 2 | = m 2 . The following simple observations are now in order.
Proposition 4.
Suppose ( S , T ) is a dual independent pair of [ G 1 , , G m 1 ; H 1 , , H m 2 ] . Then:
(1) 
If | S | = 1 , then | T | max 1 i m 1 α ( G i ) . The equality holds by taking S = { s } and T be a maximum independent set of G s , where s arg max 1 i m 1 α ( G i ) .
(2) 
| S | min t T α ( H t ) .
(3) 
S is an independent set of   t T H t .
Proof. 
The results follow directly from the definition of dual independent pairs. □
Lemma 3.
Let [ G 1 , , G m 1 ; H 1 , , H m 2 ] be a DM-TWC and G, H be graphs such that V ( G ) = X 2 , V ( H ) = X 1 . Then:
(1) 
max max 1 i m 1 α ( G i ) , max 1 j m 2 α ( H j ) π ( G 1 , , G m 1 ; H 1 , , H m 2 ) max 1 i m 1 α ( G i ) · max 1 j m 2 α ( H j ) .
(2) 
π ( G , , G ; H , , H ) = α ( G ) α ( H ) .
(3) 
π ( K ¯ m 2 , G , , G ; K ¯ m 1 , H , , H ) = max { α ( G ) α ( H ) , m 1 , m 2 } .
(4) 
π ( G 1 , , G m 1 ; K m 1 , , K m 1 ) = max 1 i m 1 α ( G i ) .
Proof. 
(1) The lower bound follows from Proposition 4 (1) and the symmetry of S and T. From Proposition 4 (2), we have
| S | min t T α ( H t ) max 1 j m 2 α ( H j ) , | T | min s S α ( G s ) max 1 i m 1 α ( G i ) ,
yielding the upper bound.
(2) From claim (1) above, we have π ( G , , G ; H , , H ) α ( G ) α ( H ) . The equality holds by taking S and T as the maximum independent sets of H and G respectively.
(3) From claims (1) and (2) above, we have
π ( K ¯ m 2 , G , , G ; K ¯ m 1 , H , , H ) max { α ( G ) α ( H ) , m 1 , m 2 } .
On the other hand, suppose ( S , T ) is a dual independent pair. We have the following three cases: (i) If | S | = 1 then by Proposition 4, claim (1), we have | S | | T | m 2 . (ii) If | T | = 1 , similar to case (i), we have | S | | T | m 1 . (iii) If | S | 2 and | T | 2 , then by Proposition 4, claim (2), we obtain | S | | T | α ( G ) α ( H ) . Thus π ( K ¯ m 2 , G , , G ; K ¯ m 1 , H , , H ) max { α ( G ) α ( H ) , m 1 , m 2 } .
(4) is a direct consequence of claim (1) above. The lemma follows. □
By graph homomorphisms we immediately have:
Proposition 5.
If [ { G i } ; { H j } ] [ { G i } ; { H j } ] , then
π ( { G i } ; { H j } ) π ( { G i } ; { H j } ) .
Next, we shall provide an upper bound for π ( { G i } ; { H j } ) via a generalization of the Lovász theta number [12]. Let Γ be an arbitrary ( m 1 + m 2 ) × ( m 1 + m 2 ) positive semi-definite matrix (i.e., Γ 0 ), and Γ i , j be its ( i , j ) th entry. Let J be an m 1 × m 2 all-one matrix, and I n be an n × n identity matrix. For any matrices A and B, denote A , B = trace ( A B ) and denote A T as the transpose of matrix A. Now define ρ ( { G i } , { H j } ) as
maximize 0 J J T 0 , Γ subject to I m 1 0 0 0 , Γ = 1 , 0 0 0 I m 2 , Γ = 1 , Γ i , j + m 1 · Γ i , k + m 1 = 0 , i X 1 , j , k X 2 , j k , j k in G i Γ i + m 1 , j · Γ i + m 1 , k = 0 , i X 2 , j , k X 1 , j k , j k in H i Γ i , k + m 1 · Γ j , k + m 1 = 0 , i , j X 1 , k X 2 , i j , i j in H k Γ i + m 1 , k · Γ j + m 1 , k = 0 , i , j X 2 , k X 1 , i j , i j in G k Γ 0 .
Lemma 4.
π ( { G i } , { H j } ) 1 2 ρ ( { G i } , { H j } ) 2 .
Proof. 
Suppose ( S , T ) with S X 1 , T X 2 is a maximum dual independent pair such that | S | | T | = π ( { G i } , { H j } ) . For a number m and a set S, denote m + S = { m + s : s S } . Let Γ be an ( m 1 + m 2 ) × ( m 1 + m 2 ) matrix such that
Γ i , j = 1 | S | , if i S , j S 1 | S | | T | , if i S , j m 1 + T , or i m 1 + T , j S 1 | T | , if i m 1 + T , j m 1 + T 0 , otherwise .
Notice that for any vector x m 1 + m 2 = ( x 1 , , x m 1 + m 2 ) we have
x m 1 + m 2 · Γ · ( x m 1 + m 2 ) T = 1 | S | i S x i + 1 | T | j T x m 1 + j 2 0 .
This shows that Γ is a positive semi-definite matrix satisfying the equality constraints in (2). Accordingly, Γ is a feasible solution for program (2) and
ρ ( { G i } , { H j } ) 0 J J T 0 , Γ = 2 | S | | T | = 2 π ( { G i } , { H j } ) ,
implying the result. This completes the proof. □

2.5. Information-Theoretic Notations

We recall some standard information-theoretic quantities that will be used in the sequel. Let X , Y be two discrete random variables taking values from sets X , Y according to a joint probability distribution P X Y . Let P X denote the marginal probability distribution for X, where P X ( x ) = y Y P X Y ( x , y ) , and P Y be the marginal probability distribution for Y similarly. The Shannon entropy of X is denoted by H ( X ) where H ( X ) = x X P X ( x ) log P X ( x ) . In particular, the binary entropy function is written as h ( x ) = x log x ( 1 x ) log ( 1 x ) , where 0 x 1 . The conditional entropy of X given Y is written as H ( X | Y ) where H ( X | Y ) = P X Y ( x , y ) log P X Y ( x , y ) P Y ( y ) . The mutual information between X and Y is I ( X ; Y ) = H ( X ) H ( X | Y ) . The conditional mutual information of X , Y given another random variable Z is I ( X ; Y | Z ) = H ( X | Z ) H ( X | Y , Z ) . The following basic properties will be used in the arguments afterwards.
Proposition 6.
(1) H ( X ) 0 , I ( X ; Y ) 0 . (Non-negativity)
(2) H ( X | Y ) H ( X ) , I ( X ; Y | Z ) I ( X ; Y ) . (Conditioning reduces entropy)
(3) H ( X 1 , X 2 , , X n ) = i = 1 n H ( X i | X 1 , , X i 1 ) . (Entropy chain rule)

3. Outer Bounds

In this section, we provide single-letter outer bounds for the non-adaptive zero-error capacity region of the DM-TWC. First in Section 3.1, we present two simple outer bounds, one based on Shannon’s vanishing-error non-adaptive capacity region and the other on a two-way analogue of the linear programming bound for point-to-point channels. Next in Section 3.2, we combine the two bounds given in Section 3.1 and obtain an outer bound that is generally better than both. Finally, in Section 3.3 we derive another single-letter outer bound via the asymptotic spectra of graphs.

3.1. Simple Bounds

It is trivial to see that Shannon’s vanishing-error non-adaptive capacity region of the DM-TWC ([1], Theorem 3) contains its zero-error counterpart. First recall Shannon’s bound in [1].
Lemma 5
([1]). The vanishing-error non-adaptive capacity region of a DM-TWC P Y 1 , Y 2 | X 1 , X 2 is the convex hull of the set:
P X 1 , P X 2 { ( R 1 , R 2 ) : R 1 0 , R 2 0 , R 1 = I ( X 1 ; Y 2 | X 2 ) , R 2 = I ( X 2 ; Y 1 | X 1 ) }
where the union is taken over all product input probability distributions P X 1 × P X 2 .
Together with Proposition 2, this immediately yields the following outer bound.
Lemma 6.
C z e ( P Y 1 , Y 2 | X 1 , X 2 ) is contained in
Q Y 1 , Y 2 | X 1 , X 2 0 λ 1 { ( R 1 , R 2 ) : R 1 0 , R 2 0 , λ R 1 + ( 1 λ ) R 2 max P X 1 , P X 2 ϵ ( λ ) } ,
where
ϵ ( λ ) λ I ( X 1 ; Y 2 | X 2 ) + ( 1 λ ) I ( X 2 ; Y 1 | X 1 ) .
The first intersection is taken over all DM-TWCs Q Y 1 , Y 2 | X 1 , X 2 with the same adjacency as P Y 1 , Y 2 | X 1 , X 2 , and the maximum is taken over all product input probability distributions P X 1 × P X 2 .
Remark 1.
The bound (3) can also be written in the standard form
Q Y 1 , Y 2 | X 1 , X 2 P X 1 , P X 2 { ( R 1 , R 2 ) : R 1 0 , R 2 0 , R 1 I ( X 1 ; Y 2 | X 2 ) , R 2 I ( X 2 ; Y 1 | X 1 ) } .
Here we prefer however to use the form (3), for ease of comparison with forthcoming bounds.
We now proceed to obtain a combinatorial outer bound. Recall that a dual clique pair of a DM-TWC is a pair ( S , T ) of subsets S X 1 and T X 2 such that t s t and s t s for any distinct s , s S and distinct t , t T . In the sequel, we adopt the convention that 0 0 = 1 .
Lemma 7.
C z e ( P Y 1 , Y 2 | X 1 , X 2 ) is contained in:
0 λ 1 { ( R 1 , R 2 ) : R 1 0 , R 2 0 , λ R 1 + ( 1 λ ) R 2 max P X 1 , P X 2 log l ( λ ) } ,
where
l ( λ ) max S , T x 1 S P X 1 ( x 1 ) λ x 2 T P X 2 ( x 2 ) 1 λ
and the maximum in (5) is taken over all the input probability distributions P X 1 and P X 2 , and the maximum in (6) is taken over all the dual clique pairs ( S , T ) of P Y 1 , Y 2 | X 1 , X 2 .
Proof. 
Let ( A , B ) be a uniquely decodable codebook pair of length n. We will show that:
| A | λ | B | 1 λ κ · 1 l ( λ ) n
by induction on n, where κ is a constant independent of n.
Indeed, for the base case n = 1 , one could take subsets A X 1 , B X 2 such that for any distinct a , a A and distinct b , b B , we have a b a and b a b . Clearly, | A | | B | | X 1 | | X 2 | and (7) follows by taking κ sufficiently large.
Assume that (7) holds for every length n n 1 , and let us proceed to prove for length n. Suppose ( A , B ) X 1 n × X 2 n is a uniquely decodable codebook pair of length n. For a vector x n , let x n i ( x 1 , , x i 1 , x i + 1 , , x n ) be its projection over all coordinates not equal to i. For each coordinate 1 i n and each x 1 X 1 , x 2 X 2 , let
A i ( x 1 ) { a n i : a n A , a i = x 1 } , B i ( x 2 ) { b n i : b n B , b i = x 2 }
be the projections of each codebook obtained by fixing the ith coordinate. Define the distributions induced by these projections over X 1 and X 2 respectively to be
P X 1 i ( x 1 ) | A i ( x 1 ) | | A | , P X 2 i ( x 2 ) | B i ( x 2 ) | | B | .
Furthermore, for any two subsets S X 1 and T X 2 , define the codebooks induced by the unions over S and T of the respective projected codebooks, to be
A i ( S ) x 1 S A i ( x 1 ) , B i ( T ) x 2 T B i ( x 2 ) .
Note that if ( S , T ) is a dual clique pair such that A i ( S ) and B i ( T ) , then the unions in (10) are disjoint, as otherwise this would contradict the assumption that ( A , B ) is uniquely decodable. Hence
| A i ( S ) | = x 1 S | A i ( x 1 ) | , | B i ( T ) | = x 2 T | B i ( x 2 ) | ,
and also, for any i [ n ] it must hold that ( A i ( S ) , B i ( T ) ) is a uniquely decodable codebook pair of length n 1 . Combining (8), (9) and (11) gives
| A | λ | B | 1 λ = | A i ( S ) | s S P X 1 i ( s ) λ · | B i ( T ) | t T P X 2 i ( t ) ) 1 λ .
By the inductive hypothesis, we obtain
| A | λ | B | 1 λ κ · 1 l ( λ ) n 1 s S P X 1 i ( s ) λ · t T P X 2 i ( t ) 1 λ κ · 1 l ( λ ) n 1 l ( λ ) = κ · 1 l ( λ ) n ,
where the second inequality follows from the definition of l ( λ ) in (6). This completes the proof. □
The following is a trivial corollary of Lemmas 6 and 7.
Corollary 1.
C z e ( P Y 1 , Y 2 | X 1 , X 2 ) is contained in
Q Y 1 , Y 2 | X 1 , X 2 0 λ 1 { ( R 1 , R 2 ) : R 1 0 , R 2 0 , λ R 1 + ( 1 λ ) R 2 t ( λ ) } ,
where
t ( λ ) min max P X 1 , P X 2 ϵ ( λ ) , max P X 1 , P X 2 log l ( λ ) .

3.2. An Improved Bound

We now provide a single-letter outer bound, in which the order of the minimum and the maximum in (14) is swapped. This generally yields a tighter outer bound due to the max–min inequality. In fact, our bound can be seen as a generalization of the one obtained by Holzman and Körner for the binary multiplying channel [13], in which case the max–min is indeed strictly tighter than the min–max.
Theorem 2.
C z e ( P Y 1 , Y 2 | X 1 , X 2 ) is contained in
Q Y 1 , Y 2 | X 1 , X 2 0 λ 1 { ( R 1 , R 2 ) : R 1 0 , R 2 0 , λ R 1 + ( 1 λ ) R 2 θ ( λ ) } ,
where
θ ( λ ) max P X 1 , P X 2 min { ϵ ( λ ) , log l ( λ ) } .
The first intersection is taken over all DM-TWCs Q Y 1 , Y 2 | X 1 , X 2 with the same adjacency as P Y 1 , Y 2 | X 1 , X 2 , and the maximum is taken over all product input probability distributions P X 1 × P X 2 .
Proof. 
The intersection over all Q Y 1 , Y 2 | X 1 , X 2 follows from Proposition 2. Hence without loss of generality, we prove that for P Y 1 , Y 2 | X 1 , X 2 , each achievable rate pair ( R 1 , R 2 ) satisfies λ R 1 + ( 1 λ ) R 2 θ ( λ ) , where 0 λ 1 .
To that end, for each uniquely decodable codebook pair ( A , B ) of length n, we will show that:
| A | λ | B | 1 λ κ · 2 n θ ( λ )
by induction on n, where κ is a constant independent of n. The base case of n = 1 , follows in the same way as in the proof of the base case in Lemma 7. Assume that (17) holds for all length n n 1 , and let us prove it also holds for length n. Suppose that ( A , B ) X 1 n × X 2 n is a uniquely decodable codebook pair of length n. Following the same steps of (8)–(12) in the argument of Lemma 7, we also have:
| A | λ | B | 1 λ = | A i ( S ) | s S P X 1 i ( s ) λ · | B i ( T ) | t T P X 2 i ( t ) ) 1 λ .
Now, if there exists a dual clique pair ( S , T ) and a coordinate 1 i n such that
s S P X 1 i ( s ) λ t T P X 2 i ( t ) 1 λ 2 θ ( λ ) ,
then (18) implies
| A | λ | B | 1 λ κ 2 ( n 1 ) θ ( λ ) 2 θ ( λ ) = κ 2 n θ ( λ ) ,
where the inequality follows from the inductive hypothesis and (19). Therefore, we conclude that (17) holds under condition (19).
Assume now that condition (19) is not satisfied, that is,
max i [ n ] max S , T s S P X 1 i ( s ) λ t T P X 2 i ( t ) 1 λ < 2 θ ( λ ) .
Let A n and B n be codewords chosen from A and B respectively, uniformly at random, and let Y 1 n , Y 2 n be the corresponding channel outputs. Since ( A , B ) is a uniquely decodable codebook pair of length n, it must be that:
log | A | = I ( Y 2 n ; A n | B n ) , log | B | = I ( Y 1 n ; B n | A n ) .
On the other hand, we have:
I ( Y 1 n ; B n | A n ) = H ( Y 1 n | A n ) H ( Y 1 n | A n , B n )
= i = 1 n H ( Y 1 , i | Y 1 , 1 , , Y 1 , i 1 , A n ) i = 1 n H ( Y 1 , i | A i , B i )
i = 1 n H ( Y 1 , i | A i ) i = 1 n H ( Y 1 , i | A i , B i )
= i = 1 n I ( Y 1 , i ; B i | A i ) ,
where (23) follows from the entropy chain rule and the memorylessness of the channel, and (24) follows from the fact that conditioning reduces entropy. Similarly,
I ( Y 2 n ; A n | B n ) i = 1 n I ( Y 2 , i ; A i | B i ) .
Combining (20)–(26), we obtain
log | A | λ | B | 1 λ = λ log | A | + ( 1 λ ) log | B | i = 1 n λ I ( Y 2 , i ; A i | B i ) + ( 1 λ ) I ( Y 1 , i ; B i | A i ) max P X 1 , P X 2 , l ( λ ) < 2 θ ( λ ) n [ λ I ( Y 2 ; X 1 | X 2 ) + ( 1 λ ) I ( Y 1 ; X 2 | X 1 ) ] = max P X 1 , P X 2 , l ( λ ) < 2 θ ( λ ) n · ϵ ( λ ) ,
where ϵ ( λ ) and l ( λ ) are defined in (4) and (6), respectively, and the maximum is taken over all product input probability distributions P X 1 × P X 2 such that l ( λ ) < 2 θ ( λ ) , following condition (20).
By the definition of θ ( λ ) , we have:
θ ( λ ) = max P X 1 , P X 2 min { ϵ ( λ ) , log l ( λ ) }
max P X 1 , P X 2 , l ( λ ) < 2 θ ( λ ) min { ϵ ( λ ) , log l ( λ ) } .
Note that for any input distributions P X 1 , P X 2 such that l ( λ ) < 2 θ ( λ ) , we have
log l ( λ ) > θ ( λ ) .
Combining (29) and (30), we obtain
max P X 1 , P X 2 , l ( λ ) < 2 θ ( λ ) ϵ ( λ ) θ ( λ ) .
Substituting (31) into (27), we have log | A | λ | B | 1 λ n θ ( λ ) , completing the proof. □
We remark that Theorem 2 immediately implies, in particular, the following upper bound on the zero-error capacity of the point-to-point discrete memoryless channel.
Corollary 2.
The zero-error capacity of the discrete memoryless channel P Y | X is upper bounded by
min Q Y | X max P X min I ( X ; Y ) , log max C x C P X ( x ) .
The outer minimum is taken over all the Q Y | X having the same confusion graph as P Y | X , the outer maximum is taken over all the input distributions P X , and the inner maximum is taken over all the cliques C of the confusion graph of the channel.
As it turns out, the upper bound in Corollary 2 coincides with the linear programming bound on the zero-error capacity of a point-to-point discrete memoryless channel in [2]. Namely,
min Q Y | X max P X I ( X ; Y ) = max P X log max C x C P X ( x )
for any point-to-point discrete memoryless channel P Y | X . This fact was originally conjectured by Shannon [2] and later proved by Ahlswede [22]. In other words, this means that in the point-to-point case, Corollary 1 yields exactly the same bound as Theorem 2. However, this is not the case in general for the DM-TWC. For example, recall that Holzman and Körner [13] derived the bound in Theorem 2 in the special case of the (deterministic) binary multiplying channel (using λ = 0.5 ) and numerically showed that it is strictly better than what can be obtained from Corollary 1. Next we give another example showing that Theorem 2 outperforms Corollary 1 for a noisy (i.e., non-deterministic) DM-TWC as well.
Example 1.
Let X 1 = { 0 , 1 , 2 } , X 2 = Y 1 = Y 2 = { 0 , 1 } , and the conditional probability distribution P Y 1 , Y 2 | X 1 , X 2 be
y 1 y 2 x 1 x 2
 00  01 10 11  20  21 
00110000
01000010
1000 δ 001
1100 1 δ 100
where δ ( 0 , 1 ) . Corollary 1 gives the upper bound
R 1 + R 2 min max P X 1 , P X 2 ϵ * , max P X 1 , P X 2 log l * 1.2933 ,
where
ϵ * = I ( X 1 ; Y 2 | X 2 ) + I ( X 2 ; Y 1 | X 1 ) = P X 1 ( 2 ) · h ( P X 2 ( 0 ) ) + P X 2 ( 0 ) · h ( P X 1 ( 0 ) + δ · P X 1 ( 1 ) ) P X 1 ( 1 ) · P X 2 ( 0 ) · h ( δ ) + P X 2 ( 1 ) · h ( P X 1 ( 1 ) ) , l * = max S , T x 1 S P X 1 ( x 1 ) x 2 T P X 2 ( x 2 ) = max { P X 1 ( 0 ) , P X 1 ( 1 ) , P X 2 ( 0 ) · ( P X 1 ( 0 ) + P X 1 ( 1 ) ) , P X 2 ( 0 ) · ( P X 1 ( 1 ) + P X 1 ( 2 ) ) , P X 2 ( 1 ) · ( P X 1 ( 0 ) + P X 1 ( 2 ) ) } ,
and h ( x ) = x log x ( 1 x ) log ( 1 x ) . In contrast, Theorem 2 yields a tighter upper bound of
R 1 + R 2 max P X 1 , P X 2 min { ϵ * , log l * } 1.2910 .

3.3. An Outer Bound via Shannon Capacity of a Graph

Based on Lemma 3 and the Shannon capacity of a graph, we immediately have the following bound.
Lemma 8.
C z e s u m ( [ G 1 , , G | X 1 | ; H 1 , , H | X 2 | ] ) max x 1 X 1 , x 2 X 2 Θ ( G x 1 ) + Θ ( H x 2 ) .
It is worth noting that the above bound could be optimal in the sense that when all G i = G and H j = H , it is easily verified that C z e s u m ( [ G , , G ; H , , H ] ) = Θ ( G ) + Θ ( H ) . However, the bound in Lemma 8 is not tight in general. Later in Section 5, we will improve the bound in Lemma 8 for certain scenarios and show that the improved bound (in Theorem 5) could outperform Theorem 2 (see Example 3), and be achieved in special cases (see Theorem 7).

4. Inner Bounds

In this section, we present two inner bounds for the non-adaptive zero-error capacity region of the DM-TWC, one based on random coding and the other on linear codes.

4.1. Random Coding

The random coding for DM-TWC is standard and generalizes a known bound by Shannon for the one-way case [2]. To obtain the random coding inner bound, we need the following lemma from [1].
Lemma 9
([1]). Let X be a random variable taking values in [ N ] , and { f i : [ N ] R + } i [ d ] be a collection of nonnegative functions. Then there exists x [ N ] such that f i ( x ) d · E [ f i ( X ) ] for all i [ d ] .
Theorem 3.
C z e ( P Y 1 , Y 2 | X 1 , X 2 ) contains the region:
P X 1 , P X 2 { ( R 1 , R 2 ) : R 1 0 , R 2 0 , R 1 1 2 log x 1 x 2 x 1 x 1 = x 1 , x 1 , x 1 X 1 , x 2 X 2 P X 1 ( x 1 ) P X 1 ( x 1 ) P X 2 ( x 2 ) , R 2 1 2 log x 2 x 1 x 2 x 2 = x 2 , x 1 X 1 , x 2 , x 2 X 2 P X 1 ( x 1 ) P X 2 ( x 2 ) P X 2 ( x 2 ) , }
where the union is taken over all input distributions P X 1 , P X 2 .
Proof. 
We randomly draw a codebook pair ( A , B ) , such that A (resp. B ) consists of M 1 (resp. M 2 ) statistically independent words, where each word is generated i.i.d. according to a probability distribution P X 1 (resp. P X 2 ). A word a n A is called bad, if there exist two words, b n , b ˜ n B that are either equal or adjacent in G a 1 G a n . For any particular words a n A , b n , b ˜ n B and coordinate i [ n ] , the probability that b i b ˜ i in G a i is upper bounded by:
x 2 x 1 x 2 x 2 = x 2 , x 1 X 1 , x 2 , x 2 X 2 P X 1 ( x 1 ) P X 2 ( x 2 ) P X 2 ( x 2 ) .
Since all the coordinates are independent, the probability that b n b ˜ n in G a 1 G a n is at most:
x 2 x 1 x 2 x 2 = x 2 , x 1 X 1 , x 2 , x 2 X 2 P X 1 ( x 1 ) P X 2 ( x 2 ) P X 2 ( x 2 ) n .
Denote by Bad ( a n ) the number of 2-subsets { b n , b ˜ n } B such that b n b ˜ n in G a 1 G a n . Then,
Pr { a n is bad } = Pr { Bad ( a n ) 1 } E [ Bad ( a n ) ] M 2 2 x 2 x 1 x 2 x 2 = x 2 , x 1 X 1 , x 2 , x 2 X 2 P X 1 ( x 1 ) P X 2 ( x 2 ) P X 2 ( x 2 ) n ,
where the first inequality is by Markov’s inequality, and the second inequality follows from (33) and the linearity of expectation. Similarly, a word b n B is called bad, if there exist two words a n , a ˜ n A that are equal or adjacent in H b 1 H b n , and we have
Pr { b n is bad } M 1 2 x 1 x 2 x 1 x 1 = x 1 , x 1 , x 1 X 1 , x 2 X 2 P X 1 ( x 1 ) P X 1 ( x 1 ) P X 2 ( x 2 ) n .
Let f 1 ( A , B ) , f 2 ( A , B ) be the number of bad words in A and B respectively. Then, we have:
E [ f 1 ( A , B ) ] M 1 M 2 2 x 2 x 1 x 2 x 2 = x 2 , x 1 X 1 , x 2 , x 2 X 2 P X 1 ( x 1 ) P X 2 ( x 2 ) P X 2 ( x 2 ) n ,
E [ f 2 ( A , B ) ] M 2 M 1 2 x 1 x 2 x 1 x 1 = x 1 , x 1 , x 1 X 1 , x 2 X 2 P X 1 ( x 1 ) P X 1 ( x 1 ) P X 2 ( x 2 ) n .
By Lemma 9, there exists a pair ( A * , B * ) such that
f 1 ( A * , B * ) 2 E [ f 1 ( A , B ) ] , f 2 ( A * , B * ) 2 E [ f 2 ( A , B ) ] .
Remove all the bad words in A * and B * respectively, yielding a codebook pair ( A , B ) such that:
| A | = M 1 f 1 ( A * , B * ) and | B | = M 2 f 2 ( A * , B * ) .
It is readily seen that ( A , B ) is a uniquely decodable codebook pair.
Now let
M 1 = ( 1 ξ 1 ) n 2 x 1 x 2 x 1 x 1 = x 1 , x 1 , x 1 X 1 , x 2 X 2 P X 1 ( x 1 ) P X 1 ( x 1 ) P X 2 ( x 2 ) n 2 ,
M 2 = ( 1 ξ 2 ) n 2 x 2 x 1 x 2 x 2 = x 2 , x 1 X 1 , x 2 , x 2 X 2 P X 1 ( x 1 ) P X 2 ( x 2 ) P X 2 ( x 2 ) n 2 ,
where ξ 1 , ξ 2 are arbitrarily small positive numbers. Combining (34)–(39), we obtain:
| A | ( 1 ( 1 ξ 2 ) n ) ( 1 ξ 1 ) n 2 x 1 x 2 x 1 x 1 = x 1 , x 1 , x 1 X 1 , x 2 X 2 P X 1 ( x 1 ) P X 1 ( x 1 ) P X 2 ( x 2 ) n 2 , | B | ( 1 ( 1 ξ 1 ) n ) ( 1 ξ 2 ) n 2 x 2 x 1 x 2 x 2 = x 2 , x 1 X 1 , x 2 , x 2 X 2 P X 1 ( x 1 ) P X 2 ( x 2 ) P X 2 ( x 2 ) n 2 .
Since ξ 1 , ξ 2 are arbitrarily small, by taking n as sufficiently large, we can find an ( n , R 1 , R 2 ) uniquely decodable codebook pair arbitrarily close to (32), as desired. □

4.2. Linear Codes

In this subsection, we present a construction of uniquely decodable codes via linear codes, which generalizes a known result for the binary multiplying channel [15]. Let us introduce some notations first. Suppose D is a set of letters, x n and y n are vectors of length n, and C is a collection of vectors of length n. Let:
ind D ( x n ) { 1 i n : x i D }
denote the collection of indices where x i D . For I [ n ] let y n | I denote the vector obtained from y n by projecting onto the coordinates in I, and denote
C | I { c n | I : c n C } .
Let P Y 1 , Y 2 | X 1 , X 2 be a DM-TWC. We say that x 1 X 1 is a detecting symbol, if x 2 x 1 x 2 for any distinct x 2 , x 2 X 2 . A detecting symbol x 2 X 2 is defined analogously. Let D 1 X 1 and D 2 X 2 denote the sets of all detecting symbols in X 1 and X 2 , respectively. A vector a n X 1 n is called a detecting vector for B X 2 n if
B | ind D 1 ( a n ) = | B | .
Similarly, a vector b n X 2 n is a detecting vector for A X 1 n if
| A | ind D 2 ( b n ) | = | A | .
The following claim is immediate.
Proposition 7.
Let A X 1 n , B X 2 n . If each a n A is a detecting vector for B and each b n B is a detecting vector for A , then ( A , B ) is a uniquely decodable codebook pair.
Proposition 7 provides a sufficient condition for a uniquely decodable code, which is not always necessary (see Example 2). Nevertheless, this sufficient condition furnishes us with a way of constructing uniquely decodable codes by employing linear codes.
Example 2.
Suppose that X 1 = { a 0 , a 1 , a 2 } , X 2 = { b 0 , b 1 } such that D 1 = { a 0 , a 1 , a 2 } , D 2 = { b 1 } , and a 0 b 0 a 1 , a 0 b 0 a 2 , a 1 b 0 a 2 . Let A = { a 0 a 0 a 0 , a 1 a 1 a 1 , a 0 a 1 a 2 } and B = { b 0 b 1 b 0 } . It is easy to verify that ( A , B ) is a uniquely decodable codebook pair. However, ind D 2 ( b 0 b 1 b 0 ) = { 2 } and | A | { 2 } | = | { a 0 , a 1 } | = 2 < | A | = 3 , implying that b 0 b 1 b 0 is not a detecting vector for A .
Assume that | X 1 | = q 1 and | X 2 | = q 2 , where q 1 , q 2 are prime powers, and let us think of the alphabets as F q 1 and F q 2 , respectively. The following theorem gives an inner bound on the capacity region, which is a generalization of the Tolhuizen’s construction for the Blackwell’s multiplying channel [15].
Theorem 4.
Let P Y 1 , Y 2 | X 1 , X 2 be a DM-TWC with input alphabet sizes | X 1 | = q 1 , | X 2 | = q 2 , where q 1 , q 2 are prime powers. If X 1 and X 2 contain τ 1 and τ 2 detecting symbols respectively, then C z e ( P Y 1 , Y 2 | X 1 , X 2 ) contains the region
0 α , β 1 { ( R 1 , R 2 ) : R 1 0 , R 2 0 , R 1 h ( α ) + α log τ 2 + ( 1 α ) log ( q 2 τ 2 ) ( 1 β ) log q 2 , R 2 h ( β ) + β log τ 1 + ( 1 β ) log ( q 1 τ 1 ) ( 1 α ) log q 1 } ,
where h ( x ) x log x ( 1 x ) log ( 1 x ) is the binary entropy function.
To prove this theorem, we need the following lemma. The case that q 1 = q 2 = 2 and τ = 1 was proved in ([15], Theorem 3). Lemma 10 follows from similar argument.
Lemma 10.
Let q, q be prime powers, n , k be positive integers such that 1 k n , and D F q with cardinality | D | = τ . Then there exists a pair ( C , Y ( C ) ) satisfying that:
(1) 
C is a q-ary [ n , k ] linear code;
(2) 
Y ( C ) F q n such that
| Y ( C ) | n k τ k ( q τ ) n k i = 1 ( 1 q i ) ;
(3) 
for each x n Y ( C ) , we have | i n d D ( x n ) | = k and C | i n d D ( x n ) = | C | .
Proof. 
Let A be a k × n matrix of full rank over F q , then C ( A ) { y k A : y k F q k } is a q-ary [ n , k ] linear code generated by A. Recall that for every x n F q n , ind D ( x n ) = { i [ n ] : x i D } as in (40). Denote:
Y ( C ( A ) ) x n F q n : | ind D ( x n ) | = k , | C | ind D ( x n ) | = | C | .
Let A | ind D ( x n ) denote the k × | ind D ( x n ) | submatrix of A with columns indexed by ind D ( x n ) . It is easy to see that | C | ind D ( x n ) | = | C | is equivalent to rank ( A | ind D ( x n ) ) = k . Denote:
P ( A , x n ) : A F q k × n , x n F q n , | ind D ( x n ) | = k , rank ( A | ind D ( x n ) ) = k ,
and let us proceed by double counting the cardinality of P .
On the one hand, the number of vectors x n F q n such that | ind D ( x n ) | = k is n k τ k ( q τ ) n k . For each such x n , there are q k ( n k ) I q ( k ) corresponding k × n matrices A F q k × n such that rank ( A | ind D ( x n ) ) = k , where I q ( k ) = i = 0 k 1 ( q k q i ) is the number of k × k invertible matrices over F q , see ([15], Lemma 3). Hence, we have:
| P | = n k τ k ( q τ ) n k q k ( n k ) I q ( k ) .
On the other hand, the number of matrices A F q k × n is q n k . By (44) and the pigeonhole principle, there exist a matrix A * F q k × n and a corresponding code C ( A * ) such that | Y ( C ( A ) ) | | P | / q n k . Letting C = C ( A * ) , the lemma follows. □
Proof of Theorem 4.
For i = 1 , 2 , let us identify X i with F q i , and let the respective sets of all detecting symbols be D i F q i with | D i | = τ i .
To prove the existence of a uniquely decodable codebook pair based on Proposition 7, we first use Lemma 10 to find two “one-sided” uniquely decodable linear codebook pairs, and then combine them to the desired codebook pair by employing their cosets in F q 1 n and F q 2 n .
First, letting q = q 1 , q = q 2 , G = D 2 and τ = τ 2 in Lemma 10, we have a pair ( C 1 , Y ( C 1 ) ) satisfying that C 1 is a q 1 -ary [ n , k 1 ] linear code and Y ( C 1 ) F q 2 n such that
| Y ( C 1 ) | n k 1 τ 2 k 1 ( q 2 τ 2 ) n k 1 i = 1 ( 1 q 1 i ) .
Similarly, letting q = q 2 , q = q 1 , G = D 1 and τ = τ 1 in Lemma 10, we have a pair ( C 2 , Y ( C 2 ) ) satisfying that C 2 is a q 2 -ary [ n , k 2 ] linear code and Y ( C 2 ) F q 1 n such that
| Y ( C 2 ) | n k 2 τ 1 k 2 ( q 1 τ 1 ) n k 2 i = 1 ( 1 q 2 i ) .
The property ( 3 ) in Lemma 10 implies that each x n Y ( C i ) is a detecting vector for C i for i = 1 , 2 . Note that if Ξ ( C i ) F q i n is a coset of C i , then each x n Y ( C i ) is also a detecting vector for Ξ ( C i ) .
Now we are going to combine the two pairs ( C 1 , Y ( C 1 ) ) and ( C 2 , Y ( C 2 ) ) . Since C i has q i n k i cosets, then by the pigeonhole principle there exists coset Ξ ( C i ) of C i such that:
A Y ( C 1 ) Ξ ( C 2 ) , | A | | Y ( C 1 ) | q 2 n k 2 , B Y ( C 2 ) Ξ ( C 1 ) , | B | | Y ( C 2 ) | q 1 n k 1 .
We now notice that each vector in A (resp. B ) is a detecting vector for B (resp. A ), hence by Proposition 7 ( A , B ) is a uniquely decodable codebook pair. Moreover, for fixed q 1 , q 2 , we have:
log | A | n h ( k 1 n ) + ( k 1 n ) log τ 2 + ( 1 k 1 n ) log ( q 2 τ 2 ) ( 1 k 2 n ) log q 2 O ( 1 n ) ,
log | B | n h ( k 2 n ) + ( k 2 n ) log τ 1 + ( 1 k 2 n ) log ( q 1 τ 1 ) ( 1 k 1 n ) log q 1 O ( 1 n ) ,
which follows from (45)–(47). Letting α = k 1 / n [ 0 , 1 ] , β = k 2 / n [ 0 , 1 ] , we obtain:
R 1 = lim n log | A | n h ( α ) + α log τ 2 + ( 1 α ) log ( q 2 τ 2 ) ( 1 β ) log q 2 , R 2 = lim n log | B | n h ( β ) + β log τ 1 + ( 1 β ) log ( q 1 τ 1 ) ( 1 α ) log q 1 .
Therefore, (43) follows, as desired. □
We note that for any DM-TWC, one could only exploit part of input symbols X 1 X 1 , X 2 X 2 to meet the requirements in Theorem 4. Hence we in fact have the following more general bound.
Corollary 3.
Let P Y 1 , Y 2 | X 1 , X 2 be a DM-TWC with input alphabets X 1 , X 2 . Then C z e ( P Y 1 , Y 2 | X 1 , X 2 ) contains the region:
X 1 X 1 , X 2 X 2 0 α , β 1 { ( R 1 , R 2 ) : R 1 0 , R 2 0 , R 1 h ( α ) + α log τ 2 + ( 1 α ) log ( q 2 τ 2 ) ( 1 β ) log q 2 , R 2 h ( β ) + β log τ 1 + ( 1 β ) log ( q 1 τ 1 ) ( 1 α ) log q 1 } ,
where the first union is taken over all X 1 X 1 , X 2 X 2 such that | X 1 | and | X 2 | are prime powers, and contain τ 1 and τ 2 detecting symbols, respectively.
Notice that the region (43) relies on the number q 1 , q 2 of symbols being used and the corresponding numbers τ 1 , τ 2 of detecting symbols. It is thus possible that using only a smaller subset of channel inputs would yield higher achievable rates (when using our linear coding strategy) than those obtained by using larger subsets. For Example 1, Corollary 3 shows that a lower bound on the maximum sum-rate R 1 + R 2 is 1.17 , which is better than the random coding lower bound 1.0907 .

5. Certain Types of DM-TWC

In this section, we consider the DM-TWC in the scenario that the communication in one direction is stable (in particular, noiseless). First we briefly review the probabilistic refinement of the Shannon capacity of a graph in Section 5.1. Then in Section 5.2, we provide an outer bound on the zero-error capacity region via the asymptotic spectrum of graphs. In Section 5.3, we present explicit constructions that attain our outer bound in certain special cases.

5.1. Probabilistic Refinement of the Shannon Capacity of a Graph

We first recall some basic notions and results from the method-of-types. Let x n X n be a sequence and N ( a | x n ) be the number of times that a X appears in sequence x n . The type  P x n of x n is the relative proportion of each symbol in X , that is, P x n ( a ) N ( a | x n ) n for all a X . Let P n denote the collection of all possible types of sequences of length n. For every P P n , the type class T n ( P ) of P is the set of sequences of type P, that is, T n ( P ) { x n : P x n = P } . The ϵ-typical set of P is
T ϵ n ( P ) { x n X n : | P x n ( a ) P ( a ) | < ϵ , a X } .
The joint type  P x n , y n of a pair of sequences ( x n , y n ) is the relative proportion of occurrences of each pair of symbols of X × Y , that is, P x n , y n N ( a , b | x n , y n ) n for all a X and b Y . By the Bayes’ rule, the conditional type  P x n | y n is defined as:
P x n | y n ( a , b ) N ( a , b | x n , y n ) N ( b | y n ) = P x n , y n ( a , b ) P y n ( b ) .
Lemma 11
([23]).  | P n | ( n + 1 ) | X | .
Lemma 12
([23]).  P P n , we have 2 n H ( X ) ( n + 1 ) | X | | T n ( P ) | 2 n H ( X ) .
In [24], Csiszár and Körner introduced the probabilistic refinement of the Shannon capacity of a graph, imposing that the independent set consists of sequences of the (asymptotically) same type. Let G ϵ n [ P ] denote the subgraph of G n induced by T ϵ n ( P ) . The Shannon capacity of graph G relative to P is defined as
Θ ( G , P ) lim ϵ 0 lim sup n 1 n log α ( G ϵ n [ P ] ) .
Let G n [ P ] denote the subgraph of G n induced by T n ( P ) . Then, it is readily seen that:
lim sup n 1 n log α ( G n [ P ] ) Θ ( G , P ) .
For each η Δ ( G ) , define
η ^ ( G , P ) lim ϵ 0 lim sup n 1 n log η ( G ϵ n [ P ] ) .
If G = K ¯ n , then according to Lemma 12, we have
η ^ ( K ¯ n , P ) = H ( X )
for any η Δ ( G ) . Very recently, Vrana [25] proved the following results on η ^ ( G , P ) .
Lemma 13
([25]). The limit in (48) exists and
(1) 
Θ ( G , P ) = min η Δ ( G ) η ^ ( G , P ) ;
(2) 
log η ( G ) = max P η ^ ( G , P ) for η Δ ( G ) .
According to Lemma 11, it is easily seen that:
Θ ( G ) = max P Θ ( G , P ) .
Here, we would like to mention that the probabilistic refinement of the Lovász theta number was introduced and investigated by Marton in [26] via a non-asymptotic formula, and the probabilistic refinement of the fractional clique cover number was studied in relation to the graph entropy in [27].

5.2. An Outer Bound via the Asymptotic Spectrum of Graphs

In this subsection, we derive an outer bound for the case when all { H j } are the same, namely, H j = H for all j X 2 .
Theorem 5.
C z e ( [ G 1 , , G | X 1 | ; H , , H ] ) is contained in the region
( R 1 , R 2 ) : R 1 0 , R 2 0 , R 1 + R 2 max P X 1 x 1 X 1 P X 1 ( x 1 ) Θ ( G x 1 ) + Θ ( H , P X 1 ) .
Proof. 
Suppose that ( A , B ) X 1 n × X 2 n is a uniquely decodable codebook pair of length n. For any a n A and b n B , let P a n , b n denote the joint type of the pair ( a n , b n ) and
J n ( P X 1 , X 2 ) { ( a n , b n ) : a n A , b n B , P a n , b n = P X 1 , X 2 } .
By Lemma 11, there are at most ( n + 1 ) | X 1 | | X 2 | different joint types over ( A , B ) . Thus by the pigeonhole principle, there exists one joint type P X 1 , X 2 * such that:
| J n ( P X 1 , X 2 * ) | | A | | B | ( n + 1 ) | X 1 | | X 2 | .
Notice that for each ( a n , b n ) J n ( P X 1 , X 2 * ) , we have:
P a n = P X 1 * = x 2 X 2 P X 1 , X 2 = x 2 * , P b n = P X 2 * = x 1 X 1 P X 1 = x 1 , X 2 * .
Now we are going to upper bound the cardinality of J n ( P X 1 , X 2 * ) . Let A * (resp. B * ) denote the collection of a n A (resp. b n B ) that appears in J n ( P X 1 , X 2 * ) , that is, there exists b n B (resp. a n A ) such that P a n , b n = P X 1 , X 2 * . Then we immediately have
| J n ( P X 1 , X 2 * ) | | A * | | B * | .
Let us now turn to upper bound the cardinalities of A * and B * . Since ( A , B ) is uniquely decodable, by Proposition 1, for any a n A * it must hold that B * is an independent set of G a 1 G a 2 G a n . Accordingly,
| B * | α G 1 n P X 1 * ( 1 ) G 2 n P X 1 * ( 2 ) G | X 1 | n P X 1 * ( | X 1 | ) .
Also, for b n B * , we notice that A * is an independent set of H n with type P X 1 * . To be precise, we have:
| A * | α H n [ P X 1 * ] .
Therefore we have:
lim sup n 1 n log | A | | B |
lim sup n 1 n log ( n + 1 ) | X 1 | | X 2 | | J n ( P X 1 , X 2 * ) |
= lim sup n 1 n log | J n ( P X 1 , X 2 * ) |
lim sup n 1 n log | A * | | B * |
lim sup n 1 n log α G 1 n P X 1 * ( 1 ) G | X 1 | n P X 1 * ( | X 1 | ) + log α H n [ P X 1 * ]
lim sup n min η , η Δ ( G ) 1 n log η G 1 n P X 1 * ( 1 ) G | X 1 | n P X 1 * ( | X 1 | ) + log η H n [ P X 1 * ]
lim sup n min η , η Δ ( G ) 1 n x 1 X 1 n P X 1 * ( x 1 ) log η G x 1 + log η H n [ P X 1 * ]
max P X 1 x 1 X 1 P X 1 ( x 1 ) Θ ( G x 1 ) + Θ ( H , P X 1 ) ,
where (54) follows from (50); (55) follows from the fact that | X 1 | , | X 2 | are fixed when n tends to infinity; (56) follows from (51); (57) follows from (52) and (53); (58) follows from Theorem 1 that α ( G ) min η Δ ( G ) η ( G ) for any graph G; (59) follows from Theorem 1 that each η Δ ( G ) is multiplicative with respect to the strong product; and (60) follows from Theorem 1 and Lemma 13.
This completes the proof. □
In particular, considering the DM-TWC such that | X 1 | = 2 , H = K ¯ 2 , G 1 = K ¯ | X 2 | and G 2 = G is a general graph, we have the following result.
Theorem 6.
C z e ( [ K ¯ | X 2 | , G ; K ¯ 2 , , K ¯ 2 ] ) is contained in the region
( R 1 , R 2 ) : R 1 0 , R 2 0 , R 1 + R 2 log | X 2 | + 2 Θ ( G ) .
Proof. 
Recall that: Θ ( K ¯ n ) = log ( n ) and Θ ( K ¯ n , P ) = H ( X ) . According to Theorem 5, we have:
R 1 + R 2 max P X 1 x 1 X 1 P X 1 ( x 1 ) Θ ( G x 1 ) + Θ ( K ¯ 2 , P X 1 ) = max P X 1 P X 1 ( 1 ) · log | X 2 | + P X 1 ( 2 ) · Θ ( G ) + H ( X 1 ) = log | X 2 | + 2 Θ ( G ) ,
where the last equality is achieved by taking P X 1 ( 1 ) = | X 2 | | X 2 | + 2 Θ ( G ) and P X 1 ( 2 ) = 2 Θ ( G ) | X 2 | + 2 Θ ( G ) . □
We remark that Theorem 6 (hence also Theorem 5) could outperform Theorem 2, see the following example.
Example 3.
Consider the channel [ K ¯ 5 , C 5 ; K ¯ 2 , , K ¯ 2 ] where C 5 is the Pentagon graph. It is well known from [2,12] that Θ ( C 5 ) = 1 2 log 5 . Then by Theorem 6 we have an upper bound on the sum-rate R 1 + R 2 log ( 5 + 5 ) 2.8552 , while Theorem 2 only gives an upper bound R 1 + R 2 2.9069 .

5.3. Explicit Constructions

In this subsection, we present explicit constructions of uniquely decodable codebook pairs which could attain the outer bound of Theorem 6 in certain special cases.
Theorem 7.
Let m be a prime power, | X 2 | = q = m s and G = K m K m be a disjoint union of s cliques. Then C z e s u m [ K ¯ q , G ; K ¯ 2 , , K ¯ 2 ] = log ( q + s ) .
Proof. 
First by Theorem 6, we have an upper bound on the sum-capacity given by
C z e s u m [ K ¯ q , G ; K ¯ 2 , , K ¯ 2 ] log | X 2 | + 2 Θ ( G ) = log ( q + s ) .
Next, we consider the lower bound. Notice that G = K m K ¯ s . We can reformulate the channel accordingly as:
[ K ¯ q , G ; K ¯ 2 , , K ¯ 2 ] = [ K ¯ m , K m ; K ¯ 2 , , K ¯ 2 ] [ K ¯ s ; K 1 , , K 1 ] ,
where the first [ K ¯ m , K m ; K ¯ 2 , , K ¯ 2 ] corresponds to a channel with input alphabets X 1 ( 1 ) = { 1 , 2 } and X 2 ( 1 ) = { 1 , , m } ; and the second [ K ¯ s ; K 1 , , K 1 ] is with input alphabets X 1 ( 2 ) = { 1 } and X 2 ( 2 ) = { 1 , , s } . Together with Lemma 1, we have:
C z e s u m [ K ¯ q , G ; K ¯ 2 , , K ¯ 2 ] C z e s u m [ K ¯ m , K m ; K ¯ 2 , , K ¯ 2 ] + C z e s u m [ K ¯ s ; K 1 , , K 1 ] .
On the one hand, it is easy to see that:
C z e s u m [ K ¯ s ; K 1 , , K 1 ] = log s
since this is a clean channel and Alice and Bob could always communicate without error. On the other hand, by Lemma 10, we obtain:
C z e s u m [ K ¯ m , K m ; K ¯ 2 , , K ¯ 2 ] log ( m + 1 ) .
In fact, letting q = m , q = 2 and τ = 1 in Lemma 10, we have a pair ( C , Y ( C ) ) satisfying that C is an m-ary [ n , k ] linear code and Y ( C ) F 2 n such that
| Y ( C ) | n k i = 1 ( 1 m i ) .
Now let A = Y ( C ) and B = C , then it is easy to see that ( A , B ) is a uniquely decodable codebook pair with respect to the channel [ K ¯ m , K m ; K ¯ 2 , , K ¯ 2 ] . The corresponding sum-rate is
lim n 1 n log | A | | B | = lim n 1 n log m k + log n k i = 1 ( 1 m i ) = lim n k n log m + h k n .
Taking k / n = m / ( m + 1 ) , we obtain a lower bound log ( m + 1 ) on the best possible sum-rate, that is, (64).
Combining (62)–(64), we have C z e s u m [ K ¯ q , G ; K ¯ 2 , , K ¯ 2 ] log ( m + 1 ) + log s = log ( q + s ) , which also implies an explicit uniquely decodable codebook pair for the channel [ K ¯ q , G ; K ¯ 2 , , K ¯ 2 ] based on the argument of Lemma 1. Then, together with (61), the proof is complete. □

6. Concluding Remarks

In this paper, we investigated the non-adaptive zero-error capacity region of the DM-TWC and provided several single-letter inner and outer bounds, some of which coincide in certain special cases. Determining the exact zero-error capacity region of a general DM-TWC remains an open problem, and clearly a difficult one, since it includes the notorious Shannon capacity of a graph as a special case. Despite this inherent difficulty, the problem is richer than the graph capacity setting, and we believe it deserves further study in order to obtain tighter bounds and smarter constructions.
One appealing direction is to extend the Lovász’s semi-definite relaxation approach in order to obtain tighter outer bounds, mimicking the graph capacity case. This, however, does not seem to be a simple task. In particular, one may ask whether the natural quantity ρ ( { G i } , { H j } ) defined in (2), which upper-bounds the one-shot zero-error sum-capacity, is sub-multiplicative with respect to the graph strong product, in which case it would also serve as an upper bound for the zero-error sum-capacity. This is however not evident, in part since the problem (2) is not a semi-definite program. We have also considered other variations of the program (2). In particular, we have attempted to modify the non-linear constraints E i , j , Γ E i , m , Γ = 0 to be of a linear form A , Γ = 0 for some suitable symmetric matrix A. We have also looked at some variants of the orthonormal representation. For example, we considered the case where each graph vertex a is labelled by a unit vector v a , and if two vertices a and a are nonadjacent a b a if and only if b F for some set F, then the vector projections of v a and v a onto the subspace spanned by the vectors in F are orthogonal. However, proving sub-multiplicativity in any of these settings has so far resisted our best efforts.
It would be also of much interest to consider the adaptive zero-error capacity of the DM-TWC. Allowing Alice and Bob to adapt their transmissions on the fly can in general enlarge the zero-error capacity region. As a simple example, note that a point-to-point channel with noiseless feedback is a special case of the DM-TWC (where Bob has no information to send). In [2], Shannon explicitly derived the zero-error capacity with feedback for the point-to-point channel, and pointed out that for the channel corresponding to Pentagon graph this capacity is given by log ( 5 / 2 ) 1.32 . This is strictly larger than the zero-error capacity without feedback ( log 5 ) / 2 1.16 , which can be thought of in this case as the non-adaptive zero-error capacity of the channel. Exploring the differences between the adaptive and non-adaptive zero-error capacity regions of a general DM-TWC remains a challenging future work.

Author Contributions

Conceptualization, Y.G. and O.S.; methodology, Y.G. and O.S.; investigation, Y.G. and O.S.; writing—original draft preparation, Y.G. and O.S.; writing—review and editing, Y.G. and O.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by an ERC grant no. 639573, ISF grant no. 1495/18, and JSPS grant no. 21K13830.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank Sihuang Hu and Lele Wang for some helpful discussions on the generalization of Lovász theta number.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Shannon, C.E. Two-way communication channels. In Proceedings of the 4th Berkeley Symposium Mathematics, Statistics and Probability, Oakland, CA, USA, 20 June–30 July 1961; pp. 611–644. [Google Scholar]
  2. Shannon, C.E. The zero error capacity of a noisy channel. IRE Trans. Inf. Theory 1956, 2, 8–19. [Google Scholar] [CrossRef] [Green Version]
  3. Han, T. A general coding scheme for the two-way channel. IEEE Trans. Inf. Theory 1984, 30, 35–44. [Google Scholar] [CrossRef]
  4. Hekstra, A.P.; Willems, F.J. Dependence balance bounds for single-output two-way channels. IEEE Trans. Inf. Theory 1989, 35, 44–53. [Google Scholar] [CrossRef]
  5. Zhang, Z.; Berger, T.; Schalkwijk, J. New outer bounds to capacity regions of two-way channels. IEEE Trans. Inf. Theory 1986, 32, 383–386. [Google Scholar] [CrossRef]
  6. Weng, J.; Song, L.; Alajaji, F.; Linder, T. Capacity of two-way channels with symmetry properties. IEEE Trans. Inf. Theory 2019, 65, 6290–6313. [Google Scholar] [CrossRef]
  7. Weng, J.; Song, L.; Alajaji, F.; Linder, T. Sufficient conditions for the tightness of Shannon’s capacity bounds for two-way channels. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018; pp. 1410–1414. [Google Scholar]
  8. Sabag, O.; Permuter, H.H. An achievable rate region for the two-way channel with common output. In Proceedings of the 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2–5 October 2018; pp. 527–531. [Google Scholar]
  9. Schalkwijk, J. The binary multiplying channel—A coding scheme that operates beyond Shannon’s inner bound region. IEEE Trans. Inf. Theory 1982, 28, 107–110. [Google Scholar] [CrossRef]
  10. Schalkwijk, J. On an extension of an achievable rate region for the binary multiplying channel. IEEE Trans. Inf. Theory 1983, 29, 445–448. [Google Scholar] [CrossRef]
  11. Haemers, W. On some problems of Lovász concerning the Shannon capacity of a graph. IEEE Trans. Inf. Theory 1979, 25, 231–232. [Google Scholar] [CrossRef]
  12. Lovász, L. On the Shannon capacity of a graph. IEEE Trans. Inf. Theory 1979, 25, 1–7. [Google Scholar] [CrossRef] [Green Version]
  13. Holzman, R.; Körner, J. Cancellative pairs of families of sets. Eur. J. Combin. 1995, 16, 263–266. [Google Scholar] [CrossRef] [Green Version]
  14. Janzer, B. A new upper bound for cancellative pairs. Electron. J. Combin. 2018, 25, 2–13. [Google Scholar] [CrossRef]
  15. Tolhuizen, L.M. New rate pairs in the zero-error capacity region of the binary multiplying channel without feedback. IEEE Trans. Inf. Theory 2000, 46, 1043–1046. [Google Scholar] [CrossRef]
  16. Gu, Y.; Shayevitz, O. On the non-adaptive zero-error capacity of the discrete memoryless two-way channel. In Proceedings of the 2019 IEEE International Symposium on Information Theory (ISIT), Paris, France, 7–12 July 2019; pp. 3107–3111. [Google Scholar]
  17. Zuiddam, J. The asymptotic spectrum of graphs and the Shannon capacity. Combinatorica 2019, 39, 1173–1184. [Google Scholar] [CrossRef] [Green Version]
  18. Cubitt, T.; Mancinska, L.; Roberson, D.E.; Severini, S.; Stahlke, D.; Winter, A. Bounds on entanglement-assisted source-channel coding via the Lovász theta number and its variants. IEEE Trans. Inform. Theory 2014, 60, 7330–7344. [Google Scholar] [CrossRef] [Green Version]
  19. Blasiak, A. A Graph-Theoretic Approach to Network Coding. Ph.D. Thesis, Cornell University, Ithaca, NY, USA, 2013. [Google Scholar]
  20. Bukh, B.; Cox, C. On a fractional version of Haemers’ bound. IEEE Trans. Inform. Theory 2019, 65, 3340–3348. [Google Scholar] [CrossRef] [Green Version]
  21. Alon, N. The Shannon capacity of a union. Combinatorica 1998, 18, 301–310. [Google Scholar] [CrossRef]
  22. Ahlswede, R. Channels with arbitrarily varying channel probability functions in the presence of noiseless feedback. Z. Wahrscheinlichkeitstheorie Verwandte Geb. 1973, 25, 239–252. [Google Scholar] [CrossRef] [Green Version]
  23. Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Cambridge University Press: Cambridge, UK, 1981. [Google Scholar]
  24. Csiszár, I.; Körner, J. On the capacity of the arbitrarily varying channel for maximum probability of error. Z. Wahrscheinlichkeitstheorie Verwandte Geb. 1981, 57, 87–101. [Google Scholar] [CrossRef]
  25. Vrana, P. Probabilistic refinement of the asymptotic spectrum of graphs. Combinatorica 2021. [Google Scholar] [CrossRef]
  26. Marton, K. On the Shannon capacity of probabilistic graphs. J. Comb. Theory Ser. B 1993, 57, 183–195. [Google Scholar] [CrossRef] [Green Version]
  27. Körner, J. Coding of an information source having ambiguous alphabet and the entropy of graphs. In Proceedings of the 6th Prague Conference on Information Theory, Prague, Czech Republic, 1 January 1973; pp. 411–425. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gu, Y.; Shayevitz, O. On the Non-Adaptive Zero-Error Capacity of the Discrete Memoryless Two-Way Channel. Entropy 2021, 23, 1518. https://doi.org/10.3390/e23111518

AMA Style

Gu Y, Shayevitz O. On the Non-Adaptive Zero-Error Capacity of the Discrete Memoryless Two-Way Channel. Entropy. 2021; 23(11):1518. https://doi.org/10.3390/e23111518

Chicago/Turabian Style

Gu, Yujie, and Ofer Shayevitz. 2021. "On the Non-Adaptive Zero-Error Capacity of the Discrete Memoryless Two-Way Channel" Entropy 23, no. 11: 1518. https://doi.org/10.3390/e23111518

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop