On Decoder Ties for the Binary Symmetric Channel with Arbitrarily Distributed Input

Chang, Ling-Hua; Chen, Po-Ning; Alajaji, Fady

doi:10.3390/e25040668

Open AccessArticle

On Decoder Ties for the Binary Symmetric Channel with Arbitrarily Distributed Input

by

Ling-Hua Chang

^1,*,

Po-Ning Chen

²

and

Fady Alajaji

³

¹

Department of Electrical Engineering, Yuan Ze University, Taoyuan 32003, Taiwan

²

Institute of Communications Engineering, National Yang-Ming Chiao-Tung University, Taipei 112304, Taiwan

³

Department of Mathematics and Statistics, Queen’s University, Kingston, ON K7L 3N6, Canada

^*

Author to whom correspondence should be addressed.

Entropy 2023, 25(4), 668; https://doi.org/10.3390/e25040668

Submission received: 20 March 2023 / Revised: 13 April 2023 / Accepted: 15 April 2023 / Published: 16 April 2023

(This article belongs to the Special Issue Advances in Information and Coding Theory)

Download

Browse Figure

Review Reports Versions Notes

Abstract

:

The error probability of block codes sent under a non-uniform input distribution over the memoryless binary symmetric channel (BSC) and decoded via the maximum a posteriori (MAP) decoding rule is investigated. It is proved that the ratio of the probability of MAP decoder ties to the probability of error grows most linearly in blocklength when no MAP decoding ties occur, thus showing that decoder ties do not affect the code’s error exponent. This result generalizes a similar recent result shown for the case of block codes transmitted over the BSC under a uniform input distribution.

Keywords:

binary symmetric channel; block codes; non-uniformly distributed channel inputs; joint source-channel coding; maximum a posteriori (MAP) decoding; decoder ties; error probability; error exponent

1. Introduction

Consider the classical channel coding context, where we send a block code through the memoryless binary symmetric channel (BSC) with crossover probability

0 < p < 1 / 2

. Given a sequence of binary codes

{C_{n}}_{n \geq 1}

with n being the blocklength, we denote the sequence of corresponding minimal probabilities of decoding error under maximum a posteriori (MAP) decoding by

{a_{n}}_{n \geq 1}

. The following result was recently shown in [1] when the channel input selects codewords from

C_{n}

according to a uniform distribution.

Theorem 1

([1]). For any sequence of codes

{C_{n}}_{n \geq 1}

of blocklength n and size

| C_{n} | = M

with

C_{n} \subseteq {0, 1}^{n}

, sent over the BSC with crossover probability

0 < p < 1 / 2

under a uniform channel input distribution over

C_{n}

, its minimum probability of decoding error

a_{n}

satisfies

b_{n} \leq a_{n} \leq (1 + \frac{(1 - p)}{p} n) b_{n},

(1)

where

\begin{matrix} b_{n} & = P_{X^{n}, Y^{n}} \{(x^{n}, y^{n}) \in X^{n} \times Y^{n} : P_{X^{n} | Y^{n}} (x^{n} | y^{n}) < max_{u^{n} \in C_{n} \ {x^{n}}} P_{X^{n} | Y^{n}} (u^{n} | y^{n})\}, \end{matrix}

(2)

where

P_{X^{n}, Y^{n}}

is the joint input–output distribution that

X^{n} = (X_{1}, X_{2}, \dots, X_{n}) \in X^{n} ≜ {0, 1}^{n}

is sent over the BSC (via n uses) and

Y^{n} = (Y_{1}, Y_{2}, \dots, Y_{n}) \in Y^{n} ≜ {0, 1}^{n}

is received.

Noting that

b_{n}

in (2) is the probability that a decoding error occurs without inducing decoder ties (which occur when two or more codewords in

C_{n}

are identified by the decoder as the estimated transmitted codeword; i.e., when more than one codeword in

C_{n}

maximize

P_{X^{n} | Y^{n}} (\cdot | y^{n})

for a given received word

y^{n}

), the above result in (1) directly implies that decoder ties do not affect the error exponent of

a_{n}

. The error exponent or reliability function of a block coding communication system represents the largest rate of exponential decay of the system’s probability of decoding error as the coding blocklength grows to infinity (e.g., see [2,3,4,5,6,7,8,9,10,11,12,13,14]).

It is known that uniformly distributed data achieves the largest entropy rate and leaves no room for data compression. Thus, ideally compressed data should exhibit uniform distribution for all blocklengths n. However, this setting is often impractical due to the sub-optimality of the implemented data compression schemes. Instead, we generally have non-uniformly distributed data after compression in the form of residual redundancy such as in speech or image coding (e.g., [15,16]). Furthermore, one may have a compressed source that can be divided into several groups, within each of which the symbols are equally probable. Decoder ties can thus occur with respect to two (or more) codewords corresponding to symbols within the same group.

In this paper, we consider a non-uniform prior distribution over

C_{n}

and prove that decoder ties, under optimal MAP decoding, still have linear and hence sub-exponential impact on the error probability

a_{n}

, thus extending Theorem 1 established for the case of a uniform prior distribution over

C_{n}

. Since our problem falls within the general framework of joint source-channel coding for point-to-point communication systems, we refer the reader to [14,15,16,17,18,19,20,21] (Section 4.6) and the references therein for theoretical studies on this subject as well as practical designs that outperform separate source and channel coding under complexity or delay constraints.

The proof technique used in [1] to show (1) above is based on the observation that there are two types of decoding errors. One is that the received tuple at the channel output induces no decoder ties but the corresponding decoder decision is wrong. The other is that the received tuple at the channel output causes a decoder tie, but the decoder picks the wrong codeword. As a result, the MAP error probability

a_{n}

can be upper bounded by the sum of two terms,

b_{n}

and

δ_{n}

, where

b_{n}

is the probability of the first type of decoding errors as given in (2), and

δ_{n}

is the probability of decoder ties regardless of whether the tie breaker misses the correct codeword or not. Under the assumption that the channel input is uniformly distributed over block code

C_{n}

for each blocklength n and an arbitrary sequence of codes

{C_{n}}_{n \geq 1}

, it was shown in [1] that flipping a properly selected bit component of the channel output that causes a decoder tie can produce a unique channel output that leads to the first type of decoding errors. An analysis of this bit-flipping manipulation shows that the ratio

δ_{n} / b_{n}

grows at most linearly in n and hence yields the upper bound in (1). However, this flipping technique no longer works when non-uniform channel inputs are considered. To tackle this problem, we judiciously separate the channel output tuples that induce decoder ties into two groups, one group consisting of output tuples that do not fulfill the above flipping manipulation property and the other group composed of the remaining output tuples (i.e., the complement group). We then show that the probability of the former group is upper bounded by that of the latter group, and therefore

δ_{n} / b_{n}

remains growing at most linearly in blocklength n under arbitrary channel input statistics. Note that the group that fails the flipping property is an empty set when channel input is uniformly distributed over

C_{n}

, thereby making the result of Theorem 1 a special case of the extended result in this paper. The rest of the paper is organized as follows. Section 2 presents the main result and highlights the key steps of the proof to facilitate its understanding. The proof is then provided in full detail, along with illustrative examples, in Section 3 and Appendix A and Appendix B. Finally, conclusions and future directions are given in Section 4.

Throughout the paper, we denote

[M] ≜ {1, 2, \dots, M}

for positive integer M and set

d (x^{n}, y^{n} | S)

to be the Hamming distance between n-tuples

x^{n} = (x_{1}, x_{2}, \dots, x_{n})

and

y^{n} = (y_{1}, y_{2}, \dots, y_{n})

with the indices of the tuples restricted to

S \subseteq [n]

. By convention, we set

d (x^{n}, y^{n} | S) = 0

when

S = \emptyset

and use

d (x^{n}, y^{n})

to represent

d (x^{n}, y^{n} | [n])

.

2. Main Result

Consider a binary code

C_{n} \subseteq {0, 1}^{n}

with fixed blocklength n and size M to be used over the memoryless BSC with crossover probability

0 < p < \frac{1}{2}

. Denote the prior probability on

C_{n}

by

P_{X^{n}}

and hence

P_{X^{n}} (C_{n}) = 1

. Without loss of generality, we assume that all codewords in

C_{n}

occur with positive probability, i.e.,

P_{X^{n}} (x^{n}) > 0

for all

x^{n} \in C_{n}

; hence,

C_{n}

is the support of

P_{X^{n}}

.

It is known the minimal probability of decoding error is achieved by the MAP decoder, which upon the reception of the channel output

y^{n} \in {0, 1}^{n}

estimates the codeword

x^{n} \in C_{n}

according to

\begin{matrix} e (y^{n}) = arg max_{u^{n} \in C_{n}} P_{X^{n} | Y^{n}} (u^{n} | y^{n}), \end{matrix}

(3)

where

P_{X^{n} | Y^{n}}

is the posterior conditional distribution of

X^{n}

given

Y^{n}

. We can see from (3) that if more than one

u^{n} \in C_{n}

achieves the maximum value of

P_{X^{n} | Y^{n}} (u^{n} | y^{n})

for a given

y^{n}

, a decoder tie occurs, in which case the set of these

u^{n}

, denoted conveniently as

{e (y^{n})}

, contains more than one element. As a result, an erroneous MAP decision is made if one of the two situations occurs:

i)

the transmitted codeword does not belong to

{e (y^{n})}

;

i i)

the transmitted codeword belong to

{e (y^{n})}

and

| {e (y^{n})} | > 1

, but the tie breaker picks the wrong one from

{e (y^{n})}

. By conveniently denoting

C_{n} = {c_{1}, c_{2}, \dots, c_{M}},

(4)

the probability of the first situation acts as a lower bound

b_{n}

for

a_{n}

(i.e.,

b_{n} \leq a_{n}

), where

b_{n}

is given in (2) and can be written as

\begin{matrix} b_{n} & = & \sum_{i = 1}^{M} P_{X^{n}} (c_{i}) P_{Y^{n} | X^{n}} (\{y^{n} \in {0, 1}^{n} : P_{X^{n} | Y^{n}} (c_{i} | y^{n}) < max_{r \in [M] \ {i}} P_{X^{n} | Y^{n}} (c_{r} | y^{n})\}) . \end{matrix}

(5)

It is shown in [22] that

b_{n}

exactly equals the generalized Poor–Verdú (lower) bound [23,24] as its tilting parameter approaches infinity. The probability of the second situation is bounded above by the probability that the transmitted codeword belongs to

{e (y^{n})}

and

| {e (y^{n})} | > 1

, disregarding whether the tie breaker picks the wrong codeword or not, and this upper bound can be expressed as

\begin{matrix} δ_{n} & ≜ & \sum_{i = 1}^{M} P_{X^{n}} (c_{i}) P_{Y^{n} | X^{n}} (\{y^{n} \in {0, 1}^{n} : P_{X^{n} | Y^{n}} (c_{i} | y^{n}) = max_{r \in [M] \ {i}} P_{X^{n} | Y^{n}} (c_{r} | y^{n})\}) . \end{matrix}

(6)

We thus have

b_{n} \leq a_{n} \leq b_{n} + δ_{n} .

(7)

By proving the inequality

δ_{n} \leq 2 q n b_{n},

(8)

where

q ≜ \frac{1 - p}{p} > 1,

(9)

we have our main result as follows.

Theorem 2.

For any sequence of binary codes

{C_{n}}_{n \geq 1}

and prior probabilities

{P_{X^{n}}}_{n \geq 1}

used over the BSC, we have

b_{n} \leq a_{n} \leq (1 + 2 q n) b_{n} .

(10)

Remark 1.

Theorem 2 implies that the relative deviation of

a_{n}

from

b_{n}

is at most linear in the blocklength n and the impact of decoder ties in (6) to

a_{n}

is only sub-exponential. Consequently,

a_{n}

and

b_{n}

must have the same error exponent. Note also that the upper bound in (10) differs from the result in Theorem 1 by an additional multiplicative factor of 2 in the

q n

term. As explained in the introduction section, this is a consequence of the fact that the probability of the group of channel output tuples that cause decoder ties but fail the flipping manipulation property is upper bounded by that of the remaining tie-inducing channel outputs. The full technical details are provided in Section 3.2. Finally, we emphasize that Theorem 2 holds for arbitrary binary codes, including “bad” codes for which high probability codewords have small Hamming distance between them. Hence, tightening the upper bound in (10) by restricting the analysis for “sufficiently good" codes, in the sense that their most likely codewords sit “sufficiently” far apart in

{0, 1}^{n}

, is an interesting future direction.

List of Main Symbols: Before providing an overview of the main steps of the proof of Theorem 2 (which is presented in full detail in the next section), we describe in Table 1 the main symbols used in the paper and indicate the equation where they are first introduced. We emphasize that sets

T_{j | i}

,

N_{j | i}

and

S_{1, j}^{(m)}

are defined differently from their counterparts in [1] that use the same notation.

We also visually illustrate in Figure 1 some of the main sets defined in Table 1 under the setting of Example 1, which is presented in Section 3 below for a non-uniformly distributed binary code with

M = 4

codewords and blocklength

n = 4

given by

C_{4} = {c_{1}, c_{2}, c_{3}, c_{4}} = {0000, 0101, 0110, 0111}

. More specifically, we only show the non-empty component subsets in

Y^{n} = {0, 1}^{4}

corresponding to codewords

c_{1}

and

c_{2}

; refer to Table A2 in Appendix A for a detailed listing of all component subsets in

{0, 1}^{4}

(including empty ones).

Overview of the Proof: Given that codeword

c_{i}

is sent over the channel,

i \in [M]

, let

T_{i}

denote the set of output tuples

y^{n}

that result in MAP decoding ties:

\begin{matrix} T_{i} & ≜ & \{y^{n} \in {0, 1}^{n} : P_{X^{n} | Y^{n}} (c_{i} | y^{n}) = max_{r \in [M] \ {i}} P_{X^{n} | Y^{n}} (c_{r} | y^{n})\} \end{matrix}

(11)

\begin{matrix} = & \{y^{n} \in {0, 1}^{n} : P_{X^{n}, Y^{n}} (c_{i}, y^{n}) = max_{r \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{r}, y^{n})\}, \end{matrix}

(12)

where (12) holds because

P_{X^{n} | Y^{n}} (x^{n} | y^{n}) = \frac{P_{X^{n}, Y^{n}} (x^{n}, y^{n})}{P_{Y^{n}} (y^{n})}

. Then,

δ_{n}

in (6) can be rewritten as

δ_{n} = \sum_{i \in [M]} P_{X^{n}} (c_{i}) P_{Y^{n} | X^{n}} (T_{i} | c_{i}) = \sum_{i \in [M]} P_{X^{n}, Y^{n}} (c_{i}, T_{i}) .

(13)

Similarly, let

N_{i}

denote the set of output tuples

y^{n}

which guarantee a tie-free MAP decoding error when

c_{i}

is transmitted over the channel:

\begin{matrix} N_{i} & ≜ & \{y^{n} \in {0, 1}^{n} : P_{X^{n} | Y^{n}} (c_{i} | y^{n}) < max_{r \in [M] \ {i}} P_{X^{n} | Y^{n}} (c_{r} | y^{n})\} \end{matrix}

(14)

\begin{matrix} = & \{y^{n} \in {0, 1}^{n} : P_{X^{n}, Y^{n}} (c_{i}, y^{n}) < max_{r \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{r}, y^{n})\} . \end{matrix}

(15)

Hence,

b_{n}

in (5) can be rewritten as:

b_{n} = \sum_{i \in [M]} P_{X^{n}} (c_{i}) P_{Y^{n} | X^{n}} (N_{i} | c_{i}) = \sum_{i \in [M]} P_{X^{n}, Y^{n}} (c_{i}, N_{i}) .

(16)

Note if

δ_{n} = 0

, then (7) is tight and (10) holds trivially; so, without loss of generality, we assume in the proof that

δ_{n} > 0

, which implies that there exists at least one non-empty

T_{i}

for

i \in [M]

. Then, according to (13) and (16), we have that

\begin{matrix} \frac{δ_{n}}{b_{n}} & = & \frac{\sum_{i \in [M]} P_{X^{n}, Y^{n}} (c_{i}, T_{i})}{\sum_{i \in [M]} P_{X^{n}, Y^{n}} (c_{i}, N_{i})} \leq \frac{\sum_{i \in [M] : T_{i} \neq \emptyset} P_{X^{n}, Y^{n}} (c_{i}, T_{i})}{\sum_{i \in [M] : T_{i} \neq \emptyset} P_{X^{n}, Y^{n}} (c_{i}, N_{i})} . \end{matrix}

(17)

We can upper-bound (17) by

\frac{\sum_{i \in [M] : T_{i} \neq \emptyset} P_{X^{n}, Y^{n}} (c_{i}, T_{i})}{\sum_{i \in [M] : T_{i} \neq \emptyset} P_{X^{n}, Y^{n}} (c_{i}, N_{i})} \leq max_{1 \in [M] : T_{i} \neq \emptyset} \frac{P_{X^{n}, Y^{n}} (c_{i}, T_{i})}{P_{X^{n}, Y^{n}} (c_{i}, N_{i})},

(18)

where for convenience we will refer to an inequality of the form given in (18) as the ratio-sum inequality. As a result, Theorem 2 holds if we can substantiate that

2 q n

is an upper bound for (18). To this end, we will find a proper partition of

T_{i}

and an equal number of disjoint subsets of

N_{i}

, of which the individual probabilities can be evaluated. For ease of notation, we denote the individual probabilities corresponding to the K-partition of

T_{i}

and K disjoint subsets of

N_{i}

by

{α_{k}}_{k = 1}^{K}

and

{β_{k}}_{k = 1}^{K}

, respectively. Then, we obtain that

\begin{matrix} \frac{P_{X^{n}, Y^{n}} (c_{i}, T_{i})}{P_{X^{n}, Y^{n}} (c_{i}, N_{i})} \leq \frac{\sum_{k = 1}^{K} α_{k}}{\sum_{k = 1}^{K} β_{k}} . \end{matrix}

(19)

By showing that each individual ratio

α_{k} / β_{k}

,

k \in [K]

is bounded above by

2 q n

, the ratio-sum inequality can again be applied to complete the proof.

3. Proof of Theorem 2

In [1], where a uniformly distributed prior probability

P_{X^{n}}

over

C_{n}

is assumed, one can flip a properly selected bit in the output

y^{n} \in T_{i}

to convert it to a corresponding element in

N_{i}

. In light of this connection, one can evaluate the ratio

\frac{P_{Y^{n} | X^{n}} (T_{i} | c_{i})}{P_{Y^{n} | X^{n}} (N_{i} | c_{i})}

. This approach, however, no longer works when a non-uniformly distributed prior probability is considered. Therefore, we have to devise a more judicious approach to extend the result in [1] for a general prior probability.

3.1. A Partition of Non-Empty $T_{i}$ and Corresponding Disjoint Subsets of $N_{i}$

In this section, instead of finding a disjoint covering of the set of decoder ties

T_{i}

as in [1], we establish a proper partition of

T_{i}

from Definitions 1 and 2. This is one of the key differences from the techniques used in [1]. Example 1 is given after Proposition 1 to illustrate Definitions 1 and 2.

Given

y^{n} \in T_{i}

defined in (12), there exists at least one

m \in [M] \ {i}

such that

\begin{matrix} P_{X^{n}, Y^{n}} (c_{i}, y^{n}) = P_{X^{n}, Y^{n}} (c_{m}, y^{n}) = max_{r \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{r}, y^{n}) . \end{matrix}

(20)

We collect the indices m that satisfy (20) in

I_{i} (y^{n})

as follows:

I_{i} (y^{n}) ≜ \{h \in [M] \ {i} : P_{X^{n}, Y^{n}} (c_{i}, y^{n}) = P_{X^{n}, Y^{n}} (c_{h}, y^{n}) = max_{r \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{r}, y^{n})\} .

(21)

Remark 2.

First, we note that

I_{i} (y^{n})

is not empty as long as

y^{n} \in T_{i}

. Also, for any

y^{n} \in T_{i}

, we can infer from (21) that

h \in I_{i} (y^{n})

if and only if

y^{n} \in T_{h}

.

In Definitions 1 and 2 that follow, we will assign each

y^{n} \in T_{i}

to a subset indexed by

j \in I_{i} (y^{n})

. These subsets will form a partition of

T_{i}

as stated in Proposition 1.

Definition 1.

For

j \in [M] \ {i}

, denoting by

S_{i, j}

the set of indices where the bit components of

c_{i}

and

c_{j}

differ, we define

{\begin{cases} (22a) & T_{j | i} ≜ \{y^{n} \in T_{i} : j = min_{r \in I_{i} (y^{n}) : d (c_{i}, y^{n} | S_{i, r}) < | S_{i, r} |} r\}; \\ N_{j | i} ≜ {y^{n} \in N_{i} : P_{X^{n}, Y^{n}} (c_{i}, y^{n}) \cdot q = P_{X^{n}, Y^{n}} (c_{j}, y^{n}) \cdot \frac{1}{q} \\ (22b) & and P_{X^{n}, Y^{n}} (c_{j}, y^{n}) \neq P_{X^{n}, Y^{n}} (c_{r}, y^{n}) for r \in [j - 1] \ {i}} . \end{cases}

Since there may exist

y^{n} \in T_{i}

satisfying

d (c_{i}, y^{n} | S_{i, r}) = | S_{i, r} |

for all

r \in I_{i} (y^{n})

, the collection of all elements in

⋃_{j \in [M] \ {i}} T_{j | i}

may not exhaust the elements in

T_{i}

(see Example 1). We thus go on to collect the remaining elements in

T_{i} \ ⋃_{j \in [M] \ {i}} T_{j | i}

as follows.

Definition 2.

Define for

j \in [M] \ {i}

,

\begin{matrix} T_{j | i} & ≜ & \{y^{n} \in T_{i} \ (⋃_{h \in [M] \ {i}} T_{h | i}) : j = min_{r \in I_{i} (y^{n})} r\} . \end{matrix}

(23)

With the sets defined in Definitions 1 and 2, a partition of

T_{i}

and disjoint subsets of

N_{i}

are constructed as proven in the following proposition.

Proposition 1.

For non-empty

T_{i}

, the following two properties hold.

(i): The collection ${T_{j | i} ⋃ T_{j | i}}_{j \in [M] \ {i}}$ forms a (disjoint) partition of $T_{i}$ .
(ii): ${N_{j | i}}_{j \in [M] \ {i}}$ is a collection of disjoint subsets of $N_{i}$ .

Before proving Proposition 1, we provide the following example to illustrate the above sets.

Example 1.

This example illustrates the necessity of introducing

T_{j | i}

as a companion to

T_{j | i}

. Suppose

C_{4} = {c_{1}, c_{2}, c_{3}, c_{4}} = {0000, 0101, 0110, 0111}

. Let

P_{X^{4}} (c_{1}) = \frac{q^{2}}{2 + q^{2} + q^{- 2}}

,

P_{X^{4}} (c_{2}) = P_{X^{4}} (c_{3}) = \frac{1}{2 + q^{2} + q^{- 2}}

and

P_{X^{4}} (c_{4}) = \frac{q^{- 2}}{2 + q^{2} + q^{- 2}}

. Then,

y^{4} = 0111

satisfies

\underset{\frac{q^{2}}{(2 + q^{2} + q^{- 2})} p^{4} q^{1}}{\underset{︸}{P_{X^{4}, Y^{4}} (c_{1}, y^{4})}} = max_{r \in [4] \ {1}} P_{X^{4}, Y^{4}} (c_{r}, y^{4}) = \underset{\frac{1}{(2 + q^{2} + q^{- 2})} p^{4} q^{3}}{\underset{︸}{P_{X^{4}, Y^{4}} (c_{2}, y^{n})}} = \underset{\frac{1}{(2 + q^{2} + q^{- 2})} p^{4} q^{3}}{\underset{︸}{P_{X^{4}, Y^{4}} (c_{3}, y^{4})}} > \underset{\frac{q^{- 2}}{(2 + q^{2} + q^{- 2})} p^{4} q^{4}}{\underset{︸}{P_{X^{4}, Y^{4}} (c_{4}, y^{4})}},

(24)

where the probabilities

P_{X^{n}, Y^{n}} (x^{n}, y^{n})

are written in the form

P_{X^{n}, Y^{n}} (x^{n}, y^{n}) = P_{X^{n}} (x^{n}) P_{Y^{n} | X^{n}} (y^{n} | x^{n}) = P_{X^{n}} (x^{n}) p^{n} q^{n - d (x^{n}, y^{n})} .

(25)

Note that the first equality in (24) indicates

0111 \in T_{1}

and the last two equalities and the right-most inequality jointly imply

I_{1} (0111) = {2, 3}

. In light of Proposition 1, this 0111 must lie in one and only one of

{T_{j | 1} ⋃ T_{j | 1}}_{j \in [4] \ {1}}

as shown in Table A1 and Table A2 of Appendix A. Since there exist no integers h in

I_{1} (0111)

fulfilling

d (c_{1}, 0111 | S_{1, h}) < | S_{1, h} |

, this 0111 belongs to

T_{j | 1}

with

j = {min}_{r \in I_{1} (0111)} = 2

. Recall that in [1], an element in

N_{j | 1}

can be obtained if flipping a zero of

y^{n} \in T_{j | 1}

can make it further away from

c_{1}

but closer to

c_{j}

. However, for

y^{4} = 0111

in this example if we flip the only zero to one, it gets further away from both

c_{1}

and

c_{h}

for any

h = 2, 3, 4

. Therefore, the bit-flipping manipulation fails.

With

y^{4} = 0111

, we also have

\underset{\frac{1}{(2 + q^{2} + q^{- 2})} p^{4} q^{3}}{\underset{︸}{P_{X^{4}, Y^{4}} (c_{2}, y^{4})}} = max_{r \in [4] \ {2}} P_{X^{4}, Y^{4}} (c_{r}, y^{4}) = \underset{\frac{q^{2}}{(2 + q^{2} + q^{- 2})} p^{4} q}{\underset{︸}{P_{X^{4}, Y^{4}} (c_{1}, y^{4})}} = \underset{\frac{1}{(2 + q^{2} + q^{- 2})} p^{4} q^{3}}{\underset{︸}{P_{X^{4}, Y^{4}} (c_{3}, y^{n})}} > \underset{\frac{1}{(2 + q^{2} + q^{- 2})} p^{4} q^{4}}{\underset{︸}{P_{X^{4}, Y^{4}} (c_{4}, y^{4})}},

(26)

where the first equality indicates

0111 \in T_{2}

and the remaining parts in (26) jointly imply that

I_{2} (0111) = {1, 3}

. Proposition 1 then states that this 0111 lies in one and only one of

{T_{j | 2}

⋃ T_{j | 2}}_{j \in [4] \ {2}}

. Since

1 \in I_{2} (0111) = {1, 3}

and

d (c_{2}, 0111 | S_{2, 1}) = 0 < | S_{2, 1} | = 2

, we have

0111 \in T_{1 | 2}

according to (22a). Thus, we can flip a bit in 0111 to get further away from

c_{2}

and closer to

c_{j}

simultaneously. More specifically, the bit-flipping manipulation produces either 0110 or 0011, which lies in

N_{1 | 2}

as

y^{4} = 0111

is in

T_{1 | 2}

. Therefore, we can associate the element in

T_{1 | 2}

with an element in

N_{1 | 2}

via a single flipping operation. For completeness, a full list of the sets

T_{i}

,

N_{i}

,

T_{j | i}

,

T_{j | i}

and

N_{j | i}

for

i \in [4]

and

j \in [4] \ {i}

, is given in Appendix A.

Proof of Proposition 1.

First, we note that by the definitions in (22a) and (23),

{T_{j | i}}_{j \in [M] \ {i}}

are disjoint and so is

{T_{j | i}}_{j \in [M] \ {i}}

. Additionally, (23) implies

T_{j | i} ⋂ T_{h | i} = \emptyset

for arbitrary

j, h \in [M] \ {i}

. Furthermore, according to Definitions 1 and 2, for any

y^{n} \in T_{i}

, we have either

y^{n} \in T_{h | i}

or

y^{n} \in T_{h | i}

for some

h \in [M] \ {i}

. Consequently,

{T_{j | i} ⋃ T_{j | i}}_{j \in [M] \ {i}}

forms a partition of

T_{i}

.

On the other hand, the inequality in (22b) prevents multiple inclusions of an element from the previous collections. Therefore,

{N_{j | i}}_{j \in [M] \ {i}}

are a collection of disjoint subsets of

N_{i}

. □

Remark 3.

When channel inputs are uniformly distributed as considered in [1], it follows that

I_{i} (y^{n}) = \{h \in [M] \ {i} : d (c_{i}, y^{n}) = d (c_{h}, y^{n}) = max_{r \in [M] \ {i}} d (c_{r}, y^{n})\},

(27)

and

d (c_{i}, y^{n} | S_{i, j}) = \frac{| S_{i, j} |}{2} < | S_{i, j} |

for every

j \in I_{i} (y^{n})

. Therefore, (22a) is reduced to

\begin{matrix} T_{j | i} & = & \{y^{n} \in T_{i} : j = min_{r \in I_{i} (y^{n})} r\}, \end{matrix}

(28)

and

\begin{matrix} T_{j | i} = \emptyset . \end{matrix}

(29)

We then have the following two remarks. First, we note that the

T_{j | i}

newly defined via (22a) and reduced to (28) in the regime considered in [1] is more restrictive than the

T_{j | i}

introduced in [1] (Equation (16a)). As a consequence,

{T_{j | i}}_{j \in [M] \ {i}}

forms a partition of

T_{i}

in this paper while those introduced in [1] (Equation (16a)) are a disjoint covering of

T_{i}

under uniform channel inputs. Second, (29) shows that [1] does not need to consider a companion

T_{j | i}

to

T_{j | i}

, but this paper does.

Based on Proposition 1, we continue the derivation from (17) and obtain:

\begin{matrix} \frac{δ_{n}}{b_{n}} & \leq & \frac{\sum_{i \in [M]} P_{X^{n}, Y^{n}} (c_{i}, ⋃_{j \in [M] \ {i}} (T_{j | i} ⋃ T_{j | i}))}{\sum_{i \in [M]} P_{X^{n}, Y^{n}} (c_{i}, ⋃_{j \in [M] \ {i}} N_{j | i})} \end{matrix}

(30)

\begin{matrix} = & \frac{\sum_{i \in [M]} \sum_{j \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{i}, T_{j | i}) + \sum_{i \in [M]} \sum_{j \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{i}, T_{j | i})}{\sum_{i \in [M]} \sum_{j \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{i}, N_{j | i})}, \end{matrix}

(31)

where (31) holds because

{T_{j | i}}_{j \in [M] \ {i}}

and

{T_{j | i}}_{j \in [M] \ {i}}

are disjoint, and the same applies to

{N_{j | i}

}_{j \in [M] \ {i}}

. An additional upper bound for (31) requires the verification of the inequality:

\sum_{i \in [M]} \sum_{j \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{i}, T_{j | i}) \leq \sum_{j \in [M]} \sum_{i \in [M] \ {j}} P_{X^{n}, Y^{n}} (c_{j}, T_{i | j}),

(32)

which is an immediate consequence of the proposition to be proven in the next section (Proposition 2), stating that

\begin{matrix} y^{n} & \in & T_{j | i} and h \in I_{i} (y^{n}) \Rightarrow y^{n} \in T_{ℓ | h} for some \\ ℓ \in I_{h} (y^{n}) and P_{X^{n}, Y^{n}} (c_{i}, y^{n}) = P_{X^{n}, Y^{n}} (c_{h}, y^{n}) . \end{matrix}

(33)

3.2. Verification of (32)

Recall that the main technique used in [1] is to associate every element in

T_{i}

with a corresponding element in

N_{i}

via the bit-flipping manipulation. By this bit-flipping association, the probability ratio of the elements and corresponding elements respectively in

T_{i}

and

N_{i}

can be evaluated. However, as Example 1 indicates, for an element in

T_{j | i}

, the bit-flipping association no longer works. This reveals the challenge of generalizing the results in [1] from uniform channel inputs to arbitrarily distributed channel inputs. A solution is to subdivide the elements in

T_{i}

into two groups

{T_{j | i}}_{j \in [M] \ {i}}

and

{T_{j | i}}_{j \in [M] \ {i}}

, where the bit-flipping association to

{N_{j | i}}_{j \in [M] \ {i}}

works for the former group but not for the latter. The inequality in (32) can then be used to exclude the latter group with an upper bound:

\begin{matrix} \frac{δ_{n}}{b_{n}} & \leq & \frac{\sum_{i \in [M]} \sum_{j \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{i}, T_{j | i}) + \sum_{i \in [M]} \sum_{j \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{i}, T_{j | i})}{\sum_{i \in [M]} \sum_{j \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{i}, N_{j | i})} \end{matrix}

(34)

\begin{matrix} \leq & 2 \frac{\sum_{i \in [M]} \sum_{j \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{i}, T_{j | i})}{\sum_{i \in [M]} \sum_{j \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{i}, N_{j | i})} . \end{matrix}

(35)

Since uniform channel inputs as considered in [1] guarantee (29), it can be seen from (35) that the multiplicative factor of 2 can be reduced to 1 as observed in Remark 1. For general arbitrary channel inputs, we have the factor of 2 since the set

T_{j | i}

may not be empty. The validity of (32) can be confirmed by the next proposition.

Proposition 2.

Suppose

y^{n} \in T_{j | i}

. Then, for every

h \in I_{i} (y^{n})

, we have

\begin{matrix} y^{n} \in T_{ℓ | h} for some ℓ \in I_{h} (y^{n}) and P_{X^{n}, Y^{n}} (c_{i}, y^{n}) = P_{X^{n}, Y^{n}} (c_{h}, y^{n}) . \end{matrix}

(36)

Proof.

Suppose

y^{n} \in T_{j | i}

. Then,

d (c_{i}, y^{n} | S_{i, h}) = | S_{i, h} |

for every

h \in I_{i} (y^{n})

. We therefore have:

P_{X^{n}, Y^{n}} (c_{i}, y^{n}) = P_{X^{n}, Y^{n}} (c_{h}, y^{n}) = max_{r \in [M] \ {i}} P_{X^{n}, Y^{n}} (c_{r}, y^{n}) .

(37)

We can rewrite (37) as

P_{X^{n}, Y^{n}} (c_{h}, y^{n}) = P_{X^{n}, Y^{n}} (c_{i}, y^{n}) = max_{r \in [M] \ {h}} P_{X^{n}, Y^{n}} (c_{r}, y^{n}),

(38)

implying

y^{n} \in T_{h}

and

i \in I_{h} (y^{n})

. Noting that

d (c_{h}, y^{n} | S_{h, i}) = 0 < | S_{h, i} |

because

d (c_{i}, y^{n} | S_{i, h}) = | S_{i, h} |

and

S_{h, i} = S_{i, h}

, we conclude that the smallest integer

ℓ \in I_{h} (y^{n})

satisfying

d (c_{h}, y^{n} | S_{h, ℓ}) < | S_{h, ℓ} |

exists, and therefore

y^{n} \in T_{ℓ | h}

. □

Remark 4.

Two observations can be made based on Proposition 2. First, Proposition 2 indicates that every

y^{n} \in T_{j | i}

must appear at least once in the sum

\sum_{h \in [M]}

\sum_{ℓ \in [M] \ {h}}

P_{X^{n}, Y^{n}} (c_{h}, T_{ℓ | h})

, contributing the same probability mass

P_{X^{n}, Y^{n}} (c_{h}, y^{n})

as

P_{X^{n}, Y^{n}} (c_{i}, y^{n})

. Second, Proposition 2 also implies that every

y^{n} \in T_{j | i}

cannot be contained in

(⋃_{h \in [M]} ⋃_{r \in [M] \ {h}} T_{r | h}) \ T_{j | i}

. This observation can be substantiated as follows. For every

h \in I_{i} (y^{n})

, Proposition 2 implies

y^{n} \in T_{ℓ | h}

for some

ℓ \in I_{h} (y^{n})

and hence Definition 2 immediately gives

y^{n} \notin T_{r | h}

for all

r \in [M] \ {h}

. For

h \notin I_{i} (y^{n})

, we have

y^{n} \notin T_{h}

and therefore

y^{n} \notin T_{r | h} \subseteq T_{h}

for all

r \in [M] \ {h}

as pointed out in Remark 2. As a result, every

y^{n} \in T_{j | i}

appears exactly once in the sum

\sum_{h \in [M]}

\sum_{r \in [M] \ {h}} P_{X^{n}, Y^{n}} (c_{h}, T_{r | h})

. Combining the two observations leads to:

\begin{matrix} \sum_{i \in [M]} \sum_{j \in [M] \ {i}} \sum_{y^{n} \in T_{j | i}} P_{X^{n}, Y^{n}} (c_{i}, y^{n}) & \leq & \sum_{j \in [M]} \sum_{ℓ \in [M] \ {j}} \sum_{y^{n} \in T_{ℓ | j}} P_{X^{n}, Y^{n}} (c_{j}, y^{n}) . \end{matrix}

(39)

To flesh out the above inequality, we give the next example.

Example 2.

Proceeding from Example 1, we observe from Table A1 and Table A2 in Appendix A that 0111 is contained in

T_{2 | 1}

,

T_{1 | 2}

and

T_{1 | 3}

. Hence, it appears once in the sum

\sum_{j \in [4]} \sum_{i \in [4] \ {j}}

P_{X^{n}, Y^{n}} (c_{j}, T_{i | j})

while it contributes twice in the sum

\sum_{j \in [4]} \sum_{i \in [4] \ {j}} P_{X^{n}, Y^{n}} (c_{j}, T_{i | j})

. We then confirm from (A35) that:

\begin{matrix} \sum_{i \in [4]} \sum_{j \in [4] \ {i}} P_{X^{4}, Y^{4}} (c_{i}, T_{j | i}) & \geq & \sum_{i \in [4]} \sum_{j \in [4] \ {i}} P_{X^{4}, Y^{4}} (c_{i}, T_{j | i}) . \end{matrix}

(40)

We continue the derivation from (35) and obtain

\begin{matrix} \frac{δ_{n}}{b_{n}} & \leq & 2 \frac{\sum_{i \in [M]} \sum_{j \in [M] \ {i} : T_{j | i} \neq \emptyset} P_{X^{n}, Y^{n}} (c_{i}, T_{j | i})}{\sum_{i \in [M]} \sum_{j \in [M] \ {i} : T_{j | i} \neq \emptyset} P_{X^{n}, Y^{n}} (c_{i}, N_{j | i})} \end{matrix}

(41)

\begin{matrix} \leq & 2 max_{i \in [M] and j \in [M] \ {i} : T_{j | i} \neq \emptyset} \frac{P_{X^{n}, Y^{n}} (c_{i}, T_{j | i})}{P_{X^{n}, Y^{n}} (c_{i}, N_{j | i})}, \end{matrix}

(42)

where we add the restriction

T_{j | i} \neq \emptyset

in (41) to exclude the cases of zero dividing by zero in (42), and (42) follows the ratio-sum inequality in (18).

In the next section, we introduce a number of delicate decompositions of non-empty

T_{j | i}

and an equal number of disjoint subsets of

N_{j | i}

to facilitate the bit-flipping association of the pairs.

3.3. Atomic Decomposition of Non-Empty $T_{j | i}$ and the Corresponding Disjoint Subsets of $N_{j | i}$

To simplify the exposition, we assume without loss of generality that

c_{1}

is the all-zero codeword (It is known that we can simultaneously flip the same position of all codewords to yield a new code of equal performance over the BSC. Thus, via a number of flipping manipulations, we can transform any code to a code of equal performance with the first codeword being all-zero.) Below we present the proof for

i = 1

. The proof for general

i > 1

follows annalagously.

Since

c_{1}

is the all-zero codeword,

S_{1, j}

is the set containing the indices of the non-zero components of

c_{j}

. To facilitate the investigation of the structure of

c_{j}

relative to the remaining codewords

{c_{r}}_{r \in [M] \ {1, j}}

, we first partition

S_{1, j}

into

2^{M - 2}

subsets according to whether each index in

S_{1, j}

is in

S_{1, 2}

, …,

S_{1, j - 1}

,

S_{1, j + 1}

, …,

S_{1, M}

or not, as follows:

S_{1, j}^{(m)} ≜ (⋂_{r = 2}^{j - 1} S_{r; λ_{r}}) ⋂ (⋂_{r = j + 1}^{M} S_{r; λ_{r}}) ⋂ S_{1, j} for m ≜ 1 + \sum_{r = 2}^{j - 1} λ_{r} \cdot 2^{r - 2} + \sum_{r = j + 1}^{M} λ_{r} \cdot 2^{r - 3},

(43)

where

S_{r; 1} ≜ S_{1, r}

and

S_{r; 0} ≜ [n] \ S_{1, r} = S_{1, r}^{c}

, and each

λ_{r} \in {0, 1}

. An example of the partition is given below.

Example 3.

Suppose

C_{4} = {00000, 11001, 01111, 01101}

. For

j = 3

and

S_{1, j} = {2, 3, 4, 5}

, we obtain

2^{4 - 2} = 4

subsets as

\begin{matrix} S_{1, 3}^{(m)} = \{\begin{matrix} S_{1, 3}^{(1)} = S_{1, 2}^{c} ⋂ S_{1, 4}^{c} ⋂ S_{1, 3} = {4}, & if (λ_{4}, λ_{2}) = (0, 0); \\ S_{1, 3}^{(2)} = S_{1, 2} ⋂ S_{1, 4}^{c} ⋂ S_{1, 3} = \emptyset, & if (λ_{4}, λ_{2}) = (0, 1); \\ S_{1, 3}^{(3)} = S_{1, 2}^{c} ⋂ S_{1, 4} ⋂ S_{1, 3} = {3}, & if (λ_{4}, λ_{2}) = (1, 0); \\ S_{1, 3}^{(4)} = S_{1, 2} ⋂ S_{1, 4} ⋂ S_{1, 3} = {2, 5}, & if (λ_{4}, λ_{2}) = (1, 1) . \end{matrix} \end{matrix}

(44)

As

c_{1}

is the all-zero codeword, the components of

c_{r}

with indices in

S_{1, j}^{(m)}

can now be unambiguously identified and must all equal

λ_{r}

. As a result,

d (c_{1}, c_{r} | S_{1, j}^{(m)}) = \{\begin{matrix} | S_{1, j}^{(m)} |, & λ_{r} = 1; \\ 0, & λ_{r} = 0 . \end{matrix}

(45)

Example 4.

Proceeding from Example 3, we have

\begin{matrix} \{\begin{matrix} d (c_{1}, c_{2} | S_{1, 3}^{(1)}) = 0 & because λ_{2} = 0; \\ d (c_{1}, c_{2} | S_{1, 3}^{(2)}) = | S_{1, 3}^{(2)} | = 0 & because λ_{2} = 1; \\ d (c_{1}, c_{2} | S_{1, 3}^{(3)}) = 0 & because λ_{2} = 0; \\ d (c_{1}, c_{2} | S_{1, 3}^{(4)}) = | S_{1, 3}^{(4)} | = 2 & because λ_{2} = 1, \end{matrix} \end{matrix}

(46)

and

\begin{matrix} \{\begin{matrix} d (c_{1}, c_{4} | S_{1, 3}^{(1)}) = 0 & because λ_{4} = 0; \\ d (c_{1}, c_{4} | S_{1, 3}^{(2)}) = 0 & because λ_{4} = 0; \\ d (c_{1}, c_{4} | S_{1, 3}^{(3)}) = | S_{1, 3}^{(3)} | = 1 & because λ_{4} = 1; \\ d (c_{1}, c_{4} | S_{1, 3}^{(4)}) = | S_{1, 3}^{(4)} | = 2 & because λ_{4} = 1 . \end{matrix} \end{matrix}

(47)

It should be emphasized that

S_{1, j}^{(m)}

in this paper is defined differently from that in [1]. While the one defined in [1] partitions

S_{1, j}

only according to codewords with indices less than j, the one defined in this paper considers all other

M - 2

codewords in the partition manipulation, and hence the order of codewords becomes irrelevant.

Next, to decompose

T_{j | 1}

, we further define a sequence of incremental sets:

S_{1, j}^{(m)} ≜ ⋃_{h = 1}^{m} S_{1, j}^{(h)}, m \in [2^{M - 2}],

(48)

and set

S_{1, j}^{(0)} ≜ \emptyset

. Let

ℓ_{1, j} ≜ | S_{1, j} |

and

ℓ_{1, j}^{(m)} ≜ | S_{1, j}^{(m)} |

respectively denote the sizes of

S_{1, j}

and

S_{1, j}^{(m)}

and note that

0 = ℓ_{1, j}^{(0)} \leq ℓ_{1, j}^{(1)} \leq ℓ_{1, j}^{(2)} \leq \dots \leq ℓ_{1, j}^{(2^{M - 2})} = ℓ_{1, j}

.

The idea behind the partition of

T_{j | 1}

into

ℓ_{1, j}

subsets, indexed by

k \in [ℓ_{1, j} - 1] ⋃ {0}

, is as follows. Pick one

y^{n} \in T_{j | 1}

. We start by examining whether

d (c_{1}, y^{n} | S_{1, j}^{(1)})

is strickly less than

ℓ_{1, j}^{(1)} - 1

. If the answer is negative, we continue examining whether

d (c_{1}, y^{n} | S_{1, j}^{(2)})

is strictly less than

ℓ_{1, j}^{(2)} - 1

. Proceed until we reach the smallest m such that

d (c_{1}, y^{n} | S_{1, j}^{(m)}) < ℓ_{1, j}^{(m)} - 1

holds. Setting k to be equal to

k = d (c_{1}, y^{n} | S_{1, j}^{(m)})

, we assign this

y^{n}

to the subset

T_{j | 1} (k)

. Notably, there exists no such number

m \in [2^{M - 2}]

that satisfies

d (c_{1}, y^{n} | S_{1, j}^{(m)}) < ℓ_{1, j}^{(m)} - 1

if and only if

d (c_{1}, y^{n} | S_{1, j}) = ℓ_{1, j} - 1

; in this case, we find the smallest m satisfying

S_{1, j}^{(m)} = S_{1, j}

and assign this element to

T_{j | 1} (ℓ_{1, j} - 1)

as

d (c_{1}, y^{n} | S_{1, j}^{(m)}) = ℓ_{1, j} - 1

. For ease of describing the above algorithmic partition process, we introduce a mapping from

k \in [ℓ_{1, j} - 1] ⋃ {0}

to

m \in [2^{M - 2}]

as follows:

η_{k} ≜ \{\begin{matrix} min \{m \in [2^{M - 2}] : k < ℓ_{1, j}^{(m)} - 1\}, & 0 \leq k < ℓ_{1, j} - 1; \\ min \{m \in [2^{M - 2}] : k = ℓ_{1, j}^{(m)} - 1\}, & k = ℓ_{1, j} - 1 . \end{matrix}

(49)

We can see that for

0 \leq k < ℓ_{1, j} - 1

, we have

ℓ_{1, j}^{(η_{k} - 1)} - 1 \leq k < ℓ_{1, j}^{(η_{k})} - 1

. Therefore, if

y^{n} \in T_{j | 1}

is assigned to

T_{j | 1} (k)

for some

k < ℓ_{1, j} - 1

, we must have

ℓ_{1, j}^{(η_{k} - 1)} - 1 \leq d (c_{1}, y^{n} | S_{1, j}^{(η_{k} - 1)}) \leq d (c_{1}, y^{n} | S_{1, j}^{(η_{k})}) = k < ℓ_{1, j}^{(η_{k})} - 1 .

(50)

On the other hand, if

y^{n} \in T_{j | 1}

is collected in

T_{j | 1} (ℓ_{1, j} - 1)

, then

S_{1, j}^{(η_{k})} = S_{1, j}

and

ℓ_{1, j}^{(η_{k} - 1)} - 1 \leq d (c_{1}, y^{n} | S_{1, j}^{(η_{k} - 1)}) \leq d (c_{1}, y^{n} | S_{1, j}^{(η_{k})}) = ℓ_{1, j} - 1 .

(51)

A formal definition of

T_{j | 1} (k)

is given next, where the corresponding subsets

N_{j | 1} (k)

of

N_{j | 1}

are also introduced.

Definition 3.

Define for

k = 0

, 1, …,

ℓ_{1, j} - 1

,

{\begin{array}{l} (52a) & T_{j | 1} (k) & ≜ & \{y^{n} \in T_{j | 1} : ℓ_{1, j}^{(η_{k} - 1)} - 1 \leq d (c_{1}, y^{n} | S_{1, j}^{(η_{k} - 1)}) and d (c_{1}, y^{n} | S_{1, j}^{(η_{k})}) = k\}; \\ (52b) & N_{j | 1} (k) & ≜ & \{y^{n} \in N_{j | 1} : ℓ_{1, j}^{(η_{k} - 1)} = d (c_{1}, y^{n} | S_{1, j}^{(η_{k} - 1)}) and d (c_{1}, y^{n} | S_{1, j}^{(η_{k})}) = k + 1\}, \end{array}

where

η_{k}

is defined in (49).

With Definition 3, we have the following proposition.

Proposition 3.

For non-empty

T_{j | 1}

, the following two properties hold.

(i): ${T_{j | 1} (k)}_{k \in [ℓ_{1, j} - 1] ⋃ {0}}$ forms a partition of $T_{j | 1}$ ;
(ii): ${N_{j | 1} (k)}_{k \in [ℓ_{1, j} - 1] ⋃ {0}}$ is a collection of disjoint subsets of $N_{j | 1}$ .

Proof.

It can be seen from the definitions of

{T_{j | 1} (k)}_{k \in [ℓ_{1, j} - 1] ⋃ {0}}

and

{N_{j | 1} (k)}_{k \in [ℓ_{1, j} - 1] ⋃ {0}}

that they are collections of mutually disjoint subsets of

T_{j | 1}

and

N_{j | 1}

, respectively. It remains to argue that every element in

T_{j | 1}

belongs to

T_{j | 1} (k)

for some

k \in [ℓ_{1, j} - 1] ⋃ {0}

. Noting that the element

y^{n}

in

T_{j | 1}

satisfies

d (c_{1}, y^{n} | S_{1, j}) \leq ℓ_{1, j} - 1

, we differentiate two cases:

d (c_{1}, y^{n} | S_{1, j}) \leq ℓ_{1, j} - 2

and

d (c_{1}, y^{n} | S_{1, j}) = ℓ_{1, j} - 1

. For the former case,

d (c_{1}, y^{n} | S_{1, j}^{(m)}) < ℓ_{1, j}^{(m)} - 1

must hold for

m = η_{k}

; hence, this

y^{n}

will be contained in

T_{j | 1} (k)

. For the latter case,

y^{n}

will be included in

T_{j | 1} (ℓ_{1, j} - 1)

. The proof is thus completed. □

In light of Proposition 3, we can apply the ratio-sum inequality to obtain

\begin{matrix} \frac{P_{X^{n}, Y^{n}} (c_{1}, T_{j | 1})}{P_{X^{n}, Y^{n}} (c_{1}, N_{j | 1})} & \leq & \frac{\sum_{k = 0 : T_{j | 1} (k) \neq \emptyset}^{ℓ_{1, j - 1}} P_{X^{n}, Y^{n}} (c_{1}, T_{j | 1} (k))}{\sum_{k = 0 : T_{j | 1} (k) \neq \emptyset}^{ℓ_{1, j - 1}} P_{X^{n}, Y^{n}} (c_{1}, N_{j | 1} (k))} \end{matrix}

(53)

\begin{matrix} \leq & max_{k \in [ℓ_{1, j} - 1] ⋃ {0} : T_{j | 1} (k) \neq \emptyset} \frac{P_{X^{n}, Y^{n}} (c_{1}, T_{j | 1} (k))}{P_{X^{n}, Y^{n}} (c_{1}, N_{j | 1} (k))} . \end{matrix}

(54)

We continue to construct a fine partition of

T_{j | 1} (k)

and the corresponding disjoint subsets of

N_{j | 1} (k)

in Proposition 4 after giving the next definition.

Definition 4.

Define for

u^{n} \in T_{j | 1} (k)

,

{\begin{matrix} (55a) & T_{j | 1} (u^{n}; k) ≜ \{y^{n} \in T_{j | 1} (k) : d (u^{n}, y^{n} | {(S_{1, j}^{(η_{k})})}^{c}) = 0\}; \\ (55b) & N_{j | 1} (u^{n}; k) ≜ \{y^{n} \in N_{j | 1} (k) : d (u^{n}, y^{n} | {(S_{1, j}^{(η_{k})})}^{c}) = 0\}, \end{matrix}

where

η_{k}

is given in (49).

Note from Definition 4 that for one element

u^{n}

in non-empty

T_{j | 1} (k)

, we can find a group of elements that have identical bit components to

u^{n}

with indices in

{(S_{1, j}^{(η_{k})})}^{c}

. We denote this group as

T_{j | 1} (u^{n}; k)

. We continue this grouping manipulation until all elements in

T_{j | 1} (k)

are exhausted as summarized below.

Proposition 4.

For non-empty

T_{j | 1} (k)

, there exists a representative subset

U_{j | 1} (k) \subseteq T_{j | 1} (k)

such that the following two properties hold.

(i): ${\{T_{j | 1} (u^{n}; k)\}}_{u^{n} \in U_{j | 1} (k)}$ forms a (non-empty) partition of $T_{j | 1} (k)$ ;
(ii): ${\{N_{j | 1} (u^{n}; k)\}}_{u^{n} \in U_{j | 1} (k)}$ is a collection of (non-empty) disjoint subsets of $N_{j | 1} (k)$ .

Since the above proposition can be self-validated via its sequential selection manipulation of each

u^{n}

from

T_{j | 1} (k)

, we omit the proof. Interested readers can find the details in [1] (Section III-C).

From Proposition 4, using again the ratio-sum inequality, we obtain that for non-empty

T_{j | 1} (k)

,

\begin{matrix} \frac{P_{X^{n}, Y^{n}} (c_{1}, T_{j | 1} (k))}{P_{X^{n}, Y^{n}} (c_{1}, N_{j | 1} (k))} & \leq & \frac{\sum_{u^{n} \in U_{j | 1} (k)} P_{X^{n}, Y^{n}} (c_{1}, T_{j | 1} (u^{n}; k))}{\sum_{u^{n} \in U_{j | 1} (k)} P_{X^{n}, Y^{n}} (c_{1}, N_{j | 1} (u^{n}; k))} \end{matrix}

(56)

\begin{matrix} \leq & max_{u^{n} \in U_{j | 1} (k)} \frac{P_{X^{n}, Y^{n}} (c_{1}, T_{j | 1} (u^{n}; k))}{P_{X^{n}, Y^{n}} (c_{1}, N_{j | 1} (u^{n}; k))} . \end{matrix}

(57)

Noting that the above result can be similarly conducted for general

i > 1

, we combine (42), (54) and (57) to conclude that

\begin{matrix} \frac{δ_{n}}{b_{n}} \leq 2 max_{i \in [M] and j \in [M] \ {i} : T_{j | i} \neq \emptyset} max_{k \in [ℓ_{i, j} - 1] ⋃ {0} : T_{j | i} (k) \neq \emptyset} max_{u^{n} \in U_{j | i} (k)} \frac{P_{X^{n}, Y^{n}} (c_{i}, T_{j | i} (u^{n}; k))}{P_{X^{n}, Y^{n}} (c_{i}, N_{j | i} (u^{n}; k))} . \end{matrix}

(58)

The final task is to evaluate

P_{X^{n}, Y^{n}} (c_{i}, T_{j | i} (u^{n}; k)) / P_{X^{n}, Y^{n}} (c_{i}, N_{j | i} (u^{n}; k))

in order to characterize a linear upper bound for

δ_{n} / b_{n}

.

3.4. Characterization of a Linear Upper Bound for $δ_{n} / b_{n}$

We again focus on

i = 1

with

c_{1}

being the all-zero codeword for simplicity. The definitions of

T_{j | 1} (u^{n}; k)

in (55a) and

N_{j | 1} (u^{n}; k)

in (55b) indicate that when dealing with the ratio

P_{X^{n}, Y^{n}} (c_{1}, T_{j | 1} (u^{n}; k)) / P_{X^{n}, Y^{n}} (c_{1}, N_{j | 1} (u^{n}; k))

, we only need to consider those bits with indices in

S_{1, j}^{(η_{k})}

because the remaining bits of all tuples in

T_{j | 1} (u^{n}; k)

and

N_{j | 1} (u^{n}; k)

have identical values as

u^{n}

. Note that all

| T_{j | 1} (u^{n}; k) |

elements in

T_{j | 1} (u^{n}; k)

have exactly k ones with indices in

S_{1, j}^{(η_{k})}

, and all

| N_{j | 1} (u^{n}; k) |

elements in

N_{j | 1} (u^{n}; k)

have exactly

k + 1

ones with indices in

S_{1, j}^{(η_{k})}

, we can immediately infer that

\frac{P_{X^{n}, Y^{n}} (c_{1}, T_{j | 1} (u^{n}; k))}{P_{X^{n}, Y^{n}} (c_{1}, N_{j | 1} (u^{n}; k))} = \frac{P_{X^{n}} (c_{1}) \cdot P_{Y^{n} | c_{1}} (T_{j | 1} (u^{n}; k) | c_{1})}{P_{X^{n}} (c_{1}) \cdot P_{Y^{n} | c_{1}} (N_{j | 1} (u^{n}; k) | c_{1})} = \frac{(1 - p)}{p} \cdot \frac{| T_{j | 1} (u^{n}; k) |}{| N_{j | 1} (u^{n}; k) |} .

(59)

The cardinalities of

T_{j | 1} (u^{n}; k)

and

N_{j | 1} (u^{n}; k)

then decide the ratio in (59) as verified in the next proposition, based on which the proof of Theorem 2 can be completed from (58).

Proposition 5.

For

u^{n} \in T_{j | 1} (k)

, we have

\frac{P_{X^{n}, Y^{n}} (c_{1}, T_{j | 1} (u^{n}; k))}{P_{X^{n}, Y^{n}} (c_{1}, N_{j | 1} (u^{n}; k))} \leq \frac{(1 - p)}{p} n .

(60)

Proof.

Recall from (22a), (52a) and (55a) that

y^{n} \in T_{j | 1} (u^{n}; k) \subseteq T_{j | 1} (k) \subseteq T_{j | 1}

if and only if

{\begin{array}{l} (61a) & P_{X^{n}, Y^{n}} (c_{1}, y^{n}) = P_{X^{n}, Y^{n}} (c_{j}, y^{n}) = max_{h \in [M] \ {1}} P_{X^{n}, Y^{n}} (x_{(h)}^{n}, y^{n}) and d (c_{1}, y^{n} | S_{1, j}) < | S_{i, j} |; \\ (61b) & ℓ_{1, j}^{(η_{k} - 1)} - 1 \leq d (c_{1}, y^{n} | S_{1, j}^{(η_{k} - 1)}) and d (c_{1}, y^{n} | S_{1, j}^{(η_{k})}) = k; \\ (61c) & d (u^{n}, y^{n} | {(S_{1, j}^{(η_{k})})}^{c}) = 0 . \end{array}

Thus, the number of elements in

T_{j | 1} (u^{n}; k)

is exactly the number of channel outputs

y^{n}

fulfilling the above three conditions. We then examine the number of

y^{n}

satisfying (61b) and (61c). Noting that these

y^{n}

have either

ℓ_{1, j}^{(η_{k} - 1)} - 1

ones or

ℓ_{1, j}^{(η_{k} - 1)}

ones with indices in

S_{1, j}^{(η_{k} - 1)}

, we know that there are at most

\begin{matrix} (\binom{ℓ_{1, j}^{(η_{k} - 1)}}{ℓ_{1, j}^{(η_{k} - 1)} - 1}) (\binom{ℓ_{j}^{(η_{k})} - ℓ_{1, j}^{(η_{k} - 1)}}{k - (ℓ_{1, j}^{(η_{k} - 1)} - 1)}) + (\binom{ℓ_{1, j}^{(η_{k} - 1)}}{ℓ_{1, j}^{(η_{k} - 1)}}) (\binom{ℓ_{j}^{(η_{k})} - ℓ_{1, j}^{(η_{k} - 1)}}{k - ℓ_{1, j}^{(η_{k} - 1)}}) \end{matrix}

(62)

of

y^{n}

tuples satisfying (61b) and (61c). Disregarding (61a), we get that the number of elements in

T_{j | 1} (u^{n}; k)

is upper-bounded by (62).

On the other hand, from (22b), (52b) and (55b), we obtain that

w^{n} \in N_{j | 1} (u^{n}; k) \subseteq N_{j | 1} (k) \subseteq N_{j | 1}

if and only if

{\begin{array}{l} (63a) & P_{X^{n}, Y^{n}} (c_{1}, w^{n}) \cdot q^{2} = P_{X^{n}, Y^{n}} (c_{j}, w^{n}); \\ (63b) & P_{X^{n}, Y^{n}} (c_{1}, w^{n}) \cdot q^{2} \neq P_{X^{n}, Y^{n}} (c_{r}, w^{n}) for r \in [j - 1] \ {1}; \\ (63c) & ℓ_{1, j}^{(η_{k} - 1)} = d (c_{1}, w^{n} | S_{1, j}^{(η_{k} - 1)}) and d (c_{1}, w^{n} | S_{1, j}^{(η_{k})}) = k + 1; \\ (63d) & d (u^{n}, w^{n} | {(S_{1, j}^{(η_{k})})}^{c}) = 0 . \end{array}

We then claim that any

w^{n}

satisfying (63c) and (63d) directly validate (63a) and (63b). Note that the validity of the claim, which we prove in Appendix B, immediately implies that the number of elements in

N_{j | 1} (u^{n}; k)

can be determined by (63c) and (63d), and hence

| N_{j | 1} (u^{n}; k) | = (\binom{ℓ_{j}^{(η_{k})} - ℓ_{1, j}^{(η_{k} - 1)}}{k + 1 - ℓ_{1, j}^{(η_{k} - 1)}}) .

(64)

Under this claim, (62) and (64) result in

\begin{matrix} \frac{| T_{j | 1} (u^{n}; k) |}{| N_{j | 1} (u^{n}; k) |} & \leq & \frac{(\binom{ℓ_{1, j}^{(η_{k} - 1)}}{ℓ_{1, j}^{(η_{k} - 1)} - 1}) (\binom{ℓ_{j}^{(η_{k})} - ℓ_{1, j}^{(η_{k} - 1)}}{k - (ℓ_{1, j}^{(η_{k} - 1)} - 1)}) + (\binom{ℓ_{1, j}^{(η_{k} - 1)}}{ℓ_{1, j}^{(η_{k} - 1)}}) (\binom{ℓ_{j}^{(η_{k})} - ℓ_{1, j}^{(η_{k} - 1)}}{k - ℓ_{1, j}^{(η_{k} - 1)}})}{(\binom{ℓ_{j}^{(η_{k})} - ℓ_{1, j}^{(η_{k} - 1)}}{k + 1 - ℓ_{1, j}^{(η_{k} - 1)}})} \end{matrix}

(65)

\begin{matrix} = & ℓ_{1, j}^{(η_{k} - 1)} + \frac{k + 1 - ℓ_{1, j}^{(η_{k} - 1)}}{ℓ_{j}^{(η_{k})} - k} \end{matrix}

(66)

\begin{matrix} \leq & ℓ_{1, j}^{(η_{k} - 1)} + \frac{ℓ_{j}^{(η_{k})} - ℓ_{1, j}^{(η_{k} - 1)}}{1} \end{matrix}

(67)

\begin{matrix} \leq & n, \end{matrix}

(68)

where (67) holds because

k \leq ℓ_{j}^{(η_{k})} - 1

by (49), and (68) follows from

ℓ_{1, j}^{(η_{k})} \leq ℓ_{1, j} \leq n

. The proof of the proposition is thus completed by (59) and (68). □

4. Conclusions

In this paper, we analyzed the error probability of block codes sent over the memoryless BSC under an arbitrary (not necessarily uniform) input distribution and used in conjunction with (optimal) MAP decoding. We showed that decoder ties do not affect the error exponent of the probability of error, thus extending a similar result recently established in [1] for uniformly distributed channel inputs. This result was obtained by proving that the relative deviation of the error probability from the probability of error grows no more than linearly in blocklength when no MAP decoding ties occur, directly implying that decoder ties have only a sub-exponential effect on the error probability as blocklength grows without bound. Future work includes further extending this result for more general channels used under arbitrary input statistics, such as non-binary symmetric channels (Note that the result of Theorem 1 can be extended for non-binary (q-ary,

q > 2

) codes sent over q-ary symmetric memoryless channels under a uniform input distribution; see [25] (Theorem 2).) and binary non-symmetric channels. Studying how to sharpen the upper bound derived in (10) for “sufficiently good” codes as highlighted in Remark 1 and for codes with small blocklengths are other worthwhile future directions.

Author Contributions

Conceptualization, L.-H.C., P.-N.C. and F.A.; Formal analysis, L.-H.C.; Writing—original draft, L.-H.C.; Writing—review & editing, P.-N.C. and F.A. All authors have read and agreed to the published version of the manuscript.

Funding

The work of Ling-Hua Chang is supported by the Ministry of Science and Technology, Taiwan under Grant MOST 109-2221-E-155-035-MY3. The work of Po-Ning Chen is supported by the Ministry of Science and Technology, Taiwan, under Grant MOST 110-2221-E-A49-024-MY3. The work of Fady Alajaji is supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Supplement to Example 1

Under distribution

\begin{matrix} P_{X^{4}} (c_{1}) & = \frac{q^{2}}{2 + q^{2} + q^{- 2}} \end{matrix}

(A1)

\begin{matrix} P_{X^{4}} (c_{2}) & = P_{X^{4}} (c_{3}) = \frac{1}{2 + q^{2} + q^{- 2}} \end{matrix}

(A2)

\begin{matrix} P_{X^{4}} (c_{4}) & = \frac{q^{- 2}}{2 + q^{2} + q^{- 2}} \end{matrix}

(A3)

over the code

C_{4} = {c_{1}, c_{2}, c_{3}, c_{4}} = {0000, 0101, 0110, 0111}

, we obtain:

\begin{matrix} T_{1} & = & \{y^{4} \in {0, 1}^{4} : P_{X^{4}, Y^{4}} (c_{1}, y^{4}) = max_{r \in [4] \ {1}} P_{X^{4}, Y^{4}} (c_{r}, y^{4})\} \end{matrix}

(A4)

\begin{matrix} = & \{y^{4} \in {0, 1}^{4} : \frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4})}} = max (\frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4})}}, \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4})}}, \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4})}})\} \end{matrix}

(A5)

\begin{matrix} = & \{y^{4} \in {0, 1}^{4} : d (c_{1}, y^{4}) - 2 = min (d (c_{2}, y^{4}), d (c_{3}, y^{4}), d (c_{4}, y^{4}) + 2)\} \end{matrix}

(A6)

\begin{matrix} = & \{0101, 0110, 0111, 1101, 1110, 1111\}, \end{matrix}

(A7)

where (A5) follows from (25), and

\begin{matrix} T_{j | 1} & ≜ & \{y^{4} \in T_{1} : j = min_{r \in I_{1} (y^{4}) : d (c_{1}, y^{4} | S_{1, r}) < | S_{1, r} |} r\} = \emptyset for j = 2, 3, 4, \end{matrix}

(A8)

and

\begin{matrix} T_{j | 1} & ≜ & \{y^{4} \in T_{1} \ (⋃_{h \in [4] \ {1}} T_{h | 1}) : j = min_{r \in I_{1} (y^{n})} r\} \end{matrix}

(A9)

\begin{matrix} = & \{\begin{matrix} {0101, 0111, 1101, 1111}, & j = 2; \\ {0110, 1110}, & j = 3; \\ \emptyset, & j = 4 . \end{matrix} \end{matrix}

(A10)

The above derivations are verified via Table A1. Continuing with the same setting, we obtain

\begin{matrix} T_{2} & = & \{y^{4} \in {0, 1}^{4} : P_{X^{4}, Y^{4}} (c_{2}, y^{4}) = max_{r \in [4] \ {2}} P_{X^{4}, Y^{4}} (c_{r}, y^{4})\} \end{matrix}

(A11)

\begin{matrix} = & \{y^{4} \in {0, 1}^{4} : \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4})}} = max (\frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4})}}, \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4})}}, \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4})}})\} \end{matrix}

(A12)

\begin{matrix} = & \{y^{4} \in {0, 1}^{4} : d (c_{2}, y^{4}) = min (d (c_{1}, y^{4}) - 2, d (c_{3}, y^{4}), d (c_{4}, y^{4}) + 2)\} \end{matrix}

(A13)

\begin{matrix} = & \{0101, 0111, 1101, 1111\}, \end{matrix}

(A14)

\begin{matrix} T_{j | 2} & ≜ & \{y^{4} \in T_{2} : j = min_{r \in I_{2} (y^{4}) : d (c_{2}, y^{4} | S_{2, r}) < | S_{2, r} |} r\} = \{\begin{matrix} T_{2}, & j = 1; \\ \emptyset, & j = 3, 4 \end{matrix} \end{matrix}

(A15)

and

\begin{matrix} T_{j | 2} & ≜ & \{y^{4} \in T_{2} \ (⋃_{h \in [4] \ {1}} T_{h | 2}) : j = min_{r \in I_{2} (y^{n})} r\} = \emptyset for j = 1, 3, 4, \end{matrix}

(A16)

where the above derivations are also confirmed via Table A1. Based on Table A1, we further have

\begin{matrix} T_{3} & = & \{y^{4} \in {0, 1}^{4} : P_{X^{4}, Y^{4}} (c_{3}, y^{4}) = max_{r \in [4] \ {3}} P_{X^{4}, Y^{4}} (c_{r}, y^{4})\} \end{matrix}

(A17)

\begin{matrix} = & \{y^{4} \in {0, 1}^{4} : \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4})}} = max (\frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4})}}, \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4})}}, \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4})}})\} \end{matrix}

(A18)

\begin{matrix} = & \{y^{4} \in {0, 1}^{4} : d (c_{3}, y^{4}) = min (d (c_{1}, y^{4}) - 2, d (c_{2}, y^{4}), d (c_{4}, y^{4}) + 2)\} \end{matrix}

(A19)

\begin{matrix} = & \{0110, 0111, 1110, 1111\}, \end{matrix}

(A20)

\begin{matrix} T_{j | 3} & ≜ & \{y^{4} \in T_{3} : j = min_{r \in I_{3} (y^{4}) : d (c_{3}, y^{4} | S_{3, r}) < | S_{3, r} |} r\} = \{\begin{matrix} T_{3}, & j = 1; \\ \emptyset, & j = 2, 4, \end{matrix} \end{matrix}

(A21)

and

\begin{matrix} T_{j | 3} & ≜ & \{y^{4} \in T_{3} \ (⋃_{h \in [4] \ {3}} T_{h | 3}) : j = min_{r \in I_{3} (y^{n})} r\} = \emptyset for j = 1, 2, 4 . \end{matrix}

(A22)

Furthermore, we establish from Table A1 that

\begin{matrix} T_{4} & = & \{y^{4} \in {0, 1}^{4} : P_{X^{4}, Y^{4}} (c_{4}, y^{4}) = max_{r \in [4] \ {4}} P_{X^{4}, Y^{4}} (c_{r}, y^{4})\} \end{matrix}

(A23)

\begin{matrix} = & \{y^{4} \in {0, 1}^{4} : \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4})}} = max (\frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4})}}, \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4})}}, \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4})}})\} \end{matrix}

(A24)

\begin{matrix} = & \{y^{4} \in {0, 1}^{4} : d (c_{4}, y^{4}) + 2 = min (d (c_{1}, y^{4}) - 2, d (c_{2}, y^{4}), d (c_{3}, y^{4}))\} \end{matrix}

(A25)

\begin{matrix} = & \emptyset, \end{matrix}

(A26)

\begin{matrix} T_{j | 4} & ≜ & \{y^{4} \in T_{4} : j = min_{r \in I_{4} (y^{4}) : d (c_{4}, y^{4} | S_{4, r}) < | S_{4, r} |} r\} = \emptyset for j = 1, 2, 3, \end{matrix}

(A27)

and

\begin{matrix} T_{j | 4} & ≜ & \{y^{4} \in T_{4} \ (⋃_{h \in [4] \ {4}} T_{h | 4}) : j = min_{r \in I_{4} (y^{n})} r\} = \emptyset for j = 1, 2, 3 . \end{matrix}

(A28)

After summarizing all sets derived above in Table A2, we remark that

\{\begin{matrix} T_{2 | 1} \subseteq T_{1 | 2} and P_{X^{4}, Y^{4}} (c_{1}, y^{4}) = P_{X^{4}, Y^{4}} (c_{2}, y^{4}) for every y^{4} \in T_{2 | 1}; \\ T_{3 | 1} \subseteq T_{1 | 3} and P_{X^{4}, Y^{4}} (c_{1}, y^{4}) = P_{X^{4}, Y^{4}} (c_{2}, y^{4}) for every y^{4} \in T_{3 | 1} . \end{matrix}

(A29)

Note that

{T_{j | i}}_{i \in [M], j \in [M] \ {i}}

are disjoint as confirmed in Remark 4 such that every element

y^{n} \in T_{j | i}

appears only once in the following summation:

\begin{matrix} \sum_{i \in [4]} \sum_{j \in [4] \ {i}} P_{X^{4}, Y^{4}} (c_{i}, T_{j | i}) & = & P_{X^{4}, Y^{4}} (c_{1}, 0101) + P_{X^{4}, Y^{4}} (c_{1}, 0110) \\ + P_{X^{4}, Y^{4}} (c_{1}, 0111) + P_{X^{4}, Y^{4}} (c_{1}, 1101) \\ + P_{X^{4}, Y^{4}} (c_{1}, 1110) + P_{X^{4}, Y^{4}} (c_{1}, 1111) \end{matrix}

(A30)

\begin{matrix} = & \frac{p^{4} q^{2}}{2 + q^{2} + q^{- 2}} (q^{2} + q^{2} + q + q + q + 1) \end{matrix}

(A31)

\begin{matrix} = & \frac{p^{4} q^{2}}{2 + q^{2} + q^{- 2}} (2 q^{2} + 3 q + 1) . \end{matrix}

(A32)

Additionally,

\begin{matrix} \sum_{i \in [4]} \sum_{j \in [4] \ {i}} P_{X^{4}, Y^{4}} (c_{i}, T_{j | i}) \\ = P_{X^{4}, Y^{4}} (c_{2}, 0101) + P_{X^{4}, Y^{4}} (c_{2}, 0111) + P_{X^{4}, Y^{4}} (c_{2}, 1101) \\ + P_{X^{4}, Y^{4}} (c_{2}, 1111) + P_{X^{4}, Y^{4}} (c_{3}, 0110) + P_{X^{4}, Y^{4}} (c_{3}, 0111) \\ + P_{X^{4}, Y^{4}} (c_{3}, 1110) + P_{X^{4}, Y^{4}} (c_{3}, 1111) \end{matrix}

(A33)

\begin{matrix} = & \frac{p^{4}}{2 + q^{2} + q^{- 2}} (q^{4} + q^{3} + q^{3} + q^{2} + q^{4} + q^{3} + q^{3} + q^{2}) \end{matrix}

(A34)

\begin{matrix} = & \sum_{i \in [4]} \sum_{j \in [4] \ {i}} P_{X^{4}, Y^{4}} (c_{i}, T_{j | i}) + \frac{p^{4} q^{2}}{2 + q^{2} + q^{- 2}} (q + 1) . \end{matrix}

(A35)

Finally, we have

\begin{matrix} N_{1} & = & \{y^{4} \in {0, 1}^{4} : \frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4})}} < max (\frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4})}}, \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4})}}, \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4})}})\} \end{matrix}

(A36)

\begin{matrix} = & \{y^{4} \in {0, 1}^{4} : d (c_{1}, y^{4}) - 2 > min (d (c_{2}, y^{4}), d (c_{3}, y^{4}), d (c_{4}, y^{4}) + 2)\} \end{matrix}

(A37)

\begin{matrix} = & \emptyset, \end{matrix}

(A38)

\begin{matrix} N_{2} & = & \{y^{4} \in {0, 1}^{4} : d (c_{2}, y^{4}) > min (d (c_{1}, y^{4}) - 2, d (c_{3}, y^{4}), d (c_{4}, y^{4}) + 2)\} \\ = & {0, 1}^{4} \ T_{2}, \end{matrix}

(A39)

\begin{matrix} N_{3} & = & \{y^{4} \in {0, 1}^{4} : d (c_{3}, y^{4}) > min (d (c_{2}, y^{4}), d (c_{1}, y^{4}) - 2, d (c_{4}, y^{4}) + 2)\} \\ = & {0, 1}^{4} \ T_{3}, \end{matrix}

(A40)

\begin{matrix} N_{4} & = & \{y^{4} \in {0, 1}^{4} : d (c_{4}, y^{4}) + 2 > min (d (c_{2}, y^{4}), d (c_{3}, y^{4}), d (c_{1}, y^{4}) - 2)\} \\ = & {0, 1}^{4}, \end{matrix}

(A41)

\begin{matrix} N_{j | 1} & = & \emptyset for j = 2, 3, 4, \end{matrix}

(A42)

\begin{matrix} N_{j | 2} & = & {y^{4} \in N_{2} : P_{X^{4}, Y^{4}} (c_{2}, y^{4}) \cdot q = P_{X^{4}, Y^{4}} (c_{j}, y^{4}) \cdot \frac{1}{q} \end{matrix}

\begin{matrix} and P_{X^{4}, Y^{4}} (c_{j}, y^{4}) \neq P_{X^{4}, Y^{4}} (c_{r}, y^{4}) for r \in [j - 1] \ {2}} \\ = & {y^{4} \in N_{2} : \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{j})}{q^{d (c_{j}, y^{4}) + 1}} and \frac{P_{X^{4}} (c_{j})}{q^{d (c_{j}, y^{4})}} \neq \frac{P_{X^{4}} (c_{r})}{q^{d (c_{r}, y^{4})}} \end{matrix}

(A43)

\begin{matrix} for r \in [j - 1] \ {2}} \end{matrix}

(A44)

\begin{matrix} = & \{\begin{matrix} \{y^{4} \in N_{2} : \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4}) + 1}}\}, & j = 1; \\ \{y^{4} \in N_{2} : \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4}) + 1}} and \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4})}} \neq \frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4})}}\}, & j = 3; \\ {y^{4} \in N_{2} : \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4}) + 1}} and \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4})}} \neq \frac{P_{X^{4}} (c_{r})}{q^{d (c_{r}, y^{4})}} \\ for r \in [3] \ {2}}, & j = 4 \end{matrix} \end{matrix}

(A45)

\begin{matrix} = & \{\begin{matrix} \{y^{4} \in N_{2} : d (c_{2}, y^{4}) = d (c_{1}, y^{4})\}, & j = 1; \\ \{y^{4} \in N_{2} : d (c_{2}, y^{4}) = d (c_{3}, y^{4}) + 2 and d (c_{2}, y^{4}) \neq d (c_{1}, y^{4})\}, & j = 3; \\ {y^{4} \in N_{2} : d (c_{2}, y^{4}) = d (c_{4}, y^{4}) + 4, d (c_{2}, y^{4}) \neq d (c_{1}, y^{4}) \\ and d (c_{2}, y^{4}) \neq d (c_{3}, y^{4}) + 2}, & j = 4 \end{matrix} \end{matrix}

\begin{matrix} = & \{\begin{matrix} \{0001, 0100, 0011, 0110, 1001, 1100, 1011, 1110\}, & j = 1; \\ \{0010, 1010\}, & j = 3; \\ \emptyset, & j = 4, \end{matrix} \end{matrix}

(A46)

\begin{matrix} N_{j | 3} & = & {y^{4} \in N_{3} : P_{X^{4}, Y^{4}} (c_{3}, y^{4}) \cdot q = P_{X^{4}, Y^{4}} (c_{j}, y^{4}) \cdot \frac{1}{q} \end{matrix}

\begin{matrix} and P_{X^{4}, Y^{4}} (c_{j}, y^{4}) \neq P_{X^{4}, Y^{4}} (c_{r}, y^{4}) for r \in [j - 1] \ {3}} \\ = & {y^{4} \in N_{3} : \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{j})}{q^{d (c_{j}, y^{4}) + 1}} and \frac{P_{X^{4}} (c_{j})}{q^{d (c_{j}, y^{4})}} \neq \frac{P_{X^{4}} (c_{r})}{q^{d (c_{r}, y^{4})}} \end{matrix}

(A47)

\begin{matrix} for r \in [j - 1] \ {3}} \end{matrix}

(A48)

\begin{matrix} = & \{\begin{matrix} \{y^{4} \in N_{3} : \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4}) + 1}}\}, & j = 1; \\ \{y^{4} \in N_{3} : \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4}) + 1}} and \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4})}} \neq \frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4})}}\}, & j = 2; \\ {y^{4} \in N_{3} : \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4}) + 1}} and \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4})}} \neq \frac{P_{X^{4}} (c_{r})}{q^{d (c_{r}, y^{4})}} \\ for r \in [3] \ {3}}, & j = 4 \end{matrix} \\ = & \{\begin{matrix} \{y^{4} \in N_{3} : d (c_{3}, y^{4}) = d (c_{1}, y^{4})\}, & j = 1; \\ \{y^{4} \in N_{3} : d (c_{3}, y^{4}) = d (c_{2}, y^{4}) + 2 and d (c_{3}, y^{4}) \neq d (c_{1}, y^{4})\}, & j = 3; \\ {y^{4} \in N_{3} : d (c_{3}, y^{4}) = d (c_{4}, y^{4}) + 4, d (c_{3}, y^{4}) \neq d (c_{1}, y^{4}) \\ and d (c_{3}, y^{4}) \neq d (c_{2}, y^{4}) + 2}, & j = 4 \end{matrix} \end{matrix}

(A49)

\begin{matrix} = & \{\begin{matrix} \{0010, 0100, 0011, 0101, 1010, 1100, 1011, 1101\}, & j = 1; \\ \{0001, 1001\}, & j = 3; \\ \emptyset, & j = 4, \end{matrix} \end{matrix}

(A50)

\begin{matrix} N_{j | 4} & = & {y^{4} \in N_{4} : P_{X^{4}, Y^{4}} (c_{4}, y^{4}) \cdot q = P_{X^{4}, Y^{4}} (c_{j}, y^{4}) \cdot \frac{1}{q} \end{matrix}

\begin{matrix} and P_{X^{4}, Y^{4}} (c_{j}, y^{4}) \neq P_{X^{4}, Y^{4}} (c_{r}, y^{4}) for r \in [j - 1] \ {4}} \\ = & \{y^{4} \in N_{4} : \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{j})}{q^{d (c_{j}, y^{4}) + 1}} and \frac{P_{X^{4}} (c_{j})}{q^{d (c_{j}, y^{4})}} \neq \frac{P_{X^{4}} (c_{r})}{q^{d (c_{r}, y^{4})}} for r \in [j - 1]\} \end{matrix}

(A51)

\begin{matrix} = & \{\begin{matrix} \{y^{4} \in N_{4} : \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4}) + 1}}\}, & j = 1; \\ \{y^{4} \in N_{4} : \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4}) + 1}} and \frac{P_{X^{4}} (c_{2})}{q^{d (c_{2}, y^{4})}} \neq \frac{P_{X^{4}} (c_{1})}{q^{d (c_{1}, y^{4})}}\}, & j = 2; \\ {y^{4} \in N_{4} : \frac{P_{X^{4}} (c_{4})}{q^{d (c_{4}, y^{4}) - 1}} = \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4}) + 1}} and \frac{P_{X^{4}} (c_{3})}{q^{d (c_{3}, y^{4})}} \neq \frac{P_{X^{4}} (c_{r})}{q^{d (c_{r}, y^{4})}} \\ for r \in [2]}, & j = 3 \end{matrix} \end{matrix}

(A52)

\begin{matrix} = & \{\begin{matrix} \{y^{4} \in N_{4} : d (c_{4}, y^{4}) = d (c_{1}, y^{4}) - 2\}, & j = 1; \\ \{y^{4} \in N_{4} : d (c_{4}, y^{4}) = d (c_{2}, y^{4}) and d (c_{4}, y^{4}) \neq d (c_{1}, y^{4}) - 2\}, & j = 2; \\ {y^{4} \in N_{4} : d (c_{4}, y^{4}) = d (c_{3}, y^{4}), d (c_{4}, y^{4}) \neq d (c_{1}, y^{4}) - 2 \\ and d (c_{4}, y^{4}) \neq d (c_{2}, y^{4})}, & j = 3 \end{matrix} \end{matrix}

(A53)

\begin{matrix} = & \emptyset for j = 1, 2, 3 . \end{matrix}

(A54)

Table A1. Measures used in Example 1.

	$d (0000, y^{4}) - 2$	$d (0101, y^{4})$	$d (0110, y^{4})$	$d (0111, y^{4}) + 2$	$I_{1} (y^{4})$	$I_{2} (y^{4})$	$I_{3} (y^{4})$	$I_{4} (y^{4})$
$y^{4} = 0000$	$- 2$	2	2	5	∅	∅	∅	∅
$y^{4} = 0001$	$- 1$	1	3	4	∅	∅	∅	∅
$y^{4} = 0010$	$- 1$	3	1	4	∅	∅	∅	∅
$y^{4} = 0100$	$- 1$	1	1	4	∅	∅	∅	∅
$y^{4} = 1000$	$- 1$	3	3	6	∅	∅	∅	∅
$y^{4} = 0011$	0	2	2	3	∅	∅	∅	∅
$y^{4} = 0101$	0	0	2	3	${2}$	${1}$	∅	∅
$y^{4} = 0110$	0	2	0	3	${3}$	∅	${1}$	∅
$y^{4} = 1001$	0	2	4	5	∅	∅	∅	∅
$y^{4} = 1010$	0	4	2	5	∅	∅	∅	∅
$y^{4} = 1100$	0	2	2	5	∅	∅	∅	∅
$y^{4} = 0111$	1	1	1	2	${2, 3}$	${1, 3}$	${1, 2}$	∅
$y^{4} = 1011$	1	3	3	4	∅	∅	∅	∅
$y^{4} = 1101$	1	1	3	4	${2}$	${1}$	∅	∅
$y^{4} = 1110$	1	3	1	4	${3}$	∅	${1}$	∅
$y^{4} = 1111$	2	2	2	3	${2, 3}$	${1, 3}$	${1, 2}$	∅

Table A2. List of

T_{i}

,

N_{i}

,

T_{j | i}

,

T_{j | i}

and

N_{j | i}

for

i \in [4]

and

j \in [4] \ {i}

in Example 1.

Table A2. List of

T_{i}

,

N_{i}

,

T_{j | i}

,

T_{j | i}

and

N_{j | i}

for

i \in [4]

and

j \in [4] \ {i}

in Example 1.

$T_{1}$	$\{0101, 0110, 0111, 1101, 1110, 1111\}$			$N_{1}$	∅
$T_{2}$	$\{0101, 0111, 1101, 1111\}$			$N_{2}$	${0, 1}^{4} \ T_{2}$
$T_{3}$	$\{0110, 0111, 1110, 1111\}$			$N_{3}$	${0, 1}^{4} \ T_{3}$
$T_{4}$	∅			$N_{4}$	${0, 1}^{4}$
$T_{2 \| 1}$	∅	$T_{2 \| 1}$	${0101, 0111, 1101, 1111}$	$N_{2 \| 1}$	∅
$T_{3 \| 1}$	∅	$T_{3 \| 1}$	${0110, 1110}$	$N_{3 \| 1}$	∅
$T_{4 \| 1}$	∅	$T_{4 \| 1}$	∅	$N_{4 \| 1}$	∅
$T_{1 \| 2}$	${0101, 0111, 1101, 1111}$	$T_{1 \| 2}$	∅	$N_{1 \| 2}$	$N_{2} \ {0000, 0010, 1000, 1010}$
$T_{3 \| 2}$	∅	$T_{3 \| 2}$	∅	$N_{3 \| 2}$	${0010, 1010}$
$T_{4 \| 2}$	∅	$T_{4 \| 2}$	∅	$N_{1 \| 2}$	∅
$T_{1 \| 3}$	${0110, 0111, 1110, 1111}$	$T_{1 \| 3}$	∅	$N_{1 \| 3}$	$N_{3} \ {0000, 0001, 1000, 1001}$
$T_{2 \| 3}$	∅	$T_{2 \| 3}$	∅	$N_{2 \| 3}$	${0001, 1001}$
$T_{4 \| 3}$	∅	$T_{4 \| 3}$	∅	$N_{4 \| 3}$	∅
$T_{1 \| 4}$	∅	$T_{1 \| 4}$	∅	$N_{1 \| 4}$	∅
$T_{2 \| 4}$	∅	$T_{2 \| 4}$	∅	$N_{2 \| 4}$	∅
$T_{3 \| 4}$	∅	$T_{3 \| 4}$	∅	$N_{3 \| 4}$	∅

Appendix B. The Proof of the Claim Supporting Proposition 5

We validate the claim that (63c) and (63d) imply (63a) and (63b) via the construction of an auxiliary

v^{n} \in N_{j | 1} (u^{n}; k)

from

u^{n} \in T_{j | 1} (u^{n}; k)

. This auxiliary

v^{n}

will be defined differently according to whether

d (c_{1}, u^{n} | S_{1, j}^{(η_{k} - 1)})

equals

ℓ_{1, j}^{(η_{k} - 1)}

or

ℓ_{1, j}^{(η_{k} - 1)} - 1

as follows.

(i): $d (c_{1}, u^{n} | S_{1, j}^{(η_{k} - 1)}) = ℓ_{1, j}^{(η_{k} - 1)}$ : In this case, $u^{n}$ has no zero components with indices in $S_{1, j}^{(η_{k} - 1)}$ . Moreover, $d (c_{1}, u^{n} | S_{1, j}^{(η_{k})}) = k \leq ℓ_{j}^{(η_{k})} - 1$ indicates that

$\begin{matrix} u^{n} has at least one zero component with its index in S_{1, j}^{(η_{k})} \ S_{1, j}^{(η_{k} - 1)} = S_{1, j}^{(η_{k})} . \end{matrix}$

(A55)

Therefore, we flip arbitrarily a zero component of $u^{n}$ with its index in $S_{1, j}^{(η_{k})}$ to construct a $v^{n}$ such that

$d (c_{1}, v^{n}) = d (c_{1}, u^{n}) + 1 and d (c_{j}, v^{n}) = d (c_{j}, u^{n}) - 1,$

(A56)

which implies

$\begin{matrix} P_{X^{n}, Y^{n}} (c_{1}, v^{n}) = P_{X^{n}, Y^{n}} (c_{1}, u^{n}) \cdot \frac{1}{q} and P_{X^{n}, Y^{n}} (c_{j}, v^{n}) = P_{X^{n}, Y^{n}} (c_{j}, u^{n}) \cdot q . \end{matrix}$

(A57)

Then, $v^{n}$ must fulfill (63a), (63c) and (63d) (with $w^{n}$ replaced by $v^{n}$ ) as $u^{n}$ satisfies (61a), (61b) and (61c). We next declare that $v^{n}$ also fulfills (63b) and will prove this declaration by contradiction.
Proof of the declaration: Suppose there exists a $r \in [j - 1] \ {1}$ satisfying

$P_{X^{n}, Y^{n}} (c_{1}, v^{n}) \cdot q^{2} = P_{X^{n}, Y^{n}} (c_{r}, v^{n}) .$

(A58)

We then recall from (45) that $d (c_{1}, c_{r} | S_{1, j}^{(η_{k})})$ is either 0 or $| S_{1, j}^{(η_{k})} |$ . Thus, (A58) can be disproved by differentiating two subcases: $(1)$ $d (c_{1}, c_{r} | S_{1, j}^{(η_{k})}) = 0$ , and $(2)$ $d (c_{1}, c_{r} | S_{1, j}^{(η_{k})}) = | S_{1, j}^{(η_{k})} |$ (Since $ℓ_{1, j}^{(η_{k} - 1)} < ℓ_{1, j}^{(η_{k})}$ as can be seen from (50) and (51), we have $| S_{1, j}^{(η_{k})} | = ℓ_{1, j}^{(η_{k})} - ℓ_{1, j}^{(η_{k} - 1)} > 0$ , i.e., $S_{1, j}^{(η_{k})}$ non-empty).
In Subcase $(1)$ , $v^{n}$ that is obtained by flipping a zero component of $u^{n}$ with index in $S_{1, j}^{(η_{k})}$ must satisfy $d (c_{1}, v^{n}) = d (c_{1}, u^{n}) + 1$ and $d (c_{r}, v^{n}) = d (c_{r}, u^{n}) + 1$ , which is equivalent to

$P_{X^{n}, Y^{n}} (c_{1}, v^{n}) \cdot q = P_{X^{n}, Y^{n}} (c_{1}, u^{n}) and P_{X^{n} | Y^{n}} (c_{r} | v^{n}) \cdot q = P_{X^{n} | Y^{n}} (c_{r} | u^{n}) .$

(A59)

Then, (A58) implies

$P_{X^{n}, Y^{n}} (c_{1}, u^{n}) \cdot q^{2} = P_{X^{n}, Y^{n}} (c_{r}, u^{n}) .$

(A60)

Hence,

$P_{X^{n}, Y^{n}} (c_{1}, u^{n}) < P_{X^{n}, Y^{n}} (c_{r}, u^{n}) \leq max_{h \in [M] \ {1}} P_{X^{n}, Y^{n}} (x_{(h)}^{n}, u^{n}) .$

(A61)

A contradiction to the fact that $u^{n} \in T_{j | 1} (u^{n}; k)$ satisfies (61a) (with $y^{n}$ replaced by $u^{n}$ ) is obtained.
In Subcase $(2)$ , we note that $d (c_{1}, c_{r} | S_{1, j}^{(η_{k})}) = | S_{1, j}^{(η_{k})} |$ implies $S_{1, j}^{(η_{k})} \subseteq S_{1, r}$ . Therefore, (A55) leads to

$d (c_{1}, u^{n} | S_{1, r}) < | S_{1, r} | .$

(A62)

The flipping manipulation on $u^{n}$ results in $d (c_{1}, v^{n}) = d (c_{1}, u^{n}) + 1$ and $d (c_{r}, v^{n}) = d (c_{r}, u^{n}) - 1$ , which is equivalent to

$P_{X^{n}, Y^{n}} (c_{1}, v^{n}) \cdot q = P_{X^{n}, Y^{n}} (c_{1}, u^{n}) and P_{X^{n}, Y^{n}} (c_{r}, v^{n}) = P_{X^{n}, Y^{n}} (c_{r}, u^{n}) \cdot q .$

(A63)

Therefore, (A58) implies

$P_{X^{n}, Y^{n}} (c_{1}, u^{n}) = P_{X^{n}, Y^{n}} (c_{r}, u^{n}),$

(A64)

which together with ${max}_{h \in [M] \ {1}} P_{X^{n}, Y^{n}} (c_{h}, u^{n}) = P_{X^{n}, Y^{n}} (c_{1}, u^{n})$ and (A62) result in $u^{n} \in T_{r | 1}$ because $r < j$ . This contradicts $u^{n} \in T_{j | 1}$ . Accordingly, $v^{n}$ must also fulfill (63b); hence, $v^{n} \in N_{j | 1} (u^{n}; k)$ . This completes the proof of the declaration.
With this auxiliary $v^{n}$ , we are ready to prove that every $w^{n}$ satisfying (63c) and (63d) also validates (63a) and (63b). Toward this end, we need to prove

$\begin{matrix} P_{X^{n}, Y^{n}} (c_{r}, w^{n}) = P_{X^{n}, Y^{n}} (c_{r}, v^{n}) for all r \in [M] . \end{matrix}$

(A65)

Note that

${\begin{array}{l} (A66a) & d (w^{n}, v^{n} | S_{1, j}^{(η_{k} - 1)}) = 0; \\ (A66b) & d (c_{r}, w^{n} | S_{1, j}^{(η_{k})}) = d (c_{r}, v^{n} | S_{1, j}^{(η_{k})}) for all r \in [M]; \\ (A66c) & d (w^{n}, v^{n} | {(S_{1, j}^{(η_{k} - 1)})}^{c} = 0, \end{array}$

where (A66a) holds because both $v^{n}$ and $w^{n}$ satisfy (63c), implying that all components of $v^{n}$ and $w^{n}$ with indices in $S_{1, j}^{(η_{k} - 1)}$ are equal to one; (A66b) holds because when considering only those portions with indices in (non-empty) $S_{1, j}^{(η_{k})}$ , $c_{r}$ gives either all ones or all zeros according to (45), and both $w^{n}$ and $v^{n}$ have exactly $k + 1 - ℓ_{1, j}^{(η_{k} - 1)}$ ones according to (63c); and (A66c) is valid since both $v^{n}$ and $w^{n}$ satisfy (63d). Based on (A66a)–(A66c), we remark that $d (c_{r}, w^{n}) = d (c_{r}, v^{n})$ for all $r \in [M]$ , which implies $P_{Y^{n} | X^{n}} (w^{n} | c_{r}) = P_{Y^{n} | X^{n}} (v^{n} | c_{r})$ (equivalently, $P_{X^{n}, Y^{n}} (c_{r}, w^{n}) = P_{X^{n}, Y^{n}} (c_{r}, v^{n})$ ) for all $r \in [M]$ ).
(ii): $d (c_{1}, u^{n} | S_{1, j}^{(η_{k} - 1)}) = ℓ_{1, j}^{(η_{k} - 1)} - 1$ : In this case, there is only one zero component of $u^{n}$ with its index in $S_{1, j}^{(η_{k} - 1)}$ . Suppose the index of such zero component lie in $S_{1, j}^{(h)} \subseteq S_{1, j}^{(η_{k} - 1)}$ , where $h \leq η_{k} - 1$ . The flipping manipulation to $u^{n}$ leads to $v^{n}$ , which has all one components with respect to $S_{1, j}^{(η_{k} - 1)}$ . Then, $v^{n}$ must fulfill (63a), (63c), and (63d) as $u^{n}$ satisfies (61a), (61b), and (61c). With the components of $c_{r}$ with respect to (non-empty) $S_{1, j}^{(h)}$ being either all zeros or all ones, the same contradiction argument between (A58) and (A64), with $η_{k}$ replaced by h, can disprove the validity of (A58) for this $v^{n}$ and for any $r \in [j - 1] \ {1}$ . Therefore, $v^{n}$ also fulfills (63b), implying $v^{n} \in N_{j | 1} (u^{n}; k)$ . With this auxiliary $v^{n}$ , we can again verify (A66a)–(A66c) via the same argument. The claim that $w^{n}$ satisfying (63c) and (63d) validates (63a) and (63b) is thus confirmed.

References

Chang, L.H.; Chen, P.N.; Alajaji, F.; Han, Y.S. Decoder Ties Do Not Affect the Error Exponent of the Memoryless Binary Symmetric Channel. IEEE Trans. Inf. Theory 2022, 68, 3501–3510. [Google Scholar] [CrossRef]
Shannon, C.E.; Gallager, R.G.; Berlekamp, E.R. Lower bounds to error probability for coding on discrete memoryless channels—I. Inf. Control 1967, 10, 65–103. [Google Scholar] [CrossRef]
Shannon, C.E.; Gallager, R.G.; Berlekamp, E.R. Lower bounds to error probability for coding on discrete memoryless channels—II. Inf. Control 1967, 10, 522–552. [Google Scholar] [CrossRef]
McEliece, R.J.; Omura, J.K. An improved upper bound on the block coding error exponent for binary-input discrete memoryless channels. IEEE Trans. Inf. Theory 1977, 23, 611–613. [Google Scholar] [CrossRef]
Gallager, R.G. Information Theory and Reliable Communication; Wiley: New York, NY, USA, 1968. [Google Scholar]
Viterbi, A.J.; Omura, J.K. Principles of Digital Communication and Coding; McGraw-Hill: New York, NY, USA, 1979. [Google Scholar]
Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Academic Press: New York, NY, USA, 1981. [Google Scholar]
Blahut, R. Principles and Practice of Information Theory; Addison-Wesley Longman Publishing Co., Inc.: Albany, NY, USA, 1988. [Google Scholar]
Barg, A.; McGregor, A. Distance distribution of binary codes and the error probability of decoding. IEEE Trans. Inf. Theory 2005, 51, 4237–4246. [Google Scholar] [CrossRef]
Haroutunian, E.A.; Haroutunian, M.E.; Harutyunyan, A.N. Reliability Criteria in Information Theory and in Statistical Hypothesis Testing. In Foundations and Trends in Communications and Information Theory; Now Publishers Inc.: Delft, The Netherlands, 2007; Volume 4, pp. 97–263. [Google Scholar]
Dalai, M. Lower bounds on the probability of error for classical and classical-quantum channels. IEEE Trans. Inf. Theory 2013, 59, 8027–8056. [Google Scholar] [CrossRef]
Burnashev, M.V. On the BSC reliability function: Expanding the region where it is known exactly. Probl. Inf. Transm. 2015, 51, 307–325. [Google Scholar] [CrossRef]
Csiszár, I. Joint source-channel error exponent. Probl. Control. Inf. Theory 1980, 9, 315–328. [Google Scholar]
Zhong, Y.; Alajaji, F.; Campbell, L. On the joint source-channel coding error exponent for discrete memoryless systems. IEEE Trans. Inf. Theory 2006, 52, 1450–1468. [Google Scholar] [CrossRef]
Alajaji, F.; Phamdo, N.; Fuja, T. Channel codes that exploit the residual redundancy in CELP-encoded speech. IEEE Trans. Speech Audio Process. 1996, 4, 325–336. [Google Scholar] [CrossRef]
Xu, W.; Hagenauer, J.; Hollmann, J. Joint source-channel decoding using the residual redundancy in compressed images. In Proceedings of the Proceedings of the International Conference on Communications, Washington, DC, USA, 25–28 February 1996; Volume 1, pp. 142–148. [Google Scholar]
Hagenauer, J. Source-controlled channel decoding. IEEE Trans. Commun. 1995, 43, 2449–2457. [Google Scholar] [CrossRef]
Goertz, N. Joint Source-Channel Coding of Discrete-Time Signals with Continuous Amplitudes; World Scientific: Singapore, 2007. [Google Scholar]
Duhamel, P.; Kieffer, M. Joint Source-Channel Decoding: A Cross-Layer Perspective with Applications in Video Broadcasting; Academic Press: Cambridge, MA, USA, 2009. [Google Scholar]
Fresia, M.; Pérez-Cruz, F.; Poor, H.V.; Verdú, S. Joint source and channel coding. IEEE Signal Process. Mag. 2010, 27, 104–113. [Google Scholar] [CrossRef]
Alajaji, F.; Chen, P.N. An Introduction to Single-User Information Theory; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
Chang, L.H.; Chen, P.N.; Alajaji, F.; Han, Y.S. The asymptotic generalized Poor-Verdú bound achieves the BSC error exponent at zero rate. In Proceedings of the IEEE International Symposium on Information Theory, Los Angeles, CA, USA, 21–26 June 2020. [Google Scholar]
Chen, P.N.; Alajaji, F. A generalized Poor-Verdú error bound for multihypothesis testings. IEEE Trans. Inf. Theory 2012, 58, 311–316. [Google Scholar] [CrossRef]
Poor, H.V.; Verdú, S. A lower bound on the probability of error in multihypothesis testing. IEEE Trans. Inf. Theory 1995, 41, 1992–1994. [Google Scholar] [CrossRef]
Chang, L.H.; Chen, P.N.; Alajaji, F.; Han, Y.S. Tightness of the asymptotic generalized Poor-Verdú error bound for the memoryless symmetric channel. arXiv 2020, arXiv:2007.04080v1. [Google Scholar]

Figure 1. An illustration, based on the setting in Example 1 for a non-uniformly distributed binary code (with

M = n = 4)

given by

C_{4} = {c_{1}, c_{2}, c_{3}, c_{4}} = {0000, 0101, 0110, 0111}

of the non-empty component subsets of

Y^{n}

defined in Table 1 and corresponding to codewords

c_{1} = 0000

(left figure) and

c_{2} = 0101

(right figure).

Figure 1. An illustration, based on the setting in Example 1 for a non-uniformly distributed binary code (with

M = n = 4)

given by

C_{4} = {c_{1}, c_{2}, c_{3}, c_{4}} = {0000, 0101, 0110, 0111}

of the non-empty component subsets of

Y^{n}

defined in Table 1 and corresponding to codewords

c_{1} = 0000

(left figure) and

c_{2} = 0101

(right figure).

Table 1. Summary of the main symbols used in this paper.

Symbol	Description	Defined in
$[M]$	A shorthand for ${1, 2, \dots, M}$
$C_{n}$	The code $\{c_{1}, c_{2}, \dots, c_{n}\}$ with $c_{1}$ being the all-zero codeword
$d (u^{n}, v^{n} \| S)$	The Hamming distance between the portions of $u^{n}$ and $v^{n}$ with indices in $S$
All terms below are functions of $C_{n}$ (this dependence is not explicitly shown to simplify notation)
$T_{i}$	The set of channel outputs $y^{n}$ inducing a decoder tie when $c_{i}$ is sent	(12)
$N_{i}$	The set of channel outputs $y^{n}$ leading to a tie-free decoder decision error when $c_{i}$ is sent	(15)
$I_{i} (y^{n})$	The set ${m \in [M] \ {i} : y^{n} \in T_{m}}$ for $y^{n} \in T_{i}$	(21)
$S_{i, j}$	The set of indices for which the components of $c_{i}$ and $c_{j}$ differ
$ℓ_{i, j}$	The size of $S_{i, j}$ , i.e., $\| S_{i, j} \|$
$T_{j \| i}$	The subset of $T_{i}$ consisting of channel outputs $y^{n}$ such that j is the minimal	(22a)
	number r in $I_{i} (y^{n})$ satisfying $d (c_{i}, y^{n} \| S_{i, r}) < ℓ_{i, r}$
$N_{j \| i}$	The subset of $N_{i}$ consisting of channel outputs $y^{n}$ that satisfy $P_{X^{n}, Y^{n}} (c_{i}, y^{n}) \cdot q =$	(22b)
	$P_{X^{n}, Y^{n}} (c_{j}, y^{n}) \cdot \frac{1}{q}$ and that are not included in $N_{r \| i}$ for $r \in [j - 1] \subset {i}$
$T_{j \| i}$	The subset of $T_{i} \ (⋃_{h \in [M] \ {i}} T_{h \| i})$ consisting of channel outputs $y^{n}$	(23)
	such that j is the minimal number in $I_{i} (y^{n})$
$S_{1, j}^{(m)}$	The subset of $S_{1, j}$ defined according to whether each index in $S_{1, j}$ is in each	(43)
	of $S_{1, 2}$ , …, $S_{1, j - 1}$ , $S_{1, j + 1}$ , …, $S_{1, M}$
$S_{1, j}^{(m)}$	The union of $S_{1, j}^{(1)}$ , $S_{1, j}^{(2)}$ , …, $S_{1, j}^{(m)}$	(48)
$ℓ_{1, j}^{(m)}$	The size of $S_{1, j}^{(m)}$ , i.e., $\| S_{1, j}^{(m)} \|$
$η_{k}$	The mapping from $k \in {0, 1, \dots, ℓ_{j} - 1}$ to $[2^{M - 2}]$ used for partitioning $T_{j \| 1}$ into $ℓ_{1, j}$	(49)
	subsets ${T_{j \| 1} (k)}_{0 \leq k < ℓ_{1, j}}$
$T_{j \| 1} (k)$	The kth partition of $T_{j \| 1}$ for $k = 0$ , 1, …, $ℓ_{1, j} - 1$	(52a)
$N_{j \| 1} (k)$	The kth subset of $N_{j \| 1}$ for $k = 0$ , 1, …, $ℓ_{1, j} - 1$	(52b)
$U_{j \| 1} (k)$	The set of representative elements in $T_{j \| 1} (k)$ for partitioning $T_{j \| 1} (k)$
$T_{j \| 1} (u^{n}; k)$	The subset of $T_{j \| 1} (k)$ associated with $u^{n} \in U_{j \| 1} (k)$	(55a)
$N_{j \| 1} (u^{n}; k)$	The subset of $N_{j \| 1} (k)$ associated with $u^{n} \in U_{j \| 1} (k)$	(55b)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chang, L.-H.; Chen, P.-N.; Alajaji, F. On Decoder Ties for the Binary Symmetric Channel with Arbitrarily Distributed Input. Entropy 2023, 25, 668. https://doi.org/10.3390/e25040668

AMA Style

Chang L-H, Chen P-N, Alajaji F. On Decoder Ties for the Binary Symmetric Channel with Arbitrarily Distributed Input. Entropy. 2023; 25(4):668. https://doi.org/10.3390/e25040668

Chicago/Turabian Style

Chang, Ling-Hua, Po-Ning Chen, and Fady Alajaji. 2023. "On Decoder Ties for the Binary Symmetric Channel with Arbitrarily Distributed Input" Entropy 25, no. 4: 668. https://doi.org/10.3390/e25040668

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On Decoder Ties for the Binary Symmetric Channel with Arbitrarily Distributed Input

Abstract

1. Introduction

2. Main Result

3. Proof of Theorem 2

3.1. A Partition of Non-Empty $T_{i}$ and Corresponding Disjoint Subsets of $N_{i}$

3.2. Verification of (32)

3.3. Atomic Decomposition of Non-Empty $T_{j | i}$ and the Corresponding Disjoint Subsets of $N_{j | i}$

3.4. Characterization of a Linear Upper Bound for $δ_{n} / b_{n}$

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

Appendix A. Supplement to Example 1

Appendix B. The Proof of the Claim Supporting Proposition 5

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

On Decoder Ties for the Binary Symmetric Channel with Arbitrarily Distributed Input

Abstract

1. Introduction

2. Main Result

3. Proof of Theorem 2

3.1. A Partition of Non-Empty T i and Corresponding Disjoint Subsets of N i

3.2. Verification of (32)

3.3. Atomic Decomposition of Non-Empty T j | i and the Corresponding Disjoint Subsets of N j | i

3.4. Characterization of a Linear Upper Bound for δ n / b n

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

Appendix A. Supplement to Example 1

Appendix B. The Proof of the Claim Supporting Proposition 5

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. A Partition of Non-Empty $T_{i}$ and Corresponding Disjoint Subsets of $N_{i}$

3.3. Atomic Decomposition of Non-Empty $T_{j | i}$ and the Corresponding Disjoint Subsets of $N_{j | i}$

3.4. Characterization of a Linear Upper Bound for $δ_{n} / b_{n}$