# Coded Parallel Transmission for Half-Duplex Distributed Computing

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Related Work

#### 1.2. Our Contribution

## 2. System Model and Problem Definitions

#### 2.1. Network Model

#### 2.2. Mapreduce Process Description

#### 2.2.1. Map Phase

#### 2.2.2. Shuffle Phase

**Definition**

**1**

**.**Define the communication load L as the total number of bits communicated by the K nodes, normalized by $NQT$, during the Shuffle phase. Define communication delay, D, as the time (in seconds) required in the shuffle phase such that all required contents are successfully sent.

#### 2.2.3. Reduce Phase

**Definition**

**2**

**.**The achievable execution time of MapReduce task with parameters $(K,N,Q,r)$, denoted by ${T}_{sum}$, is defined as

#### 2.3. Examples: The Uncoded Scheme and the CDC Scheme

## 3. Main Results

**Theorem**

**1.**

**Proof.**

## 4. The Proposed Coded Distributed Computing Scheme

**Definition**

**3**

**.**A set $\mathcal{S}\subseteq \{1,\dots ,K\}$ is called a broadcast group if all nodes in $\mathcal{S}$ can only exchange information with nodes within $\mathcal{S}$. Given an integer $\alpha \ge 1$ and multiple broadcast groups $\{{\mathcal{S}}_{1},{\mathcal{S}}_{2},\dots ,{\mathcal{S}}_{\alpha}\}$ with ${\mathcal{S}}_{i}\cap {\mathcal{S}}_{j}=\xd8,\forall i\ne j$, we define $\mathcal{B}=\{{\mathcal{S}}_{1},{\mathcal{S}}_{2},\dots ,{\mathcal{S}}_{\alpha}\}$ as the broadcast set.

#### 4.1. Map Phase

#### 4.2. Shuffle Phase

#### Group Partitioning Strategy

#### 4.3. Data Shuffle Strategy

Algorithm 1 Distributed computing process of parallel coding in lossless scenarios. |

1: ${\pi}_{i}=\left\{{\mathit{S}}_{i1},{\mathcal{S}}_{i2},\cdots ,{\mathcal{S}}_{i{\alpha}_{i}}\right\},\phantom{\rule{3.33333pt}{0ex}}i\in \{1,\cdots ,\beta \}$, ${\mathcal{S}}_{ij}\subseteq \{1,\dots ,K\}$, ${\mathcal{S}}_{ij}\cap {\mathcal{S}}_{i{j}^{\prime}}=\xd8$ if $j\ne {j}^{\prime}$, and ${\bigcup}_{j\in \{1,\dots ,{\alpha}_{i}\}}{\mathcal{S}}_{ij}\subseteq \{1,\cdots ,K\}.$ |

2: for $\text{}i=1,\cdots ,\beta \text{}$ do |

3: for $\mathcal{S}\in {\pi}_{i}$ do |

4: $\{{v}_{q,n}:q\in \left[Q\right],{w}_{n}\in {\mathcal{M}}_{k}\}$ |

5: ${v}_{{\mathcal{W}}_{j},\mathcal{S}\backslash \left\{j\right\}}=\left\{{v}_{q,n}:q\in {\mathcal{W}}_{j},{\omega}_{n}\in {\cap}_{k\in \mathcal{S}\backslash \left\{j\right\}}{\mathcal{M}}_{k},{\omega}_{n}\notin {\mathcal{W}}_{j}\right\}$ |

6: Split ${v}_{{\mathcal{W}}_{j},\mathcal{S}\backslash \left\{j\right\}}$ as ${v}_{{\mathcal{W}}_{j},\mathcal{S}\backslash \left\{j\right\}}=\left({v}_{{\mathcal{W}}_{j},\mathcal{S}\backslash \left\{j\right\}}^{i,k},i\in {\Gamma}_{\mathcal{S}},k\in \mathcal{S}\backslash \left\{j\right\}\right)$ |

7:
for $\text{}k\in \mathcal{S}$ do |

8: Node k sends the ${X}_{k,\mathcal{S}}^{i}$ to the nodes in $\mathcal{S}\backslash \left\{k\right\}$ with ${X}_{k,\mathcal{S}}^{i}={\oplus}_{j\in \mathcal{S}\backslash \left\{k\right\}}{v}_{{\mathcal{W}}_{j},\mathcal{S}\backslash \left\{j\right\}}^{i,j}$ |

9:
end for |

10:
for $\text{}j\in \mathcal{S}$ do |

11: Node j decodes the desired parts $\{{v}_{{\mathcal{W}}_{k},\mathcal{S}\backslash \left\{k\right\}}^{i,k}:k\in \mathcal{S}\backslash \left\{j\right\}\}$ as follows: ${v}_{{\mathcal{W}}_{k},\mathcal{S}\backslash \left\{k\right\}}^{i,k}=\left({\oplus}_{\mathit{y}\in \mathcal{S}\backslash \{\mathit{j},\mathit{k}\}}{v}_{{\mathcal{W}}_{\mathit{y}},\mathcal{S}\backslash \left\{\mathit{y}\right\}}^{i,y}\right)\oplus {X}_{j,\mathcal{S}}^{i},\phantom{\rule{3.33333pt}{0ex}}\forall k\in \mathcal{S}\backslash \left\{j\right\}$ |

12:
end for |

13:
end for |

14:
end for |

15:
for $\text{}k\in \mathcal{S}$, $\mathcal{S}\in \{1,\cdots ,K\}:\left|\mathcal{S}\right|=r+1$ do |

16: $\left({v}_{{\mathcal{W}}_{k},\mathcal{S}\backslash \left\{k\right\}}^{i,k}:i\in {\Gamma}_{\mathcal{S}},k\in \mathcal{S}\right)\to {v}_{{\mathcal{W}}_{k},\mathcal{S}\backslash \left\{k\right\}}$. |

17:
end for |

#### 4.4. Reduce Phase

#### 4.5. Analysis of Communication Delay

#### 4.6. Illustrative Examples

## 5. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Cristea, V.; Dobre, C.; Stratan, C.; Pop, F.; Costan, A. Large-Scale Distributed Computing and Applications: Models and Trends Information Science Reference; IGI Publishing: Hershey, PA, USA, 2010. [Google Scholar]
- Nikoletseas, S.; Rolim, J.D. Theoretical Aspects of Distributed Computing in Sensor Networks; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
- Corbett, J.C.; Dean, J.; Epstein, M.; Fikes, A.; Frost, C.; Furman, J.J.; Ghemawat, S.; Gubarev, A.; Heiser, C.; Hochschild, P.; et al. Spanner: Google’s globally distributed database. ACM Trans. Comput. Syst.
**2013**, 31, 1–22. [Google Scholar] [CrossRef] [Green Version] - Valentini, G.L.; Lassonde, W.; Khan, S.U.; Min-Allah, N.; Madani, S.A.; Li, J.; Zhang, L.; Wang, L.; Ghani, N.; Kolodziej, J.; et al. An overview of energy efficiency techniques in cluster computing systems. Clust. Comput.
**2013**, 16, 3–15. [Google Scholar] [CrossRef] [Green Version] - Sadashiv, N.; Kumar, S.D. Cluster, grid and cloud computing: A detailed comparison. In Proceedings of the 2011 6th International Conference on Computer Science Education (ICCSE), Singapore, 3–5 August 2011; pp. 477–482. [Google Scholar]
- Hussain, H.; Malik, S.U.R.; Hameed, A.; Khan, S.U.; Bickler, G.; Min-Allah, N.; Qureshi, M.B.; Zhang, L.; Yongji, W.; Ghani, N.; et al. A survey on resource allocation in high performance distributed computing systems. Parallel Comput.
**2013**, 39, 709–736. [Google Scholar] [CrossRef] [Green Version] - Idrissi, H.K.; Kartit, A.; El Marraki, M. A taxonomy and survey of cloud computing. In Proceedings of the 2013 National Security Days (JNS3), Rabat, Morocco, 26–27 April 2013; pp. 1–5. [Google Scholar]
- Dean, J.; Ghemawat, S. Mapreduce: Simplified data processing on large clusters. Commun. ACM
**2008**, 51, 107–113. [Google Scholar] [CrossRef] - Jiang, D.; Ooi, B.C.; Shi, L.; Wu, S. The performance of mapreduce: An in-depth study. Proc. VLDB Endow.
**2010**, 3, 472–483. [Google Scholar] [CrossRef] [Green Version] - Ahmad, F.; Chakradhar, S.T.; Raghunathan, A.; Vijaykumar, T.N. Tarazu: Optimizing mapreduce on heterogeneous clusters. ACM SIGARCH Comput. Archit. News
**2012**, 40, 61–74. [Google Scholar] [CrossRef] - Guo, Y.; Rao, J.; Cheng, D.; Zhou, X. ishuffle: Improving hadoop performance with shuffle-on-write. IEEE Trans. Parallel Distrib. Syst.
**2016**, 28, 1649–1662. [Google Scholar] [CrossRef] - Chowdhury, M.; Zaharia, M.; Ma, J.; Jordan, M.I.; Stoica, I. Managing data transfers in computer clusters with orchestra. ACM SIGCOMM Comput. Commun. Rev.
**2011**, 41, 98–109. [Google Scholar] [CrossRef] - Zhang, Z.; Cherkasova, L.; Loo, B.T. Performance modeling of mapreduce jobs in heterogeneous cloud environments. In Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing, Washington, DC, USA, 9–12 December 2013; pp. 839–846. [Google Scholar]
- Georgiou, Z.; Symeonides, M.; Trihinas, D.; Pallis, G.; Dikaiakos, M.D. Streamsight: A query-driven framework for streaming analytics in edge computing. In Proceedings of the 2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing (UCC), Zurich, Switzerland, 17–20 December 2018. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst.
**2012**, 25. [Google Scholar] [CrossRef] - Attia, M.A.; Tandon, R. Combating computational heterogeneity in large-scale distributed computing via work exchange. arXiv
**2017**, arXiv:1711.08452. [Google Scholar] - Wang, D.; Joshi, G.; Wornell, G. Using straggler replication to reduce latency in large-scale parallel computing. ACM Sigmetrics Perform. Eval. Rev.
**2015**, 43, 7–11. [Google Scholar] [CrossRef] - Ahmad, F.; Lee, S.; Thottethodi, M.; Vijaykumar, T.N. Mapreduce with communication overlap (marco). J. Parallel Distrib. Comput.
**2013**, 73, 608–620. [Google Scholar] [CrossRef] [Green Version] - Nicolae, B.; Costa, C.H.; Misale, C.; Katrinis, K.; Park, Y. Leveraging adaptive i/o to optimize collective data shuffling patterns for big data analytics. IEEE Trans. Parallel Distrib. Syst.
**2016**, 28, 1663–1674. [Google Scholar] [CrossRef] [Green Version] - Yu, W.; Wang, Y.; Que, X.; Xu, C. Virtual shuffling for efficient data movement in mapreduce. IEEE Trans. Comput.
**2013**, 64, 556–568. [Google Scholar] [CrossRef] - Zaharia, M.; Borthakur, D.; Sen Sarma, J.; Elmeleegy, K.; Shenker, S.; Stoica, I. Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. In Proceedings of the 5th European Conference on Computer Systems, Paris, France, 13–16 April 2010; pp. 265–278. [Google Scholar]
- Maddah-Ali, M.A.; Niesen, U. Fundamental limits of caching. IEEE Trans. Inf. Theory
**2014**, 60, 2856–2867. [Google Scholar] [CrossRef] [Green Version] - Maddah-Ali, M.A.; Niesen, U. Decentralized coded caching attains order-optimal memory-rate tradeoff. IEEE/ACM Trans. Netw.
**2015**, 23, 1029–1040. [Google Scholar] [CrossRef] [Green Version] - Li, S.; Maddah-Ali, M.A.; Yu, Q.; Avestimehr, A.S. A fundamental tradeoff between computation and communication in distributed computing. IEEE Trans. Inf. Theory
**2018**, 64, 109–128. [Google Scholar] [CrossRef] - Shariatpanahi, S.P.; Motahari, S.A.; Khalaj, B.H. Multi-server coded caching. IEEE Trans. Inf. Theory
**2016**, 62, 7253–7271. [Google Scholar] [CrossRef] [Green Version]

**Figure 2.**Variation trend of communication delay D with computation load r: the computing node K and output function Q in a given network are 50. Number of file N and output function Q are both 50. In addition, the transmission rate C is 100 Mbps, while the length of intermediate value T is 100 Mbits. Compare the variation of communication delay with computing load for uncoded, CDC and proposed scheme.

**Figure 3.**Variation trend of communication delay D with the number of computing nodes K. The computation load r is 2, and the number of file N and output function $Q=50$ are both 50. In addition, transmission rate C is 100 Mbps while the length of intermediate value T is 100 Mbits. The communication delay of uncoded computing, coded distributed computing and parallel coded computing changes with the number of computing nodes K in the network.

**Figure 4.**Variation trend of communication delay D with the number of computing nodes K and computation load r. The number of file N and output function Q are both 50. In addition, transmission rate C is 100 Mbps while the length of the intermediate value T is 100 Mbits. The communication delay of the uncoded computing, coded distributed computing and parallel coded computing changes with the number of computing nodes K in the network.

**Figure 5.**Example of coding strategy for simultaneous transmission of two broadcast groups: a distributed computing network composed of $K=6$ nodes, including $N=15$ input files, and each file is stored by $r=2$ nodes. A total of $Q=6$ output functions need to be processed.

**Figure 6.**An example of coding strategy for the simultaneous transmission of two broadcast groups. At the same time, both nodes can send messages without interference. Solid lines from the same point represent multicast messages.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zai, Q.; Yuan, K.; Wu, Y.
Coded Parallel Transmission for Half-Duplex Distributed Computing. *Information* **2022**, *13*, 342.
https://doi.org/10.3390/info13070342

**AMA Style**

Zai Q, Yuan K, Wu Y.
Coded Parallel Transmission for Half-Duplex Distributed Computing. *Information*. 2022; 13(7):342.
https://doi.org/10.3390/info13070342

**Chicago/Turabian Style**

Zai, Qixuan, Kai Yuan, and Youlong Wu.
2022. "Coded Parallel Transmission for Half-Duplex Distributed Computing" *Information* 13, no. 7: 342.
https://doi.org/10.3390/info13070342