# A Shannon-Theoretic Approach to the Storage–Retrieval Trade-Off in PIR Systems

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Preliminaries

#### 2.1. Problem Definition

**Definition**

**1.**

**Lemma**

**1.**

**Lemma**

**2.**

#### 2.2. Some Relevant Known Results

#### 2.3. Multiple Description Source Coding

## 3. A Special Case: Slepian–Wolf Coding for Minimum Retrieval Rate

- At database-1, compress and store $({X}_{1}^{L},{X}_{2}^{L})$ losslessly;
- At database-2, encode ${Y}_{1}^{L}$ using a Slepian–Wolf code (or more precisely Sgarro’s code with uncertainty side information [29]), with either ${X}_{1}^{L}$ or ${X}_{2}^{L}$ at the decoder, whose resulting code index is denoted as ${C}_{{Y}_{1}}$; encode ${Y}_{2}^{L}$ in the same manner, independent of ${Y}_{1}^{L}$, whose code index is denoted as ${C}_{{Y}_{2}}$.

## 4. Main Result

#### 4.1. A General Inner Bound

- There exist deterministic functions ${f}_{1,1}$, ${f}_{1,2}$, ${f}_{2,1}$, and ${f}_{2,2}$ such that$$\begin{array}{c}{V}_{1}={f}_{1,1}({X}_{0},{X}_{1},{Y}_{1})={f}_{2,2}({X}_{0},{X}_{2},{Y}_{2}),\phantom{\rule{1.em}{0ex}}{V}_{2}={f}_{1,2}({X}_{0},{X}_{1},{Y}_{2})={f}_{2,1}({X}_{0},{X}_{2},{Y}_{1});\end{array}$$
- There exist non-negative coding rates$$\begin{array}{c}\phantom{\rule{2.em}{0ex}}({\beta}_{1}^{\left(0\right)},{\beta}_{1}^{\left(1\right)},{\beta}_{1}^{\left(2\right)},{\beta}_{2}^{\left(1\right)},{\beta}_{2}^{\left(2\right)},{\gamma}_{1}^{\left(0\right)},{\gamma}_{1}^{\left(1\right)},{\gamma}_{1}^{\left(2\right)},{\gamma}_{2}^{\left(1\right)},{\gamma}_{2}^{\left(2\right)})\hfill \\ \phantom{\rule{2.em}{0ex}}\phantom{\rule{2.em}{0ex}}\in {\mathcal{R}}_{MD}^{\ast}\left((({V}_{1},{V}_{2}),{X}_{0},{X}_{1},{X}_{2},{Y}_{1},{Y}_{2}),\right.\hfill \\ \phantom{\rule{2.em}{0ex}}\phantom{\rule{2.em}{0ex}}\phantom{\rule{2.em}{0ex}}\phantom{\rule{2.em}{0ex}}\left.\left(\{{X}_{0},{X}_{1},{Y}_{1}\},\{{X}_{0},{X}_{1},{Y}_{2}\},\{{X}_{0},{X}_{2},{Y}_{1}\},\{{X}_{0},{X}_{2},{Y}_{2}\}\right)\right);\hfill \end{array}$$
- There exist non-negative storage rates $({\alpha}_{1}^{\left(0\right)},{\alpha}_{1}^{\left(1\right)},{\alpha}_{1}^{\left(2\right)},{\alpha}_{2}^{\left(1\right)},{\alpha}_{2}^{\left(2\right)})$ such that$$\begin{array}{c}{\alpha}_{1}^{\left(0\right)}\le {\beta}_{1}^{\left(0\right)},{\alpha}_{1}^{\left(1\right)}\le {\beta}_{1}^{\left(1\right)},{\alpha}_{1}^{\left(2\right)}\le {\beta}_{1}^{\left(2\right)},{\alpha}_{2}^{\left(1\right)}\le {\beta}_{2}^{\left(1\right)},{\alpha}_{2}^{\left(2\right)}\le {\beta}_{2}^{\left(2\right)},\end{array}$$$$\begin{array}{c}\hfill {\gamma}_{1}^{\left(0\right)}-{\beta}_{1}^{\left(0\right)}+{\gamma}_{1}^{\left(1\right)}-{\beta}_{1}^{\left(1\right)}+{\gamma}_{1}^{\left(2\right)}-{\beta}_{1}^{\left(2\right)}<H\left({X}_{1}\right)+H\left({X}_{2}\right)+H\left({X}_{3}\right)-H({X}_{0},{X}_{1},{X}_{2}),\end{array}$$$$\begin{array}{c}\hfill ({\alpha}_{1}^{\left(0\right)},{\alpha}_{1}^{\left(1\right)},{\alpha}_{1}^{\left(2\right)},{\gamma}_{1}^{\left(0\right)},{\gamma}_{1}^{\left(1\right)},{\gamma}_{1}^{\left(2\right)})\in {\mathcal{R}}_{MD}^{\ast}\left((({V}_{1},{V}_{2}),{X}_{0},{X}_{1},{X}_{2}),\left(\{{X}_{0},{X}_{1},{X}_{2}\}\right)\right);\end{array}$$$$\begin{array}{c}\hfill {\gamma}_{2}^{\left(1\right)}-{\beta}_{2}^{\left(1\right)}+{\gamma}_{2}^{\left(2\right)}-{\beta}_{2}^{\left(2\right)}<I({Y}_{1};{Y}_{2}),\end{array}$$$$\begin{array}{c}\hfill ({\alpha}_{2}^{\left(1\right)},{\alpha}_{2}^{\left(2\right)},{\gamma}_{2}^{\left(1\right)},{\gamma}_{2}^{\left(2\right)})\in {\mathcal{R}}_{MD}^{\ast}\left((({V}_{1},{V}_{2}),{Y}_{1},{Y}_{2}),\left(\{{Y}_{1},{Y}_{2}\}\right)\right),\end{array}$$
- The normalized average retrieval and storage rates$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& 2t\overline{\alpha}\ge {\alpha}_{1}^{\left(0\right)}+{\alpha}_{1}^{\left(1\right)}+{\alpha}_{1}^{\left(2\right)}+{\alpha}_{2}^{\left(1\right)}+{\alpha}_{2}^{\left(2\right)},\hfill \end{array}$$$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& 4t\overline{\beta}\ge 2{\beta}_{1}^{\left(0\right)}+{\beta}_{1}^{\left(1\right)}+{\beta}_{1}^{\left(2\right)}+{\beta}_{2}^{\left(1\right)}+{\beta}_{2}^{\left(2\right)}.\hfill \end{array}$$

**Theorem**

**1.**

**Theorem**

**2.**

**Proof**

**of Theorem 2.**

**Proof**

**of Theorem 1.**

#### 4.2. Outer Bounds

**Theorem**

**3.**

**Proof**

**of Theorem 3.**

#### 4.3. Specialization of the Inner Bound

- The distribution factorizes as follows$${P}_{{V}_{1},{V}_{2},{X}_{0},{X}_{1},{X}_{2},{Y}_{1},{Y}_{2}}={P}_{{V}_{1},{V}_{2}}{P}_{{X}_{0}|{V}_{1},{V}_{2}}{P}_{{X}_{1}|{V}_{1},{V}_{2}}{P}_{{X}_{2}|{V}_{1},{V}_{2}}{P}_{{Y}_{1}|{V}_{1},{V}_{2}}{P}_{{Y}_{2}|{V}_{1},{V}_{2}};$$
- There exist deterministic functions ${f}_{1,1}$, ${f}_{1,2}$, ${f}_{2,1}$, and ${f}_{2,2}$ such that$$\begin{array}{c}{V}_{1}={f}_{1,1}({X}_{0},{X}_{1},{Y}_{1})={f}_{2,2}({X}_{0},{X}_{2},{Y}_{2}),\end{array}$$$$\begin{array}{c}{V}_{2}={f}_{1,2}({X}_{0},{X}_{1},{Y}_{2})={f}_{2,1}({X}_{0},{X}_{2},{Y}_{1});\end{array}$$
- A set of rates$$\begin{array}{c}{\gamma}_{1}^{\left(0\right)}=I({V}_{1},{V}_{2};{X}_{0}),\phantom{\rule{0.166667em}{0ex}}{\gamma}_{1}^{\left(1\right)}=I({V}_{1},{V}_{2};{X}_{1}),\phantom{\rule{0.166667em}{0ex}}{\gamma}_{1}^{\left(2\right)}=I({V}_{1},{V}_{2};{X}_{2}),\end{array}$$$$\begin{array}{c}{\gamma}_{2}^{\left(1\right)}=I({V}_{1},{V}_{2};{Y}_{1}),\phantom{\rule{0.166667em}{0ex}}{\gamma}_{2}^{\left(2\right)}=I({V}_{1},{V}_{2};{Y}_{2}),\end{array}$$$$\begin{array}{c}{\beta}_{1}^{\left(0\right)}={\gamma}_{1}^{\left(0\right)},\phantom{\rule{0.166667em}{0ex}}{\beta}_{1}^{\left(1\right)}=I({V}_{1},{V}_{2};{X}_{1}|{X}_{0}),\phantom{\rule{0.166667em}{0ex}}{\beta}_{1}^{\left(2\right)}=I({V}_{1},{V}_{2};{X}_{2}|{X}_{0}),\end{array}$$$$\begin{array}{c}{\beta}_{2}^{\left(1\right)}=max(I({V}_{1},{V}_{2};{Y}_{1}|{X}_{0},{X}_{1}),I({V}_{1},{V}_{2};{Y}_{1}|{X}_{0},{X}_{2})),\end{array}$$$$\begin{array}{c}{\beta}_{2}^{\left(2\right)}=max(I({V}_{1},{V}_{2};{Y}_{2}|{X}_{0},{X}_{1}),I({V}_{1},{V}_{2};{Y}_{2}|{X}_{0},{X}_{2})),\end{array}$$
- The normalized average retrieval and storage rates$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& 2t\overline{\alpha}\ge {\alpha}_{1}^{\left(0\right)}+{\alpha}_{1}^{\left(1\right)}+{\alpha}_{1}^{\left(2\right)}+{\alpha}_{2}^{\left(1\right)}+{\alpha}_{2}^{\left(2\right)},\hfill \end{array}$$$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& 4t\overline{\beta}\ge 2{\beta}_{1}^{\left(0\right)}+{\beta}_{1}^{\left(1\right)}+{\beta}_{1}^{\left(2\right)}+{\beta}_{2}^{\left(1\right)}+{\beta}_{2}^{\left(2\right)}.\hfill \end{array}$$

**Corollary**

**1.**

**Proof.**

**Corollary**

**2.**

**Proof.**

## 5. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Chor, B.; Goldreich, O.; Kushilevitz, E.; Sudan, M. Private information retrieval. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science, Milwaukee, WI, USA, 23–25 October 1995; pp. 41–50. [Google Scholar]
- Shah, N.; Rashmi, K.; Ramchandran, K. One extra bit of download ensures perfectly private information retrieval. In Proceedings of the 2014 IEEE International Symposium on Information Theory (ISIT), Honolulu, HI, USA, 29 June–4 July 2014; pp. 856–860. [Google Scholar]
- Fazeli, A.; Vardy, A.; Yaakobi, E. Codes for distributed PIR with low storage overhead. In Proceedings of the 2015 Proceedings of IEEE International Symposium on Information Theory (ISIT), Hong Kong, China, 14–19 June 2015; pp. 2852–2856. [Google Scholar]
- Rao, S.; Vardy, A. Lower bound on the redundancy of PIR codes. arXiv
**2016**, arXiv:1605.01869. [Google Scholar] - Blackburn, S.R.; Etzion, T. PIR array codes with optimal virtual server rate. IEEE Trans. Inf. Theory
**2019**, 65, 6136–6145. [Google Scholar] [CrossRef] [Green Version] - Blackburn, S.R.; Etzion, T.; Paterson, M.B. PIR schemes with small download complexity and low storage requirements. IEEE Trans. Inf. Theory
**2019**, 66, 557–571. [Google Scholar] [CrossRef] [Green Version] - Zhang, Y.; Wang, X.; Wei, H.; Ge, G. On private information retrieval array codes. IEEE Trans. Inf. Theory
**2019**, 65, 5565–5573. [Google Scholar] [CrossRef] [Green Version] - Vajha, M.; Ramkumar, V.; Kumar, P.V. Binary, shortened projective Reed Muller codes for coded private information retrieval. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 2648–2652. [Google Scholar]
- Asi, H.; Yaakobi, E. Nearly optimal constructions of PIR and batch codes. IEEE Trans. Inf. Theory
**2018**, 65, 947–964. [Google Scholar] [CrossRef] [Green Version] - Chan, T.H.; Ho, S.W.; Yamamoto, H. Private information retrieval for coded storage. In Proceedings of the 2015 IEEE International Symposium on Information Theory (ISIT), Hong Kong, China, 14–19 June 2015; pp. 2842–2846. [Google Scholar]
- Sun, H.; Jafar, S.A. The capacity of private information retrieval. IEEE Trans. Inf. Theory
**2017**, 63, 4075–4088. [Google Scholar] [CrossRef] - Tajeddine, R.; Gnilke, O.W.; El Rouayheb, S. Private information retrieval from MDS coded data in distributed storage systems. IEEE Trans. Inf. Theory
**2018**, 64, 7081–7093. [Google Scholar] [CrossRef] [Green Version] - Banawan, K.; Ulukus, S. The capacity of private information retrieval from coded databases. IEEE Trans. Inf. Theory
**2018**, 64, 1945–1956. [Google Scholar] [CrossRef] - Tian, C.; Sun, H.; Chen, J. Capacity-achieving private information retrieval codes with optimal message size and upload cost. IEEE Trans. Inf. Theory
**2019**, 65, 7613–7627. [Google Scholar] [CrossRef] [Green Version] - Zhou, R.; Tian, C.; Sun, H.; Liu, T. Capacity-achieving private information retrieval codes from MDS-coded databases with minimum message size. IEEE Trans. Inf. Theory
**2020**, 66, 4904–4916. [Google Scholar] [CrossRef] [Green Version] - Sun, H.; Jafar, S.A. The capacity of robust private information retrieval with colluding databases. IEEE Trans. Inf. Theory
**2018**, 64, 2361–2370. [Google Scholar] [CrossRef] - Ulukus, S.; Avestimehr, S.; Gastpar, M.; Jafar, S.; Tandon, R.; Tian, C. Private retrieval, computing and learning: Recent progress and future challenges. IEEE J. Sel. Areas Commun.
**2022**, 40, 729–748. [Google Scholar] [CrossRef] - Attia, M.A.; Kumar, D.; Tandon, R. The capacity of private information retrieval from uncoded storage constrained databases. IEEE Trans. Inf. Theory
**2020**, 66, 6617–6634. [Google Scholar] [CrossRef] - Sun, H.; Jafar, S.A. Multiround private information retrieval: Capacity and storage overhead. IEEE Trans. Inf. Theory
**2018**, 64, 5743–5754. [Google Scholar] [CrossRef] [Green Version] - Sun, H.; Tian, C. Breaking the MDS-PIR capacity barrier via joint storage coding. Information
**2019**, 10, 265. [Google Scholar] [CrossRef] [Green Version] - Guo, T.; Zhou, R.; Tian, C. New results on the storage-retrieval tradeoff in private information retrieval systems. IEEE J. Sel. Areas Inf. Theory
**2021**, 2, 403–414. [Google Scholar] [CrossRef] - Tian, C.; Sun, H.; Chen, J. A Shannon-theoretic approach to the storage-retrieval tradeoff in PIR systems. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–20 June 2018; pp. 1904–1908. [Google Scholar]
- Gamal, A.; Cover, T. Achievable rates for multiple descriptions. IEEE Trans. Inf. Theory
**1982**, 28, 851–857. [Google Scholar] [CrossRef] [Green Version] - Tian, C. On the storage cost of private information retrieval. IEEE Trans. Inf. Theory
**2020**, 66, 7539–7549. [Google Scholar] [CrossRef] - Venkataramani, R.; Kramer, G.; Goyal, V.K. Multiple description coding with many channels. IEEE Trans. Inf. Theory
**2003**, 49, 2106–2114. [Google Scholar] [CrossRef] - Wyner, A.; Ziv, J. The rate-distortion function for source coding with side information at the decoder. IEEE Trans. Inf. Theory
**1976**, 22, 1–10. [Google Scholar] [CrossRef] - Pradhan, S.S.; Puri, R.; Ramchandran, K. n-channel symmetric multiple descriptions-part I: (n,k) source-channel erasure codes. IEEE Trans. Inf. Theory
**2004**, 50, 47–61. [Google Scholar] [CrossRef] - Tian, C.; Chen, J. New coding schemes for the symmetric K-description problem. IEEE Trans. Inf. Theory
**2010**, 56, 5344–5365. [Google Scholar] [CrossRef] - Sgarro, A. Source coding with side information at several decoders. IEEE Trans. Inf. Theory
**1977**, 23, 179–182. [Google Scholar] [CrossRef]

$({\mathit{w}}_{1},{\mathit{w}}_{2})$ | ${\mathit{x}}_{0}=\left(00\right)$ | ${\mathit{x}}_{0}=\left(01\right)$ | ${\mathit{x}}_{0}=\left(10\right)$ | ${\mathit{x}}_{0}=\left(11\right)$ |
---|---|---|---|---|

(00) | $1/2$ | 1/2 | ||

(10) | $(1-p)/2$ | p | $(1-p)/2$ | |

(01) | $(1-p)/2$ | p | $(1-p)/2$ | |

(11) | 1/2 | 1/2 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Tian, C.; Sun, H.; Chen, J.
A Shannon-Theoretic Approach to the Storage–Retrieval Trade-Off in PIR Systems. *Information* **2023**, *14*, 44.
https://doi.org/10.3390/info14010044

**AMA Style**

Tian C, Sun H, Chen J.
A Shannon-Theoretic Approach to the Storage–Retrieval Trade-Off in PIR Systems. *Information*. 2023; 14(1):44.
https://doi.org/10.3390/info14010044

**Chicago/Turabian Style**

Tian, Chao, Hua Sun, and Jun Chen.
2023. "A Shannon-Theoretic Approach to the Storage–Retrieval Trade-Off in PIR Systems" *Information* 14, no. 1: 44.
https://doi.org/10.3390/info14010044