1. Introduction
Over the last decade, the field of machine learning (ML), particularly deep learning (DL), has witnessed a tremendous rise, achieving superhumanlevel performance in many areas, such as image recognition [
1,
2], natural language processing [
3,
4], and speech recognition [
5,
6]. The success of deep neural networks (DNNs) relies on training largescale models encompassing billions of parameters and fed with extensive datasets. For instance, ChatGPT [
7], a famous large language model, was trained on the diverse and wideranging dataset derived from books, websites, and other texts, containing a total of 175 billion parameters. Therefore, training DNNs can be timeconsuming and requires huge training samples. For such reasons, it is challenging for individual users to build DNNs of this scale. In response to this challenge, Machine Learning as a Service (MLaaS) has been proposed, enabling individual users to perform inference on their data without the necessity of local model training.
The MLaaS paradigm can be described in more detail as follows. A pretrained neural network (NN), denoted by
F, is deployed on a remote cloud server. In this setup, a client transmits a data sample, denoted by
x, to the server for inference. After receiving
x, the cloud server performs neural network inference on
x to obtain the output
$F\left(x\right)$, which will be returned to the user as the prediction result. Although MLaaS has clear benefits, it also raises serious privacy concerns for the client and the server. Firstly, from the client’s perspective, the data
x and the prediction result
$F\left(x\right)$ may contain sensitive personal information, e.g., in healthcare applications, data
x can be a medical record and the prediction
$F\left(x\right)$ is whether the client has contracted a certain disease. Therefore, the client may hesitate to share
x with the server and want to hide
$F\left(x\right)$ from the server. Secondly, from the server’s perspective, the pretrained NN,
F, represents the service provider’s intellectual property and has great commercial advantages. Hence, the server does not want to share
F with clients. Furthermore, the NN may leak information about the training data [
8]. Consequently, it is very desirable and of great practical importance to design secure MLaaS frameworks, where the client should learn
$F\left(x\right)$ but nothing else about the server’s model
F, while the server should not learn anything about the client’s input
x or the prediction result
$F\left(x\right)$. Throughout this paper, we refer to this goal as
secure neural network inference (SNNI) (other names sometimes used in the literature include
privacypreserving inference and
oblivious inference).
Existing approaches to the SNNI problem naturally rely on privacypreserving techniques of modern cryptography, in particular, multiparty computation (MPC) and fully homomorphic encryption (HE). In secure twoparty computation (2PC), a special case of MPC with two parties, the client and the server interact with each other to securely compute the prediction result without revealing any individual values. After three decades of theoretical and applied work improving and optimizing 2PC protocols, we now have very efficient implementations, e.g., refs. [
9,
10,
11,
12,
13]. The main advantage of 2PC protocols is that they are computationally inexpensive and can evaluate arbitrary operators. However, 2PC protocols require the structure (e.g., the boolean circuit) of the NN to be public and involve multiple rounds of interaction between the client and the server. On the other hand, HEbased SNNI is a noninteractive approach (the client and the server are not required to involve any communication during the inference process) and keeps all NNs’ information secret from the client. The main bottlenecks of HEbased SNNI frameworks are their computational overhead and they cannot evaluate nonlinear operators, e.g., the comparison operator. If prioritizing accuracy and inference time, 2PC is the better candidate. For prioritizing security and optimizing communication, HE is the better candidate. Our motivation is to design a novel framework that achieves the best of both worlds: the security and low communication cost characteristic of the HEbased approach, coupled with the computational efficiency and ability to evaluate nonlinear functions inherent in the 2PCbased approach.
In this paper, we focus on the HEbased approach to SNNI. HE is a cryptographic technique that allows computations to be performed on encrypted data without decrypting it. HEbased SNNI can be described as follows. The client encrypts x and sends the encrypted data $\left[x\right]$ to the server. The server evaluates F on $\left[x\right]$ and sends the result $F\left(\left[x\right]\right)$ to the client, where $F\left(\left[x\right]\right)=\left[F\left(x\right)\right]$ (homomorphic property). Finally, the client decrypts the ciphertext and obtains $F\left(x\right)$. The main contribution of our paper is to propose a novel interactive approach for SNNI solely based on HE that overcomes the aforementioned issues in 2PC and HE. The novelty of our approach is that it is only based on HE, while the existing interactive approach combines HE with 2PC protocols. Hence, we reduce communication complexity caused by 2PC protocols.
Existing (noninteractive) HEbased SNNI frameworks face three inherent drawbacks, which our approach will address.
Firstly, existing HE schemes only support addition and multiplication, while other operators, such as comparison, are not readily available and are usually replaced by evaluating expensive (highdegree) polynomial approximation [
9,
10]. Hence, current HEbased SNNI frameworks
cannot practically evaluate nonlinear activation functions, e.g.,
$\mathsf{ReLU}$ function, which are widely adopted in NNs.
The solution for this issue is to replace these activation functions with (lowdegree) polynomial approximations, e.g., square function. However, this replacement degrades the accuracy of the NN [
11] and requires a costly retraining of the NN. Moreover, training NNs using the square activation function can lead to strange behavior when running the gradient descent algorithm, especially for deep NNs. For example, gradient explosion or overfitting phenomena may occur [
12]. This is because the derivative of the square function, unlike the
$\mathsf{ReLU}$ function, is unbounded. This paper proposes a novel protocol to evaluate the
$\mathsf{ReLU}$ function,
$\mathsf{HeReLU}$, which addresses the challenge of evaluating
$\mathsf{ReLU}$ in current HEbased SNNI frameworks.
Secondly, in all existing HE schemes, ciphertexts contain noise, which will be accumulated after running homomorphic operations, and at some point, the noise present in a ciphertext may become too large, and the ciphertext is no longer decryptable. A family of HE supporting a predetermined number of homomorphic operators is called leveled HE (LHE). LHE is widely adopted in SNNI frameworks (LHEbased SNNI frameworks). Therefore, an inherent drawback of existing LHEbased SNNI frameworks is that they limit the depth of NNs.
The bootstrapping technique, an innovative technique proposed by Gentry [
13], can address this issue. The bootstrapping technique produces a new ciphertext that encrypts the same message with a lower noise level so that more homomorphic operations can be performed on the ciphertext. Roughly speaking, bootstrapping homomorphically decrypts the ciphertext using encryption of the secret key, called the bootstrapping key, then (re)encrypts the message with the bootstrapping key. This requires an extra security prerequisite, termed the circular security assumption. The assumption implies that disclosing the encryption of the secret key is presumed to be secure.
The circular security assumption remains poorly studied and poorly understood. Proving circular security for the HE schemes remains an open problem. Furthermore, bootstrapping is generally considered an expensive operation, so one usually tries to avoid it as much as possible [
14]. This paper proposes a novel protocol to refresh ciphertext in HE schemes,
$\mathsf{HeRefresh}$, to address the noise growth issue in HE schemes. Our protocol is much faster than bootstrapping (the empirical comparison results are shown in
Section 5.2) and does not require circular security.
Thirdly, LHEbased SNNI suffers from
huge computational overhead. The computational cost of LHE, for example, grows dramatically with the number of levels of multiplication that the scheme needs to support. This means LHEbased SNNI is impractical in the case of deep NNs (equivalent to a very large multiplication level). Interestingly,
$\mathsf{HeRefresh}$ naturally reduces the size of LHE schemes’ parameters, hence significantly reducing computational overhead (details in
Section 5.3).
The noninteractive LHEbased SNNI approach cannot efficiently address the three above challenges. A natural approach that appears in the literature is interactive LHEbased approaches [
11,
15,
16]. There are two branches in interactive LHEbased approaches. The first branch is
solely based on LHE, which we call iLHE, and the second branch combines HE and 2PC protocols, known as the HEMPC hybrid approach in the literature.
nGraphHE2 [
11] is the only iLHEbased SNNI framework. nGraphHE2 adopted clientaided architecture where the server sends encrypted values (before the nonlinear activation function) of intermediate layers to the client, and the client decrypts those encrypted values and then evaluates the activation function on intermediate layers’ values; finally, the client encrypts those intermediate layers’ values (after nonlinear activation function) and sends it to the server. It is worth noting that the process of decryptionthenencryption can be considered as a noise refresher. This solution addresses the three mentioned issues of noninteractive LHEbased SNNI. However, nGraphHE2 leaks the network’s information, namely NNs’ architecture and all intermediate layers’ values. Recent studies have illuminated the potential risks of this information leakage; a malicious client could feasibly reconstruct the entire neural network parameters by exploiting these intermediate values [
17,
18]. nGraphHE2 is an ad hoc solution to address issues of LHEbased SNNI frameworks and is not considered as a separate approach to SNNI in the literature. This paper proposes a novel iLHEbased SNNI framework that addresses the drawbacks of existing noninteractive LHEbased SNNI frameworks while preserving NNs’ architecture and not leaking information about intermediate layers’ values. We argue that the iLHEbased approach deserves more attention from the research community.
The second branch of interactive LHEbased SNNI is the HEMPC hybrid approach, which wisely combines HE and MPC. In particular, HE schemes were used to compute linear layers, i.e., fully connected layer and convolution layer, while 2PC protocols, e.g., garbled circuit (GC) [
19] and ABY [
20], were used to compute an
exact nonlinear activation function. The major bottleneck of the approach is the communication complexity caused by 2PC protocols. On the other hand, our framework is an iLHEbased approach, which is
solely based on HE. Hence, we naturally avoid expensive 2PC protocols.
To address the three challenges faced by noninteractive LHEbased SNNI, this paper designs two novel iLHEbased protocols. Firstly, we design the
$\mathsf{HeReLU}$ protocol to evaluate the
$\mathsf{ReLU}$ activation function, an essential nonlinear function widely adopted in NNs. This protocol addresses the first challenge of nonlinear function evaluation in HEbased SNNI. Compared to current LHEbased SNNI frameworks, our protocol can
exactly evaluate the
$\mathsf{ReLU}$ function. Compared to existing HEMPC hybrid frameworks, i.e., Gazelle [
15] and MP2ML [
16], our protocol achieves fewer communication rounds (we detail this in
Section 5.1). Secondly, we design
$\mathsf{HeRefresh}$ to
refresh HE ciphertexts, which enable further homomorphic operators. This protocol addresses the second challenge of noise accumulation in HEbased SNNI. By using
$\mathsf{HeRefresh}$, rather than selecting HE parameters large enough to support the entire NN, the HE’s parameters must now only be large enough to support the computation on linear layers between
$\mathsf{ReLU}$ activation function layers. For instance, to perform secure inference on a sevenlayer NN, which contains convolution layers, fully connected layers followed by
$\mathsf{ReLU}$ activation function layers, our framework (with
$\mathsf{HeRefresh}$) requires
$L=3\ll 7$, while current noninteractive LHEbased SNNI frameworks (without
$\mathsf{HeRefresh}$) require
$L=7$ (details in
Section 5.3). Therefore,
$\mathsf{HeRefresh}$ can address the third challenge of computational overhead due to large multiplication depth in LHEbased SNNI. In the HE scheme,
$\mathsf{HeRefresh}$ plays the role of a bootstrapping procedure, which aims to reduce the noise in ciphertexts. Our experiment showed that
$\mathsf{HeRefresh}$ outperforms bootstrapping by
$300\times $ in running time (we detail this in
Section 5.2). Interestingly,
$\mathsf{HeRefresh}$ and
$\mathsf{HeReLU}$ can run in parallel. Hence, we can save communication rounds.
Our contribution. In this paper, we first design two novel protocols, i.e.,
$\mathsf{HeReLU}$ and
$\mathsf{HeRefresh}$, to address the drawbacks of existing LHEbased SNNI frameworks. Then, we leverage
$\mathsf{HeReLU}$ and
$\mathsf{HeRefresh}$ to build a new framework for SNNI called HeFUN. The idea of HeFUN is to use
$\mathsf{HeReLU}$ to evaluate the
$\mathsf{ReLU}$ activation function and use
$\mathsf{HeRefresh}$ to refresh intermediate neural network layers. The benefits of our proposed method are twofold. Firstly, compared to current HEbased frameworks, our approach significantly accelerates inference time while achieving superior accuracy. Secondly, compared to current 2PCbased frameworks, our methodology reduces communication rounds and safeguards circuit privacy for service providers. We highlight the superiority of our approach compared to current approaches in
Section 2. A comparison summary is shown in
Table 1. The contribution of this paper is as follows:
We proposed a novel iLHEbased protocol,
$\mathsf{HeReLU}$, that can exactly evaluate the
$\mathsf{ReLU}$ function
solely based on HE, which achieves better communication rounds than current HEMPC hybrid frameworks. The analysis is presented in
Section 5.1.
We proposed a novel iLHEbased protocol,
$\mathsf{HeRefresh}$, that refreshes noise in ciphertexts to enable arbitrary depth of the circuit.
$\mathsf{HeRefresh}$ also reduces HE parameters’ size, hence significantly reducing computation time. Considering the ciphertext refreshing purpose,
$\mathsf{HeRefresh}$ outperforms bootstrapping
$300\times $ in relation to computation time (we detail this in
Section 5.2). Furthermore, our protocol deviates from Gentry’s bootstrapping technique, so we bypass the circular security requirement.
We build a new iLHEbased framework for SNNI, named $\mathsf{HeFUN}$, which uses $\mathsf{HeReLU}$ to evaluate the $\mathsf{ReLU}$ activation function and use $\mathsf{HeRefresh}$ to refresh intermediate neural network layers.
We provide security proofs for the proposed protocols. All proposed protocols, i.e.,
$\mathsf{HeReLU}$ and
$\mathsf{HeRefresh}$, are proven to be secure in the semihonest model using the simulation paradigm [
21]. Our framework,
$\mathsf{HeFUN}$, is created by composing sequential protocols. The security proof of the
$\mathsf{HeFUN}$ framework is provided based on modular sequential composition [
22].
Experiments show that $\mathsf{HeFUN}$ outperform previous HEbased SNNI frameworks. Particularly, we achieve higher accuracy and better inference time.
Table 1.
Comparison of SNNI frameworks.
Table 1.
Comparison of SNNI frameworks.
Approach  Framework  Security  Accuracy  Efficiency 

Circuit Privacy  Comparable Accuracy ^{a}  NonLinear  Unbounded  NonInteractive ^{b}  SIMD  Small HE Params ^{c} 

LHE  CryptoNets  ✓  ✓  ✗  ✗  ✓  ✓  ✗ 
CryptoDL  ✓  ✓  ✗  ✗  ✓  ✓  ✗ 
BNormCrypt  ✓  ✓  ✗  ✗  ✓  ✓  ✗ 
FasterCryptoNets  ✓  ✓  ✗  ✗  ✓  ✓  ✗ 
HCNN  ✓  ✓  ✗  ✗  ✓  ✓  ✗ 
E2DM  ✓  ✓  ✗  ✗  ✓  ✓  ✗ 
LoLa  ✓  ✓  ✗  ✗  ✓  ✓  ✗ 
CHET  ✓  ✓  ✗  ✗  ✓  ✓  ✗ 
SEALion  ✓  ✓  ✗  ✗  ✓  ✓  ✗ 
nGraphHE  ✓  ✓  ✗  ✗  ✓  ✓  ✗ 
FHE  FHEDiNN  ✓  ✗  ✓  ✓  ✓  ✗  ✓ 
TAPAS  ✓  ✗  ✓  ✓  ✓  ✗  ✓ 
SHE  ✓  ✗  ✓  ✓  ✓  ✗  ✓ 
MPC  DeepSecure  ✗  ✗  ✓  ✓  ✗     
XONN  ✗  ✗  ✓  ✓  ✗     
GarbledNN  ✗  ✗  ✓  ✓  ✗     
QUOTIENT  ✗  ✓  ✓  ✓  ✗     
Chameleon  ✗  ✓  ✓  ✓  ✗     
ABY3  ✗  ✓  ✓  ✓  ✗     
SecureNN  ✗  ✓  ✓  ✓  ✗     
FalconN  ✗  ✓  ✓  ✓  ✗     
CrypTFlow  ✗  ✓  ✓  ✓  ✗     
Crypten  ✗  ✓  ✓  ✓  ✗     
HEMPC
hybrid  Gazelle  ✓  ✓  ✓  ✓  ✗  ✓  ✓ 
MP2ML  ✓  ✓  ✓  ✓  ✗  ✓  ✓ 
iLHE  nGraphHE2  ✗  ✓  ✓  ✓  ✗  ✓  ✓ 
HeFUN (this work)  ✓  ✓  ✓  ✓  ✗  ✓  ✓ 
The rest of the paper is organized as follows.
Section 2 reviews current SNNI frameworks,
Section 3 provides the necessary cryptographic background,
Section 4 presents our proposed approach,
Section 5 presents our implementation and evaluation results,
Section 6 provides some discussions and potential future works, and
Section 7 concludes our work.
4. HeFUN: An iLHEBased SNNI Framework
This section provides a detailed description of our proposed framework. We first state the problem we try to solve throughout this section (
Section 4.1). Secondly, we present a protocol to homomorphically evaluate the
$\mathsf{ReLU}$ function, i.e.,
$\mathsf{HeReLU}$ (
Section 4.2). Thirdly, we present a protocol to refresh ciphertexts, i.e.,
$\mathsf{HeRefresh}$ (
Section 4.3). Fourthly, we present a naive framework for SNNI, i.e.,
${\mathsf{HeFUN}}_{\mathsf{naive}}$, that straightforwardly applies the
$\mathsf{HeReLU}$ and
$\mathsf{HeRefresh}$, but it leaks some unexpected information to the client (
Section 4.4). Then, we revise
${\mathsf{HeFUN}}_{\mathsf{naive}}$ to prevent the potential information leakage, which results in our “ultimate” framework, i.e.,
$\mathsf{HeFUN}$ (
Section 4.5). Throughout this section, we denote the client by
$\mathcal{C}$ and the server by
$\mathcal{S}$.
4.1. Problem Statement
We consider a standard scenario for cloudbased prediction services. In this context, $\mathcal{S}$ possesses an NN, and $\mathcal{C}$ sends the data to $\mathcal{S}$ and subsequently obtains the corresponding prediction. The problem is formally defined as:
$\mathbf{z}:={a}_{L}\left({f}_{L}\left({\mathit{W}}^{\left(L\right)},{a}_{L1}\left({f}_{L1}\left(\dots {a}_{1}\left({f}_{1}\left({\mathit{W}}^{\left(1\right)},\mathit{x},{\mathit{b}}^{\left(1\right)}\right)\right)\right)\right),{\mathit{b}}^{\left(L\right)}\right)\right)$ where $\mathit{x}$ is the input data, ${f}_{i}$ is the linear transformation applied in the ith layer, ${a}_{i}$ is (usually nonlinear) activation functions, and ${\mathit{W}}^{\left(i\right)}$ and ${\mathit{b}}^{\left(i\right)}$ are the weight and bias in the ith layer of the NN. The problem we address is SNNI. That is, after each prediction, $\mathcal{C}$ obtains the prediction $\mathbf{z}$, while $\mathcal{S}$ learns nothing about $\mathit{x}$ and $\mathbf{z}$, and $\mathcal{C}$ learns nothing about ${\mathit{W}}^{\left(i\right)}$ and ${\mathit{b}}^{\left(i\right)}$ with $i=\overline{1,n}$ except $\mathbf{z}$. In this paper, we focus on the $\mathsf{ReLU}$ function, i.e., ${a}_{i}=\mathsf{ReLU}\left(\right)$. The final activation function, i.e., ${a}_{L}$, can be omitted in the inference stage without compromising the accuracy of predictions. This is because NN predictions rely on the index with the maximum value in its output vector, and since the ${a}_{L}$ is monotone increasing, whether or not we apply it will not affect the prediction. The following sections elaborate on how to homomorphically evaluate layers in NNs in the $\mathsf{HeFUN}$ framework.
Linear layers (${\mathit{f}}_{\mathit{i}}$). These layers apply a linear transformation on input. In NNs, there are two common linear transformations, i.e., fully connected layers and convolution layers.
Fully connected layer. The fully connected layer can be considered as the multiplication between an encrypted vector (
$\left[\mathit{x}\right]$) and a plain weight matrix (
$\left[\mathit{W}\right]$) and addition with a bias vector (
$\mathit{b}$). Based on operators supported by the CKKS scheme (as described in
Section 3.2), the fully connected layer can be efficiently evaluated in the ciphertext domain. In particular,
$\left[x\right]\odot \mathit{W}\oplus \mathit{b}=\left[\mathit{x}\xb7\mathit{W}\right]\oplus \mathit{b}=\left[\mathit{x}\xb7\mathit{W}+\mathit{b}\right]$.
Convolution layer. A convolutional layer consists of filters that act on input values. Convolutional layers aim to extract features from the given image. Every filter is an $n\times n$ square that moves with a certain stride. By convolving the image pixels, we compute the dot product between filter values and the values adjacent to a particular pixel. Similar to the fully connected layer, the convolution layer can be efficiently evaluated in the ciphertext domain, i.e., $\left[x\right]\u229b\mathit{W}\oplus \mathit{b}=\left[\mathit{x}\ast \mathit{W}+b\right]$.
After linear layers, a nonlinear activation function (${a}_{i}$) is usually applied. Unfortunately, the activation function cannot be evaluated in the ciphertext domain. The next section presents our method to address this problem.
4.2. $\mathsf{HeReLU}$: Homomorphic $\mathsf{ReLU}$ Evaluation Protocol
After a linear layer, a nonlinear activation function is applied to introduce nonlinearity into the NN and allow it to capture complex patterns and relationships in the data. In this paper, we focus on the
$\mathsf{ReLU}$ function.
Evaluating $\mathsf{ReLU}$ requires the comparison operator, which is not natively supported by existing HE schemes. Hence, the $\mathsf{ReLU}$ activation function layer cannot be directly computed in the ciphertext domain. To address this issue, we propose a protocol, $\mathsf{HeReLU}$, in which $\mathcal{S}$ interacts with $\mathcal{C}$ to perform $\mathsf{ReLU}$ computation.
Homomorphic $\mathsf{ReLU}$ evaluation problem. $\mathcal{C}$ holds secret key $sk$, the server has public key $pk$ but can not access to $sk$. $\mathcal{S}$ holds ciphertext $\left[\mathit{x}\right]$ ($\mathit{x}\in {\mathbb{R}}^{n}$) encrypted under $sk$. $\mathcal{S}$ obtains $\left[\mathsf{ReLU}\left(\mathit{x}\right)\right]$ without leaking $\mathit{x}$ to $\mathcal{C}$ and $\mathcal{S}$.
We now present our protocol that solves the homomorphic $\mathsf{ReLU}$ evaluation problem described above. Here, $\mathsf{ReLU}\left(\mathit{x}\right)$ means applying the $\mathsf{ReLU}$ function on each element of $\mathit{x}$, i.e., $\mathsf{ReLU}\left(\mathit{x}\right)=\left(\mathsf{ReLU}\left({\mathit{x}}_{1}\right),\dots ,\mathsf{ReLU}\left({\mathit{x}}_{n}\right)\right)$. Our protocol ($\mathsf{HeReLU}$) is described in Protocol 1. We now give the intuition behind the protocol and then present the proof for the correctness and security of the protocol (as illustrated in Lemma 1).
Intuition. We based this on the fact that evaluating the
$\mathsf{ReLU}$ function is essentially equivalent to evaluating the
$\mathsf{sign}$ function. In particular, given
$\mathsf{sign}\left(x\right)$,
$\mathsf{ReLU}\left(x\right)$ can be implemented as follows:
where
$\mathsf{sign}\left(x\right)=1$ if
$x>0$,
$=0$ if
$x=0$, and
$=1$ if
$x<0$. It is noteworthy that the
$\mathsf{sign}$ function as defined in this paper is consistent with the standard interpretation found in the literature [
9,
10,
63,
64]. This
$\mathsf{sign}$ is widely employed in various realworld applications, including machine learning algorithms, such as support vector machines [
65], cluster analysis [
66], and gradient boosting [
67].
Therefore, the fundamental question now is as follows:
How can $\mathcal{S}$ (holds $\left[\mathit{x}\right]$) securely evaluate $\left[\mathsf{sign}\left(\mathit{x}\right)\right]$? We start from a trivial (nonsecure) solution to evaluate
$\left[\mathsf{sign}\left(\mathit{x}\right)\right]$ as follows:
$\mathcal{S}$ sends
$\left[\mathit{x}\right]$ to
$\mathcal{C}$;
$\mathcal{C}$ decrypts the ciphertext (using
$sk$) and obtains
$\mathit{x}$, then obtains
$sign\left(\mathit{x}\right)$;
$\mathcal{C}$ encrypts the result and sends the ciphertext
$\left[\mathsf{sign}\left(\mathit{x}\right)\right]$ to
$\mathcal{S}$. Obviously, this trivial solution reveals the entire
$\mathit{x}$ to
$\mathcal{C}$. To hide
$\mathit{x}$ from
$\mathcal{C}$,
$\mathcal{S}$ first homomorphically multiplies
$\left[\mathit{x}\right]$ with a random vector
$\mathit{r}\ne \mathbf{0}$ (line 2), then sends
$\left[\mathit{x}\times \mathit{r}\right]$ to
$\mathcal{C}$ (line 3). After that,
$\mathcal{C}$ can decrypt
$\left[\mathit{x}\times \mathit{r}\right]$ and obtain
$\mathit{x}\times \mathit{r}$, which hides
$\mathit{x}$ from
$\mathcal{C}$ in an informationtheoretic way (it is a onetime pad). At
$\mathcal{S}$’s side, based on the sign of
$\mathit{x}\times \mathit{r}$ and sign of
$\mathit{r}$, this allows
$\mathcal{S}$ to obtain the sign of
$\mathit{x}$. In particular, as
$\mathit{r}\ne \mathbf{0}$,
$\mathsf{sign}\left(\mathit{x}\right)=\mathsf{sign}(\mathit{x}\times \mathit{r})\times \mathsf{sign}\left(\mathit{r}\right)$ (illustrated in
Table 2). Using
$\left[\mathsf{sign}\left(\mathit{x}\right)\right]$,
$\mathcal{C}$ can homomorphically compute
$\left[\mathsf{ReLU}\left(\mathit{h}\right)\right]$ based on Formula (
2). The correctness and security of Protocol 1 is shown in Lemma 1.
Protocol 1 $\mathsf{HeReLU}$: Homomorphic $\mathsf{ReLU}$ evaluation protocol 
Input $\mathcal{C}$: Secret key $sk$ Input $\mathcal{S}$: Public key $pk$, ciphertext $\left[\mathit{x}\right]$ Output $\mathcal{S}$: $\left[\mathsf{ReLU}\left(\mathit{x}\right)\right]$ 
$\mathcal{S}$ runs following steps:  1:
Picks $\mathit{r}\leftarrow {\mathbb{R}}^{n}/\left\{\mathbf{0}\right\}$  2:
$\left[\mathit{x}\times \mathit{r}\right]\leftarrow \left[\mathit{x}\right]\otimes \mathit{r}$  3:
Sends to $\mathcal{C}$: $\left[\mathit{x}\times \mathit{r}\right]$ 
$\mathcal{C}$ runs following steps:  4:
$\mathit{x}\times \mathit{r}\leftarrow Decrypt(\left[\mathit{x}\times \mathit{r}\right],sk)$  5:
Sends to $\mathcal{S}$: $\left[\mathsf{sign}(\mathit{x}\times \mathit{r})\right]$ 
$\mathcal{S}$ runs following steps:  6:
$\left[\mathsf{sign}\left(\mathit{x}\right)\right]=\left[\mathsf{sign}(\mathit{x}\times \mathit{r})\right]\otimes \mathsf{sign}\left(\mathit{r}\right)$  7:
$\left[\mathsf{ReLU}\left(\mathit{x}\right)\right]=\frac{1}{2}\otimes \left(\left[\mathit{x}\right]\oplus \left[\mathit{x}\right]\otimes \left[\mathsf{sign}\left(\mathit{x}\right)\right]\right)$

Lemma 1. $\mathsf{HeReLU}$ protocol is secure in the semihonest model.
Proof. We prove that the security of our protocol follows Definition 1. We first prove the correctness and then analyze the security of the protocol.
Correctness. The correctness of the protocol is straightforward. We first prove the correctness of computing $\left[\mathsf{sign}\left(\mathit{x}\right)\right]$. At line 6, we have:
$\left[\mathsf{sign}\left(\mathit{x}\times \mathit{r}\right)\right]\otimes \mathsf{sign}\left(\mathit{r}\right)=\left[\mathsf{sign}(\mathit{x}\times \mathit{r})\times \mathsf{sign}\left(\mathit{r}\right)\right]=\left[\mathsf{sign}\left(\mathit{x}\right)\right]$.
Now, we prove the correctness of computing $\left[\mathsf{ReLU}\left(\mathit{x}\right)\right]$. At line 7, we have:
$\frac{1}{2}\otimes \left(\left[\mathit{x}\right]\oplus \left[\mathit{x}\right]\otimes \left[\mathsf{sign}\left(\mathit{x}\right)\right]\right)=\left[\frac{1}{2}(\mathit{x}+\mathit{x}\times \mathsf{sign}\left(\mathit{x}\right))\right]=\left[\mathsf{ReLU}\left(\mathit{x}\right)\right]$ (Formula (
2)).
Security. We now consider two cases as following Definition 1, i.e., corrupted $\mathcal{C}$ and corrupted $\mathcal{S}$.
Corrupted $\mathcal{C}$: $\mathcal{C}$’s view is ${\mathsf{View}}_{\mathcal{C}}=\left(sk,\left[\mathit{x}\times \mathit{r}\right]\right)$. Given $\left(sk\right)$, we build a simulator ${\mathsf{Sim}}_{\mathcal{C}}$ as below:
 1.
Pick ${\mathit{x}}^{\prime}\leftarrow {\mathbb{R}}^{n}$;
 2.
Output $(sk,\left[{\mathit{x}}^{\prime}\right])$.
Even $\mathcal{C}$ can decrypt $\left[\mathit{x}\times \mathit{r}\right]$ and obtain $\mathit{x}\times \mathit{r}$, because $\mathit{r}$ was randomly chosen by $\mathcal{S}$; hence, from the view of $\mathcal{C}$: $\mathit{x}\times \mathit{r}{\approx}_{c}{\mathit{x}}^{\prime}\iff \left(sk,\left[\mathit{x}\times \mathit{r}\right]\right){\approx}_{c}(sk,\left[{\mathit{x}}^{\prime}\right])\iff {\mathsf{View}}_{\mathcal{C}}{\approx}_{c}{\mathsf{Sim}}_{\mathcal{C}}$.
Corrupted $\mathcal{S}$: $\mathcal{S}$’s view is ${\mathsf{View}}_{\mathcal{S}}=\left(pk,\left[\mathit{x}\right],\left[\mathsf{sign}(\mathit{x}\times \mathit{r})\right]\right)$. Given $(pk,\left[\mathit{x}\right],\left[\mathsf{ReLU}\left(\mathit{x}\right)\right])$, we build a simulator ${\mathsf{Sim}}_{\mathcal{S}}$ as below:
 1.
Pick $\mathit{s}\leftarrow {\left\{1,0,1\right\}}^{n}$;
 2.
Output $\left(pk,\left[\mathit{x}\right],\left[\mathit{s}\right]\right)$.
Based on the semantic security of HE schemes, it is obvious that $\left(pk,\left[\mathit{x}\right],\left[\mathsf{sign}(\mathit{x}\times \mathit{r})\right]\right)$${\approx}_{c}\left(pk,\left[\mathit{x}\right],\left[\mathit{s}\right]\right)\iff {\mathsf{View}}_{\mathcal{S}}{\approx}_{c}{\mathsf{Sim}}_{\mathcal{S}}$. □
The idea of computing the
$\mathsf{ReLU}$ function is based on Equation (
2), i.e.,
$\mathsf{ReLU}\left(x\right)=\frac{1}{2}\left(x+x.\mathsf{sign}\left(x\right)\right)$. This equation requires a division by 2, which is equivalent to multiplying by
$0.5$ in CKKS. However, the division is not supported by almost all other schemes that work on integers in the ring
${\mathbf{Z}}_{n}$, such as BFV [
23] and BGV [
68]. In those schemes, the division by 2 can be replaced by multiplying by
${2}^{1}$, where
${2}^{1}$ is the inverse of 2 in
${\mathbb{Z}}_{n}$. By doing so, our protocol can be compatible with both CKKS and other integer HE schemes.
$\mathsf{HeReLU}$ allows
$\mathcal{S}$ to obtain
$\left[\mathsf{ReLU}\left(\mathit{x}\right)\right]$ without leaking
$\mathit{x}$ to
$\mathcal{S}$ and
C. However, it leaks some unexpected information about
$\mathit{x}$ to
$\mathcal{C}$, namely, it leaks whether the
ith slot in
$\mathit{x}$ is zero or not (we denote it by
$\mathit{x}\stackrel{?}{=}0$). This leakage occurs at line 4 in Protocol 1. In particular,
$\mathcal{C}$ knows a slot
${\mathit{x}}_{i}=0$ if the
ith slot in
$\mathit{x}\times \mathit{r}$ is 0 (as
$\mathit{r}\ne 0$). We address the information leakage by proposing a novel NN permutation. We detail this in
Section 4.5.
4.3. $\mathsf{HeRefresh}$: Refreshing Ciphertexts
LHE schemes limit the depth of the circuit that can be evaluated. Furthermore, in the case of deep NNs, the multiplicative depth of the HE scheme must be a big value that causes computation overhead (as mentioned in
Section 3.2). To address these issues, we design a simple but efficient protocol to regularly refresh intermediate layers in NNs,
$\mathsf{HeRefresh}$. In this section, we use
$\left[\mathit{x},\ell \right]$ to denote the ciphertext
$\left[\mathit{x}\right]$ is at the level
ℓ, i.e.,
ℓ multiplications can be performed on this ciphertext).
Ciphertext refreshing problem. $\mathcal{C}$ holds secret key $sk$, the server has public key $pk$ but can not access to $sk$. $\mathcal{S}$ holds ciphertext $\left[\mathit{x},\ell \right]$ ($\mathit{x}\in {\mathbb{R}}^{n}$) encrypted under $sk$. $\mathcal{S}$ obtains $\left[\mathit{x},L\right]$, s.t. $L>\ell $, without leaking $\mathit{x}$ to $\mathcal{C}$ and $\mathcal{S}$.
We now present our protocol that solves the ciphertext refreshing problem described above. The protocol is shown in Protocol 2. We now give the intuition behind the protocol and then present the proof for the correctness and security of the protocol (as illustrated in Lemma 2.
Protocol 2 $\mathsf{HeRefresh}$: Ciphertext refreshing protocol 
Input $\mathcal{C}$: Secret key $sk$ Input $\mathcal{S}$: Public key $pk$, $\left[\mathit{x},\ell \right]$ Output $\mathcal{S}$: $\left[\mathit{x},L\right]$ 
$\mathcal{S}$ runs following steps:  1:
Picks $\mathit{r}\leftarrow {\mathbb{R}}^{n}$  2:
$\left[\mathit{x}+\mathit{r},\ell \right]\leftarrow \left[\mathit{x},\ell \right]\oplus \mathit{r}$  3:
Sends to $\mathcal{C}$: $\left[\mathit{x}+\mathit{r},\ell \right]$ 
$\mathcal{C}$ runs following steps:  4:
$\mathit{x}+\mathit{r}\leftarrow Decrypt(\left[\mathit{x}+\mathit{r},\ell \right],sk)$  5:
Sends to $\mathcal{S}$: $\left[\mathit{x}+\mathit{r},L\right]$ 
$\mathcal{S}$ runs following steps:  6:
$\left[\mathit{x},L\right]=\left[\mathit{x}+\mathit{r},L\right]\ominus \mathit{r}$

Lemma 2. $\mathsf{HeRefresh}$ protocol is secure in the semihonest model.
Intuition. The idea of the protocol is as follows. First, $\mathcal{S}$ additively blinds $\left[\mathit{x}\right]$ with a random mask $\mathit{r}$ (line 2) and sends the masked ciphertext $\left[\mathit{x}+\mathit{r}\right]$ to $\mathcal{C}$ (line 3). Then, $\mathcal{C}$ decrypts the ciphertext and obtains masked message $\mathit{x}+\mathit{r}$, which perfectly hides $\mathit{x}$ from $\mathcal{C}$ in an informationtheoretic way (it is a onetime pad). Then, $\mathcal{C}$ encrypts the masked message the $\left[\mathit{x}+\mathit{r}\right]$. This ciphertext is at the highest level L, as it has not undergone any multiplication. Obviously, the encryption–decryption procedure can be considered as the ciphertext refresher. Finally, $\mathcal{S}$ homomorphically subtracts $\mathit{r}$ and obtains a new ciphertext, which is an Llevel ciphertext. $\mathsf{HeRefresh}$ is essential to enable continued computation without increasing the encryption parameters. Rather than selecting encryption parameters large enough to support the entire NN, we must now only be large enough to support the linear layers and computation in the $\mathsf{HeReLU}$ protocol.
Proof. We first prove the correctness and then analyze the security of the protocol based on Definition 1.
Correctness. The correctness of the protocol is straightforward. We first consider the level of the ciphertext. At line 5, $\mathcal{C}$ encrypts the plain message $\mathit{x}+\mathit{r}$ into $\left[\mathit{x}+\mathit{r}\right]$. This new ciphertext has not undergone any computation; hence, it is at the highest level of the HE scheme, i.e., L. In line 6, the ciphertext involves one homomorphic subtraction; hence, the resulting ciphertext remains at the level L.
From now on, for the sake of simplicity, we remove the level notion in ciphertexts. At line 6 of the protocol, $\left[\mathit{x}+\mathit{r}\right]\ominus \mathit{r}=\left[\mathit{x}+\mathit{r}\mathit{r}\right]=\left[\mathit{x}\right]$.
Security. We now consider two cases as following Definition 1, i.e., corrupted $\mathcal{C}$ and corrupted $\mathcal{S}$.
Corrupted $\mathcal{C}$: $\mathcal{C}$’s view is ${\mathsf{View}}_{\mathcal{C}}=\left(sk,\left[\mathit{h}\times \mathit{r}\right]\right)$. Given $\left(sk\right)$, we build a simulator ${\mathsf{Sim}}_{\mathcal{C}}$ as below:
 1.
Pick ${\mathit{h}}^{\prime}\leftarrow {\mathbb{R}}^{n}$;
 2.
Output $(sk,\left[{\mathit{h}}^{\prime}\right])$.
Even $\mathcal{C}$ can decrypt $\left[\mathit{h}\times \mathit{r}\right]$, and obtain $\mathit{h}\times \mathit{r}$, because $\mathit{r}$ was randomly chosen by $\mathcal{S}$; hence, from the view of $\mathcal{C}$: $\mathit{h}\times \mathit{r}{\approx}_{c}{\mathit{h}}^{\prime}\iff \left(sk,\left[\mathit{h}\times \mathit{r}\right]\right){\approx}_{c}(sk,\left[{\mathit{h}}^{\prime}\right])\iff {\mathsf{View}}_{\mathcal{C}}{\approx}_{c}{\mathsf{Sim}}_{\mathcal{C}}$.
Corrupted $\mathcal{S}$: $\mathcal{S}$’s view is ${\mathsf{View}}_{\mathcal{S}}=\left(pk,\left[\mathit{h}\right],\left[\mathsf{sign}(\mathit{h}\times \mathit{r})\right]\right)$. Given $(pk,\left[\mathit{h}\right],\left[\mathsf{ReLU}\left(\mathit{h}\right)\right])$, we build a simulator ${\mathsf{Sim}}_{\mathcal{S}}$ as below:
 1.
Pick $\mathit{s}\leftarrow {\left\{1,0,1\right\}}^{n}$;
 2.
Output $\left(pk,\left[\mathit{h}\right],\left[\mathit{s}\right]\right)$.
Based on the semantic security of HE schemes, it is obvious that $\left(pk,\left[\mathit{h}\right],\left[\mathsf{sign}(\mathit{h}\times \mathit{r})\right]\right)$${\approx}_{c}\left(pk,\left[\mathit{h}\right],\left[\mathit{s}\right]\right)\iff {\mathsf{View}}_{\mathcal{S}}{\approx}_{c}{\mathsf{Sim}}_{\mathcal{S}}$. □
We now leverage
$\mathsf{HeReLU}$ and
$\mathsf{HeRefresh}$ to design an SNNI framework. We start from a simple framework,
${\mathsf{HeFUN}}_{\mathsf{naive}}$, which straightforwardly applies
$\mathsf{HeReLU}$ in SNNI (
Section 4.4).
${\mathsf{HeFUN}}_{\mathsf{naive}}$ almost satisfies the requirements of SNNI, except that it leaks
$\mathit{m}\stackrel{?}{=}0$ to
$\mathcal{C}$. We then improve to
${\mathsf{HeFUN}}_{\mathsf{naive}}$ to prevent such information leakage, which results in the
$\mathsf{HeFUN}$ protocol (
Section 4.5).
4.4. ${\mathsf{HeFUN}}_{\mathsf{naive}}$ Protocol
This section presents our method to solve the SNNI problem, which uses HE to evaluate linear layers (as described in
Section 4.1), uses the
$\mathsf{HeReLU}$ protocol to evaluate the
$\mathsf{ReLU}$ activation function, and uses the
$\mathsf{HeRefresh}$ protocol to refresh intermediate layers. Interestingly,
$\mathsf{HeRefresh}$ and
$\mathsf{HeReLU}$ can be run in parallel on the same input
${\mathit{x}}^{\left(i\right)}$. By doing so, we preserve communication rounds. The protocol is shown in Protocol 3.
${\mathsf{HeFUN}}_{naive}$ straightforwardly applies the
$\mathsf{HeReLU}$ protocol to compute the
$\mathsf{ReLU}$ activation function. As shown in
Section 4.2, it leaks
${\mathit{x}}^{\left(i\right)}\stackrel{?}{=}0$ to
$\mathcal{C}$ when
$\mathcal{C}$ and
$\mathcal{S}$ run
$\mathsf{HeReLU}$ (line 4). In the next section, we present a
mature version of
${\mathsf{HeFUN}}_{naive}$, i.e.,
$\mathsf{HeFUN}$, which overcomes such data leakage.
Protocol 3 ${\mathsf{HeFUN}}_{naive}$ framework 
Input $\mathcal{C}$: Data $\mathit{x}$, secret key $sk$ Input $\mathcal{S}$: public key $pk$, trained weights and biases $({\mathit{W}}_{i},{\mathit{b}}_{i})$ with $i=\overline{1,n}$ Output $\mathcal{C}$: $\mathbf{z}:={f}_{L}\left({\mathit{W}}_{L},{a}_{L1}\left({f}_{L1}\left(\dots {a}_{1}\left({f}_{1}\left({\mathit{W}}_{1},\mathit{x},{\mathit{b}}_{1}\right)\right)\right)\right),{\mathit{b}}_{L}\right)$, where ${f}_{i}$ is $\mathsf{ReLU}$ function.  1:
$\mathcal{C}$ encrypts data $\mathit{x}$ and send $\left[\mathit{x}\right]$ to server  2:
$\mathcal{S}$ computes $\left[{\mathit{x}}^{\left(1\right)}\right]={f}_{1}(\left[\mathit{x}\right],{\mathit{W}}^{\left(1\right)},{\mathit{b}}^{\left(1\right)})$  3:
for $i=1$ to $L1$ do  4:
$\mathcal{S}$ obtains refreshed$\left[{\mathit{x}}^{\left(i\right)}\right]$ ▹$\mathcal{C}$ and $\mathcal{S}$ run Protocol 2 on $\left[{\mathit{x}}^{\left(\mathit{i}\right)}\right]$  5:
$\mathcal{S}$ obtains $\left[{\mathit{h}}^{\left(i\right)}\right]=\left[\mathsf{ReLU}\left({\mathit{x}}^{\left(i\right)}\right)\right]$ ▹$\mathcal{C}$ and $\mathcal{S}$ run Protocol 1 on $\left[{\mathit{x}}^{\left(\mathit{i}\right)}\right]$  6:
$\mathcal{S}$ compute $\left[{\mathit{x}}^{(i+1)}\right]={f}_{i+1}(\left[{\mathit{h}}^{\left(i\right)}\right],{\mathit{W}}^{(i+1)},{\mathit{b}}^{(i+1)})$  7:
$\mathcal{S}$ sends $\left[{\mathit{x}}^{\left(L\right)}\right]$ to the $\mathcal{C}$  8:
$\mathcal{C}$ obtains $\mathbf{z}\leftarrow Decrypt(\left[{\mathit{x}}^{\left(L\right)}\right],sk)$

4.5. $\mathsf{HeFUN}$
We first present the intuition behind how can we hide ${\mathit{x}}^{\left(i\right)}\stackrel{?}{=}0$ from $\mathcal{C}$.
Intuition. To tackle the issues of leaking
${\mathit{x}}^{\left(i\right)}\stackrel{?}{=}0$ to
$\mathcal{C}$,
$\mathcal{S}$ implements a random permutation on
${\mathit{x}}^{\left(i\right)}$ prior to executing the
$\mathsf{HeReLU}$ protocol. Then,
${\mathit{x}}^{\left(i\right)}$ cannot determine the original position of a value, concealing whether the actual value of a particular slot is zero. Thus,
$\mathcal{C}$ only learns the number of zero values, which can be obscured by adding dummy elements. However, in SNNI,
${\mathit{x}}^{\left(i\right)}$ is in encrypted form, i.e.,
$\left[{\mathit{x}}^{\left(i\right)}\right]$. Unfortunately, permutation on
$\left[{\mathit{x}}^{\left(i\right)}\right]$ is a challenging task. This is because the permutation requires swapping slots in the ciphertext, which requires numerous homomorphic rotations—a costly operator [
11,
15].
This approach is not practical when the number of neurons in NN layers is large. The section below describes how we can overcome this challenge.
Given a vector $\mathit{x}\in {\mathbb{R}}^{n}$, and $\pi $ is a permutation over $\left\{1,\dots ,n\right\}$, let $\pi \left(\mathit{x}\right)$ denote the permutation $\mathit{x}$’s slots according to $\pi $, i.e., the ith slot in the permuted vector ($\pi \left(\mathit{x}\right)$) is ${\mathit{x}}_{\pi \left(i\right)}$ (instead of ${\mathit{x}}_{i}$). Given a matrix $\mathit{W}$, we use $\pi \left(\mathit{W}\right)$ to denote permutation of $\mathit{W}$ by columns, i.e., the ith column in the permuted matrix ($\pi \left(\mathit{W}\right)$) is ${\mathit{W}}_{\pi \left(i\right)}$ (instead of ${\mathit{W}}_{i}$). Here are some obvious facts:
Permuting two vectors with the same permutation preserves elementwise addition:
Permute two vectors with the same permutation and preserve their dot product:
Permute a vector, and every column in a matrix with the same permutation preserves the vector–matrix product:
where
$\pi (\mathit{W},col)$ denotes that we apply the permutation
$\pi $ to every column of
$\mathit{W}$.
In vector–matrix multiplication, (only) permutation of a matrix by column leads to the same permutation on the result.
Proof. Proof of Property (
3):
$\pi \left(\mathit{x}\right)+\pi \left(\mathit{y}\right)=({\mathit{x}}_{\pi \left(1\right)}+{\mathit{y}}_{\pi \left(n\right)},\dots ,{\mathit{x}}_{\pi \left(n\right)}+{\mathit{y}}_{\pi \left(n\right)})=\pi (\mathit{x}+\mathit{y})$ □
Based on Property (
6) and the fact that
$\mathit{x}=\mathit{x}\xb7\mathit{I}$, where
$\mathit{I}$ is the identity matrix, we can achieve permutation of
$\mathit{x}$ by just permuting identity matrix
$\mathit{I}$, namely
$\mathit{x}\xb7\pi \left(\mathit{I}\right)=\pi (\mathit{x}\xb7\mathit{I})=\pi \left(\mathit{x}\right)$. Based on this observation, we propose a novel algorithm, i.e.,
$\mathsf{HePerm}$, to permute slots inside a ciphertext without the rotation operator. The algorithm is shown in Protocol 4. The correctness of the protocol is straightforward, as below.
Protocol 4 $\mathsf{HePerm}$: Homomorphic permutation 
Input: ciphertext $\left[\mathit{x}\right]$, permutation $\pi $ Output: $\left[\pi \left(\mathit{x}\right)\right]$  1:
$\left[\pi \left(\mathit{x}\right)\right]=\left[\mathit{x}\right]\odot \pi \left(\mathit{I}\right)$  2:
return $\mathit{x}$

The
$\mathsf{HePerm}$ algorithm allows us to permute any (encrypted) intermediate layers, i.e.,
$\left[{\mathit{x}}^{\left(i\right)}\right]$, prior to executing the
$\mathsf{HeReLU}$ protocol. Hence, we can hide
${\mathit{x}}^{\left(i\right)}\stackrel{?}{=}0$ from
$\mathcal{C}$. Suppose each intermediate layer
$\left[{\mathit{x}}^{\left(i\right)}\right]$ (before the
$\mathsf{ReLU}$ layer) is permuted by a permutation
${\pi}^{\left(i\right)}$. Then, the value obtained by
$\mathcal{S}$ (after running
$\mathsf{HeReLU}$) is
$\left[{\pi}^{i}\left({\mathit{h}}^{\left(i\right)}\right)\right]$ (not
$\left[{\mathit{h}}^{\left(\mathit{i}\right)}\right]$). This value will be the input for the next linear layer. Therefore, to compute the next linear layer,
$\mathcal{S}$ first needs to
unpermute$\left[{\pi}^{i}\left({\mathit{h}}^{\left(i\right)}\right)\right]$. To do so,
$\mathcal{S}$ runs
$\mathsf{HePerm}$ on
$\left[{\pi}^{i}\left({\mathit{h}}^{\left(i\right)}\right)\right]$ and
${{\pi}^{\left(i\right)}}^{1}$, which results in
$\left[{{\pi}^{\left(i\right)}}^{1}\left({\pi}^{\left(i\right)}\left({\mathit{h}}^{\left(i\right)}\right)\right)\right]=\left[{\mathit{h}}^{\left(i\right)}\right]$. Then,
$\mathcal{S}$ can homomorphically evaluate the next linear layer on
$\left[{\mathit{h}}^{\left(i\right)}\right]$, which results in
$\left[{\mathit{x}}^{(i+1)}\right]$. Finally,
$\mathcal{S}$ can run
$\mathsf{HePerm}$ to permute
$\left[{\mathit{x}}^{(i+1)}\right]$ using permutation
${\pi}^{(i+1)}$. Interestingly, for a fully connected layer, we can achieve permutation without the need of
$\mathsf{HePerm}$. This is because the fully connected layer can be homomorphically evaluated directly on
$\left[{\pi}^{\left(i\right)}\left({\mathit{h}}^{\left(i\right)}\right)\right]$ by permuting each column in
${\mathit{W}}^{(i+1)}$ by
${\pi}^{\left(i\right)}$ (based on Property (
5)), namely
$\left[{\pi}^{\left(i\right)}\left({\mathit{h}}^{\left(i\right)}\right)\right]\odot {\pi}^{\left(i\right)}({\mathit{W}}^{(i+1)},col)\oplus {\mathit{b}}^{(i+1)}=$ $\left[{\pi}^{\left(i\right)}\left({\mathit{h}}^{\left(i\right)}\right)\xb7{\pi}^{\left(i\right)}({\mathit{W}}^{(i+1)},col)+{\mathit{b}}^{(i+1)}\right]=$ $\left[{\mathit{h}}^{\left(i\right)}\xb7{\mathit{W}}^{(i+1)}+{\mathit{b}}^{(i+1)}\right]$. Therefore, we achieve permutation
for free in fully connected layers.
Assume that
$\mathcal{S}$ chooses
L random permutations
${\pi}^{\left(1\right)},\dots ,{\pi}^{\left(L\right)}$ corresponding to layer 1 to layer
L, respectively. Notably,
${\pi}^{\left(1\right)},\dots ,{\pi}^{\left(L\right)}$ can be chosen offline by
$\mathcal{S}$ before the inference process. Hence, we can save the inference time. The details of
$\mathsf{HeFUN}$ are shown in Protocol 5, in which
$FC$ means fully connected layer and
$Conv$ means convolution layer.
$\mathsf{HeFUN}$’s security follows Lemma 3. We now provide an analysis of the correctness and security of the protocol.
Protocol 5 $\mathsf{HeFUN}$ framework 
Input $\mathcal{C}$: Data $\mathit{x}$, secret key $sk$ Input $\mathcal{S}$: public key $pk$, trained weights and biases $({\mathit{W}}_{i},{\mathit{b}}_{i})$ with $i=\overline{1,n}$ Output $\mathcal{C}$: $z={f}_{L}\left({\mathit{W}}_{L},{f}_{L1}\left(\dots {f}_{1}\left({\mathit{W}}_{1},\mathit{x},{\mathit{b}}_{1}\right)\right),{\mathit{b}}_{L}\right)$, where ${f}_{i}$ is $\mathsf{ReLU}$ function.  1:
$\mathcal{C}$ encrypts data $\mathit{x}$ and send $\left[\mathit{x}\right]$ to server $\mathcal{S}$ runs following steps:  2:
$\mathcal{S}$ chose ${\pi}^{\left(1\right)},\dots ,{\pi}^{\left(L\right)}$ ▹ Can be done offline  3:
if ${f}_{1}=FC$ then  4:
$\left[{\pi}^{\left(1\right)}\left({\mathit{x}}^{\left(1\right)}\right)\right]=\left[\mathit{x}\right]\odot {\pi}^{\left(1\right)}\left({\mathit{W}}^{\left(1\right)}\right)\oplus {\pi}^{\left(1\right)}\left({\mathit{b}}^{\left(1\right)}\right)$  5:
else if ${f}_{1}=Conv$ then  6:
$\left[{\mathit{x}}^{\left(1\right)}\right]=\left[\mathit{x}\right]\u229b{\mathit{W}}^{\left(1\right)}\oplus {\mathit{b}}^{\left(1\right)}$  7:
$\left[{\pi}^{\left(1\right)}\left({\mathit{x}}^{\left(1\right)}\right)\right]\leftarrow \mathsf{HePerm}(\left[{\mathit{x}}^{\left(1\right)}\right],{\pi}^{\left(1\right)})$  8:
for $i=1$ to $L1$ do  9:
$\mathcal{S}$ obtains refreshed$\left[{\pi}^{\left(i\right)}\left({\mathit{x}}^{\left(i\right)}\right)\right]$ ▹$\mathcal{C}$ and $\mathcal{S}$ run Protocol 2 on $\left[{\pi}^{\left(i\right)}\left({\mathit{x}}^{\left(i\right)}\right)\right]$  10:
$\mathcal{S}$ obtains $\left[{\pi}^{\left(i\right)}\left({\mathit{h}}^{\left(i\right)}\right)\right]=\left[\mathsf{ReLU}\left({\pi}^{\left(i\right)}\left({\mathit{x}}^{\left(i\right)}\right)\right)\right]$ ▹$\mathcal{C}$ and $\mathcal{S}$ run Protocol 1 on $\left[{\pi}^{\left(i\right)}\left({\mathit{x}}^{\left(i\right)}\right)\right]$  11:
if ${f}_{i+1}=FC$ then  12:
$\left[{\pi}^{(i+1)}\left({\mathit{x}}^{(i+1)}\right)\right]=\left[{\pi}^{\left(i\right)}\left({\mathit{h}}^{\left(i\right)}\right)\right]\odot {\pi}^{(i+1)}\left({\pi}^{\left(i\right)}({\mathit{W}}^{(i+1)},col)\right)\oplus {\pi}^{(i+1)}\left({\mathit{b}}^{(i+1)}\right)$  13:
else if ${f}_{i+1}=Conv$ then  14:
$\left[{\mathit{h}}^{\left(i\right)}\right]\leftarrow \mathsf{HePerm}(\left[{\pi}^{\left(i\right)}\left({\mathit{h}}^{\left(i\right)}\right)\right],{{\pi}^{\left(i\right)}}^{1})$  15:
$\left[{\mathit{x}}^{(i+1)}\right]=\left[{\mathit{h}}^{\left(i\right)}\right]\u229b{\mathit{W}}^{(i+1)}\oplus {\mathit{b}}^{(i+1)}$  16:
$\left[{\pi}^{(i+1)}\left({\mathit{x}}^{(i+1)}\right)\right]\leftarrow \mathsf{HePerm}(\left[{\mathit{x}}^{(i+1)}\right],{\pi}^{(i+1)})$  17:
$\mathcal{S}$ sends ${\pi}^{\left(L\right)}$ and $\left[{\pi}^{\left(L\right)}\left({\mathit{x}}^{\left(L\right)}\right)\right]$ to the $\mathcal{C}$ $\mathcal{C}$ runs following steps:  18:
${\pi}^{\left(L\right)}\left({\mathit{x}}^{\left(L\right)}\right)\leftarrow Decrypt(\left[{\pi}^{\left(L\right)}\left({\mathit{x}}^{\left(L\right)}\right)\right],sk)$  19:
$\mathcal{C}$ obtains ${\mathit{x}}^{\left(L\right)}={{\pi}^{\left(L\right)}}^{1}\left({\pi}^{\left(L\right)}\left({\mathit{x}}^{\left(L\right)}\right)\right)$

Lemma 3. $\mathsf{HeFUN}$ protocol is secure in the semihonest model.
Proof. Correctness. We first consider the fully connected layers. At line 12, we have:
We now consider convolution layers, i.e., lines 14, 15, and 16. The correctness of convolution layers is straightforward based on the correctness of the $\mathsf{HePerm}$ algorithm and the homomorphic property.
The above process, i.e., computing the linear layer then computing the $\mathsf{ReLU}$ activation function, is repeated through layers. At the end, $\mathcal{C}$ obtains ${\pi}^{\left(L\right)}\left({\mathit{x}}^{\left(L\right)}\right)$ (line 18). Because $\mathcal{S}$ sends to $\mathcal{C}$ the permutation of the last layer, i.e., ${\pi}^{\left(L\right)}$ (line 17), $\mathcal{C}$ can unpermute ${\pi}^{\left(L\right)}\left({\mathit{x}}^{\left(L\right)}\right)$, and obtains ${\mathit{x}}^{\left(L\right)}=\mathbf{z}$.
Security. We prove the security using modular sequential composition introduced in [
22]. We first construct framework
${\mathsf{HeFUN}}^{\prime}$, in which
$\mathcal{C}$ and
$\mathcal{S}$ call ideal functionalities
${\mathcal{F}}_{Refresh}$ and
${\mathcal{F}}_{ReLU}$ to refresh ciphertext and evaluate the
$\mathsf{ReLU}$ function. We have
$\mathsf{HeRefresh}$ and
$\mathsf{HeReLU}$ such that they are already secure in the semihonest model (according to Lemma 2 and Lemma 1, respectively). Now, we need to prove that
${\mathsf{HeFUN}}^{\prime}$ (with the calls to
${\mathcal{F}}_{Refresh}$ and
${\mathcal{F}}_{ReLU}$) is secure in the semihonest model. Then, we can replace the calls to ideal functionalities
${\mathcal{F}}_{Refresh}$ and
${\mathcal{F}}_{ReLU}$ by protocols
$\mathsf{HeRefresh}$ and
$\mathsf{HeReLU}$, respectively, and conclude the security of
$\mathsf{HeFUN}$ by invoking modular sequential composition (as presented in
Section 3.3). We now consider two cases as follows Definition 1, i.e., corrupted
$\mathcal{C}$ and corrupted
$\mathsf{S}$. Considering corrupted
$\mathcal{C}$, it is obvious that
$\mathcal{C}$ receives nothing from ideal calls
${\mathcal{F}}_{Refresh}$ and
${\mathcal{F}}_{ReLU}$. Hence, we just need to consider the case of corrupted
$\mathcal{S}$, as below:
Corrupted $\mathcal{S}$: $\mathcal{S}$’s view is
Given
$(pk,{f}_{1},\dots ,{f}_{n},{\mathit{W}}^{\left(1\right)},\dots ,{\mathit{W}}^{\left(n\right)},{\mathit{b}}^{\left(1\right)},\dots ,{\mathit{b}}^{\left(n\right)}$), we build a simulator
${\mathsf{Sim}}_{\mathcal{S}}$ as below:
 1.
Pick ${\mathit{s}}_{\mathit{i}}\leftarrow {\left\{1,0,1\right\}}^{n}$ with $i=\overline{1,n}$;
 2.
Output $\left(pk,{f}_{1},\dots ,{f}_{n},{W}_{1},\dots ,{W}_{n},{B}_{1},\dots ,{B}_{n};\left[{\mathit{s}}_{\mathbf{1}}\right],\dots ,\left[{\mathit{s}}_{\mathit{L}}\right]\right)$.
Given the semantic security guaranteed by the HE scheme, coupled with the entropic uncertainty introduced by our homomorphic permutation algorithm, it naturally follows that $\left[{\mathit{x}}^{\left(i\right)}\right]{\approx}_{c}\left[{\mathit{s}}_{\mathit{i}}\right]$. Hence, ${\mathsf{View}}_{\mathcal{S}}{\approx}_{c}{\mathsf{Sim}}_{\mathcal{S}}$. □
5. Experiments
In the evaluation of SNNI, we consider three critical criteria: security, accuracy, and efficiency. The comparative analysis outlined in
Table 1 leads to several observations. (1) MPCbased frameworks fail to meet the security criterion, as they disclose the entire architecture of NNs; (2) TFHEbased frameworks fall short on accuracy, since they are limited to BNNs; and (3) LHEbased frameworks, HEMPC hybrid frameworks, and our proposed
$\mathsf{HeFUN}$ framework appear to fulfill all three criteria (it should be noted that the fulfillment of these requirements is not absolute, and some issues still persist, as outlined in
Table 1). Hence, in this section, we compare our proposed approach,
$\mathsf{HeFUN}$, with HEMPC hybridbased frameworks and LHEbased frameworks. The core architecture of both
$\mathsf{HeFUN}$ and prevailing HEMPC hybrid frameworks encompasses two fundamental components: linear layer evaluation and the evaluation of the nonlinear function, specifically the
$\mathsf{ReLU}$ function. While the linear layer evaluation in these frameworks uniformly employs homomorphic encryption, the difference arises in the treatment of nonlinear layer evaluation, namely,
$\mathsf{HeFUN}$ implements
$\mathsf{HeReLU}$, whereas HEMPC hybridbased frameworks opt for either the GC or ABY protocol. This section delves into the communication complexity inherent in the
$\mathsf{ReLU}$ evaluation within
$\mathsf{HeFUN}$ as compared to that within existing HEMPC hybrid frameworks (
Section 5.1).
$\mathsf{HeReLU}$ serves as a potential alternative to the GC/ABY protocols in HEMPC hybridbased frameworks, which reduces the communication complexity in SNNI. To compare
$\mathsf{HeFUN}$ with LHEbased frameworks, we first compare the ciphertext refreshing component. In particular, we compare
$\mathsf{HeRefresh}$ with the bootstrapping technique in the existing HE schemes (
Section 5.2). Subsequently, we benchmark
$\mathsf{HeFUN}$ alongside the prevalent LHEbased frameworks using a realworld dataset (
Section 5.3). All experiments are run on a PC with a single Intel i913900K running at 5.80 GHz and 64 GB of RAM, running Ubuntu 22.04. Overall, the results presented below show that:
Compared to current HEMPC frameworks,
$\mathsf{HeFUN}$ requires less communication rounds in evaluating the
$\mathsf{ReLU}$ activation function (
Section 5.1).
Compared to existing bootstrapping procedures,
$\mathsf{HeRefresh}$ is much faster (
Section 5.2).
Compared to current LHEbased frameworks,
$\mathsf{HeFUN}$ outperforms the current LHEbased approach in terms of accuracy and inference time (
Section 5.3).
5.1. Comparison with Hybrid HEMPC Approach
The current stateoftheart HEMPC hybridbased approach is Gazelle [
15]. Within Gazelle’s framework, the server directly computes linear layers using HE operators of the LHE scheme in an offline phase. To evaluate nonlinear layers, Gazelle leverages GC [
19] to handle the bitwise operations required by
$\mathsf{ReLU}$ (online phase). Finally, because each layer in an NN consists of alternating linear and nonlinear layers, Gazelle also elaborates on an efficient method to switch between the two aforementioned primitives using a novel technique based on additive secret sharing. Gazelle’s primary bottleneck is the cost of evaluating GC for the
$\mathsf{ReLU}$ activation function [
15,
69]. Compared to Gazelle, HeFUN can evaluate the
$\mathsf{ReLU}$ function
solely based on an HE scheme, hence avoiding the expensive GC protocol that significantly reduces communication rounds and communication cost. We now compare the procedure of evaluating the
$\mathsf{ReLU}$ function between
$\mathsf{HeFUN}$ and Gazelle regarding communication rounds and communication cost. In both frameworks, i.e.,
$\mathsf{HeFUN}$ and Gazelle, after computing the linear layers homomorphically or receiving the client’s encrypted input, the server holds an encrypted vector
$\left[\mathit{x}\right]$. The objective for both
$\mathsf{HeFUN}$ and Gazelle is to enable the server to obtain
$\left[\mathsf{ReLU}\left(\mathit{x}\right)\right]$. In Gazelle, the server and the client perform the following steps:
Conversion from HE to MPC: The server additively blinds $\left[\mathit{x}\right]$ with a random mask ${\mathit{r}}_{\mathbf{1}}$ and sends the masked ciphertext $\left[\mathit{x}+{\mathit{r}}_{\mathbf{1}}\right]$ to the client.
MPC circuit evaluation: The client decrypts $\left[\mathit{x}+{\mathit{r}}_{\mathbf{1}}\right]$. Then, the client (holds $\mathit{x}+{\mathit{r}}_{\mathbf{1}}$) and the server (holds ${\mathit{r}}_{\mathbf{1}}$ and a randomness ${\mathit{r}}_{\mathbf{2}}$) run GC to compute $\mathsf{ReLU}(\mathit{x}+{\mathit{r}}_{\mathbf{1}}{\mathit{r}}_{\mathbf{1}})+{\mathit{r}}_{\mathbf{2}}=\mathsf{ReLU}\left(\mathit{x}\right)+{\mathit{r}}_{\mathbf{2}}$ without leaking $\mathit{x}+{\mathit{r}}_{\mathbf{1}}$ to the server and ${\mathit{r}}_{\mathbf{1}}$ and ${\mathit{r}}_{\mathbf{2}}$ to the client. Finally, the client sends $\left[\mathsf{ReLU}\left(\mathit{x}\right)+{\mathit{r}}_{\mathbf{2}}\right]$ to the server.
Conversion from MPC to HE: The server homomorphically subtracts ${\mathit{r}}_{\mathbf{2}}$ and obtains $\left[\mathsf{ReLU}\left(\mathit{x}\right)\right]$.
The main bottleneck of Gazelle is step 2, i.e., the client and the server run GC, which requires a huge communication [
15]. The comparison shown in
Figure 1 illustrates that
$\mathsf{HeFUN}$ surpasses Gazelle by reducing both communication rounds and costs. Specifically,
$\mathsf{HeFUN}$ performs
$\mathsf{ReLU}$ evaluation in just two rounds, whereas Gazelle requires more communication rounds, namely two rounds (to convert between HE and MPC and vice versa) and communication rounds required by GC. In
$\mathsf{HeFUN}$, two ciphertexts are transmitted between the client and the server, while the number of messages exchanged in Gazelle is larger, namely two ciphertexts (
$\left[\mathit{x}+{\mathit{r}}_{\mathbf{1}}\right]$ and
$\left[\mathsf{ReLU}\left(\mathit{x}\right)+{\mathit{r}}_{\mathbf{2}}\right]$) and a huge number of messages exchanged in the GC protocol.
In Gazelle, the client needs to involve the GC procedure, which is undesirable in SNNI. In contrast, for $\mathsf{HeFUN}$, the client only needs to perform decryption and encryption, which are simple algorithms in the HE scheme, while other heavy computation takes place in the server.
One should acknowledge that in Gazelle, the server’s operations include homomorphic addition and subtraction; conversely, in $\mathsf{HeFUN}$, the server performs homomorphic multiplication (as delineated in line 7, Protocol 1), which necessitates an LHE scheme with larger parameters due to the additional multiplications required. Where Gazelle operates with a multiplicative depth of 1, $\mathsf{HeFUN}$ requires a depth of 3.
Another HEMPC hybridbased SNNI framework is MP2ML [
16], which incorporates the ABY protocol [
20] for the evaluation of the
$\mathsf{ReLU}$ function, as opposed to the utilization of GC. In the comparative analysis (see
Figure 1), the GC component is substituted with an ABY component. It is obvious that
$\mathsf{HeFUN}$ demonstrates superior performance over MP2ML in terms of both the number of communication rounds and communication costs. Overall,
$\mathsf{HeFUN}$ outperforms the current HEMPC hybridbased approach in evaluating
$\mathsf{ReLU}$ in terms of communication complexity. This is primarily attributed to
$\mathsf{HeFUN}$’s elimination of the necessitated communicationintensive GC or ABY protocols. However,
$\mathsf{HeFUN}$ presents an elevated multiplicative depth requirement of 3, in contrast to the multiplicative depth of 1 required by Gazelle and MP2ML.
5.2. $\mathsf{HeRefresh}$ Experimental Results
In the HE scheme,
$\mathsf{HeRefresh}$ is equivalent to the bootstrapping procedure, which both aim to reduce the noise in ciphertexts. We compare
$\mathsf{HeRefresh}$ with the bootstrapping technique for CKKS [
70], implemented in OpenFHE. We report the result for refreshing noise for one ciphertext. To ensure the reliability of the results, we repeat the experiment 1000 times and take the average result. The comparison results are shown in
Table 3. The depth is the number of expected homomorphic multiplications performed before noise refreshing,
N is the degree of the polynomial used in the CKKS scheme, and the level is the multiplicative level required for noise refreshing. Bootstrapping homomorphically computes the decryption equation in the encrypted domain. The procedure consumes an amount of homomorphic multiplication, so the level of the CKKS scheme needs to be big enough for this procedure. The details of the bootstrapping procedure are out of the scope of this work. Please refer to [
70] for details. While DoubleR (Protocol 2) only requires simple computation, i.e., encryption, decryption, addition, and subtraction, we do not need to increase the level of CKKS. The larger level of CKKS leads to a bigger value of
N to meet the security requirement. With simpler operators and smaller parameters, DoubleR is much faster than bootstrapping. For example, with a depth of 1, DoubleR is nearly 300 times faster than bootstrapping. For a depth of 4, with the same
N,
$\mathsf{HeRefresh}$ is nearly 225 times faster than bootstrapping.
5.3. Comparison between $\mathsf{HeFUN}$ and LHE (CryptoNets)
This section aims to compare
$\mathsf{HeFUN}$ with the LHEbased approaches. The foundational concept for LHEbased approaches originates from the landmark study presented in CryptoNets [
12], which utilizes an LHE scheme with a predetermined depth that aligns with the NN’s architecture. For the purposes of this analysis, the term ‘CryptoNets’ will be used to refer to the conventional LHEbased approach.
5.3.1. Experimental Setup
The NNs in this section were trained using PyTorch [
71]. To implement HE, we employed TenSEAL [
60], with CKKS as instantiation. All parameters in the following sections are chosen to comply with the recommendations on HE standards [
72], which satisfies 128 bits of security.
5.3.2. Dataset and NN Setup
We evaluated the performance of NN inference on two distinct datasets, i.e., MNIST [
73] and AT&T faces datasets [
74]. The MNIST dataset has a standard split containing
$28\times 28$ grayscale images of Arabic numerals 0 to 9 of 50,000 training images and 10,000 test images. MNIST is the standard benchmark for homomorphic inference tasks [
11,
12,
25,
33]. The AT&T faces dataset includes
$92\times 112$ grayscale images of 40 individuals, with 10 different images of each individual. The dataset is considered a classic dataset in computer vision for experimenting with techniques for face recognition. It offers a more realistic scenario for SNNI, allowing for the recognition of faces while maintaining the confidentiality of the individual images.
The neural networks (NNs) under consideration are composed of convolutional layers, activation functions, and fully connected layers. For the
$\mathsf{HeFUN}$ framework, the activation function employed is the
$\mathsf{ReLU}$ function, while LHEbased secure neural network inference (SNNI) typically utilizes the square function as the activation mechanism, as extensively documented in the literature [
12,
25,
26]. To introduce diversity into our experimental evaluation, we have selected two distinct NN architectures, i.e., a small NN and a large NN (the term “large NN” refers to an NN with more complexity than our “small NN”, but its size is tailored to stay within the operational limits of LHEbased frameworks for meaningful comparisons). The details of the NNs’ architectures for the MNIST dataset are detailed below.
Small NN:
 −
Convolution layer: The input image is 28 × 28. 5 kernels, each $3\times 3$ in size, with a stride of 2 and no padding. The output is a $5\times 13\times 13$ tensor.
 −
Activation function: This layer applies the approximate activation function to each input value. It is a square function in CryptoNets, and the $\mathsf{ReLU}$ function in $\mathsf{HeFUN}$.
 −
Fully connected layer: It connects the 845 incoming nodes to 100 outgoing nodes.
 −
Activation function: It is the square function in CryptoNets, and the $\mathsf{ReLU}$ function in $\mathsf{HeFUN}$.
 −
Fully connected layer: It connects the 100 incoming nodes to 10 outgoing nodes (corresponding to 10 classes in the MNIST dataset).
 −
Activation function: It is the sigmoid activation function.
Large NN:
 −
Convolution layer: It contains 5 kernels, each $3\times 3$ in size, with a stride of 2 and no padding. The output is a $5\times 13\times 13$ tensor.
 −
Activation function: It is the square function in CryptoNets, and the $\mathsf{ReLU}$ function in $\mathsf{HeFUN}$.
 −
Fully connected layer: It connects the 845 incoming nodes to 300 outgoing nodes.
 −
Activation function: It is the square function in CryptoNets, and the $\mathsf{ReLU}$ function in $\mathsf{HeFUN}$.
 −
Fully connected layer: It connects the 300 incoming nodes to 100 outgoing nodes.
 −
Activation function: It is the square function in CryptoNets, and the $\mathsf{ReLU}$ function in $\mathsf{HeFUN}$.
 −
Fully connected layer: connects the 100 incoming nodes to 10 outgoing nodes.
 −
Activation function: It is the sigmoid activation function.
In the case of the AT&T faces dataset, the neural network retains a similar layer structure as used with the MNIST dataset. However, adjustments are made to the neuron count per layer to accommodate the dataset’s structure. For instance, the output layer features 40 neurons, corresponding to the 40 distinct classes represented in the dataset, in contrast to the 10neuron configuration used for datasets like MNIST.
As we mention in
Section 4.1, the last sigmoid activation function can be removed in the inference phase without interfering with prediction accuracy.
5.3.3. HE Parameter
The main parameters defining the CKKS scheme [
24] are the degree
N of the polynomial modulus
${X}^{N}+1$, the coefficient modulus
q, and multiplicative depth
L. The multiplicative depth
L of the CKKS scheme was chosen to align with the NN’s depth. Subsequently, the modulus
q is carefully chosen to meet the security criterion of 128 bits, taking into account the specified multiplicative depth
L. Within the
$\mathsf{HeFUN}$ framework, the
$\mathsf{HeRefresh}$ protocol is employed to refresh the outputs of intermediate layers, thereby facilitating additional multiplications.
Table 4 details the parameters. CryptoNets
${}_{s}$ (
${\mathsf{HeFUN}}_{s}$) and CryptoNets
${}_{l}$ (
${\mathsf{HeFUN}}_{l}$) stand for CryptoNets (
$\mathsf{HeFUN}$) on small NN and large NN, respectively. We choose
N = 16,384 in both frameworks. Notably, in the case of CryptoNets, an escalation in NN depth from a small to a large model necessitates an increase in
L (from 5 to 7), consequently requiring an increase in modulus
q to preserve the desired security level. Conversely,
$\mathsf{HeFUN}$ maintains a constant and reduced multiplicative depth regardless of NN depth increments—specifically 3 as per
Table 4—thanks to its intermediate ciphertext refreshing mechanism. It is imperative to recognize that elevated values of
L and
q slow down inference time.
5.3.4. Experimental Results
Table 5 and
Table 6 present the experimental results on the MNIST and AT&T faces datasets, respectively, upon which we base our analysis of the accuracy and inference time associated with
$\mathsf{HeFUN}$ and CryptoNets.
Accuracy. It can be seen that $\mathsf{HeFUN}$ outperforms CryptoNets in accuracy. For the MNIST dataset, ${\mathsf{HeFUN}}_{l}$ and ${\mathsf{HeFUN}}_{s}$ achieve accuracy of $99.16\%$ and $98.31\%$, respectively, while CryptoNet${}_{l}$ and CryptoNet${}_{s}$ achieve accuracy of $98.52\%$ and $98.15\%$, respectively. The difference is caused by the activation used in the frameworks, namely, $\mathsf{HeFUN}$ used the $\mathsf{ReLU}$ activation function, whereas CryptoNets used the square function. Similarly, for the AT&T faces dataset, ${\mathsf{HeFUN}}_{l}$ and ${\mathsf{HeFUN}}_{s}$ achieve accuracy of $97.43\%$ and $96.66\%$, respectively, while CryptoNet${}_{l}$ and CryptoNet${}_{s}$ achieve accuracy of $96.87\%$ and $95.19\%$, respectively.
Inference. The timing results reported in
Table 5 and
Table 6 are the total running time at the client and server. As shown in
Table 5 and
Table 6,
$\mathsf{HeFUN}$ is faster than CryptoNets for both the small NN and large NN. This acceleration is because LHE parameters in
$\mathsf{HeFUN}$ are smaller than CryptoNets. For instance, in
Table 5, with an increase in NN depth from 5 to 7, CryptoNets shows a significant jump in inference time—from
$1.715$ s to
$3.497$ s. This increase is due to the escalated complexity of LHE parameters necessitated by a deeper network. Conversely,
$\mathsf{HeFUN}$ benefits from the
$\mathsf{HeRefresh}$ protocol, which mitigates noise accumulation at intermediate layers and allows for consistent LHE parameters regardless of NN depth. Consequently, in
$\mathsf{HeFUN}$, a similar increase in NN depth results in a modest increase in inference time, from
$1.374$ s to
$1.501$ s. Interestingly, in
$\mathsf{HeFUN}$, the computation time at the same layers almost remains unchanged, regardless of the depth of NN, due to its invariable LHE parameter requirements. On the other hand, in CryptoNets, deeper NNs require an increase in the LHE’s parameters, which slows down the computation time. For instance, the layer Fully connected 1 takes
$0.812$ s in both
${\mathsf{HeFUN}}_{s}$ and
${\mathsf{HeFUN}}_{l}$, while it takes
$1.121$ s in CryptoNet
${}_{s}$, and takes
$2.138$ s in CryptoNet
${}_{l}$. This observation reinforces the scalability of
$\mathsf{HeFUN}$ in terms of computation time when compared to CryptoNets as NN complexity increases.
We now analyze the running time at the activation function layers. The evaluation of activation function layers’ running times, as shown in
Table 1, reveals that
$\mathsf{HeFUN}$ exhibits slower performance compared to CryptoNets. This difference stems from the intrinsic complexity of the activation functions utilized; CryptoNets employs a square function as the activation, which is a single homomorphic multiplication of two ciphertexts, whereas
$\mathsf{HeFUN}$ utilizes the more complex
$\mathsf{HeReLU}$ protocol for evaluating the
$\mathsf{ReLU}$ function, as detailed in
Table 7.
Despite this, $\mathsf{HeFUN}$ has an advantage in terms of its independence from the neural network’s (NN) depth. Specifically, $\mathsf{HeFUN}$ maintains a constant multiplicative depth of 3, regardless of the NN’s depth, which avoids the escalation of LHE parameters—and consequently, inference times—that is seen with CryptoNets as the NN becomes deeper. To illustrate, the Activation 1 layer in $\mathsf{HeFUN}$ consistently requires $0.028$ s for both small and large NNs (${\mathsf{HeFUN}}_{s}$ and ${\mathsf{HeFUN}}_{l}$), in contrast to CryptoNets, which records $0.008$ s for a small NN (CryptoNet${}_{s}$) and increases to $0.012$ s for a large NN (CryptoNet${}_{l}$). This observation reinforces the scalability of $\mathsf{HeFUN}$ in terms of computation time when compared to CryptoNets as NN complexity increases.