Lightweight and Privacy-Preserving Multi-Keyword Search over Outsourced Data

Zhao, Meng; Liu, Lingang; Ding, Yong; Deng, Hua; Liang, Hai; Wang, Huiyong; Wang, Yujue

doi:10.3390/app13052847

Open AccessArticle

Lightweight and Privacy-Preserving Multi-Keyword Search over Outsourced Data

by

Meng Zhao

¹,

Lingang Liu

¹,

Yong Ding

^1,2,*

,

Hua Deng

³,

Hai Liang

¹,

Huiyong Wang

⁴ and

Yujue Wang

¹

Guangxi Key Laboratory of Cryptography and Information Security, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China

²

Cyberspace Security Research Center, Peng Cheng Laboratory, Shenzhen 518055, China

³

College of Computer Science and Engineering, Changsha University, Changsha 410022, China

⁴

School of Mathematics and Computing Science, Guilin University of Electronic Technology, Guilin 541004, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(5), 2847; https://doi.org/10.3390/app13052847

Submission received: 21 January 2023 / Revised: 16 February 2023 / Accepted: 21 February 2023 / Published: 22 February 2023

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

In cloud computing, documents can be outsourced to the cloud server to achieve flexible access control and efficient sharing among multiple users. The outsourced documents can be intelligently searched according to some keywords with the help of cloud server. During the search process, some private information of outsourced documents may be leaked since the keywords may contain sensitive information of users. However, existing privacy-preserving keyword search schemes have high computation complexity, which are not suitable for resource-constrained end devices—that is, the data processing and search trapdoor generation procedures require users to take resource-intensive computations, e.g., high-dimensional matrix operations, exponentiations and bilinear pairings, which are unaffordable by resource-constrained devices. To address the issues of efficiency and privacy for realizing sorted multi-keyword search over outsourced data in clouds, this paper proposes a lightweight privacy-preserving ranked multi-keyword search (PRMS) scheme, which is further extended to allow each outsourced document to be associated with multiple types of keywords. The searched documents can be sorted according to their similarity scores between the search query and the keyword index of documents, so that only when the similarity score exceeds a given threshold, the corresponding searched document will be returned. The security analysis demonstrates that the proposed PRMS schemes can guarantee the privacy of outsourced documents and keywords, and provides unlinkability for search trapdoors. Performance analysis and comparison show the practicality of the proposed PRMS schemes.

Keywords:

cloud computing; outsourced data; privacy-preserving search; keyword search; privacy protection

1. Introduction

With cloud computing technology, users with limited local resources do not need to purchase expensive hardware to support massive data storage [1]. Thus, for economic savings, individuals and enterprises can engage cloud servers to maintain big data, for example, videos, animations, and images. However, some sensitive information of outsourced data may be leaked, since the data owner would lose control of these data [2,3]. Therefore, to protect the privacy of user data, they should be properly processed before being outsourced, so that only the data in ciphertext format are maintained at the cloud server side.

When retrieving the interested data, users can request the cloud server to search over outsourced dataset with some specific keywords [4,5,6,7]. However, there are two issues to be addressed in the keyword search scenarios for outsourced data: data privacy and efficiency. For example, in the Internet of Things (IoT) scenario, the keywords of the collected data may contain sensitive information of outsourced data, which means the search keywords cannot be submitted to the cloud server in plaintext format. Otherwise, the users’ private information could be deduced by the cloud server from outsourced documents. To support privacy-preserving data search, and many single/multiple keywords search schemes have been proposed [8,9,10,11].

However, existing keyword search solutions in private/public key settings require heavy computations in both data processing and search trapdoor generation phases at the user side. For example, in [12], users need to perform high-dimensional matrix operations, such as multiplication and inversion operations, where the matrix dimension is determined by the cardinality of the keyword set. Furthermore, in [13,14], users have to compute heavy exponentiation operations for generating searchable indexes and search trapdoor. However, these resource-intensive computation operations cannot be afforded by resource-constrained devices, thus, those existing keyword search constructions are not applicable to IoT-related application scenarios.

1.1. Our Contributions

This paper aims to provide a solution to protect the privacy of outsourced data, which supports privacy-preserving keyword search and does not require resource-intensive computations. Specifically, this paper proposes lightweight privacy-preserving ranked multi-keyword intelligent search (PRMS) schemes over outsourced documents, where the search results are determined by the similarity score between the search query and the keyword index of each document. The searched documents can be sorted by the cloud server according to their similarity scores. Only the top-ranked documents whose similarity scores exceed the target threshold should be returned.

With the basic PRMS scheme, which was previously presented in the preliminary version [15], each outsourced document can be processed with a group of keywords so that it can be searched according to multiple keywords. Only when the number of search keywords contained in the outsourced document exceeds the given search threshold, the outsourced document will be returned. With the extended PRMS scheme, each outsourced document can have multiple different types of keywords, so that the document can be searched and retrieved based on any type of keywords.

Security analysis shows that the proposed PRMS schemes can guarantee the privacy of searchable index of outsourced documents and queries. Specifically, the cloud server cannot deduce any private information of outsourced documents from the encrypted index and queries. Furthermore, the proposed PRMS schemes offer the unlinkability of search trapdoors—that is, the cloud server is unable to identify whether two search trapdoors are generated for the same query.

In both PRMS schemes, the data processing and query generation phases only take lightweight computing operations at user side. Performance analysis demonstrates that our PRMS schemes are much more efficient than existing technologies; thus, they can be deployed in an IoT setting to support resource-constrained devices.

Compared with the preliminary version [15], this paper provides an extended PRMS scheme to support multiple types of keywords for each outsourced document, in which security and performance are also analyzed. Moreover, the experiments on the basic PRMS scheme are tested on a new platform as in evaluating the extension scheme.

1.2. Related Works

The first single keyword searchable encryption scheme over outsourced encrypted data was proposed by Song et al. [16] in the symmetric key setting. Subsequently, many schemes supporting single keyword searching [17,18,19,20] were designed. Cash et al. [21] proposed a scheme supporting single-keyword boolean search over large outsourced dataset. However, the single keyword search mechanism cannot provide accurate search results. Since the cloud server usually stores massive amounts of data, there would be many pieces of matched data satisfying the search condition of a single keyword, and most of the search results may have no relation with the expected data.

To support more sophisticated outsourcing search methods, many multi-keyword search schemes have been proposed [12,22,23,24]. Such multi-keyword search mechanism has many advantages over the single keyword search. With the multi-keyword search method, users are able to implement more complicated search conditions on outsourced data, in such a way that the accuracy of search results could be greatly improved. Furthermore, the efficiency can be enhanced, since the users do not need to carry out multiple rounds of single keyword search to achieve the same effect of multi-keyword search. Thus, the multi-keyword search schemes can allow the cloud server to return the most relevant data, which are more practical than the single keyword search mechanism in supporting real-world applications. For example, the works [25,26] support multi-keyword search with fully homomorphic encryption, refs. [22,23,27] conjunctive keyword search, ref. [24] multi-keyword fuzzy search, and [12] ranked multi-keyword search.

In the public-key setting, the first searchable encryption scheme was proposed by Boneh et al. [13], where anyone can outsource encrypted data to the cloud server, but only the user holding the private key can issue search queries. Xu et al. [28] constructed a searchable public-key ciphertexts scheme with hidden structures to achieve fast search. Hu et al. [29] presented a public-key encryption scheme with keyword search from obfuscation, where the cloud server is given an obfuscated simple decrypt-then-compare circuit with the secret key to perform keyword search. Xu et al. [30] designed a public-key multi-keyword searchable encryption scheme with a hidden structures model, which also supports boolean search over encrypted e-mails. Wang et al. [31] proposed a tree-based public-key multi-dimensional range searchable encryption scheme from the predicate encryption method and leakage function. For Olakanmi and Odeyemi’s certificateless lightweight keyword searchable encryption scheme [32], it was proved to secure against inside keyword guessing attacks in Industrial Internet of Things setting.

To enrich the functionality of searching over remote data, various practical schemes have been designed. He and Ma [33] proposed a fuzzy search scheme over encrypted data using bloom filter. Zhang et al. [34] noticed that He and Ma’s proposal [33] cannot resist the sparse non-negative matrix factorization based attacks and further presented a multi-keyword fuzzy search scheme using random redundancy method. Fu et al. [11] designed a semantic-aware search scheme, where both the index and search trapdoor contain two vectors. Yang, Liu and Deng [35] proposed a multi-keyword ranked searchable encryption scheme in multi-user setting, which does not require the keyword set to be predefined and supports keywords in arbitrary language and flexible search authorization. Ding et al. [36] constructed a multi-keyword search scheme in wireless body area networks, which also supports access control on electronic health records. Deebak et al. [37] designed a privacy-preservation phrase with multi-keyword ranked searching scheme, which employs optimized filtering and binary tree index structure to improve the search efficiency.

Cao et al. [12] proposed an efficient multi-keyword ranked search scheme over encrypted cloud data, where coordinate matching was introduced to capture the relevance between data documents and the search query. In Raghavendra et al.’s solution [38], the index for keywords was generated using split factor, and to save computation overheads, the index tree was constructed to store keywords. Ren et al. [14] studied multi-keyword ranked search, where the search trapdoor is generated using a polynomial function. Ding et al. [39] constructed a keyword set using k-grams and Jaccard coefficient, and also built searchable index of small size. Zhao et al. [40] developed a privacy-preserving ranked multi-keyword search scheme over outsourced data, which supports the verifiability of the search results. In Liu et al.’s scheme [8], the user is allowed to update the outsourced data and verify the search result. However, these existing schemes require resource-intensive matrix operations or cryptography operations, which cannot be afforded by the end devices in IoT application scenario. The comparison of these related schemes is shown in Table 1.

1.3. Paper Organization

The remainder of this paper is organized as follows. Section 2 describes the system model, threat model and design goals of PRMS. Section 3 presents basic PRMS construction, which is extended to support multiple types of keywords in Section 4. The security and performance of the proposed PRMS schemes are evaluated and compared in Section 5. Section 6 concludes the paper.

2. System Model and Security Requirements

2.1. System Model

As shown in Figure 1, a PRMS system consists of three types of entities, namely, data owner, data user and cloud server. There is a secure communication channel between data owner and data user for key sharing. The data owner outsources a collection of documents to the cloud server. Since the documents may contain sensitive information, they cannot be directly uploaded to the cloud server. Thus, to protect the privacy of outsourced documents, they should be outsourced in ciphertext format.

To facilitate data searching, the outsourced documents should be attached with a list of keywords or multiple types of keywords. The keywords of the same type constitutes a keyword dictionary. To guarantee that the keywords cannot leak the privacy of outsourced documents, the data processing phase would produce encrypted searchable indexes for each document. The searchable indexes are outsourced to the cloud server together with the document.

In the search phase, data user can generate a search trapdoor of its query vector with multiple keywords to enable the cloud server to search over outsourced documents. The keywords in the query vector are also contained in the keyword dictionary, which should be transformed into search trapdoor to protect the privacy of outsourced data. Upon receiving the search trapdoor, the cloud server computes the similarity score between each encrypted searchable index and the search trapdoor, and returns the document if its similarity score satisfies the given search threshold.

2.2. Security Requirements

The cloud server holds the encrypted documents and ciphertext indexes of users. In the honest-but-curious model, the cloud server can perform multi-keyword search according to the user’s request, but it may be curious about the sensitive information of outsourced documents. That is, the cloud server may try to deduce some information from the outsourced documents, ciphertext indexes, and search trapdoors.

A secure PRMS scheme needs to satisfy the following requirements.

Data privacy: The documents should be outsourced in ciphertext format, so that the cloud server cannot infer any sensitive information about outsourced documents.
Keyword privacy: The cloud server should not be able to determine whether a specific keyword is relevant to an outsourced document according to encrypted document, encrypted index and search trapdoors.
Trapdoor unlinkability: The cloud server should not be able to identify whether two search trapdoors are generated from the same query.
Multiple types of keywords: Each document can be associated with multiple types of keywords, and can be searched according to each type of keywords.
Efficiency: Due to the limited computation capability of data owner and data user, the data processing and query generation phases cannot contain resource-intensive computations.

2.3. Framework

A PRMS scheme consists of four efficient procedures, namely, Setup, Index, Trapdoor, and Search.

Setup ( $λ_{1}, λ_{2}, λ_{3}, λ_{4}$ ): With input security parameters $λ_{1}, λ_{2}, λ_{3}, λ_{4}$ , data owner generate public parameter $p a r a$ and secret key S.
Index ( $F, para, D$ ): For each document $F_{i}$ $(i = 1, \dots, m)$ , data owner generates ciphertext document ${\bar{F}}_{i}$ , constructs plaintext index vector ${\bar{I}}_{i}$ , and produces ciphertext index vector ${\hat{I}}_{i}$ . Data owner uploads the ciphertext document set $\bar{F} = {{\bar{F}}_{i} : i = 1, \dots, m}$ and ciphertext index vector set $\hat{I} = {{\hat{I}}_{i} : i = 1, \dots, m}$ to the cloud server.
Trapdoor ( $W, para, D$ ): From the given search keyword set W, data user generates encrypted search trapdoor $\hat{Q}$ , and sets a search threshold $τ$ . The search trapdoor $\hat{Q}$ and search threshold $τ$ are sent to the cloud server.
Search ( $\hat{I}, \hat{Q}, τ, para$ ): With the received search trapdoor $\hat{Q}$ , the cloud server compute similarity score with each ciphertext index vector in $\hat{I}$ and return the document if the similarity score is larger than $τ$ .

3. Basic PRMS Construction

This section introduces a basic PRMS scheme based on the inner product similarity computing technology [41], where the procedures are summarized in Figure 2. The frequently used notations are summarized in Table 2.

System setup: With input security parameters $λ_{1}, λ_{2}, λ_{3}, λ_{4}$ , the data owner constructs a dictionary D, which contains n keywords. The data owner randomly picks a large prime p such that $| p | = λ_{2}$ , an element $s \in_{R} Z_{p}^{*}$ , and a cryptographic one-way hash function $H : {0, 1}^{*} \to {0, 1}^{λ_{1}}$ . Thus, the system public parameters are $p a r a = (λ_{1}, λ_{2}, λ_{3}, λ_{4}, p, n, H)$ . The data owner keeps D and s secret.
Index generation: For each document F, the data owner encrypts it as ciphertext document $\bar{F}$ using some secure symmetric encryption algorithm, randomly picks a unique file name N, and calculates the length d of document F. The data owner computes $γ = H (N, d)$ and constructs the index vector $\vec{I}$ such that if the document F contains the ith keyword in the dictionary D, then $I_{i} = 1$ , otherwise $I_{i} = 0$ . The data owner further sets $I_{n + 1} = I_{n + 2} = 0$ , chooses $n + 2$ random number $m_{i}$ such that $∣ m_{i} ∣ = λ_{3}$ for $1 \leq i \leq n + 2$ , and encrypts each $I_{i}$ as follows:

${\hat{I}}_{i} = s \cdot (I_{i} \cdot γ + m_{i}) mod p$

(1)

Then for document F, the data owner outsources the ciphertext index vector $\hat{I} = ({\hat{I}}_{1}, {\hat{I}}_{2}, \dots, {\hat{I}}_{n + 2})$ and the processed file $\hat{F} = (\bar{F}, γ)$ to the cloud server, and keeps $(N, d)$ at local.
Trapdoor generation: Data user picks a large random number $δ$ such that $∣ δ ∣ = λ_{1}$ , and computes $s^{- 1} mod p$ . From the query keyword set W, data user constructs query vector $\vec{Q}$ , where $Q_{i} = 1$ if the query keyword set W contains the i-th keyword in the dictionary D, otherwise $Q_{i} = 0$ . The data user then sets $Q_{n + 1} = Q_{n + 2} = 0$ , and randomly chooses $n + 2$ numbers $t_{i}$ such that $∣ t_{i} ∣ = λ_{4}$ for $1 \leq i \leq n + 2$ . The data user constructs the search trapdoor $\hat{Q} = ({\hat{Q}}_{1}, {\hat{Q}}_{2}, \dots, {\hat{Q}}_{n + 2})$ , where

${\hat{Q}}_{i} = s^{- 1} \cdot (Q_{i} \cdot δ + t_{i}) mod p$

(2)

The data user sets the search threshold, $τ$ , and submits the search trapdoor $\hat{Q}$ and $(τ, δ)$ to the cloud server.
Search: Once received, the encrypted search trapdoor $\hat{Q}$ , the cloud server computes the similarity score $Score (\vec{I}, \vec{Q})$ with each $\hat{I}$ of outsourced documents as follows. The cloud server computes

$\begin{matrix} E = Score (\hat{I}, \hat{Q}) = \hat{I} ⊙ \hat{Q} mod p \end{matrix}$

(3)

where “⊙” denotes the modular vector inner product operation. By properly choosing the elements under the given security parameters $λ_{1}, λ_{2}, λ_{3}, λ_{4}$ , both the following conditions can be satisfied

$\hat{I} ⊙ \hat{Q} < p$

(4)

and

$\begin{matrix} ρ & = \sum_{i = 1, I_{i} \neq 0, Q_{i} \neq 0}^{n + 2} (γ t_{i} I_{i} + m_{i} δ Q_{i} + m_{i} t_{i}) + \\ \sum_{i = 1, I_{i} = 0, Q_{i} \neq 0}^{n + 2} (m_{i} δ Q_{i} + m_{i} t_{i}) + \sum_{i = 1, I_{i} \neq 0, Q_{i} = 0}^{n + 2} (γ t_{i} I_{i} + m_{i} t_{i}) + \sum_{i = 1, I_{i} = 0, Q_{i} = 0}^{n + 2} m_{i} t_{i} \\ < γ δ . \end{matrix}$

(5)

Then, the cloud server computes

$\begin{matrix} Score (\vec{I}, \vec{Q}) = \sum_{i = 1}^{n} I_{i} \cdot Q_{i} = \frac{E - (E mod δ \cdot γ)}{δ \cdot γ} \end{matrix}$

(6)

If the following search condition is satisfied

$Score (\vec{I}, \vec{Q}) \geq τ$

(7)

then the corresponding document $\bar{F}$ is returned. According to the similarity score $Score (\vec{I}, \vec{Q})$ , the searched documents can be sorted. In this way, the searched documents would be returned if their similarity scores are greater than the given search threshold.

A flow chart of the proposed basic PRMS scheme is depicted in Figure 3.

Theorem 1.

The proposed basic PRMS scheme is correct.

Proof.

To compute the similarity score

Score (\vec{I}, \vec{Q})

, it is required that both

I_{i} \neq 0

and

Q_{i} \neq 0

are satisfied for

1 \leq i \leq n

. Let

E^{'} = \sum_{i = 1, I_{i} \neq 0, Q_{i} \neq 0}^{n + 2} γ δ I_{i} Q_{i} mod p

Note that

\begin{matrix} E & = \hat{I} ⊙ \hat{} Q \\ = \sum_{i = 1, I_{i} \neq 0, Q_{i} \neq 0}^{n + 2} (γ δ I_{i} Q_{i} + γ t_{i} I_{i} + m_{i} δ Q_{i} + m_{i} t_{i}) + \\ \sum_{i = 1, I_{i} = 0, Q_{i} \neq 0}^{n + 2} (m_{i} δ Q_{i} + m_{i} t_{i}) + \sum_{i = 1, I_{i} \neq 0, Q_{i} = 0}^{n + 2} (γ t_{i} I_{i} + m_{i} t_{i}) + \sum_{i = 1, I_{i} = 0, Q_{i} = 0}^{n + 2} m_{i} t_{i} \\ = E^{'} + ρ mod p \end{matrix}

If

E < p

and

ρ < γ δ

hold, then we have

\begin{matrix} Score (\vec{I}, \vec{Q}) & = & \frac{E - (E mod δ \cdot γ)}{δ \cdot γ} \\ = & \frac{E - ρ}{δ \cdot γ} \\ = & \frac{\sum_{i = 1, I_{i} \neq 0, Q_{i} \neq 0}^{n + 2} (γ δ I_{i} Q_{i})}{δ \cdot γ} \\ = & \sum_{i = 1, I_{i} \neq 0, Q_{i} \neq 0}^{n + 2} (I_{i} Q_{i}) \\ = & \vec{I} ⊙ \vec{Q} mod p \end{matrix}

Thus, the proposed basic PRMS scheme is correct. □

As shown in Equation (2), each element

Q_{i}

in the query vector

\vec{Q}

is randomized with a randomly chosen value

t_{i}

. Therefore, even when the i-th keyword does not exist in the dictionary D, the privacy of the query keyword set at i-th position can still be guaranteed, which means

Q_{i} = 0

would not be leaked from the search trapdoor element

{\hat{Q}}_{i}

.

4. Extension

This section presents an extended PRMS construction, which allows the outsourced document to be attached with multiple types of keywords.

System setup: With input security parameters $λ_{1}, λ_{2}, λ_{3}, λ_{4}$ , the data owner constructs a dictionary set $D = {D_{1}, D_{2}, \dots, D_{z}}$ , where $D_{ℓ}$ $(1 \leq ℓ \leq z)$ represents a dictionary of some type of keywords. Without loss of generality, it is assumed that each dictionary $D_{ℓ}$ contains n keywords. The data owner randomly picks a large prime p such that $| p | = λ_{2}$ , z elements $s_{ℓ} \in_{R} Z_{p}^{*}$ $(1 \leq ℓ \leq z)$ , and a cryptographic one-way hash function $H : {0, 1}^{*} \to {0, 1}^{λ_{1}}$ . Thus, the public parameters are $p a r a = (λ_{1}, λ_{2}, λ_{3}, λ_{4}, p, n, H)$ , and the data owner keeps $D$ and ${s_{ℓ} : 1 \leq ℓ \leq z}$ secret.
Index generation: For each document F, the data owner encrypts it as ciphertext document $\bar{F}$   using some secure symmetric encryption algorithm, randomly picks a unique file name N, and calculates the length d of document F.
For each dictionary $D_{ℓ}$ $(1 \leq ℓ \leq z)$ , the document F is processed as follows. The data owner computes $γ_{ℓ} = H (N, d, ℓ)$   and constructs an index vector ${\vec{I}}_{ℓ}$   such that if the document F contains the i-th keyword in dictionary $D_{ℓ}$ , then $I_{ℓ, i} = 1$ , otherwise $I_{ℓ, i} = 0$ . The data owner further sets $I_{ℓ, n + 1} = 0$   and $I_{ℓ, n + 2} = 0$ , chooses $n + 2$   random number $m_{j}$   such that $∣ m_{ℓ, i} ∣ = λ_{3}$   for $1 \leq i \leq n + 2$ , and encrypts each $I_{ℓ, i}$   as follows:

${\hat{I}}_{ℓ, i} = s_{ℓ} \cdot (I_{ℓ, i} \cdot γ_{ℓ} + m_{ℓ, i}) mod p$

(8)

At last, the data owner outsources the ciphertext index set $\hat{\vec{I}} = {{\hat{I}}_{1}, {\hat{I}}_{2}, \dots, {\hat{I}}_{z}}$   and the processed document $\hat{F} = (\bar{F}, γ_{1}, γ_{2}, \dots, γ_{z})$   to the cloud server, where ${\hat{I}}_{ℓ} = ({\hat{I}}_{ℓ, 1}, {\hat{I}}_{ℓ, 2}, \dots, {\hat{I}}_{ℓ, n + 2})$   is a ciphertext index vector corresponding to dictionary $D_{ℓ}$ , and keeps $(N, d)$   at local.
Trapdoor generation: Suppose the data user would like to search the outsourced documents with the keywords in dictionary $D_{ℓ} \in D$ . Data user picks a large random number $δ$ such that $∣ δ ∣ = λ_{1}$ , and computes $s^{- 1} mod p$ . From the query keyword set W, data user constructs query vector $\vec{Q}$ , where $Q_{i} = 1$ if the query keyword set W contains the i-th keyword in the dictionary $D_{ℓ}$ , otherwise $Q_{i} = 0$ . Data user then sets $Q_{n + 1} = Q_{n + 2} = 0$ , and randomly chooses $n + 2$ numbers $t_{i}$ such that $∣ t_{i} ∣ = λ_{4}$ for $1 \leq i \leq n + 2$ . Data user constructs the search trapdoor $\hat{Q} = ({\hat{Q}}_{1}, {\hat{Q}}_{2}, \dots, {\hat{Q}}_{n + 2})$ , where

${\hat{Q}}_{i} = s_{ℓ}^{- 1} \cdot (Q_{i} \cdot δ + t_{i}) mod p$

(9)

Data user sets the search threshold $τ$ , and submits the search trapdoor $\hat{Q}$ and $(ℓ, τ, δ)$ to the cloud server, where ℓ denotes the type of keywords in searching documents.
Search: Once received the encrypted search trapdoor $\hat{Q}$ , the cloud server computes the similarity score $Score ({\vec{I}}_{ℓ}, \vec{Q})$ with the ciphertext index vector ${\hat{I}}_{ℓ}$ of each outsourced document $\bar{F}$ as follows. The cloud server computes

$\begin{matrix} E = Score ({\hat{I}}_{ℓ}, \hat{Q}) = {\hat{I}}_{ℓ} ⊙ \hat{Q} mod p \end{matrix}$

(10)

By properly choosing the elements under the given security parameters $λ_{1}, λ_{2}, λ_{3}, λ_{4}$ , it is assumed that both the following conditions

${\hat{I}}_{ℓ} ⊙ \hat{Q} < p$

(11)

and

$\begin{matrix} ρ & = \sum_{i = 1, I_{ℓ, i} \neq 0, Q_{i} \neq 0}^{n + 2} (γ_{ℓ} t_{i} I_{ℓ, i} + m_{ℓ, i} δ Q_{i} + m_{ℓ, i} t_{i}) + \\ \sum_{i = 1, I_{ℓ, i} = 0, Q_{i} \neq 0}^{n + 2} (m_{ℓ, i} δ Q_{i} + m_{ℓ, i} t_{i}) + \sum_{i = 1, I_{ℓ, i} \neq 0, Q_{i} = 0}^{n + 2} (γ_{ℓ} t_{i} I_{ℓ, i} + m_{ℓ, i} t_{i}) + \\ \sum_{i = 1, I_{ℓ, i} = 0, Q_{i} = 0}^{n + 2} m_{ℓ, i} t_{i} \\ < γ_{ℓ} δ . \end{matrix}$

(12)

hold for $1 \leq ℓ \leq z$ . Then, the cloud server computes

$\begin{matrix} Score ({\vec{I}}_{ℓ}, \vec{Q}) = \sum_{i = 1}^{n} I_{ℓ, i} \cdot Q_{i} = \frac{E - (E mod δ \cdot γ_{ℓ})}{δ \cdot γ_{ℓ}} \end{matrix}$

(13)

Note that these similarity scores can be sorted according to their values. If the following search condition is satisfied

$Score ({\vec{I}}_{ℓ}, \vec{Q}) \geq τ$

(14)

then the corresponding document $\bar{F}$ would be returned.

A flow chart of the extended PRMS scheme is depicted in Figure 4 and Figure 5.

Theorem 2.

The proposed extended PRMS scheme is correct.

Proof.

Suppose the outsourced documents are searched according to the ℓ-th type of keywords. To compute the similarity score

Score ({\vec{I}}_{ℓ}, \vec{Q})

, it is required that both

{\hat{I}}_{ℓ, i} \neq 0

and

Q_{i} \neq 0

are satisfied for

1 \leq i \leq n

. Let

E^{'} = \sum_{i = 1, I_{ℓ, i} \neq 0, Q_{i} \neq 0}^{n + 2} γ_{ℓ} δ I_{ℓ, i} Q_{i} mod p

Note that

\begin{matrix} E & = {\hat{I}}_{ℓ} ⊙ \hat{} Q \\ = \sum_{i = 1, I_{ℓ, i} \neq 0, Q_{i} \neq 0}^{n + 2} (γ_{ℓ} δ I_{ℓ, i} Q_{i} + γ_{ℓ} t_{i} I_{ℓ, i} + m_{ℓ, i} δ Q_{i} + m_{ℓ, i} t_{i}) + \\ \sum_{i = 1, I_{ℓ, i} = 0, Q_{i} \neq 0}^{n + 2} (m_{ℓ, i} δ Q_{i} + m_{ℓ, i} t_{i}) + \sum_{i = 1, I_{ℓ, i} \neq 0, Q_{i} = 0}^{n + 2} (γ_{ℓ} t_{i} I_{ℓ, i} + m_{ℓ, i} t_{i}) + \\ \sum_{i = 1, I_{ℓ, i} = 0, Q_{i} = 0}^{n + 2} m_{ℓ, i} t_{i} \\ = E^{'} + ρ mod p \end{matrix}

If

E < p

and

ρ < γ_{ℓ} δ

hold, then we have

\begin{matrix} Score ({\vec{I}}_{ℓ}, \vec{Q}) & = & \frac{E - (E mod δ \cdot γ_{ℓ})}{δ \cdot γ_{ℓ}} \\ = & \frac{E - ρ}{δ \cdot γ_{ℓ}} \\ = & \frac{\sum_{i = 1, I_{ℓ, i} \neq 0, Q_{i} \neq 0}^{n + 2} (γ_{ℓ} δ I_{ℓ, i} Q_{i})}{δ \cdot γ_{ℓ}} \\ = & \sum_{i = 1, I_{ℓ, i} \neq 0, Q_{i} \neq 0}^{n + 2} I_{ℓ, i} Q_{i} \\ = & {\vec{I}}_{ℓ} ⊙ \vec{Q} mod p \end{matrix}

Thus, the proposed extended PRMS scheme is correct. □

5. Analysis and Comparison

5.1. Security Analysis

Theorem 3.

If the symmetric encryption algorithm chosen by data owner is secure, then the proposed basic and extended PRMS schemes guarantee the privacy of outsourced documents against the server.

Proof.

As shown in the proposed PRMS schemes of Section 3 and Section 4, the outsourced documents are encrypted with some symmetric encryption algorithm. Thus, if such symmetric encryption algorithm is secure, then the encrypted documents would not leak the contents of these documents. □

Theorem 4.

The proposed basic and extended PRMS schemes guarantee the privacy of keywords for outsourced documents against the server.

Proof.

In the index generation phase of the basic PRMS scheme, each element

I_{i}

in the index vector

\vec{I}

is randomized using one-time random value

m_{i}

. Thus, if all these elements

m_{i}

(1 \leq i \leq n + 2)

are uniformly distributed and independently chosen from

Z_{λ_{3}}

, then all encrypted indexes

{\hat{I}}_{i}

(1 \leq i \leq n + 2)

would have the same distribution, which means

Pr [{\hat{I}}_{i_{1}}] = Pr [{\hat{I}}_{i_{2}}]

for different

i_{1}

and

i_{2}

. Therefore, the cloud server cannot deduce the content of index vector. For the index generation phase of the extended PRMS scheme, the elements in each index vector

{\vec{I}}_{ℓ}

are processed in the similar way. Thus, the encrypted index vectors would not leak the private information of outsourced documents. □

Theorem 5.

The proposed basic and extended PRMS schemes guarantee the privacy of trapdoors against the server.

Proof.

In the trapdoor generation phase of the basic PRMS scheme, each element

Q_{i}

in the query vector

\vec{Q}

is randomized using one-time random value

t_{i}

. Furthermore, for the whole query vector

\vec{Q}

,

δ

is also randomly chosen. Thus, if all these elements

t_{i}

(1 \leq i \leq n + 2)

and

δ

are uniformly distributed and independently chosen from

Z_{λ_{3}}

and

Z_{λ_{1}}

, respectively, then all encrypted trapdoor entries

{\hat{Q}}_{i}

(1 \leq i \leq n + 2)

would have the same distribution, which means

Pr [{\hat{Q}}_{i_{1}}] = Pr [{\hat{Q}}_{i_{2}}]

for different

i_{1}

and

i_{2}

. Therefore, the cloud server cannot deduce the content of query vector. For the trapdoor generation phase of the extended PRMS scheme, the elements in the query vector

\vec{Q}

with regard to each keyword dictionary

D_{ℓ}

are processed in the similar way. Thus, the encrypted search trapdoor vectors would not leak the private information of trapdoor and outsourced documents. □

Theorem 6.

The proposed basic and extended PRMS schemes offer unlinkability of trapdoors against the server.

Proof.

As analyzed in the proof of Theorem 5, if all elements

t_{i}

(1 \leq i \leq n + 2)

and

δ

are uniformly distributed and independently chosen from

Z_{λ_{3}}

and

Z_{λ_{1}}

, respectively, then all entries

{\hat{Q}}_{i}

(1 \leq i \leq n + 2)

in the encrypted trapdoor would have the same distribution, which means

Pr [{\hat{Q}}_{i_{1}}] = Pr [{\hat{Q}}_{i_{2}}]

for different

i_{1}

and

i_{2}

. Hence, the cloud server would not be able to determine whether two search trapdoors are generated for the same query. Furthermore, for the trapdoor generation phase of the extended PRMS scheme, the elements in the query vector

\vec{Q}

with regard to each keyword dictionary

D_{ℓ}

are processed in the similar way. Thus, the encrypted search trapdoor vectors enjoys unlinkability. □

5.2. Theoretical Analysis

As shown in Table 3, the performance of the proposed basic and extended PRMS schemes are theoretically compared with related ones [9,12,13,42] in three phases, namely, ciphertext index generation, trapdoor generation and search process. Note that among these schemes, Boneh et al.’s scheme [13] was designed in the public key setting. Let

μ

denote the number of documents. It can be seen that in the system setup phase, only the keyword search scheme proposed in the public key setting has to perform resource-intensive operations—that is, in Boneh et al.’s scheme [13], one exponentiation operation in bilinear group should be taken in this phase, where the exponentiation operation takes much more computing time than that of addition and multiplication.

In the process of the ciphertext index generation for a document, both proposed PRMS schemes only require the data owner to take modular multiplication and addition operations. Thus, the time complexity of this phase in the proposed PRMS schemes are

O (μ n)

and

O (μ z n)

, respectively, which depend on the length n of index vector and the number

μ

of documents. Since each document would be attached with z types of keywords, the time complexity of the extended PRMS scheme also relies on z. In Cao et al.’s scheme [12], the generation of ciphertext index involves two matrix multiplications between the split index vectors and the transposed matrices of

(n + 2) \times (n + 2)

, which means the time complexity is

O (μ n^{2})

.

In Ding et al.’s scheme [9], the time cost of ciphertext index generation includes building a tree-based index group, constructing indexes for documents, and encrypting all nodes in the index tree. There are total

O (α μ b)

nodes in the index tree, where

α

is a decimal and b represents the number of invertible matrices for each group. The encryption for each node needs to perform two multiplication operations of

n \times n

matrix. Thus, the time complexity of this phase [9] is

O (α μ b n^{2})

. In Xia et al.’s scheme [42], the data owner needs to construct a

K B B

index tree with

O (n)

nodes before performing encryption, where the encryption requires the multiplications of

n \times n

matrices. Hence, the total time complexity is

O (μ n^{3})

. In Boneh et al.’s scheme [13], each keyword in the index vector for each document should be separately encrypted, which requires two exponentiations and one bilinear pairing operation. Thus, their scheme has the time complexity of

O (μ n)

.

For the trapdoor generation, the proposed PRMS schemes only involve modular multiplication and addition operations. Thus, the time complexity of both proposed PRMS schemes is

O (n)

. In Cao et al.’s scheme [12], the trapdoor generation includes the query vector splitting and encryption, where the encryption is realized by matrix multiplications of

(n + 2) \times (n + 2)

inverse matrices. Thus, the time complexity of their scheme is

O (n^{2})

. Both Ding et al.’s scheme [9] and Xia et al.’s scheme [42] mainly include the split process and matrix multiplication operations, thus the time complexity is

O (n^{2})

. In Boneh et al.’s scheme [13], each keyword in the trapdoor should be separately processed, which requires one map-to-point hash evaluation and one exponentiation. Thus, the time complexity of trapdoor generation for Boneh et al.’s scheme [13] is

O (n)

.

In the search process of our PRMS schemes and Cao et al.’s scheme [12], to compute the similarity score with each document, the cloud server needs to perform inner product operations. Thus, the total time complexity is

O (μ n)

, while Xia et al.’s scheme [42] and Ding et al.’s scheme [9] need to generate

θ

leaf nodes, where the height of the index tree is

log n

. Hence, the total time complexity is

O (θ μ log n)

. In the search process of Boneh et al.’s scheme [13], each keyword in the trapdoor should be individually compared with each encrypted keyword in the index vector for every document, which requires one bilinear pairing operation. Thus, for the worst case, the time complexity of searching in Boneh et al.’s scheme [13] is

O (μ n^{2})

. Therefore, it can be seen that all phases of the proposed PRMS schemes only require lightweight computations, which are more efficient than existing proposals.

5.3. Performance Evaluation

We conducted experimental evaluation of the proposed PRMS constructions and compared with Cao et al.’s scheme [12]. The experiments were implemented using JAVA programming language on a platform with Windows 10 operating system, Intel(R) Core(TM) i5-7500 CPU 3.71 GHz and 8 GB memory. In experiments, all procedures were compared—that is, ciphertext index construction, search trapdoor generation and cloud search. Note that the encryption on documents of the proposed PRMS constructions are only determined by the employed symmetric encryption scheme. Thus, the performance of document encryption was not considered in experiments. The chosen parameters satisfy

n \leq 2^{32}

,

∣ γ ∣ = ∣ γ_{ℓ} ∣ = ∣ δ ∣ = λ_{1} = 200

,

∣ p ∣ = λ_{2} = 512

,

∣ m_{i} ∣ = λ_{3} = 128

, and

∣ t_{i} ∣ = λ_{4} = 128

.

As shown in Figure 6, the size of index vector varies from 10 to 100 to evaluate the performance of generating ciphertext index. It can be seen that the time costs of the proposed basic PRMS construction are less than 1 ms for all cases in processing one document, while the costs of Cao et al.’s scheme [12] rapidly increase as the number of keywords increases.

For processing a document with multiple types of keywords with the proposed extended PRMS scheme, different number of keyword types were considered, that is,

z = 5, 10, \dots, 30

. The performance of the extended PRMS construction for processing one document is shown in Figure 7. It can be seen from Figure 6 and Figure 7 that the cost of processing a document with the extended PRMS construction is roughly z times of that with the basic PRMS construction.

For the query generation and search phases, both the basic and extended PRMS constructions enjoy the same performance. As shown in Figure 8, the time costs of all these schemes are linear with the number of keywords in the query. Note that Cao et al.’s scheme [12] needs to perform matrix multiplications in generating search trapdoor. Thus, the proposed PRMS constructions are more efficient than their scheme [12] in all cases.

For the search by the cloud server, the proposed PRMS constructions do not involve complicated computation operations. In the experiments, the cloud server was considered to search over

μ = 200

outsourced documents. The performance is shown in Figure 9 for the proposed PRMS constructions and Cao et al.’s scheme [12]. It can be seen that the performance of keyword search with all these schemes enjoy roughly linear relation with the number of keywords in the query for all cases. Furthermore, the performance of Cao et al.’s scheme [12] would decrease greatly when the number of keywords in the query vector increases. Thus, the search procedure of the proposed PRMS schemes are more efficient than that in Cao et al.’s scheme [12].

6. Conclusions

Existing privacy-preserving multi-keyword search schemes cannot be deployed on resource-constrained devices due to the complicated computation operations at user side. To address this issue, this paper presented lightweight multi-keyword search (PRMS) constructions to allow resource-constrained devices to process data and generate search trapdoors for outsourcing documents. The extended PRMS construction allows each outsourced document to be attached with different types of keywords, and can be searched according to any type of keywords. Security analysis showed that the proposed PRMS constructions can protect the privacy of outsourced documents, indexes and search trapdoors, and guarantee unlinkability on search trapdoors. A performance analysis demonstrated that the proposed PRMS constructions are more efficient than existing proposals and can be deployed on weak devices. Note that in the private key setting, it is difficult to realize data sharing among multiple users without leaking any private parameters. Thus, it is necessary to develop lightweight public key encryption schemes supporting multi-keyword search in our future works, especially without using bilinear groups.

Author Contributions

Conceptualization, M.Z., L.L. and Y.W.; methodology, L.L. and Y.W.; software, H.L.; validation, H.L.; formal analysis, M.Z. and H.D.; writing—original draft preparation, M.Z., L.L., H.W. and Y.W.; writing—review and editing, M.Z., Y.D., H.D. and Y.W.; supervision, Y.D.; project administration, Y.D. and Y.W.; funding acquisition, Y.D., H.D., H.W. and Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This article is supported in part by the Guangxi Natural Science Foundation under grants 2019GXNSFFA245015 and 2019GXNSFGA245004, the National Natural Science Foundation of China under projects 62162017, 61962012 and 61902123, the Scientific Research Project of Hunan Provincial Department of Education through grant 22B0822, the special fund of the High-level Innovation Team and Outstanding Scholar Program for universities of Guangxi, and the Peng Cheng Laboratory Projects of Guangdong Province PCL2021A09, PCL2021A02, and PCL2022A03.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kumar, P.; Kumar, R. Issues and Challenges of Load Balancing Techniques in Cloud Computing: A Survey. ACM Comput. Surv. 2019, 51, 120. [Google Scholar] [CrossRef]
Tabrizchi, H.; Rafsanjani, M.K. A survey on security challenges in cloud computing: Issues, threats, and solutions. J. Supercomput. 2020, 76, 9493–9532. [Google Scholar] [CrossRef]
Zhao, M.; Ding, Y.; Wang, Y.; Wang, H.; Han, B. Verifiable and Privacy-Preserving Outsourcing of Matrix Multiplications. In Proceedings of the 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Chongqing, China, 29–30 October 2020; pp. 72–75. [Google Scholar] [CrossRef]
Mei, T.; Rui, Y.; Li, S.; Tian, Q. Multimedia Search Reranking: A Literature Survey. ACM Comput. Surv. 2014, 46, 38. [Google Scholar] [CrossRef]
Kuang, N.L.; Leung, C.H.C. Performance Effectiveness of Multimedia Information Search Using the Epsilon-Greedy Algorithm. In Proceedings of the 2019 18th IEEE International Conference On Machine Learning Furthermore, Applications (ICMLA), Boca Raton, FL, USA, 6–19 December 2019; pp. 929–936. [Google Scholar] [CrossRef] [Green Version]
Oliveira, A.; Rocha, A. Score-based Learning for Relevance Prediction in Image Similarity Search. In Proceedings of the ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 2087–2091. [Google Scholar] [CrossRef]
Oliveira, A.; Oakley, E.; da Silva Torres, R.; Rocha, A. Relevance prediction in similarity-search systems using extreme value theory. J. Vis. Commun. Image Represent. 2019, 60, 236–249. [Google Scholar] [CrossRef]
Liu, Q.; Nie, X.; Liu, X.; Peng, T.; Wu, J. Verifiable Ranked Search over dynamic encrypted data in cloud computing. In Proceedings of the 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS), Vilanova i la Geltru, Spain, 14–16 June 2017; pp. 1–6. [Google Scholar] [CrossRef]
Ding, X.; Liu, P.; Jin, H. Privacy-Preserving Multi-Keyword Top-k k Similarity Search Over Encrypted Data. IEEE Trans. Dependable Secur. Comput. 2019, 16, 344–357. [Google Scholar] [CrossRef] [Green Version]
Zhang, L.; Zhang, Y.; Ma, H. Privacy-Preserving and Dynamic Multi-Attribute Conjunctive Keyword Search Over Encrypted Cloud Data. IEEE Access 2018, 6, 34214–34225. [Google Scholar] [CrossRef]
Fu, Z.; Xia, L.; Sun, X.; Liu, A.X.; Xie, G. Semantic-Aware Searching Over Encrypted Data for Cloud Computing. IEEE Trans. Inf. Forensics Secur. 2018, 13, 2359–2371. [Google Scholar] [CrossRef]
Cao, N.; Wang, C.; Li, M.; Ren, K.; Lou, W. Privacy-Preserving Multi-Keyword Ranked Search over Encrypted Cloud Data. IEEE Trans. Parallel Distrib. Syst. 2014, 25, 222–233. [Google Scholar] [CrossRef] [Green Version]
Boneh, D.; Di Crescenzo, G.; Ostrovsky, R.; Persiano, G. Public Key Encryption with Keyword Search. In Proceedings of the Advances in Cryptology—EUROCRYPT 2004, Interlaken, Switzerland, 2–6 May 2004; Cachin, C., Camenisch, J.L., Eds.; Springer: Berlin/Heidelberg, Germany, 2004; pp. 506–522. [Google Scholar] [CrossRef] [Green Version]
Ren, Y.; Chen, Y.; Yang, J.; Xie, B. Privacy-preserving ranked multi-keyword search leveraging polynomial function in cloud computing. In Proceedings of the 2014 IEEE Global Communications Conference, Austin, TX, USA, 8–12 December 2014; pp. 594–600. [Google Scholar] [CrossRef] [Green Version]
Liu, L.G.; Zhao, M.; Ding, Y.; Wang, Y.; Deng, H.; Wang, H. Privacy-Preserving Multi-keyword Search over Outsourced Data for Resource-Constrained Devices. In Proceedings of the Blockchain and Trustworthy Systems, Dali, China, 6–7 August 2020; Zheng, Z., Dai, H.N., Fu, X., Chen, B., Eds.; Springer: Singapore, 2020; pp. 282–294. [Google Scholar] [CrossRef]
Song, D.X.; Wagner, D.; Perrig, A. Practical techniques for searches on encrypted data. In Proceedings of the 2000 IEEE Symposium on Security and Privacy, SP 2000, Berkeley, CA, USA, 14–17 May 2000; pp. 44–55. [Google Scholar] [CrossRef] [Green Version]
Chang, Y.C.; Mitzenmacher, M. Privacy Preserving Keyword Searches on Remote Encrypted Data. In Proceedings of the Applied Cryptography and Network Security, New York, NY, USA, 7–10 June 2005; Ioannidis, J., Keromytis, A., Yung, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2005; pp. 442–455. [Google Scholar] [CrossRef]
Bellare, M.; Boldyreva, A.; O’Neill, A. Deterministic and Efficiently Searchable Encryption. In Proceedings of the Advances in Cryptology—CRYPTO 2007, Santa Barbara, CA, USA, 19–23 August 2007; Menezes, A., Ed.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 535–552. [Google Scholar] [CrossRef] [Green Version]
Li, J.; Wang, Q.; Wang, C.; Cao, N.; Ren, K.; Lou, W. Fuzzy Keyword Search over Encrypted Data in Cloud Computing. In Proceedings of the 2010 Proceedings IEEE INFOCOM, San Diego, CA, USA, 14–19 March 2010; pp. 1–5. [Google Scholar] [CrossRef]
Wang, C.; Cao, N.; Li, J.; Ren, K.; Lou, W. Secure Ranked Keyword Search over Encrypted Cloud Data. In Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems, Genoa, Italy, 21–25 June 2010; pp. 253–262. [Google Scholar] [CrossRef]
Cash, D.; Jarecki, S.; Jutla, C.; Krawczyk, H.; Roşu, M.C.; Steiner, M. Highly-Scalable Searchable Symmetric Encryption with Support for Boolean Queries. In Proceedings of the Advances in Cryptology—CRYPTO 2013, Santa Barbara, CA, USA, 18–22 August 2013; Canetti, R., Garay, J.A., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 353–373. [Google Scholar] [CrossRef] [Green Version]
Golle, P.; Staddon, J.; Waters, B. Secure Conjunctive Keyword Search over Encrypted Data. In Proceedings of the Applied Cryptography and Network Security, Yellow Mountain, China, 8–11 June 2004; Jakobsson, M., Yung, M., Zhou, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2004; pp. 31–45. [Google Scholar] [CrossRef]
Hwang, Y.H.; Lee, P.J. Public Key Encryption with Conjunctive Keyword Search and Its Extension to a Multi-user System. In Proceedings of the Pairing-Based Cryptography—Pairing 2007, Tokyo, Japan, 2–4 July 2007; Takagi, T., Okamoto, T., Okamoto, E., Okamoto, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 2–22. [Google Scholar] [CrossRef]
Fu, Z.; Wu, X.; Guan, C.; Sun, X.; Ren, K. Toward Efficient Multi-Keyword Fuzzy Search Over Encrypted Outsourced Data with Accuracy Improvement. IEEE Trans. Inf. Forensics Secur. 2016, 11, 2706–2716. [Google Scholar] [CrossRef]
Liu, J.; Han, J.; Wang, Z. Searchable Encryption Scheme on the Cloud via Fully Homomorphic Encryption. In Proceedings of the 2016 Sixth International Conference on Instrumentation Measurement, Computer, Communication and Control (IMCCC), Harbin, China, 21–23 July 2016; pp. 108–111. [Google Scholar] [CrossRef]
Anand, V.; Satapathy, S.C. Homomorphic encryption for secure information retrieval from the cloud. In Proceedings of the 2016 International Conference on Emerging Trends in Engineering, Technology and Science (ICETETS), Pudukkottai, India, 24–26 February 2016; pp. 1–5. [Google Scholar] [CrossRef]
Chenam, V.B.; Ali, S.T. A designated tester-based certificateless public key encryption with conjunctive keyword search for cloud-based MIoT in dynamic multi-user environment. J. Inf. Secur. Appl. 2023, 72, 103377. [Google Scholar] [CrossRef]
Xu, P.; Wu, Q.; Wang, W.; Susilo, W.; Domingo-Ferrer, J.; Jin, H. Generating Searchable Public-Key Ciphertexts with Hidden Structures for Fast Keyword Search. IEEE Trans. Inf. Forensics Secur. 2015, 10, 1993–2006. [Google Scholar] [CrossRef] [Green Version]
Hu, C.; Liu, P.; Yang, R.; Xu, Y. Public-Key Encryption with Keyword Search via Obfuscation. IEEE Access 2019, 7, 37394–37405. [Google Scholar] [CrossRef]
Xu, P.; Tang, S.; Xu, P.; Wu, Q.; Hu, H.; Susilo, W. Practical Multi-Keyword and Boolean Search Over Encrypted E-mail in Cloud Server. IEEE Trans. Serv. Comput. 2019, 14, 1877–1889. [Google Scholar] [CrossRef]
Wang, B.; Hou, Y.; Li, M.; Wang, H.; Li, H. Maple: Scalable multi-dimensional range search over encrypted cloud data with tree-based index. In Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, ASIA CCS ’14, Kyoto, Japan, 4–6 June 2014. [Google Scholar] [CrossRef]
Olakanmi, O.O.; Odeyemi, K.O. A certificateless keyword searchable encryption scheme in multi-user setting for fog-enhanced Industrial Internet of Things. Trans. Emerg. Telecommun. Technol. 2022, 33, e4257. [Google Scholar] [CrossRef]
He, T.; Ma, W. An Effective Fuzzy Keyword Search Scheme in Cloud Computing. In Proceedings of the 2013 5th International Conference on Intelligent Networking and Collaborative Systems, Xi’an, China, 9–11 September 2013; pp. 786–789. [Google Scholar] [CrossRef]
Zhang, Q.; Fu, S.; Jia, N.; Tang, J.; Xu, M. Secure Multi-keyword Fuzzy Search Supporting Logic Query over Encrypted Cloud Data. In Proceedings of the Security and Privacy in New Computing Environments, Tianjin, China, 13–14 April 2019; Li, J., Liu, Z., Peng, H., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 210–225. [Google Scholar] [CrossRef]
Yang, Y.; Liu, X.; Deng, R.H. Multi-User Multi-Keyword Rank Search Over Encrypted Data in Arbitrary Language. IEEE Trans. Dependable Secur. Comput. 2020, 17, 320–334. [Google Scholar] [CrossRef]
Ding, Y.; Xu, H.; Wang, Y.; Yuan, F.; Liang, H. Secure Multi-Keyword Search and Access Control over Electronic Health Records in Wireless Body Area Networks. Secur. Commun. Netw. 2021, 2021, 9520941. [Google Scholar] [CrossRef]
Deebak, B.D.; Memon, F.H.; Dev, K.; Khowaja, S.A.; Qureshi, N.M.F. AI-enabled privacy-preservation phrase with multi-keyword ranked searching for sustainable edge-cloud networks in the era of industrial IoT. Ad Hoc Netw. 2022, 125, 102740. [Google Scholar] [CrossRef]
Raghavendra, S.; Girish, S.; Geeta C, M.; Buyya, R.; Venugopal K, R.; Iyengar, S.S.; Patnaik, L.M. IGSK: Index Generation on Split Keyword for search over cloud data. In Proceedings of the 2015 International Conference on Computing and Network Communications (CoCoNet), Trivandrum, India, 16–19 December 2015; pp. 374–380. [Google Scholar] [CrossRef] [Green Version]
Ding, S.; Li, Y.; Zhang, J.; Chen, L.; Wang, Z.; Xu, Q. An efficient and privacy-preserving ranked fuzzy keywords search over encrypted cloud data. In Proceedings of the 2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC), Durham, NC, USA, 11–13 November 2016; pp. 1–6. [Google Scholar] [CrossRef]
Zhao, M.; Liu, L.G.; Ding, Y.; Wang, Y.; Liang, H.; Tang, S.; Wen, B.; Liang, W. Verifiable and Privacy-Preserving Ranked Multi-Keyword Search over Outsourced Data in Clouds. In Proceedings of the 2021 IEEE 15th International Conference on Big Data Science and Engineering (BigDataSE), Shenyang, China, 20–22 October 2021; pp. 95–102. [Google Scholar] [CrossRef]
Lu, R.; Zhu, H.; Liu, X.; Liu, J.K.; Shao, J. Toward efficient and privacy-preserving computing in big data era. IEEE Netw. 2014, 28, 46–50. [Google Scholar] [CrossRef]
Xia, Z.; Wang, X.; Sun, X.; Wang, Q. A Secure and Dynamic Multi-Keyword Ranked Search Scheme over Encrypted Cloud Data. IEEE Trans. Parallel Distrib. Syst. 2016, 27, 340–352. [Google Scholar] [CrossRef]

Figure 1. System model.

Figure 2. The procedures of basic PRMS scheme.

Figure 3. The flow chart of basic PRMS scheme.

Figure 4. The procedures of extended PRMS scheme.

Figure 5. The flow chart of the extended PRMS scheme.

Figure 6. Time cost on ciphertext index generation.

Figure 7. Time cost on ciphertext index generation for multiple types of keywords by the extended PRMS scheme.

Figure 8. Time cost on search trapdoor generation.

Figure 9. Time cost on the search process over 200 documents.

Table 1. Comparison with related works.

Scheme	Search Mechanism	Setting	Application	Advantage
Cao et al. [12]	Multi-keyword	Private key	Cloud computing	Ranked search; coordinate matching
Boneh et al. [13]	Single keyword	Public key	Email gateway	Semantically secure
Ren et al. [14]	Multi-keyword	Private key	Cloud computing	Ranked search; lightweight
Liu et al. [15]	Multi-keyword	Private key	Cloud computing; IoT	Ranked search; lightweight
Song et al. [16]	Single keyword	Private key	Cloud storage	Provably secure; controlled searching; hidden queries
Chang and Mitzenmacher [17]	Single keyword	Private key	Distributed file system	File update
Bellare, Boldyreva and O’Neil [18]	Single keyword	Public key	Database	Deterministic; CCA security
Li et al. [19]	Multi-keyword	Private key	Cloud computing	Fuzzy keyword search
Wang et al. [20]	Multi-keyword	Private key	Cloud computing	Ranked search
Cash et al. [21]	Single keyword	Private key	Large database; arbitrarily-structured data	conjunctive search; general Boolean queries
Golle, Staddon and Waters [22]	Multi-keyword	Private key	Email system	Conjunctive keyword search
Hwang and Lee [23]	Multi-keyword	Public key	Remote storage system	Conjunctive keyword search; Multi-user
Fu et al. [24]	Multi-keyword	Private key	Cloud computing	Fuzzy search; ranked search
Liu, Han and Wang [25]	Multi-keyword	Multikey	Cloud computing	Multiple data source
Anand and Satapath [26]	Single keyword	Public key	Cloud computing	Ranked search
Chenam and Ali [27]	Single keyword	Public key	Medical Internet of Things	Designated tester; conjunctive keyword search; certificateless; dynamical
Xu et al. [28]	Single keyword	Public key	Large-scale databases	Hidden structures; semantic security
Hu et al. [29]	Single keyword	Public key	Cloud computing	Resist of off-line keyword guessing attacks
Xu et al. [30]	Multi-keyword	Public key	Large encrypted email database	Boolean search
Wang et al. [31]	Multi-keyword	Private key	Cloud computing	Scalable; multi-dimensional range search
Olakanmi and Odeyemi [32]	Single keyword	Public key	Industrial Internet of Things	Certificateless; resist inside keyword guessing attacks
Zhang et al. [34]	Multi-keyword	Private key	Cloud storage	Fuzzy search; resist sparse non-negative matrix factorization based attacks
Yang, Liu and Deng [35]	Multi-keyword	Public key	Cloud storage	Ranked search; multi-user; time-controlled revocation
Deebak et al. [37]	Multi-keyword	Private key	Sustainable edge-cloud networks	Ranked search; conjunctive search
Raghavendra et al. [38]	Multi-keyword	Private key	Cloud computing	Fuzzy search; synonym based search
Zhao et al. [40]	Multi-keyword	Private key	Cloud computing	Ranked search; verifiability

Table 2. Notations.

Notations	Descriptions
$λ_{i}$ $(i = 1, 2, 3, 4)$	Security parameters
s	Secret key of data owner
p	Large prime
H	One-way hash function
N	The filename of document F
d	The size of document F
F	Document
$\bar{F}$	Encrypted document
D	Keyword dictionary
n	Number of keywords in D
$D$	Dictionary set $D = {D_{1}, D_{2}, \dots, D_{z}}$
W	A set of search keywords
$\vec{I}$	A plaintext index vector $\vec{I} = (I_{1}, I_{2}, \dots)$
$\hat{I}$	A ciphertext index vector $\hat{I} = ({\hat{I}}_{1}, {\hat{I}}_{2}, \dots)$
$\vec{Q}$	Query vector $\vec{Q} = (Q_{1}, Q_{2}, \dots)$ constructed from W
$\hat{Q}$	Search trapdoor in ciphertext format
$γ$	The hash value with regard to document F
$m_{i}, t_{i}$	Random numbers for $1 \leq i \leq n + 2$
$δ$	Random number
$τ$	Search threshold

Table 3. Theoretical comparison.

Scheme	Setup	Index	Trapdoor	Search	Setting
Scheme	—	Generation	Generation	Process	Setting
Cao et al.’s scheme [12]	—	$O (μ n^{2})$	$O (n^{2})$	$O (μ n)$	Private key
Ding et al.’s scheme [9]	—	$O (α b μ n^{2})$	$O (n^{2})$	$O (θ μ log n)$	Private key
Xia et al.’s scheme [42]	—	$O (μ n^{3})$	$O (n^{2})$	$O (θ μ log n)$	Private key
Boneh et al.’s scheme [13]	1 $E x p o$	$O (μ n)$	$O (n)$	$O (μ n^{2})$	Public key
Basic PRMS scheme (Section 3 and [15])	—	$O (μ n)$	$O (n)$	$O (μ n)$	Private key
Extended PRMS scheme (Section 4)	—	$O (μ z n)$	$O (n)$	$O (μ n)$	Private key

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, M.; Liu, L.; Ding, Y.; Deng, H.; Liang, H.; Wang, H.; Wang, Y. Lightweight and Privacy-Preserving Multi-Keyword Search over Outsourced Data. Appl. Sci. 2023, 13, 2847. https://doi.org/10.3390/app13052847

AMA Style

Zhao M, Liu L, Ding Y, Deng H, Liang H, Wang H, Wang Y. Lightweight and Privacy-Preserving Multi-Keyword Search over Outsourced Data. Applied Sciences. 2023; 13(5):2847. https://doi.org/10.3390/app13052847

Chicago/Turabian Style

Zhao, Meng, Lingang Liu, Yong Ding, Hua Deng, Hai Liang, Huiyong Wang, and Yujue Wang. 2023. "Lightweight and Privacy-Preserving Multi-Keyword Search over Outsourced Data" Applied Sciences 13, no. 5: 2847. https://doi.org/10.3390/app13052847

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Lightweight and Privacy-Preserving Multi-Keyword Search over Outsourced Data

Abstract

1. Introduction

1.1. Our Contributions

1.2. Related Works

1.3. Paper Organization

2. System Model and Security Requirements

2.1. System Model

2.2. Security Requirements

2.3. Framework

3. Basic PRMS Construction

4. Extension

5. Analysis and Comparison

5.1. Security Analysis

5.2. Theoretical Analysis

5.3. Performance Evaluation

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI