1. Introduction
Currently, data have become something within people’s reach, and more and more people are becoming aware of the ownership and usage of their data. What is “data confirmation”? The purpose of data confirmation is to legally establish ownership of the data and the right of the data owner to determine who can have access to the data. Data confirmation requires determining the type of rights, how they will be acquired, and how they will be distributed. With the popularity of cloud computing, people have begun to share data. There are many ways to share data, such as uploading to thirdparty trading platforms, cloud servers, Github [
1], etc., resulting in the inability to ensure the privacy of users. Many scholars have begun to study data sharing, privacy issues [
2,
3], and how data are uploaded. In addition, the speed of data dissemination is extremely fast. Meng et al. [
4] modeled network public opinion data to predict public opinion crisis warnings. Cao et al. [
5] proposed a more comprehensive recommendation scheme based on realworld shared mobile data. The proposal of ciphertextpolicy attributebased encryption provides a good answer to these problems. CPABE allows the data owner to specify that only those who conform to the access policy can access the data.
Figure 1 shows the execution process of CPABE. The data owner Alice has some data
$(defined\phantom{\rule{4pt}{0ex}}in\phantom{\rule{4pt}{0ex}}the\phantom{\rule{4pt}{0ex}}figure\phantom{\rule{4pt}{0ex}}as\phantom{\rule{4pt}{0ex}}\u201cData\u201d)$, and she sets a set of access policies (
$(A\wedge B)\vee (C\wedge D)$ in the figure) according to the potential user subjectively, and combines
$Data$ with access policies to encrypt and upload to the server. At this time, there are two users
$Use{r}_{1}$ and
$Use{r}_{2}$ in the system;
$Use{r}_{1}$ has attributes
A and
B, and
$Use{r}_{2}$ has attributes
A and
C, then according to the access policies in the ciphertext,
$Use{r}_{1}$ can decrypt and access the data, but
$Use{r}_{2}$ fails to decrypt. However, due to the replicability of the data, ownership of the data cannot be determined. The development of blockchain [
6,
7,
8] has brought the possibility of data rights confirmation. Due to its immutability and traceability, the information of the data owner cannot be tampered with once it is on the chain, and it can be easily traced back to the source and destination of the data. However, as a result of the decentralized and public nature of blockchain technology, the privacy of the data owner cannot be guaranteed. Using traditional thirdparty hosting or issuing certificates of ownership does not guarantee that the data owner’s source data will not be leaked. Therefore, there is the need of a scheme that solves the above problems.
Consider the following scenario: Alice wants to store her data on the cloud and share them with others, but she only wants a specific group of people to access the data. Therefore, she specifies an access policy and encrypts the data using CPABE before uploading it. Bob is a member of Alice’s designated group, and he retrieves and decrypts the data from the cloud. Smith is Bob’s friend, but he is not a part of the designated group. Smith contacts Bob and obtains a copy of the data. One day, Alice discovers that Smith is using her data and wants to seek compensation. However, Smith claims that the data are his own. How can Alice prove that the data belong to her?
In 2005, Sahai and Waters [
9] introduced the notion of fuzzy identitybased encryption. which was later extended to attributebased encryption (ABE). In an ABE system, the ciphertext and key are associated with the attribute set and access structure, and decryption only succeeds when the attribute set satisfies the access structure. Goyal et al. [
10] suggested correlating the access policy with the ciphertext and key, respectively, and divided ABE for the first time into ciphertextpolicy attributebased encryption (CPABE) and keypolicy attributebased encryption (KPABE) in 2006. Bethencourt et al. [
11] introduced the first CPABE system in 2007, embedding the access tree structure within ciphertext; however, it is challenging to deploy in practice. Waters et al. [
12] built on Bethencourt et al.’s work in the following year and proposed a CPABE system with an efficient general access structure while also proving selection security under the standard model. In 2012, Lewko and Waters [
13] developed a broad strategy for converting the standard model’s concept of selection security into adaptive security. In today’s cloud computing, CPABE has a significant influence. In 2015, Ning et al. [
14] proposed a traceable and auditable CPABE scheme in cloud computing to address key abuse by dishonest users in the cloud storage environment [
15], but it does not provide key revocation. Yu et al. [
16] proposed a traceable and undeniable CPABE scheme based on Ning’s work to solve the problem of semihonest institutions illegally selling keys.
Under the big data environment [
17], data can be used to verify the validity of the protocol [
18] and can also be used to train the robot [
19]. However, these web data have no real ownership. Yun Peng et al. [
20] investigated the basic challenges surrounding data confirmation in 2016. Bing Guo et al. [
21] presented a service system to defend the property rights of personal data in 2017. Shuaiyu Wang et al. [
22] suggested a large data correct confirmation technique based on blockchain technology in the same year, but the issue is that the data source cannot be verified. In 2018, Hailong Wang and his colleagues [
23] introduced a novel approach for verifying big data using blockchain and digital watermarking technology, but the authority agency can access the data owner’s source data. Although this solution can be applied to the environment of cloud storage, due to the limitation of its form of plaintext confirmation, the privacy of users cannot be guaranteed. Zhao et al. [
24] developed a smart contractbased big data property right confirmation system the following year. In 2021, Zhou et al. [
25] proposed a data ownership confirmation scheme based on consortium blockchain in IoT environments [
26], with a focus on controlling the flow of data. However, the scheme cannot be applied to onetomany environments such as cloud storage. Professors Jintai Ding and Ke Tang from Tsinghua University announced their plans to develop an innovative solution for managing largescale data transactions. Their approach involves leveraging cuttingedge cryptography techniques and advanced mechanisms for economic design to create a robust and effective system for processing and exchanging data. By combining these two technologies, they aim to address the unique challenges associated with managing and securing large volumes of data, ultimately providing a reliable and efficient solution for businesses and organizations worldwide. This technique assures data transaction security while also increasing transaction efficiency. In 2022, Liu et al. [
27] proposed a data ownership confirmation scheme based on the Ethereum blockchain and smart contracts. The parties authenticate their identities through a protocol for generating data fingerprints based on smart contracts. However, the article did not address the issue of user privacy protection on the public blockchain.
Based on the research status above, we propose a new data confirmation scheme in the cloud storage environment, focusing on user privacy protection and preventing the leakage of original plaintext data. The scheme can effectively protect the privacy of data owners while ensuring data confirmation, and in the process of confirmation, no one can access plaintext, thus reducing the risk of data leakage. We embed the data owner’s identification information into CPABE using Paillier encryption and change the plaintext confirmation form to the ciphertext confirmation form. An audit phase is introduced at the end of the confirmation process.
Our contributions are as follows:
 (1)
User privacy protection. We propose a new data confirmation scheme based on CPABE in the cloud storage environment. Users only need to embed the information with their own identity into the ciphertext after Paillier encryption and upload it to the cloud. They do not need to worry about revealing their identity.
 (2)
Prevent original plaintext data leakage. During the entire right confirmation process, the authority $AT$ can only access the ciphertext and only needs to process the ciphertext. This greatly reduces the risk of plaintext data leakage during the right confirmation process.
 (3)
The scheme is safe and efficient. We reduce the scheme to the threeprime subgroup decision problem and prove that the scheme is safe, and through experimental analysis, our scheme is almost as efficient as the scheme proposed by Allison et al. [
21] in terms of system setup, key generation, encryption, and encryption algorithms.
Table 1 shows the comparison between our scheme and other data confirmation schemes.
Section 2 will present a formal definition and explanation of several fundamental concepts.
Section 3 will focus on constructing the scheme, which will include defining the security requirements, implementing the scheme, and providing a security proof. In
Section 4, we will conduct experiments and analysis to evaluate the effectiveness of the scheme. Finally,
Section 5 will summarize the scheme and its contributions.
3. Construction
3.1. Membership
Our system consists of five parties (as shown in
Figure 2). The data owner
$\left(Do\right)$ is in charge of data encryption and uploading. The data user
$\left(Du\right)$ is responsible for retrieving and decrypting the data submitted by the data owner from the cloud. The authority
$\left(AT\right)$ is in charge of giving decryption keys to data users, participating in the ciphertext’s signature, and storing the credentials of the data owner. The public auditor
$\left(PA\right)$ is in charge of publicly auditing the ciphertext and extracting the information of the ciphertext owner from credentials. Finally, the cloud server
$\left(Cloud\right)$ is responsible for storing the ciphertext uploaded by the data owner.
3.2. Security
3.2.1. INDCPA Security
We can rephrase the description of the
$INDCPA$ security game process for our proposed scheme, which is equivalent to the one proposed by Allison et al. [
21], as follows:
 –
$Setup$: The adversary $\mathcal{A}$ is given the public parameter $PK$ after the challenger $\mathcal{B}$ calls the $Setup({1}^{\phi},U)$ algorithm.
 –
$Phase\phantom{\rule{4pt}{0ex}}1$: Adversary $\mathcal{A}$ can dynamically request the decryption keys $S{k}_{i}$ associated with attribute sets ${S}_{1},\dots ,{S}_{{q}_{r}}$ from the challenger $\mathcal{B}$. In response, $\mathcal{B}$ executes the key generation algorithm to generate $S{k}_{i}$ and sends it to $\mathcal{A}$.
 –
$Challenge$: Adversary $\mathcal{A}$ provides two equallength messages ${M}_{1}$ and ${M}_{2}$ and a generator matrix ${A}^{*}$ that corresponds to an access structure ${\mathbb{A}}^{*}$ that does not satisfy ${S}_{1},\dots ,{S}_{{q}_{r}}$ to the challenger $\mathcal{B}$. Then $\mathcal{B}$ randomly chooses a bit $\sigma \in \{0,1\}$ and generates the ciphertext $C{T}_{{A}^{*},T}$ by calling the encryption algorithm with $s{k}_{Do}$, $<{A}^{*},\rho >$, $PK$, and ${M}_{\sigma}$. Finally, $\mathcal{B}$ sends $C{T}_{{A}^{*},T}$ to adversary $\mathcal{A}$.
 –
$Phase\phantom{\rule{4pt}{0ex}}2$: Adversary $\mathcal{A}$ keeps asking $\mathcal{B}$ for decryption keys $S{k}_{i}$ corresponding to attribute sets ${S}_{{q}_{r+1}},\dots ,{S}_{q}$, where each set cannot satisfy the access structure ${\mathbb{A}}^{*}$. Upon each request, $\mathcal{B}$ calls the key generation algorithm and sends $S{k}_{i}$ to adversary $\mathcal{A}$.
 –
$Guess$: $\mathcal{A}$ outputs a guess ${\sigma}^{{}^{\prime}}\in \{0,1\}$.
The advantage of the adversary in this game is defined as:
Definition 7. If we assume that any adversary with polynomial time has only a negligible advantage in winning the aforementioned game, we can confidently assert that our scheme is secure.
3.2.2. Dishonest User Game (NonReplicability of Ciphertext)
The dishonest user game of this scheme is defined as follows: A user attempts to confuse the auditor by forging the authority’s signature and republishing a ciphertext. The game is played by a challenger and an adversary.
–$Setup$: Challenger $\mathcal{B}$ starts the $Setup({1}^{\phi},U)$ algorithm and sends the public parameters $PK$ and $s{k}_{Do}$ to the attacker $\mathcal{A}$.
$Ciphertext\phantom{\rule{4pt}{0ex}}Generation$: Challenger $\mathcal{B}$ generates the ciphertext $C{T}_{A,T}$ through the $Encrypt$ algorithm and sends it to $\mathcal{A}$; $\mathcal{A}$ generates a new ciphertext $C{T}_{A,{T}^{{}^{\prime}}}^{{}^{\prime}}$ according to the initial ciphertext $C{T}_{A,T}$.
$Output$:
If $Decrypt(S{k}_{Du},C{T}_{A,T})=Decrypt(S{k}_{Du},C{T}_{A,{T}^{{}^{\prime}}}^{{}^{\prime}})$ and ${C}_{0}^{{}^{\prime}}=M\xb7e({g}^{\alpha \beta},{g}^{s{T}^{{}^{\prime}}})$ then we say that the attacker successfully copies the ciphertext.
The adversary’s advantage in the dishonest user game is defined as
Definition 8. If the probability of a polynomialtime adversary winning the game described above is negligible, then we consider the ciphertext of our scheme to be secure and irreproducible.
In order to satisfy the requirement of data confirmation, the conventional CPABE scheme is insufficient. To ensure auditing capabilities in our CPABE scheme, we have developed a method that involves incorporating the data owner’s unique identifier (such as an address or ID number) into the ciphertext using Paillier encryption. The process of our scheme is illustrated in
Figure 3.
3.3. Implementation
$\mathbf{Setup}\phantom{\rule{3.33333pt}{0ex}}({1}^{\phi},U)\to PK,MSK,P{k}_{AT},S{k}_{AT}$: In the setup phase of our system, we provide the security parameter
$\phi $ and the user attribute universe
U as inputs to the setup algorithm. This algorithm then generates a group
G of order
$N={p}_{1}{p}_{2}{p}_{3}$, a mapping
e, an integer group
${\mathbb{Z}}_{N}$, and a hash function
$H:H\left(x\right)\to {\mathbb{Z}}_{N}$. This setup process establishes the necessary parameters and functions to enable secure and efficient cryptographic operations in our system. The resulting setup allows us to implement our system in a manner that satisfies our security and performance requirements. Then the system proceeds to select random parameters
$\alpha ,a\in {\mathbb{Z}}_{N}$, and the generator
$g\in {G}_{{p}_{1}}$. For each attribute
$s\in U$, the system randomly selects a corresponding value
${u}_{i}\in {\mathbb{Z}}_{N}$. The system global parameter is set as
$MSK=(\alpha ,{g}_{3})$(${g}_{3}\in {G}_{{p}_{3}}$ and is a generator) and $MSK$ is sent to the authority $AT$; $AT$ performs the following steps locally: randomly selecting two safe large primes p and q, which satisfy $gcd(pq,(p1\left)\right(q1\left)\right)=1$, calculating $n=pq,\lambda =lcm(p1,q1)$, and then randomly selecting a positive integer ${g}_{1}$ that is less than ${n}^{2}$. Next, AT computes $\mu ={\left(L\left({g}^{\lambda}mod{n}^{2}\right)\right)}^{1}mod\phantom{\rule{4pt}{0ex}}n$ and randomly selects a value $\beta \in {\mathbb{Z}}_{N}$. The public parameter $P{k}_{AT}=(n,{g}_{1},{g}^{\beta})$ is generated, whereas the private key $S{k}_{AT}=(\lambda ,\mu ,\beta )$ is stored locally.
$\mathbf{Encrypt}\phantom{\rule{3.33333pt}{0ex}}(P{k}_{AT},<A,\rho >,PK,M)\to C{T}_{A,T}:$
$\mathbf{Step}\phantom{\rule{4pt}{0ex}}\mathbf{1}$: The unique identifier (e.g., ID number, address, mailbox, etc.) is hashed by data owner
$Do$ and mapped to an integer in
${\mathbb{Z}}_{N}$, denoted as
After mapping the data owner’s unique identifier to an integer in
${\mathbb{Z}}_{N}$,
$Do$ chooses a value
$r\stackrel{R}{\u27f5}{\mathbb{Z}}_{{n}^{2}}^{*}$ and employs Paillier encryption to generate the encrypted output
$T={g}_{1}^{{t}_{id}}{r}^{n}\phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}{n}^{2}$. The Algorithm 1 is as follows (here we assume the unique identifier string is
$\u201caddress\u201d$):
Algorithm 1 Encrypt ${t}_{id}$ 
Input:
$String\u201caddress\u201d$ 
 1:
$addrHash$← Convert $\u201caddress\u201d$ to a byte array after hashing;  2:
${t}_{id}\leftarrow Map\phantom{\rule{4pt}{0ex}}addrHash\phantom{\rule{4pt}{0ex}}into\phantom{\rule{4pt}{0ex}}{\mathbb{Z}}_{N};$  3:
$r\leftarrow Randomly\phantom{\rule{4pt}{0ex}}pick\phantom{\rule{4pt}{0ex}}an\phantom{\rule{4pt}{0ex}}element\phantom{\rule{4pt}{0ex}}from\phantom{\rule{4pt}{0ex}}{\mathbb{Z}}_{N};$  4:
$T\leftarrow Use\phantom{\rule{4pt}{0ex}}Paillier\phantom{\rule{4pt}{0ex}}encryption\phantom{\rule{4pt}{0ex}}to\phantom{\rule{4pt}{0ex}}obtain\phantom{\rule{4pt}{0ex}}encrypted\phantom{\rule{4pt}{0ex}}identity\phantom{\rule{4pt}{0ex}}information;$

Output: 
${t}_{id}$ = 79847630022358710946125273965671104052858 065717629025639108307113838327353

$\mathbf{Step}\phantom{\rule{4pt}{0ex}}\mathbf{2}$: To encode the access structure for the data, the owner of the data,
$Do$, creates a shared generator matrix
A with dimensions
l by
n using the LSSS. First, a secret number
$s\stackrel{R}{\u27f5}{\mathbb{Z}}_{n}$ is randomly selected. Then,
$n1$ random numbers
${y}_{2},\dots ,{y}_{n}\stackrel{R}{\u27f5}{\mathbb{Z}}_{n}$ are selected to generate a vector
$\mathbf{y}=(s,{y}_{2},\dots ,{y}_{n})$. Finally, random numbers
${r}_{i}\stackrel{R}{\u27f5}{\mathbb{Z}}_{N}$ are chosen for each row
${A}_{i\in \left[l\right]}$ of matrix
A(
$\left[l\right]$ that represents the entire set of
$\{1,2,\dots ,l\}$),
$H\left(M\right)$ is obtained by taking a hash of the plaintext
M and mapping it to
${G}_{T}$ to generate ciphertext:
$\mathbf{Step}\phantom{\rule{4pt}{0ex}}\mathbf{3}$: Both
$\overline{{C}_{0}},{g}^{s}$ and
T are sent to the authority
$AT$ for decryption. The decryption process begins with
$AT$ decrypting
${G}_{T}\left(H\left(M\right)\right)$ using the following method:
After successfully decrypting
${G}_{T}\left(H\left(M\right)\right)$, the authority
$AT$ checks if it already has a record of
${G}_{T}\left(H\left(M\right)\right)$ in its database. If a record already exists, the application is rejected; otherwise,
$AT$ utilizes their private key
$\beta $ to sign the message and generates
and stores the data credentials of
$Do$ in the local database in the form of
$T:{G}_{T}\left(H\left(M\right)\right):timeStamp$. By following this process, it is guaranteed that there is only one legitimate owner associated with the original data source. This measure also serves as a safeguard against any attempts by malicious actors to produce ciphertext and assert false ownership over the data. Furthermore, this also serves to prevent
$AT$ from directly accessing the plaintext, which enhances the security of the system. Finally,
${C}_{0}^{{}^{\prime}}$ is sent back to the data owner
$Do$ for further processing. The user credentials setting Algorithm 2 is as follows:
Algorithm 2 Store user credentials 
Input:
$\overline{{C}_{0}}$, ${g}^{s}$, T 
 1:
Divide $\overline{{C}_{0}}$ by $e({g}^{\alpha},{g}^{s})$ to get ${G}_{T}\left(H\left(M\right)\right)$;  2:
if Retrieving ${G}_{T}\left(H\left(M\right)\right)$ locally is empty then  3:
Element P = $e{(g,g)}^{\alpha s\beta}$;  4:
Date $date$ = Get the current time through the time function;  5:
$Recordlist\left[\right]$← (T,${G}_{T}\left(H\left(M\right)\right)$,$date$);  6:
end if  7:
return $P*{G}_{T}\left(H\left(M\right)\right)$;

$\mathbf{Step}\phantom{\rule{4pt}{0ex}}\mathbf{4}$:
$Do$ first calculates
after receiving
${C}_{0}^{{}^{\prime}}$, afterwards, the ciphertext is assigned the value
and uploads
$C{T}_{A,T}$ to the cloud.
Note: A notable characteristic of this scheme is the possibility of having multiple owners for a given $data$, which is made feasible by the additive homomorphism property of Paillier encryption. For example, in a scenario where the data are jointly owned by two parties, denoted as $D{o}_{1}$ and $D{o}_{2}$, they can both hash their unique identifiers and use them to generate separate $Paillier$ ciphertexts ${T}_{1}$ and ${T}_{2}$ using different random numbers, then $D{o}_{1},D{o}_{2}$ calculate ${T}_{1}={g}_{1}^{{t}_{i{d}_{1}}}{r}_{2}^{n}\phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}{n}^{2},{T}_{2}={g}_{1}^{{t}_{i{d}_{2}}}{r}_{2}^{n}\phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}{n}^{2}$, let $T={T}_{1}\xb7{T}_{2}$.
During the entire encryption stage, we have realized data confirmation. Hash the plaintext and map it to ${G}_{T}$ for encryption($\overline{{C}_{0}}={G}_{T}\left(H\left(M\right)\right)\xb7e{(g,g)}^{\alpha s}$) and send it to $AT$; $AT$ only needs to perform division and signature operations on $\overline{{C}_{0}}$, and store user ID T locally as a certificate. Therefore, $AT$ cannot touch the plaintext.
 3.
$\phantom{\rule{4pt}{0ex}}\mathbf{KeyGen}\phantom{\rule{3.33333pt}{0ex}}(MSK,S,PK)\to S{k}_{Du}:$ The generation of the decryption key in this scheme is a collaborative process between
$Du$ and
$AT$;
$Du$ first chooses a random number
$t\stackrel{R}{\u27f5}{\mathbb{Z}}_{N}$ as a parameter. Next,
$Du$ forwards their personal set of attributes
S and the value
${g}^{t}$ to “
$AT$ as part of its request to generate a key. Then,
$AT$ selects random numbers
$h\stackrel{R}{\u27f5}{\mathbb{Z}}_{N}$ and
${R}_{0},{R}_{1},{R}_{2},{R}_{3},{R}_{i}\in {G}_{{p}_{3}}$ to generate part of the decryption key
Finally,
$AT$ transmits the decryption key
$S{k}_{pri}$ and a collection of values labeled as
$\{{R}_{0},{R}_{1},$${R}_{2},{R}_{3},{R}_{i}\}$ to
$Du$, and
$Du$ generates the decryption key locally using these values:
 4.
$\mathbf{Decrypt}\phantom{\rule{3.33333pt}{0ex}}(S{k}_{Du},C{T}_{A,T})\to M:$ The decryption key allows
$Du$ to decrypt the ciphertext and obtain access to the data. The decryption algorithm searches for a vector
$\mathbf{w}$ such that
${A}_{i}^{T}\xb7\mathbf{w}={(1,0,\dots ,0)}^{T}(i\in S)$, if the attributes of
$Du$ do not satisfy the access policy, then there is only one vector
$\left\{{\kappa}_{i}\right\}$, such that
${A}_{i}^{T}\xb7\left\{{\kappa}_{i}\right\}={(0,0,\dots ,0)}^{T}(i\in S)$ and
${\kappa}_{1}=1$, the plaintext
M is obtained by the following formula:
 5.
$\mathbf{Audit}\phantom{\rule{3.33333pt}{0ex}}(PK,M,{M}^{*},P{k}_{AT},S{k}_{AT},MSK)\to {t}_{id}$: If the data owner $Do$ suspects that his data have been infringed upon or abused, he can prove his ownership by interacting with the public auditor $PA$ and the authority $AT$. This interaction serves two purposes:
 (a)
To demonstrate that $Do$ was the first to upload the data;
 (b)
To prove that the ciphertext corresponding to the data is indeed generated by $Do$.
$\mathbf{Step}\phantom{\rule{4pt}{0ex}}\mathbf{1}$: To prove that $Do$ is the first to upload the data, the source data M and $C{T}_{A,T}$ are sent by $Do$ to the public auditor $PA$. $PA$ obtains the hash value of the source data M by applying the hash function $H\left(x\right)$ and sends it to the authority $AT$ to identify the owner of the plaintext.
$\mathbf{Step}\phantom{\rule{4pt}{0ex}}\mathbf{2}$: First, PA carries out a comparison:
If they are equal, $PA$ enter the ${t}_{id}$ extraction process using n, $\lambda $, defines $L\left(x\right)=(x1)/n$, calculates $\mu ={\left(L\left({g}^{\lambda}\phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}{n}^{2}\right)\right)}^{1}\phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}n$, then by $L\left({T}^{\prime \lambda}\phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}{n}^{2}\right)\times \mu \phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}n$ to extract the $D{o}^{\prime}s$${t}_{id}$.
$\mathbf{Step}\phantom{\rule{4pt}{0ex}}\mathbf{3}$:
$PA$ is needed to verify whether the given equation is valid or false.
Assuming the equation is satisfied, we can conclude that the data belong to the user
$Do$. Let us take the unique identifier
$\u201caddress\u201d$ during encryption as an example: the decryption Algorithm 3 and the decrypted
${t}_{id}$ are as follows:
Algorithm 3 Decrypt ${t}_{id}$ 
 1:
$\lambda \leftarrow (p1)\ast (q1)$;  2:
$\mu \leftarrow Get\phantom{\rule{4pt}{0ex}}\mu \phantom{\rule{4pt}{0ex}}according\phantom{\rule{4pt}{0ex}}to\phantom{\rule{4pt}{0ex}}the\phantom{\rule{4pt}{0ex}}formula\phantom{\rule{4pt}{0ex}}{\left(L\left({g}_{1}^{\lambda}\phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}{n}^{2}\right)\right)}^{1}\phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}n;$  3:
${t}_{id}\leftarrow Obtain\phantom{\rule{4pt}{0ex}}the\phantom{\rule{4pt}{0ex}}{t}_{id}\phantom{\rule{4pt}{0ex}}in\phantom{\rule{4pt}{0ex}}the\phantom{\rule{4pt}{0ex}}ciphertext\phantom{\rule{4pt}{0ex}}according\phantom{\rule{4pt}{0ex}}to\phantom{\rule{4pt}{0ex}}the\phantom{\rule{4pt}{0ex}}decryption\phantom{\rule{4pt}{0ex}}algorithm\phantom{\rule{4pt}{0ex}}L\left({T}^{\prime \lambda}\phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}{n}^{2}\right)$$\times \phantom{\rule{3.33333pt}{0ex}}\mu \phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}n;$

Output: 
${t}_{id}^{\prime}$ = 79847630022358710946125273965671104052858 065717629025639108307113838327353 
If the data are generated by multiple users, then ${t}_{i{d}_{1}}+{t}_{i{d}_{2}}\equiv L({g}^{\lambda ({t}_{i{d}_{1}}+{t}_{i{d}_{2}})}\xb7{({r}_{1}\xb7{r}_{2})}^{\lambda n})\xb7\mu \phantom{\rule{4pt}{0ex}}mod\phantom{\rule{4pt}{0ex}}n$, $PA$ verifies $H\left(identit{y}_{1}\right)\to {\mathbb{Z}}_{N}+H\left(identit{y}_{2}\right)\to {\mathbb{Z}}_{N}\stackrel{?\phantom{\rule{4.pt}{0ex}}}{=}{t}_{i{d}_{1}}+{t}_{i{d}_{2}}$.
During the entire audit phase, $PA$ needs to do two things:
 (1)
Compare whether the leaked plaintext is the same as that owned by $Do$ and calculate whether the ciphertext is generated by $Do$ through the formula ${C}_{0}\stackrel{?}{=}M\xb7e({g}^{\alpha \beta},{C}_{1})$;
 (2)
Obtain the user credential T corresponding to the plaintext in the $AT$’s database and obtain the owner of the plaintext through Paillier decryption.
3.5. INDCPA Security
Suppose there is an adversary
$\mathcal{A}$ who can eavesdrop on the channel between the user and the data owner, and he can obtain the ciphertext corresponding to the plaintext within a limited time, so as to crack the key and gain unlimited access to the ciphertext. Our scheme’s
$INDCPA$ security is analogous to the
$INDCPA$ security of the
$CPABE$ scheme proposed by Allison and his colleagues in [
33], and we only prove Assumption 1 here. To begin with, we create a semifunctional ciphertext (defined as SFC) and a semifunctional key (defined as SFK) in the following format:
SFC: We define ${g}_{2}$ as the generator element of the group ${G}_{{p}_{2}}$. It randomly selects $f\stackrel{R}{\u27f5}{\mathbb{Z}}_{N}$, for each attribute, selects ${z}_{i}\stackrel{R}{\u27f5}{\mathbb{Z}}_{N}$, then selects ${\gamma}_{i}\stackrel{R}{\u27f5}{\mathbb{Z}}_{N}$ for each row of the shared generator matrix and two random vectors $\mathbf{u},\mathbf{w}\in {\mathbb{Z}}_{N}^{n}$, SFC is defined as follows: ${C}_{1}={g}^{sT}\xb7{g}_{2}^{f},{C}_{2}={g}^{\beta sT}\xb7{g}_{2}^{\beta f},{C}_{i,1}={g}^{aT{A}_{i}\xb7\mathit{v}}{\mathcal{U}}_{\rho \left(i\right)}^{{r}_{i}}{g}_{2}^{{A}_{i}\xb7\mathbf{u}+{\gamma}_{i}{z}_{\rho \left(i\right)}},{C}_{i,2}={g}^{{r}_{i}}{g}_{2}^{{\gamma}_{i}},{C}_{i,3}={g}^{\beta T{A}_{i}\xb7\mathit{v}}\xb7{g}_{2}^{{A}_{i}\xb7\mathbf{w}}$
SFK: We can create two types of SFK by randomly selecting the parameters $d,h,c\stackrel{R}{\u27f5}{\mathbb{Z}}_{N},{R}_{0}^{{}^{\prime}},{R}_{1}^{{}^{\prime}},{R}_{2}^{\prime},{R}_{3}^{\prime},{R}_{i}^{{}^{\prime}}\in {G}_{{p}_{3}}$ as follows:
$\mathbf{Type}\phantom{\rule{4pt}{0ex}}\mathbf{1}:D={g}^{\beta \alpha}{g}_{2}^{d}{R}_{0}^{\prime},{D}_{1}={g}^{aht}{g}_{2}^{d}{R}_{1}^{\prime},{D}_{2}={g}^{\beta ht}{g}_{2}^{d}{R}_{2}^{\prime},{D}_{3}={g}^{ht}{g}_{2}^{c}{R}_{3}^{{}^{\prime}},\{{D}_{i}={\mathcal{U}}_{i}^{ht}{R}_{i}^{{}^{\prime}}{g}_{2}^{c{z}_{i}}\}$
$\mathbf{Type}\phantom{\rule{4pt}{0ex}}\mathbf{2}:D={g}^{\beta \alpha}{g}_{2}^{d}{R}_{0}^{\prime},{D}_{1}={g}^{aht}{g}_{2}^{d}{R}_{1}^{\prime},{D}_{2}={g}^{\beta ht}{g}_{2}^{d}{R}_{2}^{\prime},{D}_{3}={g}^{ht}{R}_{3}^{{}^{\prime}},\{{D}_{i}={\mathcal{U}}_{i}^{ht}{R}_{i}^{{}^{\prime}}\}$(let $c=0$)
Upon decrypting an SFC with an SFK, an additional term is introduced into the plaintext due to the semifunctional properties of the key and ciphertext:
where
${u}_{1}$ represents the first item of the vector
$\mathbf{u}$. We will now introduce a series of games to analyze the security of our proposed scheme:
$Gam{e}_{Real}$: In this game, both the ciphertext and the decryption key are valid, and the security of the scheme is not compromised.
$Gam{e}_{0}$: We define a game where all keys are normal, but the challenge ciphertext is SFC. Let q be the number of times the attacker requests the key. For $k\in [1,q]$, we define:
$Gam{e}_{k,1}$: The challenge ciphertext is SFC, the first $\mathit{k}\mathit{1}$ keys requested by the adversary are SFK of Type 2, the kth key is SFK of Type 1, and the rest are normal. $Gam{e}_{k,2}:$ We define a game where the challenge ciphertext is SFC, the first k keys are SFK of Type 2, and the remaining keys are normal.
At the end of the game, we play the game’s last round ($Gam{e}_{final}$): all the keys are Type 2 SFK, and the ciphertext is generated by semifunctionally encrypting random messages without using the two messages supplied by the adversary.
Lemma 1. Assume the existence of a polynomialtime algorithm$\mathcal{A}$such that$Gam{e}_{Real}Ad{v}_{A}Gam{e}_{0}Ad{v}_{A}=\u03f5$. where ϵ is a nonnegligible value. We can find a polynomialtime algorithm$\mathcal{B}$ to break Assumption 1 by ϵ.
Proof. Sending
$g,{g}_{3},X$ to
$\mathcal{B}$,
$\mathcal{B}$ will simulate
$Gam{e}_{Real}$ or
$Gam{e}_{0}$ with adversary
$\mathcal{A}$;
$\mathcal{B}$ randomly selects
$\alpha ,a,\beta \in {\mathbb{Z}}_{N}$, and selects a random exponent
${u}_{i}\in {\mathbb{Z}}_{N}$ for each attribute in the system, then randomly selects two safe large prime numbers
$p,q$ such that
$gcd(pq,(p1\left)\right(q1\left)\right)=1$, calculates
$n=pq,\lambda =lcm(p1,q1)$, then randomly selects a positive integer
${g}_{1}$ less than
${n}^{2}$, and
$\mu ={\left(L\left({g}^{\lambda}mod{n}^{2}\right)\right)}^{1}mod\phantom{\rule{4pt}{0ex}}n$, the public parameter
and public key
are sent to
$\mathcal{A}$. □
Next,
$\mathcal{A}$ sends two equallength messages
${M}_{0},{M}_{1}$,
T generated by his own unique identity and a shared generator matrix
$({A}^{*},\rho )$ to
$\mathcal{B}$,
$\mathcal{B}$ implicitly sets
${g}^{sT}$ to the part of
${G}_{{p}_{1}}$ (and possibly
${G}_{{p}_{1}{p}_{2}}$ element). Then,
$\mathcal{B}$ flips a coin and pick
$\sigma \in \{0,1\}$ and sets:
then randomly selects
${y}_{2}^{{}^{\prime}},\dots ,{y}_{n}^{{}^{\prime}},{r}_{i}^{{}^{\prime}}\stackrel{R}{\u27f5}{\mathbb{Z}}_{N}$, sets the vector
${\mathit{v}}^{{}^{\prime}}=(1,{y}_{2}^{{}^{\prime}},\dots ,{y}_{n}^{{}^{\prime}})$, then sets
${C}_{i,1}={X}^{aT{A}_{i}\xb7{\mathit{v}}^{{}^{\prime}}}{X}^{{r}_{i}^{{}^{\prime}}{u}_{\rho \left(i\right)}},{C}_{i,2}={X}^{{r}_{i}^{{}^{\prime}}},{C}_{i,3}={X}^{\beta T{A}_{i}\xb7{\mathit{v}}^{{}^{\prime}}}$. We implicitly set
$\mathit{v}$ to
$(s,s{y}_{1}^{{}^{\prime}},\dots ,s{y}_{n}^{{}^{\prime}})$,
${r}_{i}=s{r}_{i}^{{}^{\prime}}$, so when
$X\in {G}_{{p}_{1}}$, it is a correctly distributed normal ciphertext.
If
$X\in {G}_{{p}_{1}{p}_{2}}$, let
${g}_{2}^{{f}^{{}^{\prime}}}$ be the
${G}_{{p}_{2}}$ part of
X (
$X={g}^{s}{g}_{2}^{{f}^{{}^{\prime}}}$), so
${C}_{1}={g}^{sT}{g}_{2}^{{f}^{{}^{\prime}}T},{C}_{2}={g}^{s\beta T}{g}_{2}^{\beta {f}^{{}^{\prime}}T},{C}_{i,1}={g}^{saT{A}_{i}\xb7{\mathit{v}}^{{}^{\prime}}}{g}^{s{r}_{i}^{{}^{\prime}}{u}_{\rho \left(i\right)}}\xb7{g}_{2}^{{f}^{{}^{\prime}}aT{A}_{i}\xb7{\mathit{v}}^{{}^{\prime}}{f}^{{}^{\prime}}{r}_{i}^{{}^{\prime}}{u}_{\rho \left(i\right)}},{C}_{i,2}={g}^{s{r}_{i}^{{}^{\prime}}}{g}_{2}^{{f}^{{}^{\prime}}{r}_{i}^{{}^{\prime}}},{C}_{i,3}={g}^{s\beta T{A}_{i}{\mathit{v}}^{{}^{\prime}}}{g}_{2}^{{f}^{{}^{\prime}}\beta T{A}_{i}{\mathit{v}}^{{}^{\prime}}}$. Let
this is a correctly distributed semifunctional ciphertext. We simulated and ran local experiments to test our scheme against Choose Plaintext Attack, and in both
$X\in {G}_{{p}_{1}}$ and
$X\in {G}_{{p}_{1}{p}_{2}}$ scenarios, attacker
$\mathcal{A}$ was unable to decrypt the data.
Figure 4 illustrates the process of a chosen plaintext attack and the experimental results obtained by the attacker. Therefore,
$\mathcal{A}$ can break Assumption 1 with the advantage of
$\u03f5$.
Assumptions 2 and 3 can be proved by similar constructions above; see Allison’s scheme [
33] for details.
3.6. Ciphertext NonReplicability
Suppose there is an adversary $\mathcal{A}$ who can eavesdrop on the channel between the data owner and $AT$. The purpose of $\mathcal{A}$ in this game is to obtain the signature of $AT$ and embed its own ${T}^{\prime}$ in the ciphertext and replace the identity of the data owner in the ciphertext data with its own identity, so as to obtain the ownership of the data. We assume that the adversary will not send his identity information to $AT$ without being able to copy the ciphertext (even if sent, it does not pass authentication).
Lemma 2. Assume that there is a polynomialtime algorithm $\mathcal{A}$ that can break the CDH Assumption with the advantage of ϵ in the polynomial time, then we can construct a polynomialtime algorithm $\mathcal{B}$ that falsifies ciphertext with the advantage of ϵ.
Proof. $\mathcal{B}$ first runs the $Setup({1}^{\phi},U)\to (PK,MSK,P{k}_{AT},S{k}_{AT})$ algorithm, and $PK,P{k}_{AT}$ are sent to the adversary $\mathcal{A}$. □
Ciphertext generation:$\mathcal{B}$ first interacts with $AT$ to generate the ciphertext $C{T}_{A,T}$, and sends $C{T}_{A,T}$ to $\mathcal{A}$. The adversary has two ways to generate its own ciphertext:
Case 1: After the adversary (dishonest user) decrypts the ciphertext and obtains the plaintext M, it regenerates the ciphertext $C{T}^{{}^{\prime}}$ by itself. This method is obviously not advisable, because even if the original decryption key of the ciphertext is generated, it is unable to decrypt $C{T}^{{}^{\prime}}$ and ${G}_{T}\left(H\left(M\right)\right)$ has been stored locally in the authority.
Case 2: The adversary obtains the signature and generates the ciphertext by eavesdropping on the channel between the data owner and $AT$, and sending information that is beneficial to $\mathcal{A}$ to $AT$; $\mathcal{B}$ randomly selects $s\stackrel{R}{\u27f5}{\mathbb{Z}}_{n}$, hashes the plaintext M and maps it to ${G}_{T}$, $\overline{{C}_{0}}={G}_{T}\left(H\left(M\right)\right)\xb7e{(g,g)}^{\alpha s}$ and $\overline{{C}_{0}},{g}^{s}$ are sent to $\mathcal{A}$.
The adversary
$\mathcal{A}$ attempts to generate a random number
${s}^{{}^{\prime}}$ such that
$e{(g,g)}^{\alpha {s}^{{}^{\prime}}}=e{(g,g)}^{\alpha s}$, so he can send
$\overline{{C}_{0}^{{}^{\prime}}}={G}_{T}\left(H\left(M\right)\right)\xb7e{(g,g)}^{\alpha {s}^{{}^{\prime}}},{g}^{{s}^{\prime}}$ and his own identity
${T}^{{}^{\prime}}$ to obtain the signature of
$AT$, and then according to
$Encrypt(P{k}_{AT},<A,\rho >,PK,M)\to C{T}_{A,T}$ algorithm to generate ciphertext and publishes it,
$\mathcal{A}$ can obtain
${g}^{s}$ after eavesdropping on the channel, by calculating
$e(g,{g}^{s})$, he can get
$e{(g,g)}^{s}$, that is, the adversary
$\mathcal{A}$ knows
$e{(g,g)}^{\alpha}$ and
$e{(g,g)}^{s}$, wants to calculate
$e{(g,g)}^{\alpha s}$. This is a
$CDH$ problem, there is no polynomial time algorithm to break it, so
4. Experiments and Analysis
In this section, we mainly analyze the efficiency of our scheme and compared it with the Fully secure CPABE scheme proposed by Allison et al. [
33] in setup, key generation, encryption, decryption, and memory consumption. The experiment is in the win10, 16GB, AMD Ryzen 5 R2600 Six Core 3.40GHz platform. We choose to use the JPBC library of JAVA to build the environment and generate a composite order group with a size of 512 bits and an integer cyclic group with a size of 258 bits through an 83bit elliptic curve. The data were obtained by running the experiments on a locally set up environment and were saved in a text file in “.xlsx” format. The figures were generated by comparing the data using MATLAB plotting.
Figure 5 shows the setup comparison between our scheme and Allison et al.’s scheme. Since the complexity of the setup is
$O\left(N\right)$ (
N represents the number of attributes in the attribute universe), the time efficiency is almost the same except for computer errors.
The master key is in the form of a key–value pair:
MSK: alpha:210810353108659863024409106247517618452769941479846636980134442864523125 
95818033429445987282464226795828802774079330 

g3:507123706182628610741111547764849270218939290666858116887669480037266931236 
6558956079248951460266712797586740186721012,3848333923503209101783513197106326 
835379604308979589246948604032780415086659341351001892795149863403912326337017 
537761,0 

beta:3971351897302668818568847385425920497495147741445579066512932780617650627 
246730432329467425793820791111050407589402 
Figure 6 shows a comparison of the key generation time between our scheme and the scheme proposed by Allison et al. Our scheme involves interactive key generation, resulting in higher overhead compared to Allison’s scheme when the attribute space is small. However, as the attribute space grows, the performance gap between the two schemes decreases.
Figure 7 illustrates a comparison of the encryption time between our scheme and the scheme proposed by Allison et al. The ciphertext complexity of Allison et al.’s scheme is
$O(C+N)$, and the complexity of our scheme is also
$O(C+N)$, where
C represents the length of the ciphertext and
N represents the number of attributes in the attribute space. In fact, Allison et al.’s scheme involves
$C+2N$ terms, whereas ours involves
$C+3N$ terms, which results in a small difference in overhead.
The generated ciphertext is also stored in the form of key–value pairs. We take plaintext “
$hello$” as an example, and the encrypted
${C}_{0}$ format is as follows:
CT_AT: 
C0_0:{x=18512619911661450195327750867443546794232624419671207238173221146535506 
95552872207298169171836981828215064641158984234,y=70592147653957222146120442588 
2624722810944092606551932879619145643069907176528665887436471753185725971617607 
9123508797} 

C0_1:{x=10305743198129175737055428555819461127421625147813759967731163938926816 
650740214215836427503578249751830703585598710737,y=9864469019751328170985532932 
3260154136345304295214862269824474212174107344885351182673330718902818130451275 
75927761095} 

C0_2:{x=85627632985010802758634299337508008509965737271381026168608890332725357 
51769387446114503063186080075196481905182611275,y=69157043429120428487195645886 
4787393452354723338419697486692137357841812076088296305720924512144895097136125 
4706491542} 

C0_3:{x=85627632985010802758634299337508008509965737271381026168608890332725357 
51769387446114503063186080075196481905182611275,y=69157043429120428487195645886 
4787393452354723338419697486692137357841812076088296305720924512144895097136125 
4706491542} 

C0_4:{x=10828209153804955834077646467569440299821102129146337294704720899926979 
6582045437576244731444812151580842960742884772,y=411045008855643689234607591514 
8105748998457729928230076680665408791706458037634503664240890763024397642409757 
902239244} 
Figure 8 presents a comparison of the decryption time between our scheme and the scheme proposed by Allison et al. The time overhead is mainly focused on computing the secret
s, so apart from computational errors, there is no difference in overhead.
The plaintext obtained by decrypting the above ciphertext is as follows:
ourScheme.Decrypt(“file/Key”,“file/CT_AT”); 
The plaintext after decryption is:hello 
Finally,
Figure 9 shows a comparison of the memory overhead between our scheme and the scheme proposed by Allison et al. Due to the involvement of our scheme’s interactive functions, such as
$sendToAT\left(\right)$,
$KeyGenAT\left(\right)$, and
$id$ extraction function
$extractID\left(\right)$, our scheme incurs a higher memory overhead than Allison et al.’s scheme.