Privacy-Preserving Public Route Planning Based on Passenger Capacity

Zhang, Xin; Zhang, Hua; Li, Kaixuan; Wen, Qiaoyan

doi:10.3390/math11061546

Open AccessArticle

Privacy-Preserving Public Route Planning Based on Passenger Capacity

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2023, 11(6), 1546; https://doi.org/10.3390/math11061546

Submission received: 8 March 2023 / Revised: 18 March 2023 / Accepted: 20 March 2023 / Published: 22 March 2023

(This article belongs to the Special Issue Advanced Mathematical Methods in Intelligent Multimedia: Security and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Precise route planning needs huge amounts of trajectory data recorded in multimedia devices. The data, including each user’s location privacy, are stored as cipher text. The ability to plan routes on an encrypted trajectory database is an urgent necessity. In this paper, in order to plan a public route while protecting privacy, we design a hybrid encrypted random bloom filter (RBF) tree on encrypted databases, named the encrypted random bloom filter (eRBF) tree, which supports pruning and a secure, fast k nearest neighbor search. Based on the encrypted random bloom filter tree and secure computation of distance, we first propose a reverse k nearest neighbor trajectory search on encrypted databases (RkNNToE). It returns all transitions, in which each takes the query trajectory as one of its k nearest neighbor trajectories on the encrypted database. The results can be the indicator of a new route’s capacity in route planning. The security of the trajectory and query is proven via the simulation proof technique. When the number of points in the trajectory database and transition database are 1174 and 18,670, respectively, the time cost of an R2NNToE is about 1200 s.

Keywords:

public route planning; reverse trajectory query; encrypted trajectory database

MSC:

68P27

1. Introduction

Public route planning is used to find a new route that can cover a large area and carry a greater amount of passengers. The operation of a new public route can ease traffic congestion as well as reduce fuel consumption and pollution. Public route planning requires a lot of trajectory data recorded in various GPS-equipped multimedia devices and online location-based services (Bikely, Didi, Twitter, and Facebook) [1]. Since trajectory data include locations, data owners encrypt the trajectory data to preserve their locations’ privacy. Public route planning on an encrypted database is necessary.

In a typical scenario of planning a bus route, a passenger’s transition includes two points: the source and the destination. The passenger prefers to take the bus, which has stations close to the two points. If a bus company wants to develop a new route (trajectory) that provides services to more passengers, it is necessary to predict the passenger flow of the new route. Note that passengers do not want to leak their location privacy. The new route should not be published until it is applied. Basically, it is a reverse k nearest neighbor trajectory (RkNNT) search on an encrypted trajectory database. The transition data and trajectory data are collected by online location-based service providers; they outsource their encrypted data to the cloud server to release their storage space. In a secure RkNNT search on an encrypted trajectory database, the operations of computing and comparing the distances between different trajectories are frequent, which leads to repeated access to the online location-based service providers. A proxy cloud can represent all the online location-based service providers to cooperate with the server cloud, which can reduce the online computational burden of the online location-based service providers. The details of the two-cloud model are introduced in Section 3.3.

Various kinds of queries on encrypted points are proposed, such as k nearest neighbor (NN) points queries, reverse kNN points queries, range queries, skyline queries and liner range queries. However, all these schemes cannot be applied to an encrypted trajectory query, because the similarity measure of trajectories is based on a more complex aggregation of distances and order between trajectory points, such as dynamic time warping [2], longest common subsequence [3], and edit distance on a real sequence [4]. There are also some schemes study the reverse kNN trajectories query [5,6]. However, they only return the single point, which takes the query trajectory as one of the kNN trajectories. In addition, the locations are not protected, which leaks the locations of users and the points in trajectories. These problems motivate us to investigate the RkNNT search on the encrypted databases.

There are two challenges to search the RkNNT on the encrypted databases. One is to reduce the search space, since computation on large encrypted data is time-consuming. The other is to search on a certain space without leaking the location’s privacy. To overcome these two challenges, our main contributions are as follows:

In this paper, we first design a hybrid tree, eRBFtree. It divides the search space into subspaces according to the distribution of trajectory points. The division of the subspace is according to the distribution of transition points. The eRBFtree supports spatial pruning and fast kNN search on ciphertext.
We propose a reverse kNN trajectory search on the encrypted database, RkNNToE. We use eRBFtree to prune the space of encrypted transitions. Then, we give a distance list (DList), which helps to refine the transitions and reduce the times of the kNN search. To ensure the correctness of results, we apply the fast kNN search for every transition as a result.
Theoretical analysis proves that clouds and users cannot know the locations of data and the distance between two locations at the same time. The experiment results confirm that our scheme is practicable in the GeoLife project in Beijing and the bus lines dataset in Beijing.

2. Related Work

In this section, we present an overview of the existing protocols in terms of trajectory search on plain text [7] and secure RkNN search [8], which are related to our work in this paper. The comparison between related schemes and RkNNToE is listed in Table 1. Note that a trajectory can degrade into a point, so the search method in RkNNT can deal with the RkNNP search, and a two-type database can degrade into a one-type database.

RkNNT Search. In [12,13], an RkNN points search was studied, which is the foundation of RkNNT search. Refs. [5,6] investigated the problem to find the single points—that is, the kNN points—for the query trajectory. In 2018, Wang et al. [9] proposed an RkNN trajectory search, which studies transitions with multiple points. It does not include any semantic information [10]. In [14,15], the reverse spatial–keyword nearest neighbor queries were studied. Pan et al. [10] introduced the geo-textual object sequences to achieve an RkNN semantic trajectories search. None of the above schemes focus on the privacy of both the query and data.

Privacy-Preserving RkNN Search. In [16], the private information retrieval was used to protect the query to achieve the privacy-preserving RkNN search. It does not protect the database stored in the cloud [17]. Li et al. [17] designed a reference-locked order-preserving based RNN query, which protects the database, but it is only used for two-dimensional data. In [11], RkNN over-encrypted multi-dimensional data were proposed, which only support point data and cannot support trajectory data. In 2023, Zheng et al. [8] proposed a privacy-preserving set reverse kNN query, which is not suitable for the two-type trajectory database.

3. Problem Formulation

The notations are shown in Table 2.

3.1. RkNNT Problem and Definitions

The RkNNT on the plain-text database is introduced in [9]. In this paper, we follow their definitions.

Definition 1.

(Transition) A transition of an object

O = (s, d)

is a pair of points, describing the motive object’s source and destination.

D_{o}

is the set of transitions.

Definition 2.

(Trajectory) A trajectory (route) τ of length l is a sequence of points

< p_{1}, p_{2}, \dots, p_{N_{p}} >

, where

N_{p}

is the number of points in the trajectory, and

D_{τ}

is the set of trajectories.

Definition 3.

(Point-to-trajectory distance) The distance between a point

p_{i}

and a trajectory

τ_{j}

is defined as:

D i s t (p_{i}, τ_{j}) = max_{p_{j} \in τ_{j}} d i s t (p_{i}, p_{j})

(1)

Definition 4.

(RkNNT) Given a transition set

D_{o}

, a trajectory set

D_{τ}

and a query trajectory Q, RkNNT(Q) returns all the transitions in a set

D_{1} \in D_{o}

. For each

O = (s, d) \in D_{1}

, all trajectories

τ \in D_{τ}

that meet

D i s t (s, τ) \leq D i s t (s, Q)

and

D i s t (d, τ) \leq D i s t (d, Q)

are stored in a set

D_{2}

, whose size less is than k.

3.2. Basic Security Primitives

3.2.1. CKKS Encryption

CKKS encryption [18] is a fully homomorphic encryption. It can directly encrypt a vector and support calculating the inner product on cipher text. In this paper,

C K K S_{e n c} (\cdot)

,

C K K S_{d e c} (\cdot)

,

C K K S_{s u b} (\cdot, \cdot)

and

C K K S_{d o t} (\cdot, \cdot)

represent the operation of encryption, decryption, subtraction and inner product, respectively. If

C K K S_{e n c} (v_{1}) = c_{1}

,

C K K S_{e n c} (v_{2}) = c_{2}

,

v_{1} = (x_{1}, y_{1})

and

v_{2} = (x_{2}, y_{2})

, then

C K K S_{d e c} (C K K S_{d o t} (c_{1}, c_{2})) = v_{1} \cdot v_{2}

,

C K K S_{d e c} (C K K S_{s u b}

(c_{1}, c_{2})) = (x_{1} - x_{2}, y_{1} - y_{2})

and

C K K S_{d e c} (C K K S_{d o t} (C K K S_{s u b} (c_{1}, c_{2}), C K K S_{s u b} (c_{1}, c_{2}))) = {(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}

. In this paper, we use the above operations to obtain the distance of two points and denote a new operation as

C K K S_{d i s^{2}} (c_{1}, c_{2}) = C K K S_{d o t} (C K K S_{s u b}

(c_{1}, c_{2}), C K K S_{s u b}

(c_{1}, c_{2}))

.

3.2.2. Security kNN

In our algorithm, a secure kNN point search is based on the Fast and Secure kNN query (FSknn [19]). In this paper, we will briefly give the main changes compared to the FSknn.

Index-building. In this phase, a data owner (DO) firstly random generates two vectors

v_{1} ⊥ v_{2}

. The method of computing every point’s prefix families is the same as it in FSknn. However, in this paper, the DO treats all prefixes of all points in subspace of a node as keywords

k w

to embed in one RBF rather than all prefixes of a point. As shown in Figure 1, an empty RBF is initialized as a two-row and m-column random binary array. The two elements in the same column are different.

R B [i] [j]

is the element in the i-th row and j-th column of RBF. For every keyword, the DO sets

R B F [H (h (h_{k} (k w)) \oplus r_{k})] [h_{k} (k w)] = 1

and

R B F [1 - H (h (h_{k} (k w)) \oplus r_{k})] [h_{k} (k w)] = 0

, where

h (\cdot) = H M A C (\cdot) m o d 2, h_{k} (\cdot) = H M A C (\cdot)

,

H (\cdot) = S H A 256 (\cdot) m o d 2

and k is the number of hash functions for RBF. Every RBF point is a node rather than a point. An example of inserting a keyword is shown in Figure 1. An RBF tree is generated based on

R B F_{p} [H (h (h_{l} (k w)) \oplus r_{p})] [i] = R B F_{l} [H (h (h_{l} (k w)) \oplus r_{l})] [i] \lor R B F_{r} [H (h (h_{r} (k w)) \oplus r_{r})] [i]

, where

R B F_{p}

is the parent RBF of child

R B F_{i}, i \in (1, 4)

. An example of constructing an RBF tree is shown in Figure 2.

Token generation. When a data user (DU) wants to find the kNN in the database, the DU needs to generate k pairs of hashes and locations that serve as the search token following the same method in FSknn. However, when the token is generated by the DO, it only needs to generate the token based on one radius

d i s_{r e f}

rather than L radiuses.

Query processing. The method of the cloud is the same as it in FSknn. However, the stop condition is to find kNN trajectories in all query points’ kNN points set rather than to find more than kNN points for every query point.

Post-processing. If there are not kNN trajectories in all query points’ kNN points set, the DU needs to expand the search radius and repeat search kNN points following the same method in FSknn. However, if the token is generated by the DO, it does not need to expand the search radius or repeat search for kNN points.

3.3. The System and Threat Models

As shown in Figure 3, there are four entities: a data owner (DO), two clouds (

c l o u d_{1}

and

c l o u d_{2}

) and a data user (DU). The details are described as follows.

The DO is a data owner. The data include the transition data and the trajectory data. The DO wants to update the encrypted trajectory data and transition data to

c l o u d_{2}

to release the storage space.

The DU is a user who wants to process the RkNNT search on the database stored in

c l o u d_{2}

. The DU sends a query to trigger the service; the query includes the encrypted information of the data user’s trajectory.

c l o u d_{1}

(proxy cloud) is the proxy of the DU and DO, which is responsible for directing

c l o u d_{2}

to filter and refine the transitions, and calling the DU to construct the token for every point in the refined transitions.

c l o u d_{2}

(server cloud) provides storage space for data owners.

c l o u d_{2}

is responsible for searching nearest neighbor points for every point in a query trajectory and refined transitions, computing the distance between points or points and nodes with the

c l o u d_{1}

’s help, and sending the encrypted transition points to the DU.

Overview: As shown in Figure 3, the DO sends the index and encrypted points to

c l o u d_{2}

and a distance list (DList) to

c l o u d_{1}

to complete data outsourcing. If the DU wants to conduct an RkNN trajectory search, he sends the encrypted request to

c l o u d_{2}

.

c l o u d_{2}

cooperates with

c l o u d_{1}

to prune and refine transitions that cannot be the RkNN transition of the query trajectory.

c l o u d_{1}

obtains the refined transitions and sends a request for NN points token for every point in refined transitions to the DO. The DO generates and sends the tokens to

c l o u d_{2}

.

c l o u d_{2}

cooperates with

c l o u d_{1}

to find the NN trajectories of all refined transitions based on the NN points.

c l o u d_{1}

obtains the transitions that take the query trajectory as one of the kNN trajectories and returns the results to the DU.

3.4. Secure Requirements for MTS

Our scheme is under the assumptions that two clouds follow the processing of search and cannot actively attack the system or collude with each other (honest-but-curious). The DO and DU cannot collude with any cloud, but they can be a malicious attacker. Note that we mainly focus on the location privacy of points. The identity is on plain text.

Data Security. The location of every point in the transition and trajectory should not be learned by both clouds. An attacker cannot know the points’ locations in the encrypted database.

Index Security. The index is secure, which means that

c l o u d_{2}

cannot know the specific point pointed by every leaf node of the index, and every node cannot reveal the location of both trajectories and transitions.

Query Security. Both the encrypted requests cannot reveal the location of every point in the query trajectory. Both clouds cannot know the specific location.

4. The Proposed Scheme

In this section, first, we generalize the main idea of the search. However, all information of the index is not protected, and the trajectories and transitions are not encrypted. Then, we proposed a secure scheme with encrypted index and encrypted data, which should be processed in a two-cloud model. It can satisfy the secure requirements and counter-threat model.

4.1. Main Idea of RkNNT Search

The reverse trajectories searching are divided into four steps: building a hybrid quad tree, generating a filter set and pruning transition, refining transitions and returning results. The whole processing is shown in Algorithm 1.

Algorithm 1: Reverse Trajectory Search

(Q, D B_{p})

4.1.1. Building Hybrid Quad Tree

On the plain-text trajectory database, we build a hybrid quad tree base on quad tree [20] in

D B_{p}

.

D B_{p}

includes all the points in

D B_{τ}

and

D B_{o}

. The space in a node is partitioned into four equal subspaces. The subspace is stored in the child node. The partitioning will not be stopped until there are less than n points in the subspace. First, the partitioning is based on

D B_{τ}

, it will not be stopped until there are less than

N_{τ}

points in the subspace. The quad tree in this phase is called the father tree. The trajectory points are stored in every leaf node of the father tree. Then, every subspace in the leaf node of the father tree is partitioned. The partitioning is based on all points in this subspace; it will not be stopped until there is less than

N_{o}

points in one leaf node. The quad tree takes the leaf node of the father tree, as its root node is called the child tree. Figure 4 shows the structure of a hybrid quad tree. The bold tree is the father tree. The others are child trees. Every non-leaf node of the hybrid quad tree stores the location vectors of four vertexes. Every leaf node stores the identities and location vectors of points in this leaf node. This is shown in line 1 of Algorithm 1.

4.1.2. Generating Filter Points and Pruning Transitions

If a reverse trajectory search is needed, we find the NN trajectory points for every query trajectory point and construct a table. In Figure 5, the NN points of query points (

q_{1}, q_{2}, q_{3}

) are in

n o d e (1)

,

n o d e (2)

,

n o d e (3)

and

n o d e (4)

. Then, we find the trajectory, which has more than two points in the table, such as

T 1

. All points in these trajectories are called filter points. In Figure 6, the filter points are

T 11

and

T 12

. We form a polyline based on perpendicular bisectors between the points from one trajectory and the query points. The polyline divides the space into two subspaces. If one node is intersected by the polyline, then we check whether the child node meets the above condition. Then,

n o d e (1)

and the

n o d e (3)

are intersected by the polyline in Figure 5. Its child node needs to be checked. If the node is the leaf node of the child tree, we list all the transitions’ identity and compute the distance between the transition points and the filter points. If there are more than k trajectories closer to the transition than the query trajectory, the transition is pruned. In Figure 6, leaf

n o d e (3, 2, 3)

is intersected by the polyline, and we compare the distance

d i s t (O_{2}, Q)

with the distance

d i s t (O_{2}, T 1)

. Since

d i s t (O_{2}, Q) > d i s t (O_{2}, T 1)

, transition

O_{2}

can be pruned. If one node is in the subspaces of two filter points with one trajectory identity, the node is closer to the trajectory than the query trajectory. If there are more than k polylines that make one node meet the above condition, there are more than k trajectories closer to the node than to the query trajectory. All transitions in these nodes are closer to the k trajectories than to the query trajectory. All transitions in these nodes can be pruned. In Figure 5, the

n o d e (3, 1)

is in the subspace of

T 11

and

T 12

, all points in

n o d e (3, 1)

are closer to trajectory

T 1

than to query trajectory Q. Since transition

O_{1} = (s_{1}, d_{1})

is in

n o d e (3, 1)

, it can be pruned. All the rest of the transitions are called candidate transitions. The candidate transitions in Figure 5 are

O_{0}

and

O_{3}

.

4.1.3. Refining Transitions and Returning Results

For every candidate transition, we compute the distance between every point in transition and the query trajectory. We check the nodes in the quad tree by using a circle, of which the radius is the distance and the center point is the transition point. For the nodes in the circles of one transition, we record the identities of trajectories in these nodes. For the nodes that intersect with the circle, the child node needs to be checked further. If the node is a leaf node, we compute the distance between every transition point and trajectory in the leaf node. We record the identities of trajectories which are closer to the transition than the query trajectory. If the total number of recorded identities is more than k, then the candidate transition is deleted. In Figure 6, the circles of point

s_{3}

and

d_{3}

are drawn. The nodes

(2, 1, 4)

and

(4, 4, 1)

are in the circle, respectively. The trajectory

T 2

is closer to the transition

O_{3}

. It can be deleted in the candidate transitions. The rest of the candidate transitions are called refined transitions. For every point of the refined transitions, we find the NN points in the quad tree and check whether there are two points of query trajectory in it. If two points of the query trajectory are in the NN points of one transition, it is inserted in the set

R k N N (Q)

. The

R k N N (Q)

is the search results. In Figure 5, the NN point for the point

s_{0}

is

q_{2}

and the NN point for the point

d_{0}

is

q_{3}

. The

R k N N T (Q)

in Figure 5 is

O_{1}

.

4.2. Reverse Search on Encrypted Trajectory Database

In this section, the points of transitions and trajectories are encrypted, and the hybrid quad tree is replaced by an encrypted RBF tree (eRBFtree) and the distance list (DList). This section is consists of four phases: setup, eRBFtree building, query encryption and search.

4.2.1. Setup

The data owner generates the parameters of CKKS and RBF tree as shown in Section 3.2. It encrypted all the location vectors of points in database

D B_{p}

. For a point with identity

I D

and location

l o c

, its item is

{I D, C K K S_{e n c} (l o c)}

. The

c l o u d_{2}

generates its private key

s k_{2}

and public key

p k_{2}

; it publishes the public key

p k_{2}

to the DO and DU.

4.2.2. eRBFtree and DList Building

As shown in Figure 7, building an eRBFtree includes two steps. The first step is building the RBF tree in the database

D B_{τ}

and the partitioning of space is the same to the partitioning of the father tree in Section 4.1.1. Every leaf node of the RBF tree stores the encrypted items of trajectory points. Every non-leaf node stores an RBF and four encrypted points

{C K K S_{e n c} (V_{1}), \dots, C K K S_{e n c} (V_{4})}

, where

V_{i}, i \in (1, 4)

is the four vertices of the node. The second step is building the child trees in the database

D B_{o}

. Every leaf node of the child tree stores encrypted items of transition points. Every non-leaf node stores four encrypted points

{C K K S_{e n c} (V_{1}), \dots, C K K S_{e n c} (V_{4})}

. The DList is a table, in which every row records

(I D_{o}, p) : {d i s_{1}, S I D_{τ}^{1}}, {d i s_{2}, S I D_{τ}^{2}}, \dots

. The keywords

(I D_{o}, p)

are the identity of transition and one point in the transition. The value

d i s_{i}

is the maximum distance from the point p to its nearby nodes. The value

S I D_{τ}^{i}

is the set of trajectories’ identities in these nodes. The values are listed in increasing order by the

d i s_{i}

. The eRBFtree and the DList are constructed by the data owner. The DO encrypts the eRBFtree with all the items by the public key

p k_{2}

and sends the cipher text to

c l o u d_{2}

. The DO sends the DList and the secret key of CKKS to

c l o u d_{1}

.

4.2.3. Query Encryption

The query includes tokens and items for points

q_{j}, j \in (1, N_{p})

in the query trajectory.

N_{P}

is the number of points in the query trajectory. The token

T o k e n (q_{j})

is for a secure kNN search in eRBFtree, which is constructed as shown in Section 3.2.2. The center point is the point of the query trajectory, and the search radius is set by the DO. The item

{C K K S_{e n c} (q_{j})}

is the encrypted location vector of the point

q_{j}

. The query

Q = {(T o k e n (q_{1}), C K K S_{e n c} (q_{1})), \dots, (T o k e n (q_{N_{p}}),

C K K S_{e n c} (q_{N_{p}}))}

is encrypted by the public key of

c l o u d_{2}

; then, it is sent to

c l o u d_{2}

to start a reverse search.

4.2.4. Search

In this phase,

c l o u d_{2}

decrypts the query with the private key

s k_{2}

. Then,

c l o u d_{2}

uses the tokens

T o k e n (q_{j}), j \in (1, N_{p})

to search the eRBF tree, obtains the NN trajectory points for every point in the query trajectory, checks the identities of points and constructs the filter set. The item in the filter set is

{I D_{τ}, C K K S_{e n c} (l o c_{1}), C K K S_{e n c} (l o c_{2}), \dots, C K K S_{e n c} (l o c_{N_{p}})}

, where

C K K S_{e n c} (l o c_{1}), C K K S_{e n c} (l o c_{2}), \dots, C K K S_{e n c} (l o c_{N_{p}})

are NN points of

N_{p}

query points. They have the same trajectory identity

I D_{τ}

.

C l o u d_{2}

computes distances between every vertex in the node and the filter points by

D I S_{1} = C K K S_{d i s^{2}} (C K K S_{e n c} (l o c_{i}), C K K S_{e n c} (V_{j})), i \in (1, 2), j \in (1, 4)

. Then,

c l o u d_{2}

computes the distance between every vertex in the node and the query trajectory by

D I S_{2} = C K K S_{d i s^{2}} (C K K S_{e n c} (q_{i}), C K K S_{e n c} (V_{j})), i \in (1, N_{p}), j \in (1, 4)

. Afterwards,

c l o u d_{2}

sends

D I S_{1}, D I S_{2}

to the

c l o u d_{1}

.

C l o u d_{1}

decrypts them and obtains the distance between every vertex and the filter points

d i s t (l o c_{i}, V_{j}), i \in (1, 2), j \in (1, 4)

and the distance between every vertex in node and the query trajectory

d i s t (q_{i}, V_{j}), i \in (1, N_{p}), j \in (1, 4)

. The process is from the root node to the leaf node, using the pruning transition in Section 4.1.2. If one node is filtered,

c l o u d_{1}

notifies

c l o u d_{2}

. Then,

c l o u d_{2}

stops computing the distance of its child node. If the node is a leaf node,

c l o u d_{2}

and

c l o u d_{1}

compute the distance between every transition point in the node and the query trajectory

d i s t (l o c_{i}, q_{j}), i \in (1, 2), j \in (1, N_{p})

. After filtering the transitions,

c l o u d_{2}

cooperates with

c l o u d_{1}

to compute the distance between the candidate transitions and the query trajectory. The identities of transitions and the cipher text of distance are sent to

c l o u d_{1}

. Then,

c l o u d_{1}

decrypts the cipher text and obtains the distance

d = d i s t (l o c_{i}^{c a n}, q_{j}), i \in (1, 2), j \in (1, N_{p})

. For every candidate point

l o c_{i}^{c a n}, i \in (1, 2)

in one transition,

c l o u d_{1}

refers to the DList, locates the row of keyword

l o c_{i}^{c a n}

and finds the maximum values for

d i s_{h_{i}}

meet

d i s_{h_{i}} \leq m i n {d i s t (l o c_{i}^{c a n}, q_{j}), j \in (1, N_{p})}

. Then,

c l o u d_{1}

counts the number of trajectories, of which two points come from two sets

S I D_{τ}^{h_{1}}

and

S I D_{τ}^{h_{2}}

. If the number of the trajectories is more than k, the transition (

l o c_{1}^{c a n}, l o c_{2}^{c a n}

) can be pruned. Then,

c l o u d_{1}

sends the identities of refined transitions

S_{r e f}

to

c l o u d_{2}

. For every refined transition (

s, d

) with identity in

S_{r e f}

,

c l o u d_{2}

sends the identity and a distance

d i s_{r e f} = d i s t (s, Q) + d i s t (d, Q)

to the DO. Then,

c l o u d_{2}

sends the encrypted transition

(C K K S_{e n c} (l o c_{s}), C K K S_{e n c} (l o c_{d}))

points to the DO. The tokens

T o k e n_{r e f}

for every point in the refined transition are constructed after the DO obtains the request

{I D_{O}, d i s_{r e f}}, I D_{O} \in S_{r e f}

from

c l o u d_{1}

and decrypts

(C K K S_{e n c} (l o c_{s}), C K K S_{e n c} (l o c_{d}))

. The DO constructs two tokens for every transition, as shown in Section 3.2.2. The center points are the points of location

l o c_{s}

and

l o c_{d}

, respectively. The radius is

d i s_{r e f}

. The DO sends the set of tokens

T o k e n_{r e f}

to

c l o u d_{2}

.

C l o u d_{2}

searches the NN points and checks if there is less than k trajectories in the NN points. If there are, the transition is one of the reverse k transitions for the query trajectory. Otherwise,

c l o u d_{2}

computes the distance between trajectories and transitions with the help of

c l o u d_{1}

.

C l o u d_{1}

compares the distance between the trajectory and transitions as wel as between the query trajectory and transitions. If more than k trajectories are closer to one transition, the transition is deleted. The rest of the refined transitions are the results. Then,

c l o u d_{1}

returns the identities to the DU and

c l o u d_{2}

returns the encrypted locations to the DU.

5. Theoretical Analysis

5.1. Correctness Analysis

In this section, we will discuss the returned results, which are all reverse transitions for the query trajectory. The discussion is divided into three steps.

(1) In the first step, we find the filter set

S_{τ}

and prune the

O = (s, d) \in D B_{o}

so that

\exists D_{τ} = {τ_{1}, \dots, τ_{k}}

such that

d i s t (s, τ_{i}) < d i s t (s, Q)

and

d i s t (d, τ_{i}) < d i s t (d, Q)

,

i \in (1, \dots, k)

. According to Definition 4, the transition cannot be in RkNNT(Q). We call the transition that is not in RkNNT(Q) a negative transition and the transition that is in RkNNT(Q) a positive transition. In this step, we only prune a part of the negative transitions. There are also many negative transitions in set

S_{c a n}

.

(2) In the second step, we use the candidate set

S_{c a n}

and DList to delete the

O = (s, d), s \in S_{c a n}

or

d \in S_{c a n}

so that there exists

{d i s_{s} < d i s t (s, Q), S I D_{τ}^{s}}

in the row of point s,

{d i s_{d} < d i s t (d, Q), S I D_{τ}^{d}}

in the row of point d and

S I D_{τ}^{s} \cap S I D_{τ}^{d}

has more than k trajectory identities. It also means that

\exists D_{τ^{'}} = {τ_{1}^{'}, \dots, τ_{k}^{'}} \subset (S I D_{τ}^{s} \cap S I D_{τ}^{d})

such that

d i s t (s, τ_{i}) < d i s t (s, Q)

and

d i s t (d, τ_{i}) < d i s t (d, Q)

,

i \in (1^{'}, \dots, k^{'})

. In this step, we also delete a part of negative transitions. It is unclear whether there are any negative transitions in set

S_{r e f}

.

(3) In the third step, we know that if a transition takes the query trajectory as one of its kNN trajectories, the transition must be the RkNNT of the query trajectory. For every transition

O = (s, d) \in S_{r e f}

, we find all trajectory points with distance to s or d less than

d i s_{r e f} = d i s t (s, Q) + d i s t (d, Q)

. If a trajectory

τ

has only a point with distance to s or d less than

d i s_{r e f}

, then

d i s t (s, τ) + d i s t (d, τ) > d i s_{r e f}

. If a trajectory

τ

has no point with distance to s or d less than

d i s_{r e f}

, then

d i s t (s, τ^{'}) + d i s t (d, τ^{'}) > 2 d i s_{r e f}

. So if only a trajectory has one point with distance to s less than

d i s_{r e f}

and the other one point has distance to d less than

d i s_{r e f}

, it is possibly closer to the transition

O = (s, d)

than the query trajectory Q. For every transition

O = (s, d) \in S_{r e f}

, we list all NN trajectories

τ_{i}

meets

d i s t (s, τ_{i}) + d i s t (d, τ_{i}) \leq 2 d i s_{r e f}

and check the size of

D_{τ} = {τ_{1}, \dots, τ_{j}}

such that

d i s t (s, τ_{i}) + d i s t (d, τ_{i}) < d i s t (s, Q) + d i s t (d, Q)

,

i \in (1, \dots, j)

. If the size of

D_{τ}

is not more than k, the transition must be the positive transitions; otherwise, the transition must be the negative transition.

5.2. Security Definitions and Analysis

The two-clouds model is honest-but-curious, and the RkNNToE is processed in two phases. The definition of leakage functions [21] of two phases and the formal proof are proposed. It shows that RkNNToE is secure in an honest-but-curious clouds model.

Definition 5.

In an honest-but-curious clouds model, there are two participants

C i, i \in (1, 2)

in a protocol

P

. For

C_{i}

,

f_{i}

and

O_{i}

are the execution function and its output, while

v i e w_{i}

is the view during an execution of

P

. The protocol

P

is secure against a probabilistic polynomial time (PPT) honest-but-curious adversary if there exist simulators

S_{1}

and

S_{2}

such that:

(S_{1} (f_{1}, L_{1}), f_{2}) \equiv (v i e w_{1}, O_{2})

(2)

(f_{1}, S_{2} (f_{2}, L_{2})) \equiv (O_{1}, v i e w_{2})

(3)

where ≡ means computational indistinguishability.

L_{i}^{j}

is the leakage function of cloud

i \in (1, 2)

in phase

j \in {s e t u p, s e a r c h}

. Given a collection of points

D B_{p}

from the DO and a query trajectory Q from the DU,

L_{1}^{s e t u p} (D B_{p}) = {D L, [EI, [i d, p]]}

L_{2}^{s e t u p} (D B_{p}) = {EI, {(O I D, [l o c])}_{i}, {(T I D, [l o c])}_{j}, | D B_{p} |, | D B_{τ} |, | D B_{O} |}

L_{1}^{s e a r c h} (D B_{p}, Q) = {D (Q), D L, S_{c a n}, S_{r e f}}

L_{2}^{s e a r c h} (D B_{p}, Q) = {T o k e n_{i}, T o k e n_{j}, | Q |, | S_{r e f} |, | S_{c a n} |, S (Q), A (Q), {(O I D, [l o c])}_{i}, {(T I D, [l o c])}_{j}},

where

D L

is the distance list,

E I

is the eRBF tree,

i d

is the identity of point p,

[\cdot]

is the cipher text of ·,

| \cdot |

is the size of ·,

O I D_{i}

is the identity of transition i and

T I D_{j}

is the identity of trajectory j.

Definition 6.

(Search Pattern

S

) The search pattern leakage reveals whether the keywords in the token of every query point have appeared before.

Definition 7.

(Access Pattern

A

) Given a search query Q, the access pattern is defined as the identifier of trajectory points in the nearest neighbor of query points.

Definition 8.

(Distance Pattern

D

) Given a search query Q,

D (Q) = d i s t (p_{i}, q_{j}), q_{j} \in Q, p_{i} \in S_{c a n}

. Informally, this part of leakage can be derived from the query,

D

leaks the distances between the points in candidate transitions and query points.

Theorem 1.

Under the permitted leakage functions

L_{1}^{S e t u p}

,

L_{2}^{S e t u p}

,

L_{1}^{S e a r c h}

and

L_{2}^{S e a r c h}

, if CKKS and the FSknn [19] are secure in the two honest-but-curious clouds model, then

R k N N T o E

is secure in the two honest-but-curious clouds model.

Proof.

We introduce the leakage function to Definition 5 and prove that for any PPT adversary, there exist simulators

S_{1}

and

S_{2}

such that:

(S_{1} (f_{1}, L_{1}^{s e t u p}), f_{2}) \equiv (v i e w_{1}, O_{2})

(4)

(S_{1} (f_{1}, L_{1}^{s e a r c h}), f_{2}) \equiv (v i e w_{1}, O_{2})

(5)

[Simulating Setup] Given

L_{1}^{s e t u p} (D B_{p}) = {D L, [EI, [i d, p]]}

,

S_{1}

randomly generates a message as the plain text m and encrypts it by using a CPA-secure encryption to obtain

[m]

.

S_{1}

randomly generates the identity of trajectories and transitions. The number of these trajectories is the same as the one listed in

D L

.

S_{1}

randomly generates many increasing arrays to represent the distance between the transition points and vertices of each node. Since the PPT adversary does not know the real distribution of points, and the encryption in the above simulation is secure, a PPT adversary cannot distinguish between the simulated view and the real view.

[Simulating Search] Given

L_{1}^{s e a r c h} (D B_{p}, Q) = {D (Q), D L, S_{c a n}, S_{r e f}}

,

S_{1}

knows the identities of transitions that are deleted in the phase of refining transitions

S_{d e l} = S_{c a n} - S_{r e f}

. From

D (Q)

,

S_{1}

knows the distance between the point in

S_{c a n}

and query points. In the simulated

D L^{'}

, if a transition is in

S_{d e l}

, it must have kNN trajectories closer than the query. A PPT does not know the locations of every point; it only knows the distance and the identities of deleted transitions. It cannot distinguish between the simulated

D L^{'}

and the real

D L

.

(f_{1}, S_{2} (f_{2}, L_{2}^{s e t u p})) \equiv (O_{1}, v i e w_{2}),

(6)

(f_{1}, S_{2} (f_{2}, L_{2}^{s e a r c h})) \equiv (O_{1}, v i e w_{2})

(7)

[Simulating Setup] Given

L_{2}^{s e t u p} (D B_{p}) = {EI, {(O I D, [l o c])}_{i}, {(T I D, [l o c])}_{j}

,

| D B_{p} |,

| D B_{τ} |, | D B_{O} |}

,

S_{2}

randomly chooses

| D B_{p} |

points, encrypts points by CKKS to obtain

{[l o c]}^{'}

and assigns the identity to these points. Then,

S_{2}

constructs an eRBF tree

{EI}^{'}

, which has the same structure with

EI

. For each node,

S_{2}

randomly generates four vectors

V_{1}^{'}, \dots, V_{4}^{'}

and encrypts them by CKKS.

S_{2}

associates encrypts

{[l o c]}^{'}

with its corresponding

O I D

or

T I D

in

EI

. According to secure analysis in [19], a PPT adversary cannot distinguish between the simulated view and the real view.

[Simulating Search] Given

L_{2}^{s e a r c h} (D B_{p}, Q) = {T o k e n_{i}, T o k e n_{j}, | Q |, S_{c a n}, S_{r e f},

S (Q), A (Q), {(O I D, [l o c])}_{i}, {(T I D, [l o c])}_{j}}

,

S_{2}

randomly generates plain text

l o c^{'}

and encrypts it by using CKKS to get

{[l o c]}^{'}

. From

S (Q)

,

S_{2}

knows whether a point in query has been searched before or not. From

A (Q)

,

S_{2}

knows the identifiers of points which are NN points for a query point. If a

q_{i} \in Q

is searched before by comparing the token of

q_{i}

and in previous tokens,

S_{2}

reuses the previous simulated token and returns the previous NN points as search results. Otherwise,

S_{2}

simulates a new search token

T o k e n^{'}

, which is the token of one point including k hashes

h (k w)

and a location. Since

S_{2}

knows which leaf node of the eRBF tree matches the search token

T o k e n_{j}

,

S_{2}

randomly generates a k-bit string as the search token

T o k e n

. The string has the same size as

h (k w)

and matches with the same leaf node of eRBF. A PPT cannot distinguish between the simulated

T o k e n^{'}

and the real

T o k e n

. □

5.3. Computational Complexity Analysis

In this section, we analyze the time complexity of RkNNToE, in which the most complexity is caused by computing the distance between two points securely. The complexity of kNN is shown in [19]. To generate the set

S_{c a n}

, every query point is checked against nodes and cost

O (| Q | \cdot (N_{v i s} (e R B F t r e e) + N_{v i s} (O_{l e a f})))

at most, where

N_{v i s} (e R B F t r e e)

is the number of vertexes in the visited nodes and

N_{v i s} (O_{l e a f})

is the number of transition points in the leaf nodes that are intersected by the polyline. All filter points are checked against nodes and the cost of computing the distance is

O (k \cdot | Q | \cdot (N_{v i s} (e R B F t r e e) + N_{v i s} (O_{l e a f})))

at most. After obtaining

S_{c a n}

, the cost of computing the distances between all transitions in set

S_{c a n}

and the query trajectory is

O (| Q | \cdot | S_{c a n} |)

. After obtaining

S_{r e f}

, the cost of computing the distances between all transitions in set

S_{r e f}

and their kNN trajectories is

O (2 | S_{τ^{'}} | \cdot | S_{r e f} |)

, where

S_{τ^{'}}

is a set of all kNN trajectories of a transition. The total complexity is

O (R k N N T o E) = O ((k + 1) \cdot | Q | \cdot (N_{v i s} (e R B F t r e e) + N_{v i s} (O_{l e a f}))) + O (| Q | \cdot | S_{c a n} |) + O (2 | S_{τ^{'}} | \cdot | S_{r e f} |)

. According to [9], the visited nodes are proportional to the number of points in

D B_{p}

, f is the fanout of the eRBFtree, and

D B_{τ} ≪ D B_{o}

. The complexity is

O (R k N N T o E) = O ((k + 1) \cdot | Q | \cdot (N_{v i s} (e R B F t r e e) + N_{v i s} (O_{l e a f}))) + O (| Q | \cdot | S_{c a n} |) + O (2 | S_{τ^{'}} | \cdot | S_{r e f} |) = O ((k + 2) \cdot | Q | \cdot (| D B_{o} | / f))

.

6. Performance Evaluation

In this section, we conduct experiments on the two databases: the aGPS trajectory dataset (Transition dateset) collected in Geolife project in Beijing [22,23,24] and the bus lines dataset (Trajectory dataset) in Beijing [25]. There are 18,670 transitions in the transition database. The bus lines dataset has 1891 trajectories and 1174 bus stations. All algorithms are implemented in Python language in Windows 10 and examined on a computer with an Intel(R) Core (TM)i5-10505 and 16.00 GB RAM. We randomly generate a query trajectory by selecting an ordered sequence from the trajectory database, since the randomly generated points cannot keep the spatial continuity as a trajectory. In the experiment, the NN k trajectories do not share any one point with the query trajectory. The trajectory that is shared by multiple bus lines is just recorded as one trajectory.

6.1. Constructing eRBF Tree and DList

Before outsourcing the data, the DO needs to build the eRBF tree. The time cost of constructing the eRBF index includes two parts: the time of constructing the RBF tree in database

D_{τ}

and the time of constructing the encrypted quad tree in database

D_{p}

. The first part is related with the maximum number (

N_{τ}

) of trajectory points in a leaf node of the father tree. Table 3 shows the time cost of constructing the RBFs in the father tree with different

N_{τ}

. The second part is related to the maximum number (

N_{o}

) of transition points in a leaf node of the child tree. The total time of constructing the eRBF tree is shown in Figure 8; the main cost is for encrypting the four vertexes in every node of eRBF tree. With

N_{τ}

or

N_{o}

increasing, the cost of constructing eRBF decreases, since the DList is constructed based on plain text, and the DO only needs to compute the distance between every transition point with vertexes in its nearby nodes. Here, we set the nodes in the range of 25 to 200 steps, and the mean time of constructing the DList is shown in Table 3.

6.2. Generating Query

A query of one point includes the encrypted location and an NN search token. The time cost of encrypting a location vector is about 0.004516 seconds by CKKS encryption. The cost of generating a token is related with the search radius. Here, we denote the minimum range of the leaf node in the father tree as a step length and use the number of steps to determine the search radius. The step length does not decrease as

N_{τ}

increases, which is shown in Table 3. As shown in Figure 9, the line of “Enc.” is the time of encryption of a location. As the number of steps increases, the time of generating a token increases. So, the total time to generate a trajectory query is related to the number of points included in this trajectory and the search radius for every point in the query.

6.3. Search

In this section, we firstly demonstrate the time cost of the kNN search for a point. Then we show the total time of two clouds after receiving a RkNNToE request.

6.3.1. NN Trajectories Search

Since the DO needs to search NN trajectories for the refined transitions, it is necessary to illustrate the efficiency of the kNN search for every transition point. As shown in Figure 10, as the number of trajectory points in a leaf node of the father tree increases, the time of searching the NN points increases. As the number of steps in the search radius increases, the time cost of searching NN points increases. The total cost of searching NN trajectories for a transition requires twice as much time as that for NN points in Figure 10.

6.3.2. RkNNToE Search

In this section, we simulate the whole search process in two clouds, which includes finding NN points for every query point, constructing a filter set and pruning transitions, refining transitions and finding NN trajectories for every refined transition. In the simulated search, the eRBF tree is built with

N_{τ} = 2

and

N_{o} = 2

. The experiment settings are as follows:

The number of points in a query(

N_{p}

): 2 to 5, default 3. The k in RkNNToE: 1 to 4, default 2. The number of steps in NN points search: 20 to 200.

The random behavior of a time cost is caused by the random generation query trajectory. The effect of pruning differs widely when the queries are different. According to Section 5.3, the complexity is mainly affected by

S_{c a n}

and

(D B_{O} / f)

rather than operations of search k

N N

trajectories.

S_{c a n}

and

(D B_{O} / f)

are the outputs of pruning, and the size of the filter set does not linearly increase as k increases. In most cases, when

k = 2

, the points in the filter set are

a, b, c

.

{a, c}

and

{a, b}

can form two trajectories. It also causes the random behavior of time cost. So, we use the median of time cost to analyze the distribution trend of the results. As shown in Figure 11, when the number of points in a query is 3, the median time cost decreases as k increases. As k increases, the number of trajectories in the filter set increases, and the filtered transition increases. In the refining phase, the number of candidate transitions decreases, which leads to the reduction of time. As shown in Figure 12, when

k = 2

, the median of the time cost is increased as the number of points in a query increases. As the number of points in a query increases, the number of points in the trajectories in the filter set increases, which leads to the increased times of computing distance. It also results in the decrease of pruning space, which means the number of refined transitions increases. Both conditions cause the cost time to increase.

7. Conclusions

In this paper, we studied a method of route planning on an encrypted trajectory database, RkNNToE, that securely returns all transitions, which are the reverse k nearest neighbor trajectories of the query trajectory. We designed a hybrid encrypted bloom filter tree (eRBFtree) for search in the encrypted trajectory database, which supports space pruning and fast kNN search. Combined with eRBFtree, we gave the pruning strategies to prune the transition as much as possible and to improve the search efficiency. The security analysis showed that the query, data and index are secure in the process of RkNNToE. The experiments showed that RkNNToE can find the results in the RkNNT search efficiently and correctly.

Author Contributions

Conceptualization, X.Z., H.Z. and Q.W.; methodology, X.Z.; software, X.Z.; validation, H.Z., X.Z. and K.L.; formal analysis, X.Z. and H.Z.; investigation, H.Z.; resources, X.Z.; writing—original draft preparation, X.Z.; writing—review and editing, H.Z.; visualization, X.Z. and K.L.; supervision, Q.W. and K.L.; project administration, H.Z.; funding acquisition, H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant Nos. 62072051, 61976024, 61972048, 62272056).

Data Availability Statement

GPS trajectory dataset (Transition dateset) collected in Geolife project in Beijing [22,23,24] and the bus lines dataset (Trajectory dataset) in Beijing [25].

Conflicts of Interest

The authors declare no conflict of interest.

References

Chen, Z.; Shen, H.T.; Zhou, X.; Zheng, Y.; Xie, X. Searching trajectories by locations: An efficiency study. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Indianapolis, IN, USA, 6–10 June 2010; pp. 255–266. [Google Scholar] [CrossRef]
Keogh, E.J. Exact indexing of dynamic time warping. In Proceedings of the VLDB, Hong Kong, China, 20–23 August 2002; pp. 406–417. [Google Scholar]
Vlachos, M.; Gunopulos, D.; Kollios, G. Discovering similar mul- tidimensional trajectories. In Proceedings of the ICDE, San Jose, CA, USA, 26 February–1 March 2002; pp. 673–684. [Google Scholar]
Chen, L.; Özsu, M.T.; Oria, V. Robust and fast similarity search for moving object trajectories. In Proceedings of the SIGMOD, Baltimore, MD, USA, 14–16 June 2005; pp. 491–502. [Google Scholar]
Cheema, M.A.; Zhang, W.; Lin, X.; Zhang, Y.; Li, X. Continuous reverse k nearest neighbors queries in Euclidean space and in spatial networks. VLDB J. 2012, 21, 69–95. [Google Scholar] [CrossRef]
Emrich, T.; Kriegel, H.P.; Mamoulis, N.; Niedermayer, J.; Renz, M.; Zufle, A. Reverse-nearest neighbor queries on uncertain moving object trajectories. In Database Systems for Advanced Applications; Springer: Berlin/Heidelberg, Germany, 2014; pp. 92–107. [Google Scholar]
Feng, Z.; Zhu, Y. A Survey on Trajectory Data Mining: Techniques and Applications. IEEE Access 2017, 4, 2056–2067. [Google Scholar] [CrossRef]
Zheng, Y.; Lu, R.; Zhu, H.; Zhang, S.; Guan, Y.; Shao, J.; Wang, F.; Li, H. SetRkNN: Efficient and Privacy-Preserving Set Reverse kNN Query in Cloud. IEEE Trans. Inf. Forensics Secur. 2023, 18, 888–903. [Google Scholar] [CrossRef]
Wang, S.; Bao, Z.; Culpepper, J.S.; Sellis, T.; Cong, G. Reverse k nearest neighbor serach over trajectories. IEEE Trans. Knowl. Data Eng. 2018, 30, 757–771. [Google Scholar] [CrossRef] [Green Version]
Pan, X.; Nie, S.; Hu, H.; Yu, P.S.; Guo, J. Reverse Nearest Neighbor Search in Semantic Trajectories for Location-Based Services. IEEE Trans. Serv. Comput. 2022, 15, 986–999. [Google Scholar] [CrossRef]
Tzouramanis, T.; Manolopoulos, Y. Secure reverse k-nearest neighbors search over encrypted multi-dimensional databases. In Proceedings of the 22nd International Database Engineering & Applications Symposium (IDEAS), Villa San Giovanni, Italy, 18–20 June 2018; pp. 84–94. [Google Scholar]
Tao, Y.; Papadias, D.; Lian, X. Reverse kNN search in arbitrary dimensionality. In Proceedings of the 30th International Conference Very Large Data Bases, Toronto, ON, Canada, 31 August–3 September 2004; pp. 744–755. [Google Scholar]
Wu, W.; Yang, F.; Chan, C.-Y.; Tan, K.-L. FINCH: Evaluating reverse k-nearest-neighbor queries on location data. Proc. Vldb Endow. 2008, 1, 1056–1067. [Google Scholar] [CrossRef]
Lu, J.; Lu, Y.; Cong, G. Reverse spatial and textual k nearest neighbor search. In Proceedings of the ACM SIGMOD International Conference on Management Data, Athens, Greece, 12–16 June 2011; pp. 349–360. [Google Scholar]
Lu, Y.; Cong, G.; Lu, J.; Shahabi, C. Efficient algorithms for answering reverse spatialkeword nearest neighbor queries. In Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems, Bellevue, WA, USA, 3–6 November 2015; pp. 1–4. [Google Scholar]
Pournajaf, L.; Tahmasebian, F.; Xiong, L.; Sunderam, V.; Shahabi, C. Privacy preserving reverse k-nearest neighbor queries. In Proceedings of the 19th IEEE International Conference Mobile Data Manage, (MDM), Aalborg, Denmark, 25–28 June 2018; pp. 177–186. [Google Scholar]
Li, X.; Xiang, T.; Guo, S.; Li, H.; Mu, Y. Privacy-preserving reverse nearest neighbor query over encrypted spatial data. IEEE Trans. Serv. Comput. 2022, 15, 2954–2968. [Google Scholar] [CrossRef]
Wang, Q.; He, M.; Du, M.; Chow, S.S.; Lai, R.W.; Zou, Q. Searchable encryption over feature-rich data. IEEE Trans. Dependable Secur. Comput. 2016, 15, 496–510. [Google Scholar] [CrossRef]
Lei, X.; Tu, G.H.; Xie, A.X.L.T. Fast and Secure kNN Query Processing in Cloud Computing. In Proceedings of the 2020 IEEE Conference on Communications and Network Security (CNS), Avignon, France, 29 June–1 July 2020. [Google Scholar]
Finkel, R.A.; Bentley, J.L. Quad trees a data structure for retrieval on composite keys. Acta Inform. 1974, 4, 1–9. [Google Scholar] [CrossRef]
Lindell, Y. How to simulate it—A tutorial on the simulation proof technique. In Tutorials on the Foundations of Cryptography; Springer: Berlin/Heidelberg, Germany, 2017; pp. 277–346. [Google Scholar]
Zheng, Y.; Zhang, L.; Xie, X.; Ma, W. Mining interesting locations and travel sequences from GPS trajectories. In Proceedings of the International conference on World Wild Web (WWW 2009), Madrid, Spain, 20–24 April 2009; ACM Press: New York, NY, USA, 2009; pp. 791–800. [Google Scholar]
Zheng, Y.; Li, Q.; Chen, Y.; Xie, X.; Ma, W. Understanding Mobility Based on GPS Data. In Proceedings of the ACM conference on Ubiquitous Computing (UbiComp 2008), Seoul, Republic of Korea, 21–24 September 2008; ACM Press: New York, NY, USA, 2008; pp. 312–321. [Google Scholar]
Zheng, Y.; Xie, X.; Ma, W. GeoLife: A Collaborative Social Networking Service among User, location and trajectory. IEEE Data Eng. Bull. 2010, 33, 32–40. [Google Scholar]
BeiJIngBusStation. Available online: https://github.com/FFGF/BeiJIngBusStation (accessed on 7 March 2023).

Figure 1. Inserting a keyword on an empty RBF.

Figure 2. Index structure: RBF tree.

Figure 3. The model of RkNNToE search.

Figure 4. The partition for all points.

Figure 5. The quad tree structure for all points.

Figure 6. The example of RkNN search.

Figure 7. The structure of eRBFtree.

[\cdot]

is the encryption of ·.

Figure 7. The structure of eRBFtree.

[\cdot]

is the encryption of ·.

Figure 8. The time cost of constructing the eRBF tree.

Figure 9. The time cost of generating a query.

Figure 10. The time cost of searching NN trajectory points.

Figure 11. The effect of k in RkNNToE search (

N_{P} = 3

).

Figure 11. The effect of k in RkNNToE search (

N_{P} = 3

).

Figure 12. The effect of the number of points in query (

k = 2

).

Figure 12. The effect of the number of points in query (

k = 2

).

Table 1. Comparison with related works.

Schemes	Plaintext			Ciphertext
Schemes	[5,6]	[9]	[10]	[11]	[8]	RkNNToE
Search Type	RkNNT	RkNNT	RkNNT	RkNNP	RkNNS	RkNNT
Query Type	T	T	P	P	S	T
Result Type	P	T	T	P	S	T
Database Type	P and T	T and T	P and T	P	S	T and T

P: point; T: trajectory; S: set.

Table 2. Notations.

Notation	Definition
dist(a,b)	The distance between a and b
$D B_{p}$	The database of all points
$D B_{τ}$	The database of points in all trajectories
$D B_{o}$	The database of points in all transitions
$S_{τ}, S_{c a n}$	The set of trajectories and the set of candidate transitions
$S_{r e f}, S_{r e s}$	The set of refined transitions and the set of results
$n o d e (\cdot)$	The node with identity $(\cdot)$
$l o c$	The vector of location
$N_{τ}$	The max number of trajectory points in a leaf node of the father tree
$N_{o}$	The max number of transition points in a leaf node of the child tree
$i \in (a, b)$	$i \in (a, \dots, b)$

Table 3. The cost of constructing father tree and DList.

DB	$N_{τ}$	Step Length in Latitude and Longitude	Time Cost of RBF Tree(s)	Time Cost of DList(s)
$D B_{τ}$	2	[0.000230, 0.001237]	5.600787	3.559773
	3	[0.000460, 0.002474]	3.796525	3.468612
	4	[0.001840, 0.009896]	2.861059	3.470150
	5	[0.001840, 0.009896]	2.698736	3.444898

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, X.; Zhang, H.; Li, K.; Wen, Q. Privacy-Preserving Public Route Planning Based on Passenger Capacity. Mathematics 2023, 11, 1546. https://doi.org/10.3390/math11061546

AMA Style

Zhang X, Zhang H, Li K, Wen Q. Privacy-Preserving Public Route Planning Based on Passenger Capacity. Mathematics. 2023; 11(6):1546. https://doi.org/10.3390/math11061546

Chicago/Turabian Style

Zhang, Xin, Hua Zhang, Kaixuan Li, and Qiaoyan Wen. 2023. "Privacy-Preserving Public Route Planning Based on Passenger Capacity" Mathematics 11, no. 6: 1546. https://doi.org/10.3390/math11061546

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Privacy-Preserving Public Route Planning Based on Passenger Capacity

Abstract

1. Introduction

2. Related Work

3. Problem Formulation

3.1. RkNNT Problem and Definitions

3.2. Basic Security Primitives

3.2.1. CKKS Encryption

3.2.2. Security kNN

3.3. The System and Threat Models

3.4. Secure Requirements for MTS

4. The Proposed Scheme

4.1. Main Idea of RkNNT Search

4.1.1. Building Hybrid Quad Tree

4.1.2. Generating Filter Points and Pruning Transitions

4.1.3. Refining Transitions and Returning Results

4.2. Reverse Search on Encrypted Trajectory Database

4.2.1. Setup

4.2.2. eRBFtree and DList Building

4.2.3. Query Encryption

4.2.4. Search

5. Theoretical Analysis

5.1. Correctness Analysis

5.2. Security Definitions and Analysis

5.3. Computational Complexity Analysis

6. Performance Evaluation

6.1. Constructing eRBF Tree and DList

6.2. Generating Query

6.3. Search

6.3.1. NN Trajectories Search

6.3.2. RkNNToE Search

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI