Next Article in Journal
Mapping Imprecision: How to Geocode Data from Inaccurate Historic Maps
Previous Article in Journal
Context-Aware Point-of-Interest Recommendation Based on Similar User Clustering and Tensor Factorization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Management Method of Multi-Granularity Dimensions for Spatiotemporal Data

1
School of Geoscience and Technology, Zhengzhou University, Zhengzhou 450001, China
2
School of Geospatial Information, Information Engineering University, Zhengzhou 450001, China
3
Zhongke Yungu Technology Co., Ltd., Changsha 410000, China
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2023, 12(4), 148; https://doi.org/10.3390/ijgi12040148
Submission received: 17 December 2022 / Revised: 9 March 2023 / Accepted: 28 March 2023 / Published: 30 March 2023

Abstract

:
To understand the complex phenomena in social space and monitor the dynamic changes in people’s tracks, we need more cross-scale data. However, when we retrieve data, we often ignore the impact of multi-scale, resulting in incomplete results. To solve this problem, we proposed a management method of multi-granularity dimensions for spatiotemporal data. This method systematically described dimension granularity and the fuzzy caused by dimension granularity, and used multi-scale integer coding technology to organize and manage multi-granularity dimensions, and realized the integrity of the data query results according to the correlation between the different scale codes. We simulated the time and band data for the experiment. The experimental results showed that: (1) this method effectively solves the problem of incomplete query results of the intersection query method. (2) Compared with traditional string encoding, the query efficiency of multiscale integer encoding is twice as high. (3) The proportion of different dimension granularity has an impact on the query effect of multi-scale integer coding. When the proportion of fine-grained data is high, the advantage of multi-scale integer coding is greater.

1. Introduction

The spatiotemporal data reflect the quantitative and qualitative characteristics, spatial structure, spatial relations, and their changes with time of various elements or phenomena in the geographical world, which is the basis for human cognition of the geographical world. Most of the problems we face must be addressed based on data-driven approaches for understanding better and achieving more efficient and optimal decisions [1,2]. In recent years, with the development of Internet technology and sensor equipment, the production mode of spatiotemporal data has changed from passive production and active production to automatic production, which makes the spatiotemporal data resources we obtain more abundant [3,4,5]. We can use the monitoring video to analyze the vehicle operation and the movement rule of people on the road, realize the tracking and real-time prediction of traffic conditions, and avoid traffic and congestion. It can also improve the accuracy of weather forecasting by establishing models for continuous years of observation data. However, many social phenomena are complex. In order to reveal their essence, we need more cross-scale data [6,7]. For example, in recent years, the outbreak of COVID-19 has seriously affected the development of society. Many scholars have contributed to the prevention of the outbreak of COVID-19 by analyzing the correlation between relevant indicators at different scales and confirmed cases of COVID-19 [8,9]. Therefore, we can see the importance of different scale data for data mining. However, when we query data, we often ignore the multi-scale impact of data, resulting in incomplete data acquisition. Therefore, there is an urgent need for a multi-granularity dimension management method of spatiotemporal data. The rest of this paper was organized as follows: “Related work” introduced the content related to this study. “Method” introduced the DGFQM and the multi-scale dimension integer coding method. “Results” verified the effectiveness of the method and analyzed its results. “Discussion” was the content related to the results, including shortcomings and prospects.
Multi-granularity dimensions of spatiotemporal data mainly included time and space. First, we understand how to manage time and space in spatiotemporal data management. A popular direction of spatial information management was the method based on the grid model [10,11], which used a spatial filling curve to build grid code, and improved the indexing and query efficiency of multi-scale spatial data. For example, Guo [12] and others proposed an adaptive Hilbert –Geohash geogrid and coding method. Cao et al. [13] and others used a Hilbert curve to store and retrieve spatiotemporal data. However, geohash encoding lacked cross-scale spatial relations, resulting in low indexing efficiency. Zhai et al. [14] and others proposed a level-by-level space-filling curve, which improves the correlation between multiple levels by connecting adjacent levels. The clustering between levels of this method was poor, and the existing spatial retrieval strategy only considered the intersection and the included data that was not considered [15]. To solve the above problems, Lei et al. [16] proposed a global multi-scale spatial grid coding model, and designed a strategy to ensure the integrity of spatial queries based on this model. Multi-scale time was more used for auxiliary processing in simple ways, such as timestamps [17] and strings [18], which were scattered in file systems [19,20,21], databases [22,23,24], and programming languages. This time management method did not retain the multi-scale information of time, and it was difficult to manage the multi-scale time uniformly. To solve this problem, Tong et al. [25] proposed a multi-scale time segment integer coding method, which uses integer representation of time scale information and location information. However, the management of multi-granularity dimensions had the following challenges:
First, the intersection query results based on time and other one-dimensional dimensions are incomplete, and the impact of time multi-scale is not considered. The current fuzzy query method used fuzzy set theory [26] to solve the problem of fuzzy words such as “left and right” and “probably” in time description, and could solve the new fuzzy problem caused by multi-granularity [27,28].
Secondly, the current research on multi-granularity dimensions was mainly about time and space, and other multi-scale dimensions were not discussed. As a kind of spatiotemporal data, remote sensing data have the ability to cover a wide area of spectrum. The spectral band spanned from visible light, thermal infrared to microwave, and the resolution changed from multispectral to hyperspectral, which was an important indicator for distinguishing physical properties of ground objects in remote sensing data [29,30]. Recently, the application of radio wave imaging technology in daily life had made the band information span larger [31,32,33]. Zhang et al. [34] designed five spatiotemporal spectral integrated storage formats for long-term remote sensing data with time, space, and spectral information. However, there were few studies on multiscale bands. At present, band information was represented by unique identifiers in the database system. This method is not conducive to the unified storage of multi-source information.
Based on the above analysis, we propose a multi-granularity dimensions management method for spatiotemporal data, which is mainly divided into the DGFQM and multi-scale dimension integer coding. The DGFQM divided the query results into fuzzy data and fine data according to the dimension granularities, and obtained complete query results according to the correlation between different scale codes. Multi-scale dimension integer coding mainly applied the multi-scale integer coding method to the band. We designed an association method between arbitrary scale band and multi-scale integer coding to improve the efficiency of data retrieval.

2. Materials and Methods

2.1. Dimension

Dimension refers to the inherent and measurable physical properties of physical quantities. Internationally, the seven basic dimensions, such as time and length, are often used to represent other physical quantities. Physical quantity under the same dimension has the function of ordering and is often used to retrieve conditions. However, the dimensions have multi-granularity characteristics. The multi-granularity dimensions easily lead to data loss, so we defined the concept of dimension granularity fuzziness, and described the fuzziness problem.

2.1.1. Dimension Granularity

Granularity refers to the size of the particles. Granularity is measured by particle diameter (usually long or medium diameter). We expressed the measure of physical quantity with dimension granularities. Database systems usually use existing units to represent dimension granularities, such as standard time units (year, month, day, etc.), and length units (meter, decimeter, millimeter, etc.). The premise of realizing this goal requires a simple, effective, and easy-to-use multi-granularity dimension system. Therefore, we defined the relevant concepts as follows:
Definition 1.
Dimension Domain, D is a set of completely ordered points that satisfy a sorting relation. D = {d1,d2,d3,…,dn}, where d1 < d2 < d3 < … < dn.
Definition 2.
Dimension Particle, G is a set of finite continuous points in a dimension domain. G = {d1, d2, …, dk}, where k is the number of aggregate particles.
Definition 3.
Dimension Granularity, R is a set of nonoverlapping dimension particles, R = {G1, G2, G3,…, Gn}.
Definition 4.
Dimension Granularity Relations refer to the correlation between different dimension granularities. The relationship between Dimension Granularities can be divided into equal relations, finer relations, and coarser relations according to the size of the particles that make up the Dimension Granularity. Assume that R1 and R2 are two different granularities, and k1 and k2 are the numbers of particles contained in R1 and R2. k1 = k2, the granularity of R1 is equal to that of R2, R1 = R2; k1 < k2, the granularity of R1 is smaller than that of R2, R1≮ R2; k1 > k2, the granularity of R1 is larger than that of R2, R1≯ R2.
Definition 5.
Inclusion Relation, A point at a certain granularity can be expressed as a set of finite points of another granularity, and the inclusion relation exists between the two granularities. Assume that R1 and R2 are two different granularities, R1≮ R2. For any point at the granularity of R2, there is always a finite number of corresponding points at the granularity of R1, where x1 is the point at the granularity of R2 and yi is the point at the granularity of R1.
There are specific conversion rules between these units such as 60 min for an hour. However, there is not only fixed granularity information but also other granularity information. Therefore, it is urgent to implement the limited granularities to represent other granularities. The dimensions have two different representations: point and segment type. The point type represents a position on a dimension domain, represented by a value of a certain granularity. The segment type represents the interval on the dimension domain, which is represented by two points. This representation method realizes the representation of various granularities by existing units.

2.1.2. Dimension Granularity Fuzziness

At present, the fuzzy problem adopts the fuzzy set theory. The method calculates the probability of fuzzy data occurrence through the membership function. In this way, the fuzzy point was represented as a two-tuple (d1, δ1), where d1 represented the point, and δ1 represented the membership degree. The fuzzy segment was converted into a quad-tuple (d1, δ1, d2, δ2), where d1 and d2 were the start point and endpoint, and δ1 and δ2 were the membership degrees of the start point and endpoint, respectively. The premise of using the fuzzy set theory was to obtain the fuzzy data set. However, the fuzzy data sets were obtained through semantic computing or empirical knowledge. The above methods cannot solve the fuzziness caused by multi-granularity dimensions. Therefore, we described point fuzziness and segment fuzziness separately.
Compared with fine-grained data, coarse-grained data with multi-granularity dimensions has uncertainty. Therefore, different granularity choices for the same event produce different results. We defined the fuzziness induced by multi-granularity as the dimension granularity fuzziness, describing the fuzziness problem of point type and segment type, respectively.

Point

At present, most database systems use a point of a certain granularity to represent the state of an object, which is usually an index value. There are different granularities in practical applications, so granularity conversion is needed. We defined the granularity transformation function T:
T(dR,H) =h,
where dR is a point at the granularity of R, H is a granularity of transformation, and h is a point at the granularity of H.
Assume that d1 is a point at the granularity of R1, and R2 is a different granularity from R1. The conversion of d1 from the granularity of R1 to that of R2 involves the following two cases: R1R2, there is a unique dimension point d2 at the granularity of R2, i.e., d2 = h; R1R2, {d2|l < d2 < u} = h, where l~u is a point set of R2. The constant is generally used as the retrieval condition, so we divided the transfer function T into Ts and Tl.
Ts = min(T(dR,H)),
Tl = max(T(dR,H)),
where Ts is the minimum value converted to H granularity, Tl is the maximum value converted to H granularity.
Because of the multi-granularity characteristic of dimension, different granularity description of the same event produces different results. When describing the same event, coarse-grained points are fuzzier than fine-grained points. For example, the Wenchuan earthquake occurred on 12 May 2018 (China Standard Time), and the time under the annual granularity is 2018. The time information at annual granularity is fuzzier than that at daily granularity. We may miss this fuzzy information when retrieving data.

Segment

The segment represents a binary group [d1, d2], which is all points between d1 and d2. The granularity of d1 and d2 are R1 and R2. In the ideal case, the segment can represent by one index value. However, the length of the segment does not correspond to the existing granularity. Currently, dimension segments represent by two fields, which is inefficient when querying. With the introduction of multi-scale integer coding, we designed the following rules to attain a reasonable and smaller number of index values to represent segments. According to the granularity relationship between d1 and d2, there are two kinds of cases.
Case 1: The granularity of d1 is equal to d2, i.e., R1 = R2. Assuming the interval length is L. If Rx = L, select the value at the granularity of Rx to represent this interval, as shown in Figure 1a. If RxL, there are two filling methods. One is to fill from coarse-grained to fine-grained. The following three situations may exist depending on the coverage position of the index value. (1) As shown in Figure 1b, the index value covers the middle position of the interval. (2) As shown in Figure 1c, the coverage of the index value starts at the starting position d1. (3) As shown in Figure 1d, the coverage of the index value ends at the endpoint d2. Recalculate the length of the remaining sections and repeat the above steps until all sections [d1, d2] are covered. The other is to fill from fine-grained to coarse-grained. We can choose to start filling from the start point d1 or the endpoint d2. This method only needs to determine the granularity of the starting index value and does not need to perform multiple calculations. Therefore, we choose this method to study the band.
Case 2: The granularity of d1 is not equal to d2, R1 ≠ R2. First, we needed to convert the coarse-grained point to a fine-grained point. If R1 ≯ R2, we reached the point of d1 at R2 granularity by the transformation function Ts. If R1R2, we converted d2 to the point at R1 granularity through the transformation function Tl. According to the transformation function, the starting and ending points of the segment have the same granularity. Secondly, design the index values according to case 1. The fuzzy problem of segment type is similar to that of point type. Let the segment D consists of several segments, D= {D1, D2, …, Dn}. T is fuzzy relative to Di.

2.2. DGFQM

There are two main ways to retrieve data through dimensions. One is to query through a point, and the other is to query through the start point and end point, also known as the intersection query. Due to dimension granularity fuzziness, data are easily lost when querying, such as in the following example:
Data record 1: MODIS blue-band image (450–530 nm) of Beiyuan Road, Chaoyang District, Beijing, at 14:00 am on 15 November 2014.
Data record 2: MODIS visible-band image (380–780 nm) of Chaoyang District, Beijing, 15 November 2014.
Data record 3: MODIS panchromatic image (350–900 nm), Beijing, November 2014.
The above examples show that the same data was described differently due to the multi-granularity characteristics of temporal, spatial, and spectral attributes. Data record 1 was more accurate than data record 2, and data record 3 was fuzzier than data record 2. Important data may be missing from query results.
There are two kinds of missing data caused by dimension granularity fuzziness: coarse-grained missing and fine-grained missing data. Therefore, we divided the query results into fuzzy and exact data according to scales. Assume O (p1, p2, …, pi) is an object with multiple attributes, where pi represents the i-th attribute. Take the intersection query as an example. Let the query interval be [ p i 1 , p i 2 ] , the corresponding scales are N1 and N2, respectively. We divided the query result S into S1, S2, and S3, S1, S2, and S3, i.e., S = S 1 S 2 S 3 .
s 1 = O | O ( p i ) > max ( O ( p i 1 ) , O ( p i 2 ) )
s 2 = O | min ( O ( p i 1 ) , O ( p i 2 ) O ( p i 1 ) max ( O ( p i 1 ) , O ( p i 2 )
s 3 = O | O ( p i ) < min ( O ( p i 1 ) , O ( p i 2 ) )
where S1 is the set of objects whose scales are larger than p i 1 and p i 2 ; S2 is the set of objects whose scales are between N1 and N2; and S3 is the set of objects whose scales are smaller than p i 1 and p i 2 .
When N1 = N2, S1 is the fuzzy data set and S2 and S3 are the exact data set. When N1N2, S1 and S2 are the fuzzy data set and S3 is the exact data set. The DGFQM is to obtain the missing accurate data and fuzzy data. This method obtains missing data by analyzing the relationship between different dimension granularity. Since the specific steps of this method are related to the dimension coding method, we will introduce them in Section 3.
In practical application, the dimension granularity fuzzy query method must satisfy the following conditions:
Condition 1: The dimension has a multi-scale characteristic in the concrete application.
Condition 2: Inclusion relationships exist between adjacent levels.
Condition 1 means that a dimension domain can be represented by sets of points with different dimension granularities, or a point can be represented by multiple granularities. Condition 2 means that there is an inclusion relationship between adjacent levels, and a point on a certain scale includes all points on the next fine scale. As shown in Figure 2a, d1 is expressed as one index value at R1 granularity, two index values at R2 granularity, and three index values at R3 granularity, respectively. However, R2 and R3 do not satisfy condition 2. As shown in Figure 2b, d1 can be represented as one index value at R1 granularity, two index values at R2 granularity, and four index values at R3 granularity. Therefore, there is an inclusion relationship between adjacent granularities, which satisfies condition 2.

2.3. Dimension Coding Method

At present, dimensions are expressed in two ways: single-scale dimension coding and multi-scale dimension coding. Single-scale dimension coding is the representation of multi-granularity dimensions on a fixed scale. Multi-scale dimension coding represents multi-granularity dimension by coding at different scales. The existing coding methods are string coding and multi-scale integer coding. The multi-scale integer coding had been used in the time segment (multi-scale time segment integer encoding, MTSIC). For a time, MTSIC has had certain advantages compared to string coding. We extended it to multi-granularity dimensions, and the implementation method was as follows:
Assuming the dimension is dim ( α 1 , α 2 , α n 1 , α n ) , where α i is the number of dimension components and n is the number of dimension components. Figure 3 shows the principle of multi-scale dimension integer coding. Firstly, the components of the dimension are expressed in binary, and the single-scale dimension integer coding is formed by bit operation. Then, the multi-scale dimension integer coding is obtained based on the level information N. Since the bands usually exist in the form of dimension segments, we used multi-scale integer coding to manage the bands and designed the association method between multi-scale integer coding and band.

2.3.1. Multi-Scale Band Integer Coding

The band is encoded with an integer for single-scale band integer coding (SBIC) and multi-scale band integer coding (MBIC). The main idea of MBIC is to transform the band information into an SBIC, and then transform the SBIC into MBIC by level information. Assume that the band was b ( l 1 , l 2 , , l n 1 , l n ) , where l 1 , l 2 , l n 1 , l n were the different components of the band. An m-bit integer SC is used to represent a fixed-scale band (the integer types in computers are 32-bit and 64-bit). The SC is transformed into the integer coding MC of different levels according to the level information.
Since the band span is from kilometer to picometer, a 64-bit integer was used to represent single-scale band coding. Let the band be b ( l 1 , l 2 , l 3 , l 4 , l 5 , l 6 , l 7 , l 8 ) , where the memory usage of the components of the band is as follows:
  • The range of l8-pm is 0–1000, represented by a 10-bit binary, where 1000–1023 is a null value;
  • The range of l7-nm is 0–1000, represented by a 10-bit binary number, where 1000–1023 is null;
  • The range of l6-μm is 0–1000, represented by a 10-bit binary number, where 1000–1023 is null;
  • The range of l5-mm is 0–10, represented by a 4-bit binary number, where 10–16 is null;
  • The range of l4-cm is 0–1000, represented by a 4-bit binary number, where 10–16 is null;
  • The range of l3-dm is 0–1000, represented by a 4-bit binary number, where 10–16 is null;
  • The range of l2-m is 0–1000, represented by a 10-bit binary number, where 10–16 is null;
  • l1-km is represented by a 12-bit binary number.
For example, 1 pm is the fixed scale. The SC is made up of l1(12-bit), l2(10-bit), l3(4-bit), l4(4-bit), l5(4-bit), l6(10-bit), l7(10-bit), and l8(10-bit) in memory. As shown in Figure 4, the band range is 0–4096 km, denoted by integers ranging from 0 to 264-1. Since the commonly used scales (km, m, dm, …, nm, pm) are not integral multiples of 2, SC is not continuous.
Since SBIC already occupies almost all 64-bit integers, it is necessary to select some integers from them to represent other scale bands. We chose 1-bit from 64-bit to store multi-scale band integer encoding. In this way, the single-scale band integer at the 1 pm scale changed from 0~264-1 to 0~263-1, indicating that the range was 0~2048 km, and the remaining 263 integers were used to store bands of other scales. The 264 integers were divided into 64 levels according to the structure of the binary tree, effectively including the commonly used units of length (km, m, dm, …, nm, pm), where level 63 consisted of 263 integers, level 63 consisted of by 262 integers, …, level 0 was represented by 1 integer, the minimum scale level was 63, and the relationship between adjacent scales was a factor of 2. The correspondence between levels and scales is shown in Table 1.
As shown in Table 1, 64 scales are represented by 64-bit integers, namely: 1 pm, 2 pm, …, 1 nm, 2 nm, …, 1 μm, 2 μm, …, 1 mm, 2 mm, …, 1 cm, 2 cm, …, 1 dm, 2 dm, …, 1 m, 2 m, …, 1 km, 2 km, …, 2048 km, with scales ranging from 1 pm to 2048 km. To include the common scale of the band, 1 nm is extended to 1024 pm, 1 μm to 1024 nm, 1 mm to 1024 μm, 1 cm to 16 mm, 1 dm to 16 cm, 1 m to 16 dm, 1 km to 1024 m. As shown in Figure 5, a 64-layer binary tree structure was obtained.
The MBIC is obtained by the level N and b ( l 1 , l 2 , l 3 , l 4 , l 5 , l 6 , l 7 , l 8 ) , and the specific method is as follows:
  • Single-scale band integer coding calculation: SC is calculated according to Formula (7);
S C = ( l 1 < < 52 ) ^ ( l 2 < < 42 ) ^ ( l 3 < < 38 ) ^ ( l 4 < < 34 ) ^ ( l 5 < < 30 ) ^ ( l 6 < < 20 ) ^ ( l 7 < < 10 ) ^ l 8
2.
Multi-scale band integer coding calculation: according to Formulas (8)–(10), the multi-scale band integer coding mc is obtained by using the level N;
S C = S C < < 1
D e t a 0 = 1 < < ( 63 N )
M C = ( s c > > ( 64 N ) ) < < ( 64 N ) + D e t a 0 1
where Deta0 is the smallest number in the Nth level.

2.3.2. MBIC Related Operations

Since MBIC represents band data by integers, the related operations in MBIC mainly involve the addition and subtraction of integers and bit operations. This section introduces the level calculation and relationship calculation method of MBIC in detail.

Level Calculation

The multi-scale band integer code is a 64-bit integer, so the level information cannot be intuitively obtained by giving the integer. It is necessary to calculate its level. According to the parity of MC, the specific methods are as follows:
  • If MC is an even number, its level N is 63;
  • If MC is an odd number, first, calculate how much the high-order bits in front of the binary of MC1 and MC + 1 are the same, i.e., Mid = (MC − 1) ^ (MC + 1). Secondly, the level is calculated by calculating how many consecutive zeros are on the left side of the binary of Mid. MBIC is represented by a 64-bit integer and can use the bifurcation method to efficiently obtain level information. The branch method judges how many 0 are on the left of the 64-bit integer according to the method of dichotomy.

Level Relationship Calculation

The multi-scale band integer encoding has a containment relationship and a contained relationship. The child coding set can be obtained by using the containment relationship, and the parent coding set can be obtained by the contained relationship.
  • Child coding set: Given a multi-scale band integer encoding MC, the corresponding level is N. The integer encoding MC′ of the calculated level N′ ( N N ) is the child coding set. Let the interval of the child coding set be [C1, C2], where C1, C2 are calculated as Formulas (11) and (12):
C 1 = M C ( 1 < < ( 63 N ) ) + 1
C 2 = M C + ( 1 < < ( 63 N ) ) + 1
2.
Parent coding set: Let the MC level be N, and the parent encoding level is N′. The integer MC′ of the calculated level N′ ( N < N ) is the parent coding set. According to Formulas (13) and (14), the parent coding set of MC is obtained from N1 to 0 through loop variable N′:
D e t a 0 = 1 < < ( 63 N )
F M C = ( M C > > ( 64 N ) < < ( 64 N ) ) + D e t a 1 1

2.3.3. The Association Method between MBIC and Band

The bands often exist in the form of an interval, and establishing the association between band intervals and MBIC is crucial for data retrieval. Since MBIC is designed according to the binary tree rules based on common granularity units, the following rules are designed to establish the association between band and MBIC:
  • Rule 1: The maximum level Nmax of MBIC is not larger than the maximum level Nmax of the start and end point of the band.
  • Rule 2: First, the bands are padded with fine-grained to coarse-grained integer encoding, then the bands are padded with coarse-grained to fine-grained integer encoding until the band interval is filled. The specific filling method is shown in Figure 6, where L represents the band, and A, B, C, and D represent multi-scale integer coding at different levels.
The steps to associate the band with the multi-scale band integer coding are as follows:
  • Convert the start and end point of the bands to the same granularity.
Analyze the levels of the start(b1(li)) and end (b2(lj)) points of the bands. If i j , use the conversion function to convert coarse-grained to fine-grained. When the granularity of b1 is coarser than that of b2, the Ts conversion function is used, and when the granularity of b1 is finer than that of b2, the T1 conversion function is used;
2.
Gradually divide and determine its level scope.
Assuming that both b1(li) and b2(lj) are data at the micrometer scale, i.e., i = j = 6, according to each component, its grade is divided into 6 grades (33~43, 29~33, 25~29, 21~25, 11~21, 0~11). The minimum level Nmin and the maximum level Nmax of the MBIC is determined grade by grade. The maximum level is the maximum level at this grade, i.e., Nmax = 43 at the (33~43) grade. The minimum level calculation is divided into two cases:
  • Case 1: If li1 = lj1, calculate the band length l, i.e., l = lj li + 1, and convert l to the sum of the power of 2, where the maximum value in the addend corresponds to the level of is Nmin;
  • Case 2: If li−1lj−1, calculate the band length l, l = maxjli + 1, where maxj is the maximum value of the j-th component, for example, if j = 6, then maxj = 1000. Then convert l to the sum of the power of 2, where the maximum value in the addend corresponds to the level of is Nmin;
3.
Accurate filling step by step.
According to Table 1, obtain the level N of the corresponding component for each grade, if N N min , convert the l of this grade to the sum of the power of 2, and obtain the level corresponding to the addend. Finally, calculate the multi-scale band integer encoding according to the level information; if N > N min , execute the loop body until l = 0. Assuming that the corresponding scale of N is v, the loop body is as follows:
l = lv. If l > 0, multi-scale integer encoding is performed on the data of the current level and N = N − 1, li = li + v; If l < 0, N = N + 1, l = l + v; If l = 0, multi-scale integer encoding is performed on the data at the current level and the loop is exited.
For example, the band range is (6 km 626 m 4 dm 5 cm 1 mm~6 km 626 m 4 dm 5 cm 4 mm).
  • Step 1: Calculate the corresponding level of b1 and b2, N1 = 33, N2 = 33;
  • Step 2: According to the components of b1 and b2, it is divided into 5 grades (29~33, 25~29, 21~25, 11~21, 0~11); It is only necessary to calculate the band length l at the (29~33) grade, l = 4 mm, the level corresponding to 4 mm is Nmin = 27, Nmax = 33;
  • Step 3: The level corresponding to l5 = 1 mm is N = 33, N > Nmin, and the multi-scale integer coding is obtained: MC1= 59,551,923,803,521,023 (N = 33), MC2= 59,551,927,024,746,495 (N = 32), MC3= 59,551,930,245,971,967 (N = 33);
As shown in Figure 7, the relationship between MBIC and band is many-to-many.

3. Results

To verify the effectiveness of the design method in this paper, we conducted related experiments on multi-granularity dimensions (time, band) that satisfy the fuzziness of dimension granularity. The verification content mainly includes the following three points: the effectiveness of the DGFQM, the relevant factors that affect the query efficiency of MTSIC and string coding, and the influence of the association method between MBIC and band on data retrieval. In response to the above contents, we designed the experiments as follows:
Experiment 1: To verify the effectiveness of the DGFQM, we simulated time data, and then compared the query results of the DGFQM and the intersection query method.
Experiment 2: We designed time data sets with different proportions using string encoding and MTSIC methods and compared the retrieval efficiency of the two ways.
Experiment 3: We used the string coding method and the association method between MBIC and band to build an index table for the simulated band data, respectively, and then compared the query efficiency of the two ways.
Development experiment environment: Windows Intel(R) Core(TM) i5-8500 CPU @ 3.00 GHz, 64-bit,8 GB, Visual Studio 2019, C++, MySQL 5.7.19.

3.1. DGFQM

At present, we mostly use the intersection query method for data queries. We used string coding and MTSIC to store time data, respectively, and then compared the results of the DGFQM and the intersection query method. First, randomly generate n different time scales (year, month, day, hour, minute, second, millisecond, microsecond), then perform string coding and multi-scale integer coding. Finally, build a B-tree for the intersection query method and the DGFQM.

3.1.1. The DGFQM Based on String Coding

Dimension granularity fuzziness query steps based on string coding:
  • Perform string encoding on the query interval [t1, t2] to obtain the string interval [s1, s2];
  • Decode the strings s1 and s2 to attain levels N1, N2;
  • Parse the string s1, and then obtain the parent data set Cf1 of s1 by coding;
  • Parse s2, and then obtain the child set Cs2 of s2 through string coding;
  • Obtain query results through set operations and query statements;
Set n to be 10,000, 100,000, 500,000, 1,000,000, 5,000,000, 10,000,000, and select various query intervals to perform the intersection query and the DGFQM, respectively. The query intervals are the annual scale, the daily scale, and the second scale. The query results are shown in Figure 8. The intersection query method does not take into account the dimension granularity fuzziness, but only relies on the size sorting function of the code to obtain the data. Therefore, the number of results obtained by DGFQM is higher than that of the intersecting query method. From Figure 8, it can be seen that the amount of missing data in the intersection query is affected by the amount of data and the query interval. The amount of missing data is proportional to the query interval and the total amount of data.
To verify the correctness of the data in the query results of the dimension granularity fuzziness, we took the query interval (15 November 2014, 15 February 2015) as an example to compare query results for both methods under the 1 million data set. The number of query results for the DGFM is 5727, and the number of unequal results is 5564. The query results of the intersection query method are 5564, of which 5408 are unique. As shown in Table 2, the query results of the DGFM are more complete than the intersection query.

3.1.2. The DGFQM Based on MTSIC

The DGFQM steps based on MTSIC:
  • According to the multi-scale time segment integer encoding method, the integer coding MTC1 and MTC2 of t1 and t2 were obtained, so the integer coding interval was Cb = [MTC1, MTC2];
  • Calculate the level of MTC1 and MTC2, and obtain the corresponding levels N1 and N2 through level operations;
  • The parent data sets Cf1 and Cf2 are obtained through the contained relationship operation, and the missing fuzzy data set C1 is obtained according to Formula (15);
C 1 = { x | x C f 1 x C f 2 x C b }
4.
The child data sets Cs1 and Cs2 of MTC1 and MTC2 were obtained by using the containment relationship operation, and then the missing precise data set C2 was obtained according to the following Formula (16);
C 2 = { x | x C s 1 x C s 2 x C b }
5.
Obtain query results through set operations and query statements;
Set n to be 10,000, 100,000, 500,000, 1,000,000, 5,000,000, 10,000,000, and select various query intervals to perform the intersection query and the DGFQM respectively. The query intervals are the annual scale, the daily scale, and the second scale. The query results were consistent with the query result based on string coding, as shown in Figure 8.

3.2. The Influence of the Proportion of Different Time Scales on Retrieval Efficiency

MTSIC uses an integer type to store time data, which occupies less memory and is more computationally efficient than a string type. Therefore, the proportion of different scales in the time data may have an impact on the query efficiency. We designed different temporal data sets to compare the query efficiency of temporal string encoding and MTSIC using DGFQM. The experimental design process was as follows:
  • Randomly generate n time data (year, month, day, hour, minute, second, millisecond, microsecond) according to equal and unequal proportions. The non-proportional data is generated in the way of 1: 2: 4: 8: 16: 32: 64: 128, which will generate a combination of factorials of 8, so we divided the scales into fine scales (hour, minute, second, millisecond, microsecond) and coarse scales (year, month, day). The specific design is shown in Table 3.
  • Establish a B-tree index. Perform string encoding and MTSIC on time data, and then build B-trees, respectively.
  • Dimension granularity fuzzy query. According to Section 3.1, we performed the DGFQM on string coding and MTSIC, respectively, and counted the results.
Set n to 10,000, 10,000, 100,000, 1,000,000, 5,000,000, 10,000,000, and select the query range: “2014 to 2015”, “15 November 2014 to 15 February 2015” for querying. Each query result was taken ten times, and the query efficiency was counted. The result was shown in Figure 9.
The red marks in Figure 9a–e are all lower than the blue marks, so the query time of MTSIC is less than that of string coding. Under the data volume of 10,000, the time-consuming of the string coding was 1.2 (dbl1) times, 1.5 times (bdbl1), 1.1 times (dbl2), and 1.2 times (bdbl2) of MTSIC, respectively. Under the 10,000 data volume of fdbl, the time-consuming of string coding was roughly equal to that of MTSIC. The time-consuming of string coding under 10 million data volume is 1.2 times (fdbl1), 1.7 times (dbl1), 2.1 times (bdbl1), 1.1 times (fdbl2), 1.5 times (dbl2), 2.1 times (bdbl2) of MTSIC, respectively. Therefore, we can draw the following conclusions: Under the same proportion, with the increase in the total amount of data or the expansion of the query scope, the query effect of MTSIC was better and better compared to string coding. In the case of the same amount of data, with the increase in the fine-scale ratio, the query effect of MTSIC was better and better.

3.3. Comparing the Retrieval Efficiency of MBIC and String Encoding

Randomly generate n bands and manage them in two ways. One was the string coding method, which was stored and indexed through two fields of string type. The other was to use the association method between MBIC and band to store and index. Table 4 is a comparison of the expressions of the two codes. Let the band be [b1, b2], and retrieve data according to the DGFQM. The steps for the DGFQM of bands were as follows:
The steps of the DGFQM based on string coding:
  • Perform string coding on the query interval [b1, b2] to obtain the string interval [s1, s2];
  • Attain the exact data set Cx in the query interval. Let the storage fields be field1 and field2, respectively, and obtain the exact data set Cx according to Formula (17);
C x = { f i l e d 1 s 1 f i l e d 2 f i l e d 1 s 2 f i l e d 2 }
3.
Obtain the fuzzy data set Cm in the query interval. Obtain the fuzzy data set Cm according to the Formula (18);
C m = { s 1 f i l e d 1 s 2 f i l e d 2 }
4.
Obtain query results through set sum operation;
The steps of the DGFQM based on MBIC:
  • According to the association method between MBIC and band, the corresponding MBIC set B= {MC1, MC2,..., MCn} is obtained;
  • Attain the exact data set Cx in the query interval. Obtain the child interval xi of the i-th code in B by including relational operation, i.e., B(i) and repeat the operation until all codes in B are traversed. The specific process was shown in Figure 10a:
  • Attain fuzzy data set Cm of query interval. Obtain the parent interval mi of the i-th code in B by including relational operation, i.e., B(i) and repeat the operation until all codes in B are traversed. The specific process was shown in Figure 10b;
  • Obtain query results through set operations;
Set n to 500,000, 1,000,000, 5,000,000, and 10,000,000, and make multiple queries. We considered four query intervals as an example, which contained four different scale intervals. The query intervals were represented by the string coding method and the multi-scale integer coding method, and the specific design is shown in Table 5. Then query according to DGFQM under different codes. Finally, take ten times for each query and count the query efficiency.
The statistical results are shown in Figure 11. The association method between MBIC and band proposed in this paper has a better effect than the traditional string representation. The query time for both methods increase with the amount of data. Under the same amount of data, when using the method proposed in this paper, the query time gradually increased with the expansion of the band range. It can be seen from Figure 11 that the time-consuming of queries 1–3 was about zero. However, when using the string coding method to retrieve the band range, it is necessary to traverse all the data, which took a long time. The results show that the query band range has little effect on it.

3.4. Discussion

Aiming at the problem of the multi-granularity dimension in spatiotemporal data, we proposed a management method of multi-granularity dimensions for spatiotemporal data. Mainly study the fuzziness and organization methods of multi-granularity dimensions. First, according to the inclusion relationship between granularities, we proposed DGFQM, which solved the problem of data loss caused by the multi-granularity characteristic of dimensions. Second, we discussed the encoding method of bands and designed the association method of multi-scale integer coding and bands. The correlation experiments were carried out by simulating time and band data. Correlation experiments are carried out by simulating time and band data. The experimental results are as follows:
(1) Whether the string coding method or MTSIC, the DGFQM can obtain more complete data than the intersection query method;
(2) Although the query efficiency of MTSIC is higher than that of the string coding method, its effect is affected by the proportion of different scales in the data. With the increase in the amount of fine-scale data, the query effect of multi-scale time integer coding is better;
(3) Compared with the string coding method, the association method between MBIC and band designed in this paper effectively improves the data retrieval efficiency. The retrieval efficiency of this method is related to the range of the query band, and the query effect is better as the range of the band decreases. Especially when the band range is small, the query time is about 0.

4. Conclusions

4.1. DGFQM

Few studies have discussed the fuzziness caused by the multi-granularity of dimensions. Although a cross-scale spatial filling curve was proposed in reference [16] to provide a query method for multi-scale spatial data, the relevant theories and methods of dimension granularity fuzzy such as time were not proposed. In this paper, we discuss the fuzziness of multi-granularity dimensions from point and segment, and proposed the DGFQM. To verify the effectiveness of the DGFQM, we simulated temporal data and compared the query results of the intersection query method [25] and DGFQM.

4.2. Multi-Scale Integer Coding

At present, multi-scale integer coding has achieved good results in time and space. However, there were few studies on other multi-granularity dimensions. The concept of time-spectrum was proposed in reference [34], which put our focus on spectral information. We extended multi-scale integer coding to multi-scale dimension and took the band as an example to describe the application of multi-scale integer coding in a band in detail. We used the scale information contained in multiscale integer coding to design the correlation method between multiscale integer coding and band. The band was converted into a one-dimensional array by filling. The experiment showed that the association method proposed in this paper improved the efficiency of data retrieval compared with the traditional binary form.
In the above research, we studied the multi-granularity metric in spatiotemporal data from the above two aspects. The results were generally good, but there were still some limitations, and there are still some problems to be discussed.
(1) This method was to solve the problem of incomplete query results based on time and other multi-scale dimensions. This requires that the query data cover as many areas as possible. Secondly, the method uses multi-scale integers to fill multi-scale dimensions. When the scale is one year, three months, one day, and five hours, this complex situation needs to be filled with many multi-scale integer codes, which would affect the efficiency of data retrieval.
(2) We analyzed the fuzziness of spatiotemporal data from the multi-scale dimension level, and provided a new perspective for the study of spatiotemporal data fuzziness. We obtained fuzzy data with hidden values from the data through the DGFQM, so as to better understand and analyze the change trend in various fields such as economy and culture. Next, we will further study the query results, analyze the potential information in the fuzzy data, and build the corresponding knowledge map.
(3) We applied multi-scale integer coding to the band, and discussed the applicability of multi-scale integer coding. It can be seen that multi-scale integer coding has certain advantages in terms of memory occupation and query efficiency. At present, multi-scale integer coding was applied to time, space, and band, respectively. Next, we will consider building the coding of a space-time, spatiotemporal spectrum based on multi-scale integer coding.

Author Contributions

Conceptualization, Wen Cao and Wenhao Liu; methodology, Wen Cao, Wenhao Liu and Xiaochong Tong; software, Wenhao Liu; validation, Wenhao Liu, Jianfei Wang, Feilin Peng, Yuzhen Tian, Jingwen Zhu; formal analysis, Wenhao Liu, Feilin Peng, Yuzhen Tian and Jingwen Zhu; investigation, Wenhao Liu; resources, Wenhao Liu; data curation, Wenhao Liu; writing—original draft preparation, Wen Cao and Wenhao Liu; writing—review and editing, Wen Cao, Wenhao Liu and Jianfei Wang; visualization, Wenhao Liu; supervision, Wen Cao; project administration, Wen Cao and Wenhao Liu; funding acquisition, Wen Cao. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by The Excellent Youth Foundation of Henan Municipal Natural Science Foundation (212300410096), Program of Song Shan Laboratory (Included in the Management of Major Science and Technology Program of Henan Province) under Grant number 221100211000-03, and The National Key R&D Plan of China (2018YFB0505304).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Huang, Y.; Chen, Z.-x.; Yu, T.; Huang, X.-z.; Gu, X.-f. Agricultural remote sensing big data: Management and applications. J. Integr. Agric. 2018, 17, 1915–1931. [Google Scholar] [CrossRef]
  2. Saralioglu, E.; Gungor, O. Crowdsourcing in Remote Sensing: A Review of Applications and Future Directions. IEEE Geosci. Remote Sens. Mag. 2020, 8, 89–110. [Google Scholar] [CrossRef]
  3. Clifford, L. Big data: How do your data grow? Nature 2008, 455, 28–29. [Google Scholar]
  4. Spitzbart, B.D.; Lynch, H.J.; Turilli, M.; Jha, S. ICEBERG: Imagery Cyber-infrastructure and Extensible Building blocks to Enhance Research in the Geosciences. (A Research Programmer’s Perspective). In Practice and Experience in Advanced Research Computing; ACM: New York, NY, USA, 2020. [Google Scholar]
  5. Hidalgo, C. Why Information Grows; Penguin UK: London, UK, 2015. [Google Scholar]
  6. Balsa-Barreiro, J.; Menendez, M.; Morales, A.J. Scale, context, and heterogeneity: The complexity of the social space. Sci. Rep. 2022, 12, 9037. [Google Scholar] [CrossRef] [PubMed]
  7. Alessandretti, L.; Aslak, U.; Lehmann, S. The scales of human mobility. Nature 2020, 587, 402–407. [Google Scholar] [CrossRef] [PubMed]
  8. Ren, H.; Zhao, L.; Zhang, A.; Song, L.; Liao, Y.; Lu, W.; Cui, C. Early forecasting of the potential risk zones of COVID-19 in China’s megacities. Sci. Total Environ. 2020, 729, 138995. [Google Scholar] [CrossRef]
  9. Sugg, M.M.; Spaulding, T.J.; Lane, S.J.; Runkle, J.D.; Harden, S.R.; Hege, A.; Iyer, L.S. Mapping community-level determinants of COVID-19 transmission in nursing homes: A multi-scale approach. Sci. Total Environ. 2021, 752, 141946. [Google Scholar] [CrossRef]
  10. Ben, J.; Li, Y.; Zhou, C.; Wang, R.; Du, L. Algebraic encoding scheme for aperture 3 hexagonal discrete global grid system. Science China. Earth Sci. 2018, 61, 215–227. [Google Scholar]
  11. Li, Q.; Chen, X.; Tong, X.; Zhang, X.; Cheng, C. An Information Fusion Model between GeoSOT Grid and Global Hexagonal Equal Area Grid. ISPRS Int. J. Geo-Inf. 2022, 11, 265. [Google Scholar] [CrossRef]
  12. Guo, N.; Xiong, W.; Wu, Y.; Chen, L.; Jing, N. A Geographic Meshing and Coding Method Based on Adaptive Hilbert-Geohash. IEEE Access 2019, 7, 39815–39825. [Google Scholar] [CrossRef]
  13. Cao, B.; Feng, H.; Liang, J.; Li, X. Hilbert Curve and Cassandra Based Indexing and Storing Approach for Large-Scale Spatiotemporal Data. Geomat. Inf. Sci. Wuhan Univ. 2021, 46, 620–629. [Google Scholar]
  14. Zhai, W.; Chen, B.; Tong, X.; Cheng, C. Research on Continuity of Multi-Scale Space-Filling Curves. Acta Sci. Nat. Univ. Pekin. 2018, 54, 331–335. [Google Scholar]
  15. Huang, K.; Li, G.; Wang, J. Rapid retrieval strategy for massive remote sensing metadata based on GeoHash coding. Remote Sens. Lett. 2019, 10, 111–119. [Google Scholar] [CrossRef]
  16. Lei, Y.; Tong, X.; Zhang, Y.; Qiu, C.; Wu, X.; Lai, G.; Li, H.; Guo, C.; Zhang, Y. Global multi-scale grid integer coding and spatial indexing: A novel approach for big earth observation data. ISPRS J. Photogramm. 2020, 163, 202–213. [Google Scholar] [CrossRef]
  17. Fairbanks, K.D. An analysis of Ext4 for digital forensics. Digit. Invest. 2012, 9, S118–S130. [Google Scholar] [CrossRef]
  18. Brumm, B. Beginning Oracle SQL for Oracle Database 18c: From Novice to Professional: Beginning Oracle SQL for Oracle Database 18c: From Novice to Professional; Apress: New York, NY, USA, 2019. [Google Scholar]
  19. Zhu, L.; Su, X.; Tai, X. A High-Dimensional Indexing Model for Multi-Source Remote Sensing Big Data. Remote Sens. 2021, 13, 1314. [Google Scholar] [CrossRef]
  20. Wu, H.; Cheng, H.; Zheng, J.; Qi, K.; Yang, H.; Li, X. RS-ODMS: An Online Distributed Management and Service Framework for Remote Sensing Data. Geomat. Inf. Sci. Wuhan Univ. 2020, 45, 11. [Google Scholar]
  21. Xu, C.; Du, X.; Yan, Z.; Fan, X. ScienceEarth: A Big Data Platform for Remote Sensing Data Processing. Remote Sens. 2020, 12, 607. [Google Scholar] [CrossRef] [Green Version]
  22. Isomura, A.; Iida, Y.; Naito, I.; Nakamura, T. Axispot: A Distributed Spatiotemporal Data Management System for Digital Twins of Moving Objects. IEEE Softw. 2022, 39, 33–38. [Google Scholar] [CrossRef]
  23. Akakba, A.; Filali, A. Object-Relational Modelling and Establishment of a Generic Database for the Management and Monitoring of Urban Planning Permissions in the City of El-Eulma (Algeria). J. Settl. Spat. Plan. 2017, 8, 139–146. [Google Scholar] [CrossRef]
  24. Zheng, Y.; Liu, J.; Li, J.; Xu, Y.; Pei, Y. Design of Fine Management System for Civil Aviation Airspace Resources Based on Spatiotemporal Grid Model. In Proceedings of the 2019 IEEE 1st International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Kunming, China, 17–19 October 2019. [Google Scholar]
  25. Tong, X.; Wang, R.; Wang, L.; Lai, G.; Ding, L. An Efficient Integer Coding and Computing Method for Multiscale Time Segment. Acta Geod. Et Cartogr. Sin. 2016, 45, 66–76. [Google Scholar]
  26. Zadeh, A.L. Fuzzy sets versus probability. Proc. IEEE 1980, 68, 421. [Google Scholar] [CrossRef]
  27. Deng, L.; Liang, Z.; Zhang, Y. A Fuzzy Temporal Model and Query Language for FTER Databases. In Proceedings of the 2008 Eighth International Conference on Intelligent Systems Design and Applications, Kaohsuing, Taiwan, 26–28 November 2008; Volume 3, pp. 77–82. [Google Scholar]
  28. Ďuračiová, R.; Faixová Chalachanová, J. Fuzzy Spatio-Temporal Querying the PostgreSQL/PostGIS Database for Multiple Criteria Decision Making. In Dynamics in GIscience; Springer: Cham, Switzerland, 2018; pp. 81–97. [Google Scholar]
  29. Liu, Y.; Wu, H.; Wang, S.; Chen, X.; Kimball, J.S.; Zhang, C.; Gao, H.; Guo, P. Evaluation of trophic state for inland waters through combining Forel-Ule Index and inherent optical properties. Sci. Total Environ. 2022, 820, 153316. [Google Scholar] [CrossRef]
  30. Duan, M.; Duan, L. High Spatial Resolution Remote Sensing Data Classification Method Based on Spectrum Sharing. Sci. Program. 2021, 2021, 4356957. [Google Scholar] [CrossRef]
  31. Fan, L.; Li, T.; Yuan, Y.; Katabi, D. In-Home Daily-Life Captioning Using Radio Signals. arXiv 2020, arXiv:2008.10966. [Google Scholar]
  32. Fan, L.; Li, T.; Fang, R.; Hristov, R.; Yuan, Y.; Katabi, D. Learning Longterm Representations for Person Re-Identification Using Radio Signals. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, DC, USA, 13–19 June 2020. [Google Scholar]
  33. Yonemoto, N.; Kohmura, A.; Futatsumori, S.; Morioka, K.; Makita, Y. Passive Radio Imaging of Hybrid Radar System for Security Inspections. In Proceedings of the 2020 17th European Radar Conference (EuRAD), Utrecht, The Netherlands, 10–15 January 2021; pp. 378–381. [Google Scholar]
  34. Zhang, L.; Wang, S.; Liu, H.; Lin, Y.; Zhu, M.; Gao, L.; Tong, Q. From Spectrum to Temporal Spectrum—Research on Change Detection of Remote Sensing Time Series. Geomat. Inf. Sci. Wuhan Univ. 2021, 46, 18. [Google Scholar]
Figure 1. Different cases of building the index from coarse to fine-grained. (a) the index value covers the middle position of the interval. (b) the index value covers the middle position of the interval. (c) the coverage of the index value starts at the starting position d1. (d) the coverage of the index value ends at the endpoint d2.
Figure 1. Different cases of building the index from coarse to fine-grained. (a) the index value covers the middle position of the interval. (b) the index value covers the middle position of the interval. (c) the coverage of the index value starts at the starting position d1. (d) the coverage of the index value ends at the endpoint d2.
Ijgi 12 00148 g001
Figure 2. Different representations of points on dimension domains. (a) d1 is represented as one index value at R1 granularity, two index values at R2 granularity, and three index values at R3 granularity, respectively. (b) d1 is represented as one index value at R1 granularity, two index values at R2 granularity, and four index values at R3 granularity.
Figure 2. Different representations of points on dimension domains. (a) d1 is represented as one index value at R1 granularity, two index values at R2 granularity, and three index values at R3 granularity, respectively. (b) d1 is represented as one index value at R1 granularity, two index values at R2 granularity, and four index values at R3 granularity.
Ijgi 12 00148 g002
Figure 3. The principle of multi-scale dimension integer coding.
Figure 3. The principle of multi-scale dimension integer coding.
Ijgi 12 00148 g003
Figure 4. Integer coding at 1 pm scale.
Figure 4. Integer coding at 1 pm scale.
Ijgi 12 00148 g004
Figure 5. Multi-scale band integer coding.
Figure 5. Multi-scale band integer coding.
Ijgi 12 00148 g005
Figure 6. An example of padding from fine-grained to coarse-grained.
Figure 6. An example of padding from fine-grained to coarse-grained.
Ijgi 12 00148 g006
Figure 7. Correlation between MBIC and bands.
Figure 7. Correlation between MBIC and bands.
Ijgi 12 00148 g007
Figure 8. The number of query results for both methods. (a) the number of query results for the annual scale query interval. (b) the number of query results for the daily scale query interval. (c) the number of query results for the second scale query interval.
Figure 8. The number of query results for both methods. (a) the number of query results for the annual scale query interval. (b) the number of query results for the daily scale query interval. (c) the number of query results for the second scale query interval.
Ijgi 12 00148 g008
Figure 9. The query time of the two coding methods under different time data (1: “2014–2015”; 2: “15 November 2014–15 February 2015”).
Figure 9. The query time of the two coding methods under different time data (1: “2014–2015”; 2: “15 November 2014–15 February 2015”).
Ijgi 12 00148 g009
Figure 10. Data acquisition process.
Figure 10. Data acquisition process.
Ijgi 12 00148 g010
Figure 11. The query time of the two coding methods under different band data.
Figure 11. The query time of the two coding methods under different band data.
Ijgi 12 00148 g011
Table 1. Corresponding levels of different scales.
Table 1. Corresponding levels of different scales.
LevelScaleLevelScaleLevelScaleLevelScale
631 pm47643141564
6224612830814128
61445256291 cm13256
6084451228212512
5916431 μm274111 km
5832422268102
5764414251 dm94
5612840824288
552563916234716
545123832228632
531 nm3764211 m564
522361282024128
514352561943256
508345121882512
4916331 mm171611024
4832322163202048
Table 2. Two query results based on the string encoding.
Table 2. Two query results based on the string encoding.
Partial Results of Granular Fuzzy QueriesPartial Results of an Intersect QueryPartially Missing Data for Intersecting Queries
‘2014’
‘2014-11’
‘2014-11-15’
‘2014-11-15T00:08:08.216495’
‘2014-11-15T01:25’
‘2014-11-15T01:59:09.074094’
‘2014-11-15T03:08:31.252138’
‘2015-02-15T00:10:09.460989’
‘2015-02-15T00:21:15.373’
‘2014-11-15’
‘2014-11-15T00:08:08.216495’
‘2014-11-15T01:25’
‘2014-11-15T01:59:09.074094’
‘2014-11-15T03:08:31.252138’
‘2014’
‘2014-11’
‘2015-02-15T00:10:09.460989’
‘2015-02-15T00:21:15.373’
Table 3. Proportion designs in the temporal data set.
Table 3. Proportion designs in the temporal data set.
Proportional WayRepresentation SymbolsProportional Design
y: m: d: h: m: s: ms: μsdbl (equal proportion)1: 1: 1: 1: 1: 1: 1: 1
bdbl (unequal proportion)1: 2: 4: 8: 16: 32: 64: 128
fdbl (unequal proportion)128: 64: 32: 16: 8: 4: 2: 1
Table 4. Comparison of two coding methods.
Table 4. Comparison of two coding methods.
Storage MethodMethod DescriptionExample
stringUse two fields to store bands“6-626-4-5-1”–“6-626-4-5-4”
MBICStore bands with a column of integerThe multi-scale integer encoding of “6-626-4-5-1”–“6-626-4-5-4” is: 59,551,923,803,521,023, 59,551,927,024,746,495, 59,551,930,245,971,967
Table 5. Corresponding codes for different queries.
Table 5. Corresponding codes for different queries.
Query IntervalMBICString Coding
query14,003,612~4,003,619 mm36,058,524,635,103,231
36,058,531,077,554,175
36,058,537,520,005,119
“04-003-6-1-2”–“04-003-6-1-9”
query2400,362~400,367 cm36,058,586,912,129,023
36,058,689,991,344,127
“04-003-6-2”–“04-003-6-7”
query340,032~40,039 dm36,056,834,565,472,255
36,058,483,832,913,919
36,060,133,100,355,583
“04-003-2”–“04-003-9”
query42004~2060 m18,067,175,067,615,231
18,119,951,625,748,479
18,225,504,742,014,975
18,366,242,230,370,303
18,471,795,346,636,799
18,524,571,904,770,047
18,546,562,137,325,567
“02-004”–“02-060”
query54003~4230 m36,059,583,344,541,695
36,072,777,484,075,007
36,081,573,577,097,215
36,134,350,135,230,463
36,239,903,251,496,959
36,451,009,484,029,951
36,873,221,949,095,935
37,436,171,902,517,247
37,858,384,367,583,231
38,016,714,041,982,975
38,056,296,460,582,911
“04-003”–“04-230”
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, W.; Liu, W.; Tong, X.; Wang, J.; Peng, F.; Tian, Y.; Zhu, J. A Management Method of Multi-Granularity Dimensions for Spatiotemporal Data. ISPRS Int. J. Geo-Inf. 2023, 12, 148. https://doi.org/10.3390/ijgi12040148

AMA Style

Cao W, Liu W, Tong X, Wang J, Peng F, Tian Y, Zhu J. A Management Method of Multi-Granularity Dimensions for Spatiotemporal Data. ISPRS International Journal of Geo-Information. 2023; 12(4):148. https://doi.org/10.3390/ijgi12040148

Chicago/Turabian Style

Cao, Wen, Wenhao Liu, Xiaochong Tong, Jianfei Wang, Feilin Peng, Yuzhen Tian, and Jingwen Zhu. 2023. "A Management Method of Multi-Granularity Dimensions for Spatiotemporal Data" ISPRS International Journal of Geo-Information 12, no. 4: 148. https://doi.org/10.3390/ijgi12040148

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop