Next Article in Journal
MGAD: Mutual Information and Graph Embedding Based Anomaly Detection in Multivariate Time Series
Previous Article in Journal
Combined Shark-Fin Rooftop Antenna for LTE, WLAN and BeiDou Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Forensic Analysis of wxSQLite3-Encrypted Databases and Its Application

1
Department of Financial Information Security, Kookmin University, 77 Jeongneung-ro, Seongbuk-gu, Seoul 02707, Republic of Korea
2
Department of Information Security, Cryptology, and Mathematics, Kookmin University, 77 Jeongneung-ro, Seongbuk-gu, Seoul 02707, Republic of Korea
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Electronics 2024, 13(7), 1325; https://doi.org/10.3390/electronics13071325
Submission received: 22 February 2024 / Revised: 17 March 2024 / Accepted: 28 March 2024 / Published: 1 April 2024
(This article belongs to the Section Computer Science & Engineering)

Abstract

:
This study focuses on digital forensic investigations of the databases used in an instant messenger application. Instant messengers store and manage user data in databases, which can be encrypted for privacy protection. We proposed a method to identify and decrypt an SQLite version 3.40.0 database encrypted using wxSQLite3 version 4.9.1, and then we examined the LINE instant messenger application to validate the proposed method. As a result, we successfully acquired the wxSQLite3 passphrase, which was used to decrypt the database of the LINE messenger application. We also performed artifact analysis to enumerate the data from a digital forensics perspective. To the best of our knowledge, this study is the first to propose a method to identify and decrypt of wxSQLite3-encrypted database and its applications.

1. Introduction

In recent years, the forensic investigation of data from instant messenger applications has gained increasing attention in the digital forensics field. Typically, instant messengers store user data in databases to facilitate data management, and these databases are encrypted for privacy. SQLite [1] is a commonly used library for managing databases, including smartphone application data. Depending on the messenger application, the data in the database may be encrypted using various cryptographic algorithms, e.g., Advanced Encryption Standard (AES), which is known to be secure. Alternatively, the entire database can be encrypted. When encrypting a database, instant messengers can simply encrypt the files themselves or utilize an exclusive module for encryption. SQLite recommends the use of journal mode to preserve user data in an abnormal operating system (OS) shutdown environment. When using an exclusive module for SQLite encryption, journal files created in journal mode are also protected. Among the exclusive modules for SQLite encryption, there are three representative modules, i.e., SQLCipher, which is an open-source extension program [2], SQLite encryption extension (SEE), which is a commercial library [3], and wxSQLite3, which is an open-source library for SQLite version 3 database encryption [4].
Encrypting a database that stores personal user information is critical in terms of protecting user privacy. However, from a digital forensic perspective, encrypted databases occasionally make it difficult for investigators to gather user information. To solve this problem, various studies have been conducted on instant messengers. As there are many instant messengers using SQLCipher, there are various studies related to this [5,6,7,8,9,10,11]. These studies have identified elements for decryption and demonstrated how databases can be decrypted by reverse engineering. While reverse engineering enables efficient analysis of the operation process, it is time-consuming. As various messengers are developed and updated, more efficient analysis is also required. To make the reverse engineering process less time-consuming and laborious, we first identify the encryption method applied to a database by looking at the encrypted database. Then, the analysis becomes efficient by simply using reverse engineering to verify the encryption method.
In this study, we conducted an in-depth investigation of wxSQLite3 in a Windows environment, and this paper proposes an effective decryption method for databases encrypted using wxSQLite3. Specifically, we analyzed the Windows version of the LINE instant messenger application (v7.14.0), which is a popular instant messenger application that was developed in Japan. LINE uses wxSQLite3 to encrypt and store user data. We confirmed that LINE is encrypted using wxSQLite3 in the Windows environment, and we decrypted the encrypted database successfully using a memory analysis technique to analyze user artifacts.

Contributions

Our primary contributions to the digital forensics field are summarized as follows.
  • We propose a method to identify wxSQLite3 databases. We present a method for determining whether an encrypted database is encrypted with wxSQLite3 from both a structural and code perspective. First, we present a framework for determining if a target database is encrypted using wxSQLite3 by analyzing the structure of the encrypted database. Furthermore, we propose an efficient method to find wxSQLite3’s encryption process in decompiled code while reverse engineering an instant messenger application. Based on the two perspectives we proposed, the time needed to analyze the encrypted database of any instant messenger can be reduced to constant time.
  • We propose a detailed decryption method for the LINE instant messenger application. Here, the LINE messenger application is analyzed using the proposed method. Based on this analysis, we discovered that the LINE messenger encrypts its database, including user chat histories, using wxSQLite3. From the results of our analysis, we found that the encryption key generation element used for database encryption is received through communication with the LINE messenger’s server. Thus, we also propose a method to obtain an encryption key generation element from memory. Once the encryption key generation element is obtained, we demonstrate that the encrypted database of the LINE messenger can be decrypted using the proposed wxSQLite3 database decryption method.

2. Related Work

2.1. SQLite Database Research

There have been various studies analyzing the SQLite databases used by instant messengers to manage their databases. Anglano et al. succeeded in decrypting the database of the ChatSecure instant messenger on an Android smartphone by identifying the generation process of passphrases used in SQLCipher encryption [5]. Zhang et al. analyzed the key generation factors and encryption process to decrypt an encrypted SQLite database containing text messages from WeChat v4.5, and successfully decrypted the database [6]. Furthermore, Songyang et al. conducted an experiment to decrypt the SQLite database of WeChat versions 5.0 to 6.3 on different brands of Android smartphones [7]. Rathi et al. analyzed WeChat, Telegram, Viber, and WhatsApp instant messengers and found that their databases are SQLite. They also determined whether each database was encrypted, and if so, proposed a decryption method [8]. Kim et al. analyzed Telegram X, Unigram, and BBME in mobile, and PC environments and proposed a method to decrypt each messenger’s SQLite3 database [9]. Shin et al. analyzed two SNS applications that use the SQLCipher module to encrypt their databases and proposed a passphrase generation method, and successfully decrypted the database [10]. Kim et al. found a method to generate the passphrase used by the Wickr messenger’s encrypted database with SQLCipher in Android 9.0 and iOS 13.2 environments and decrypted the database [11].

2.2. Instance Messenger Research

Instant messenger applications have been studied extensively [12]. For example, Anglano et al. analyzed WhatsApp Messenger on an Android smartphone to demonstrate that the contact list and messages exchanged by users can be reconstructed, and that user activity can be estimated based on the analysis results [12]. In addition, Choi et al. presented a method to decrypt the encrypted data of the three most popular instant messengers in China and Korea, i.e., KakaoTalk, NateOn, and QQ. In the case of the KakaoTalk and NateOn applications, the encrypted database files can be recovered successfully without a user password [13]. Afzal et al. proposed a method to extract user artifacts by analyzing the encrypted network traffic of signal messengers supporting end-to-end encryption [14].

2.3. Line Instance Messenger Research

Several studies have conducted forensic analyses of LINE messenger. For example, Jain et al. used the LINE messenger on a jailbroken iPhone and extracted user conversation history data [15]. In addition, Chang et al. analyzed the LINE application (v5.8.1) using the BlueStacks application to emulate the Android OS [16]. They obtained user information present in memory by extracting and analyzing the data generated according to various user environments. Riadi et al. conducted a study to acquire and analyze data on the LINE messenger (v7.14.0) using the MOBILedit forensic tool to analyze the difference between data collected through application analysis extraction and full content extraction [17]. Note that the above studies focused on the LINE messenger; however, they did not propose a method to decrypt the database of the LINE messenger or obtain data via decryption. Thus, in this study, we clarify the encryption logic and propose a method to obtain elements for decryption. We decrypt the encrypted LINE messenger data such that they can be used for digital forensics investigations.

3. Methodology

We analyzed the target application using wxSQLite3 to identify databases encrypted with wxSQLite3 and to understand the relevant encryption methods. In the following, we describe the overall application analysis process and explain how to utilize program behavior for analysis. In addition, we describe the dynamic analysis process we performed, reverse engineering. We propose a methodology based on both processes.

3.1. Application Operation Process Analysis

Various experiments were conducted to identify the libraries used by the application and the application operation process. These experiments included various uses of the application (e.g., deleting data, logging in with multiple accounts, and logging in from different personal computer (PC) environments), and we observed changes in the data during each process. Here, we generated scenarios for specific observations, and we identified how the application data are formed in each scenario. Generally, instant messengers can be used with multiple accounts in various environments, and it is possible to configure and experiment with various specific scenarios. For example, when data are deleted and the same account is logged in to under the same environment, the differences in the data created can be identified. Alternatively, the user can log in with one account to use the application and then log out and log back in under a different account to check if additional data have been created or deleted. We can also determine whether a user can log in to a PC in a different environment with the same account simultaneously, as well as differences in the data generated. Through these various scenarios, the data generated and managed by the application can be characterized and organized, and this information can be utilized to understand how the data are encrypted. For example, when newly generated encrypted data and previously existing encrypted data are compared, if some parts are the same, there is a possibility that the same value was used in the data encryption process.

3.2. Reverse Engineering

Dynamic analysis is required to analyze specific applications in practice. For the dynamic analysis of the target application, we employed a disassembler to analyze computer applications in the Windows environment, i.e., the Interactive DisAssembler (IDA) [18]. First, we checked the code by decompiling the target application using the IDA tool. Here, the purpose was to examine the features of wxSQLite3. The wxSQLite3 encryption process utilizes MD5 hash functions repeatedly (Section 4). Thus, we performed a static analysis to determine where the MD5 algorithm is used. The MD5 algorithm has a fixed initial state and features used in the rotation process. Note that it is easy to distinguish where well-known algorithms, e.g., MD5, are used in code and reverse track where they are called from. After identifying the location of the encryption process, e.g., the encryption key or initialization vector (IV) generation process, the actual operation process should be clarified. Thus, through dynamic analysis, we tracked the memory during operation by setting a breakpoint at the code location estimated for either encryption or key generation. This process allows us to be sure that the target application uses wxSQLite3.

3.3. Verification

In this process, the encrypted data are decrypted using the encryption key and encryption algorithm estimated from the analysis results. The analysis results can be verified by decrypting the encrypted data using the correct encryption key and encryption algorithm. If the data are decrypted correctly, the file signature can be identified visually when viewed in binary. In the case of databases, artifact analysis is required to analyze the data in the database after decryption to further infer specific user behavior. However, if the verification process through data decryption fails, a detailed analysis of the operation process and reverse engineering of the application are repeated.

4. Analysis of wxSQLite3

wxSQLite3 is an open-source library used to handle SQLite databases that operates based on the C++ programming language [4]. wxSQLite3 is designed to use SQLite databases in wxWidgets, which is a C++ library that allows users to create applications for Windows, macOS, Linux, and other platforms. With wxSQLite3, users can perform data processing, e.g., modifying, creating, and deleting SQLite databases and the data in databases. Since the release of version 3.5.0, it is possible to compile using only the wxSQLite3 library without a separate SQLite3 Dynamic Link Library (DLL). In addition, wxSQLite3 contains a process to encrypt SQLite3 databases using AES128-CBC and AES-256-CBC. In this section, we cover the structure of an SQLite3 database encrypted using wxSQLite3, and we describe the encryption process in detail.

4.1. Structure of wxSQLite3 Database

wxSQLite3 defines encryption information in the header of an encrypted database, and the encryption is performed in units of pages defined in the header. The first page has a header. And the second to last page does not have a header, and the whole page is encrypted. Table 1 shows the structure of data encrypted with wxSQLite3.
A notable feature of the data encrypted using wxSQLite3 is that an 8-byte between offsets 16 and 23 is plaintext containing information about the encryption. Therefore, unlike other SQLite database encryption techniques, this part can be identified in plaintext form.

4.2. wxSQLite3 Database Encryption and Decryption Process

When wxSQLite3 encrypts a database, it performs encryption in units of pages defined in the header. To generate a different ciphertext for each page, wxSQLite3 utilizes a different encryption key (i.e., the page key) and IV (i.e., page IV) for each page. Note that the page key is generated from the base key created by wxSQLite3, and the page IV is generated based on the page number. We divide the overall encryption process into three subprocesses, i.e., page IV generation, base key generation, and database encryption. Each process is described in detail in the following subsections.

4.2.1. Page IV Generation Process

wxSQLite3 uses a different IV for each page to be encrypted. The IV used to encrypt each page is generated by using a linear congruential generator (LCG) that combines the number of pages to be encrypted with a modular multiplication operation. Generally, the LCG algorithm generates random numbers using multiplication, addition, and modular operations. The LCG also exists in the form of multiplicative LCGs (MLCGs), which combine two or more LCGs [19]. wxSQLite3 employs MODMULT, which is a function comprising a combination of modular and multiplication operations among the composition of MLCGs. Algorithm 1 shows the pseudocode for the MODMULT function.   
Algorithm 1: MODMULT
  • Input: a, b, c, m, page number
  • Output: s
    1:
    s page number
    2:
    q s / a
    3:
    s b ( s a q ) c q
    4:
    if  s < 0  then
    5:
         s s + m
    6:
    end if
    7:
    return s
   wxSQLite3 uses values of 52774, 40692, 3791, and 2147483399 [19] for a, b, c, and m, respectively, for the various fixed values in the MODMULT function. In addition, s denotes the seed, and the number of pages to be encrypted is used in the first operation. A single operation produces a 32-bit result, and the internal state is updated by reusing the result as a seed. The first output and three updates are used to generate four random numbers of 32-bit length. The generated random numbers are concatenated sequentially to produce 128-bit data, which are then hashed using an MD5 hash function, and the hash value is used as the page IV. Algorithm  2 describes the process of generating a page IV from the page number to be encrypted. Here, the page number begins from 1 and is expressed as a 4-byte little endian.   
Algorithm 2: Page IV generation Process
Input: 
page number
Output: 
page IV for each page encryption
 1:
a 52774
 2:
b 40692
 3:
c 3791
 4:
m 2147483399
 5:
s p a g e n u m b e r
 6:
next seed n u l l
 7:
value n u l l
 8:
for i ← 0 ⋯ 3 do
 9:
    next seed ← MODMULT ( a , b , c , m , s ) // Algorithm 1
10:
     s next seed
11:
    value←value | | next seed // | | : concatenate
12:
end for
13:
page IV  M D 5 (value)
14:
return  p a g e I V

4.2.2. Base Key Generation Process

wxSQLite3 supports page unit encryption using 128-bit and 256-bit keys. The page key used for page encryption is generated based on the base key, which is generated using the input passphrase. The process of generating the base key depends on the key length. Here, the MD5 hash function and RC4 stream cipher are used to generate a 128-bit base key, and the SHA256 hash function is used to generate a 256-bit base key. If the length of the passphrase is not more than 32 bytes, it is concatenated with the predefined padding value and truncated by 32 bytes from the beginning to fit 32 bytes. The predefined padding values are as follows:
0 x 28 BF 4 E 5 E 4 E 758 A 4164004 E 56 FFFA 01082 E 2 E 00 B 6 D 0683 E 802 F 0 CA 9 FE 6453697 A
Algorithm 3 describes the base key generation process for each encryption algorithm key length.
The 128-bit base key is generated by encrypting the padded passphrase using the RC4 stream cipher and the MD5 hash function. This works in three steps, each of which is described below.
In Step 1, an intermediate key (inter_k) to be used for the RC4 stream cipher is created. The inter_k is a fixed value generated using a predefined 32-byte padding value. First, the predefined padding value is hashed using the MD5 hash function, and then the hash value is hashed again using the MD5 hash function 50 additional times. This generates the following fixed value inter_k.
inter _ k = 0 x 5 A 00344 F 40 D 0 A 5 C 52 B 160 B 830 E 6 E 086 E
In Step 2, an intermediate value (inter_v) to generate the final key is created by encrypting the padded passphrase. The padded passphrase is encrypted by repeating the RC4 stream cipher 20 times, and the key of the RC4 stream cipher is generated using inter_k. Concerning the value transformed by XORing, the current number of encryptions with each byte of inter_k becomes the RC4 key (lines 23–24). The padded passphrase, which has been encrypted repeatedly 20 times, becomes inter_v.
In Step 3, the base key is generated using the padded passphrase and inter_v. First, the padded passphrase and inter_v are concatenated and hashed using the MD5 hash function, and then the hashed value is hashed 50 additional times using the MD5 hash function. The final generated hashed value becomes the base key.
Algorithm 3: Base key generation Process
Input: 
passphrase, key length
Output: 
base key
 1:
m s g passphrase
 2:
R C 4 _ k e y n u l l
 3:
p a d d i n g 0 x 28 B F 4 E 5 E 4 E 758 A 4164004 E 56 F F F A 01082 E 2 E 00 B 6 D 0683 E 802 F 0 C A 9 F E 6453697 A
 4:
j 0
 5:
p 0
 6:
m p s w d l e n
 7:
if  m > 32  then
 8:
     m 32
 9:
end if
10:
for  j = 0 to m 1  do
11:
     p a d d e d _ p w d [ p + + ] ( u n s i g n e d c h a r ) p a s s p h r a s e [ j ]
12:
end for
13:
for  j = 0 to 32 1  do
14:
    if  p < 32  then
15:
         p a d d e d _ p w d [ p + + ] p a d d i n g [ j ]
16:
    else
17:
        break
18:
    end if
19:
end for
20:
if key length == 128 then
21:
     d a t a 1 = p a d d i n g // Step 1
22:
    for i ← 0 ⋯ 50 do
23:
        inter_k  M D 5 ( d a t a 1 )
24:
         d a t a 1 = inter_k
25:
    end for
26:
     m s g = p a d d e d _ p w d // Step 2
27:
    for i ← 0 ⋯ 19 do
28:
        for k ← 0 ⋯ 15 do
29:
            R C 4 _ k e y [ k ] inter_k [ k ] i
30:
        end for
31:
        inter_v R C 4 e n c r y p t ( R C 4 _ k e y , m s g )
32:
         m s g = inter_v
33:
    end for
34:
     n e w _ h p a d d e d _ p w d | | inter_v // Step 3
35:
    for i ← 0 ⋯ 50 do
36:
         o u t M D 5 ( n e w _ h )
37:
         n e w _ h o u t
38:
    end for
39:
     B a s e k e y o u t
40:
else
41:
     d a t a 1 p a d d e d _ p w d
42:
    for i ← 0 ⋯ 4001 do
43:
         o u t S H A 256 ( d a t a 1 )
44:
         d a t a 1 o u t
45:
    end for
46:
    base key o u t
47:
end if
48:
return base key
The process of generating a 256-bit base key is described as follows. First, the padded passphrase is hashed using the SHA256 hash function, and then the hashed value is hashed again using the SHA256 hash function 4001 times. The final generated hashed value becomes the base key.

4.2.3. Database Encryption Process

The database is divided into pages and is encrypted using the AES-128-CBC or AES-256-CBC algorithm. Each page is encrypted using a different page key and page IV. The page key generation method depends on the key length; however, the page IV is generated in the same manner. Algorithm 4 describes a page key generation method according to the key length.
Algorithm 4: Page key generation process
Input: 
page number, passphrase, key length, base key
Output: 
page key
1:
base keybase key generation Process(passphrase,
key length) // See. Algorithm 3
2:
tmp  base key  | | page number | | sAlT
// | | : concatenate
3:
if key length == 128 then
4:
     h a s h M D 5
5:
else
6:
     h a s h S H A 256
7:
end if
8:
page key  h a s h ( t m p )
9:
return page key
The page key is generated by concatenating the base key, page number, and the fixed string “sAlT", and then hashing the outcome using a hash function that matches the given key length. Here, the page number is expressed as a 4-byte little endian. The hash function uses the MD5 hash function for key lengths of 128 bits and the SHA256 hash function for key lengths of 256 bits.
When the page key and page IV are generated, encryption is performed by page using the AES-CBC algorithm. However, the first page requires additional work to verify the encryption key, whereas the remaining pages are encrypted using the AES-CBC algorithm with the respective page key and page IV values. Figure 1 shows the detailed encryption process for the first page.
Database information (SQLite info) is stored at offsets 16–23 of the first page. Thus, the 8-byte data of offsets 16-23 of the first page are backed up. The first page is then encrypted using the AES-CBC algorithm with the first page key and page IV. Then, the data from offsets 16–23 of the encrypted page are overwritten at offsets 8–15. Finally, the encryption of the first page is performed by overwriting the backed-up 8-byte data in offsets 16–23 of the encrypted database. This encryption process enables wxSQLite3 to verify the passphrase and decrypt the database.

4.2.4. Database Decryption Process

The encrypted database is decrypted using the AES-128-CBC or AES-256-CBC algorithm by page. Each page encryption requires a page key, which is generated from the passphrase (Algorithm 4). At this point, a verification process can be performed to determine whether the passphrase is correct. Passphrase verification is performed by comparing the backed-up SQLite data with the decryption result of concatenating offsets 8–15 and 24–31 of the database. If the correct passphrase is used, the SQLite data will be decrypted, which can be verified by comparing the backed up values with the decryption results. After passphrase verification succeeds, all pages are decrypted. However, offsets 8–15 of the first page are overwritten with the ciphertext (Figure 1). Thus, the first page can be decrypted by overwriting offsets 0–15 with a fixed signature value that indicates it is an SQLite3 database. As a result, we can verify the passphrase used for wxSQLite3 encryption and decrypt the database encrypted with wxSQLite3.

4.3. wxSQLite3 Database Identification Method

Whether wxSQLite3 is used can be checked based on the structure of the encrypted data and the fixed value in the program. Each identification technique is described in the following subsections.

4.3.1. Confirming wxSQLite3 Based on Encrypted Data

wxSQLite3 employs a method of backing up the main data used for decryption when encrypting the first page and restoring it after encryption (Figure 1). In this process, the 8-byte between offsets 16 and 23 of the encrypted database exist as plaintext. Thus, if the 8-byte between offsets 16 and 23 of the encrypted database are plaintext, it is possible that the target database was encrypted using wxSQLite3 If an application encrypts and manages the database, we can determine whether the database was encrypted using wxSQLite3 by deleting the database. If a database created by the application is deleted, the application will recreate the encrypted database for using the application. During this process, the structure of the encrypted database can differ depending on which passphrase the application uses to encrypt the database. The application can encrypt the recreated database using the same passphrase that was used to encrypt the deleted database (Case 1). Otherwise, when the application encrypts the recreated database, it can generate a new passphrase for encryption that is different from the passphrase used to encrypt the deleted database (Case 2). The features of the encrypted database for each case are as follows.
  • (Case 1) Database encryption with the same passphrase
    The first 8-byte of the recreated database have the same ciphertext
  • (Case 2) Database encryption with a the new passphrase
    The first 8-byte of the recreated database have a different ciphertext
If the application encrypted the database using the same passphrase, the created database is the same as the one that existed prior to deletion. However, if the application encrypts the database by creating a new passphrase, the created database will differ from the previously encrypted database. In the former case, the IV and key used for database encryption are the same. Note that other known SQLite3 encryption modules generate random IVs each time. Thus, we can assume with high probability that wxSQLite3 was used in this case. However, in the latter case, additional analysis is required because the encryption module may differ or the encryption key may have been altered.
First, we consider the case where the encryption module differs. The known SQLite3 encryption modules generate random IVs and associate the IV with the database to store it. Thus, we can verify this by simply backing up the encrypted database, deleting it, and comparing it with the new database created by the application. We then delete the new database, replace it with the backed-up database, and run the application again. If the application recognizes the content of the database without issue, the encryption key is the same, and only the IV has changed. Thus, in this case, wxSQLite3 was not used to encrypt the database.
Next, we consider the case where the encryption key has been changed. Here, we delete the new database, replace it with the backed-up database, and run the application again. If the application does not recognize the database and creates a new one, it can be assumed that the encryption key was changed. In other words, the encryption key may be changed each time or a completely different encryption module may be used. If the encryption key is changed randomly each time, the application should store relevant information about the client for later database use. Such relevant information can take many forms, e.g., a random value, a seed for generation, or a value that can be requested from the server. Therefore, it is necessary to observe all values created or changed during the database creation process. If the encryption key can be obtained through observation or if a fixed encryption key can be generated, it is possible to infer wxSQLite3 to some extent through the previous experiment. However, if the wxSQLite3 guess is unclear, dynamic analysis via reverse engineering is required.

4.3.2. Confirming wxSQLite3 Based on Reverse Engineering

When wxSQLite3 encrypts a database, IV and encryption key generation are performed first. At this time, the IV and encryption key generation processes can be guessed through reverse engineering. First, for IV generation, three main fixed values are used for the calculation: 40692, 2147483399, and 52774 for parameters b, m, and a, respectively (Algorithm 1). If the value is found in the decompiled code and the function containing the code is executed prior to the encryption process, it can be assumed that it is an IV generation process. Figure 2 shows part of the MODMULT function in the code. In addition, when wxSQLite3 encrypts the database, the MD5 hash function is computed repeatedly over three processes. Above all, there are 51 MD5 hash function operations, 20 MD5 hash function operations, and 51 MD5 hash function operations. Thus, this can also be confirmed in the decompiled code. First, as the MD5 hash function uses a fixed initial state, we can find the MD5 hash function. If the MD5 hash function is called 51, 20, and 51 times, we can assume that this is a wxSQLite3 database encryption process (Figure 3).

5. Analysis of LINE Messenger

The LINE messenger is an instant messenger application developed by the Japanese branch of the South Korean company Naver. Since its launch in 2011, the LINE messenger has been updated continuously to implement new features, apply bug fixes, and mitigate protocol vulnerabilities. The LINE messenger also encrypts and manages its entire database to protect user data.

5.1. Data Structure and Main Data

When a user logs in using a previously authenticated mobile device to sign into a PC device, user data are created in %LocalAppData%\LINE on the PC (Figure 4).
The executable files required to run the LINE messenger, update programs, and log files that record program usage are stored in the bin directory. When users send and receive chats using the LINE messenger, files with the “.edb” extension in the db directory under the Data directory are modified. Thus, we assume that user data are stored in those files, and the database with the “.edb” extension is encrypted. There are a total of three databases with the “.edb” extension. One file comprises the “qw” string and a random 15-byte hex string for each account. The two files are stored with file names in which “keep_” and “chatStats_” are concatenated in front of the file names, respectively. If a user changes their login account, the database of the existing account is not deleted, and a new database is created in the same path. When users send and receive images using the LINE messenger, files with the “.eimg” extension are created in the Cache directory. The size of these files is smaller than that of the original file, and we have verified that they are thumbnails of the image.

5.2. Identification of wxSQLite3 Database in LINE Messenger

Figure 5 shows three encrypted databases. We checked the encrypted database of the LINE messenger and confirmed that the 8-byte of offsets 0–7 and the 8-byte of offsets 16–23 are the same. First, the data of offsets 16–23 have sufficiently low entropy, which suggests that they may not be ciphertexts. If the value is plaintext, we can suspect a wxSQLite3 structure. Next, we check the data of offsets 0–7 and 8–15. The 8-byte of data from offset 0 are the same, and the next 8-byte differ. SQLite3 databases use 16-byte of “SQLite format 3 \x00” as a signature. When the same plaintext is encrypted using three different encryption keys, the probability that 8-byte of the data match and the rest of the data differ is close to zero. Therefore, we can guess that the data are encrypted using the same encryption key and IV, and they are overwritten with another 8-byte data. These structures are typical of wxSQLite3; thus, we can infer that LINE messenger’s databases are encrypted using wxSQLite3. Next, we reverse engineered the LINE messenger to ensure that wxSQLite3 was used for encryption. There is a “QtCipherSQLite” plug-in that supports the wxSQLite3 module in the LINE messenger library, and we confirmed that the MD5 hash function was used in the same manner as the AES-128 algorithm of wxSQLite3 (Figure 3). As a result, we concluded that the LINE messenger database is encrypted using wxSQLite3.
We conducted an additional experiment to determine whether the passphrase of wxSQLite3 used for database encryption is variable. After extracting the existing “.edb” file, we created a new “.edb” file by logging in again. This process was performed multiple times on the same account, and with different accounts. Also, we collected data from four other PCs to consider the existence of dependencies of the PCs. We compared and analyzed the first 8-byte of the collected database to examine whether the passphrase was variable or dependent on the PC. As a result, regardless of the PC environment, if the account is the same, a database with the same first 8-byte is created, and if the account is different, different values are created. This means that the LINE messenger encrypts its database using a specific value attributed to the given account as the passphrase.

5.3. Memory Analysis and Data Decryption

Data sent and received by users are stored on the user’s devices and the LINE messenger server. LINE messenger performs the authentication process between the server and users through the mobile application (Figure 6). One way to authentication is by entering a phone number or email account, and then sending a verification code to the environment in which the user is already logged in. Alternatively, user authentication can be performed using a quick response code on a previously authenticated mobile device. In this study, we found that the server only sends the encrypted database and passphrase locally when the correct user logs in and is authenticated successfully.
The database needs to have the encryption key in memory until the application’s connection to the database is closed to encrypt and store continuously added data. This means that the passphrase remains in memory after the user authentication process. Based on this, the methods we studied for obtaining the passphrase in memory and using it to decrypt the database are as follows.

5.3.1. Passphrase Acquisition through Memory Analysis

To acquire the passphrase in memory, we performed memory dumps in various states. Here, we focused on three states, i.e., immediately after login, while using the application with the application in the system tray, and after logging out. As a result, we confirmed that the passphrase remains in memory in all states until the user logs out. Figure 7 shows the passphrase that remains in memory. There are two patterns of passphrases stored in memory that we checked.
Our analysis showed that the passphrase exists in memory with a fixed string, i.e., the “encryption_key=” string or the “encryptionkey=” string. After that, the string “mse” commonly exists at the end. We can use these strings as headers and footers to identify and extract the database passphrases in memory. Alternatively, it is possible to search with a regular expression of [0-9a-f]32 using a 32-character hex string.

5.3.2. Database Decryption

The encrypted data of the LINE messenger include the database and thumbnails. Our analysis showed that the LINE messenger database is encrypted using wxSQLite3. The passphrase used for database encryption is a 32-character hex string that is received from the server after successful login. The LINE messenger generates a 128-bit encryption key for database decryption according to Algorithm 3.
In Step 2, the LINE messenger encrypts the passphrase received from the server rather than the padded passphrase by repeating the RC4 stream cipher 20 times. In addition, in Step 3, the LINE messenger concatenates the passphrase received from the server rather than the padded passphrase and the result of Step 2. In other words, the LINE messenger uses the passphrase received from the server rather than the padded passphrase in Algorithm 3 to generate a 128-bit key for database encryption. Thus, the database can be decrypted using the 32-character hex string obtained from memory and wxSQLite3’s decryption process. Figure 8 shows the decrypted LINE messenger database. We confirmed that all conversation history exchanged between users in the decrypted database exists in plaintext.
In the LINE messenger, thumbnails are encrypted using the AES-128-ECB algorithm; however, our analysis demonstrated that the key used for data encryption was received from the server each time, and a different key was used for each file. We captured the memory as the thumbnail was being decrypted and found that the encryption key did not have any separator and was deleted from memory after decryption was complete. Thus, it was virtually impossible to obtain the attached image unless the application logged in and downloaded the decrypted image.

5.4. User Artifact Analysis

Three encrypted databases are created when using the LINE messenger. Among them, databases starting with “keep_” and “chatStats_” do not store important information. The user’s chat history, including user information, is stored in the “.edb” database comprising “qw” strings and random 15-byte hex strings. Table 2 shows the main user artifacts in the decrypted database.
When a user sends and receives chats from other users, their nicknames and the user’s unique value are stored in the “_mid” of the “_contact” table. In addition, the user’s chat history is stored in the “_chats” table. The “_from” and “_to” fields in the “_chats” table store the UUIDs of the sender and receiver of the chat, respectively. Note that this is the same as the value stored in “_mid” of the “_contact” table. The “_chats” table also stores the time at which the chats were sent and received. This allowed us to identify users who sent and received chats and to check the chat history of a specific user at a given time.

6. Conclusions

In this paper, we have proposed a method to identify and decrypt SQLite databases encrypted using wxSQLite3. We initially examined the database structure to identify databases that were encrypted with wxSQLite3. The results provided a basis to assume that the database was encrypted using wxSQLite3. In addition, we proposed a method to identify the wxSQLite3 encryption process within the decompiled code during the target application’s reverse engineering. This enabled us to identify that the database was encrypted using wxSQLite3. Based on both methods, we identified that the database was encrypted with wxSQLite3. By applying the proposed methodology to identify databases encrypted using wxSQLite3, we analyzed the LINE messenger application and confirmed that this application’s databases are encrypted using wxSQLite3. We found that the encryption key generation element used by the LINE messenger is received from the LINE messenger’s server, and we proposed a method to acquire the encryption key generation element from memory. Based on the analysis results, we decrypted the database of the LINE messenger. The proposed method to identify and decrypt wxSQLite3-encrypted databases can be applied to other instant messenger applications, which is expected to provide a new direction for database analysis. In the future, it would be interesting to apply our methodology to various instant messenger applications and investigate methods to identify and decrypt an encrypted database using the SQLite encryption extension.

Author Contributions

Conceptualization, S.K. and G.K.; Methodology, S.K. and G.K.; Software, U.H.; Validation, S.K., G.K. and U.H.; Formal analysis, U.H.; Investigation, S.K. and G.K.; Writing—original draft, S.K. and G.K.; Writing—review & editing, J.K.; Visualization, U.H.; Project administration, J.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the BK21 FOUR (Fostering Outstanding Universities for Research) funded by the National Research Foundation of Korea (NRF).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. DB-Engines. Available online: https://db-engines.com/en/ranking (accessed on 10 April 2023).
  2. SQLCipher. Available online: https://www.zetetic.net/sqlcipher/ (accessed on 10 April 2023).
  3. SQLite Encryption Extension (SEE). Available online: https://www.SQLite.org/see/doc/release/www/index.wiki (accessed on 10 April 2023).
  4. wxSQLite3—A Lightweight Wrapper for SQLite. Available online: https://github.com/utelle/wxSQLite3 (accessed on 10 April 2023).
  5. Anglano, C.; Canonico, M.; Guazzone, M. Forensic analysis of the ChatSecure instant messaging application on android smartphones. Digit. Investig. 2016, 19, 44–59. [Google Scholar] [CrossRef]
  6. Zhang, L.; Yu, F.; Ji, Q. The forensic analysis of WeChat message. In Proceedings of the 2016 6th International Conference on Instrumentation & Measurement, Computer, Communication and Control (IMCCC), Harbin, China, 21–23 July 2016; pp. 500–503. [Google Scholar]
  7. Wu, S.; Zhang, Y.; Wang, X.; Xiong, X.; Du, L. Forensic analysis of WeChat on Android smartphones. Digit. Investig. 2017, 21, 3–10. [Google Scholar] [CrossRef]
  8. Rathi, K.; Karabiyik, U.; Aderibigbe, T.; Chi, H. Forensic analysis of encrypted instant messaging applications on Android. In Proceedings of the 2018 6th International Symposium on Digital Forensic and Security (ISDFS), Antalya, Turkey, 22–25 March 2018; pp. 1–6. [Google Scholar]
  9. Kim, G.; Park, M.; Lee, S.; Park, Y.; Lee, I.; Kim, J. A study on the decryption methods of telegram X and BBM-Enterprise databases in mobile and PC. Forensic Sci. Int. Digit. Investig. 2020, 35, 300998. [Google Scholar] [CrossRef]
  10. Shin, S.; Kang, S.; Kim, G.; Kim, J. Study on SNS Application Data Decryption and Artifact. J. Korea Inst. Inf. Secur. Cryptol. 2020, 30, 583–592. [Google Scholar]
  11. Kim, G.; Kim, S.; Park, M.; Park, Y.; Lee, I.; Kim, J. Forensic analysis of instant messaging apps: Decrypting Wickr and private text messaging data. Forensic Sci. Int. Digit. Investig. 2021, 37, 301138. [Google Scholar] [CrossRef]
  12. Anglano, C. Forensic analysis of WhatsApp Messenger on Android smartphones. Digit. Investig. 2014, 11, 201–213. [Google Scholar] [CrossRef]
  13. Choi, J.; Yu, J.; Hyun, S.; Kim, H. Digital forensic analysis of encrypted database files in instant messaging applications on Windows operating systems: Case study with KakaoTalk, NateOn and QQ messenger. Digit. Investig. 2019, 28, S50–S59. [Google Scholar] [CrossRef]
  14. Afzal, A.; Hussain, M.; Saleem, S.; Shahzad, M.K.; Ho, A.T.; Jung, K.H. Encrypted network traffic analysis of secure instant messaging application: A case study of signal messenger app. Appl. Sci. 2021, 11, 7789. [Google Scholar] [CrossRef]
  15. Jain, V.; Sahu, D.R.; Tomar, D.S. Evidence gathering of LINE messenger on iPhones. Int. J. Innov. Eng. Manag. 2015, 4, 1–9. [Google Scholar]
  16. Chang, M.S.; Chang, C.Y. Forensic analysis of LINE messenger on android. J. Comput. 2018, 29, 11–20. [Google Scholar]
  17. Riadi, I.; Fadlil, A.; Fauzan, A. Evidence gathering and identification of line messenger on android device. Int. J. Comput. Sci. Inf. Secur. (IJCSIS) 2018, 16, 201–205. [Google Scholar]
  18. IDA (Interactive DisAssembler). Available online: https://hex-rays.com/ida-free/ (accessed on 10 April 2023).
  19. L’ecuyer, P. Efficient and portable combined random number generators. Commun. ACM 1988, 31, 742–751. [Google Scholar] [CrossRef]
Figure 1. The Detailed Encryption Process for the First Page.
Figure 1. The Detailed Encryption Process for the First Page.
Electronics 13 01325 g001
Figure 2. MODMULT function in code.
Figure 2. MODMULT function in code.
Electronics 13 01325 g002
Figure 3. Iterative operation process of MD5.
Figure 3. Iterative operation process of MD5.
Electronics 13 01325 g003
Figure 4. LINE messenger main data path.
Figure 4. LINE messenger main data path.
Electronics 13 01325 g004
Figure 5. Features of LINE messenger’s encrypted database.
Figure 5. Features of LINE messenger’s encrypted database.
Electronics 13 01325 g005
Figure 6. User authentication process of LINE messenger.
Figure 6. User authentication process of LINE messenger.
Electronics 13 01325 g006
Figure 7. The passphrase of the LINE messenger database stored in memory.
Figure 7. The passphrase of the LINE messenger database stored in memory.
Electronics 13 01325 g007
Figure 8. Conversation history in the decrypted database of LINE messenger.
Figure 8. Conversation history in the decrypted database of LINE messenger.
Electronics 13 01325 g008
Table 1. Database structure encrypted with wxSQLite3.
Table 1. Database structure encrypted with wxSQLite3.
OffsetSizeDataNote
0–1516Encrypted SQLite 3.x database headers 
16–172Database page size 
181File format write version1: Legacy
2: WAL
191File format read version1: Legacy
2: WAL
201Amount of unused ‘reserved’ space at the end of each pageusually 0
211Maximum embedded payload fractionfixed at 64
221Minimum embedded payload fractionfixed at 32
231Leaf payload fractionfixed at 32
24–NDatabase encrypted except for headers 
Table 2. The main user artifacts in the decrypted database.
Table 2. The main user artifacts in the decrypted database.
Table NameColumn NameDataRemarks
_idUhat room unique value33 random alphanumeric characters
_chats_lastMessageConversations and information about them at the end of a chat roomJSON format
_lastUpdatedTimeWhen the last chat room conversation was sent13-digit Unix Time
_chatMidGroup chat room unique values33 random alphanumeric characters
_groupchat_createdTimeGroup chat room creation time13-digit Unix Time
_chatNameGroup chat room title
_midUser unique values33 random alphanumeric characters
_createdTimeAdd friend time13-digit Unix Time
_contact_displayNameUser name
_statusMessageUser profile status message
_favoriteTimeFavorite add time13-digit Unix Time
_profile_midUser unique values33 random alphanumeric characters
_fromMessage sending user unique valuesSame as _mid in the _contact table and _profile table
_toMessage receiving user unique valuesSame as _mid in the _contact table and _profile table
_message_createdTimeMessage sent time13-digit Unix Time
_textMessage contentFor attachments, the filename
_chatIdChat room unique valuesSame as _id in the _chats table
_contentInfoImage thumbnail informationJSON format
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kang, S.; Kim, G.; Hur, U.; Kim, J. Forensic Analysis of wxSQLite3-Encrypted Databases and Its Application. Electronics 2024, 13, 1325. https://doi.org/10.3390/electronics13071325

AMA Style

Kang S, Kim G, Hur U, Kim J. Forensic Analysis of wxSQLite3-Encrypted Databases and Its Application. Electronics. 2024; 13(7):1325. https://doi.org/10.3390/electronics13071325

Chicago/Turabian Style

Kang, Soojin, Giyoon Kim, Uk Hur, and Jongsung Kim. 2024. "Forensic Analysis of wxSQLite3-Encrypted Databases and Its Application" Electronics 13, no. 7: 1325. https://doi.org/10.3390/electronics13071325

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop