Android Malware Detection Based on Hypergraph Neural Networks

Zhang, Dehua; Wu, Xiangbo; He, Erlu; Guo, Xiaobo; Yang, Xiaopeng; Li, Ruibo; Li, Hao

doi:10.3390/app132312629

Open AccessArticle

Android Malware Detection Based on Hypergraph Neural Networks

by

Dehua Zhang

¹,

Xiangbo Wu

²,

Erlu He

²

,

Xiaobo Guo

²,

Xiaopeng Yang

²,

Ruibo Li

³ and

Hao Li

^2,*

¹

Big Data Center of Hebei Province, Shijiazhuang 050066, China

²

Science and Technology on Communication Networks Laboratory, Academy for Network & Communications of CETC, Shijiazhuang 050081, China

³

College of Letters and Science, University of California, Davis, CA 95616, USA

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(23), 12629; https://doi.org/10.3390/app132312629

Submission received: 18 October 2023 / Revised: 17 November 2023 / Accepted: 22 November 2023 / Published: 23 November 2023

(This article belongs to the Special Issue State-of-the-Art of Network Attack Detection and Situation Awareness Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

Android has been the most widely used operating system for mobile phones over the past few years. Malicious attacks against android are a major privacy and security concern. Malware detection techniques for android applications are therefore significant. A class of methods using Function Call Graphs (FCGs) for android malware detection has shown great potential. The relationships between functions are limited to simple binary relationships (i.e., graphs) in these methods. However, one function often calls several other functions to produce specific effects in android applications, which cannot be captured with FCGs. In this paper, we propose to formalize android malware detection as a hypergraph-level classification task. A hypergraph is a topology capable of portraying complex relationships between multiple vertices, which can better characterize the functional behavior of android applications. We model android applications using hypergraphs and extract the embedded features of android applications using hypergraph neural networks to represent the functional behavior of android applications. Hypergraph neural networks can encode high-order data correlation in a hypergraph structure for data representation learning. In experiments, we validate the gaining effect of hypergraphs on detection performance across two open-source android application datasets. Especially, HGNNP obtains the best classification performance of 91.10% on the Malnet-Tiny dataset and 97.1% on the Drebin dataset, which outperforms all baseline methods.

Keywords:

malware detection; hypergraph neural network; hypergraph classification

1. Introduction

Due to its open-source nature, android is vulnerable to malicious program attacks from attackers [1]. Therefore, detecting malware or code is important for the data security and privacy protection of android users. Traditional anti-malware approaches are signature-based detection techniques [2], which rely on analyzing and comparing the attack signature of malware with a list of pre-identified malicious signatures. The problem with these approaches is that they are unable to identify malware with unknown signatures. Machine learning-based malware detection methods can use static or dynamic features such as calls and permissions [2,3] to detect malware, providing a new solution for android malware detection.

In recent years, Function Call Graph (FCG)-based approaches [4,5,6] have been able to capture the rich semantic information of call relationships, showing both a high accuracy in detection and resistance to code obfuscation. In the FCG-based approaches, each vertex represents a function, and the directed edges between the vertices represent the call relationship between two functions: caller–callee. In the field of deep learning, graph neural networks (GNNs) can effectively capture topological features of graph data and are widely used in social networks [7], biology [8], and communication [9,10]. Thus, graph neural networks are directly applicable for extracting embedding features for FCGs of android applications. A number of malware detection methods based on graph neural networks have emerged.

Some studies use graph neural networks to extract graph embedding features of android applications from FCGs, combined with downstream classifiers for malware detection [11,12,13]. Compared to these studies, other studies directly combine the FCG and the extracted function features with the graph neural network. The FCG with features is provided to a graph convolutional neural network for representation learning [14,15].

However, considering only the caller–callee relationship between functions is still insufficient to exploit the rich semantic information between functions, and this paper uses hypergraphs to construct the multiple forms of topological relationships in android applications. Inspired by the construction of hypergraphs in hypergraph neural networks, this paper proposes two ways for constructing hypergraphs: (1) considering the K-order call neighbors of the current function as a hyperedge and (2) considering functions that share the same permission as a hyperedge.

In the following paper, we first introduce FCGs and methods for extracting function features. Then, we provide a feasible solution for constructing hyperedges. Finally, we establish a task flow for android malware detection based on a hypergraph-level classification. In summary, the contributions of this paper are as follows:

This paper proposes for the first time the use of hypergraphs instead of graphs to model higher-order relations among functions of android applications. We propose two ways of constructing hyperedges, common call hyperedges and common permission hyperedges, and use hypergraph convolutional neural networks to extract higher-order semantic information from the constructed hypergraph.
This paper abstracts android malware detection as a hypergraph graph-level classification task and builds a framework for the hypergraph-level classification task. The gaining effect brought by hypergraph and hypergraph convolutional neural networks is verified with two open-source datasets. Especially, HGNNP obtains the best classification performance of 91.10% on the Malnet-Tiny dataset and 97.1% on the Drebin dataset, which outperforms all baseline methods.

The organizational structure of this paper is arranged as follows: Section 2 reviews the current research work on android malware detection technology; Section 3 introduces the concept of hypergraph and the four types of hypergraph neural networks; Section 4 presents our model algorithm and describes the principle of the model; Section 5 introduces the dataset and baseline model used in this paper and evaluates the final experimental results; and Section 6 summarizes the research of this paper.

2. Related Research

Traditional android malware detection techniques fall into three main categories [16]: static analysis techniques, dynamic analysis techniques, and hybrid analysis techniques. Below are mentioned several typical traditional technologies. In static analysis techniques, Feng et al. [17] uses a combination of static taint analysis and a new form of program representation to efficiently detect android applications that have certain control- and data-flow properties. Faruki et al. [18] presents a robust approach that generates a signature by extracting statistically improbable features to detect malicious android apps. The proposed method is effective against code obfuscation and repackaging, widely used techniques to evade AntiVirus signature and to propagate unseen variants of known malware. Xiao et al. [19] uses a dynamic analysis method to distinguish malware with system call sequences. Two different feature models, the frequency vector and the co-occurrence matrix, are employed to extract features from the system call sequence. Hybrid analysis method combines static analysis and dynamic analysis to observe malware from a more comprehensive perspective. Feng et al. [20] proposes a two-layer method to detect malware in android APPs. The first layer is permission, intent, and component information-based static malware detection model. In the second layer, a new method CACNN, which cascades CNN and AutoEncoder, is used to detect malware through network traffic features of APPs. Usually, hybrid analysis methods have better detection and recognition performance than static and dynamic analysis, but their time, hardware, and model complexity all exceed those of static and dynamic analysis.

Static methods are generally more efficient than dynamic methods but are vulnerable to obfuscation techniques, such as changing function names in code. Therefore, the FCG plays an important role in the static characterization of android applications, reflecting the contextual dependencies and semantic meaning of functions at runtime rather than just remembering the form of the malware code [4,5,6]. With the rise of machine learning, an increasing number of researchers are focusing on how to apply machine learning to achieve android malware detection. Qiao et al. [21] uses different machine learning methods, including Support Vector Machines, Random Forest, and Neural Networks, to detect malware by mining the patterns of Permissions and API Function Calls acquired and used by android apps. Zhao et al. [22] uses decision tree classifier and kNN classifier to achieve android malicious detection scheme based on sensitive API calls. They employ mutual information to measure the correlation between specific API calls and malware, and generate a set of sensitive API calls. MalScan [4] treats FCGs as social networks and uses centrality analysis to obtain structural features of APK samples, which are then handed over to downstream classifiers for classification. Due to the large order of magnitude of vertices in FCGs, it is difficult for traditional machine learning methods to directly exploit the structural information of FCGs.

The deep graph neural network approach has shown great advantages for large-scale graph-level classification tasks and is naturally suited for android malware detection tasks based on FCGs. Cai et al. [11] generated embedding features of functions (vertices) using the Continuous Bag of Words (CBOW) algorithm from natural language processing. Then, they used graph neural networks to extract graph embedding features of android applications from FCGs, combined with downstream classifiers for malware detection. Gdroid [12] first constructed a heterogeneous graph between APKs and APIs, and then obtained the neighborhood relationships between APIs using word2vec and clustering algorithms [13]. Then, they combined graph convolutional neural networks (GCN) to learn the representations of the APKs. And the APKs were handed over to the downstream classifier for detection and classification. However, their construction of heterogeneous graphs still required domain knowledge.

Compared to the approach in [12], a more straightforward approach is to directly combine the FCG and the extracted function features with the graph neural network. Lo et al. [1] used FCGs to extract centrality features of functions (vertices) and directly combine popular graph neural networks such as GCN, GIN, and GraphSAGE for graph-level classification tasks, and enhances the model representation with the Jumping Knowledge technique to achieve an excellent detection accuracy. Vinayaka et al. [14] extracted the properties of the functions in the FCG itself, such as whether they are declared by public or static descriptors, whether the function is an android API, the size of the function bytecode, the opcode contained inside the function, etc. These properties reflect the essential functional characteristics of the function. The FCG with features is provided to a graph convolutional neural network for representation learning, and the graph embedding features obtained from the Readout function are provided to a downstream classifier for classification. MsDroid [15] separated local subgraphs around sensitive APIs from FCGs and used GNN to classify the subgraphs. In an APK, if a local subgraph was identified as malicious, the APK was treated as a malicious application.

The above research has achieved some success in the field of android malware detection; however, considering only the caller–callee relationship between functions is still insufficient to exploit the rich semantic information. Using hypergraphs to reconstruct FCGs, or constructing hypergraphs using properties such as common permissions and common descriptors between functions, and then using hypergraph neural networks for representation learning, is a meaningful research direction. In this paper, we use hypergraphs to construct the multiple forms of topological relationships in android applications to capture higher-order semantic relationships between functions.

3. Hypergraph Neural Networks

3.1. Preliminaries

A hypergraph is a more general topology than a graph that is capable of modeling multiple relationships between discrete data. A hyperedge in a hypergraph can contain multiple vertices, and thus, hypergraphs are capable of modeling higher-order relationships among vertices. In terms of the definition of hyperedges and edges, graphs are a special case of hypergraphs. There is a large amount of discrete data with complex relationships in the real world, but modeling binary relationships between discrete data via graphs can result in a loss of information, whereas hypergraphs are a more direct and natural way of modeling [23]. Figure 1 shows the topology difference between graphs and hypergraphs.

Let

H = 〈V, E〉

be a hypergraph, where

V = \{v_{1,} v_{2,} \dots v_{n}\}

denotes the vertex set containing

|V| = n

vertices and

E = \{e = (v_{k_{1}}, \dots, v_{k_{i}}) | e \subset V\}

denotes the set of hyperedges. In a hypergraph, the vertex-hyperedge adjacency matrix is represented by

H \in R^{| V | \times | E |}

. When a vertex

v

belongs to a hyperedge

e

,

H (v, e) = 1

, otherwise

H (v, e) = 0

. A priori weights of the hyperedges can be represented by a diagonal matrix

W \in R^{| E | \times | E |}

where

W (i, i) = w (e_{i})

. The degree of vertices is computed by

d_{V} (v) = \sum_{e} H (v, e) w (e),

and the degree of hyperedges is computed as

d_{E} (e) = \sum_{v} H (v, e)

. The matrix representation of the degree of vertices and hyperedges is denoted by

D_{V}

and

D_{E},

respectively. They are both diagonal matrices. The Laplacian matrix of

H

is denoted by

L = I - H W D_{E}^{- 1} H^{T}

. These concepts are clarified here in order to better explain hypergraph neural networks later in the paper.

3.2. Hypergraph Neural Networks

HGNN [24] estimates the convolution kernel parameters using two-order Chebyshev polynomials based on the hypergraph Laplacian proposed by Zhou et al. [25], as in Equation (1):

g * x = \sum_{k = 0}^{K} θ_{k} T_{k} (\tilde{Λ}) x = θ_{0} x - θ_{1} D_{V}^{- 1 / 2} H W D_{E}^{- 1} H^{T} D_{V}^{- 1 / 2} x,

(1)

where

\tilde{Λ} = \frac{2 Λ}{λ_{m a x}} - I_{n},

and

Λ

is the eigenvalue matrix of

L

. The convolution layer of HGNN is represented in Equation (2):

X_{o u t} = δ (D_{V}^{- 1 / 2} H W D_{E}^{- 1} H^{T} D_{V}^{- 1 / 2} X_{i n} Θ) .

(2)

HyperGCN [26] simplifies the hypergraph by defining a non-linear Laplacian matrix that generates only one simple edge for each hyperedge, i.e., taking the two vertices with the largest signal difference between them and generating a simple edge. Specifically, for a hyperedge

e

, the generated simple edge

(v_{e}, u_{e})

is as follows:

(v_{e}, u_{e}) ≔ {a r g m a x}_{v, u \in e} | X_{v} - X_{u} | .

(3)

The initial features on vertex

v

and vertex

u

in Equation (3) are

X_{v}

and

X_{u}

respectively. This generates an adjacency matrix

A_{S}

of a simple graph, and then, the convolution of the HyperGCN is derived, as in Equation (4):

X_{o u t} = δ ((I_{n} + A_{S}) X_{i n} Θ) .

(4)

UniGNN [27] provides an integrated framework for the representation learning of graphs and hypergraphs by taking various approaches from well-established graph neural networks and extending them to hypergraphs with minimal cost. It includes UniGCN, UniGIN, UniGAT, etc. UniGNN proposes a two-stage message-passing approach as shown in Equation (5):

h_{e} = f_{1} (\{x_{j} | v_{j} \in e\}), {\tilde{x}}_{i} = f_{2} (x_{i}, \{h_{e} | e \in E (v_{i})\}) .

(5)

where

x_{i}

is the initial feature of the vertex

v_{i}

,

h_{e}

represents the information aggregated to the hyperedge

e

,

E (v_{i}) = \{e \in E | v_{i} \in e\}

is the set of hyperedges containing the vertex

v_{i}

, and

f_{1}

,

f_{2}

are aggregation functions. In UniGNNs, the difference between the UniGCN, UniGIN, UniGAT, and other methods lies in the choice of the two aggregation functions [27].

HGNNP [28] provides a convolution operator that is slightly different from HGNN. It uses a random-walk-based probability transition matrix for feature propagation:

X_{o u t} = δ (D_{V}^{- 1} H W D_{E}^{- 1} H^{T} X_{i n} Θ),

(6)

Here, we provide a mathematical comparison of four typical hypergraph neural networks. HGNN is able to learn the hidden layer representation considering the high-order data structure, which is a general framework considering complex data correlations. HyperGCN is a novel GCN for SSL on attributed hypergraphs. UniGNN is a unified framework for interpreting the message passing process in graph and hypergraph neural networks, which can generalize general GNN models into hypergraphs. HGNNP introduces a general high-order multi-modal/multi-type data correlation modeling framework to learn an optimal representation in a single hypergraph-based framework. This paper abstracts android malware detection as a hypergraph graph-level classification task and uses these four hypergraph networks to implement the classification task.

4. Methodology

4.1. Feature Extraction

The most important element in graph neural network-based android malicious detection is the FCG, which describes the process call relationships between functions. There should be a clear difference between the FCGs of benign and malicious applications, as malware usually requires frequent calls to risky and sensitive interfaces. This subsection focuses on the decompilation process for android applications, the acquisition of FCGs and their form, and the feature extraction for functions.

Function Call Graphs: A function call graph is a directed graph

G = 〈V, E〉

, where

V

is the set of vertices, and each vertex represents a function, including internal and external functions, where internal functions are functions that appear in the DEX file, generally written by the developer according to the application. On the other hand, the DEX file contains only the declaration or definition of the external function and does not contain the code to perform the specific function, and the external function contains the API functional interface from android.

E = \{(u, v) | u \to v, u, v \in V\}

is the edge set, representing the Inter-Procedural Call relationship between functions.

Features of Functions: The FCG contains a wealth of graph structure information that will be used by GNN for passing messages between vertices. However, the resulting FCG has no vertex features; thus, the next step is to find initial features that represent the function functionality of the function. The vertex features extracted in this paper are divided into two categories: centrality features and functional features.

Centrality features: Inspired by [29], the following centrality measures of vertices are selected as the first part of the features in this paper: PageRank, Degree Centrality, Betweenness Centrality [30], Closeness Centrality [31], Katz Centrality [32], and Harmonious Centrality [33]. Due to space limitations, detailed definitions of these centrality metrics can be obtained from the references [30,31,32,33].

Functional Features: We extracted the functional features following the methods from [34]. In order to describe more specifically the semantic function of a function, this paper considers the extraction of relevant functional features from the original code of the function in the Dex file. For internal functions (Internal Method), Androgurad can parse the code block of the function to obtain the class of the OpCode corresponding to the code block.

Each opcode has a specific 8-bit binary representation, and there are 256 possible opcodes, of which 230 are given a specific meaning, leaving the remaining undefined opcodes as a fallback. Many of these opcodes have the same functional properties, for example, invoke-direct, invoke-virtual, invoke-static, etc., all essentially invoke operations. [35] combines opcodes with the same functional properties to obtain a total of 21 opcode functional groups. For any internal function

v

in FCG, its functional identity can be expressed as a 21-dimensional 0–1 vector

o p c (v)

:

o p c (v) [i] = 1

, if

v

contains the operand of the i-th functional group. For any external function

u

in FCG, since the Dex file does not contain its specific code,

o p c (u) [i] = 0, \forall i

.

Then, a 226-dimensional One-Hot vector

a p i (u)

is obtained for any external function

u

, where 226 is the number of all android API packages:

a p i (u) [i] = 1

, when the external function

u

matches the i-th API package using the longest prefix algorithm. Since the internal function

v

will not match any API package,

a p i (v) = 0, \forall v \in

the set of internal functions. In summary, for any function

v

in FCG, the functional identity

f u n c (v)

of the function is obtained as shown in Equation (7).

f u n c (v) = [o p c (v) | | a p i (v)],

(7)

where

| |

indicates a concate operation.

4.2. Hypergraph Construction

After obtaining the FCG and vertex features of the android application, the next step is to focus on the construction of the hypergraph, which is mainly the construction of hyperedges. We propose two schemes for the construction of hyperedges:

K-order Common Call Hyperedges: The case where the same function calls multiple other functions is common in android applications, and such multiple co-calls reflect dependencies or similar functional relationships between multiple functions. Therefore, this paper proposes to construct a hyperedge using the K-order nearest neighbors of a function. When

K > 1

, the hyperedge is able to model a complete calling relationship, as shown in Figure 2. Intuitively, the longer the call relationship, the better it may be at passing relevant information about the underlying functionality to the central function, such as using the underlying interface to retrieve user read and write permissions, monitor user action behaviour, etc. Here,

K

is the hyperparameter of the method. A larger

K

may involve some unimportant neighborhood vertices and bring noise into the information transfer, and a smaller

K

can only model local information and cannot reflect the advantages of hypergraph modeling; thus,

K

needs to be adjusted specifically for the actual task.

Common Permission Hyperedges: Inspired by APIGraph [6], this paper constructs a set of hyperedges for functions that share permissions. First, Androgurad is used to parse the APK manifest file to obtain the system permissions required by each function. The set of system permissions is selected, where each permission is treated as a hyperedge and the functions that require this permission belong to this hyperedge. In android, public permission requests usually express high-level semantic information about the function’s functionality; thus, using common permissions to build hyperedges is useful for discovering similar functional features between functions.

4.3. Hypergraph Level Classification

In this part, we focus on building a framework for hypergraph graph classification, as shown in Figure 3. In this paper, we develop a task framework for hypergraph graph classification based on the vertex classification hypergraph open-source algorithm library DHG [28]. Two of the key components are the hypergraph graph classification data loader and the pooling layer, respectively. In graph neural networks, for the graph classification task, the adjacency matrices of the graphs in a Mini-Batch need to be diagonally spliced, and the corresponding vertex ordinal numbers are increased sequentially. The hypergraph is represented as a vertex-hyperedge adjacency matrix; thus, this paper adopts the same idea of diagonally splicing the vertex-hyperedge adjacency matrix of the hypergraph in a Mini-Batch, with the vertex and hyperedge ordinal numbers increasing sequentially.

In practice, in order to save memory, hypergraphs are generally represented as a collection of hyperedges, and the hyperedge-vertex adjacency matrix is stored as a sparse matrix; thus, it is not necessary to perform diagonal splicing operations directly at the matrix level. To achieve this diagonal splicing operation, the vertex number in the hyperedge list of each hypergraph in the Mini-Batch is simply added to the total number of vertices in all the hypergraphs preceding it. Figure 4 shows the concatenation of Mini-Batch hypergraphs.

For the i-th hypergraph in the Mini-Batch,

H_{i} = 〈V_{i}, E_{i}〉

, let

\forall e_{j} \in E_{i}, \forall v_{k} \in e_{j}

, and

{\hat{v}}_{i} = v_{i} + \sum_{t = 1}^{i - 1} | V_{i} |

, then, we have

{\hat{H}}_{i} = 〈{\hat{V}}_{i}, {\hat{E}}_{i}〉

. The set of vertices and the set of hyperedges of these hypergraphs are merged to obtain the following:

H_{B a t c h} = 〈{\hat{V}}_{1} \cup {\hat{V}}_{2} \dots {\cup \hat{V}}_{m}, {\hat{E}}_{1} {\cup \hat{E}}_{2} {\dots \cup \hat{E}}_{m}〉

. Once

H_{B a t c h}

is obtained, the original features of the vertices are then stacked by row, corresponding to the serial numbers of the vertices mentioned above.

X_{B a t c h} = S t a c k (X_{1}, X_{2}, \dots, X_{m})

.

〈H_{B a t c h}, X_{B a t c h}〉

is provided to the hypergraph convolution module to extract the embedding features of all vertices, and after averaging the pooling layer, the embedding features of the hypergraph are obtained.

For the binary classification problem, the Binary Cross-Entropy Loss function is chosen as shown in Equation (8):

B C E L o s s (x, y) = \frac{1}{m} \sum_{i = 1}^{m} y_{i} \log p (x_{i}) + (1 - y_{i}) \log (1 - p (x_{i})),

(8)

where

x_{i}

represents a hypergraph sample,

p (x_{i})

is the predicted value of the classifier output,

y_{i} = \{0,1\}

is the true label category corresponding to

x_{i}

, and

m

is the number of samples in the Mini-Batch.

For the malware family multi-classification problem, the Multi-Class Cross-Entropy Loss function is chosen as shown in Equation (9):

M C E L o s s (x, y) = \frac{1}{m} \sum_{i = 1}^{m} \sum_{j = 1}^{c} y_{i j} \log p {(x_{i})}_{j},

(9)

where

c

is the number of categories, and

y_{i}

denotes the category vector of the i-th sample, and if the category of

y_{i}

is

j

, then,

y_{i j} = 1,

and

p {(x_{i})}_{j}

is the predicted value of the i-th sample belonging to the j-th sample.

5. Evaluation

In this section, two open-source datasets, Malnet-Tiny and Drebin, are used to experimentally validate the hypergraph-based method proposed in this paper.

5.1. Datasets Description

Malnet-Tiny [36]: Malnet-Tiny contains FCGs for 5000 android apps, where any FCG has less than 5000 vertices, hence, the name “Tiny”. As shown in Table 1, the entire dataset is divided into five categories, namely Benign, Adware, Addisplay, Downloader, and Trojan, with the latter four being malicious app families. The five classes in the dataset are balanced with 1000 samples each, which allows the effect of sample imbalance on model performance to be excluded. As can be seen from Table 1, the average number of vertices and the average number of edges of the samples differ significantly between the different categories, indicating that the FCGs of the different categories differ significantly in structure. Since Malnet-Tiny only contains the list of edges of the FCG for android applications, it does not contain information related to specific functions. Therefore, in the feature extraction stage, only the central features of the samples can be extracted, while in the hypergraph construction stage, only K-order common call hyperedges can be constructed. Such an experimental setup can verify the gain effect brought by transforming the original FCG into a hypergraph, and validate the advantages of hypergraph modeling and its convolution operator. In the experiments of this paper, the training set and test set comprise 80% and 20% of the dataset.

Drebin [37]: Drebin is a dataset containing only malicious android apps, with a total of 5560 android app samples, divided into 179 malicious families. In this paper, we have downloaded its original APK file, and the task flow described in the previous section allows us to extract the full vertex initial features from the original APK, as well as to construct two types of hypergraphs. The task designed for Drebin in this paper is malicious family classification. Since many families in Drebin have a small sample size, the top 20 malicious families in terms of sample size are selected from 179 malicious families for experimentation in this paper, as shown in Table 2. In this task, the training and test sets comprise 80% and 20% of the dataset.

5.2. Evaluation Results

Malnet-Tiny: Since the samples of different categories in Malnet-Tiny are balanced, the accuracy rate is chosen as a metric to measure the strengths and weaknesses of the methods in this paper. In this task, the baseline methods selected in this paper include mapping-based methods, SLaq-VNGE [36] and SLaq-LSD [36]; feature-based methods, Feather [36], NoG [36], and LDP [36]; and graph neural network-based methods, GCN [36] and GIN [36]. In the study [36], GCN is implemented based on the PYG framework. For a fair comparison, this paper reproduces the GCN and GIN methods in the above hypergraph graph classification framework, in line with the study [36], setting five-layer GCN and GIN with a learning rate of 0.0001, and the implicit layer dimension is set to 64. The GCN and GIN methods reproduced in this paper have accuracy of 75% and 83%, respectively. For the hypergraph-based methods, this paper evaluates five-layer HGNN, HyperGCN, UniGNN, and HGNNP as hypergraph convolution modules to extract hypergraph features, and the pooling layer was uniformly chosen with an average pooling operator. The learning rate is set to 0.001, and the hidden layer dimension is 64. As shown in Table 3, HGNNP obtains the best classification performance of 91.10%, which outperforms all the feature-based, graph-spectral-based, and GNN-based methods. Compared to the GNN-based methods GCN and GIN, all hypergraph-based methods achieve a superior classification performance. HGNN, HyperGCN, and HGNNP are spectral-based convolutional neural networks in the hypergraph approaches, with improvements of 12.2%, 3.9%, and 15.7%, respectively, compared to the spectral-based GCN. The relatively poor results of HyperGCN among the hypergraph-based methods may stem from the high number of isolated vertices in the constructed hypergraph dataset, as HyperGCN needs to discard isolated vertices [26]. For UniGNN, we choose UniGIN to compare with the GNN-based GIN. The results show that UniGIN gained a 5.9% performance improvement over GIN. It can therefore be concluded that modeling function call relations using hypergraphs can bring gains for android malicious detection tasks, reflecting the advantages of hypergraph modeling of higher-order function call relations.

Drebin: Due to the unbalanced sample of categories in Drebin, in this task, in addition to accuracy, we add precision, recall, and F1 score as metrics for comparison. As mentioned in Section 5.1, the top 20 categories in terms of sample size were selected for malicious family classification, with a total of 4664 malicious samples. This experiment was able to fully extract the centrality features of the function as well as the functional features, and the construction of two forms of hyperedges. In this experiment, graph neural network-based methods are used as baseline algorithms in this paper, including GCN, GIN, and Graph-SAGE. Specifically, the following settings are adopted for the above graph-based methods: the number of convolutional layers is 3, the learning rate is 0.001, the hidden layer dimension is 300, and the batch size is 64. For the hypergraph-based methods, HyperGCN, UniGNN, HGNN, and HGNNP, with specific settings of 3 convolutional layers, the learning rate of 0.0001, the hidden layer dimension of 300, and the batch size of 64. All methods use the Adam optimizer as well as the average pooling layer.

As seen in Table 4, the optimal accuracy of 97.1% was obtained with HGNNP, while the optimal precision, recall, and F1 scores obtained with HGNN were 97.2%, 96.8%, and 0.9699, respectively. Compared to GCN, the hypergraph-based HGNN and HGNNP improved in accuracy by 3.2% and 3.3%, respectively, and in F1-Score by 0.41 and 0.31, respectively. This suggests that for spectral-based convolutional networks, the use of hypergraph modeling is able to extract richer higher-order information to help classify malicious applications than the GNN approach that directly utilizes FCGs. For UniGNN, we still chose UniGIN as the hypergraph method being evaluated. Compared to GIN, UniGIN does not show an advantage in accuracy but gains 0.004 on the F1-score. As is well known, GraphSAGE generalizes better due to the inclusion of sampling of neighboring vertices. It is noteworthy that the GraphSAGE method yields a high-level F1-Score of 0.9639, outperforming HyperGCN and UniGIN, which may be due to the fact that the aggregation of K-order neighbor information by GraphSAGE is comparable to the direct construction of K-order neighbor hyperedges. However, compared to GraphSAGE, the hypergraph-based HGNNP and HGNN obtained superior results in terms of accuracy and F1-Score with an improvement of 0.6% and 0.006, respectively, which illustrates the benefits of introducing hypergraphs into android malware detection.

6. Conclusions and Future Work

In this paper, hypergraph convolutional neural networks are introduced to android malicious detection. First, this study extracts function centrality features and function functionality features using FCGs and function code blocks of android applications, respectively. Then, this study proposes two types of hyper-edge constructions: K-order common call hyperedges and common permission hyperedges. In the third part, this study establishes the overall task flow of hypergraph classification and designs the reconstruction method of Mini-Batch hypergraph to obtain the reconstructed hypergraph. Then, this study extracts the hypergraph embedding features through the hypergraph convolutional network and pooling layer, and finally, it uses the classifier to complete the malicious detection of android applications. In terms of experiments, this study validates the gaining effect of using hypergraphs to model android applications on two datasets, Malnet-Tiny and Drebin, and also verifies the excellent detection performance of hypergraph convolutional neural networks.

The method proposed in this study has achieved good experimental results. With the expansion of research and application, further research is needed on this topic. We will conduct in-depth research mainly on the following two aspects. (1) The model should also be tested on other datasets available. Android applications have numerous features. There are also many types of malicious software, and if the data volume is not large enough, it may lead to poor model performance. (2) The current model only utilizes static analysis features, and future research can consider adding dynamic analysis features. The combination of static and dynamic features can achieve complementary advantages.

Author Contributions

Conceptualization, D.Z. and X.W.; methodology, D.Z.; software, E.H.; validation, H.L., X.G., and X.Y.; formal analysis, D.Z.; investigation, R.L.; resources, X.W.; data curation, X.W.; writing—original draft preparation, D.Z.; writing—review and editing, H.L.; visualization, E.H.; supervision, X.Y.; project administration, X.G.; funding acquisition, H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by Science and Technology on Communication Networks Laboratory Fund Project (FFX22641X017, FFX22641X009, HHX21641X010, HHX23641X003).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. Two open-source datasets, Malnet-Tiny and Drebin, were used in this paper. For Drebin, please visit https://drebin.mlsec.org/ (accessed on 15 October 2023). For Malnet-Tiny, please visit https://mal-net.org/ (accessed on 15 October 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

Lo, W.W.; Layeghy, S.; Sarhan, M.; Gallagher, M.; Portmann, M. Graph Neural Network-Based Android Malware Classification with Jumping Knowledge. In Proceedings of the 2022 IEEE Conference on Dependable and Secure Computing (DSC), Edinburgh, UK, 22–24 June 2022; pp. 1–9. [Google Scholar]
Gandotra, E.; Bansal, D.; Sofat, S. Malware Analysis and Classification: A Survey. JIS 2014, 5, 56–64. [Google Scholar] [CrossRef]
Liang, X.; Gu, Z.; Xie, Y.; Wang, L.; Tian, Z. MUSEDA: Multilingual Unsupervised and Supervised Embedding for Domain Adaption. Knowl.-Based Syst. 2023, 273, 110560. [Google Scholar] [CrossRef]
Wu, Y.; Li, X.; Zou, D.; Yang, W.; Zhang, X.; Jin, H. MalScan: Fast Market-Wide Mobile Malware Scanning by Social-Network Centrality Analysis. In Proceedings of the 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), San Diego, CA, USA, 11–15 November 2019; pp. 139–150. [Google Scholar]
Mariconti, E.; Onwuzurike, L.; Andriotis, P.; De Cristofaro, E.; Ross, G.; Stringhini, G. MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models. In Proceedings of the Proceedings 2017 Network and Distributed System Security Symposium, San Diego, CA, USA, 26 February–1 March 2017; Internet Society: San Diego, CA, USA, 2017. [Google Scholar]
Zhang, X.; Zhang, Y.; Zhong, M.; Ding, D.; Cao, Y.; Zhang, Y.; Zhang, M.; Yang, M. Enhancing State-of-the-Art Classifiers with API Semantics to Detect Evolved Android Malware. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 9–13 November 2020; ACM: New York, NY, USA, 2020; pp. 757–770. [Google Scholar]
Ni, S.; Li, J.; Kao, H.-Y. MVAN: Multi-View Attention Networks for Fake News Detection on Social Media. IEEE Access 2021, 9, 106907–106917. [Google Scholar] [CrossRef]
Li, R.; Yuan, X.; Radfar, M.; Marendy, P.; Ni, W.; O’Brien, T.J.; Casillas-Espinosa, P. Graph Signal Processing, Graph Neural Network and Graph Learning on Biological Data: A Systematic Review. IEEE Rev. Biomed. Eng. 2023, 16, 109–135. [Google Scholar] [CrossRef]
He, J.; Zhao, H. Fault Diagnosis and Location Based on Graph Neural Network in Telecom Networks. In Proceedings of the 2020 International Conference on Networking and Network Applications (NaNA), Haikou, China, 10–13 December 2020; pp. 304–309. [Google Scholar]
Jia, Y.; Gu, Z.; Du, L.; Long, Y.; Wang, Y.; Li, J.; Zhang, Y. Artificial Intelligence Enabled Cyber Security Defense for Smart Cities: A Novel Attack Detection Framework Based on the MDATA Model. Knowl.-Based Syst. 2023, 276, 110781. [Google Scholar] [CrossRef]
Cai, M.; Jiang, Y.; Gao, C.; Li, H.; Yuan, W. Learning Features from Enhanced Function Call Graphs for Android Malware Detection. Neurocomputing 2021, 423, 301–307. [Google Scholar] [CrossRef]
Gao, H.; Cheng, S.; Zhang, W. GDroid: Android Malware Detection and Classification with Graph Convolutional Network. Comput. Secur. 2021, 106, 102264. [Google Scholar] [CrossRef]
Zhang, H.; Gu, Z.; Tan, H.; Wang, L.; Zhu, Z.; Xie, Y.; Li, J. Masking and Purifying Inputs for Blocking Textual Adversarial Attacks. Inf. Sci. 2023, 648, 119501. [Google Scholar] [CrossRef]
Vinayaka, V.K.; Jaidhar, C.D. Android Malware Detection Using Function Call Graph with Graph Convolutional Networks. In Proceedings of the 2021 2nd International Conference on Secure Cyber Computing and Communications (ICSCCC), Jalandhar, India, 21–23 May 2021; pp. 279–287. [Google Scholar]
He, Y.; Li, Y.; Wu, L.; Yang, Z.; Ren, K.; Qin, Z. MsDroid: Identifying Malicious Snippets for Android Malware Detection. IEEE Trans. Dependable Secur. Comput. 2023, 20, 2025–2039. [Google Scholar] [CrossRef]
Liu, K.; Xu, S.; Xu, G.; Zhang, M.; Sun, D.; Liu, H. A Review of Android Malware Detection Approaches Based on Machine Learning. IEEE Access 2020, 8, 124579–124607. [Google Scholar] [CrossRef]
Feng, Y.; Anand, S.; Dillig, I.; Aiken, A. Apposcopy: Semantics-Based Detection of Android Malware through Static Analysis. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Hong Kong SAR, China, 16–21 November 2014; ACM: New York, NY, USA, 2014; pp. 576–587. [Google Scholar]
Faruki, P.; Ganmoor, V.; Laxmi, V.; Gaur, M.S.; Bharmal, A. AndroSimilar: Robust Statisti-cal Feature Signature for Android Malware Detection. In Proceedings of the 6th International Conference on Security of Information and Networks, Aksaray, Turkey, 26–28 November 2013; ACM: New York, NY, USA, 2013; pp. 152–159. [Google Scholar]
Xiao, X.; Xiao, X.; Jiang, Y.; Liu, X.; Ye, R. Identifying Android Malware with System Call Co-occurrence Matrices. Trans. Emerg. Tel. Tech. 2016, 27, 675–684. [Google Scholar] [CrossRef]
Feng, J.; Shen, L.; Chen, Z.; Wang, Y.; Li, H. A Two-Layer Deep Learning Method for An-droid Malware Detection Using Network Traffic. IEEE Access 2020, 8, 125786–125796. [Google Scholar] [CrossRef]
Qiao, M.; Sung, A.H.; Liu, Q. Merging Permission and API Features for Android Malware Detection. In Proceedings of the 2016 5th IIAI International Congress on Advanced Ap-plied Informatics (IIAI-AAI), Kumamoto, Japan, 10–14 July 2016; pp. 566–571. [Google Scholar]
Zhao, C.; Zheng, W.; Gong, L.; Zhang, M.; Wang, C. Quick and Accurate Android Malware Detection Based on Sensitive APIs. In Proceedings of the 2018 IEEE International Conference on Smart Internet of Things (SmartIoT), Xi’an, China, 17–19 August 2018; pp. 143–148. [Google Scholar]
Jia, Y.; Gu, Z.; Jiang, Z.; Gao, C.; Yang, J. Persistent Graph Stream Summarization for Real-Time Graph Analytics. World Wide Web 2023, 26, 2647–2667. [Google Scholar] [CrossRef]
Feng, Y.; You, H.; Zhang, Z.; Ji, R.; Gao, Y. Hypergraph Neural Networks. AAAI 2019, 33, 3558–3565. [Google Scholar] [CrossRef]
Zhou, D.; Huang, J.; Schölkopf, B. Learning with Hypergraphs: Clustering, Classification, and Embedding. In Advances in Neural Information Processing Systems 19; Schölkopf, B., Platt, J., Hofmann, T., Eds.; The MIT Press: Cambridge, MA, USA, 2007; pp. 1601–1608. ISBN 978-0-262-25691-9. [Google Scholar]
Yadati, N.; Nimishakavi, M.; Yadav, P.; Nitin, V.; Louis, A.; Talukdar, P. HyperGCN: A New Method For Training Graph Convolutional Networks on Hypergraphs. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2019; Volume 32. [Google Scholar]
Huang, J.; Yang, J. UniGNN: A Unified Framework for Graph and Hypergraph Neural Networks. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 19–27 August 2021; International Joint Conferences on Artificial Intelligence Organization: Montreal, QU, Canada, 2021; pp. 2563–2569. [Google Scholar]
Gao, Y.; Feng, Y.; Ji, S.; Ji, R. HGNN+: General Hypergraph Neural Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 1–18. [Google Scholar] [CrossRef]
Cui, H.; Lu, Z.; Li, P.; Yang, C. On Positional and Structural Node Features for Graph Neural Networks on Non-Attributed Graphs. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, 17–21 October 2022; ACM: New York, NY, USA, 2022; pp. 3898–3902. [Google Scholar]
Freeman, L.C. A Set of Measures of Centrality Based on Betweenness. Sociometry 1977, 40, 35. [Google Scholar] [CrossRef]
Freeman, L.C. Centrality in Social Networks Conceptual Clarification. Soc. Netw. 1978, 1, 215–239. [Google Scholar] [CrossRef]
Katz, L. A New Status Index Derived from Sociometric Analysis. Psychometrika 1953, 18, 39–43. [Google Scholar] [CrossRef]
Marchiori, M.; Latora, V. Harmony in the Small-World. Phys. A Stat. Mech. Its Appl. 2000, 285, 539–546. [Google Scholar] [CrossRef]
Gibert, D.; Mateu, C.; Planes, J. HYDRA: A Multimodal Deep Learning Framework for Malware Classification. Comput. Secur. 2020, 95, 101873. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, L.; Huang, X. Using G Features to Improve the Efficiency of Function Call Graph Based Android Malware Detection. Wirel. Pers. Commun. 2018, 103, 2947–2955. [Google Scholar] [CrossRef]
Freitas, S.; Dong, Y.; Neil, J.; Chau, D.H. A Large-Scale Database for Graph Representation Learning. arXiv 2021, arXiv:2011.07682. [Google Scholar]
Arp, D.; Spreitzenbarth, M.; Hübner, M.; Gascon, H.; Rieck, K. Drebin: Effective and Explainable Detection of Android Malware in Your Pocket. In Proceedings of the Proceedings 2014 Network and Distributed System Security Symposium, San Diego, CA, USA, 23–26 February 2014; Internet Society: San Diego, CA, USA, 2014. [Google Scholar]

Figure 1. Examples of graph and hypergraph structures: (a) graph and (b) hypergraph.

Figure 2. K-order common hyperedge construction.

Figure 3. Framework of hypergraph malware detection.

Figure 4. Concatenation of Mini-Batch hypergraphs.

Table 1. Statistics on Malnet-Tiny dataset.

Categories	Samples	Ave Vertices	Ave Edges
Benign	1000	2064.3	3886.7
Adware	1000	2506.7	5597.2
Addisplay	1000	1631.2	2920
Downloader	1000	49.3	55.3
Trojan	1000	799.8	1840.3
Total	5000	1410.3	2859.9

Table 2. Top 20 malicious families by the number of samples of Drebin.

Family	Samples	Family	Samples
FakeInstaller	925	Adrd	91
DroidKungFu	667	DroidDream	81
Plankton	625	LinuxLotoor	70
Opfake	613	GoldDream	69
GingerMaster	339	MobileTx	69
BaseBridge	330	FakeRun	61
Iconosys	152	SendPay	59
Kmin	147	Gappusin	58
FakeDoc	132	Imlog	43
Geinimi	92	SMSreg	41
Total	-	-	4664

Table 3. Accuracy results of MalNet-Tiny.

Methods	Category	Accuracy
Feather	Feature-Based	86.00%
NoG	Feature-Based	77.00%
LDP	Feature-Based	86.00%
Slaq-LSD	Graph-Spectral	76.00%
SLaq-VNGE	Graph-Spectral	53.00%
GCN	GNN-Based	81.00%
GCN (reproduce)	GNN-Based	75.40%
GIN (reprocuce)	GNN-Based	83.10%
HGNN (ours)	Hypergraph-Based	87.60%
HyperGCN (ours)	Hypergraph-Based	79.30%
UniGNN (ours)	Hypergraph-Based	89.00%
HGNNP (ours)	Hypergraph-Based	91.10%

Table 4. Accuracy results of Drebin.

Methods	Category	Accuracy	Precision	Recall	F1-Score
GCN	Graph-Based	93.8	93.0	92.8	0.9289
GIN	Graph-Based	96.7	95.1	96.1	0.9559
GraphSAGE	Graph-Based	96.5	96.5	96.3	0.9639
HyperGCN	Hypergraph-Based	93.1	92.8	92.3	0.9254
UniGNN	Hypergraph-Based	96.2	96.7	95.3	0.9599
HGNNP	Hypergraph-Based	97.1	96.9	95.2	0.9604
HGNN	Hypergraph-Based	97.0	97.2	96.8	0.9699

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, D.; Wu, X.; He, E.; Guo, X.; Yang, X.; Li, R.; Li, H. Android Malware Detection Based on Hypergraph Neural Networks. Appl. Sci. 2023, 13, 12629. https://doi.org/10.3390/app132312629

AMA Style

Zhang D, Wu X, He E, Guo X, Yang X, Li R, Li H. Android Malware Detection Based on Hypergraph Neural Networks. Applied Sciences. 2023; 13(23):12629. https://doi.org/10.3390/app132312629

Chicago/Turabian Style

Zhang, Dehua, Xiangbo Wu, Erlu He, Xiaobo Guo, Xiaopeng Yang, Ruibo Li, and Hao Li. 2023. "Android Malware Detection Based on Hypergraph Neural Networks" Applied Sciences 13, no. 23: 12629. https://doi.org/10.3390/app132312629

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Android Malware Detection Based on Hypergraph Neural Networks

Abstract

1. Introduction

2. Related Research

3. Hypergraph Neural Networks

3.1. Preliminaries

3.2. Hypergraph Neural Networks

4. Methodology

4.1. Feature Extraction

4.2. Hypergraph Construction

4.3. Hypergraph Level Classification

5. Evaluation

5.1. Datasets Description

5.2. Evaluation Results

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI