Research on the Intelligent Construction of UAV Knowledge Graph Based on Attentive Semantic Representation

Fan, Yi; Mi, Baigang; Sun, Yu; Yin, Li

doi:10.3390/drones7060360

Open AccessArticle

Research on the Intelligent Construction of UAV Knowledge Graph Based on Attentive Semantic Representation

¹

School of Aeronautics, Northwestern Polytechnical University, Xi’an 710072, China

²

School of Foreign Studies, Northwestern Polytechnical University, Xi’an 710072, China

^*

Author to whom correspondence should be addressed.

Drones 2023, 7(6), 360; https://doi.org/10.3390/drones7060360

Submission received: 29 April 2023 / Revised: 22 May 2023 / Accepted: 27 May 2023 / Published: 30 May 2023

(This article belongs to the Special Issue Intelligent Recognition and Detection for Unmanned Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate target recognition of unmanned aerial vehicles (UAVs) in the intelligent warfare mode relies on a highly standardized UAV knowledge base, and thus it is crucial to construct a knowledge graph suitable for UAV multi-source information fusion. However, due to the lack of domain knowledge and the cumbersome and inefficient construction techniques, the intelligent construction approaches of knowledge graphs for UAVs are relatively backward. To this end, this paper proposes a framework for the construction and application of a standardized knowledge graph from large-scale UAV unstructured data. First, UAV concept classes and relations are defined to form specialized ontology, and UAV knowledge extraction triples are labeled. Then, a two-stage knowledge extraction model based on relational attention-based contextual semantic representation (UASR) is designed based on the characteristics of the UAV knowledge extraction corpus. The contextual semantic representation is then applied to the downstream task as a key feature through the Multilayer Perceptron (MLP) attention method, while the relation attention mechanism-based approach is used to calculate the relational-aware contextual representation in the subject–object entity extraction stage. Extensive experiments were carried out on the final annotated dataset, and the model F1 score reached 70.23%. Based on this, visual presentation is achieved based on the UAV knowledge graph, which lays the foundation for the back-end application of the UAV knowledge graph intelligent construction technology.

Keywords:

UAV; knowledge graph; knowledge extraction; attention mechanism; semantic representation

1. Introduction

With the penetration of artificial intelligence technology into the total factor of warfare, intelligent warfare is gradually becoming the mainstream combat style [1]. UAVs play the role of both a material foundation and a main support of combat capability in the context of intelligent warfare, which can realize more accurate perception, swifter decision-making, and more efficient action on the battlefield. In the rapidly changing intelligent warfare environment, the generation of large-scale UAV knowledge data leads to changes in knowledge processing, storage, query, and integration technologies, bringing new challenges and opportunities for the application of UAVs in practical mission scenarios [2].

As a new paradigm of knowledge processing, the development of knowledge graphs brings new possibilities for UAV knowledge management [3]. Objectively, knowledge graphs have more powerful data synthesis governance capabilities, and on a large scale, multi-source and different forms of UAV knowledge data can be deeply mined and represent the semantic relation and knowledge systems the graph contains. Subjectively, the knowledge graph is part of the development process of expert systems. However, the knowledge graph construction research on specific domains mainly focuses on medical [4], financial [5], and energy [6] fields. Compared with other domains, there are two main challenges for UAV knowledge graph construction:

At the level of UAV knowledge and data, several challenges need to be addressed. Firstly, the authority of UAV data is often lacking. These data primarily originate from encyclopedias and news pages, where a direct binding connection between data producers and data credibility is absent. Secondly, the accuracy of UAV knowledge suffers from deficiencies, leading to conflicts and noticeable errors among different UAV data sources. Thirdly, there is a scarcity of systematic data within the UAV domain, and publicly available UAV data are in unstructured text format. Extracting fine-grained knowledge directly from such data becomes challenging, thus hindering comprehensive system research in this field.
At the level of UAV knowledge graph construction and application, knowledge extraction is the core task in the construction process. The classical knowledge extraction methods can be commendable in the general domain. However, in the face of UAV domain data, it is necessary to develop an algorithmic extraction model adapted to the domain properties, to improve the efficiency and accuracy of extraction, and thus to enhance the generalization ability of the traditional methods. In the face of downstream UAV task applications, there is no direct reference case for the development of domain-oriented system engineering for user requirements and visual interaction. The overall architecture of the system needs to be explored and developed under the guidance of the requirements, in combination with application models from other domains.

Based on the above problems, this paper proposes a knowledge graph construction and application framework for the UAV domain. Firstly, a fine-grained ontology knowledge system model of UAV classes and their relations is defined, and the UAV knowledge extraction dataset is labeled according to the ontology. Then, a two-step UASR knowledge extraction model is designed according to the characteristics of the UAV triple corpus, divided into two subtasks of relation prediction and subject–object entity identification. Then, we design a two-step UASR knowledge extraction model based on the characteristics of the UAV triple corpus. The main contributions of this paper are as follows:

A fine-grained knowledge ontology is formed by defining the concept and relation attributes of UAVs based on a collection of unstructured UAV data of a significant scale. From this ontology, a UAV knowledge extraction dataset is created by selecting high-quality texts that align with predefined UAV ontology entities and relation annotations.
A UASR knowledge extraction model is proposed, taking into account the characteristics of UAV knowledge extraction data. The BERT pre-trained language model is utilized to generate character feature encoding. In the decoder stage, the model incorporates the MLP attention mechanism to enhance the representation of relation types in the text for relation prediction. Additionally, a relationship-aware attention approach is employed to assign higher weights to tokens closely associated with relation classification and entity recognition tasks, thus enhancing the contextual semantic representation of subject–object entities.
The UASR knowledge extraction model undergoes extensive comparison and ablation experiments using a self-built dataset. The experimental results demonstrate the model’s effectiveness in solving knowledge extraction challenges within the UAV corpus. Furthermore, the knowledge graph generated by the UASR model’s extraction enables visual storage applications. These quantitative and qualitative experiments substantiate the efficacy and validity of the UASR framework.

2. Related Works

In this section, we mainly introduce two types of related work: the results of macro-construction of UAV knowledge graphs and micro-knowledge extraction methods.

2.1. Construction of UAV Knowledge Graph

The knowledge graph is a network composed of entities and the relations between them. These entities include entity types, entity type properties, relation types, and relation type properties, all of which require a unified semantic specification, known as the knowledge graph schema [7]. In the context of the Semantic Web, the knowledge graph schema is often referred to as an ontology, primarily aimed at abstracting, semantifying, and conceptualizing the content of knowledge graphs. However, there is no mature ontology in the UAV domain, while more mature ontologies have been constructed in the aviation domain on top of UAVs. For aircraft maintenance faults, Wang et al. [8] proposed an ontology for aircraft faults in order to solve the problem that multi-source aviation maintenance data lead to syntactic and semantic conflicts, thus limiting the integration and sharing of aircraft fault information, which provides a unified and specific description of multi-source data and eliminates the semantic heterogeneity. For aviation intelligence and operational safety, Mi et al. [9] used natural language processing techniques and clustering algorithms to intelligently extract ontologies for navigational announcement information.

Based on the top-level ontology model, the knowledge graph can be formed by combining the underlying data instances. At present, the research on the construction and application of knowledge graphs in the UAV domain is basically in the initial stage, and there are limitations in the specialty and the scale is far from the practical application requirements. Qiu Ling et al. [10] constructed a knowledge graph with more than 900 entities and 1800 relations for UAV faults, but its ontology construction is too simple and lacks certain authority. Nie Tongpan et al. [11] constructed a knowledge graph containing 74 entities and 98 relations for UAV power system fault diagnosis, and although the original fault manual documents are available on the data to ensure professionalism, the graph storage capacity is too small.

Although the above two typical UAV knowledge graphs basically present the form of the knowledge graph in terms of the results, they essentially directly replicate the general domain methods in terms of the methods and cannot meet the needs of the domain for knowledge graphs, while in other specific domains, knowledge graph construction techniques are relatively mature. Li et al. [12] applied the BERT-BiLSTM-CRF model for information extraction in the military domain to achieve extraction of multi-source military intelligence information. In the geographic domain, Molina-Villegas et al. [13] selected the Mexican geological exploration news text as the research object and applied the general word-embedding approach to accomplish information extraction and disambiguation at the same time. In the energy domain, Wang et al. [14] proposed a recognition method applicable to electric power text, which improves the performance of the BiLSTM-CRF decoding recognition model by fusing character-level pre-training models, left-neighbor entropy, and lexical feature encoding. In the aviation domain, Bao et al. [15] used the traditional BiLSTM-CRF framework and incorporated an attention mechanism to extract named entities related to aviation design. Wang Hong et al. [16] used a self-attentive mechanism and BiLSTM to extract the triples of the accident occurrence process and used scenario reproduction to analyze the causality of aviation safety accidents.

2.2. UAV Knowledge Extraction Approach

The UAV knowledge extraction task is essentially an automated process of extracting triples from large-scale unstructured texts. Currently, it is primarily tackled using deep learning methods, which involve training deep neural networks to learn relevant features for knowledge classification. This process can be divided into three main components: an embedding layer, a network layer, and classification.

Since sentences are not directly input into the neural network for computation, textual word element-embedding representations are required, and common approaches include the static word representations Word2vec [17,18] and GloVe [19], and the dynamic pre-training models ELMo [20], BERT [21], and GPT [22] for the input-embedding layers. The main approaches of the network layer include the convolutional neural network (CNN) [23], firstly used for text classification and then further extended to knowledge extraction, mainly implemented using multiple convolutional kernels and multiple windows. Based on this, Zeng et al. [24] proposed a maximum convergent segmental convolution based on location information, which can capture more structured information of entity pairs. Similarly, Recurrent Neural Network (RNN) can also extract relations, such as the SDP-LSTM framework [25], which applies a multi-channel LSTM to the shortest sequence path of entity pairs. In independent sentence representation, graph convolution networks (GCN) [26] can be utilized to encode sentences with relational knowledge, and the multi-headed attention mechanism selects weights corresponding to different relational edges, further proposing a knowledge-adaptive coarse- and fine-grained attention mechanism combined with an information-filtering relation extraction model. The output layer of the model is processed with SoftMax or sigmoid for single and multiclassification cases, respectively.

According to the model structure, there are pipeline and joint extraction models [27]. The biggest drawback of the pipeline model is that it cannot solve the relation overlap problem, which is manifested in the case of one entity corresponding to multiple relations, the case of two relations between the principal and object entities, and the case of multiple nesting within the principal and object entities. These problems are commonly found in UAV knowledge extraction tasks, such as the PURE model proposed by Chen et al. [28]. However, Yu et al. [29] first proposed the ETL model for the overlapping problem of relations, where the main entity is extracted first as a priori information, and then the object entities and relations are classified by sequence annotation. However, this method cannot solve the situation that there are two relations between the main and object entities. The CasRel model, proposed by Wei et al. [30] for this purpose, first identifies all possible principal entities in the sentence as a priori information and extracts both object entities and relations using sequence annotation. However, the method requires judging a large number of redundant relations and can only handle one principal entity at a time, which is less efficient in engineering practice. Wang et al. [31] proposed the TPLinker model to construct the global information matrix, a matrix used for extracting all the principal and object entities, and additionally, for each relation, to construct two matrices for the beginning and end positions of the principal and object entities. Finally, the alignment of entities and relations is achieved by using the corresponding annotation method to decode the triples. The relation extraction model based on potential relations and global correspondence (PRGC) proposed by Zheng et al. [32] can not only solve the overlap phenomenon in the corpus, but can also significantly reduce the number of relation judgments due to the presence of a priori potential relation judgments and significantly improve the training and computational efficiency.

3. Construction of UAV Knowledge Graph Based on UASR

3.1. UAV Ontology Definition

Currently, ontology construction usually adopts a seven-step approach [33], in which the core aspects include listing important terms in the domain, defining classes and their hierarchical relation, defining class attributes and their relation, etc. As for UAV ontology construction, it is more oriented to unstructured text data, and there is currently no ontology that can be directly reused to learn from. Therefore, combined with the specific UAV knowledge graph construction and application scenario requirements, the UAV ontology definition is simplified into the following two steps:

1. Analyze the core knowledge concepts of the UAV domain and sort out the knowledge system of the UAV system. This paper is oriented to UAV ontology construction and was completed under the guidance of domain expert knowledge, which defines UAV-related concepts, attributes, and relations, and prepares “raw materials” for the subsequent steps. This process does not require a completely clear and conflict-free classification of the above elements, but only a list of as many desired elements as possible, for example, for the UAV system components enumerated in the flight control, weaponry, and landing systems, etc. Although there is no UAV ontology that can be directly reused, it can be borrowed from the ontology model of the military domain [34], which has a strong relevance to the UAV studied in this paper and can serve a complementary role in the process of the knowledge system formation. This contains the main basic information of the aircraft and knowledge of derived systems, such as warfare, equipment systems, design manufacturers and facilities, and equipment, which are basically the same concepts as the UAV.

2. Define the UAV classes and their hierarchical and relational attributes. After determining the UAV-related elements, the next step is to filter out the representative elements and further constitute the framework of the UAV classification system with its hierarchical relation, in which the concepts represented by the upper-level categories must fully encompass the concepts represented by the lower-level categories. Relations generally correspond to the categories that need to be interacted with, and attributes refer to the inherent qualities of the categories themselves. However, the UAV relation and attribute constraints can be transformed into each other to some extent. For example, the UAV production manufacturer can be defined as an attribute inherent to the UAV in the description metrics, but often the manufacturer latently has a strong interaction with categories such as the location country. Therefore, this paper provides a unified definition of relation and attribute constraints. According to the above ontology construction method, the first level of the UAV classification system includes two subsystems, UAV equipment and UAV events. Three third-level subsystems, of the composition system, description attributes, and technical indicators, are included under UAV equipment, while UAV events include eleven third-level subsystems, such as design, manufacturing, and operation. The fourth-level subsystems include 70 subsystems, such as appearance, the control system, and duties. The layers of the UAV composition system are shown in Figure 1.

3.2. UASR Knowledge Extraction Model

3.2.1. Problem Formulation

The task of UAV knowledge extraction model is to get the UAV knowledge triples

T (s) = {(s, r, o) | s, o \in E, r \in R}

from the given sentences with

n

tokens

S = {x_{1}, x_{2}, \dots, x_{n}}

. Among them,

E

and

R

denote the set of entities and the set of relations, respectively. In the encoding part, in order to obtain better results for token and relation embedding, the BERT [35] large-scale language pre-training model is used to encode the input sentences and obtain the contextual representation of each word, and the result is represented as:

Y_{e n c} (S) = {h_{1}, h_{2}, \dots, h_{n} | h_{i} \in ℝ^{d \times 1}}

. In the decoding part, it can be split into two stages: relation prediction and subject–object entity identification. For a given sentence

S

, the goal of the relation extraction subtask is to predict the potential relations contained in the sentence, and the output is:

Y_{r} (S) = {r_{1}, r_{2}, \dots, r_{m} \in R}

. Among them,

m

denotes the number of relations in the subset of potential relations. The entity recognition subtask annotates each token corresponding to a given sentence

S

and the potential relation

r_{i}

, predicted by the entity recognition task according to the predefined scheme and BIO rules, and the output is:

Y_{e} (S, r_{i} | r_{i} \in R = {t_{1}, t_{2}, \dots, t_{n}})

. Among them,

t_{i}

denotes each label of the annotation. A given sentence

S

produces a corresponding score for each subject–object entity start token, which corresponds to a higher score for the start tokens and lower scores for the other tokens in a triadic pair. The output results are:

Y_{s} (S) = M \in ℝ^{n \times n}

. Among them,

M

represents the global correspondence matrix. Combined with the specific UAV knowledge extraction context, Figure 2 shows that considering the input text: “RQ-4 is deployed in the Western Pacific”, the final output triple is: RQ-4, Position (Deployment), Western Pacific. The relation judgment subtask determines the potential relation from the five predefined relations, such as “Position (Deployment)”. The entity recognition subtask predicts “RQ-4” and “Western Pacific” with the labels “B-SUB”, “I-SUB”, “B-OBJ”, and “I-OBJ”. The global correspondence matrix for the subject–object alignment subtask predicts higher correspondence scores for the tokens starting with “Seahawk” and ending with “West Pacific”, and lower correspondence scores for other tokens.

3.2.2. Relation Prediction

For the relation prediction stage, MLP attention [36] is used to generate contextual semantic representations based on tokens and relation representations to further extract key information, and the extraction method is shown in Equation (1), where the weight of each token in the sentence is generated by multi-layer perceptron attention. If token

x_{k}

has a tight semantic association in the subject–object entity extraction or relation classification tasks,

x_{k}

receives a higher weight.

\begin{matrix} V_{k} = {MLP}_{k} (x_{k}) s . t . k \in [1, n] \\ α_{k} = \frac{exp (V_{k})}{\sum_{m = 1}^{n} exp (V_{m})} \\ F_{s} = \sum_{m = i}^{n} α_{m} x_{m} \end{matrix}

(1)

From the MLP attention method, the sentence semantic representation obtained in the relation prediction stage is

F_{s}

, while the relation classification is simplified into a multi-label binary classification task,

σ

denotes the

s i g m o i d

function,

W_{r} \in ℝ^{d \times 1}

denotes the training parameters, and

P_{r e l}

denotes the probability of the sentence relation classification, as the corresponding relation of the binary classification task will be labeled as 1 (when the relation probability is exceeded); otherwise, it will be 0. Therefore, the subsequent tasks only need to label the sequences with potential relation sequences, and not all relational sequences need to be labeled.

P_{r e l} = σ (w_{r} F_{s} + b_{r})

(2)

3.2.3. Subject and Object Extraction

Different from the conventional knowledge extraction model, this paper further incorporates the attention mechanism [37] to obtain the contextual semantic representation. For the general entity–relation extraction method, the subject and object entities are first extracted, and then the relation vectors are spliced and fused with the subject and object entity vectors. Thus, the extraction of entities and relations is inseparable from the semantics of the context, where the relation embedding is represented by a learnable network, and the specific relation attention mechanism is formulated as follows:

\begin{matrix} a_{i j} = α^{T} [W_{1} x_{i} + b_{1}; W_{2} r_{j} + b_{2}], \\ λ_{i j} = softmax (a_{i j}), \\ h_{s} = r_{j} + \sum^{n} λ_{i j} (W_{3} x_{j} + b_{3}), \end{matrix}

(3)

To overcome the overlap phenomenon in the UAV knowledge extraction corpus, this paper utilizes the relation-embedding approach to label object entities. Based on the semantic association of entities and relations, the main entity and relational embeddings are input to a deep neural network model to predict the object entities. In order to enhance the effect of relational embedding in entity classification, the weight of each token in the sentence is calculated based on the relational attention approach under each specific class of relations. In the subject–object entity recognition task, different relational representations for each sentence are generated through relation-aware contextual semantics with the following prediction formula for each token, specifically:

\begin{array}{l} P_{i, j}^{s u b} = S o f t m a x (w_{s u b} (h_{i} + u_{j}) + b_{s u b}) \\ P_{i, j}^{o b j} = s o f t m a x (w_{o b j} (h_{i} + u_{j}) + b_{o b j}) \end{array}

(4)

Among them,

u_{j} \in ℝ^{d \times 1}

denotes the

j

-th relation representation in the learning-embedding matrix,

n_{r}

denotes the total number of relations in the set of relations,

h_{i} \in ℝ^{d \times 1}

denotes the encoding of the

i

-th token,

W_{s u b}, W_{o b j} \in ℝ^{d \times 3}

denote the learnable weights, and the three dimensions correspond to the annotation set

\{B, I, O\}

.

In the process of subject–object entity alignment, the correct subject–object entity pairs are defined using the global correspondence matrix. The prediction of a potential relation can also be simultaneously performed during the learning process of the global correspondence matrix. The process is as follows: first, number the possible subject–object entity pairs, then check the corresponding score of each subject–object entity pair in the global correspondence matrix, and keep the score if it exceeds the threshold

λ_{2}

; otherwise, filter and delete it.

For a given global correspondence matrix

M \in ℝ^{n \times n}

,

n

denotes the number of tokens in the sentence, each element of the matrix corresponds to the starting position of the subject–object entity, and the correspondence score is the confidence of matching the subject and object entities. The higher the score, the higher the confidence of matching the two entities. The corresponding score of each element in the matrix is calculated as follows:

P_{i_{s u b}, j_{o b j}} = σ (w_{g} [h_{i}^{s u b}; h_{j}^{o b j}] + b_{g})

(5)

Among them,

h_{i}^{s u b}, h_{j}^{o b j} \in ℝ^{d \times 1}

denote the input encoding of the

i

-th and

j

-th tokens formed by the subject–object entity,

W_{g} \in ℝ^{2 d \times 1}

denotes the learnable weights, and

σ

denotes the

s i g m o i d

function.

3.2.4. Training and Inference

The cross-entropy loss function is used in the training inference phase to calculate the final loss to optimize the training model, while sharing the parameters encoded by the BERT model. The loss function can be decomposed into three components:

\begin{matrix} L_{r e l} = - \frac{1}{n_{r}} \sum_{i = 1}^{n_{r}} (y_{i} \log P_{r e l} + (1 - y_{i}) \log (1 - P_{r e l})) \\ L_{s e g} = - \frac{1}{2 \times n \times n_{r}^{p o t}} \sum_{t \in {s u b, o b j}} \sum_{j = 1}^{n_{r}^{p o t}} \sum_{i = 1}^{n} y_{i, j}^{t} \log P_{i, j}^{t} \\ L_{g l o b a l} = - \frac{1}{n^{2}} \sum_{i = 1}^{n} \sum_{j = 1}^{n} (y_{i, j} \log P_{i_{s u b}, j_{o b j}}) + (1 - y_{i, j} \log (1 - P_{i_{s u b}, j_{o b j}})) \end{matrix}

(6)

Among them,

n_{r}

denotes the number of all relations and

n_{r}^{p o t}

denotes the number of predicted potential relations.

The weighted sum of the three-part loss function is:

L_{t o t a l} = α L_{r e l} + β L_{s e g} + γ L_{g l o b a l}

(7)

Among them,

α, β, γ

denote the adjustable hyperparameters, which can be set to 1 to simplify the model calculation.

4. UAV Knowledge Extraction Experiments

4.1. Experimental Setup

4.1.1. Dataset

The experimental dataset was generated based on the UAV ontology defined in Section 3.1 using the manual annotation method, with a total number of 377,670 characters and 8063 sentences. The core elements of relation labeling still revolve around the UAV entity, corresponding to the relation between UAV and UAV, including homologation, modifications before and after production design; operation between the UAV and the organization; UAV and its related locations such as the production, origin and user country etc.; the direct correspondence between the UAV and the engine; the relations between the UAV and event participation, exhibition, and occurrence of accidents, etc. The final annotated corpus entity data and representatives are shown in Table 1.

4.1.2. Model Setting

The above constructed UAV knowledge extraction dataset was divided into train, validation, and test dataset data models in the ratio of 8:1:1 in the experiment. The accuracy, precision (P), recall (R), and the F1 score were selected as the evaluation index of knowledge extraction task:

\begin{matrix} P = T_{P} / (T_{P} + F_{P}) \\ R = T_{P} / (T_{P} + F_{n}) \\ F 1 = 2 * P * R / (P + R) \end{matrix}

(8)

where

T_{P}

denotes the number of entities whose true class is positive and that were predicted to be positive as well,

F_{P}

denotes the number of entities whose true class is negative but were predicted to be positive, and

F_{n}

denotes the number of entities whose true class is positive but that were predicted to be negative.

The RoBERTa-wwm-ext language pre-training model proposed by Cui et al. [38] was chosen as the basis for character-level vector coding, and the experimental base hyperparameters were set as shown in Table 2. The deep learning framework was Pytorch V1.6.0, the basic running environment was Python V3.7.0, and the hardware configuration was 32G RAM, NVIDIA RTX A4500 (20G).

4.2. Experimental Results

4.2.1. Pre-Experiment Result

Before applying the UASR model on the in-house UAV knowledge extraction dataset, a pre-experiment on the public dataset NYT was conducted to justify the feasibility. As depicted in Table 3, the UASR model demonstrated superior performance compared to other models on the NYT dataset. Our model outperformed the baselines in terms of precision, recall, and F1 score, with a notable increase of 0.34% in the F1 score over PRGC. While the performance on the public dataset was relatively satisfactory, it should be noted that there is still room for improvement. Nevertheless, this pre-experiment serves as a strong indication that the UASR model is viable for knowledge extraction tasks, providing a solid foundation for subsequent experiments on our in-house UAV knowledge extraction dataset.

4.2.2. Overall Comparison

The UAV knowledge extraction results are presented in Table 4. Our proposed UASR model demonstrated the best performance compared to other models, achieving an impressive F1 score of 70.23%. In comparison to two other entity–relation joint extraction models, namely CasRel and TPLinker, the CasRel model exhibited exposure bias and error propagation issues. It overlooked the interaction between entities and relations, resulting in a decrease of 6.37% in the F1 score compared to our model. On the other hand, the TPLinker model performed relatively better, with a decrease of 5.49% in the F1 score. This improvement is primarily attributed to the adoption of a handshake labeling strategy for end-to-end sequence labeling. This strategy connects entity heads with entity tails, head entity heads with tail entity heads, and head entity tails with tail entity tails, using a matrix-based approach for labeling. By training the model to predict the matrix under each relation, the TPLinker model achieved simultaneous entity–relation learning, avoiding the inconsistency between training and prediction orders. Consequently, it demonstrated superior performance on the UAV dataset. Additionally, the PGCN model, which shares closer similarities with our approach, attained an F1 score of 66.59%, showing a decrease of 3.64%. This outcome further emphasizes the advantages of our attention-based relation-aware method in acquiring contextual semantic information.

4.2.3. Ablation Experiments

In order to further demonstrate the effectiveness of the improvement strategies of each part of the UASR model, further ablation experiments were conducted in this paper. The experiments removed the MLP attention module in the relation prediction phase and the relation attention module in the subject–object entity identification phase, respectively. The two main improvement modules of the model shown in Table 5 had significant effects on the overall performance. When the MLP attention module was removed, the F1 score of the model decreased by 3.22%, presumably because the relation prediction stage enhanced the model’s perception of relation classification to a certain extent for the relation-embedding representation. When the relation attention module was removed, the F1 score of the model decreased by 2.68%, presumably because the main object entity recognition stage above the semantic relationship-aware representation further enhanced the information fusion of task associations.

The impact of different components within the loss functions on the performance of our proposed UASR model was analyzed through a comprehensive experiment. The results, presented in Table 6, provide valuable insights into the significance of three components. Upon removing the relation loss function, we observed a decrease in the performance of the UASR model by 5.67%. This validates the crucial role of the relation prediction component, which aims to accurately predict potential relation subsets. Moreover, Table 6 reveals that the removal of the sequence loss function led to the inability of the UASR model to correctly extract UAV knowledge. This can be attributed to the inclination of the model to solely memorize the entity positions rather than comprehending the underlying semantics. Additionally, the limited size of our in-house dataset may have contributed to this limitation. Furthermore, the absence of the global correspondence loss function significantly impacted the performance of the UASR model, resulting in an increased number of predicted triples with numerous mismatched pairs. This emphasizes the importance of the constraint imposed by this component.

To assess the impact of different BERT models on our proposed model, we conducted a comparative analysis of three prominent BERT models: BERT-base, BERT-wwm, and RoBERTa-wwm. These models have demonstrated state-of-the-art performance in related knowledge extraction tasks. The experimental results, presented in Table 7, indicate that RoBERTa-wwm outperformed the other models, exhibiting F1 score improvements of 2.10% and 1.00%, respectively. However, as the focus of this paper is not specifically on pre-training language models, the influence of these models, as indicated by the results of our ablation experiments, can be considered negligible.

4.3. UAV Knowledge Graph Visualization

UAV knowledge extraction was implemented through the UASR knowledge extraction model, and the extracted relation needed to be stored and visualized. In this paper, Neo4j was chosen as the relation storage tool in view of the data structure of UAV and the convenience of interactive visualization. In this paper, the UAV knowledge graphs are represented as a triad of <entity, relation, entity> and <entity, attribute, attribute value>, such as <RQ-4, deployment location, Western Pacific> and <MQ-9, endurance, 42 h>. In this paper, we adopted the LOAD CSV method in Cypher query language to import the above knowledge triples and visualize them using Cypher statements. Figure 3a shows the UAV knowledge graph of 4000 entities and more than 10,000 relations at a macro-level. Figure 3b shows the UAV knowledge graph of a single entity “RQ-4” and its relation attributes at a micro-level.

5. Conclusions

As a basis for intelligent target identification and detection, UAV knowledge graphs can provide fine-grained structured knowledge for downstream situational awareness tasks. In this paper, a knowledge graph construction and application framework for the UAV domain was proposed for the problems at the data, system, technology, and application levels in the process of UAV knowledge graph construction. The UAV ontology was formed by combining the characteristics of the UAV knowledge structure system, and a certain scale of the UAV knowledge extraction triple dataset was labeled according to the ontology model, which solved the difficulties at the data and knowledge levels. Based on the UAV data and schema layers, a two-step UASR knowledge extraction model was proposed, and the contextual semantic representation of the two stages of relation prediction and subject–object entity recognition was optimized using attention representation and a relation-aware attention mechanism, respectively, based on the word meta-embedding of the large-scale language pre-training model, with full consideration of the UAV corpus characteristics. Finally, extensive experiments were conducted on the labeled dataset to demonstrate the effectiveness of the UASR model, and the extracted UAV knowledge was implemented for visual storage and intelligent question-and-answer (Q&A) applications.

However, the proposed framework for UAV knowledge graph construction and ap-plication still has some limitations:

(1) Due to the limited UAV data, there is still room for improvement in the knowledge extraction accuracy of the UASR model. In the future, under the condition that the dataset is large enough, we could divide the sub-datasets according to the categories of overlapping relations and the length of entity texts in the relations. We could then conduct experiments on the sub-datasets, analyze them from multiple perspectives based on the experimental results, and finally, conclude the reasons for the poor performance of the model on the UAV dataset and the directions for improvement.

(2) Although the application of the UAV knowledge graph was simply realized in this paper, there is still a gap in the application technology between intelligent Q&A and UAV target recognition and detection applications. In the future, the method of knowledge graph embedding can be used to input UAV knowledge in the form of pre-trained templates into the downstream situational game perception task to realize the development of UAV knowledge graphs for knowledge understanding and reasoning applications in actual aerospace situation.

Author Contributions

Conceptualization, B.M.; methodology, Y.F.; writing—review and editing, Y.S.; writing—original draft, L.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 12202363, and Fundamental Research Projects in Characteristic Disciplines, grant number G2023WD0124.

Data Availability Statement

The data are unavailable due to privacy reasons.

Acknowledgments

The authors would like to acknowledge the support of the National Natural Science Foundation of China (Grant No. 12202363) and the support of Fundamental Research Projects in Characteristic Disciplines (Grant No. G2023WD0124).

Conflicts of Interest

The authors declare no potential conflict of interest with respect to the research, authorship, and/or publication of this article.

References

Mohsan, S.A.H.; Khan, M.A.; Alsharif, M.H.; Uthansakul, P.; Solyman, A.A.A. Intelligent Reflecting Surfaces Assisted UAV Communications for Massive Networks: Current Trends, Challenges, and Research Directions. Sensors 2022, 22, 5278. [Google Scholar] [CrossRef] [PubMed]
Baigang, M.; Yi, F. A review: Development of named entity recognition (NER) technology for aeronautical information intelligence. Artif. Intell. Rev. 2022, 56, 1515–1542. [Google Scholar] [CrossRef]
Huo, C.; Ma, S.; Liu, X. Hotness prediction of scientific topics based on a bibliographic knowledge graph. Inf. Process. Manag. 2022, 59, 102980. [Google Scholar] [CrossRef]
Wu, X.; Duan, J.; Pan, Y.; Li, M. Medical Knowledge Graph: Data Sources, Construction, Reasoning, and Applications. Big Data Min. Anal. 2023, 6, 201–217. [Google Scholar] [CrossRef]
Cheng, D.; Yang, F.; Wang, X.; Zhang, Y.; Zhang, L. Knowledge Graph-based Event Embedding Framework for Financial Quantitative Investments. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Electr Network, online, 25–30 July 2020; pp. 2221–2230. [Google Scholar]
Chen, Z.; Deng, Q.; Ren, H.; Zhao, Z.; Peng, T.; Yang, C.; Gui, W. A new energy consumption prediction method for chillers based on GraphSAGE by combining empirical knowledge and operating data. Appl. Energy 2022, 310, 118410. [Google Scholar] [CrossRef]
Shao, B.; Li, X.; Bian, G. A survey of research hotspots and frontier trends of recommendation systems from the perspective of knowledge graph. Expert Syst. Appl. 2021, 165, 113764. [Google Scholar] [CrossRef]
Wang, Y.; Li, Q.; Sun, Y.; Chen, J. Aviation equipment fault information fusion based on ontology. In Proceedings of the 2014 International Conference on Computer, Communications and Information Technology (CCIT 2014), Beijing, China, 16–17 January 2014; pp. 8–10. [Google Scholar]
Mi, B.; Fan, Y.; Sun, Y. Ontology Intelligent Construction Technology for NOTAM. In Proceedings of the 2021 11th International Conference on Intelligent Control and Information Processing (ICICIP), Dali, China, 3–7 December 2021; pp. 137–142. [Google Scholar]
Qiu, L.; Zhang, A.; Zhang, Y.; Li, S.; Li, C.; Yang, L. An application method of knowledge graph construction for UAV fault diagnosis. Comput. Eng. Appl. 2023, 59, 280–288. [Google Scholar]
Nie, T.; Zeng, j.; Cheng, Y.; Ma, L. Knowledge graph construction technology and its application aircraft power sys tem fault diagnosis. Acta Aeronaut. Et Astronaut. Sin. 2022, 43, 46–62. [Google Scholar]
Li, Z.; Dai, Y.; Li, X. Construction of sentimental knowledge graph of Chinese government policy comments. Knowl. Manag. Res. Pract. 2022, 20, 73–90. [Google Scholar] [CrossRef]
Molina-Villegas, A.; Muñiz-Sanchez, V.; Arreola-Trapala, J.; Alcántara, F. Geographic named entity recognition and disambiguation in Mexican news using word embeddings. Expert Syst. Appl. 2021, 176, 114855. [Google Scholar] [CrossRef]
Chen, Y.; Wang, J.; Zhu, S.; Gu, Y.; Dai, H.; Xu, J.; Zhu, Y.; Wu, T. Knowledge Graph Construction for Foreign Military Unmanned Systems. In Proceedings of the CCKS 2022-Evaluation Track: 7th China Conference on Knowledge Graph and Semantic Computing Evaluations, CCKS 2022, Qinhuangdao, China, 24–27 August 2022; Revised Selected Papers. pp. 127–137. [Google Scholar]
Bao, Y.; An, Y.; Cheng, Z.; Jiao, R.; Zhu, C.; Leng, F.; Wang, S.; Wu, P.; Yu, G. Named entity recognition in aircraft design field based on deep learning. In Proceedings of the International Conference on Web Information Systems and Applications, Guangzhou, China, 23–25 September 2020; pp. 333–340. [Google Scholar]
Wang, H.; Zhu, H.; Lin, H. Research on Causality Extraction of Civil Aviation Accident. Comput. Eng. Appl. 2020, 56, 6. [Google Scholar]
Goldberg, Y.; Levy, O. word2vec Explained: Deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv 2014, arXiv:1402.3722. preprint. [Google Scholar]
Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
Sarzynska-Wawer, J.; Wawer, A.; Pawlak, A.; Szymanowska, J.; Stefaniak, I.; Jarkiewicz, M.; Okruszek, L. Detecting formal thought disorder by deep contextualized word representations. Psychiatry Res. 2021, 304, 114135. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
Zeng, D.; Liu, K.; Lai, S.; Zhou, G.; Zhao, J. Relation classification via convolutional deep neural network. In Proceedings of the COLING 2014, the 25th international conference on computational linguistics: Technical papers, Dublin, Ireland, 23–29 August 2014; pp. 2335–2344. [Google Scholar]
Zeng, D.; Liu, K.; Chen, Y.; Zhao, J. Distant supervision for relation extraction via piecewise convolutional neural networks. In Proceedings of the 2015 conference on empirical methods in natural language processing, Lisbon, Portugal, 17–21 September 2015; pp. 1753–1762. [Google Scholar]
Xu, Y.; Mou, L.; Li, G.; Chen, Y.; Peng, H.; Jin, Z. Classifying relations via long short term memory networks along shortest dependency paths. In Proceedings of the 2015 conference on empirical methods in natural language processing, Lisbon, Portugal, 17–21 September 2015; pp. 1785–1794. [Google Scholar]
Zhang, N.; Deng, S.; Sun, Z.; Wang, G.; Chen, X.; Zhang, W.; Chen, H. Long-tail relation extraction via knowledge graph embeddings and graph convolution networks. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019, Minneapolis, MN, USA, 2–7 June 2019; pp. 3016–3025. [Google Scholar]
E, H.; Zhang, W.; Xiao, S.; Cheng, R.; Hu, Y.; Zhou, Y.; Niu, P. Survey of Entity Relationship Extraction Based on Deep Learning. J. Sofrware 2019, 30, 1793–1818. [Google Scholar]
Zhong, Z.; Chen, D. A Frustratingly Easy Approach for Entity and Relation Extraction. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Virtual, Online, 6–11 June 2021; pp. 50–61. [Google Scholar]
Yu, B.; Zhang, Z.; Shu, X.; Liu, T.; Wang, Y.; Wang, B.; Li, S. Joint extraction of entities and relations based on a novel decomposition strategy. In Proceedings of the 24th European Conference on Artificial Intelligence, ECAI 2020, including 10th Conference on Prestigious Applications of Artificial Intelligence, PAIS 2020, Santiago de Compostela, Spain, 29 August–8 September 2020; pp. 2282–2289. [Google Scholar]
Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A novel cascade binary tagging framework for relational triple extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Virtual, Online, 5–10 July 2020; pp. 1476–1488. [Google Scholar]
Wang, Y.; Yu, B.; Zhang, Y.; Liu, T.; Zhu, H.; Sun, L. TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking. In Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Virtual, Online, 8–13 December 2020; pp. 1572–1582. [Google Scholar]
Zheng, H.; Wen, R.; Chen, X.; Yang, Y.; Zhang, Y.; Zhang, Z.; Zhang, N.; Qin, B.; Xu, M.; Zheng, Y. PRGC: Potential relation and global correspondence based joint relational triple extraction. In Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021, Virtual, Online, 1–6 August 2021; pp. 6225–6235. [Google Scholar]
Noy, N.F.; McGuinness, D.L. Ontology Development 101: A Guide to Creating Your First Ontology; Stanford University: Stanford, CA, USA, 2001. [Google Scholar]
Wu, Y.; Zhu, Y.; Li, J.; Zhang, C.; Gong, T.; Du, X.; Wu, T. Unmanned Aerial Vehicle Knowledge Graph Construction with SpERT. In Proceedings of the China Conference on Knowledge Graph and Semantic Computing, Shenyang, China, 24–27 August 2023; pp. 151–159. [Google Scholar]
Cui, Y.; Che, W.; Liu, T.; Qin, B.; Wang, S.; Hu, G. Revisiting pre-trained models for Chinese natural language processing. In Proceedings of the Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020, Virtual, Online, 16–20 November 2020; pp. 657–668. [Google Scholar]
Eberts, M.; Ulges, A. Span-based joint entity and relation extraction with transformer pre-training. In Proceedings of the 24th European Conference on Artificial Intelligence, ECAI 2020, including 10th Conference on Prestigious Applications of Artificial Intelligence, PAIS 2020, Santiago de Compostela, Spain, 29 August–8 September 2020; pp. 2006–2013. [Google Scholar]
Yan, M.; Lou, X.; Chan, C.A.; Wang, Y.; Jiang, W. A semantic and emotion-based dual latent variable generation model for a dialogue system. CAAI Trans. Intell. Technol. 2023, 1–12. [Google Scholar] [CrossRef]
Cui, Y.; Che, W.; Liu, T.; Qin, B.; Yang, Z. Pre-training with whole word masking for chinese bert. IEEE/ACM Trans. Audio Speech Lang. Process. 2021, 29, 3504–3514. [Google Scholar] [CrossRef]
Riedel, S.; Yao, L.; McCallum, A. Modeling relations and their mentions without labeled text. In Proceedings of the Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2010, Barcelona, Spain, 20–24 September 2010; Proceedings, Part III 21. pp. 148–163. [Google Scholar]

Figure 1. UAV ontology of the composition system.

Figure 2. The overall structure of our proposed model.

Figure 3. UAVKG visualization results.

Table 1. Statistics and examples of each type of relation.

Relation	Examples	Number
UAV and Engine	RQ-4 Global Hawk—Rolls-Royce AE 3007	88
UAV and UAV	RQ-4 Global Hawk—ACTD Prototype	974
UAV and Events	RQ-4 Global Hawk—Afghanistan War	135
UAV and Country	RQ-4 Global Hawk—United States	363
UAV and Organization	RQ-4 Global Hawk—Northrop Grumman	702

Table 2. Relation extraction model hyperparameter setting.

Hyperparameter	Value
Batch size	8
Sequence length	256
$λ_{1}$	0.5
$λ_{2}$	0.1
Warmup proportion	0.05
Decay rate	0.5
Learning rate	1 × 10⁻³
Embedding learning rate	1 × 10⁻⁴
Dropout	0.3
Loss	CE
Optimizer algorithm	Adam

Table 3. Comparison of the results of different models for the pre-experiment.

Model	P (%)	R (%)	F1 (%)
CasRel	87.71	90.53	89.10
TPLinker	89.35	90.67	90.01
PGCN	93.54	91.62	92.57
UASR (Our Model)	93.58	92.24	92.91

Table 4. Comparison of the results of different models for UAV knowledge extraction.

Model	P (%)	R (%)	F1 (%)
CasRel	67.12	60.16	63.86
TPLinker	62.13	67.58	64.74
PGCN	64.93	68.35	66.59
UASR (Our Model)	67.79	72.86	70.23

Table 5. Comparison of the results of UAV knowledge extraction ablation experiments.

Model	P (%)	R (%)	F1 (%)
UASR (Our Model)	67.79	72.86	70.23
MLP attention	66.21	67.83	67.01
Relation attention	62.51	73.49	67.55

Table 6. Comparison of the results of UASR loss function ablation experiments.

Model	P (%)	R (%)	F1 (%)
UASR (Our Model)	67.79	72.86	70.23
Relation loss	57.32	73.90	64.56
Sequence loss	Null	Null	Null
Global loss	39.26	28.70	33.16

Table 7. Comparison of the results of UASR BERT ablation experiments.

Model	P (%)	R (%)	F1 (%)
RoBERTa-wwm (Our Model)	67.79	72.86	70.23
BERT-base	66.71	69.62	68.13
BERT-wwm	65.22	73.77	69.23

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fan, Y.; Mi, B.; Sun, Y.; Yin, L. Research on the Intelligent Construction of UAV Knowledge Graph Based on Attentive Semantic Representation. Drones 2023, 7, 360. https://doi.org/10.3390/drones7060360

AMA Style

Fan Y, Mi B, Sun Y, Yin L. Research on the Intelligent Construction of UAV Knowledge Graph Based on Attentive Semantic Representation. Drones. 2023; 7(6):360. https://doi.org/10.3390/drones7060360

Chicago/Turabian Style

Fan, Yi, Baigang Mi, Yu Sun, and Li Yin. 2023. "Research on the Intelligent Construction of UAV Knowledge Graph Based on Attentive Semantic Representation" Drones 7, no. 6: 360. https://doi.org/10.3390/drones7060360

Article Menu

Research on the Intelligent Construction of UAV Knowledge Graph Based on Attentive Semantic Representation

Abstract

1. Introduction

2. Related Works

2.1. Construction of UAV Knowledge Graph

2.2. UAV Knowledge Extraction Approach

3. Construction of UAV Knowledge Graph Based on UASR

3.1. UAV Ontology Definition

3.2. UASR Knowledge Extraction Model

3.2.1. Problem Formulation

3.2.2. Relation Prediction

3.2.3. Subject and Object Extraction

3.2.4. Training and Inference

4. UAV Knowledge Extraction Experiments

4.1. Experimental Setup

4.1.1. Dataset

4.1.2. Model Setting

4.2. Experimental Results

4.2.1. Pre-Experiment Result

4.2.2. Overall Comparison

4.2.3. Ablation Experiments

4.3. UAV Knowledge Graph Visualization

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI