An Open Relation Extraction Method for Domain Text Based on Hybrid Supervised Learning

Wang, Xiaoxiong; Hu, Jianpeng

doi:10.3390/app13052962

Open AccessArticle

An Open Relation Extraction Method for Domain Text Based on Hybrid Supervised Learning

by

Xiaoxiong Wang

and

Jianpeng Hu

^*

School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(5), 2962; https://doi.org/10.3390/app13052962

Submission received: 26 January 2023 / Revised: 14 February 2023 / Accepted: 20 February 2023 / Published: 25 February 2023

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Current research on knowledge graph construction is focused chiefly on general-purpose fields, whereas constructing knowledge graphs in vertically segmented professional fields faces numerous difficulties. To solve the problems of complex relation types of domain entities, the lack of a large amount of annotated corpus, and the difficulty of extraction, this study proposed a method for constructing domain-annotated datasets based on publicly available texts on the web, which integrates remote supervision and semi-supervision. For the relational triad extraction of a given core entity (an entity lexicon defined semi-automatically by experts), an inflated gate attention network structure for increasing the perceptual field of the model is proposed. In addition, a relational extraction model, Ro-DGANet, was designed based on this structure, incorporating the idea of a probability graph. The Ro-DGANet model was experimentally evaluated on the publicly available Chinese datasets LIC2019 and CHIP2020 and compared with the mainstream relation extraction models, achieving the best results with F1 values of 82.99% and 66.39%, respectively. Finally, the Ro-DGANet model was applied to the relation extraction task of equipment components in industrial scenarios and to the relation extraction task of core knowledge points of programming languages. The analysis results show that the proposed method is applicable to open relation extraction among core entities in different domains with reliable performance and portability.

Keywords:

dilate gated attention network; remote supervision; relation extraction

1. Introduction

The field of knowledge graph [1] has seen widespread applications in various domains, including search, question-answering, and recommendation systems. The construction of knowledge graphs has primarily focused on general domains, whereas building knowledge graphs in specialized domains presents numerous challenges, particularly with manual annotation methods becoming increasingly obsolete. The aim of this study is to minimize the costs associated with manual annotation and generate high-quality datasets in specialized domains. In addition, this study leverages deep learning techniques to extract entities and relation from publicly available texts on the web, thus constructing a specialized domain knowledge graph that uncovers previously undiscovered latent relation between entities and knowledge.

Entity relation extraction, one of the essential tasks in knowledge graph construction [2,3], has received extensive attention from academia and industry in recent years. Current relation extraction is primarily divided into two major categories: (1) traditional restricted domain relation extraction and (2) open domain relation extraction. Traditional relation extraction systems are often limited to a given ontology and can extract only those relation predefined in the ontology. In 2008, Etzioni [4] proposed open information extraction, which does not require training data to be annotated for each type of relation and is not limited to predefined types of relation. At present, there are currently two main paradigms for relation extraction, as follows:

The pipeline approach, in which entity recognition is achieved before the relation classification task;
A joint model of entity recognition plus relation classification. Currently, most research on knowledge graph construction is focused on general-purpose domains, whereas constructing knowledge graphs for vertically segmented specialist domains faces many difficulties.

Compared with general-purpose domains, domain-specific relation extraction relies heavily on expert-defined entity relation. The number of relation types is minimal, and there is a lack of a high-quality annotation corpus, requiring a considerable investment of human and material resources. Most annotation tasks can only be performed by experienced frontline experts, particularly in the industrial sector. To reduce the cost of data annotation, Mintz [5] proposed a remotely supervised approach to automatically generate training data based on the idea that if an entity pair has a relation in an external knowledge base, all samples containing the entity pair are matched in total, and all matched samples will express this relation. However, the above approach generates a large amount of noisy data, which hinders the performance of the training model.

Traditional relation extraction is aimed at extracting the categories of qualified relation and attributes [6]. For example, Banko [7] extracted words or phrases directly from the text to represent entity relation types; these studies primarily used remotely-supervised external tags to identify relation phrases from external knowledge bases and then identify entity relation pairs. However, in the face of the massive amount of data accumulated in different domains, supervised extraction models based on manually annotated corpus can hardly solve the complex problem of specialist domain entities and relation types. Open relation extraction aims to extract relation facts from domain corpora, and Wu [8] proposed using high-quality annotated data from predefined relation to achieve open relation extraction. However, the manual annotation of high-quality datasets is still required. To further reduce annotation costs while obtaining high-quality annotated samples, this study proposes a distant supervised [9,10,11] and semi-supervised domain dataset construction method to provide data support for constructing knowledge graphs in different professional domains.

Unlike the generic domain, the task of open relation extraction is simplified by the given core entities in the specialized domain. These primary entities often come from a lexicon of entities defined semi-automatically by experts, such as part names of equipment and core knowledge points of specific domain knowledge. This lexicon is obtained by experts using a large corpus to extract domain terms using mining algorithms, manual reviews, and the extraction of sub-entities and relation words or attributes, and attribute values directly under the condition of known primary entities improves the extraction effect to a certain extent. At the same time, there are many complex relation in the domain of text information. For example, in the industrial text, a graphitization furnace temperature measurement device includes a furnace body mechanism and a temperature measurement mechanism. The overlapping triadic relation of the primary entity graphitization furnace temperature measurement device include (1) a graphitization furnace temperature measurement device, including a furnace body mechanism, and (2) a graphitization furnace temperature measurement device, including a temperature measurement mechanism. The hidden triadic relation includes a graphitization furnace temperature measurement device with the function of temperature measurement.

The joint extraction model proposed by Zheng [12] transforms the problem into a sequence labeling problem and uses a close combination approach to extract relational triples. Although this approach can extract multiple relation from sentences, it does not effectively solve the task of relational extraction in different domains. Second, to solve the complex problem of entity relation and the relation extraction task in different domains, this study proposed an inflated gate attention network structure for increasing the perceptual model field for a given principal entity and designed an open relation extraction model Ro-DGANet based on the idea of fusing probabilistic graphs [13,14,15] based on this structure. The model incorporates Chinese lexical information and uses an inflated convolutional neural network [16] to extract features used to enrich the input features. It is trained by adding a gating [17] mechanism of residual structure, which can better understand the semantic information. This mechanism is less likely to cause gradient disappearance and improve computational speed. The attention mechanism is also introduced to find the interrelation between word vectors and thus achieve better extraction results. The main contributions of this study are as follows:

To address the problem of a lack of domain-annotated corpus, this study to improve the bootstrapping [18,19] algorithm, combined with the remote supervision algorithm, to back-label high-quality triad samples from external knowledge bases, such as Baidu Encyclopedia. Furthermore, it refined the extraction pattern which was refined based on the syntactic structure in the existing training samples. Based on the extraction pattern from the unstructured text, triadic relation were openly labeled. A high-quality training corpus was constructed using manual review, thus solving the problem of the lack of an annotated corpus;
To solve the problem of complex entity relation types and overlapping relation in the professional domain, the Ro-DGANet model provides an efficient and concise open relation extraction scheme. Based on the idea of a probabilistic graph, the model solves the complex relation problem, in which multiple relation exist for the same primary entity without limiting the relation types. At the same time, it proposes an inflated gate attention network structure for increasing the field-of-view of the model to extract the semantic features of the entire sentence, which can enable the model to better understand the semantic information;
Experiments are conducted and applied to different domain datasets, such as the relation extraction task for equipment components in industrial scenarios and the relation extraction task for core knowledge points of programming languages. The results show that the model has significantly improved F1 values for different domain datasets. The analysis showed that the proposed method is suitable for open relation extraction between core entities in different domains and has reliable performance and portability.

In summary, this study introduces a novel dataset construction method that combines remote supervision and semi-supervised techniques, which sets it apart from conventional methods. Moreover, the Ro-DGANet relation extraction model is developed from the perspective of open relation, bridging the knowledge gap regarding latent relation across various professional domains.

2. Related Work

Entity relation extraction is a crucial technique for building knowledge graphs. The two main types of existing supervised extraction methods [20,21] are (1) pipelined learning and (2) joint learning. In pipeline learning, entity recognition is conducted first, followed by relation extraction between entities. This method is prone to error propagation and ignores the direct connection between the two subtasks, and the lost connection will affect the relation extraction effect. In addition, redundant information can also affect the performance of the model. To address these issues, the researchers fused the two subtasks of entity recognition and relation extraction into one task for joint learning.

Joint learning is the input of a sample sentence; the entity relation triad is directly obtained by a joint extraction model. Miwa [22] first proposed using recurrent neural networks for entity relation extraction, which reduced error propagation by sharing parameters and thus improved the model effect. However, the model did not consider long-range dependencies between entities. Zheng [23] proposed a new annotation mechanism that transforms entity relation extraction into a sequence annotation problem using an end-to-end neural network model to directly extract relation triples, reducing the impact of redundant information on the model. However, this model cannot solve the problem of complex entity relation types, such as the existence of relation between one entity and multiple entities and the existence of multiple relation between the same entity pair.

Although deep learning supervised extraction methods are effective in learning many features and improving model effectiveness, the models must be trained using a large amount of well-labeled data. To reduce annotation costs, remotely supervised entity relation extraction significantly reduces manual dependency and is highly portable. Mintz [5] first applied remotely supervised methods to the relation extraction task by aligning news domain data with entities in FreeBase and extracting text features using annotated data. However, this approach suffers from noisy data and propagation. Alfonseca [24] used the FreeBase knowledge base to inductively process relation and used heuristic rules to automatically identify relevant words for extracting relation, reducing mislabeling to a certain extent. Huang [25] proposed a method that incorporates a gated recurrent unit (GRU) and an attention mechanism for remotely supervised relation extraction, which solves the poor dependency of entities in traditional models over long distances and the error labels easily generated in remote supervision; however, the performance still has a certain gap compared with supervised relation extraction. However, the accuracy of the data samples obtained by the remotely supervised approach is low, affecting the extraction performance of the model.

Traditional relation extraction is based on specific relation for extraction, which is not only time-consuming for the extraction task but also costly for annotation. The emergence of bidirectional encoder representations from transformers (BERT) [26] has led to an increasing number of relation extraction models based on pre-trained models [27,28]. Shi [29] proposed a simple model of entity relation extraction based on BERT. Shen [30] used BERT for interpersonal relation extraction to reduce the impact of noisy data on the model. However, BERT is not good at handling long-text data, and the domain text in relation extraction mostly has long sentences. Moreover, in the joint learning process, the model is susceptible to the accumulation and propagation of errors due to the influence of noisy data; this ultimately leads to unsatisfactory model results.

All the above methods have shortcomings in one aspect, either problems of noisy data and propagation errors or being limited to dealing with long-text features. The Ro-DGANet model proposed in this study not only reduces error propagation in the training process but also improves the effect of long-text sentence relation extraction.

3. Problem Background

Traditional relation extraction is based on specific relation, which leads to time-consuming and labor-intensive extraction tasks and high labor costs. Meanwhile, the extraction method only applies to the current field and is difficult to transfer to other fields. Open relation extraction is a new research direction in information extraction. This extraction method combines semantic and morphological features to automatically extract relation in unqualified types of open web texts. Open relation extraction does not require a defined relation type in advance, reducing the cost of manual annotation. The extraction system based on this design is more portable and suitable for open web corpora with complex relation types.

The construction of domain knowledge graphs based on open texts on the web proposed in this paper includes the construction of a core entity lexicon, the automatic annotation of corpus entities, the manual annotation of open relation, the training of relation extraction models, and open relation extraction for a given core entity case. One of the core tasks is relation extraction for open texts. An example is the relation extraction of equipment components in an industrial scenario. The inputs and outputs of open relation extraction are shown in Table 1 for a known defined core entity lexicon.

This paper employs the TF-IDF [31] and TextRank [32] algorithms as tools for data pre-processing in order to compute the frequency of keywords in texts publicly available on the web, as demonstrated in Table 2.

As the TF-IDF algorithm solely measures the significance of words based on their “word frequency”, words remain unconnected and fail to convey any sequence information. For instance, in the case of industrial data, the word “feature” may appear frequently in the text but is not a valid entity. As evidenced in Table 2, incorporating the TextRank algorithm as an auxiliary tool furnishes co-occurrence information between words, leading to a marked improvement in the statistical results. Although this approach can replace fundamental manual statistics, it still requires expert review to eliminate non-valid entity words and introduce a limited number of missing entities. The ultimate outcome is a superior-quality core entity candidate word library (partially) as presented in Table 3, which serves as a reference for primary entity matching labeling.

The core idea of the TF-IDF algorithm is that the importance of a word is proportional to the number of times it appears in the document; it is also inversely proportional to the frequency with which it appears in the corpus. The calculation formula is as follows.

{\begin{matrix} t f_{i j} = \frac{n_{i, j}}{\sum_{k} n_{k, j}} \\ i d f_{i} = \log \frac{| D |}{| {j : t_{j} \in d_{j}} |} \\ T F - I D F = t f_{i j} * i d f_{i} \end{matrix}

(1)

TextRank algorithm calculates the relation between words. The core idea is the same as the PageRank [33] algorithm, which extracts a text’s keywords and key phrases from the co-occurrence information (semantics) between words within the document. The calculation formula is as follows.

W S (V_{i}) = (1 - d) + d * \sum_{V_{j} \in \ln (V_{i})} \frac{W_{j i}}{\sum_{V_{k} \in O u t (V_{j})} W_{j k}} W S (V_{j})

(2)

This study used cascading pointer annotation for the training samples to annotate entities and relation separately, as shown in Figure 1 (from the Chinese patent dataset, and marked with translations). The “1” label in the figure indicates an entity’s start or end position, whereas the “0” label is meaningless. The sequence of the start and end positions of the primary entity is used to predict the corresponding sub-entities and their relation. For example, the main entity, “vertical graphitization furnaces”, has three corresponding sub-entities: Support Plate, Furnace Body, and Furnace Cover. The relation between them are all “includes”.

4. Training Corpus Acquisition

The bootstrapping algorithm is a process of iteratively obtaining a high-confidence annotated corpus using an external knowledge base. The essence is to extract sentences containing all entities and relational words in a triad and select samples with high confidence levels together with the triad as the annotated data to form the training annotated corpus. The flowchart for obtaining the training corpus in this study is shown in Figure 2.

Method 1 adopted the idea of remote supervision by searching the Infobox list on the pages related to the primary entity through the Baidu Encyclopedia and extracting the relevant triad sets to add to the candidate corpus. Then, a heuristic algorithm is used to build the training corpus by introducing an external knowledge base to automatically back-label and triad entity pairs of related sentences, as shown in Algorithm 1. A final manual review of the annotated dataset was conducted to ensure its quality.

Algorithm 1: Remote annotation algorithm

Algorithm implementation:

Input: external knowledge base

U (s, p, o)

, candidate corpus

M

Output: labeled data set:

K

1 for

k \in U

do

2

M_{k} = g e t (M, k)

3 for

M_{k}^{'} \in M_{k}

do

4 if

M A T C H_E N T I T Y (k_{s}, k_{o}, M_{k}^{'})

then

5 add

(M_{k}^{'}, k_{s}, k_{p}, k_{o})

to the set

K

6 else

7 remove

M_{k}^{'}

from

M_{k}

8 end for

9 end for

10 for

k \in K

do

11 if

F L I L T E R_E N T E N C E S (k_{s}, k_{o})

then

12 remove

k

from

K

13 end if

14 end for

15 return

K = {K_{1}, K_{2}, \dots, K_{n}}

Method 2 used crawler technology for domain document collection and open-ended extraction of relation types without restricting the relation types. In contrast to the traditional bootstrapping-based OLLIE [34] extraction system, this study learns the extraction pattern from domain documents and a small number of annotated samples according to the Chinese syntactic structure. Then, we openly annotated triadic relation from unstructured text based on this extraction pattern and finally constructed a high-quality training corpus by manual review.

Syntactic structures include subject–predicate, verb–object, prepositional, complementary, and joint structures; these five structures reflect the fundamental grammatical relation in Chinese. Dependent syntactic analysis can help us determine the subject and predicate in a sentence and their relation so that a generic annotation template can be extracted by revealing the syntactic structure of Chinese according to the dependent syntax analysis. Figure 3 (The figure shows the results of Chinese specific word separation) shows the results obtained after the language technology platform (LTP) [35]-dependent syntactic analysis, in which the labels on the connecting lines represent the relation between different syntactic components. For example, the subject–predicate relation between “device” and “includes” is identified by “SBV”.

To learn effective extraction patterns, the entity relation dependency paths of the triad and related sentences were first labeled on the results of the dependency syntactic analysis, with the specific entities labeled as “Entity” and the relation labeled as “Relation”. Then, the content of the dependency path is replaced with the label to create the triadic relation pattern, thus annotating the entire sentence with the relation, as shown in Figure 4 (The figure shows the results of the Chinese specific process).

Using the above method, the industrial domain dataset (IDATA) and the C programming domain (CDATA) dataset were constructed based on domain documents crawled on the web. Among them, the IDATA dataset is a collection of industrial domain documents from the industrial equipment patent information website, with a total of 3257 patents’ information, and the related contents are shown in Table 4.

The CDATA dataset is a collection of C-language domain documents for vertical learning websites, such as Niuke.com, W3C School, C-Language Network, Rookie Tutorial, and Baidu Encyclopedia, with a total of 529 document pages, as shown in Table 5.

5. Relation Extraction Model Ro-DGANet

For unstructured data, relation extraction is not easy. Unstructured data include domain documents and expert documents. Compared with other general-purpose domains, specialized domains have their uniqueness. The knowledge content structure is relatively stable, and the terminology vocabulary is relatively standardized, consisting mainly of nouns, verbs, and adjectives. According to the data characteristics, the information extraction task is transformed into an open relation [36] extraction task under a given primary entity to enhance the extraction effect.

This study proposed an open relation extraction model based on an inflated gate attention network based on a probabilistic graph model given the primary entity. The model architecture is shown in Figure 5.

A richer semantic feature was captured by the dilate gated attention network (DGANet) by adding lexical encoding based on the features of the domain text. The obtained encoded sequence was spliced with the primary entity encoding vector as a condition. A conditional LayerNorm [37] was conducted on the encoded sequence to predict the sub-entities and correspondences. To solve the problem of poor model robustness due to the local instability of neural network models, this study added a gradient penalty to the model loss calculation. The experimental results show that the method can improve the robustness of the model.

5.1. Joint Extraction of Ideas

The model in this study is a reference to the joint extraction method, which jointly optimizes entity identification and relation classification and designs a seq2seq-like extraction framework based on the probability graph idea. First, assume that Equation (3) holds.

P (s, p, o) = P (s) P (o | s) P (p | s, o)

(3)

Equation (3) in

s

denotes the primary entity,

p

denotes the relation, and

o

denotes the sub-entity. The corresponding

o

can be predicted by passing in

s

, and the corresponding

p

can be predicted by passing in

s

,

o

. The prediction of

o

and

p

can be synchronized during the training process to shorten the training time. To handle the case of multiple

s

,

o

, or

p

in the same input text, the Sigmoid activation function was used in this study. This design was not only able to handle multiple triad extraction cases but also to decode them efficiently.

5.2. Encoding Layer

We used the RoFormer [38] pre-trained model for initial data processing in this study. RoFormer adopts the rotary position embedding (RoPE) technique, which is entirely different from the methods used by mainstream BERT and A Lite BERT (ALBERT) [39]. Absolute position encoding has the advantages of simple implementation and fast computation. In contrast, relative position encoding directly reflects the interrelation between Tokens, and the model’s actual performance is better.

RoPE introduces positional information into the learning process of pre-trained language models by using rotation angles between vectors to represent the relative relation between features. Specifically, RoPE uses a rotation matrix to encode absolute position information and incorporates explicit relative position correlation into Attention, making it scalable to arbitrary sequence lengths and having relative position encoding capabilities. As a result, the Transformer with rotated positional embedding performs excellently in textual processing tasks.

We used a trained Chinese-RoFormer model, which consists of 12 coding blocks with 12 heads in each coding block for the multi-headed self-attentive operator sub-module and a word vector with embedding dimension 768.

5.3. Dilate Gate Attention Network

The DGANet proposed in this paper consists of 12 layers of dilate gated attention (DGA), each of which includes a one-dimensional inflated convolution and a gated attention unit (GAU) [40] fused with a residual structure. The structure is shown in Figure 6.

Suppose the input vector

x = [x 1, x 2, \dots, x n]

, a gated linear unit (GLU) [16], is added to the one-dimensional convolution as following equations.

Y = U \otimes V

(4)

U = C o n v 1 D_{1} (X)

(5)

V = C o n v 1 D_{2} (X)

(6)

The two

C o n v 1 D

in Equations (5) and (6) have the same structure, but the parameters are not shared. Suppose the dimension of a word vector is

d

. In that case, there are

2 d

convolution kernels, one of which is activated by the Sigmoid function, the other without the activation function, and then they are multiplied bit by bit. As the Sigmoid function has a value range of

(0, 1)

, it essentially adds a “valve” to each output of

C o n v 1 D

to control the flow. This design reduces the risk of gradient disappearance because one of the convolutions does not have an activation function. This part of the convolution is less prone to gradient disappearance.

Because the

C o n v 1 D

parameters in the model are not shared, they are all operated independently, and the interaction between each Token is missing. To solve this problem, by adding the Attention matrix noted as

A

, which is responsible for fusing the information between Tokens so that the output contains the interaction between Tokens, the following formula is used.

Y = U \otimes A V

(7)

A = {\begin{array}{l} \frac{1}{n} r e l u^{2} (\frac{Q (Z) K {(Z)}^{T}}{\sqrt{s}}) \\ \frac{1}{n s} r e l u^{2} (Q (Z) K {(Z)}^{T}) \end{array}

(8)

Z = ϕ (X W_{z}), W_{z} \in R^{d \times s}

(9)

In the above Equation,

s

denotes the head size of attention,

s = 128

in this paper, whereas

Q

and

K

are simple affine variations, and

ϕ

is the SiLU (Sigmoid Linear Unit) [41] activation function.

As the input

X

and output

Y

of this model have the same dimension, it is possible to involve

X

in the operation; in other words, the combination of residual and gated attention not only solves the gradient disappearance problem but also enables the feature information to be transmitted in multiple channels, as in Equation (10).

Y = X + U \otimes A V

(10)

For a clearer understanding of the DGANet structure, Equation (10) can be rewritten as the following equivalence.

Y = X \otimes (1 - A V) + U \otimes A V

(11)

The convolutional structure used in the model is the convolutional neural network (CNN) [42] proposed by Jonas, which can significantly reduce the model training time and is more effective than the recurrent neural network (RNN) structure. For the CNN model to capture richer semantic features and increase the perceptual field of the model, the semantic information at a distance is captured without increasing the parameters using an inflated convolutional network.

The model uses 12 layers of inflated convolutional networks with expansion rates of

[1, 2, 4]

repeated three times in sequence and then set to

[1, 1, 1]

for fine-grained tuning to give the model automatic feature selection capability and more powerful contextual semantic information capture.

6. Experimental Assessment

6.1. Experimental Dataset

The Baidu LIC2019 Chinese information extraction contest dataset and CHIP2020 Chinese medical text entity relation extraction dataset are used as model evaluation datasets, and the industrial patent dataset (IDATA) and the Chinese C-language entity relation dataset (CDATA) collected by our own collation are used as model application datasets.

The LIC2019 dataset contains text data recorded by Baidu Encyclopedia, Posting Bar, and Baidu Info Stream, with approximately 210,000 data samples, 450,000 ternary datasets, and 50 predefined entity relation types.

The CHIP2020 dataset contains a pediatric training corpus and a hundred common diseases training corpus, with the pediatric training corpus derived from 518 pediatric diseases and the hundred common diseases training corpus derived from 109 common diseases, approximately 75,000 ternary datasets, 28,000 disease utterances, and 53 well-defined entity relation types.

6.2. Experimental Environment and Model Parameter Settings

All experimental models in the text are on Pytorch 1.10.0 and Python 3.8 environments, with 64-bit Ubuntu 20.04 OS, 12-core Intel(R) Xeon(R) Platinum 8255C CPU, 48 GB RAM, and an NVIDIA RTX 3090 graphics card. The experimental parameters are shown in Table 6.

6.3. Evaluation Metrics

In the paper, the training and validation sets are divided in the ratio of 7:3, and the precision rate P, recall rate R, and F1 values are used as evaluation indexes, defined by the following equations.

{\begin{matrix} P = \frac{T P}{T P + F P} \\ R = \frac{T P}{T P + F N} \\ F 1 = \frac{2 P R}{P + R} \end{matrix}

(12)

In the formula, TP indicates the number of samples predicted to be correct in positive samples, FP indicates the number of samples predicted to be correct in negative samples, FN indicates the number of samples predicted to be incorrect in positive samples, and TN indicates the number of samples predicted to be incorrect in negative samples.

6.4. Contrast Model

To prove the effectiveness of the extraction model proposed in this paper, the various models were compared with the current mainstream relational extraction models, as summarized below:

CasRel [43]: The extraction process employs a “half-pointer-half-label” method based on the BERT model, which results in a relatively straightforward model architecture. However, during data processing, a large number of unnecessary relation must be identified, leading to reduced computational efficiency, and the model may experience exposure bias and error propagation during training;
TPLinker [44]: TPLinker mitigates exposure bias and error propagation problems in joint extraction models by reformulating joint extraction as a token pair linking task, and introducing a novel linking approach that ensures consistency during both training and prediction phases. Moreover, TPLinker adopts a multi-head labeling strategy where, in the case of an N-class relation, the model creates 2N sparse matrices of relation during training, resulting in slower convergence. The model does not take advantage of more comprehensive semantic features;
TPLinkerPlus: The model task is transformed from a multi-classification task to a multi-label classification task based on TPLinker, which reduces the sparsity of the matrix;
GPLinker: The GlobalPointer-based [45] entity relation federated extraction scheme is similar to TPLinker, with additional labeling to distinguish entity type labels.

6.5. Experimental Results Analysis

6.5.1. Experimental Analysis of Comparative Models

The experimental results of the different models on the LIC2019 and CHIP2020 datasets are shown in Table 7. Compared with other mainstream baseline models, Ro-DGANet performs better on both datasets. Table 7 shows that the Ro-DGANet model improves in all three evaluation metrics compared to the other baseline models. The recall R and F1 values on the LIC2019 dataset improved by 1.64% and 0.41%, respectively, compared to the best-performing model. Although our model has been successful in achieving high recall and F1 scores, it is important to note that its precision is lower than that of other models such as GPLinker. The reason for this is twofold: first, the stacked pointer labeling approach used in our model creates class imbalance issues that impact precision. Second, during training, our model places a higher emphasis on recall, which limits its ability to improve precision. In the CHIP2020 dataset, Ro-DGANet improved in the precision P, recall R, and F1 values, with R and F1 values performing the best. The experimental results show the effectiveness of the Ro-DGANet model.

The results of the evaluation metrics of the different models on the LIC2019 dataset as the number of training increases are shown in Figure 7. Figure 7 shows that TPLinker and TPLinkerPlus have a slower convergence rate of recall and F1 value due to their more complex models. In contrast, GPLinker and CasRel have a faster convergence recall number and F1 values during the training process; the change in F1 value floats less after convergence. The proposed model is comparable to CasRel and GPLinker in terms of the F1 value convergence speed on the LIC2019 dataset. However, the recall index was higher than the other models, and the variation was more stable after convergence. Meanwhile, the accuracy rate of this model fluctuated less than that of other models during the training process and was stable. In summary, the Ro-DGANet model extracts the best effect and converges faster, indicating that the introduction of the expansion gate attention network module can obtain more effective attention information.

6.5.2. Experimental Analysis of Ablation

The study carried out ablation experiments on the LIC2019 test set to evaluate the efficacy of various components and layers, as illustrated in Table 8. To extract semantic feature information, a dilation gate attention network was employed, increasing the interaction between tokens through multiple channels, thereby leading to a significant performance enhancement compared to models that lack this module. Moreover, the inclusion of only lexical encoding in the Postag component resulted in a slight improvement in the model’s performance. Similarly, the GP component introduced perturbations to the Embedding layer, which is to some extent equivalent to adding “gradient perturbations” to the loss and also slightly improved the model’s performance. On the other hand, the Base component does not introduce any functional modules. The results in Table 8 indicate that the use of the DGANet, Postag, and GP components aids in enhancing the joint extraction of triple relation. Furthermore, it is evident that the DGANet layer exerts a more significant impact on the extraction process as the majority of the relation information is embedded in the content between the two entities.

6.5.3. Model Applications

This study applied the Ro-DGANet model to the relation extraction task and the C-language knowledge graph construction information extraction task in industrial fields. The constructed IDATA and CDATA datasets were randomly disrupted; 70% were selected as training samples and the remaining 30% as test samples. The average value was taken as the experimental results using multiple cross-validations, as shown in Figure 8.

The accuracy P, recall R, and F1 values were improved compared with the baseline model without adding any function module, where F1 improved by 3.15% on the IDATA dataset and the recall rate improved by 3.17% on the CDATA dataset. This showed that the performance of the Ro-DGANet model improved not only on the IDATA dataset composed of short texts but also on medium and long texts. It was also effective on the CDATA datasets, consisting of medium and long texts. Taking C language as an example, the core knowledge graph of the programming language is constructed as shown in Figure 9. The accuracy of Ro-DGANet ranged from 82% to 86% in the industrial and C programming domains, which can be used as a reference for constructing domain knowledge graphs in the future.

6.5.4. Training Time Cost for Different Models

The total training time of the various models in this study for 20 epochs is shown in Table 9. The table shows that adding the inflated gate attention network makes the model more complex overall. However, the GAU with only a single-head attention mechanism improved the computational speed compared to the multi-head attention mechanism, reduced video memory usage, and achieved the same or even better results. Comparing CasRel and the training time is comparable to that of the CasRel and GPLinker models. Compared with TPLinker and TPLinkerPlus, which use Handshaking Tagging, this study’s pointer tagging model, based on probabilistic graphs, was simpler to compute and had a faster training time. Overall, the Ro-DGANet model had a more complex structure, but the training time was faster and the F1 value was higher.

Table 10 demonstrates the graphics memory consumption during the training of various models. Despite its relatively complex structure, the Ro-DGANet model used in this study demonstrates a quick training time, low graphics memory consumption, and high F1 value.

7. Conclusions

To address the problem of a lack of annotated corpus in the domain, this study proposed a method for constructing domain-annotated datasets incorporating remote supervision and semi-supervision. To address the problem of difficulty in extracting entity relation types, this study simplified the relation extraction task and proposed a relation extraction model, Ro-DGANet, given the primary entity. A comparison experiment with the mainstream relation extraction models on the publicly available Chinese datasets LIC2019 and CHIP2020 was conducted to demonstrate the effectiveness of the proposed model in this study. At the same time, applying the model proposed in this study to the relation extraction task of equipment components in industrial scenarios and the relation extraction task of knowledge points in program language knowledge graphs can reduce part of the labor cost for building domain knowledge graphs. It should be noted that the dataset construction approach proposed in this paper does not completely eliminate the need for manual review. Although the use of dependency parsing to extract extraction patterns for constructing training corpus covers high-frequency syntax structures, it may miss some valuable data. Furthermore, due to the Ro-DAGNet model’s utilization of a cascaded pointer annotation approach, there may be a class imbalance problem when extracting entity relation. Given the characteristics of the existing data, the target entities are significantly fewer than invalid entities, resulting in a scarcity of the “1” label in pointer annotation in contrast to the “0” label, thereby affecting the training effect to a certain extent. In the future, we aim to compile more comprehensive extraction patterns based on the document characteristics of different domains. Additionally, to address the class imbalance problem, we plan to investigate the adaptive gradient weight concept to achieve more effective practical application outcomes.

Author Contributions

X.W. performed the experiments, contributed to the construction of the data and the design of the model, executed a detailed analysis, and wrote some sections. J.H. set the direction of the research, wrote some sections, and performed the final corrections. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Science and Technology Innovation 2030—Major Project of “New Generation Artificial Intelligence” granted by Ministry of Science and Technology, grant number 2020AAA0109300.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Fu, L.; Cao, Y.; Bai, Y.; Leng, J. Development status and prospect of vertical domain knowledge graph in China. Appl. Res. Comput. 2021, 38, 3201–3214. [Google Scholar] [CrossRef]
Feng, J.; Wei, D.; Su, D.; Hang, T.; Lu, J. Survey of document-level entity relation extraction methods. Comput. Sci. 2022, 49, 224–242. [Google Scholar]
Li, D.; Zang, Y.; Li, D.; Lin, D. Review of entity relation extraction methods. J. Comput. Res. Dev. 2020, 57, 1424–1448. [Google Scholar]
Etzioni, O.; Banko, M.; Soderland, S.; Weld, D.S. Open information extraction from the web. Commun. ACM 2008, 51, 68–74. [Google Scholar] [CrossRef] [Green Version]
Mintz, M.; Bills, S.; Snow, R.; Jurafsky, D. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, 2–7 August 2009; pp. 1003–1011. [Google Scholar]
Qin, B.; Liu, A.A.; LIU, T. Unsupervision for relation entity extraction. J. Comput. Res. Dev. 2015, 52, 1029–1035. [Google Scholar]
Banko, M.; Etzioni, O. The tradeoffs between open and traditional relation extraction. In Proceedings of the ACL-08: HLT, Columbus, OH, USA, 15–20 June 2008; pp. 28–36. [Google Scholar]
Wu, R.; Yao, Y.; Han, X.; Xie, R.; Liu, Z.; Lin, F.; Lin, L.; Sun, M. Open relation extraction: Relational knowledge transfer from supervised data to unsupervised data. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 219–228. [Google Scholar]
Xiao, Y.; Jin, Y.; Cheng, R.; Hao, K. Hybrid attention-based transformer block model for distant supervision relation extraction. Neurocomputing 2022, 470, 29–39. [Google Scholar] [CrossRef]
Liu, M.; Zhou, F.; He, J.; Yan, X. Knowledge graph attention mechanism for distant supervision neural relation extraction. Knowl.-Based Syst. 2022, 256, 109800. [Google Scholar] [CrossRef]
Zhang, Y.; Fei, H.; Li, P. ReadsRE: Retrieval-Augmented Distantly Supervised Relation Extraction. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Montreal, QC, Canada, 11–15 July 2021; pp. 2257–2262. [Google Scholar]
Zheng, S.; Hao, Y.; Lu, D.; Bao, H.; Xu, J.; Hao, H.; Xu, B. Joint entity and relation extraction based on a hybrid neural network. Neurocomputing 2017, 257, 59–66. [Google Scholar] [CrossRef]
Wang, S.; Zhang, Y.; Che, W.; Liu, T. Joint extraction of entities and relation based on a novel graph scheme. In Proceedings of the IJCAI, Stockholm, Sweden, 13–19 July 2018; pp. 4461–4467. [Google Scholar]
Li, X.; Yin, F.; Sun, Z.; Li, X.; Yuan, A.; Chai, D.; Zhou, M.; Li, J. Entity-relation extraction as multi-turn question answering. arXiv 2019, arXiv:1905.05529. [Google Scholar]
Chang, H.; Zan, H.; Guan, T.; Zhang, K.; Sui, Z. Application of cascade binary pointer tagging in joint entity and relation extraction of Chinese medical text. Math. Biosci. Eng. 2022, 19, 10656–10672. [Google Scholar] [CrossRef]
Gehring, J.; Auli, M.; Grangier, D.; Yarats, D.; Dauphin, Y.N. Convolutional sequence to sequence learning. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 1243–1252. [Google Scholar]
Wu, F.; Lao, N.; Blitzer, J.; Yang, G.; Weinberger, K. Fast reading comprehension with convnets. arXiv 2017, arXiv:1711.04352. [Google Scholar]
Brin, S. Extracting patterns and relation from the world wide web. In Proceedings of the International Workshop on the World Wide Web and Databases, Valencia, Spain, 27–28 March 1998; pp. 172–183. [Google Scholar]
Gao, T.; Han, X.; Xie, R.; Liu, Z.; Lin, F.; Lin, L.; Sun, M. Neural snowball for few-shot relation learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 7772–7779. [Google Scholar]
Zhang, S.; Wang, X.; Chen, Z.; Wang, L.; Xu, D.; Jia, Y. Survey of Supervised Joint Entity Relation Extraction Methods. J. Front. Comput. Sci. Technol. 2022, 16, 713–733. [Google Scholar]
Jiang, J.; Zhai, C. A systematic exploration of the feature space for relation extraction. In Proceedings of the Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Main Conference, Rochester, NY, USA, 22–27 April 2007; pp. 113–120. [Google Scholar]
Miwa, M.; Bansal, M. End-to-end relation extraction using lstms on sequences and tree structures. arXiv 2016, arXiv:1601.00770. [Google Scholar]
Zheng, S.; Wang, F.; Bao, H.; Hao, Y.; Zhou, P.; Xu, B. Joint extraction of entities and relation based on a novel tagging scheme. arXiv 2017, arXiv:1706.05075. [Google Scholar]
Alfonseca, E.; Filippova, K.; Delort, J.-Y.; Garrido, G. Pattern learning for relation extraction with a hierarchical topic model. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Jeju Island, Republic of Korea, 8–14 July 2012; pp. 54–59. [Google Scholar]
Huang, Z.; Chang, L.; Bin, C.; Sun, Y.; Sun, L. Distant supervision relation extraction based on GRU and attention mechanism. Appl. Res. Comput. 2019, 36, 2930–2933. [Google Scholar]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Xue, Y.; Zhu, J.; Lyu, J. Construction and Application of Text Entity Relation Joint Extraction Model Based on Multi-Head Attention Neural Network. Comput. Intell. Neurosci. 2022, 2022, 1530295. [Google Scholar] [CrossRef]
Qiao, B.; Zou, Z.; Huang, Y.; Fang, K.; Zhu, X.; Chen, Y. A joint model for entity and relation extraction based on BERT. Neural Comput. Appl. 2022, 34, 3471–3481. [Google Scholar] [CrossRef]
Shi, P.; Lin, J. Simple bert models for relation extraction and semantic role labeling. arXiv 2019, arXiv:1904.05255. [Google Scholar]
Shen, T.; Wang, D.; Feng, S.; Zhang, Y. NS-Hunter: BERT-Cloze based semantic denoising for distantly supervised relation classification. In Proceedings of the Chinese Computational Linguistics: 20th China National Conference, CCL 2021, Hohhot, China, 13–15 August 2021; pp. 324–340. [Google Scholar]
Chowdhury, G.G. Introduction to Modern Information Retrieval; Facet Publishing: London, UK, 2010. [Google Scholar]
Mihalcea, R.; Tarau, P. Textrank: Bringing order into text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25–26 July 2004; pp. 404–411. [Google Scholar]
Page, L.; Brin, S.; Motwani, R.; Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab. 1999. Available online: https://www.semanticscholar.org/paper/The-PageRank-Citation-Ranking-%3A-Bringing-Order-to-Page-Brin/eb82d3035849cd23578096462ba419b53198a556 (accessed on 25 January 2023).
Schmitz, M.; Soderland, S.; Bart, R.; Etzioni, O. Open language learning for information extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Republic of Korea, 12–14 July 2012; pp. 523–534. [Google Scholar]
Che, W.; Li, Z.; Liu, T. Ltp: A chinese language technology platform. In Proceedings of the Coling 2010: Demonstrations, Beijing, China, 23–27 August 2010; pp. 13–16. [Google Scholar]
Jia, S.; Shijia, E.; Ding, L.; Chen, X.; Xiang, Y. Hybrid neural tagging model for open relation extraction. Expert Syst. Appl. 2022, 200, 116951. [Google Scholar] [CrossRef]
Dathathri, S.; Madotto, A.; Lan, J.; Hung, J.; Frank, E.; Molino, P.; Yosinski, J.; Liu, R. Plug and play language models: A simple approach to controlled text generation. arXiv 2019, arXiv:1912.02164. [Google Scholar]
Su, J.; Lu, Y.; Pan, S.; Wen, B.; Liu, Y. Roformer: Enhanced transformer with rotary position embedding. arXiv 2021, arXiv:2104.09864. [Google Scholar]
Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. Albert: A lite bert for self-supervised learning of language representations. arXiv 2019, arXiv:1909.11942. [Google Scholar]
Hua, W.; Dai, Z.; Liu, H.; Le, Q. Transformer quality in linear time. In Proceedings of the International Conference on Machine Learning, Baltimore, MD, USA, 17–23 July 2022; pp. 9099–9117. [Google Scholar]
So, D.; Mańke, W.; Liu, H.; Dai, Z.; Shazeer, N.; Le, Q.V. Searching for Efficient Transformers for Language Modeling. Adv. Neural Inf. Process. Syst. 2021, 34, 6010–6022. [Google Scholar]
Yamashita, R.; Nishio, M.; Do, R.K.G.; Togashi, K. Convolutional neural networks: An overview and application in radiology. Insights Imaging 2018, 9, 611–629. [Google Scholar] [CrossRef] [Green Version]
Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A novel cascade binary tagging framework for relational triple extraction. arXiv 2019, arXiv:1909.03227. [Google Scholar]
Wang, Y.; Yu, B.; Zhang, Y.; Liu, T.; Zhu, H.; Sun, L. TPLinker: Single-stage joint extraction of entities and relation through token pair linking. arXiv 2020, arXiv:2010.13415. [Google Scholar]
Su, J.; Murtadha, A.; Pan, S.; Hou, J.; Sun, J.; Huang, W.; Wen, B.; Liu, Y. Global Pointer: Novel Efficient Span-based Approach for Named Entity Recognition. arXiv 2022, arXiv:2208.03054. [Google Scholar]

Figure 1. Schematic diagram of sample labeling.

Figure 2. Flowchart of training corpus acquisition.

Figure 3. Dependency parsing tags.

Figure 4. Flowchart of relation labeling.

Figure 5. Structure of Ro-DGANet model.

Figure 6. Structure of DGA model.

Figure 7. Comparison of metrics of different models on the LIC2019 dataset. (a) Changes in the indicators of the p-value; (b) Changes in the indicators of the R-value; (c) Changes in the indicators of the F1 score.

Figure 8. Experimental results on IDATA and CDATA datasets.

Figure 9. Core knowledge graph of c programming language (partial results).

Table 1. Open relation extraction sample.

Sample	Input Text	Output Result
1	{ “text”: ”The vertical graphitization furnace includes a support plate, a furnace body and a furnace cover above it.”, “sub”: [“Vertical Graphitization Furnace”, ”Support plate”, “Furnace body”, ”Furnace lid”] }	{ “text”: ”The vertical graphitization furnace includes a support plate, a furnace body and a furnace cover above it.”, “spo”: [[“Vertical Graphitization Furnace”, “Include”, ”Support plate”], [“Vertical Graphitization Furnace”, ”Include”, ”Furnace body”], [“Vertical Graphitization Furnace”, ”Include”, ”Furnace lid”], [“Furnace body”, ”Above”, ”Furnace lid”]] }
2	{ “text”: “The raw material silo, drying furnace and carbonization furnace are connected in sequence by transportation equipment.”, “sub”: [“Raw material silo”, ”Drying furnace”, “Carbonization furnace”, ” Transport equipment”] }	{ “text”: “The raw material silo, drying furnace and carbonization furnace are connected in sequence by transportation equipment.”, “spo”: [[“Raw material silo”, ” Connected”, ”Transport equipment”], [“Drying furnace”, ”Connected”, ”Transport equipment”], [“Carbonization”, ”Connected”,” Transport equipment”]] }

Table 2. Core entity extraction results.

Method	TFIDF	TextRank	TFIDF-TextRank
Accuracy	75%	67%	81%

Table 3. Core entity candidate lexicon (partial results).

Entity	Score	Entity	Score
High-temperature furnace	12.5378	Stove top	6.8294
Carbonization furnace	11.6943	Furnace shell	6.6275
Furnace body	11.0950	Thermal insulation panels	6.5890
Graphitization furnace	9.4932	Atmospheric chamber furnace	6.1748
Furnace door	9.0723	Carbon tube furnace	5.4379
Furnace cover	7.5063	Support seat	5.4318

Table 4. Statistics of IDATA sources.

Content	Number of Document Pages
Carbonization furnace	1355
Graphitization furnace	1902

Table 5. Statistics of CDATA sources.

Data Source	Niuke	W3C School	C-Language Network	Rookie Tutorial
Number of document pages	177	79	101	172

Table 6. Experimental parameters of Ro-DGANet model.

Parameter	Value
max epochs	20
batch size	16
seq length	128
dropout	0.2
earning rate	2 × 10⁻⁵
hidden size	768

Table 7. Results of different models on LIC2019, CHIP2020 datasets.

Dataset	Model	Ternary Relation Extraction
Dataset	Model	P%	R%	F1%
LIC2019	CasRel	80.75	79.77	82.15
	TPLinker	79.34	78.27	80.43
	TPLinkerPlus	80.89	79.38	81.31
	GPLinker	81.75	79.96	82.58
	Ro-DGANet	80.83	81.60	82.99
CHIP2020	CasRel	64.10	58.28	62.86
	TPLinker	64.76	63.81	63.72
	TPLinkerPlus	65.90	64.11	65.32
	GPLinker	65.59	60.97	63.66
	Ro-DGANet	65.74	66.37	66.39

Table 8. The influence of different functional modules.

Model	F1(%)	Influence
Base	80.86	0
Base_+Postag+GP	81.82	0.96
Base_+DGANet+GP	82.48	1.62
Base_{+DGANet+Postag}	82.26	1.4
Base_{+DGANet+Postag+GP}	82.99	2.13

Table 9. Total training time of different models (seconds).

Models	IDATA(s)	CDATA(s)
CasRel	2948	2064
TPLinker	38,412	29,853
TPLinkerPlus	21,536	14,841
GPLinker	4837	3687
Ro-DGANet	3514	2728

Table 10. Memory usage of different models.

Models	CasRel	TPLinker	TPLinker Plus	GPLinker	Ro-DGANet
Memory(MB)	5038	8157	15,978	5919	5405

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, X.; Hu, J. An Open Relation Extraction Method for Domain Text Based on Hybrid Supervised Learning. Appl. Sci. 2023, 13, 2962. https://doi.org/10.3390/app13052962

AMA Style

Wang X, Hu J. An Open Relation Extraction Method for Domain Text Based on Hybrid Supervised Learning. Applied Sciences. 2023; 13(5):2962. https://doi.org/10.3390/app13052962

Chicago/Turabian Style

Wang, Xiaoxiong, and Jianpeng Hu. 2023. "An Open Relation Extraction Method for Domain Text Based on Hybrid Supervised Learning" Applied Sciences 13, no. 5: 2962. https://doi.org/10.3390/app13052962

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Open Relation Extraction Method for Domain Text Based on Hybrid Supervised Learning

Abstract

1. Introduction

2. Related Work

3. Problem Background

4. Training Corpus Acquisition

5. Relation Extraction Model Ro-DGANet

5.1. Joint Extraction of Ideas

5.2. Encoding Layer

5.3. Dilate Gate Attention Network

6. Experimental Assessment

6.1. Experimental Dataset

6.2. Experimental Environment and Model Parameter Settings

6.3. Evaluation Metrics

6.4. Contrast Model

6.5. Experimental Results Analysis

6.5.1. Experimental Analysis of Comparative Models

6.5.2. Experimental Analysis of Ablation

6.5.3. Model Applications

6.5.4. Training Time Cost for Different Models

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI