TransRFT: A Knowledge Representation Learning Model Based on a Relational Neighborhood and Flexible Translation

Wan, Boyu; Niu, Yingtao; Chen, Changxing; Zhou, Zhanyang

doi:10.3390/app131910864

Open AccessArticle

TransRFT: A Knowledge Representation Learning Model Based on a Relational Neighborhood and Flexible Translation

¹

Fundamentals Department, Air Force Engineering University of PLA, Xi’an 710051, China

²

The Sixty-Third Research Institute, National University of Defense Technology, Nanjing 210007, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(19), 10864; https://doi.org/10.3390/app131910864

Submission received: 30 August 2023 / Revised: 29 September 2023 / Accepted: 29 September 2023 / Published: 29 September 2023

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The use of knowledge graphs has grown significantly in recent years. However, entities and relationships must be transformed into forms that can be processed by computers before the construction and application of a knowledge graph. Due to its simplicity, effectiveness, and great interpretability, the translation model lead by TransE has garnered the most attention among the many knowledge representation models that have been presented. However, the performance of this model is poor when dealing with complex relations such as one-to-many, many-to-one, and reflexive relations. Therefore, a knowledge representation learning model based on a relational neighborhood and flexible translation (TransRFT) is proposed in this paper. Firstly, the triples are mapped to the relational hyperplane using the idea of TransH. Then, flexible translation is applied to relax the strict restriction h + r = t in TransE. Finally, the relational neighborhood information is added to further improve the performance of the model. The experimental results show that the model has good performance in triplet classification and link prediction.

Keywords:

knowledge graph completion; knowledge graph embedding; triple classification; link prediction; TransE

1. Introduction

Knowledge graph, a large-scale organized semantic knowledge base [1], has made groundbreaking progress in tandem with the quick development of big data, natural language processing (NLP), and artificial intelligence technology [2]. In the 1970s, knowledge engineering and expert systems gave rise to the knowledge graph. It was not until 2012 that Google officially launched the large-scale knowledge graph for Internet searches [3], which has been widely used in search engines, question systems, recommendation systems and other fields [4]. Knowledge graph is a technological approach that employs the graph model to describe knowledge and model the interactions between all objects in the world [5]. They also use entities, relationships, and characteristics to describe knowledge and concepts in real life [6]. In the traditional sense, knowledge graph is generally used as a large-scale semantic network to express knowledge in the form of points and edges. Points are generally classified as entities, concepts, or values, while edges are generally classified as attributes and relationships. However, knowledge graph is no longer a simple semantic network, but the sum of a series of representative technologies of knowledge engineering [7,8]. At present, the large-scale knowledge maps that have been built include ConceptNet [9], YAGO [10,11,12,13], and CN-DBpedia [14].

The traditional representation of a knowledge graph is usually based on logic or rules. Predicate logic is typically used in logic-based systems to describe knowledge, however, logic is frequently too abstract for knowledge reasoning. To describe the entity–relationship–entity semantic relationship, rule-based methods frequently require the establishment of a number of rules, which makes it challenging to express tacit knowledge and process knowledge and unable to capture the intricate semantic relationship between them. Both logic-based and rule-based knowledge representation methods need to define a lot of logic or rules manually, which makes it difficult to update a knowledge graph in time. Moreover, in the real world, knowledge is almost endless, and it is difficult to guarantee the completeness of knowledge in the knowledge graph by artificial means. These disadvantages seriously limit the application prospects of knowledge graphs.

To overcome the limitations of the manual approaches mentioned above and enable automatic learning of knowledge from data in a form that is easier to understand and process for machine learning algorithms, researchers have proposed knowledge representation learning. The goal of knowledge representation learning is to create a representation that appropriately captures knowledge so that machines can process and comprehend material more effectively. Therefore, researchers proposed a word vector model (word2vec [15], glove [16], BERT [17], etc.), whose basic idea is to transform words in natural language into low-dimensional dense vectors, so that words with similar semantics are closer in the vector space. This representation can enhance the correlation between data through semantic connection to realize the expression of more standardized and high-quality data. It can also capture the semantic relationship between words and other words, which is crucial to the knowledge graph’s representation learning.

At present, researchers have proposed many knowledge representation models: examples include the distance model (SE) [18], single layer neural network (SLM) model [19] and many other neural network models [20,21,22], the generative adversarial network model [23], semantic matching energy model (SME) [24], bilinear model (LFM) [25], multi-modal pretraining [26,27,28,29], the translation model [30,31,32,33,34], etc. Among them, the translation model headed by TransE [30] has attracted a lot of research interest. This approach, which was put forth in 2013, follows a traditional model for knowledge graph embedding. Each entity and relationship in the TransE model is encoded as a vector, and in the vector space, the vector’s angle, length, and location convey the semantic meaning of the entity or relationship, which is accurately reflected in link prediction and triplet categorization. The scoring function for the relationship (r) between the head entity (h) and the tail entity (t) can be represented as

f_{r} (h, t) = {‖ h + r - t ‖}_{2}^{2}

, which represents the Euclidean distance between the head entity vector and the tail entity after adding the head entity vector and the relational vector. The training process for this method is simple and efficient, and the method is intuitive and easy to understand. However, it still has some limitations. TransE performs poorly in handling complex relationships and cannot accurately infer one-to-many, many-to-one, many-to-many, and reflexive relationships. Therefore, there is significant room for improvement in this method, and many researchers have made various enhancements to the TransE model. Study [31] analyzed that the

h + r \approx t

constraint in the TransE model was too strong, which led to the learning error with complex relations. Based on TransE, the head entity h and tail entity t are projected onto a hyperplane, relaxing TransE’s strict assumption of

h + r \approx t

. Study [32] proposed the TransR model. On the basis of TransH, a corresponding vector space was constructed for each relation, effectively solving the problem that TransH entities and relations are in the same vector space, which makes it impossible to distinguish two entities with similar semantics. In order to successfully reduce the computational complexity, study [33] created the TransD model and updated the mapping function on the basis of TransR. Study [35] proposed a knowledge graph embedding method based on a double restriction score (DLS), which effectively improves the effect of link prediction and triplet classification. Study [36] proposed a knowledge graph embedding method based on adaptive double-limited loss (ADL), which effectively improves the accuracy of link prediction. Study [37] proposed a transliteration-based model building method (TransCoRe) based on relational correlation, and tested it on WordNet and Freebase datasets, effectively improving the accuracy of link prediction and triplet classification. In study [38], a TransModE approach was proposed. The relational vector in the knowledge graph was based on TransE, which effectively minimized the computing complexity by representing it as a transition of modular space. A distributed translation-based knowledge graph embedding model (DTransE) was suggested in study [39]. The Gather–Apply–Scatter model, which is based on the TransE model, is used in this method to significantly increase the model’s calculation speed. The link prediction, triplet classification, and model calculation degree are all enhanced in the aforementioned method.

However, the models in the above approaches either ignore the neighborhood information of the triples when modeling the triples in the knowledge base, resulting in the inability to deal with rare entities with less associative knowledge, or fail to adaptively extract the most relevant neighboring node attributes for each entity when introducing the neighborhood information, resulting in the introduction of redundant information. To solve the above problems, a knowledge representation learning model (TransRFT) based on a relational neighborhood and flexible translation is proposed in this paper, which is implemented as follows:

Firstly, by means of the hyperplane projection of TransH, the vectorization of head entity, relation, and tail entity is carried out to effectively represent the complex relation. Secondly, the idea of flexible translation is introduced, and

h, r, t

is no longer regarded as a fixed vector, but as a plane within a range, which effectively alleviates the strict assumptions of the TransE model

h + r \approx t

and enhances the ability of the model to deal with complex relationships. Finally, the neighborhood information of different relationships is integrated to further improve the performance of the model.

This paper is divided into the following sections: The TransRFT model’s guiding principle and model training methodology are introduced in Section 2. The experimental findings and a model performance analysis are presented in Section 3. The conclusion of this paper is found in Section 4.

Section 2 introduces the principle of the TransRFT model and the model training methodology. Section 3 gives the experimental results and performance analysis of the model. Section 4 is the conclusion of this paper.

2. TransRFT Model Principle and Training

The foundation of the TransRFT knowledge representation paradigm will be thoroughly explained in this section. By including neighborhood information and employing flexible translation to address TransH’s drawbacks, TransRFT enhances the translation-based paradigm. Finally, the probability technique is used to improve the quality of negative triples by replacing the head and tail entities.

The symbols in this document are described as follows: h indicates the head entity, t indicates the tail entity, and r indicates the relationship between the head entity and the tail entity.

h, r, t

is the corresponding

h, r, t

of the embedded representation. S is the set of positive example triples and

S^{'}

is the set of negative example triples.

2.1. The TransRFT Model

As shown in Figure 1, suppose there are two triples in the knowledge graph: (Obama, position, US President) and (Trump, position, US President). For TransE, both the relation r and the tail entity are the same, so their corresponding head entity vector h is the same. This is clearly factually incorrect, and TransE cannot effectively distinguish between Obama and Trump. Therefore, inspired by the TransH model, the entity is first introduced into the relationship-specific hyperplane on the basis of TransE to improve the ability of TransE to deal with complex relationships. Assuming that the normal vector of the hyperplane

w_{r}

is

W_{r}

, and the projected head and tail entities are

h_{⊥}, t_{⊥}

, then:

{\begin{matrix} h_{⊥} = h - W_{r}^{T} h W_{r} \\ t_{⊥} = t - W_{r}^{T} t W_{r} \end{matrix}

(1)

Then, apply the relational hyperplane and the projected triplet representation is shown in Figure 2.

Through Figure 2, it can be observed that after introducing the head entities (Obama, Trump) onto the relation-specific hyperplane (Position), even though Obama and Trump are not the same person, they end up with the same projection

h_{⊥}

. By representing each entity and relation with separate entity and relation vectors, and assigning a relation-specific hyperplane to each relation, it is possible to distinguish between different head entities with the same relation to a given tail entity. Additionally, the differences and interactions between different relations can also be captured by the hyperplanes corresponding to different relations. Compared to TransE, this approach enables better handling of one-to-many, one-to-one, and many-to-many relationships.

Normally, it is hoped that in vector space, the projection of the head entity vector on the relational hyperplane

h_{⊥}

plus the relation r should equal the projection of the tail entity vector on the relational hyperplane

t_{⊥}

. Therefore, the vector score function after projection is

\begin{array}{l} f_{r} (h, t) & = & {‖ h_{⊥} + r - t_{⊥} ‖}_{2}^{2} \\ = & {‖ (h - W_{r}^{T} h W_{r}) + r - (t - W_{r}^{T} t W_{r}) ‖}_{l_{1} / l_{2}}^{} \end{array}

(2)

The score function represents the degree of correctness of the head and tail entities of the triplet species after the “translation” of the relationship. The lower the score, the more favorable it is to the positive sample triplet; the higher the score, the more favorable it is to the negative sample. However, the aforementioned approach also has its own limitations. For instance, compared to TransE, it significantly increases the model complexity and training time. Furthermore, it still performs poorly when dealing with reflexive relationships. Therefore, the idea of flexible translation is introduced after projecting into the hyperplane of relation.

Specifically, for each triplet

(h, r, t)

, assuming that the embeddings of h and r are given, then t is allowed to be a range of planes, rather than a fixed vector or a set of vectors in the same direction in the cross-section model. The range of r is a plane similarly if the embeddings of h and t are supplied; similarly, if the embeddings of r and t are given, the range of h is also a plane, as illustrated in Figure 3.

Finally, in order to utilize the existing knowledge in the knowledge graph and make the knowledge representation learning more accurate, the relationship neighborhood knowledge is introduced to utilize the large amount of information provided by the structure of the knowledge graph. Specifically, if the number of neighbor nodes of a relation is more than three, only the three of them that are closest are selected; if the number of neighbor nodes is less than three, as many neighbor nodes as there are are selected. Moreover, when the representation of neighbor nodes is performed, the weight of the relationship

ω

is added, and the closer the nodes are, the higher the weight is, as shown in Figure 4.

During model training, the sampling strategy of negative example triples is improved, i.e., replacement entities are selected using one-to-many and many-to-one mapping relationships so that as many entities as possible are trained. When replacing entities, the entities that are most similar to their semantics are selected for replacement to improve the differentiation between entities. At this time, the score function of TransRFT is that for

f_{r} (h, t) = ω {‖ h_{⊥} + (r + β_{r}) - t_{⊥} ‖}_{l_{1} / l_{2}}

(3)

2.2. Model Training

The model must be trained beforehand for both link prediction and experimentation in triplet classification. The conventional approach is to construct negative example triples by randomly replacing the head and tail entities in the positive example triples in the entity set. This labels the original correct triplet as the wrong one. However, this simple sampling method leads to the poor quality of some of the generated negative example triples, which makes the training of the model not as effective as expected, thus affecting the actual score. As a result, when the model is trained, the probabilistic technique is selected to replace the head and tail entities. The model can learn more entities and relationships and obtain better parameter combinations since more entity attributes can be taught in this manner. In order to enhance the model’s capacity to distinguish between comparable entities, replacement entities that are similar to the original ones are chosen.

(1) Probability method to replace the head and tail entities

The head and tail entities are replaced when creating a negative triplet, depending on the nature of the relationships between the entities. Both the head and tail entities have several features, regardless of whether the relationship between the entities is one-to-many or many-to-one. When the relationship between entities is many-to-one, the strategy is to replace the tail entity with a higher probability, because this allows the multiple properties of the tail entity to be better trained. When there is a one-to-many relationship between the entities, the head entity is replaced with a greater probability so that the various attributes of the head entity may be trained more effectively.

During the model training process, we designate tph as the average number of tail entities corresponding to each head entity in a triplet, and hpt as the average number of head entities corresponding to each tail entity in a triplet. And then the head–tail entity will be replaced by the probability method. According to Equation (4), the sampling process follows the Bernoulli distribution.

q = \frac{t p h}{(t p h + h p t)}

(4)

When constructing a negative triplet using a positive triplet, the header entity is replaced with probability q and the tail entity is replaced with probability 1-q. At the same time, it is stipulated that when tph < 1.5 and hpt < 1.5, it means that the relationship r is one-to-one; when tph > 1.5 and hpt > 1.5, it means that the relationship r is many-to-many. When tph ≥ 1.5 and hpt < 1.5, then the relationship r is one-to-many; when tph < 1.5 and hpt ≥ 1.5, then the relationship r is many-to-one.

(2) The entities are selected based on their similarity

When judging the similarity between entities, the semantic similarity between entities or relationships is often selected for judgment, which is reflected in the vector space, that is, the similarity between calculation vectors. The calculation formula is as follows:

d i s (E, E^{'}) = \sqrt{\sum_{j = 1}^{k} (E_{j} - E_{j}^{'})}

(5)

where E is the entity. Specifically, given a positive example triplet

(h, r, t)

, when replacing the header entity to generate a negative example triplet

(h^{'}, r, t)

,

h^{'}

is selected to minimize

d i s (h, h^{'})

; when replacing tail entities to generate negative example triples

(h, r, t^{'})

,

t^{'}

is selected to minimize

d i s (t, t^{'})

.

In the process of model training, in order to distinguish between correct triples and wrong triples, the following marginal loss function is adopted as the optimization objective function of the training model:

L = \sum_{(h, r, t) \in S} \sum_{(h^{'}, r, t^{'}) \in S^{'}} \max (f_{r} (h, t) + γ - f_{r} (h^{'}, t^{'}), 0)

(6)

where S is all positive case triples and S is all negative case triples.

\max (x, y)

is the larger value in the

x, y

.

γ

is the distance between the score of the positive triplet loss function and the score of the negative triplet loss function.

3. Experimental Results and Analysis

The IDE platform chosen for this experiment was VS Code, the operating system was Ubuntu 20.04, and the programming language used was Python 3.8. The CPU used was i7-6700HQ, and the PyTorch version was 1.5. Additionally, to enhance the training speed of the model, the GTX1080 GPU with 6G memory was selected in this experiment.

3.1. Dataset

Two public datasets were selected for this experiment: WordNet and Freebase. WordNet is an English lexical database constructed by Princeton University. It contains a vast collection of English words along with their hypernyms, hyponyms, synonyms, and more. Freebase, on the other hand, is an open and semi-structured graph database developed by Google. It encompasses 39 million entities from various domains in the real world, such as art, abstract concepts, objects, and more. Specifically, we used two subsets from WordNet: WN18 and WN11, as well as two subsets from Freebase: FB15K and FB13. The details of the entity number, relationship number, training set, verification set, and test set of these datasets are shown in Table 1.

3.2. Link Prediction

In this paper, unif represents the traditional method of replacing head and tail entities with equal probability, and bern represents the method of adopting Bernoulli sampling. During the model training process, we adjust hyperparameters to achieve the best performance of the model. Among them, the learning rate α is chosen from the set {0.01, 0.001, 0.0001, 0.00001}. The boundary

γ

can take values from the set {0.5, 1, 2, 3, 4, 5, 6}. The dimension k can take values from the set {50, 100, 150, 200, 250, 300}. The batch size B can take values from the set {600, 1200, 4800, 6000}. We evaluate the performance of the TransRFT using two metrics: MeanRank and Hit@10%. MeanRank measures the average rank of the true head entity among the predicted entities for each test triple. A lower rank indicates more accurate predictions. Hit@10% measures whether the true entity is included in the top 10% of predicted entities. A higher Hit@10% indicates better prediction capability. Typically, we use the term “Raw” to refer to link prediction without any dataset processing, and “Filter” to refer to link prediction with the removal of positive triples. Using the Raw metric allows us to assess the overall prediction capability of the model, while using the Filter metric provides a more accurate evaluation of the model’s ability to predict unknown entity relationships. In our case, we perform iterative training for 500 epochs on all triples in the WN18 and FB15K datasets. When the sampling method is “unif”, the TransRFT model performs best on the WN18 dataset with parameter settings of α = 0.0001,

γ

= 5, k = 100, and B = 4800. On the FB15K dataset, the TransRFT model performs best with parameter settings of α = 0.001,

γ

= 4, k = 200, and B = 4800. When the sampling method is “bern,” the TransRFT model performs best on the WN18 dataset with parameter settings of α = 0.0001,

γ

= 5, k = 100, and B = 4800. On the FB15K dataset, the TransRFT model performs best with parameter settings of α = 0.001,

γ

= 4, k = 200, and B = 4800. The experimental results of the link prediction are shown in Table 2.

Through our experiments, we found that on the WN18 dataset, TransRFT (unif) and TransRFT (bern) outperform MeanRank in terms of the Hit@10% metric. This is because the WN18 dataset contains fewer entities and relations, and TransRFT is able to make predictions about entities, although its accuracy is not as high as TransE. On the FB15K dataset, TransRFT (unif) and TransRFT (bern) perform better overall compared to other methods. In terms of the Hits@10 metric, TransRFT improves by 3.8% on WN18 and by 35.9% on FB15K compared to TransE, showing a significant improvement in performance.

In order to confirm that TransRFT can better handle complex relationships, we chose the FB15K dataset with more relationships and utilized the optimal configuration parameters for Hits@10 to test the scores under 1-to-1, 1-to-n, n-to-1, and n-to-n relationships, respectively, and ranked the scores. From the experimental results in Table 3, it can be seen that the TransRFT model achieves the best results for both predicting left and predicting right, outperforming the other models. Among them, TransRFT achieves 97.1% on predicting left with a 1-to-n relationship and 96.0% on predicting right with a n-to-1 relationship.

3.3. Triplet Classification

In the triplet classification tests, we conducted experiments on three datasets: WN11, FB13, and FB15K. On the WN11 dataset, the best configuration for the hyperparameters was found to be α = 0.001,

γ

= 10, k = 100, and B = 4800. On the FB13 dataset, the best configuration for the hyperparameters was found to be α = 0.001,

γ

= 5,

k

= 200, and B = 4800. On the FB15K dataset, the best configuration for the hyperparameters was found to be α = 0.001,

γ

= 5, k = 100, and B = 120.

Table 4 shows the results of the triplet classification assessment. It can be seen that in WN11, the TransRFT model is better than TransE model, with an improvement of 12.8%; in FB13, the performance of the TransRFT model is better than the TransE, with an increase of 8.9%. In the FB15K dataset, TransRFT has the best performance. Compared to TransE, the improvement is 12.8%. This shows that TransRFT can adapt and perform well on both sparse and dense datasets.

4. Conclusions

To further enhance the performance of knowledge graph embedding models, this paper proposes a novel knowledge representation model called TransRFT, based on a relational neighborhood and flexible translation. This model improves its ability to handle rare entities by incorporating relational neighborhood information, and enhances the handling of complex relationships through flexible translation. This method first projects the triplet vectors into the relational hyperplane, and then employs the flexible translation to relax the strict constraints in the triplet, which enhances the model’s ability to deal with complex relationships. The relational neighborhood information is then included to better utilize the data already present in the graph structure and boost the model’s performance. Finally, Bernoulli sampling is used to replace the entities in the positive-case triplet, which improves the model training effect. In addition, we conducted tests on link prediction and triplet classification using the WordNet and FreeBase databases, and found that the TransRFT model outperformed other models on both sparse and dense datasets. This indicates that our model can better handle complex knowledge graph data, thereby enhancing the practical value of knowledge graphs.

Author Contributions

Conceptualization, Y.N., C.C. and B.W.; methodology, Y.N.; software, B.W.; validation, B.W. and Z.Z.; formal analysis, Y.N. and C.C.; investigation, Y.N. and C.C.; resources, Y.N., C.C., B.W. and Z.Z.; data curation, B.W.; writing—original draft preparation, B.W.; writing—review and editing, Y.N., C.C.; visualization, Y.N. and C.C.; supervision, Y.N. and C.C.; project administration, Y.N.; funding acquisition, Y.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science Foundation of China (NSFC grant: U19bB2014).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

Liang, K.; Liu, Y.; Zhou, S.; Tu, W.; Wen, Y.; Yang, X.; Dong, X.; Liu, X. Knowledge Graph Contrastive Learning Based on Relation-Symmetrical Structure. IEEE Trans. Knowl. Data Eng. 2023, 1, 1–12. [Google Scholar] [CrossRef]
Jin, X.; Wah, B.W.; Cheng, X.; Wang, Y. Significance and challenges of big data research. Big Data Res. 2015, 2, 59–64. [Google Scholar] [CrossRef]
Xiaohan, Z. A survey on application of knowledge graph. J. Phys. Conf. Ser. 2020, 1487, 012016. [Google Scholar]
Amador-Domínguez, E.; Serrano, E.; Manrique, D. GEnI: A framework for the generation of explanations and insights of knowledge graph embedding predictions. Neurocomputing 2023, 521, 199–212. [Google Scholar] [CrossRef]
Singhal, A. Introducing the knowledge graph: Things, not strings. Official Google Blog 6 May 2012.
Gutiérrez, C.; Sequeda, J.F. Knowledge graphs. Commun. ACM 2021, 64, 96–104. [Google Scholar] [CrossRef]
Lu, R.; Jin, X.; Zhang, S.; Qiu, M.; Wu, X. A study on big knowledge and its engineering issues. IEEE Trans. Knowl. Data Eng. 2018, 31, 1630–1644. [Google Scholar] [CrossRef]
Wu, X.; Chen, H.; Wu, G.; Liu, J.; Zheng, Q.; He, X.; Zhou, A.; Zhao, Z.-Q.; Wei, B.; Li, Y.; et al. Knowledge engineering with big data. IEEE Intell. Syst. 2015, 30, 46–55. [Google Scholar] [CrossRef]
Hugo, L.; Singh, P. ConceptNet—A practical commonsense reasoning tool-kit. BT Technol. J. 2004, 22, 211–226. [Google Scholar]
Suchanek, F.M.; Kasneci, G.; Weikum, G. Yago: A core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web, Banff, AB, Canada, 8–12 May 2007. [Google Scholar]
Hoffart, J.; Suchanek, F.M.; Berberich, K.; Lewis-Kelham, E.; De Melo, G.; Weikum, G. YAGO2: Exploring and querying world knowledge in time, space, context, and many languages. In Proceedings of the 20th International Conference Companion on World Wide Web, Hyderabad, India, 28 March–1 April 2011. [Google Scholar]
Biega, J.; Kuzey, E.; Suchanek, F.M. Inside YAGO2s: A transparent information extraction architecture. In Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil, 13–17 May 2013. [Google Scholar]
Mahdisoltani, F.; Biega, J.; Suchanek, F.M. Yago3: A knowledge base from multilingual wikipedias. In Proceedings of the 7th Biennial Conference on Innovative Data Systems Research (CIDR 2015), Asilomar, CA, USA, 4–7 January 2013. [Google Scholar]
Xu, B.; Xu, Y.; Liang, J.; Xie, C.; Liang, B.; Cui, W.; Xiao, Y. CN-DBpedia: A never-ending Chinese knowledge extraction system. In Proceedings of the 30th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems (IEA/AIE 2017), Arras, France, 27–30 June 2017; Springer International Publishing: Cham, Switzerland, 2017. [Google Scholar]
Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014. [Google Scholar]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Bordes, A.; Weston, J.; Collobert, R.; Bengio, Y. Learning structured embeddings of knowledge bases. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 7–11 August 2011; Volume 25. [Google Scholar]
Socher, R.; Chen, D.; Manning, C.D.; Ng, A. Reasoning with neural tensor networks for knowledge base completion. In Proceedings of the 27th Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–8 December 2013. [Google Scholar]
Shi, B.; Weninger, T. Proje: Embedding projection for knowledge graph completion. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
Sun, Z.; Huang, J.; Hu, W.; Chen, M.; Guo, L.; Qu, Y. Transedge: Translating relation-contextualized embeddings for knowledge graphs. In Proceedings of the Semantic Web–ISWC 2019: 18th International Semantic Web Conference, Auckland, New Zealand, 26–30 October 2019; Proceedings Part I 18. Springer International Publishing: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
Yuan, J.; Gao, N.; Xiang, J. TransGate: Knowledge graph embedding with shared gate structure. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33. [Google Scholar]
Cai, L.; Wang, W.Y. Kbgan: Adversarial learning for knowledge graph embeddings. arXiv 2017, arXiv:1711.04071. [Google Scholar]
Bordes, A.; Glorot, X.; Weston, J.; Bengio, Y. A semantic matching energy function for learning with multi-relational data: Application to word-sense disambiguation. Mach. Learn. 2014, 94, 233–259. [Google Scholar] [CrossRef]
Jenatton, R.; Roux, N.; Bordes, A.; Obozinski, G.R. A latent factor model for highly multi-relational data. In Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA, 3–8 December 2012. [Google Scholar]
Zolfaghari, M.; Zhu, Y.; Gehler, P.; Brox, T. Crossclr: Cross-modal contrastive learning for multi-modal video representations. In Proceedings of the IEEE/CVF 2021 International Conference on Computer Vision, Virtual Conference, 11–17 October 2021. [Google Scholar]
Liu, Y. Contrastive multimodal fusion with tupleinfonce. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual Conference, 11–17 October 2021. [Google Scholar]
Liu, C.; Cheng, S.; Chen, C.; Qiao, M.; Zhang, W.; Shah, A.; Bai, W.; Arcucci, R. M-flag: Medical vision-language pre-training with frozen language models and latent space geometry optimization. arXiv 2023, arXiv:2307.08347. [Google Scholar]
Wan, Z.; Liu, C.; Zhang, M.; Fu, J.; Wang, B.; Cheng, S.; Ma, L.; Quilodrán-Casas, C.; Arcucci, R. Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias. arXiv 2023, arXiv:2305.19894. [Google Scholar]
Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. In Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS’13), Lake Tahoe, NV, USA, 5–10 December 2013. [Google Scholar]
Wang, Z.; Zhang, J.; Feng, J.; Chen, Z. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014; Volume 28. [Google Scholar]
Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; Volume 29. [Google Scholar]
Ji, G.; He, S.; Xu, L.; Liu, K.; Zhao, J. Knowledge graph embedding via dynamic mapping matrix. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015; Volume 1: Long papers. [Google Scholar]
Xiao, H.; Huang, M.; Hao, Y.; Zhu, X. TransA: An adaptive approach for knowledge graph embedding. arXiv 2015, arXiv:1509.05490. [Google Scholar]
Zhou, X.; Niu, L.; Zhu, Q.; Zhu, X.; Liu, P.; Tan, J.; Guo, L. Knowledge graph embedding by double limit scoring loss. IEEE Trans. Knowl. Data Eng. 2021, 34, 5825–5839. [Google Scholar] [CrossRef]
Zou, X.; Wang, X.; Cen, S.; Dai, G.; Liu, C. Knowledge graph embedding with self adaptive double-limited loss. Knowl. Based Syst. 2022, 252, 109310. [Google Scholar] [CrossRef]
Zhu, J.-Z.; Jia, Y.-T.; Xu, J.; Qiao, J.-Z.; Cheng, X.-Q. Modeling the correlations of relations for knowledge graph embedding. J. Comput. Sci. Technol. 2018, 33, 323–334. [Google Scholar] [CrossRef]
Baalbaki, H.; Hazimeh, H.; Harb, H.; Angarita, R. TransModE: Translational Knowledge Graph Embedding Using Modular Arithmetic. Procedia Comput. Sci. 2022, 207, 1154–1163. [Google Scholar] [CrossRef]
Song, D.; Zhang, F.; Lu, M.; Yang, S.; Huang, H. DTransE: Distributed translating embedding for knowledge graph. IEEE Trans. Parallel Distrib. Syst. 2021, 32, 2509–2523. [Google Scholar] [CrossRef]
Bordes, A.; Glorot, X.; Weston, J.; Bengio, Y. Joint learning of words and meaning representations for open-text semantic parsing. In Proceedings of the Artificial Intelligence and Statistics, Fifteenth International Conference on Artificial Intelligence and Statistics, La Palma, Canary Islands, 21–23 April 2012. [Google Scholar]

Figure 1. Example of triples. Orange and blue arrows represent the head entity and the tail entity respectively, and the green arrow represents the relationship.

Figure 2. Example of projected triples. Orange and blue arrows represent the head entity and the tail entity respectively, and the green arrow represents the relationship.

Figure 3. Example of triples based on flexible translation. Orange and blue arrows represent the head entity and the tail entity respectively, and the green arrow represents the relationship.

Figure 4. Example of triples based on Trans-WTT. Orange and blue arrows represent the head entity and the tail entity respectively, and the green arrow represents the relationship.

Table 1. The Datasets.

Dataset	Entities	Relationships	Train	Verify	Test
WN18	40,943	18	141,442	5000	5000
WN11	38,696	11	112,581	2609	10,544
FB15K	14,951	1345	483,142	50,000	59,071
FB13	75,043	13	316,232	5908	23,733

Table 2. Experimental results of link prediction.

Date Sets	WN18				FB15K
Metric	MeanRank		Hits@10/%		MeanRank		Hits@10/%
Metric	Raw	Filt	Raw	Filt	Raw	Filt	Raw	Filt
Unstructured [40]	315	304	35.3	38.2	1074	979	4.5	6.3
SE [18]	1011	985	68.5	80.5	273	162	28.8	39.8
SME (Linear) [24]	545	533	65.1	74.1	274	154	30.7	40.8
SME (Bilinear) [24]	526	509	54.7	61.3	284	158	31.3	41.3
LFM [25]	469	456	71.4	81.6	283	164	26.0	33.1
TransE [30]	263	251	75.4	89.2	243	125	34.9	47.1
TransH [31]	401	388	73.0	82.3	212	87	45.7	64.4
TransRFT (unif)	307	290	78.3	93.0	171	24.1	53.1	83.1
TransRFT (bern)	304	288	78.4	92.8	140	37.5	53.9	80.0

Table 3. Hits@10 values of all kinds of relationships in FB15K.

Method	Predicting Left (Hits@10)				Predicting Right (Hits@10)
Method	1-to-1	1-to-N	N-to-1	N-to-N	1-to-1	1-to-N	N-to-1	N-to-N
Unstructured [40]	34.5	2.5	6.1	6.6	34.3	4.2	1.9	6.6
SE [18]	35.6	62.6	17.2	37.5	34.9	14.6	68.3	41.3
SME (linear) [24]	35.1	53.7	19.0	40.3	32.7	14.9	61.6	43.3
SME (Bilinear) [24]	30.9	69.6	19.9	38.6	28.2	13.1	76.0	41.8
TransE [30]	43.7	65.7	18.2	47.2	43.7	19.7	66.7	50.0
TransH (unif) [31]	66.7	81.7	30.2	57.4	63.7	30.1	83.2	67.2
TransH (bern) [31]	66.8	87.6	28.7	64.5	65.5	39.8	83.3	67.2
TransRFT (unif)	92.7	96.7	65.4	76.2	92.4	71.2	95.4	78.9
TransRFT (bern)	92.4	97.1	56.1	75.4	91.8	58.7	96.0	78.6

Table 4. Triplet classification accuracy of different models.

Method	WN11	FB13	FB15K
SE [40]	53.0	75.2	—
SME (bilinear) [24]	73.8	84.3	—
NTN [19]	70.4	87.1	66.5
TransE [30]	75.8	81.5	79.7
TransH (unif) [31]	77.7	76.5	74.2
TransH (bern) [31]	78.8	83.8	87.7
TransRFT (unif)	86.7	88.7	91.4
TransRFT (bern)	88.6	90.4	92.5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wan, B.; Niu, Y.; Chen, C.; Zhou, Z. TransRFT: A Knowledge Representation Learning Model Based on a Relational Neighborhood and Flexible Translation. Appl. Sci. 2023, 13, 10864. https://doi.org/10.3390/app131910864

AMA Style

Wan B, Niu Y, Chen C, Zhou Z. TransRFT: A Knowledge Representation Learning Model Based on a Relational Neighborhood and Flexible Translation. Applied Sciences. 2023; 13(19):10864. https://doi.org/10.3390/app131910864

Chicago/Turabian Style

Wan, Boyu, Yingtao Niu, Changxing Chen, and Zhanyang Zhou. 2023. "TransRFT: A Knowledge Representation Learning Model Based on a Relational Neighborhood and Flexible Translation" Applied Sciences 13, no. 19: 10864. https://doi.org/10.3390/app131910864

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

TransRFT: A Knowledge Representation Learning Model Based on a Relational Neighborhood and Flexible Translation

Abstract

1. Introduction

2. TransRFT Model Principle and Training

2.1. The TransRFT Model

2.2. Model Training

3. Experimental Results and Analysis

3.1. Dataset

3.2. Link Prediction

3.3. Triplet Classification

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI