Skip to main content

Recommendation method for fusion of knowledge graph convolutional network

Abstract

In the application of internet of vehicles system, it is particularly important to obtain real-time and effective vehicle information and provide personalized functional services for vehicle operation. This algorithm combines knowledge graph technology with convolutional network and presents a new algorithm model, that is, when calculating the representation of a given entity in the knowledge graph, the information of the neighboring entity is combined with the deviation. Through the integration of neighbor entity information, the local neighborhood structure can be better captured and stored in each entity, and the weight of different neighbor entities depends on the relationship between them and the specific user, which can better reflect the user's personalized interests, in order to fully demonstrate the characteristics of the entity. Compared with the traditional coordinated filtering technology SVD model, this model has improved accuracy and F1 value.

Introduction

The Internet of vehicles (IOV) is an integral part in the context of the Internet of things [1]. The equipment on the vehicle can obtain the vehicle information from the information network platform by wireless communication technology, making relevant recommendation technology, and viewing the vehicle in real time. The recommendation algorithm can provide personalized functional services for vehicle operation [2, 3]. The important idea of the KGCN model is summarized in that when calculating its feature representation for a given entity in the knowledge graph, it first performs a width search method to obtain its multi-hop associated entity in the knowledge graph [4], and then combines the relevant information of its neighbor nodes. The weights are aggregated and the features of the entity are merged. There are two main meanings of adopting this method [5, 6]: first, the calculation of the feature vector of the entity contained in the knowledge graph is weighted and aggregated with the information of the neighboring entity within a certain range of the entity, second, the information of the neighbor node is calculated for the node. The degree of selection is jointly influenced by the established entity and neighboring entities. This method not only effectively combines the semantic information between entities [7], but also characterizes the user's own interests. There may be many in extreme conditions since the neighbor nodes of each entity are largely different. For this reason, drawing lessons from the idea of graph convolutional network, a concept of receptive field is defined in knowledge graph convolutional network [7]. The receptive field refers to a set of neighborhoods of a fixed size. In this way, the calculation amount of the knowledge graph convolutional network has a certain controllability, and this greatly improves the scalability of the algorithm. So as to achieve effective access to vehicle information, real-time information of the vehicle.

Methods

KGCN layer

The predictive equation for judging whether the users are interested in vehicle information in the IOV system.

$$\mathop {\hat{y}}\nolimits_{uv} = F\left( {u,v|\theta ,Y,G} \right)$$
(1)

where G is the knowledge graph composed of entity-relation-entity triples, Y represents the sparse matrix of a given user's vehicle, u represents user, v represents Vehicles, \(\theta\) represents the set of learning parameters of the whole model.

Measure users’ preference for different vehicle information relationships

$$\pi_{r}^{u} = g\left( {r,u} \right)$$
(2)

where g is aggregator, r represents relationship, π represents weight. Define function g: a certain function that can make Rd × Rd → R, here is an aggregator, used to calculate the score between the user and a certain. Where d is the dimension of the feature representation of the knowledge graph.

The real meaning here can be understood as (calculating the user’s preference for different relationships). For example, if a user pays much attention to the vehicle service information of the corresponding online ride-hailing, he will pay more attention to the number of vehicle service times, user evaluation and other information, and will be more willing to use high-quality vehicles based on these judgments. Here, g function is used to measure the user's preference for different vehicle information relationships.

Use the linear combination of neighbor information to describe the neighborhood information of a node

$$\mathop v\nolimits_{N\left( v \right)}^{u} = \sum\nolimits_{e \in N\left( v \right)} {\mathop \pi \nolimits_{{\mathop r\nolimits_{v,e} }}^{u} } e$$
(3)

where N(v) is the set of all entities directly connected to entity v (that is, within one hop), Note that the weight π here is not only related to the relationship between the two nodes v and e, but also related to the characteristics of the corresponding user u at this time. The weight here is the g function scores of all entities e and relations r in N(v) corresponding to v and then normalized to formula 4.

$$\mathop \pi \nolimits_{{\mathop r\nolimits_{v,e} }}^{u} = \frac{{\exp \left( {\mathop \pi \nolimits_{{\mathop r\nolimits_{v,e} }}^{u} } \right)}}{{\sum\nolimits_{e \in N\left( v \right)} {\exp \left( {\mathop \pi \nolimits_{{\mathop r\nolimits_{v,e} }}^{u} } \right)} }}$$
(4)

where N(v) represents the set of all entities directly connected to entity v (that is, within one hop), where e is the characteristics of entity.

When calculating the representation of a given entity neighborhood, the vector scores of users and relationships are similar to the role of personalized filters. This processing is because the neighbor feature vectors are focused on the basis of these user-specific vectors.

Control the number of neighbor users

$$S\left( v \right) \to \left\{ {e|e \in N\left( v \right)} \right\}$$
(5)
$$S\left( v \right)| = {\rm K}$$
(6)

where N(v) represents the set of all entities directly connected to entity v (that is, within one hop), S(v) is entity, (single-layer) accepting field, e is the characteristics of entity, K represents Hyperparameter, select K neighbor users.

In a real knowledge graph [7], there will be too many neighbor users in a certain node v, which will bring huge pressure to the calculation of the overall model. It does not use all its neighbor users, but randomly and uniformly sample a fixed-size set from its neighbor users. Instead, a hyperparameter K is defined. For each node v, only K neighbor users are selected for calculation. In other words, the neighborhood representation of v at this time is written as. In KGCN, because the calculation of the final feature of v is sensitive to these regions \(v_{S\left( v \right)}^{u}\).

Define aggregator

The key point in the KGCN model is the fusion of the feature v of the entity and the neighbor feature \(v_{S\left( v \right)}^{u}\). Three types of aggregators are defined in KGCN. In this experiment, an additive aggregator is used for comparison experiments.

Sum Aggregator. The sum aggregator is to perform arithmetic addition of two eigenvectors, and then add a nonlinear transformation.

$$\mathop {{\text{agg}}}\nolimits_{{{\text{sum}}}} = \sigma \left( {W \cdot \left( {v + \mathop v\nolimits_{S(v)}^{u} } \right) + b} \right)$$
(7)

In the formula, w is conversion weight, b represents bias, S(v) represents physical (single-layer) acceptance field.

Splicing aggregator The splicing aggregator performs splicing operations on two feature vectors, and then adds a nonlinear transformation.

$$\mathop {{\text{agg}}}\nolimits_{{{\text{concat}}}} = \sigma \left( {W \cdot {\text{conact}}\left( {v,\mathop v\nolimits_{S(v)}^{u} } \right) + b} \right)$$
(8)

Neighbor aggregator The neighbor aggregator only uses the neighbor features of the entity v, and directly uses the neighborhood representation to replace the representation of the v node.

$$\mathop {{\text{agg}}}\nolimits_{{{\text{neighbor}}}} = \sigma \left( {W \cdot \mathop v\nolimits_{S(v)}^{u} + b} \right)$$
(9)

Algorithm

For a given entity, its final vector feature indicates that it must have a significant relationship with its neighbor users directly connected, which is called a first-order entity feature.

In order to further study the relationship between the two entities in the knowledge graph and explore other potential interests of users, the user's neighborhood set is expanded from one layer to multiple layers, that is, the selected single entity feature is transferred to a layer of adjacent neighborhoods, Then the first-order feature vector can be obtained, as shown in Fig. 1. In other words, the h-level features of a single entity emphatically aggregate the neighbor feature representations within h hops near itself. For this design, different aggregators are used to collect neighborhood information.

Fig. 1
figure 1

Neighborhood distribution of K = 2

H represents the maximum number of receptive fields. For the obtained entity pairs of users and Vehicles, first obtain the receptive field size of the item through iteration, and then repeat the aggregation for H times. In the h time, calculate the neighborhood \(e \in M\left[ h \right]\) features of each entity. Then aggregate it with its own features \(e^{u} \left[ {h - 1} \right]\) to obtain the feature representation of v in the next iteration, as shown in Fig. 2. Vu and user feature u are passed into the function together: \(R^{d} \times R^{d} \to R^{d}\) to get the final predicted probability.

$$\mathop {\hat{y}}\nolimits_{uv} = f\left( {u,\mathop v\nolimits^{u} } \right)$$
(10)

where Vu represents H-order entity feature, Since the algorithm traverses all possible user-item pairs, in order to make the algorithm more efficient, the negative sampling technique is selected in the training of this article. Therefore, the specific loss function of the KGCN model is as follows:

$$\Gamma = - \sum\nolimits_{u \in U} {\left( {\sum\nolimits_{{v:\mathop y\nolimits_{uv} = 1}} {\ell \left( {\mathop y\nolimits_{uv} ,\mathop {\hat{y}}\nolimits_{uv} } \right) - \sum\nolimits_{i = 1}^{{\mathop T\nolimits^{u} }} {\mathop E\nolimits_{{\mathop v\nolimits_{i} \sim P(\mathop v\nolimits_{i} )}} } \ell \left( {\mathop y\nolimits_{uv} ,\mathop {\hat{y}}\nolimits_{uv} } \right)} } \right)} + \lambda \left\| F \right\|_{2}^{2}$$
(11)
Fig. 2
figure 2

Self-vector representation in the h iteration

In the formul, \(\ell\) represents a-cross-direction loss function, P represents the distribution of negative samples, Tu is the number of negative samples of user u, p represents uniform distribution.

$$T^{u} = \left| {\left\{ {v:y_{uv} = 1} \right\}} \right|$$
(12)

The code of the KGCN algorithm is shown in Table 1.

Table 1 KGCN algorithm code

Results and discussion

Results

This experiment uses a medical data database as a training experiment. The specific data is shown in Table 2.

Table 2 Training data

The experimental data and various parameters are shown in Table 3.

Table 3 Experimental parameters

In this experiment, a classic hidden factor model based on collaborative filtering, singular value decomposition (SVD) and the fusion knowledge graph convolutional network KGCN [8, 9], are used in the recommended technology to compare experiments, and the results can be compared to Table 4.

Table 4 Comparison of data and experimental results

The F1 and accurate value line graphs of the training set, validation set, and test set of the data are shown in Figs. 3 and 4.

Fig. 3
figure 3

F1 line chart of training set, verification set and test set

Fig. 4
figure 4

Line chart of exact values of training set, verification set and test set

Discussion

The traditional collaborative filtering algorithm SVD does not have the auxiliary effect of the knowledge map, and the experimental effect is worse than that of KGCN. In the real relationship, using the attributes of users and vehicles, the various attributes do not exist alone, but are related to each other, forming a knowledge graph. Therefore, KGCN can better represent the relationship and perform better than SVD [9]. Because the multi-hop neighborhood structure is used, it shows that the knowledge graph captures neighborhood information is helpful for recommendation.

First, the influence of the number of samples in the neighborhood, the performance of the algorithm is further analyzed by changing the number of samples, as shown in Table 5. It can be concluded from the table that the best performance is achieved when the sampling format is 8. This is because a too small K does not have enough capacity to contain neighbor information, and a large K is easily affected by noise [10].

Table 5 The influence of increasing K value on accuracy

Second, adjust the influence of the number of receiving field layers on the experimental results. After experimenting with the number of layers H of the receiving field from 1 to 4, it is found that it has an effect on the performance of KGCN. The results are shown in Table 6. By comparing the data, it can be concluded that H is more sensitive to the experimental effect of the model than the number of samples K in the field. When H = 4, the accuracy rate drops significantly. This is because too many layers of the receiving field have a negative impact on the experimental effect. This result is consistent with common sense judgment, because if the connected relationship in the knowledge graph is longer, these influencing factors at the end of the relationship chain are meaningless when recommending the similarity of Vehicles. In other words, H being 1 or 2 is sufficient for real life scenarios.

Table 6 Influence of increasing H value on accuracy

In the IOV system, for physical vehicle information extraction, the data required for training data is too high, and high-quality data has a large labor cost. Therefore, unsupervised learning is considered in subsequent studies to reduce the manual pressure as much as possible. For recommendation technology, and the algorithm is based on knowledge graph construction, the workload is relatively large.

Conclusions

The main content of this chapter is to in-depth study of the recommendation technology based on the knowledge graph-KGCN, this method is to aggregate the feature vector representation of the adjacent entities of a given entity [11, 12], and calculate the neighborhood features of the entity by weighting, and by setting a fixed-size neighborhood collection as the acceptance field, the calculation amount of the algorithm is controlled. By comparing with the traditional collaborative filtering algorithm, the result proves the superiority of the algorithm, and it is more suitable for the recommendation of knowledge graph. In the application of IOV system. The new algorithm is proposed to better obtain real-time and effective vehicle information, and provide personalized functions for vehicle operation to provide better services.

Availability of data and materials

The datasets used and analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

IOV:

Internet of vehicles

KGCN:

Knowledge graph convolutional network

SVD:

Singular value decomposition

References

  1. P. Zeng, Z. Chen, Y. Ma, G. Zhao, Design of IOV privacy protection authentication scheme based on blockchain. Comput. Appl. Res. 38(10), 2919–2925 (2021)

    Google Scholar 

  2. X. Liu, X. Zhang, M. Jia et al., 5G-based green broadband communication system design with simultaneous wireless information and power transfer. Phys. Commun. 28, 130–137 (2018)

    Article  Google Scholar 

  3. Y. Xue, J. Jin, A. Song, Y. Zhang, Y. Liu, K. Wang, Relation-based multi-type aware knowledge graph embedding. Neurocomputing 456, 11–22 (2021)

    Article  Google Scholar 

  4. T. Phan, P. Do, Building a Vietnamese question answering system based on knowledge graph and distributed CNN. Neural Comput. Appl. 33, 14887–14907 (2021)

    Article  Google Scholar 

  5. L. Sang, M. Xu, S. Qian, X. Wu, Knowledge graph enhanced neural collaborative filtering with residual recurrent network. Neurocomputing 454, 417–429 (2021)

    Article  Google Scholar 

  6. H. Werneck, N. Silva, M. Viana, A.C. Pereira, F. Mourão, L. Rocha, Points of interest recommendations: methods, evaluation, and future directions. Inform. Syst. 101, 101789 (2021)

    Article  Google Scholar 

  7. R. Kojima, S. Ishida, M. Ohta, H. Iwata, T. Honma, Y. Okuno, kGCN: a graph-based deep learning framework for chemical structures. J. Cheminform. 12(1), 1–10 (2020)

    Article  Google Scholar 

  8. X. Wang, X. Liu, J. Liu, X. Chen, H. Wu, A novel knowledge graph embedding based API recommendation method for Mashup development. World Wide Web 24(3), 869–894 (2021)

    Article  Google Scholar 

  9. Information Technology-Cloud Computing; findings from nanjing university of science and technology reveals new findings on cloud computing (Kg2rec: Lsh-cf recommendation method based on knowledge graph for cloud services). Comput, Technol. J. (2020)

  10. X. Liu, X. Zhang, Rate and energy efficiency improvements for 5G-based IoT with simultaneous transfer. IEEE Internet Things J. 6(4), 5971–5980 (2019)

    Article  Google Scholar 

  11. J. Ren, J. Long, Z. Xu, Financial news recommendation based on graph embeddings. Decis. Support Syst. 125, 113115 (2019)

    Article  Google Scholar 

  12. C. Ma, B. Zhang, A new query recommendation method supporting exploratory search based on search goal shift graphs. IEEE Trans. Knowl. Data Eng. 30(11), 2024–2036 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledged the anonymous reviewers and editors for their efforts in valuable comments and suggestions.

Funding

This work was supported in part by the Heilongjiang Province Natural Science Fund Project (No. F2020011) and Heilongjiang Provincial Postdoctoral Science Foundation (No. LBH-Z16054).

Author information

Authors and Affiliations

Authors

Contributions

Xiaolin Jiang proposes the innovation ideas and theoretical analysis. Other authors also have contributed jointly to the manuscript. All of the authors have read and approved the final manuscript.

Corresponding author

Correspondence to Changchun Dong.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jiang, X., Fu, Y. & Dong, C. Recommendation method for fusion of knowledge graph convolutional network. EURASIP J. Adv. Signal Process. 2022, 27 (2022). https://doi.org/10.1186/s13634-022-00854-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-022-00854-7

Keywords

  • Knowledge graph
  • Recommended technology
  • Convolutional network