Unifying Node Labels, Features, and Distances for Deep Network Completion
Abstract
:1. Introduction
1.1. Background
1.2. Related Methods
1.3. Present Work
 We formalize NC with side information as a graph refinement problem;
 We propose LFDNC, a deep graph convolutionalnetworkbased completion method by unifying the observed structure with node label, feature, and distance information;
 We validate LFDNC through extensive experiments on several realworld networks.
2. Methods
2.1. Problem Formulation
2.2. Overview of LFDNC
2.3. LabelBased Topology Initialization
2.4. Edge Probability Learning
2.5. Distance Pruning and Topology Refinement
Algorithm 1. LFDNC  
Input: node features $X$, nonedge mask ${M}_{nonedge}$, observed graph matrix ${A}_{O}$, SBM estimated matrix ${W}_{L}$, and topology refinement round $R$. Output: estimated graph ${A}_{{G}_{D}}$.  
 //labelbased topology initialization by Equation (4) 
 //topology refinement 
 //node embedding by Equation (6) 
 //link prediction by Equation (12) 
 //distance pruning by Equation (13) 
 //update ${A}_{{G}_{L}}$ 
 

3. Experiments
3.1. Experimental Settings
3.1.1. Datasets
3.1.2. Baselines and Evaluation Metrics
 SBM [42] only uses node labels, whereby the symmetric $C\times C$ matrix of edge probability $\left[{p}_{Y\left(u\right),Y\left(v\right)}\right]$ is estimated from the observed $O$;
 KronEM [7] only uses the network graph structure and ignores node features, node labels, and node distances;
 MCDT [15] employs both the pairwise similarity of node features and the network graph structure, as well as ignores node labels and node distances. The similarity information is utilized by matrix factorization in a linear way;
 MLPNC [48] considers node features and the network graph structure, as well as ignores node labels and node distances. Unlike MCDT, MLPNC directly learns a nonlinear similarity metric;
 GGCN [34] also considers node features and the network graph structure, as well as ignores node labels and node distances. Unlike MLPNC, GGCN adopts a generative graph convolution model.
3.1.3. Implementation Details
3.2. Completion Performance
3.2.1. Comparison with StateoftheArt Methods
3.2.2. Impact Analysis of Node Labels
3.2.3. Impact Analysis of Distance Constraints
3.2.4. Ablation Study
4. Conclusions and Future Work
Author Contributions
Funding
Conflicts of Interest
References
 Barabási, A.L. Networks at the heart of complex systems. In Network Science; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
 Newman, M.E.J. Communities, modules and largescale structure in networks. Nat. Phys. 2012, 8, 25–31. [Google Scholar] [CrossRef]
 Hanneke, S.; Xing, E.P. Network completion and survey sampling. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics, Clearwater Beach, FL, USA, 16–19 April 2009; PMLR: Clearwater Beach, FL, USA, 2009; Volume 5, pp. 209–215. [Google Scholar]
 Hric, D.; Peixoto, T.P.; Fortunato, S. Network structure, metadata, and the prediction of missing nodes and annotations. Phys. Rev. X 2016, 6. [Google Scholar] [CrossRef]
 Newman, M.E.J. Network structure from rich but noisy data. Nat. Phys. 2018, 14, 542–545. [Google Scholar] [CrossRef][Green Version]
 Huisman, M.; Krause, R.W. Imputation of missing network data. In Encyclopedia of Social Network Analysis and Mining; Springer: New York, NY, USA, 2018; pp. 1044–1053. [Google Scholar]
 Kim, M.; Leskovec, J. The network completion problem: Inferring missing nodes and edges in networks. In Proceedings of the 2011 SIAM International Conference on Data Mining, Mesa, AZ, USA, 28–30 April 2011; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2011; pp. 47–58. [Google Scholar]
 Ouédraogo, F.; Magnien, C. Impact of sources and destinations on the observed properties of the internet topology. Comput. Commun. 2011, 34, 670–679. [Google Scholar] [CrossRef]
 Chung, F.; Garrett, M.; Graham, R.; Shallcross, D. Distance realization problems with applications to internet tomography. J. Comput. Syst. Sci. 2001, 63, 432–448. [Google Scholar] [CrossRef][Green Version]
 Kannan, S.; Mathieu, C.; Zhou, H. Graph reconstruction and verification. ACM Trans. Algorithms 2018, 14, 1–30. [Google Scholar] [CrossRef]
 Erlebach, T.; Hall, A.; Hoffmann, M.; Mihaľák, M. Network discovery and verification with distance queries. In Proceedings of the 2006 Conference on Algorithms and Complexity, Rome, Italy, 29–31 May 2006; pp. 69–80. [Google Scholar]
 Vasanthakumar, G.U.; Sunithamma, K.; Deepa Shenoy, P.; Venugopal, K.R. An overview on user profiling in online social networks. Int. J. Appl. Inf. Syst. 2017, 11, 25–42. [Google Scholar] [CrossRef]
 Wei, Q.; Hu, G.; Shen, C.; Yin, Y. A fast method for shortestpath cover identification in large complex networks. Comput. Mater. Contin. 2020, 63, 705–724. [Google Scholar] [CrossRef]
 Eriksson, B.; Barford, P.; Crovella, M.; Nowak, R. Learning network structure from passive measurements. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, San Diego, CA, USA, 24–26 October 2007; pp. 209–214. [Google Scholar]
 Forsati, R.; Barjasteh, I.; Ross, D.; Esfahanian, A.H.; Radha, H. Network completion by leveraging similarity of nodes. Soc. Netw. Anal. Min. 2016, 6, 1–22. [Google Scholar] [CrossRef]
 Tran, C.; Shin, W.Y.; Spitz, A.; Gertz, M. DeepNC: Deep generative network completion. IEEE Trans. Pattern Anal. Mach. Intell. 2020. [Google Scholar] [CrossRef]
 Cui, P.; Wang, X.; Pei, J.; Zhu, W. A Survey on Network Embedding. IEEE Trans. Knowl. Data Eng. 2019, 31, 833–852. [Google Scholar] [CrossRef][Green Version]
 Bianconia, G.; Pinb, P.; Marsilia, M. Assessing the relevance of node features for network structure. Proc. Natl. Acad. Sci. USA. 2009, 106, 11433–11438. [Google Scholar] [CrossRef][Green Version]
 Kim, M.; Leskovec, J. Modeling social networks with node attributes using the Multiplicative Attribute Graph model. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, Barcelona, Spain, 14–17 July 2011; AUAI Press: Arlington, VA, USA, 2011; pp. 400–409. [Google Scholar]
 Yang, L.; Kang, Z.; Cao, X.; Jin, D.; Yang, B.; Guo, Y. Topology optimization based graph convolutional network. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, 10–16 August 2019; pp. 4054–4061. [Google Scholar] [CrossRef][Green Version]
 Newman, M.E.J. The structure and function of complex networks. SIAM Rev. 2003, 45, 167–256. [Google Scholar] [CrossRef][Green Version]
 Shi, M.; Tang, Y.; Zhu, X. Topology and content coAlignment graph convolutional learning. arXiv 2020, arXiv:2003.12806. [Google Scholar]
 Rafailidis, D.; Crestani, F. Network completion via joint node clustering and similarity learning. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Francisco, CA, USA, 18–21 August 2016; pp. 63–68. [Google Scholar]
 Kaya, M.; Bilge, H.Ş. Deep Metric Learning: A Survey. Symmetry 2019, 11, 1066. [Google Scholar] [CrossRef][Green Version]
 Zhang, Z.; Cui, P.; Zhu, W. Deep learning on graphs: A Survey. IEEE Trans. Knowl. Data Eng. 2020. [Google Scholar] [CrossRef][Green Version]
 Zhou, J.; Cui, G.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
 Franceschi, L.; Niepert, M.; Pontil, M.; He, X. Learning discrete structures for graph neural networks. In Proceedings of the The 36th International Conference on Machine Learning (PMLR), Long Beach, CA, USA, 9–15 June 2019; pp. 1972–1982. [Google Scholar]
 Chen, Y.; Wu, L.; Zaki, M.J. Deep Iterative and Adaptive Learning for Graph Neural Networks. arXiv 2019, arXiv:1912.07832. [Google Scholar]
 Yu, D.; Zhang, R.; Jiang, Z.; Wu, Y.; Yang, Y. GraphRevised Convolutional Network. arXiv 2020, arXiv:1911.07123. [Google Scholar] [CrossRef]
 Hao, Y.; Cao, X.; Fang, Y.; Xie, X.; Wang, S. Inductive link prediction for nodes having only attribute information. In Proceedings of the the 29th International Joint Conference on Artificial Intelligence (IJCAI), Yokohama, Japan, 7–15 January 2021; pp. 1209–1215. [Google Scholar]
 You, J.; Ying, R.; Ren, X.; Hamilton, W.L.; Leskovec, J. Graphrnn: Generating realistic graphs with deep autoregressive models. In Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden, 10–15 July 2018; Volume 80, pp. 5708–5717. [Google Scholar]
 Alderson, D.; Li, L.; Willinger, W.; Doyle, J.C. Understanding internet topology: Principles, models, and validation. IEEE/ACM Trans. Netw. 2005, 13, 1205–1218. [Google Scholar] [CrossRef][Green Version]
 Grover, A.; Zweig, A.; Ermon, S. Graphite: Iterative generative modeling of graphs. In Proceedings of the 36th International Conference on Machine Learning (ICML), Long Beach, CA, USA, 10–15 June 2019; Volume 97, pp. 2434–2444. [Google Scholar]
 Xu, D.; Ruan, C.; Motwani, K.; Korpeoglu, E.; Kumar, S.; Achan, K. Generative Graph Convolutional Network for Growing Graphs. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 3167–3171. [Google Scholar]
 Kipf, T.N.; Welling, M. SemiSupervised Classification with Graph Convolutional Networks. In Proceedings of the 5th International Conference on Learning Representations (ICLR), Palais des Congrès Neptune, Toulon, France, 24–26 April 2017; pp. 1–14. [Google Scholar]
 Lin, W.; He, F.; Zhang, F.; Cheng, X.; Cai, H. Initialization for network embedding: A graph partition approach. In Proceedings of the WSDM 2020—The Thirteenth ACM International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020; pp. 367–374. [Google Scholar] [CrossRef][Green Version]
 Wang, H.; Leskovec, J. Unifying graph convolutional neural networks and label propagation. arXiv 2020, arXiv:2002.06755. [Google Scholar]
 Jia, J.; Benson, A.R. A unifying generative model for graph learning algorithms: Label Propagation, graph Convolutions, and combinations. arXiv 2021, arXiv:2101.07730. [Google Scholar]
 Chen, D.; Lin, Y.; Li, W.; Li, P.; Zhou, J.; Sun, X. Measuring and relieving the oversmoothing problem for graph neural networks from the topological view. arXiv 2019. [Google Scholar] [CrossRef]
 Velicković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations (ICLR), Toulon, France, 24–26 April 2017. [Google Scholar]
 Huang, Q.; He, H.; Singh, A.; Lim, S.N.; Benson, A.R. Combining label propagation and simple models outperforms graph neural networks. arXiv 2020, arXiv:2010.13993. [Google Scholar]
 Karrer, B.; Newman, M.E.J. Stochastic blockmodels and community structure in networks. Phys. Rev. E 2011, 83, 016107. [Google Scholar] [CrossRef][Green Version]
 Abbe, E. Community detection and stochastic block models: Recent developments. J. Mach. Learn. Res. 2017, 18, 6446–6531. [Google Scholar]
 Agarap, A.F.M. Deep learning using rectified linear units (ReLU). arXiv 2018, arXiv:1803.08375. [Google Scholar]
 Etworks, N. Geomgcn: Geometric graph convolutional networks. arXiv 2020, arXiv:2002.05287. [Google Scholar]
 Yang, Z.; Cohen, W.W.; Salakhutdinov, R. Revisiting semisupervised learning with graph embeddings. In Proceedings of the 33rd International Conference on Machine Learning (ICML), New York, NY, USA, 19–24 June 2016; Volume 48, pp. 40–48. [Google Scholar]
 Mernyei, P.; Cangea, C. WikiCS: A wikipediabased benchmark for graph neural networks. arXiv 2020, arXiv:2007.02901. [Google Scholar]
 Wei, Q. Network completion via deep metric learning. In Proceedings of the International Conference on Big Data Mining and Information Processing (BDMIP), Qingdao, China, 24–26 July 2020; Volume 1656, p. 012026. [Google Scholar]
 Leskovec, J.; Sosič, R. SNAP: A generalpurpose network analysis and graphmining library. ACM Trans. Intell. Syst. Technol. 2016, 8, 1–20. [Google Scholar] [CrossRef] [PubMed][Green Version]
Dataset  Nodes  Edges  Classes  Features 

Actor  7600  33,544  5  931 
Cornell  183  295  5  1703 
Texas  183  309  5  1703 
Wisconsin  251  499  5  1703 
Cora  2708  5429  7  1433 
Citeseer  3327  4732  6  3703 
PubMed  19,717  44,338  3  500 
WikiCS  11,701  216,123  10  300 
Method  Actor  Cornell  Texas  Wisconsin  Cora  Citeseer  PubMed  WikiCS 

SBM  49.7 ± 0.6  45.1 ± 5.8  65.2 ± 8.0  52.7 ± 7.0  88.9 ± 1.5  80.7 ± 2.0  76.1 ± 1.6  83.1 ± 0.8 
KronEM  54.0 ± 1.2  52.0 ± 12.3  59.3 ± 9.7  51.5 ± 9.2  50.9 ± 1.4  49.3 ± 2.1  57.6 ± 1.7  62.2 ± 2.9 
MCDT  50.6 ± 0.7  48.9 ± 9.4  58.8 ± 10.5  72.7 ± 2.9  91.6 ± 1.6  87.7 ± 1.6  89.4 ± 0.8  92.3 ± 0.9 
MLPNC  51.6 ± 1.1  43.1 ± 9.1  41.8 ± 9.4  66.7 ± 8.0  90.1 ± 0.9  85.6 ± 1.6  88.6 ± 1.2  92.3 ± 1.2 
GGCN  50.4 ± 0.7  53.8 ± 4.5  38.9 ± 7.8  60.0 ± 6.0  93.2 ± 0.2  88.7 ± 0.2  88.7 ± 0.2  90.7 ± 0.5 
LFDNC  72.1 ± 1.0  85.7 ± 2.3  88.4 ± 2.8  89.4 ± 4.2  97.1 ± 0.6  96.0 ± 0.7  93.7 ± 0.9  92.5 ± 0.9 
Method  Actor  Cornell  Texas  Wisconsin  Cora  Citeseer  PubMed  WikiCS 

SBM  49.9 ± 0.5  49.4 ± 3.9  68.7 ± 8.3  56.7 ± 4.5  85.8 ± 2.1  79.2 ± 1.8  71.3 ± 1.7  81.6 ± 0.9 
KronEM  53.0 ± 1.3  56.0 ± 11.4  59.9 ± 7.6  55.0 ± 7.9  51.5 ± 1.4  50.0 ± 1.6  55.3 ± 1.6  59.9 ± 2.5 
MCDT  51.1 ± 0.7  54.6 ± 7.3  57.3 ± 7.3  73.4 ± 4.0  89.0 ± 2.3  86.3 ± 1.9  88.3 ± 0.8  91.5 ± 1.1 
MLPNC  52.2 ± 1.2  52.6 ± 8.9  47.5 ± 6.2  67.1 ± 8.1  87.3 ± 1.3  81.5 ± 2.3  86.5 ± 1.4  92.0 ± 1.3 
GGCN  51.4 ± 0.9  60.8 ± 5.4  46.2 ± 4.8  59.6 ± 5.8  91.4 ± 0.2  86.2 ± 0.2  86.2 ± 0.2  90.6 ± 0.5 
LFDNC  66.7 ± 1.4  83.3 ± 3.5  84.8 ± 5.4  87.5 ± 6.4  96.6 ± 0.7  94.6 ± 1.2  92.3 ± 1.2  92.3 ± 1.0 
LTI  EPL  DP  TF  AUC on Actor  AP on Actor  AUC on Texas  AP on Texas 

✓  ✓  ✓  ✓  72.1 ± 1.0  66.7 ± 1.4  88.4 ± 2.8  84.8 ± 5.4 
✓  ✓  ✓  66.4 ± 1.7  60.1 ± 1.6  86.7 ± 6.7  81.3 ± 8.9  
✓  ✓  ✓  70.7 ± 1.4  65.4 ± 2.1  79.1 ± 8.0  74.1 ± 10.8  
✓  ✓  ✓  60.4 ± 0.8  57.3 ± 0.9  50.9 ± 3.3  58.3 ± 6.5 
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. 
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wei, Q.; Hu, G. Unifying Node Labels, Features, and Distances for Deep Network Completion. Entropy 2021, 23, 771. https://doi.org/10.3390/e23060771
Wei Q, Hu G. Unifying Node Labels, Features, and Distances for Deep Network Completion. Entropy. 2021; 23(6):771. https://doi.org/10.3390/e23060771
Chicago/Turabian StyleWei, Qiang, and Guangmin Hu. 2021. "Unifying Node Labels, Features, and Distances for Deep Network Completion" Entropy 23, no. 6: 771. https://doi.org/10.3390/e23060771