A Comparative Analysis of Bias Amplification in Graph Neural Network Approaches for Recommender Systems
Abstract
:1. Introduction
2. StateoftheArt
2.1. Bias in Machine Learning
2.2. Bias in GNNs
2.3. Bias in RSs
2.4. Bias in GNNBased RSs
3. Experimental Study
3.1. Benchmark Datasets
3.1.1. MovieLens 100K [57]
3.1.2. LastFM [59]
3.2. Recommendation Methods
 1.
 Collaborative Filtering (CF):Collaborative (or social) filtering techniques are based on user preferences and use information about ratings given by users to items to compute user or item similarity. If two users rate items in a similar way, it is more likely that they rate the new items likewise. Therefore, the target user is one who recommended items well rated by users with the same taste. Another CF strategy is to recommend items similar to those that the user has consumed or has rated positively; such similarity between items is calculated from the ratings received by users. In CF approaches, user ratings on items are collected in order to create a user–item rating matrix. This matrix is used to find similarities between users/items. CF approaches can tackle some of the problems of contentbased approaches, in which items similar to those consumed or well rated by the user are also recommended, but this similarity is computed from item features. A drawback of this approach, which is avoided by CF, is the unavailability of item features or the difficulty to obtain them, since recommendations in CF are made using only the feedback of other users. Besides, the quality of these techniques is higher because they are based on items evaluated by users, instead of relying on content, whose quality can be low. CF approaches unlike contentbased systems can recommend items with various content, that is not similar to those previously consumed by the user, as long as other users have already shown interest in these different items. CF techniques use different approaches including:
 Userbased: these systems assess the preference of a target user for an item using the ratings given to this item by his/her neighbors, which are users that have a similar rating behavior [4].
 Itembased: these approaches anticipate the rating of a user for an item considering the ratings given by this user to similar items. In such approaches, two items are similar if they have received similar ratings from several users in the system. This is different from contentbased methods, which base the similarities of items on their characteristics or attributes. This approach is more convenient in common commercial recommender systems, where the number of users is much higher than the number of items in the catalog. Usually, itembased approaches are more reliable, require less computation time, and do not need to be updated as frequently [4].
Figure 10 shows the differences between userbased and itembased approaches.The CF approaches used are as follows: ItemKNN: This method is an itembased approach that computes the similarity between the items based on the ratings that users give to them. The main motivation behind this method is that customers are more prone to purchase items that are compatible with their previous purchases. Historical purchase information in the user–item matrix can lead to recognizing sets of similar items and using them to create the topK recommendations. This algorithm in a highlevel view includes two major elements. The first component creates a model seizing the relations among various items. The second component applies the mentioned calculated model to acquire topK recommendations for a user. This method also shows a high performance in comparison to other similar CF approaches [60,61,62].
 Neural Collaborative Filtering model with interactionbased Neighborhood (NNCF): This model utilizes deep learning for modeling complicated interactions between users and items. This novel model also uses neighborhood information to complete user–item interaction data, hence improving the model’s performance. NNCF models can overcome traditional algorithmic issues such as simple linear factorization, which may not completely support complex interaction among users and items. This method can also provide user/item embeddings with a good quality [63,64,65].
 2.
 Matrix factorization:Matrix factorization encompasses a group of modelbased techniques in which the rating matrix is transformed into two matrices of latent factors representing users and items, respectively, in an attempt to tackle the sparsity problem of the ratings matrix. This is a lowdimensional factor model, where it is assumed that the inner products between the user and item latent factors influence the preferences of the user for an item [66]. Currently, MF has become one of the most popular methods for implementing RS [67].The MF approaches used are as follows:
 Deep Matrix Factorization (DMF): This method uses a neural network architecture. This method constructs the user–item matrix with explicit ratings and nonpreference implicit feedback. Afterward, this matrix is used as the input for learning a common lowdimensional space for the deep structure learning architecture. This method also uses a novel loss function based on binary crossentropy, which considers both explicit ratings and implicit feedback for enhancing optimization. DMF provides better topK recommendations in comparison to traditional models by applying implicit feedback, thus reconstructing the users’ ratings via learning hidden structures with explicit historical ratings. This method also supports twochannel structures, which can combine side information from both users and items. Some articles also indicate that this approach can outperform new recommendation algorithms with respect to accuracy and training efficiency [68,69,70].
 Neural Collaborative Filtering (NeuMF): Knowing that the most important factor in CF models is the interaction between user and item features, the inner products in these methods can be replaced by a neural network architecture. Neural networkbased Collaborative Filtering (NCF) is a schema that expresses and generalizes matrix factorization, which can be enhanced by using nonlinear kernels. To achieve this, a multilayer perceptron can be used in order to learn the user–item interaction function [71]. The good capacity and nonlinearity of deep neural networks are the main reasons for their good performance. Furthermore, the general NCF used in NeuMF can provide us with the opportunity for using the combination of various models [67].
 3.
 GNNbased:One of the fastestgrowing technologies that had great capability in recent years is Graph Learning (GL) [14]. This approach relates to machine learning applied to graph structure data. Using these advantages to learn relational data, Graph Learningbased Recommender Systems (GLRSs) have been proposed [6]. In reality, the majority of objects are explicitly or implicitly connected with each other. These relations can be shown by graphs. In RSs where the objects can be considered users, items, attributes, and context, this characteristic is even clearer.These objects are strongly connected with each other and affect each other via different relations. The quality of RSs can be remarkably increased by using graph techniques. Graph learning has a great ability to learn complex relations, as well as a high potential in obtaining knowledge enclosed in a variety of graphs [72].There are different types of entities in RSs including users, items, and attributes, which maintain different types of relationships with each other and can therefore be represented by graphs of diverse types. It is well known that the three main objects used in recommender models are user, item, and user–item interaction, although other information concerning users and/or items may also be used. On this basis, data used in RSs can be classified into two broad categories: user–item interaction data (clicks, purchases, or ratings made by the users on the items) and side information data (user and item attributes). In addition, interaction data can be classified into two categories depending on whether the interactions are sequential or general. [72]. Each class also is divided into various subclasses, as can be seen in Table 3.Each input in the user–item matrix is information about the type of interaction that happened between them. The interaction data can be divided into categories based on their types: explicit and implicit. Explicit interaction happens when a user is asked to provide an opinion on an item (e.g., users’ ratings on items). Implicit interaction is the one that is concluded from the user’s action (e.g., click, view) [72,73].The GNN methods used are the following:
 LightGCN: This model is a simple version of a Graph Convolution Network (GCN), which includes the most important components of GCNs for recommendation tasks. LightGCN linearly propagates the user and item embeddings on the user–item interaction graph. Afterward, this model uses the weighted sum of the embeddings learned at all layers as the final embedding [74]. The symmetric normalization in LightGCN is the same as the standard GCN, which controls the increase in the size of embeddings with graph convolution operations. This method also showed great performance in comparison to conventional approaches [75,76].
 Neural Graph Collaborative Filtering (NGCF): This model is another chosen method for this investigation. This model introduces a graph structure into user–item interactions. This method benefits from the user–item graph structure by generating embeddings on it, which results in highorder connectivity in the user–item graph. The collaborative signal is pumped into the embedding process in an explicit way [72]. This method, moreover, uses multiple embedding propagation layers, with concatenated outputs to create the final prediction for the recommendation task. NGCF also shows great performance concerning model optimization [77].
 Selfsupervised Graph Learning (SGL): This method is an improved version of GCN models with respect to accuracy and robustness. These models also perform better while working with interactions with noise. This method uses an enhanced classical supervised task of recommendation with a supporting selfsupervised task, which reinforces node representation learning via selfdiscrimination. This structure generates different views of a node, which maximizes the agreement between various views of the same node compared to that of the other nodes. Three operators are also devised in order to generate the mentioned views—node dropout, edge dropout, and random walk—which change the graph structure in different aspects. The SGL method has also shown great performance in RS tasks, which makes it a suitable choice for this experiment [78,79,80].
3.3. Evaluation Metrics
4. Results
5. Conclusions and Future Work
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
 Lin, S.; Wang, J.; Zhu, Z.; Caverlee, J. Quantifying and Mitigating Popularity Bias in Conversational Recommender Systems. arXiv 2022, arXiv:2208.03298. [Google Scholar]
 Chen, J.; Dong, H.; Wang, X.; Feng, F.; Wang, M.; He, X. Bias and debias in recommender system: A survey and future directions. arXiv 2020, arXiv:2010.03240. [Google Scholar]
 Zhang, S.; Yao, L.; Sun, A.; Tay, Y. Deep learning based recommender system: A survey and new perspectives. ACM Comput. Surv. (CSUR) 2019, 52, 1–38. [Google Scholar] [CrossRef] [Green Version]
 Ricci, F.; Rokach, L.; Shapira, B. Recommender Systems Handbook; Springer: New York, NY, USA, 2015. [Google Scholar]
 Alam, M.; Iana, A.; Grote, A.; Ludwig, K.; Müller, P.; Paulheim, H. Towards Analyzing the Bias of News Recommender Systems Using Sentiment and Stance Detection. arXiv 2022, arXiv:2203.05824. [Google Scholar]
 Gao, C.; Wang, X.; He, X.; Li, Y. Graph neural networks for recommender system. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, Tempe, AZ, USA, 21–25 February 2022; pp. 1623–1625. [Google Scholar]
 Di Noia, T.; Tintarev, N.; Fatourou, P.; Schedl, M. Recommender systems under European AI regulations. Commun. ACM 2022, 65, 69–73. [Google Scholar] [CrossRef]
 Fahse, T.; Huber, V.; Giffen, B.v. Managing bias in machine learning projects. In Proceedings of the International Conference on Wirtschaftsinformatik, Nuremberg, Germany, 21–23 February 2022; Springer: Berlin/Heidelberg, Germany, 2021; pp. 94–109. [Google Scholar]
 Kordzadeh, N.; Ghasemaghaei, M. Algorithmic bias: Review, synthesis, and future research directions. Eur. J. Inf. Syst. 2022, 31, 388–409. [Google Scholar] [CrossRef]
 Boratto, L.; Fenu, G.; Marras, M. Connecting user and item perspectives in popularity debiasing for collaborative recommendation. Inf. Process. Manag. 2021, 58, 102387. [Google Scholar] [CrossRef]
 Mu, R. A survey of recommender systems based on deep learning. IEEE Access 2018, 6, 69009–69022. [Google Scholar] [CrossRef]
 Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
 Bronstein, M.M.; Bruna, J.; LeCun, Y.; Szlam, A.; Vandergheynst, P. Geometric deep learning: Going beyond euclidean data. IEEE Signal Process. Mag. 2017, 34, 18–42. [Google Scholar] [CrossRef] [Green Version]
 Wu, S.; Sun, F.; Zhang, W.; Xie, X.; Cui, B. Graph neural networks in recommender systems: A survey. ACM Comput. Surv. (CSUR) 2020. [Google Scholar] [CrossRef]
 Dai, E.; Wang, S. Say no to the discrimination: Learning fair graph neural networks with limited sensitive attribute information. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, Online, 8–12 March 2021; pp. 680–688. [Google Scholar]
 BaezaYates, R. Data and algorithmic bias in the web. In Proceedings of the 8th ACM Conference on Web Science, Hannover, Germany, 22–25 May 2016; p. 1. [Google Scholar]
 Boratto, L.; Marras, M. Advances in Biasaware Recommendation on the Web. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, Virtual Event, 8–12 March 2021; pp. 1147–1149. [Google Scholar]
 Milano, S.; Taddeo, M.; Floridi, L. Recommender systems and their ethical challenges. AI Soc. 2020, 35, 957–967. [Google Scholar] [CrossRef] [Green Version]
 Bozdag, E. Bias in algorithmic filtering and personalization. Ethics Inf. Technol. 2013, 15, 209–227. [Google Scholar] [CrossRef]
 Ciampaglia, G.L.; Nematzadeh, A.; Menczer, F.; Flammini, A. How algorithmic popularity bias hinders or promotes quality. Sci. Rep. 2018, 8, 15951. [Google Scholar] [CrossRef] [Green Version]
 Eryarsoy, E.; Piramuthu, S. Experimental evaluation of sequential bias in online customer reviews. Inf. Manag. 2014, 51, 964–971. [Google Scholar] [CrossRef]
 Vall, A.; Quadrana, M.; Schedl, M.; Widmer, G. Order, context and popularity bias in nextsong recommendations. Int. J. Multimed. Inf. Retr. 2019, 8, 101–113. [Google Scholar] [CrossRef] [Green Version]
 Olteanu, A.; Castillo, C.; Diaz, F.; Kıcıman, E. Social data: Biases, methodological pitfalls, and ethical boundaries. Front. Big Data 2019, 2, 13. [Google Scholar] [CrossRef] [PubMed] [Green Version]
 Bruce, P.; Bruce, A.; Gedeck, P. Practical Statistics for Data Scientists, 2nd ed.; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2020; pp. 50–51. [Google Scholar]
 Gu, J.; Oelke, D. Understanding bias in machine learning. arXiv 2019, arXiv:1909.01866. [Google Scholar]
 Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 2021, 54, 1–35. [Google Scholar] [CrossRef]
 Akter, S.; Dwivedi, Y.K.; Sajib, S.; Biswas, K.; Bandara, R.J.; Michael, K. Algorithmic bias in machine learningbased marketing models. J. Bus. Res. 2022, 144, 201–216. [Google Scholar] [CrossRef]
 Blanzeisky, W.; Cunningham, P. Algorithmic factors influencing bias in machine learning. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Bilbao, Spain, 13–17 September 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 559–574. [Google Scholar]
 Hall, M.; van der Maaten, L.; Gustafson, L.; Adcock, A. A Systematic Study of Bias Amplification. arXiv 2022, arXiv:2201.11706. [Google Scholar]
 Ashokan, A.; Haas, C. Fairness metrics and bias mitigation strategies for rating predictions. Inf. Process. Manag. 2021, 58, 102646. [Google Scholar] [CrossRef]
 Zhang, Q.; Wipf, D.; Gan, Q.; Song, L. A biased graph neural network sampler with nearoptimal regret. Adv. Neural Inf. Process. Syst. 2021, 34, 8833–8844. [Google Scholar]
 Dong, Y.; Liu, N.; Jalaian, B.; Li, J. Edits: Modeling and mitigating data bias for graph neural networks. In Proceedings of the ACM Web Conference 2022, Virtual Event, Lyon, France, 25–29 April 2022; pp. 1259–1269. [Google Scholar]
 Liu, Y.; Ao, X.; Feng, F.; He, Q. UDGNN: Uncertaintyaware Debiased Training on SemiHomophilous Graphs. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 1131–1140. [Google Scholar]
 Dong, Y.; Wang, S.; Wang, Y.; Derr, T.; Li, J. On Structural Explanation of Bias in Graph Neural Networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 316–326. [Google Scholar]
 Xu, B.; Shen, H.; Sun, B.; An, R.; Cao, Q.; Cheng, X. Towards consumer loan fraud detection: Graph neural networks with roleconstrained conditional random field. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 11–15 October 2021; Volume 35, pp. 4537–4545. [Google Scholar]
 Zeng, Z.; Islam, R.; Keya, K.N.; Foulds, J.; Song, Y.; Pan, S. Fair representation learning for heterogeneous information networks. In Proceedings of the International AAAI Conference on Weblogs and Social Media, Virtual, 7–10 June 2021; Volume 15. [Google Scholar]
 Chen, Z.; Xiao, T.; Kuang, K. BAGNN: On Learning BiasAware Graph Neural Network. In Proceedings of the 2022 IEEE 38th International Conference on Data Engineering (ICDE), Kuala Lumpur, Malaysia, 9–12 May 2022; pp. 3012–3024. [Google Scholar]
 Gao, C.; Lei, W.; Chen, J.; Wang, S.; He, X.; Li, S.; Li, B.; Zhang, Y.; Jiang, P. CIRS: Bursting Filter Bubbles by Counterfactual Interactive Recommender System. arXiv 2022, arXiv:2204.01266. [Google Scholar]
 Fabbri, F.; Croci, M.L.; Bonchi, F.; Castillo, C. Exposure Inequality in People Recommender Systems: The LongTerm Effects. In Proceedings of the International AAAI Conference on Web and Social Media, Atlanta, GA, USA, 6–9 June 2022; Volume 16, pp. 194–204. [Google Scholar]
 Mansoury, M.; Abdollahpouri, H.; Pechenizkiy, M.; Mobasher, B.; Burke, R. A graphbased approach for mitigating multisided exposure bias in recommender systems. ACM Trans. Inf. Syst. (TOIS) 2021, 40, 1–31. [Google Scholar] [CrossRef]
 Sun, W.; Khenissi, S.; Nasraoui, O.; Shafto, P. Debiasing the humanrecommender system feedback loop in collaborative filtering. In Proceedings of the 2019 World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 645–651. [Google Scholar]
 Ahanger, A.B.; Aalam, S.W.; Bhat, M.R.; Assad, A. Popularity Bias in Recommender Systems—A Review. In Proceedings of the International Conference on Emerging Technologies in Computer Engineering, Jaipur, India, 4–5 February 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 431–444. [Google Scholar]
 Wu, P.; Li, H.; Deng, Y.; Hu, W.; Dai, Q.; Dong, Z.; Sun, J.; Zhang, R.; Zhou, X.H. On the Opportunity of Causal Learning in Recommendation Systems: Foundation, Estimation, Prediction and Challenges. In Proceedings of the International Joint Conference on Artificial Intelligence, Vienna, Austria, 23–29 July 2022. [Google Scholar]
 Kowald, D.; Lacic, E. Popularity Bias in Collaborative FilteringBased Multimedia Recommender Systems. arXiv 2022, arXiv:2203.00376. [Google Scholar]
 Neophytou, N.; Mitra, B.; Stinson, C. Revisiting popularity and demographic biases in recommender evaluation and effectiveness. In Proceedings of the European Conference on Information Retrieval, Stavanger, Norway, 10–14 April 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 641–654. [Google Scholar]
 Rahmani, H.A.; Naghiaei, M.; Tourani, A.; Deldjoo, Y. Exploring the Impact of Temporal Bias in PointofInterest Recommendation. arXiv 2022, arXiv:2207.11609. [Google Scholar]
 Ekstrand, M.D.; Das, A.; Burke, R.; Diaz, F. Fairness in recommender systems. In Recommender Systems Handbook; Springer: Berlin/Heidelberg, Germany, 2022; pp. 679–707. [Google Scholar]
 Abdollahpouri, H.; Mansoury, M.; Burke, R.; Mobasher, B. The connection between popularity bias, calibration, and fairness in recommendation. In Proceedings of the Fourteenth ACM Conference on Recommender Systems, Virtual Event, Brazil, 22–26 September 2020; pp. 726–731. [Google Scholar]
 Abdollahpouri, H.; Burke, R.; Mobasher, B. Managing popularity bias in recommender systems with personalized reranking. In Proceedings of the ThirtySecond International Flairs Conference, Sarasota, FL, USA, 19–22 May 2019. [Google Scholar]
 Liu, H.; Wang, Y.; Lin, H.; Xu, B.; Zhao, N. Mitigating sensitive data exposure with adversarial learning for fairness recommendation systems. Neural Comput. Appl. 2022, 1–15. [Google Scholar] [CrossRef]
 Shakespeare, D.; Porcaro, L.; Gómez, E.; Castillo, C. Exploring artist gender bias in music recommendation. arXiv 2020, arXiv:2009.01715. [Google Scholar]
 Saxena, S.; Jain, S. Exploring and Mitigating Gender Bias in Recommender Systems with Explicit Feedback. arXiv 2021, arXiv:2112.02530. [Google Scholar]
 Rahman, T.; Surma, B.; Backes, M.; Zhang, Y. Fairwalk: Towards fair graph embedding. In Proceedings of the International Joint Conference on Artificial Intelligence, Macao, China, 10–16 August 2019. [Google Scholar]
 Chen, J.; Wu, W.; Shi, L.; Zheng, W.; He, L. Longtail sessionbased recommendation from calibration. Appl. Intell. 2022, 1–18. [Google Scholar] [CrossRef]
 Zhao, M.; Wu, L.; Liang, Y.; Chen, L.; Zhang, J.; Deng, Q.; Wang, K.; Shen, X.; Lv, T.; Wu, R. Investigating AccuracyNovelty Performance for Graphbased Collaborative Filtering. arXiv 2022, arXiv:2204.12326. [Google Scholar]
 Kim, M.; Oh, J.; Do, J.; Lee, S. Debiasing Neighbor Aggregation for Graph Neural Network in Recommender Systems. arXiv 2022, arXiv:2208.08847. [Google Scholar]
 Movielens. 2021. Available online: https://grouplens.org/datasets/movielens/ (accessed on 31 June 2022).
 Floridi, L.; Holweg, M.; Taddeo, M.; Amaya Silva, J.; Mökander, J.; Wen, Y. capAIA Procedure for Conducting Conformity Assessment of AI Systems in Line with the EU Artificial Intelligence Act. 2022. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4064091 (accessed on 31 June 2022).
 Million Song Dataset. Available online: http://millionsongdataset.com/lastfm/ (accessed on 31 June 2022).
 AlGhamdi, M.; Elazhary, H.; Mojahed, A. Evaluation of Collaborative Filtering for Recommender Systems. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 559–565. [Google Scholar] [CrossRef]
 Airen, S.; Agrawal, J. Movie recommender system using knearest neighbors variants. Natl. Acad. Sci. Lett. 2022, 45, 75–82. [Google Scholar] [CrossRef]
 Deshpande, M.; Karypis, G. Itembased topn recommendation algorithms. ACM Trans. Inf. Syst. (TOIS) 2004, 22, 143–177. [Google Scholar] [CrossRef]
 Sang, L.; Xu, M.; Qian, S.; Wu, X. Knowledge graph enhanced neural collaborative filtering with residual recurrent network. Neurocomputing 2021, 454, 417–429. [Google Scholar] [CrossRef]
 Girsang, A.S.; Wibowo, A.; Jason; Roslynlia. Neural collaborative for music recommendation system. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1071, 012021. [Google Scholar] [CrossRef]
 Bai, T.; Wen, J.R.; Zhang, J.; Zhao, W.X. A neural collaborative filtering model with interactionbased neighborhood. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 1979–1982. [Google Scholar]
 Himabindu, T.V.; Padmanabhan, V.; Pujari, A.K. Conformal matrix factorization based recommender system. Inf. Sci. 2018, 467, 685–707. [Google Scholar] [CrossRef]
 Kuang, H.; Xia, W.; Ma, X.; Liu, X. Deep matrix factorization for crossdomain recommendation. In Proceedings of the 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 12–14 March 2021; Volume 5, pp. 2171–2175. [Google Scholar]
 Xue, H.J.; Dai, X.; Zhang, J.; Huang, S.; Chen, J. Deep matrix factorization models for recommender systems. In Proceedings of the IJCAI, Melbourne, Australia, 19–25 August 2017; Volume 17, pp. 3203–3209. [Google Scholar]
 Yi, B.; Shen, X.; Liu, H.; Zhang, Z.; Zhang, W.; Liu, S.; Xiong, N. Deep matrix factorization with implicit feedback embedding for recommendation system. IEEE Trans. Ind. Inform. 2019, 15, 4591–4601. [Google Scholar] [CrossRef]
 Liang, G.; Sun, C.; Zhou, J.; Luo, F.; Wen, J.; Li, X. A General Matrix Factorization Framework for Recommender Systems in Multiaccess Edge Computing Network. Mob. Netw. Appl. 2022, 1–13. [Google Scholar] [CrossRef]
 He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.S. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 173–182. [Google Scholar]
 Wang, S.; Hu, L.; Wang, Y.; He, X.; Sheng, Q.Z.; Orgun, M.A.; Cao, L.; Ricci, F.; Yu, P.S. Graph learning based recommender systems: A review. arXiv 2021, arXiv:2105.06339. [Google Scholar]
 Schafer, J.B.; Frankowski, D.; Herlocker, J.; Sen, S. Collaborative filtering recommender systems. In The Adaptive Web; Springer: Berlin/Heidelberg, Germany, 2007; pp. 291–324. [Google Scholar]
 He, X.; Deng, K.; Wang, X.; Li, Y.; Zhang, Y.; Wang, M. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Xi’an, China, 25–30 July 2020; pp. 639–648. [Google Scholar]
 Broman, N. Comparasion of Recommender Systems for Stock Inspiration. Master’s Thesis, Linköping University, Linköping, Sweden, 2021. [Google Scholar]
 Ding, S.; Feng, F.; He, X.; Liao, Y.; Shi, J.; Zhang, Y. Causal incremental graph convolution for recommender system retraining. IEEE Trans. Neural Netw. Learn. Syst. 2022. [Google Scholar] [CrossRef] [PubMed]
 Sun, W.; Chang, K.; Zhang, L.; Meng, K. INGCF: An Improved Recommendation Algorithm Based on NGCF. In Proceedings of the International Conference on Algorithms and Architectures for Parallel Processing, Xiamen, China, 3–5 December 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 116–129. [Google Scholar]
 Wu, J.; Wang, X.; Feng, F.; He, X.; Chen, L.; Lian, J.; Xie, X. Selfsupervised graph learning for recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 11–15 July 2021; pp. 726–735. [Google Scholar]
 Yang, C. Supervised Contrastive Learning for Recommendation. arXiv 2022, arXiv:2201.03144. [Google Scholar] [CrossRef]
 Tang, H.; Zhao, G.; Wu, Y.; Qian, X. Multisamplebased Contrastive Loss for Topk Recommendation. IEEE Trans. Multimed. 2021. [Google Scholar] [CrossRef]
 Foulds, J.R.; Islam, R.; Keya, K.N.; Pan, S. Differential Fairness. In NeurIPS 2019 Workshop on Machine Learning with Guarantees, Vancouver, Canada; UMBC Faculty Collection: Baltimore, MD, USA, 2019. [Google Scholar]
 Foulds, J.R.; Islam, R.; Keya, K.N.; Pan, S. An intersectional definition of fairness. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 1918–1921. [Google Scholar]
 Naghiaei, M.; Rahmani, H.A.; Dehghan, M. The Unfairness of Popularity Bias in Book Recommendation. arXiv 2022, arXiv:2202.13446. [Google Scholar]
 Lazovich, T.; Belli, L.; Gonzales, A.; Bower, A.; Tantipongpipat, U.; Lum, K.; Huszar, F.; Chowdhury, R. Measuring disparate outcomes of content recommendation algorithms with distributional inequality metrics. arXiv 2022, arXiv:2202.01615. [Google Scholar] [CrossRef]
 Wang, X.; Wang, W.H. Providing Itemside Individual Fairness for Deep Recommender Systems. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Korea, 21–24 June 2022; pp. 117–127. [Google Scholar]
 Islam, R.; Keya, K.N.; Zeng, Z.; Pan, S.; Foulds, J. Debiasing career recommendations with neural fair collaborative filtering. In Proceedings of the Web Conference, Ljubljana, Slovenia, 19–23 April 2021; pp. 3779–3790. [Google Scholar]
Features  Description  Data Type  Count  Mean  Std 

Age  Age of users  int  100 K  32.96  11.56 
Rating  Rating on movies provided by users  float  100 K  3.52  1.12 
User id  IDs of the users  int  100 K     
Movie id  IDs of the movies  int  100 K     
Gender  Gender of the user  String  100 K     
Occupation  Users’ job  String  100 K     
Movie title  The title of rated movies  String  100 K     
Features  Description  Data Type  Count  Mean  Std 

Weight  Listening count for each artist  float  100 K  745.24  3751.32 
User id  IDs of the users  int  100 K     
Item id  IDs of the artists  int  100 K     
Gender  Age of users  String  100 k     
Country  Users’ country  String  100 K     
Name  Names of the artists  String  100 K     
Data Class  Data Subclass  Representing Graph 

General interaction  Explicit interaction, implicit interaction  Weighted bipartite graph, unweighted bipartite graph 
Sequential interaction  Singletype interactions multitype interactions  Directed homogeneous graph, directed heterogeneous graph 
Side information  Attribute information, social information, external knowledge  Heterogeneous graph, homogeneous graph tree, or heterogeneous graph 
Notation  Definition 

U  A set of users 
I  A set of items 
u  A user 
i  An item 
$R\left(u\right)$  A groundtruth set of items that user u interacted with 
$\widehat{R}\left(u\right)$  A ranked list of items that a model produces 
K  The length of the recommendation list 
$M\left(x\right)$  Algorithmic mechanism for the RS with input x and output y 
$\theta $  Distribution, which generates x 
$\Theta $  A set of distributions of $\theta $, which generate each instance x 
Metric Name  Description 
MRR  Computes the reciprocal rank of the first relevant item found by an algorithm. Considers $Ran{k}_{u}^{*}$ to be the rank position of the first relevant item found by an algorithm for a user u. 
$$MRR@K=\frac{1}{\leftU\right}\sum _{u\in U}\frac{1}{Ran{k}_{u}^{*}}$$
 
NDCG  Is a measure of ranking quality, where positions are discounted logarithmically. It accounts for the position of the hit by assigning higher scores to hits at top ranks. $\delta \left(0\right)$ is an indicator function. 
$$NDCG@K=\frac{1}{\leftU\right}\sum _{u\in U}\frac{1}{{\sum}_{i=1}^{min\left(\rightR\left(u\right),K)}\frac{1}{lo{g}_{2}(i+1)}}\sum _{i=1}^{K}\delta (i\in R\left(u\right))\frac{1}{lo{g}_{2}(i+1)}$$
 
Precision  Is a measure for computing the fraction of relevant items out of all the recommended items. Its final value is the average of the metric values for each user. $\widehat{R}\left(u\right)$ represents the item count of $\widehat{R}\left(u\right)$. 
$$Precision@K=\frac{1}{\leftU\right}\sum _{u\in U}\frac{\widehat{R}\left(u\right)\cap R\left(u\right)}{\widehat{R}\left(u\right)}$$
 
Recall  Is a measure for computing the fraction of relevant items out of all relevant items. $\leftR\right(u\left)\right$ represents the item count of $R\left(u\right)$. 
$$Recall@K=\frac{1}{\leftU\right}\sum _{u\in U}\frac{\widehat{R}\left(u\right)\cap R\left(u\right)}{\leftR\right(u\left)\right}$$
 
HR (HIT)  This metric is also known as the truncated hitratio. It is a way of calculating how many “hits” are included in a Ksized list of ranked items. If there is at least one item that falls in the groundtruth set, we call it a hit. $\delta \left(0\right)$ is an indicator function. $\delta \left(b\right)=1$ if b is true; otherwise, it would be 0. ∅ denotes the empty set. 
$$HR@K=\frac{1}{\leftU\right}\sum _{u\in U}\delta (\widehat{R}\left(u\right)\cap R\left(u\right)\ne \varnothing )$$

Metric Name  Description 
Average Popularity  Computes the average popularity of recommended items. In the formula below, $\varphi \left(i\right)$ is the number of interaction on item i in the training data [83]. 
$$AveragePopularity@K=\frac{1}{\leftU\right}\sum _{u\in U}\frac{{\sum}_{i\in {R}_{u}}\varphi \left(i\right)}{{R}_{u}}$$
 
Gini Index [41]  Presents the diversity of the recommendation items. It is used to measure the inequality of a distribution. In the following formula, $P\left(i\right)$ represents the number of times that item i appears in the recommended list, which is indexed in nondecreasing order [84]. 
$$GiniIndex@K=\frac{{\sum}_{i=1}^{\leftI\right}(2iI1)P\left(i\right)}{\leftI\right{\sum}_{i=1}^{\leftI\right}P\left(i\right)}$$
 
Item Coverage  Computes the coverage of recommended items over all items [85] 
$$ItemCoverage@K=\frac{{\cup}_{u\in U}\widehat{R}\left(u\right)}{\leftI\right}$$
 
Differential Fairness (DF) for sensitive attribute gender [81,86]  Ensures unbiased treatment for all protected groups. This metric also denotes the privacy interpretation of disparity. The mechanism $M\left(x\right)$ is $\u03f5$differentially fair with respect to $(A,\theta )$ for all $\theta \in \Theta $ with $x\sim \theta $ and $y\in Range\left(M\right)$. For all $({s}_{i},{s}_{j})\in A\times A$, where $P\left({s}_{i}\right)>0$, $P\left({s}_{j}\right)>0$. ${s}_{i},{s}_{j}\in A$ are tuples of all protected attribute values (here, male and female). 
$${e}^{\u03f5}\le \frac{{P}_{M,\theta}(M\left(x\right)=y{s}_{i},\theta )}{{P}_{M,\theta}(M\left(x\right)=y{s}_{j},\theta )}\le {e}^{\u03f5}$$

Approach  Method  Top K  Recall  Precision  MRR  NDCG  HIT  Item Coverage  Gini Index  Average Popularity 

MF  DMF  K = 5  0.14  0.22  0.43  0.26  0.62  0.18  0.94  256.29 
MF  DMF  K = 10  0.21  0.17  0.42  0.25  0.73  0.20  0.93  252.25 
MF  DMF  K = 15  0.29  0.16  0.45  0.28  0.83  0.28  0.90  219.49 
MF  NeuMF  K = 5  0.15  0.23  0.45  0.27  0.65  0.25  0.91  228.52 
MF  NeuMF  K = 10  0.23  0.18  0.46  0.27  0.78  0.36  0.89  212.41 
MF  NeuMF  K = 15  0.30  0.16  0.46  0.28  0.83  0.40  0.86  196.89 
CF  ItemKNN  K = 5  0.15  0.23  0.44  0.28  0.63  0.19  0.93  231.96 
CF  ItemKNN  K = 10  0.22  0.18  0.46  0.27  0.75  0.24  0.93  249.74 
CF  ItemKNN  K = 15  0.31  0.16  0.46  0.29  0.84  0.29  0.89  208.12 
CF  NNCF  K = 5  0.15  0.24  0.47  0.29  0.64  0.17  0.95  284.47 
CF  NNCF  K = 10  0.24  0.19  0.46  0.22  0.78  0.25  0.91  217.70 
CF  NNCF  K = 15  0.28  0.15  0.47  0.27  0.81  0.30  0.91  231.28 
GNN  NGCF  K = 5  0.15  0.24  0.48  0.29  0.66  0.15  0.95  277.85 
GNN  NGCF  K = 10  0.25  0.20  0.49  0.30  0.77  0.25  0.93  255.49 
GNN  NGCF  K = 15  0.32  0.17  0.49  0.31  0.86  0.32  0.89  219.13 
GNN  LightGCN  K = 5  0.11  0.17  0.36  0.21  0.55  0.05  0.98  245.13 
GNN  LightGCN  K = 10  0.18  0.14  0.37  0.21  0.67  0.07  0.97  312.47 
GNN  LightGCN  K = 15  0.23  0.12  0.38  0.21  0.76  0.10  0.96  292.8 
GNN  SGL  K = 5  0.15  0.25  0.47  0.29  0.66  0.24  0.91  229.24 
GNN  SGL  K = 10  0.25  0.20  0.49  0.29  0.80  0.31  0.89  209.39 
GNN  SGL  K = 15  0.31  0.17  0.49  0.30  0.85  0.34  0.88  200.63 
Approach  Method  Top K  Recall  Precision  MRR  NDCG  HIT  Item Coverage  Gini Index  Average Popularity 

MF  DMF  K = 5  0.05  0.05  0.11  0.05  0.20  0.01  0.99  377.81 
MF  DMF  K = 10  0.07  0.03  0.12  0.06  0.25  0.02  0.99  341.85 
MF  DMF  K = 15  0.08  0.02  0.12  0.07  0.30  0.02  0.99  309.64 
MF  NeuMF  K = 5  0.10  0.10  0.25  0.12  0.40  0.05  0.98  167.49 
MF  NeuMF  K = 10  0.15  0.07  0.27  0.14  0.52  0.06  0.98  157.12 
MF  NeuMF  K = 15  0.20  0.06  0.27  0.16  0.60  0.09  0.98  140.17 
CF  ItemKNN  K = 5  0.12  0.11  0.29  0.14  0.41  0.12  0.96  152.64 
CF  ItemKNN  K = 10  0.16  0.08  0.30  0.16  0.50  0.23  0.93  131.54 
CF  ItemKNN  K = 15  0.20  0.06  0.30  0.18  0.57  0.31  0.91  118.00 
CF  NNCF  K = 5  0.09  0.07  0.16  0.09  0.31  0.04  0.98  195.14 
CF  NNCF  K = 10  0.12  0.06  0.17  0.10  0.38  0.05  0.98  185.23 
CF  NNCF  K = 15  0.15  0.05  0.19  0.12  0.49  0.06  0.99  177.06 
GNN  NGCF  K = 5  0.12  0.11  0.29  0.14  0.44  0.03  0.99  202.55 
GNN  NGCF  K = 10  0.18  0.09  0.32  0.17  0.59  0.06  0.98  155.32 
GNN  NGCF  K = 15  0.21  0.07  0.31  0.18  0.64  0.08  0.98  160.38 
GNN  LightGCN  K = 5  0.13  0.14  0.31  0.15  0.47  0.05  0.98  174.23 
GNN  LightGCN  K = 10  0.19  0.09  0.33  0.18  0.59  0.09  0.98  148.15 
GNN  LightGCN  K = 15  0.23  0.07  0.34  0.20  0.66  0.12  0.97  132.52 
GNN  SGL  K = 5  0.13  0.13  0.33  0.15  0.48  0.06  0.98  142.71 
GNN  SGL  K = 10  0.20  0.10  0.35  0.19  0.62  0.10  0.97  114.49 
GNN  SGL  K = 15  0.24  0.07  0.35  0.21  0.69  0.14  0.96  103.11 
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. 
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chizari, N.; Shoeibi, N.; MorenoGarcía, M.N. A Comparative Analysis of Bias Amplification in Graph Neural Network Approaches for Recommender Systems. Electronics 2022, 11, 3301. https://doi.org/10.3390/electronics11203301
Chizari N, Shoeibi N, MorenoGarcía MN. A Comparative Analysis of Bias Amplification in Graph Neural Network Approaches for Recommender Systems. Electronics. 2022; 11(20):3301. https://doi.org/10.3390/electronics11203301
Chicago/Turabian StyleChizari, Nikzad, Niloufar Shoeibi, and María N. MorenoGarcía. 2022. "A Comparative Analysis of Bias Amplification in Graph Neural Network Approaches for Recommender Systems" Electronics 11, no. 20: 3301. https://doi.org/10.3390/electronics11203301