The Recommendation Algorithm Based on Improved Conditional Variational Autoencoder and Constrained Probabilistic Matrix Factorization

Zhang, Yunfei; Xu, Hongzhen; Yu, Xiaojun

doi:10.3390/app132112027

Open AccessArticle

The Recommendation Algorithm Based on Improved Conditional Variational Autoencoder and Constrained Probabilistic Matrix Factorization

by

Yunfei Zhang

¹,

Hongzhen Xu

^1,2,* and

Xiaojun Yu

²

¹

School of Information Engineering, East China University of Technology, Nanchang 330013, China

²

School of Software, East China University of Technology, Nanchang 330013, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(21), 12027; https://doi.org/10.3390/app132112027

Submission received: 4 September 2023 / Revised: 19 October 2023 / Accepted: 2 November 2023 / Published: 4 November 2023

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

An improved recommendation algorithm based on Conditional Variational Autoencoder (CVAE) and Constrained Probabilistic Matrix Factorization (CPMF) is proposed to address the issues of poor recommendation performance in traditional user-based collaborative filtering algorithms caused by data sparsity and suboptimal feature extraction. Firstly, in the data preprocessing stage, a hidden layer is added to CVAE, and random noise is introduced into the hidden layer to constrain the data features, thereby obtaining more accurate latent features and improving the model’s robustness and generative capability. Secondly, the category of items is incorporated as auxiliary information in CVAE to supervise the encoding and decoding of item data. By learning the distribution characteristics of the data, missing values in the rating data can be effectively reconstructed, thereby reducing the sparsity of the rating matrix. Subsequently, the reconstructed data is processed using CPMF, which optimizes the feature extraction performance by imposing constraints on user features. Finally, the prediction rating of a user for an item can be obtained through the matrix product of user and item feature matrices. Experimental results on the MovieLens-100K and MovieLens-1M datasets demonstrate the effectiveness and superiority of the proposed algorithm over four comparative algorithms, as it exhibits significant advantages in terms of root mean square error and mean absolute error metrics.

Keywords:

collaborative filtering; conditional variational autoencoder; constrained probabilistic matrix factorization; auxiliary information; feature matric

1. Introduction

In the current era of exponential growth in internet information, the volume of available data has increased geometrically. As users search for information, it has become challenging to efficiently and accurately access desired content. Personalized recommendation algorithms [1] leverage user behavioral data to assist users in rapidly obtaining relevant content amidst data overload [2]. Presently, personalized recommendation algorithms can be broadly categorized into three types: content-based recommendation algorithms, collaborative filtering recommendation algorithms, and hybrid recommendation algorithms. Among these, collaborative filtering recommendation algorithms, as the most widely applied recommendation technique, primarily focus on prediction and recommendation. Collaborative filtering recommendation algorithms can be further divided into item-based collaborative filtering recommendation algorithms and user-based collaborative filtering recommendation algorithms. In the item-based approach, similar items are identified and tagged, and recommendations are made to users based on items similar to those they prefer. The user-based collaborative filtering recommendation algorithm operates by recommending based on groups of similar users [3]. Leveraging historical user behavioral data features, it groups users with high similarity into clusters, enabling prediction of ratings [4].

With the rapid expansion of the user base, traditional user-based collaborative filtering recommendation algorithms have encountered challenges such as data sparsity and suboptimal feature extraction, which hinder the ability to thoroughly and comprehensively uncover user interests [5]. As a result, scholars both domestically and internationally have begun conducting research to address these issues. Salakhutdinov et al. [6] were among the first to propose a recommendation algorithm based on deep learning models. They employed a restricted Boltzmann machine model to characterize rating data. Wang et al. [7] introduced a recommendation algorithm based on deep belief networks and Probabilistic Matrix Factorization, combining learned latent features with collaborative filtering and applying them to music recommendation. Zhang et al. [8] introduced a recommendation algorithm based on deep variational matrix factorization, utilizing deep nonlinear structures to independently capture latent features of both users and items. Fo et al. [9] devised a matrix factorization algorithm based on Bernoulli distribution, leveraging the binary nature of model distribution to enhance recommendation accuracy and reliability. Chen et al. [10] proposed a joint matrix factorization algorithm based on transfer learning, which could improve the effectiveness of similarity measurement by mining latent features of users. Ma et al. [11] introduced a biased deep matrix factorization model that employed neural networks to enhance matrix factorization. Rong et al. [12] put forth a regularized least squares optimization algorithm with case-weighting, leading to improved collaborative filtering recommendation performance.

While the aforementioned recommendation algorithms have to some extent alleviated the issues of data sparsity and suboptimal feature extraction, they primarily analyze the associations between users and items based solely on user behavior information, without considering auxiliary item information [13,14,15]. Consequently, they may fall short of fully capturing users’ genuine preferences. To address this limitation, this paper introduces a recommendation algorithm based on an improved Conditional Variational Autoencoder (CVAE) and Constrained Probabilistic Matrix Factorization (CPMF). Firstly, enhancements are made to the CVAE by adding latent layers and introducing random noise to these layers, thereby constraining the data features. This augmentation improves the model’s robustness against interference and enhances its generative capabilities. Secondly, the CVAE incorporates item categories as auxiliary information for supervising the encoding and decoding of item data. By learning the distribution characteristics of data, it reconstructs missing values in rating data, effectively reducing data sparsity. Subsequently, the CPMF algorithm constrains user features, optimizing the feature extraction process. The stochastic gradient descent algorithm is employed for parameter updates. Finally, the product of user and item feature matrices yields predicted ratings.

The experimental results indicate that the proposed algorithm effectively addresses the challenges of data sparsity and suboptimal feature extraction, leading to a significant improvement in recommendation performance.

The main contributions of this paper are as follows:

(1): Proposing a recommendation algorithm based on an improved CVAE and CPMF, which effectively addresses the issues of poor recommendation performance in traditional user-based collaborative filtering recommendation algorithms due to data rating sparsity and suboptimal feature extraction.
(2): Enhancements are made to the CVAE. In addition to the existing structure, latent layers are incorporated, and random noise is introduced into these layers. Processing input data through these latent layers yields more precise implicit features, thereby enhancing the model’s resilience to interference and its generative capabilities. Furthermore, the CVAE incorporates item categories as auxiliary information for supervising the encoding and decoding of item data. By learning the distribution characteristics of data, it effectively reconstructs missing values in rating data, reducing data sparsity. The reconstructed data is subsequently subjected to CPMF to extract implicit rating features of users and items, optimizing the feature extraction process.
(3): MovieLens-100K and MovieLens-1M datasets are selected for experimental evaluation. Four comparative algorithms are chosen, and the proposed algorithm is compared to them using the same dataset. The comparison results reveal that the proposed algorithm reduces the Root Mean Square Error (RMSE) by 2.26%, 3.30%, 2.57%, and 5.12%, and the Mean Absolute Error (MAE) by 2.63%, 3.66%, 2.53%, and 5.33% respectively. In the MovieLens-1M dataset, RMSE is reduced by 4.39%, 5.49%, 4.22%, and 5.99%, and MAE is reduced by 3.28%, 4.17%, 3.08%, and 5.97%, respectively.

2. Correlation Algorithm

2.1. User-Based Collaborative Filtering Recommendation

User-based collaborative filtering is a prevalent recommendation algorithm within the field of recommender systems. This approach aims to predict items of potential interest to a user by leveraging the historical behaviors and preferences of other users who exhibit similar interests. The fundamental principle revolves around the notion that if two users have demonstrated similar preferences or purchases for items in the past, they are likely to share similar preferences for other items in the future [16].

The algorithm entails the following key steps:

(1): Construction of User-Item Matrix

The establishment of a rating matrix can be initiated through the historical ratings provided by users. Assuming a set of users as U = {u₁, u₂, …, u_m}, and a set of items as I = {i₁, i₂, …, i_n}, the rating matrix is denoted as R_m,n, as illustrated in Equation (1), where r_i,j represents the rating given by customer i to item j.

R_{m, n} = [\begin{matrix} r_{1, 1} & r_{1, 2} & \dots & r_{1, j} & \dots & r_{1, n} \\ r_{2, 1} & r_{2, 2} & \dots & r_{2, j} & \dots & r_{2, n} \\ ⋮ & ⋮ & ​ & ⋮ & ​ & ⋮ \\ r_{i, 1} & r_{i, 2} & \dots & r_{i, j} & \dots & r_{i, n} \\ ⋮ & ⋮ & ​ & ⋮ & ​ & ⋮ \\ r_{m, 1} & r_{m, 2} & \dots & r_{m, j} & \dots & r_{m, n} \end{matrix}]

(1)

(2): Calculation of User Similarity

The user-based collaborative filtering (U-CF) approach recommends items among users with similar historical ratings by analyzing the extent of their similarity. It employs the Pearson correlation coefficient as the metric for assessing similarity, which is used to delineate the likeness between users. Let r_u denote the rating vector of user u, defined as r_u = {r_u,₁, r_u,₂, …, r_u,n}. The Pearson correlation coefficient between users u and v is depicted by Equation (2):

s i m (u, v) = \frac{\sum_{i \in S} (r_{u, i} - {\bar{r}}_{u}) {(r_{v, i} - {\bar{r}}_{v})}_{}}{\sqrt{\sum_{i \in S} {(r_{u, i} - {\bar{r}}_{u})}^{2}} \sqrt{\sum_{i \in S} {(r_{v, i} - {\bar{r}}_{v})}^{2}}}

(2)

where S represents the set of items for which both users u and v have given ratings. The variables r_u,i and r_v,i, respectively, denote the ratings provided by users u and v for item i, while

{\bar{r}}_{u}

and

{\bar{r}}_{v}

indicate the average ratings given by users u and v.

(3): Identifying Nearest Neighbors

For the target user, calculate the similarity with other users using Equation (2), and subsequently select users with higher similarity scores as the nearest neighbors.

(4): Predict Ratings and Generate Recommendations

Compute the final predicted ratings based on the nearest neighbors and the target user’s ratings, and generate recommendations according to the obtained predicted ratings. The formula for calculating predicted ratings is as follows:

P_{u, i} = {\bar{r}}_{u} + \frac{\sum_{n \in N_{u}} s i m (u, n) \times (r_{n, u} - {\bar{r}}_{n})}{\sum_{n \in N_{u}} s i m (u, n)}

(3)

where N_u denotes the nearest neighbors of user u,

{\bar{r}}_{u}

represents the rating mean of user u, and r_n signifies the rating mean of the nearest neighbors of user u.

2.2. Autoencoder

The autoencoder [17,18,19,20] is an unsupervised learning neural network. It leverages the backpropagation algorithm along with optimization techniques like gradient descent to learn the mapping between input x and its corresponding reconstructed output x′.

This model consists of two primary components: the encoder and the decoder. The encoder converts high-dimensional input data x into latent space variables h, which are encoded representations. The decoder reconstructs the latent space variables h back into the original dimensions of the output x′, which is an approximation of the input. A well-performing autoencoder will yield a reconstructed output x′ that closely resembles the original input x. The principle is illustrated in Figure 1.

The objective of an autoencoder is to train a network to learn data features and reproduce the original input as faithfully as possible. Leveraging this characteristic, autoencoders can be applied to rating prediction within recommendation systems. Consider a scenario with a sparse rating matrix R. Treating R as the input, it undergoes encoding to transform it into a latent space representation. The transition from the hidden layer to the output layer is executed by the decoder, which reconstructs the input from the latent space representation. Notably, autoencoders possess the property of having input and output dimensions equal, and they learn to reconstruct a user’s ratings using the acquired latent space representation.

This utilization of autoencoders proves beneficial in tackling the issue of sparse data in recommendation systems, as they can effectively learn meaningful latent representations from the input data and aid in accurate rating predictions.

The basic form of an autoencoder is as follows; where f(x) represents the encoding function and g(x) represents the decoding function:

x \overset{f (x)}{\to} h \overset{g (x)}{\to} x^{'}

(4)

During the training process, the constraint conditions are:

x \approx x^{'}

(5)

To ensure that the input and output of the autoencoder closely match, it is imperative to define a loss function:

x^{'} = g (f (x))

(6)

2.3. Matrix Factorization

Matrix factorization [21,22,23,24,25] is based on the concept of introducing implicit features onto the foundation of a “co-occurrence matrix”. It translates the user-item rating matrix into implicit user ratings and implicit item ratings, ultimately yielding the final rating through a combination of these implicit ratings. This approach enhances the capability to handle data sparsity effectively.

The fundamental principle of matrix factorization algorithms involves transforming the rating matrix into two sparsely interacting low-dimensional matrices, corresponding to latent rating features of users and items, represented as M = U^TV. Where M, U, and V, respectively, denote the rating matrix, implicit user rating matrix, and implicit item rating matrix. The matrix factorization model diagram is depicted in Figure 2.

The prediction of a user’s rating for a specific item is derived from the interaction between the user’s latent rating features and the latent rating features of the corresponding item. For instance, consider that the latent rating vector for user u after factorization is u = [1, 1, 2], and the latent rating vector for item v is v = [1, 0, 1]^T. The three features of the user u signify preferences for action, romance, and comedy genres, respectively. The item vector v indicates that the movie has an action component of magnitude 1, no romance component (magnitude 0), and a comedy component of magnitude 1. The user’s predicted rating for the item is obtained by taking the inner product of these vectors, that is, u·v = 2.

Utilizing the feature vectors of users and items for rating prediction enables the attainment of enhanced prediction accuracy and concurrently mitigates the adverse impact of data sparsity.

3. Methodology

This paper introduces a recommendation algorithm based on an improved Conditional Variational Autoencoder and Constrained Probabilistic Matrix Factorization. The recommendation process is shown in Figure 3.

Firstly, enhancements are made to the CVAE. A hidden layer is added on top of the CVAE architecture, and random noise is introduced into the hidden layer to provide constraints. This serves to enhance both the model’s resilience against interference and its generative capabilities.

Secondly, CVAE incorporates item categories as auxiliary information to supervise the encoding and decoding of project data. This utilization of category information enables the reconstruction of missing values in the rating data, effectively reducing the sparsity of the rating matrix.

Subsequently, the reconstructed data is subjected to the CPMF algorithm. This step involves mining the latent implicit rating features of users and items, thus optimizing the process of feature extraction.

Finally, the multiplication of user and item feature matrices yields the predicted ratings for users and projects, thereby accomplishing the objective of project recommendation.

3.1. Improved Data Reconstruction for CVAE

The Improved CVAE introduces external auxiliary information into the encoding and decoding layers of the Variational Autoencoder (VAE) to supervise the encoding and decoding of item data. By reconstructing missing values in the rating data, it effectively reduces the sparsity of the rating matrix. The processing through latent layers enhances the efficiency of guiding sample outputs. The improved CVAE model is illustrated in Figure 4.

In the description provided:

X represents the original input data.

h_hid represents the newly added latent layer.

Y represents the data processed through the latent layers.

h_inf represents the inference neural network of the encoding stage’s latent space.

μ and ρ represent the mean and variance of the latent space distribution.

z represents the implicit representation of input data in the latent space layer.

h_gen represents the neural network guiding the output of new samples in the decoding stage.

Y′ represents the output data processed by the model.

c represents the auxiliary information of selecting item categories to guide the encoding and decoding networks.

In this study, a latent layer is introduced before the encoding layer of the CVAE, and random noise is injected into the latent layer to enhance the model’s resistance to interference. By processing input data through this latent layer, more accurate implicit features can be obtained, thereby further improving the model’s generative capabilities. Assuming the input variable is denoted as x and x ∈ X, the transformed data y is obtained through processing x in the latent layer. Formula (7) represents the processing function of the latent layer:

J_{θ} (y | x) = f (y; x, θ)

(7)

where f represents the sigmoid non-linear transformation function, and θ stands for the parameters of the encoder, representing random noise.

During the training phase of CVAE,

q_{ϕ} (z ∣ y, c)

denotes the probability distribution of z in the learned latent space, while

p_{θ} (z ∣ y, c)

represents the implicit probability distribution of z in CVAE, referred to as the “posterior probability.” The Kullback-Leibler (KL) divergence is employed as a metric to quantify the similarity between the two distributions, which can be expressed as follows:

\begin{array}{l} D_{K L} (q_{ϕ} (z | y, c) | | p_{θ} (z | y, c)) \\ = E_{q_{ρ}}_{(z | y, c)} \log \frac{q_{ϕ} (z | y, c)}{p_{θ} (z | y, c)} \\ = E_{q_{ρ}}_{(z | y, c)} \log \frac{q_{ϕ} (z | y, c)}{p_{θ} (z, y, c)} + E_{q_{ρ}}_{(z | y, c)} \log p_{θ} (y, c) \\ = E_{q_{ρ}}_{(z | y, c)} (\log \frac{q_{ϕ} (z | y, c)}{p_{θ} (z, y, c)} + \log p_{θ} (y, c)) \end{array}

(8)

where

D_{K L} (q_{ϕ} (z ∣ y, c) ∥ p_{θ} (z ∣ y, c))

represents the relative entropy between the approximate distribution and the posterior probability.

D_{K L} (q_{ϕ} (z ∣ y, c) ∥ p_{θ} (z ∣ y, c))

smaller relative entropy indicates a closer approximation between the distribution

q_{ϕ} (z ∣ y, c)

and posterior probability

p_{θ} (z ∣ y, c)

. To generate authentic samples, it is necessary to maximize the probability of output samples and minimize the relative entropy. Therefore, the following conditions apply:

\begin{array}{l} L (θ, ϕ, y, c) = - E_{q_{ρ}}_{(z | y, c)} \log \frac{q_{ϕ} (z | y, c)}{p_{θ} (z, y, c)} \\ = E_{q_{ρ}}_{(z | y, c)} \log p_{θ} (z, y, c) - E_{q_{ϕ}}_{(z | y, c)} \log q_{ϕ} (z | y, c) \\ = E_{q_{ρ}}_{(z | y, c)} \log p_{θ} (y | z, c) + E_{q_{ϕ}}_{(z | y, c)} \log p_{θ} (z, c) - E_{q_{ρ}}_{(z | y, c)} \log q_{ϕ} (z | y, c) \\ = E_{q_{θ}}_{(z | y, c)} \log p_{θ} (y | z, c) - D_{K L} (q_{ϕ} (z | y, c) | | p_{θ} (z, c)) \end{array}

(9)

where

L (θ, ϕ, y, c)

represents the variational lower bound of CVAE. When the variational lower bound is maximized, the value of

D_{K L} (q_{ϕ} (z ∣ y, c) ∥ p_{θ} (z ∣ y, c))

is minimized. Therefore,

L (θ, ϕ, y, c)

is utilized as the optimization objective for training CVAE.

3.2. CPMF

When dealing with sparse rating matrices, the features of users with fewer ratings tend to approach the prior mean, resulting in predicted ratings gravitating toward the average value. Therefore, in this study, CPMF is employed to handle sparse matrices and impose constraints on users with fewer ratings. This approach effectively enhances prediction accuracy and optimizes feature extraction outcomes. The model diagram of CPMF is illustrated in Figure 5.

Where:

M represents the interaction matrix generated between users and items.

U signifies the matrix of implicit space features for users.

V signifies the matrix of implicit space features for items.

W is the user constraint matrix.

I represents user rating information.

Y signifies the user compensation matrix.

Initially, the constraint matrix W enforces constraints on user features based on user rating information. This is followed by the combination, with the compensation matrix Y to derive the matrix of implicit space features for users, U. Ultimately, the user’s implicit space feature matrix U, and the item’s implicit space feature matrix V, collaborate to generate the interaction matrix M between users and items.

Given N users and G items, along with the interaction matrix M~N × G, with user and item implicit rating dimensions of D, M = U^TV represents the inner product of users and items in the implicit space. The latent space feature vector for user i can be represented as:

U_{i} = Y_{i} + \frac{\sum_{k = 1}^{G} I_{i k} W_{k}}{\sum_{k = 1}^{G} I_{i k}}

(10)

where I_ik is the indicator function, taking the value 1 when user i has rated item k, and 0 otherwise.

Assuming the observation matrix M and the approximate rating matrix M′ follow Gaussian distributions with a mean of 0, the conditional distribution of the rating matrix is given by:

\begin{array}{l} p (M | Y, V, W, σ^{2}) \\ = \prod_{i = 1}^{N} \prod_{j = 1}^{G} {[N (M_{i j} | g ({[Y_{i} + \frac{\sum_{k = 1}^{G} I_{i k} W_{k}}{\sum_{k = 1}^{G} I_{i k}}]}^{T} V_{j}, σ^{2})]}^{I_{i j}} \end{array}

(11)

Assuming that the latent feature vectors of an item, user compensation matrices, and user constraint matrix also follow Gaussian distributions with a mean of 0, we have:

p (V | σ_{V}^{2}) = \prod_{j = 1}^{G} N (V_{j} | 0, σ_{j}^{2} I)

(12)

p (Y | σ_{Y}^{2}) = \prod_{i = 1}^{N} N (Y_{i} | 0, σ_{i}^{2} I)

(13)

p (W | σ_{W}^{2}) = \prod_{k = 1}^{G} N (W_{k} | 0, σ_{k}^{2} I)

(14)

Based on the objective function formula of Probabilistic Matrix Factorization [26], the loss function of CPMF can be derived as follows:

\begin{array}{l} J = \frac{1}{2} \sum_{i = 1}^{N} \sum_{j = 1}^{G} I_{i j} (M_{i j} - g ([Y_{i} + \frac{\sum_{k = 1}^{G} I_{i k} W_{k}}{\sum_{k = 1}^{G} I_{i k}}]^{T} V_{j}))^{2} \\ + \frac{λ_{Y}}{2} {\sum_{i = 1}^{N} | | Y_{i} | |}_{F r o}^{2} + \frac{λ_{V}}{2} {\sum_{j = 1}^{G} | | V_{j} | |}_{F r o}^{2} \\ + \frac{λ_{W}}{2} {\sum_{k = 1}^{G} | | W_{k} | |}_{F r o}^{2} \end{array}

(15)

where g(x) = 1/(1 + e^−x), λ_Y = σ²/

σ_{Y}^{2}

, λ_V = σ²/

σ_{V}^{2}

, λ_W = σ²/

σ_{W}^{2}

, Fro representative

{| | x | |}_{Fro}^{2}

= X^TX.

Training is carried out using the stochastic gradient descent optimization algorithm [27] until convergence or reaching the maximum number of training iterations. The process is outlined as follows:

First, compute the negative gradient of the loss function:

\begin{array}{l} \frac{\partial_{J}}{\partial_{Y_{i}}} = - (M_{i j} - g_{i j}) g_{i j}^{'} V_{j} + λ_{Y} Y_{i} \\ = - e_{i j} g_{i j}^{'} V_{j} + λ_{Y} Y_{i} \end{array}

(16)

\begin{array}{l} \frac{\partial_{J}}{\partial_{V_{j}}} = - (M_{i j} - g_{i j}) g_{i j}^{'} U_{i} + λ_{Q} V_{j} \\ = - e_{i j} g_{i j}^{'} U_{i} + λ_{Q} V_{j} \end{array}

(17)

\begin{array}{l} \frac{\partial_{J}}{\partial_{W_{k}}} = - \frac{(M_{i j} - g_{i j}) g_{i j}^{'}}{\sum_{k = 1}^{G} I_{i k}} V_{j} + λ_{W} W_{k} \\ = - \frac{e_{i j} g_{i j}^{'}}{\sum_{k = 1}^{G} I_{i k}} V_{j} + λ_{W} W_{k} \end{array}

(18)

where

U_{i} = Y_{i} + \frac{\sum_{k = 1}^{G} I_{i k} W_{k}}{\sum_{k = 1}^{G} I_{i k}}, g_{i j} = g (U_{i}^{T} V_{j})

. Then, update the variables based on the change in the negative gradient:

Y_{i} = Y_{i} - η \frac{\partial_{J}}{\partial_{Y_{i}}} = Y_{i} + η (e_{i j} g_{i j}^{'} V_{j} - λ_{Y} Y_{i})

(19)

V_{j} = V_{j} - η \frac{\partial_{J}}{\partial_{V_{j}}} = V_{j} + η (e_{i j} g_{i j}^{'} U_{i} - λ_{Q} V_{j})

(20)

W_{k} = W_{k} - η \frac{\partial_{J}}{\partial_{W_{k}}} = W_{k} + η (- \frac{e_{i j} g_{i j}^{'}}{\sum_{k = 1}^{G} I_{i k}} Y_{j} - λ_{W} W_{k})

(21)

where

η

denotes the learning rate. The assessment of the user-item matrix is accomplished by forecasting scores using the matrix product formulation M = U^TV.

3.3. Algorithm Flow

The algorithm presented in this study takes raw user ratings as input and generates predicted ratings as output. The detailed algorithmic steps are described as follows:

Input: Target user i, user rating matrix M.

Output: Predicted ratings for target user i.

Step 1: Incorporate the rating variable x from matrix M into the hidden layer processing of Equation (7), introducing random noise for constraint. This yields transformed data y.

Step 2: Introduce the category of items, denoted as c, as auxiliary information in CVAE. This supervises the encoding and decoding of item data. Input the transformed data y into the improved encoding part of CVAE for training. Obtain the probability distribution of latent space z and express its Kullback-Leibler (KL) divergence with the posterior probability using Equation (8). To maximize the output sample probability and minimize KL divergence, the modified CVAE training loss function is derived, as shown in Equation (9).

Step 3: Decode through the decoding part of CVAE to reconstruct missing values of rating data, ultimately obtaining the reconstructed rating matrix M₂.

Step 4: Initialize the user compensation matrix Y, constraint matrix W, and project latent space feature matrix V based on the reconstructed matrix M₂.

Step 5: Assuming that the observed matrix M₂ and the approximated rating matrix M₂′ follow Gaussian distributions with mean 0, the conditional distribution of the rating matrix is expressed in Equation (11).

Step 6: Based on the objective function of Probabilistic Matrix Factorization [26], the loss function of CPMF is derived, as presented in Equation (15).

Step 7: Employ stochastic gradient optimization algorithm for training. First, compute the negative gradient of the loss function according to Equations (16)–(18). Then, update the variables based on the gradient change, as depicted in Equations (19)–(21).

Step 8: Halt training when the maximum number of iterations is reached or the loss function value stabilizes. Ultimately, derive user latent space feature matrix U and item latent space feature matrix V. Output the final user-item rating matrix M₃ using the formulation M = U^TV.

4. Experiment and Analysis

4.1. Experimental Data

The methodology presented in this study is evaluated using the MovieLens-100K and MovieLens-1M datasets. MovieLens-100K dataset comprises 100,000 ratings provided by 943 users for 1682 movies. MovieLens-1M dataset comprises 993,482 ratings provided by 6040 users for 3544 movies. The training and testing sets are divided into an 8:2 ratio. The movie data includes movie IDs, titles, and categories such as Action, Adventure, and others. The ratings in the dataset range from 1 to 5. Table 1 presents pertinent details about the dataset.

4.2. Evaluation Index

This study employs the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) metrics to assess the quality of the recommendation effectiveness. Both metrics are utilized to quantify the disparities between predicted ratings and actual ratings. Lower MAE and RMSE values indicate better recommendation performance. Assuming the user’s actual ratings are denoted as Q = {q₁, q₂, …, q_n} and the predicted ratings are R = {r₁, r₂, …, r_n}, the formulas for calculating MAE and RMSE are as follows:

M A E = \frac{1}{N} \sum_{i = 1}^{N} | q_{i} - r_{i} |

(22)

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} (q_{i} - r_{i})^{2}}

(23)

4.3. Experimental Results and Analysis

Initially, the parameters required for the improved CVAE are selected. Figure 6 and Figure 7 illustrate the curve of RMSE results for different epochs (training iterations) during the training process in MovieLens-100K and MovieLens-1M datasets.

Where epoch stands for the number of iterations, it is observed that as the number of epochs increases, the RMSE of the improved CVAE gradually decreases. The optimal performance is achieved when the epoch reaches 180 in MovieLens-100K. However, 140 achieves the optimum in MovieLens-1M. Therefore, the optimal number of epochs for the improved CVAE model is determined to be 180 and 140.

Upon obtaining the reconstructed matrix using the improved CVAE, the CPMF matrix factorization is applied. Initially, various numbers of implicit rating factors are chosen to train the model. Figure 8, Figure 9, Figure 10 and Figure 11 display the RMSE and MAE values for different numbers of implicit rating factors in different datasets.

From Figure 8 and Figure 10, implicit factors represent the dimensions of the matrix factorization, it can be inferred that as the number of implicit factors increases, the training outcomes stabilize. The optimal points for both RMSE and MAE are found to be 40 in MovieLens-100K. However, it can be seen from Figure 9 and Figure 11 that the optimal point is 15 in MovieLens-1M.

Then, the RMSE and MAE values of the algorithm were tested under different λ_Y, λ_V, and λ_W. λ_Y, λ_V, λ_W represent the regularization parameters of CPMF.

Figure 12, Figure 13, Figure 14 and Figure 15 show that the RMSE and MAE curves gradually decrease as λ increases and reach the optimal value after reaching 10 in both datasets. Thus, it is determined that the optimal values of λ_Y, λ_V, and λ_W are all 10, which indicates the best experimental performance.

Lastly, to demonstrate the effectiveness of the proposed model in this study, four comparative algorithms are selected for evaluation: the Probabilistic Matrix Factorization (PMF) model proposed by Salakhutdinov et al. [26], Collaborative Deep Learning (CDL) recommended by Wang et al. [28], the Hybrid Deep Probabilistic Model (PHD) introduced by Liu et al. [29], and the Collaborative Topic Regression (CTR) model combining PMF proposed by Wang et al. [30]. These algorithms are compared against the proposed model using the MovieLens-100K and MovieLens-1M datasets. The experimental results are presented in Figure 16, Figure 17, Figure 18 and Figure 19. Several experiments were performed and the average value was taken as the final result of RMSE and MAE, as shown in Table 2 and Table 3.

It is evident that in MovieLens-100K, the proposed model in this study reaches convergence at an epoch of 20. Compared to CDL, CTR, PHD, and PMF, the proposed model achieves reductions of 2.26%, 3.30%, 2.57%, and 5.12% in RMSE, respectively. Additionally, the proposed model achieves reductions of 2.63%, 3.66%, 2.53%, and 5.33% in MAE, respectively. In MovieLens-1M, the proposed model reaches convergence at 30 iterations. Compared with CDL, CTR, PHD, and PMF, the RMSE of the proposed model is reduced by 4.39%, 5.49%, 4.22%, and 5.99%, respectively. MAE decreased by 3.28%, 4.17%, 3.08%, and 5.97%, respectively. The final convergence results for RMSE and MAE indicate that the proposed model outperforms other algorithms significantly.

Experimental results show that compared with the selected algorithms, the proposed model has significant improvement in RMSE and MAE values, which can effectively improve the recommendation performance.

5. Conclusions

This paper introduces a recommendation algorithm based on an improved Conditional Variational Autoencoder (CVAE) and Constrained Probabilistic Matrix Factorization (CPMF). In this algorithm, a hidden layer is added on top of the CVAE architecture, and random noise is introduced into the hidden layer to enhance the model’s robustness against interference. The processing of item data through the hidden layer further enhances the model’s generative capability. Additionally, CVAE incorporates item categories as auxiliary information to supervise the encoding and decoding of project data. By learning the distribution features of the data, the algorithm reconstructs missing values in the rating data, effectively mitigating data sparsity.

Furthermore, the reconstructed data is subjected to the CPMF algorithm to optimize feature extraction by mining the latent implicit rating features of users and items. Finally, the predicted ratings are obtained through the multiplication of user and item feature matrices. Experimental validation is conducted on the MovieLens-100K and MovieLens-1M datasets, and a comparison is made against four alternative algorithms. The experimental results demonstrate the superior recommendation performance of the proposed algorithm. This indicates that the proposed algorithm effectively addresses the issues of poor recommendation performance often encountered in traditional user-based collaborative filtering algorithms due to data sparsity and suboptimal feature extraction.

However, it is worth noting that the current algorithm utilizes a singular auxiliary information source to guide item outputs. In future research, the incorporation of multiple auxiliary information sources, such as knowledge graphs, attention mechanisms, and social networks, could be explored to enhance user and item features and further elevate recommendation performance.

Author Contributions

Writing—original draft, Y.Z.; Writing—review & editing, H.X. and X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the Fuzhou Talent Program of Jiangxi Province (2021ED008), and Jiangxi Key Laboratory of Cyberspace Security Intelligent Perception (JKLCIP202202).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: [https://grouplens.org/datasets/movielens/].

Conflicts of Interest

The authors declare no conflict of interest.

References

Wu, Z.; Tang, Y.; Liu, H. Survey of personalized learning recommendation. J. Front. Comput. Sci. Technol. 2022, 16, 21. [Google Scholar]
Awan, M.J.; Khan, R.A.; Nobanee, H.; Yasin, A.; Anwar, S.M.; Naseem, U.; Singh, V.P. A recommendation engine for predicting movie ratings using a big data approach. Electronics 2021, 10, 1215. [Google Scholar] [CrossRef]
Yimu, J.; Ke, L.; Shangdong, L.; Qiang, L.; Haichang, Y.; Kui, L. Collaborative filtering recommendation algorithm based on interactive data classification. J. China Univ. Posts Telecommun. 2020, 27, 1. [Google Scholar]
Goyani, M.; Chaurasiya, N. A review of movie recommendation system: Limitations, Survey and Challenges. ELCVIA Electron. Lett. Comput. Vis. Image Anal. 2020, 19, 0018–0037. [Google Scholar]
Qian, Z.; Yang, J.; Li, D.; Ye, Z. Event Recommendation Strategy Combining User Long-Short Term Interest and Event Influence. J. Comput. Res. Dev. 2022, 59, 2803–2815. [Google Scholar]
Salakhutdinov, R.; Mnih, A.; Hinton, G. Restricted Boltzmann machines for collaborative filtering. In Proceedings of the 24th International Conference on Machine Learning, Corvalis, OR, USA, 20–24 June 2007; ACM: New York, NY, USA, 2007; pp. 791–798. [Google Scholar]
Wang, X.; Wang, Y. Improving content-based and hybrid music recommendation using deep learning. In Proceedings of the 22nd ACM International Conference on Multimedia, Kaohsiung, Taiwan, 8–10 December 2014; ACM: New York, NY, USA, 2014; pp. 627–636. [Google Scholar]
Zhang, W.; Zhang, X.; Wang, H.; Chen, D. A deep variational matrix factorization method for recommendation on large scale sparse dataset. Neurocomputing 2019, 334, 206–218. [Google Scholar] [CrossRef]
Ortega, F.; Lara-Cabrera, R.; González-Prieto, Á.; Bobadilla, J. Providing reliability in recommender systems through Bernoulli matrix factorization. Inf. Sci. 2021, 553, 110–128. [Google Scholar] [CrossRef]
Chen, J.; Zhu, Y.; Zhou, G.; Cui, L.; Wu, S. Collaborative filtering recommendation based on transfer learning and joint matrix decomposition. J. Sichuan Univ. (Nat. Sci. Ed.) 2020, 57, 1096–1102. [Google Scholar]
Ma, C.; Li, J.; Pan, P.; Li, G.; Du, J. BDMF: A Biased Deep Matrix Factorization Model for Recommendation. In Proceedings of the 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), Leicester, UK, 19–23 August 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1039–1045. [Google Scholar]
Rong, P.; Zhou, Y.; Cao, B.; Liu, N.N.; Lukose, R.; Scholz, M.; Yang, Q. One-class collaborative filtering. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 502–511. [Google Scholar]
Wu, Y.; Dubois, C.; Zheng, A.X.; Ester, M. Collaborative Denoising Auto-Encoders for Top-N Recommender Systems. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, San Francisco, CA, USA, 22–25 February 2016; ACM: New York, NY, USA, 2016; pp. 153–162. [Google Scholar]
Li, X.; She, J. Collaborative variational autoencoder for recommender systems. In Proceedings of the 23rd ACM SIGKDD Internationalconference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; ACM: New York, NY, USA, 2017; pp. 305–314. [Google Scholar]
Chen, Y.; Rijke, M.D. Top-N Recommendation with High-Dimensional Side Information via Locality Preserving Projection. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo, Japan, 7–11 August 2017; ACM: New York, NY, USA, 2017; pp. 985–988. [Google Scholar]
Na, L.; Ming-Xia, L.; Hai-Yang, Q.; Hao-Long, S. A hybrid user-based collaborative filtering algorithm with topic model. Appl. Intell. 2021, 51, 7946–7959. [Google Scholar] [CrossRef]
Bank, D.; Koenigstein, N.; Giryes, R. Autoencoders. In Machine Learning for Data Science Handbook: Data Mining and Knowledge Discovery Handbook; Springer: Berlin/Heidelberg, Germany, 2023; pp. 353–374. [Google Scholar]
Pan, Y.; He, F.; Yu, H. Learning social representations with deep autoencoder for recommender system. World Wide Web 2020, 23, 2259–2279. [Google Scholar] [CrossRef]
Niu, Y.; Su, Y.; Li, S.; Wan, S.; Cao, X. Deep adversarial autoencoder recommendation algorithm based on group influence. Inf. Fusion 2023, 100, 101903. [Google Scholar] [CrossRef]
Tahmasebi, H.; Ravanmehr, R.; Mohamadrezaei, R. Social movie recommender system based on deep autoencoder network using Twitter data. Neural Comput. Appl. 2021, 33, 1607–1623. [Google Scholar] [CrossRef]
Wang, H.; Hong, Z.; Hong, M. Research on product recommendation based on matrix factorization models fusing user reviews. Appl. Soft Comput. 2022, 123, 108971. [Google Scholar] [CrossRef]
Liu, N.; Zhao, J. Recommendation system based on deep sentiment analysis and matrix factorization. IEEE Access 2023, 11, 16994–17001. [Google Scholar] [CrossRef]
Zheng, X.; Guan, M.; Jia, X.; Guo, L.; Luo, Y. A Matrix Factorization Recommendation System-Based Local Differential Privacy for Protecting Users’ Sensitive Data. IEEE Trans. Comput. Soc. Syst. 2022, 10, 1189–1198. [Google Scholar] [CrossRef]
Bin, S.; Sun, G. Matrix factorization recommendation algorithm based on multiple social relationships. Math. Probl. Eng. 2021, 2021, 6610645. [Google Scholar] [CrossRef]
Zhang, H.; Ganchev, I.; Nikolov, N.S.; Ji, Z.; O’Droma, M. FeatureMF: An item feature enriched matrix factorization model for item recommendation. IEEE Access 2021, 9, 65266–65276. [Google Scholar] [CrossRef]
Salakhutdinov, R.; Mnih, A. Probabilistic matrix factorization. Adv. Neural Inf. Process. Syst. 2008, 1257-1264. [Google Scholar]
Shi, J.; Wang, D.; Shang, F.; Zhang, H.Y. Research Advances on Stochastic Gradient Descent Algorithms. Acta Autom. Sin. 2021, 47, 2103–2119. [Google Scholar]
Wang, H.; Wang, N.; Yeung, D.Y. Collaborative deep learning for recommender systems. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; ACM: New York, NY, USA, 2015; pp. 1235–1244. [Google Scholar]
Liu, J.; Wang, D.; Ding, Y. PHD: A Probabilistic Model of Hybrid Deep Collaborative Filtering for Recommender Systems. In Proceedings of the Asian Conference on Machine Learning, Seoul, Republic of Korea, 15–17 November 2017; PMLR: New York, NY, USA, 2017; pp. 224–239. [Google Scholar]
Wang, C.; Blei, D.M. Collaborative Topic Modeling for Recommending Scientific Articles. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21–24 August 2011; ACM: New York, NY, USA, 2011; pp. 448–456. [Google Scholar]

Figure 1. Autoencoder model.

Figure 2. Matrix factorization model.

Figure 3. Recommendation process.

Figure 4. Improved CVAE model.

Figure 5. CPMF model.

Figure 6. RMSE values under different epochs in MovieLens-100K.

Figure 7. RMSE values under different epochs in MovieLens-1M.

Figure 8. RMSE values for different numbers of implicit factors in MovieLens-100K.

Figure 9. RMSE values for different numbers of implicit factors in MovieLens-1M.

Figure 10. MAE values for different numbers of implicit factors in MovieLens-100K.

Figure 11. MAE values for different numbers of implicit factors in MovieLens-1M.

Figure 12. RMSE values for different values of λ in MovieLens-100K.

Figure 13. RMSE values for different values of λ in MovieLens-1M.

Figure 14. MAE values for different values of λ in MovieLens-100K.

Figure 15. MAE values for different values of λ in MovieLens-1M.

Figure 16. RMSE values of each algorithm in different epochs in MovieLens-100K.

Figure 17. RMSE values of each algorithm in different epochs in MovieLens-1M.

Figure 18. MAE values of each algorithm in different epochs in MovieLens-100K.

Figure 19. MAE values of each algorithm in different epochs in MovieLens-1M.

Table 1. MovieLens100K and MovieLens-1M datasets.

Dataset Information	MovieLens-100K	MovieLens-1M
Number of users	943	6040
Number of movies	1682	3544
Movie Categories	unknown\|Action\|Adventure\|Animation\|Children’s\|Comedy\|Crime\|Documentary…
Range of ratings	1~5
Rating items	100,000	993,482

Table 2. Results of Different Methods in MovieLens-100K.

Comparison Method	RMSE	MAE
CDL	0.9234	0.7485
CTR	0.9338	0.7588
PHD	0.9265	0.7475
PMF	0.9520	0.7755
This Paper	0.9008	0.7222

Table 3. Results of Different Methods in MovieLens-1M.

Comparison Method	RMSE	MAE
CDL	0.8922	0.7183
CTR	0.9032	0.7272
PHD	0.8905	0.7163
PMF	0.9082	0.7452
This Paper	0.8483	0.6855

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Y.; Xu, H.; Yu, X. The Recommendation Algorithm Based on Improved Conditional Variational Autoencoder and Constrained Probabilistic Matrix Factorization. Appl. Sci. 2023, 13, 12027. https://doi.org/10.3390/app132112027

AMA Style

Zhang Y, Xu H, Yu X. The Recommendation Algorithm Based on Improved Conditional Variational Autoencoder and Constrained Probabilistic Matrix Factorization. Applied Sciences. 2023; 13(21):12027. https://doi.org/10.3390/app132112027

Chicago/Turabian Style

Zhang, Yunfei, Hongzhen Xu, and Xiaojun Yu. 2023. "The Recommendation Algorithm Based on Improved Conditional Variational Autoencoder and Constrained Probabilistic Matrix Factorization" Applied Sciences 13, no. 21: 12027. https://doi.org/10.3390/app132112027

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Recommendation Algorithm Based on Improved Conditional Variational Autoencoder and Constrained Probabilistic Matrix Factorization

Abstract

1. Introduction

2. Correlation Algorithm

2.1. User-Based Collaborative Filtering Recommendation

2.2. Autoencoder

2.3. Matrix Factorization

3. Methodology

3.1. Improved Data Reconstruction for CVAE

3.2. CPMF

3.3. Algorithm Flow

4. Experiment and Analysis

4.1. Experimental Data

4.2. Evaluation Index

4.3. Experimental Results and Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI