Next Article in Journal
Redundancy and Synergy of an Entangling Cloner in Continuous-Variable Quantum Communication
Previous Article in Journal
Three Efficient All-Erasure Decoding Methods for Blaum–Roth Codes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep Matrix Factorization Based on Convolutional Neural Networks for Image Inpainting

1
School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
2
School of Science, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
*
Author to whom correspondence should be addressed.
Entropy 2022, 24(10), 1500; https://doi.org/10.3390/e24101500
Submission received: 7 September 2022 / Revised: 26 September 2022 / Accepted: 17 October 2022 / Published: 20 October 2022
(This article belongs to the Topic Recent Trends in Image Processing and Pattern Recognition)

Abstract

:
In this work, we formulate the image in-painting as a matrix completion problem. Traditional matrix completion methods are generally based on linear models, assuming that the matrix is low rank. When the original matrix is large scale and the observed elements are few, they will easily lead to over-fitting and their performance will also decrease significantly. Recently, researchers have tried to apply deep learning and nonlinear techniques to solve matrix completion. However, most of the existing deep learning-based methods restore each column or row of the matrix independently, which loses the global structure information of the matrix and therefore does not achieve the expected results in the image in-painting. In this paper, we propose a deep matrix factorization completion network (DMFCNet) for image in-painting by combining deep learning and a traditional matrix completion model. The main idea of DMFCNet is to map iterative updates of variables from a traditional matrix completion model into a fixed depth neural network. The potential relationships between observed matrix data are learned in a trainable end-to-end manner, which leads to a high-performance and easy-to-deploy nonlinear solution. Experimental results show that DMFCNet can provide higher matrix completion accuracy than the state-of-the-art matrix completion methods in a shorter running time.

1. Introduction

Matrix completion (MC) [1,2,3,4,5] aims to recover a matrix with missing matrix elements or incomplete data. It has been successfully applied to a wide range of signal processing and image analysis tasks, including collaborative filtering [6,7], image in-painting [8,9,10], image denoising [11,12], and image classification [13,14]. The MC methods assume that the original matrix is low rank and the missing elements of the matrix can be estimated based on rank minimization. It should be noted that the rank minimization problem is generally non-convex and NP-hard [15]. A typical approach to address this issue is to establish a convex approximation of the original non-convex objective function.
Existing approaches for solving the MC problem are mainly based on nuclear norm minimization (NNM) and matrix factorization (MF). The NNM approach [16,17,18] aims to minimize the sum of matrix singular values, which is a convex relaxation of the matrix rank. The nuclear norm minimization can be solved by singular value thresholding (SVT) algorithms [19], inexact increasing Lagrange multiplier (IALM) methods [16], and an alternating direction method (ADM) [17,20]. One major disadvantage of the NNM approach is that singular value decomposition (SVD) needs to be performed in each iteration of the optimization process, which has very high computational complexity when the matrix size is large. To avoid this problem, matrix factorization (MF), which does not need SVD, has been proposed by researchers to solve the MC problem [6,21,22,23]. Assuming that the rank of the original matrix is known, the MF method aims to decompose and approximate the matrix into a product of a thin matrix and a short matrix [21,24,25,26], and then reconstruct the missing element using this low-rank representation. Low-rank matrix fitting (LMaFit) [21] was one of the earliest MF methods. Although LMaFit is able to obtain an exact solution, it is sensitive to the rank estimation and cannot be globally optimized due to its non-convex formulation.
Both the NNM and the MF methods assume the low-rank property of the original matrix. Their performance degrades significantly when this property does not hold any more and the data are generated from a nonlinear latent variable model [10,27,28,29]. Recently, encouraged by the remarkable success of deep learning in many computer vision and machine learning tasks [30,31,32,33], researchers have explored the deep learning methods to nonlinear MC problems [28,29,30]. For example, the autoencoder-based collaborative filtering (AECF) approach [34] learns an autoencoder network to map the input matrix into a latent space and then reconstructs the matrix by minimizing the reconstruction error. The deep learning-based matrix complementation (DLMC) method [28] learns a stacked autoencoder network with with a nonlinear latent variable model. One major disadvantage of these deep learning-based methods is that they are unable to explore the global structure of the matrix, which degrades their performance in matrix completion, especially in image analysis where the global structure plays an important role in its restoration process.
In this paper, we propose a deep matrix factorization and completion network (DMFCNet) for matrix completion by coupling deep learning with traditional matrix completion methods. Our main idea is to use a neural network to simulate the iterative update of variables in the traditional matrix factorization process and learn the underlying relationship between input matrix data and the recovered output data after matrix completion in an end-to-end manner. We apply the proposed method to image in-painting to demonstrate its performance.
The main contributions of this paper can be summarized as follows.
(1)
Compared with existing methods, our proposed method is able to address the nonlinear data model problem faced by the traditional MC methods. It is also able to address the global structure problem in existing deep learning-based MC methods.
(2)
The proposed method can be pre-trained to learn the global image structure and underlying relationship between input matrix data with missing elements and the recovered output data. Once successfully trained, the network does not need to be optimized again in the subsequent image in-painting tasks, thereby providing a high-performance and easy-to-deploy nonlinear matrix completion solution.
(3)
To improve the performance of the proposed method, a new algorithm for pre-filling the missing elements of the image is proposed. This new padding method performs global analysis of the matrix data to predict the missing elements as their initial values, which improves the performance of matrix completion and image in-painting.
The rest of the paper is organized as follows. Section 2 reviews related work. Section 3 presents our approach of deep matrix factorization and completion for image in-painting. Experimental results are presented in Section 4. Section 5 concludes the paper with a discussion of future research work.

2. Related Work

In this section, we review existing work related to our proposed method. For example, the mathematical models of the low-rank Hankel matrix factorization (LRHMF) method [35] and the deep Hankel matrix factorization (DHMF) method [36] will be introduced, respectively. The LRHMF method is a low-rank matrix factorization method that avoids singular value decomposition to achieve fast signal reconstruction. The DHMF method [36] inspired by LRHMF is a complex exponential signal recovery method based on deep learning and Hankel matrix factorization. The method proposed in this paper for image in-painting is inspired by them.
As said in [35], the rank of the Hankel matrix is equal to the number of exponentials in x which is a vector of exponential functions. Thus, the low-rank Hankel matrix completion (LRHMC) problem can be solved by using the low-rank property of the Hankel matrix. Its mathematical formula can be described as:
min x R x * + λ 2 y U x 2 2 .
where x is the signal to be recovered from the undersampled data y, R is the operator that converts the signal to the Hankel matrix R x , U denotes the undersampling matrix, and λ is the balance parameter. · * is the nuclear norm of the matrix, which is used to restrict the rank of the matrix. The second term is used to measure the consistency of the data. However, it is very time-consuming to solve this problem because of its frequent singular value decomposition (SVD). To avoid this problem, the LRHMF method uses matrix factorization [37,38] instead of the nuclear norm minimization. Given any matrix, its nuclear norm can be approximated as:
V * = min P , Q 1 2 ( P F 2 + Q F 2 ) , s . t . V = P Q H .
where P R n 1 × r , Q R n 2 × r , · F 2 denotes the square of the Frobenius norm of the matrix, and the superscript H denotes the conjugate transpose. If we substitute Equation (2) to optimization problem (1), then the optimization problem can be reformulated as:
min x , P , Q 1 2 P F 2 + Q F 2 + λ 2 y U x 2 2 , s . t . R x = P Q H .
Since the nuclear norm of R x is replaced by the Frobenius norm of its matrix factorization, it is no longer necessary to calculate the singular value decomposition. To solve this problem effectively, the alternating direction multiplier method (ADMM) is adopted in LRHMF [35], and its corresponding extended Lagrangian function is derived as:
L ( x , P , Q , D ) = 1 2 P F 2 + 1 2 Q F 2 + λ 2 y U x 2 2 + < D , R x P Q H > + γ 2 R x P Q H F 2 .
where D denotes the increasing Lagrange multiplier, < · , · > denotes the inner product operator, and the λ > 0 and γ > 0 are the balanced parameters.
To solve the signal reconstruction problem, Huang et al. [36] gave the k-th iteration of the solution of (3) by minimizing (4), as shown in (5). Based on this iterative formulation, a deep Hankel matrix factorization network based on deep learning is designed for fast reconstruction of the signal.
x k + 1 = ( λ U T U + γ R * R ) 1 ( λ U T y + γ R * ( P k ( Q k ) H D k ) ) P k + 1 = γ ( R x k + 1 + D k ) Q k ( γ ( Q k ) H Q k + I ) 1 Q k + 1 = γ ( R x k + 1 + D k ) P k + 1 ( γ ( P k + 1 ) H P k + 1 + I ) 1 D k + 1 = D k + τ k ( R x k + 1 P k + 1 ( Q k + 1 ) H )

3. The Proposed Method

In this section, we construct a deep matrix factorization completion network (DMFCNet) for matrix completion and image in-painting. We derive the mathematical model for our DMFCNet method, discuss the network design, and then introduce two network structures based on different prediction methods for missing elements. Finally, we introduce the loss functions and explain the network training process.

3.1. Mathematical Model of the DMFCNet Method

The proposed DMFCNet is based on low-rank matrix factorization [35,36]. The optimization objective function of our proposed model can be formulated as
min X X * + λ 2 Ψ ( Y X ) F 2 ,
where Y R m × n is the observation matrix with missing elements whose initial values are set to be a predefined constant. X R m × n is the matrix that needs to be recovered from the matrix Y, and λ is a regularization parameter. X * is the nuclear norm of the matrix X, which is used to restrict the rank of X. Ψ ( Y X ) F 2 denotes the reconstruction error of Y, where ⊙ is the Hadamard product. Ψ { 0 , 1 } m × n is a mask indicating the positions of missing data. If Y is missing data at position ( i , j ) , the value of Ψ i j is 0; otherwise, it is 1. We use matrix factorization instead of the traditional nuclear norm minimization, and the proposed model can be formulated as follows:
min X , U , V 1 2 U F 2 + V F 2 + λ 2 Ψ ( Y X ) F 2 , s . t . X = U V T ,
where U R m × r and V R n × r . The augmented Lagrangian function for (7) is given by
L ( X , U , V , S ) = 1 2 ( U F 2 + V F 2 ) + λ 2 Ψ ( Y X ) F 2 + < S , X U V T > + η 2 X U V T F 2 .
Here, η > 0 is the penalty parameter and S R m × n is the Lagrangian multiplier corresponding to the constraint X = U V T . Since it is difficult to solve for U, V, S and X simultaneously in (8), following the idea of alternating direction method of multipliers (ADMM), we minimize the Lagrangian function with respect to each block variable U, V, S and X at a time while fixing the other blocks at their latest values. Thus, the proposed optimization process becomes:
U k + 1 = arg min U R m × r L ( X k , U k , V k , S k ) , V k + 1 = arg min V R n × r L ( X k , U k + 1 , V k , S k ) , S k + 1 = S k + μ ( X k U k + 1 ( V k + 1 ) T ) , X k + 1 = arg min X R m × n L ( X k , U k + 1 , V k + 1 , S k + 1 ) ,
where μ > 0 is the step size of the optimization process.
However, there are many limitations if solved directly by traditional algorithms, so we propose to solve the above optimization problem using a deep learning approach. The main idea is to update the variables using neural network modules. As shown in Figure 1, we construct a deep neural network based on (9), which has three updating modules shown in Figure 1b. A completed restoration module contains the U updating module and V updating module for updating matrices U and V, and it contains the X updating module for restoring the incomplete matrix.

3.1.1. U and V Updating Modules

In our proposed DMFCNet method, the input matrix is first processed by U and V updating modules. According to the analysis in [36], U and V are updated as follows:
U 1 = η ( X 0 + S 0 ) V 0 ( η V 0 T V 0 + I ) 1 , V 1 = η ( X 0 + S 0 ) T U 1 ( η U 1 T U 1 + I ) 1 .
Note that the variables ( X 0 + S 0 ) V 0 and V 0 are included in the update formula of U. So, we choose to add them to the input of the U updating module. In order to learn the maximum convolutional features, U 0 is also added as an input to the U updating module. The auxiliary matrix variable S 0 in (10) is initialized as a zero matrix; thus, it can be removed in the U and V updating modules. Based on (10), for the U updating module, we concatenate variables X 0 V 0 , V 0 and U 0 in the channel dimension as input and use a convolutional neural network to update the variable. The updating of the V matrix follows a similar procedure.
Once U 1 is updated, we concatenate X 0 T U 1 , U 1 and V 0 channel-wise to obtain V 1 . Thus, the updating formulas for U and V matrices are:
U 1 = C U ( X 0 V 0 , V 0 , U 0 ) , V 1 = C V ( X 0 T U 1 , U 1 , V 0 ) ,
where C denotes the convolutional neural network.
We observe that the final matrix recovery performance is sensitive to the initialization of U and V. To address this issue, we propose to perform the following SVD of X 0 R m × n to initialize U and V:
X 0 = U Σ V T , Σ = d i a g ( { σ ˜ i } 1 i d ) .
where Σ R d × d is a diagonal matrix with σ ˜ 1 , , σ ˜ d on the diagonal and zeros elsewhere, d = min ( m , n ) . σ ˜ i > 0 is the i-th singular value of matrix X 0 . U R m × d and V R n × d are left and right singular vectors, respectively. Then, U 0 R m × r and V 0 R n × r are initialized by
U 0 = U ˜ Σ ˜ , V 0 = V ˜ Σ ˜ ,
where U ˜ R m × r are the first r columns of U, V ˜ R n × r are the first r columns of V, and Σ ˜ R r × r are the first r rows and first r columns of Σ . In this paper, we set m = n .
In order to maintain the maximum amount of information during the matrix completion process, a dense convolutional structure is used in the network, and a residual structure is added to improve the stability of the training process. The Mish function is chosen as the activation function due to its smoothness at almost all points of the curve, which allows more information to flow through the neural network. A batch normalization operation (BN) layer is added between convolution layers to speed up the convergence.

3.1.2. X Updating Module

After obtaining U 1 and V 1 using the U and V updating modules, the Lagrange multiplier S 1 can be updated using following formula:
S 1 = μ ( X 0 U 1 V 1 T )
Then, they will be fed into the X updating module, and X ^ 1 will be obtained by the following equation.
X ^ 1 = U 1 V 1 T S 1 .
To improve the reconstruction performance, we further process X ^ 1 by an autoencoder network. As shown in Figure 1b, the network contains four convolution layers, a batch normalization module, and the last activation layer with the tanh function. For image in-painting applications, to enhance the smoothness of the recovered image, we incorporate the following weighted averaging operation into the network
X 1 = ( 1 Ψ ) X ˜ 1 + Ψ X ˜ 1 + γ Ψ X 0 1 + γ .
where X 0 is the initial matrix and γ is a weighting parameter. When a pixel value is missing at a point in the image, the output of the network is assigned directly to the value at the corresponding location. Otherwise, a weighted average between the output of the network and the pixel value of the corresponding location of the input image are used to obtain the final reconstructed pixel value of that location.

3.2. Pre-Filling

Note that the initialization of U and V in the network is obtained from the original incomplete matrix using SVD, so the missing entries in the matrix need to be filled with predefined constants before the singular value decomposition. However, the network is extremely sensitive to the pre-filled constants and will directly affect the in-painting performance if not filled properly.
To reduce the effect of filling random constants, we first obtain X 0 by replacing the missing values of the observation matrix with predefined constants such as 255. Then, the singular value decomposition operation is performed on X 0 to obtain U 0 and V 0 and input them into the restoration module for preliminary restoration to obtain X 1 , and the output matrix X n e w can be calculated by:
X n e w = ( 1 Ψ ) X 1 + Ψ X 0 .
This step is the preliminary inference of the missing values by the restoration module. The predicted values are filled to the missing positions of the observation matrix, and then, the filled X n e w is used as the input to the second restoration module. Since U 0 and V 0 in the second restoration module are obtained by the singular value decomposition of the new X 0 , which can largely eliminate the negative effects of using random constant filling, better restoration results can be obtained by the second restoration module. The following Algorithm 1 summarizes the DMFCNet-1 algorithm for network-based pre-filling operations.
Algorithm 1 DMFCNet-1
Require
X Ω : original incomplete image matrix; Ω : the position of the observed entries; non-negative parameters r, μ and λ .
Ensure: 
the restored matrix X n e w .
  1:
Init: X 0 R m × n : The matrix using constant 255 to replace missing values of X Ω ; Ψ { 0 , 1 } m × n : Ψ i j = 1 , i f ( i , j ) Ω 0 , i f ( i , j ) Ω ;
  2:
fori = 1:2 do
  3:   
Compute  U 0 R m × r and V 0 R n × r from X 0 using (12) and (13);
  4:
    U 1 C U ( X 0 V 0 , V 0 , U 0 ) ;
  5:
    V 1 C V ( X 0 T U 1 , U 1 , V 0 ) ;
  6:
    X ^ 1 U 1 V 1 T μ ( X 0 U 1 V 1 T ) ;
  7:
    X ˜ 1 AutoEncoder ( X ^ 1 ) ;
  8:
    X 1 ( 1 Ψ ) X ˜ 1 + Ψ X ˜ 1 + λ Ψ X 0 1 + λ ;
  9:
    X n e w ( 1 Ψ ) X 1 + Ψ X 0 ;
  10:
X 0 X n e w ;
  11:
end for
  12:
return X n e w ;
However, a pre-filling-based neural network requires a singular value decomposition, which will take a relatively long time. To improve the running time, according to the structural characteristics of the image, a new pre-filling algorithm, called Nearest Neighbor Mean Filling (NNMF), is presented. It takes the data observed near the location of the missing value as a reference to infer the missing value.
It is assumed that we need to fill the data at the location ( i , j ) of the missing values. Let V l i j be the value of the first non-missing position traversed from position ( i , j ) to the left, V r i j be the value of the first non-missing position traversed from position ( i , j ) to the right, and similarly, let V t i j be the value of the first non-missing position traversed from position ( i , j ) to the top and V b i j be the value of the first non-missing position traversed from position ( i , j ) to the bottom. Then, the formula for filling the data V i j at the location ( i , j ) of the missing value is as follows.
V i j = ( V l i j + V r i j + V t i j + V b i j ) 4 .
It is a very time-consuming operation to traverse the location of each missing value and then find the four values in turn. In this paper, we design a calculation procedure as shown in Figure 2, which can efficiently calculate the fill values at the locations of all missing data by dynamic programming. As shown in Figure 2, the missing values at the edges are first filled in a clockwise direction; then, four matrices are generated in four directions, and finally, the four generated matrices are summed to find the mean value to obtain the filled matrix.
Algorithm 2 summarizes the DMFCNet-2 algorithm based on NNMF for pre-filling operation. The matrix obtained from the pre-filling operation of the observation matrix using the NNMF algorithm is used as the input of the restoration module. Based on the two pre-filling methods, the network framework of the DMFCNet-1 algorithm and DMFCNet-2 algorithm proposed in this paper is shown in Figure 1a. During training, only the weighting parameters in the convolutional networks C U and C V and the autoencoder are optimized.
Algorithm 2 DMFCNet-2
Require: 
X Ω : original incomplete image matrix; Ω : the position of the observed entries; non-negative parameters r, μ and λ .
Ensure: 
the restored matrix X n e w .
  1:
Init: X 0 R m × n : The matrix obtained by pre-filling X Ω using NNMF algorithm; Ψ { 0 , 1 } m × n : Ψ i j = 1 , i f ( i , j ) Ω 0 , i f ( i , j ) Ω ;
  2:
Compute U 0 R m × r and V 0 R n × r from X 0 using (12) and (13);
  3:
U 1 C U ( X 0 V 0 , V 0 , U 0 ) ;
  4:
V 1 C V ( X 0 T U 1 , U 1 , V 0 ) ;
  5:
X ^ 1 U 1 V 1 T μ ( X 0 U 1 V 1 T ) ;
  6:
X ˜ 1 AutoEncoder ( X ^ 1 ) ;
  7:
X 1 ( 1 Ψ ) X ˜ 1 + Ψ X ˜ 1 + λ Ψ X 0 1 + λ ;
  8:
X n e w ( 1 Ψ ) X 1 + Ψ X 0 ;
  9:
return X n e w ;

3.3. Loss Function

The general convolutional neural network, whose network interior is equivalent to a black box for people, can only be globally optimized by constraining the final output of the network to the whole network weights. In contrast, each variable in the interpretable network built based on the iterative model in this paper is of practical significance. So, in addition to restricting the final output X 1 of the restoration module in the loss function, this paper also restricts the intermediate variables in the module, which can make its training more stable and efficient. Frobenius parametrization is used to restrict the variables in the network, from which the loss function of a recovery module can be derived as follows:
L ( Θ ) = 1 B b = 1 B ( ( X b ( Θ ) Y b F 2 ) + α ( X ^ b ( Θ ) Y b F 2 ) + β ( U b ( Θ ) V b ( Θ ) T Y b F 2 ) ) .
where Θ is the network parameter of the restoration module, B is the number of samples input to the network, and α and β are the regular term coefficients. X b denotes the output X 1 of the b-th sample in the restoration module and X ^ b is the input X ^ 1 of the b-th sample of the autoencoder in the X updating module. U b and V b are the U 1 and V 1 of the b-th sample output, and Y b is the complete image corresponding to the b-th sample.

3.4. Training

According to the diversity of the VOC dataset [39,40], this dataset is selected as the training sample to adapt to the recovery task of more complex images. Firstly, the image is converted into a grayscale image of size 256 × 256, and then, some random pixel values in the image are replaced by 255.
The hyperparameters in training are set as follows. The first 50 singular values are taken when initializing the U and V matrices. Adam is chosen as the optimizer for training the network, and the learning rate is set to 1 × 10 3 , which is reduced to 1 × 10 4 after stabilization and set to 1 × 10 5 for global fine-tuning. μ is set to 1 × 10 3 and γ is set to 10 in the X updating module. The loss function canonical term coefficients α and β are set to 0.1 and 0.01, respectively. The autoencoder in the X updating module contains a total of three hidden layers with the dimensions ( H 2 , W 2 , 32 ) , ( H 4 , W 4 , 64 ) and ( H 2 , W 2 , 32 ) .
To make it more targeted for the recovery of images with missing elements, two models are trained for each of the DMFCNet-1 and DMFCNet-2 networks. The first model uses a dataset containing images with a 30% to 50% missing rate, so this model is mainly used for recovering images with a 50% missing rate and below. The second model uses a dataset containing images with a 50% to 70% missing rate, and this model is used to recover images with a 50% to 70% missing rate.
Specifically, DMFCNet-1 is trained with one restoration module as the training unit. The first restoration module is trained, and the weights of the first restoration module are frozen after the training is completed. Then, the second restoration module is added and trained, and the weights of the first repair module are unfrozen for global fine-tuning when the training of the second restoration module is completed. Figure 3 shows the loss convergence during the training period of the two models and the reconstruction results of the test data.

4. Experiments

In this section, we first compare the two versions of the DMFCNet model proposed in this paper in image in-painting tasks, and then, we compare them with six popular matrix completion methods. These methods are matrix factorization (MF) by LmaFit [21], nuclear norm minimization (NNM) by IALM [16], truncated nuclear norm minimization (TNNM) by ADMM [8], DLMC method-based deep learning [28], NC-MC method [41], and LNOP method by ADMM [42]. The peak signal-to-noise ratio (PSNR) [43] and structural similarity (SSIM) [44] were used in the experiments to evaluate the quality of the restored images.

4.1. Datasets

In this part, we discuss how to select the dataset for training the model. The method proposed requires pre-training the network model parameters, which requires a large number of datasets for training. We hope that the proposed algorithm is not only limited to simple low-rank images but includes both low-rank images and more complex images. Therefore, two datasets are chosen to train the model and test the effect of different datasets on the image restoration performance. The first dataset is the CelebFaces Attributes Dataset (CelebA) [45], which is a large-scale face attribute dataset with over 200,000 celebrity images. These images contain some degree of pose variation but remain relatively simple and homogeneous images overall. The second dataset is the VOC dataset [39,40], which has a more diverse set of images, including simple low-rank images as well as complex images.
Two datasets are used to train the DMFCNet-2 model, where the images of the datasets are converted to grayscale images of size 256 × 256, and 30% of random pixel information will be discarded. The test results of the model on complex images obtained by training with different datasets are shown in Figure 4. It can be seen from Figure 4 that the training loss when training the network using the CelebA dataset is smaller than that when training the network using the VOC dataset because of its relative simplicity. However, the loss and reconstruction performance of the network trained with the VOC dataset outperformed the network trained with the CelebA dataset when tested on complex images. Therefore, to improve the image restoration performance, we recommend using a more targeted dataset.

4.2. Experimental Settings

To make the best performance of six methods for comparison, the hyperparameters of each method were chosen as follows. In MF, since automatic estimation often leads to poor performance in image restoration problems, the fixed number of ranks is chosen for different missing rates, with the rank set to 30 for restoring images containing 20% to 30% missing rates, 20 for restoring images containing 40% to 50% missing rates, and 10 for restoring images containing 60% to 70% missing rates and text masks. In TNNM, the parameter r is uniformly set to 10. In DLMC, the weight decay penalty is set to 0.01, the network contains three hidden layers, and the number of hidden cells is set to [100 50 100]. p is set to 0.7 in LNOP. Other parameters follow the settings in the original paper.

4.3. Image In-Painting

At first, DMFCNet-1 and DMFCNet-2 are compared, which includes the preliminary restoration results (pre-filling results) and the final restoration results. Figure 5 and Figure 6 show the restoration results of the two models restoring images containing 40% and 60% missing rates. As shown in Figure 5e and Figure 6e, the image pre-filled with NNMF can achieve relatively good results in the relatively smooth areas of the image, but it produces more obvious vertical stripes in the areas with large variations in pixel values. Figure 5f and Figure 6f show that the vertical stripes of the restored image using the DMFCNet-2 model have disappeared a lot, and the overall image is smoother, but there are still some spots left by the pre-filling of the image with NNMF. The DMFCNet-1 model uses the restoration module for the preliminary restoration, as shown in Figure 5b and Figure 6b. As can be seen, although there is no vertical stripe, the image has some white spots and is rougher overall. Figure 5c and Figure 6c show the final restoration result of DMFCNet-1.
It can be seen that after the second restoration module, the image was restored more carefully based on the preliminary restoration. The white spots in the image basically disappear, but the overall image is a bit rougher than the restored result of DMFCNet-2. In addition, Table 1 shows the recovery of the two methods at different missing rates, which contains a performance comparison of the preliminary restoration ability of the two models. It can be seen from Table 1 that the DMFCNet-2 model with pre-filling using NNMF is better at a low missing rate, but the preliminary restoration of DMFCNet-1 at a high missing rate gives stronger results than that of pre-filling using NNMF. However, the recovery of DMFCNet-2 is better than that of DMFCNet-1 because the images obtained by NNMF are smoother overall.
The next step is to compare the proposed methods with other methods of matrix completion. Five images as shown in Figure 7 are selected for the comparison experiments. Two masks are considered in the experiments: the first one is a random pixel mask, where 20% to 70% of the pixels in the image are removed randomly. The second one is a text mask containing English words. Although the DMFCNet-1 model and DMFCNet-2 model are not trained to restore images that contain text masks, the DMFCNet-2 network is used to compare with other methods in the tests containing text masks because of the characteristics of NNMF.
Figure 8, Figure 9 and Figure 10 show the original images, images containing random pixel masks, and examples of restored images obtained by each of the six methods. Here, 30%, 50% and 70% of the pixels in the original image are removed randomly, respectively. From the images, it can be visually seen that the images obtained by restoration through the MF method are rougher than those obtained by other methods, and DMFCNet-1 and DMFCNet-2 have the best restoration results. We also conducted more comprehensive tests on other images, and the experimental results are shown in Table 2. Table 2 shows the PSNR values and SSIM values of the images obtained from the recovery of five images containing a 20% to 70% missing rate by six methods, respectively. Figure 11 illustrates the average recovery performance for five images with different missing rates using the eight methods. Figure 12 shows the execution time of the eight methods to recover grayscale images of size 256 × 256 containing different missing rates. Meanwhile, Table 3 shows the average running times of the eight methods for recovering images of different sizes containing different missing rates.
The graphical data show that MF(LMaFit) takes the shortest time, which is followed by the methods proposed in this paper. Due to the superiority of deep learning, using the trained network model for image restoration can significantly reduce the time required for the restoration task; even if the missing rate increases gradually, it does not increase the time required for restoration. In contrast, the deep learning-based DLMC method takes the longest time because it needs to optimize the network weights, and its running time growth rate is the largest among all methods when the image size increases. From the overall graphical data, it can be seen that DMFCNet-1 and DMFCNet-2 can achieve better recovery performance than competing methods in the shortest time, both for images containing small and large missing rates. Especially when the missing rate is large, the recovery performance of other methods decreases faster, but the proposed methods can still achieve satisfactory results.
Figure 13 shows the examples of the images containing text masks and grid masks and the recovered images obtained by the seven methods. Table 4 shows the restoration results of the seven methods on the five images containing text masks and grid masks. The data in Table 4 show that the proposed DMFCNet-2 network, even though it is not trained to recover in the case of text-masked and grid-masked images, still performs well due to the characteristics of NNMF.

5. Conclusions

In this work, a new end-to-end neural network structure for image restoration called DMFCNet is proposed in this paper by combining deep learning with traditional matrix complementation algorithms. Experimental results on data containing random masks and other masks show that DMFCNet performs optimally in image restoration compared to the currently popular methods, and it remains stable even when it contains high missing rates.
Although the methods have good performance, there is still room for further improvement. For example, when restoring images containing a high missing rate, the restoration result of DMFCNet-1 contains white spots and the restoration result of DMFCNet-2 contains vertical stripes. Therefore, how to combine these two restoration results to obtain better restoration results is a problem that needs to be investigated in the future. In addition, the adjustment of hyperparameters in this method is also one of the important elements of the next work. A variety of other experiments will be carried out in the future in order to apply the methods of this paper to a wider range of experiments, such as larger image sizes (e.g., 512 × 512 pixels) or missing data due to other factors (e.g., image processing or transmission).

Author Contributions

Conceptualization, X.M., Z.L. and H.W.; data curation, Z.L.; formal analysis, X.M. and Z.L.; funding acquisition, X.M. and H.W.; investigation, X.M. and Z.L.; methodology, X.M., Z.L. and H.W.; project administration X.M.; resources, X.M.; software, X.M. and Z.L.; supervision, X.M., H.W.; validation, Z.L.; visualization, Z.L.; writing—original draft, X.M., Z.L.; writing—review and editing, X.M., Z.L. and H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China under Grant of No. 2020YFB2103604 and No. 2020YFF0305504, the National Natural Science Foundation of China (Nos. 62072024, 61971290), the Research Ability Enhancement Program for Young Teachers of Beijing University of Civil Engineering and Architecture (No. X21024), the Talent Program of Beijing University of Civil Engineering and Architecture, the BUCEA Post Graduate Innovation Project.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Candes, E.J.; Recht, B. Exact matrix completion via convex optimization. Found. Comput. Math. 2009, 9, 717–772. [Google Scholar] [CrossRef] [Green Version]
  2. Fan, J.; Chow, T.W. Sparse subspace clustering for data with missing entries and high-rank matrix completion. Neural Netw. 2017, 93, 36–44. [Google Scholar] [CrossRef] [PubMed]
  3. Liu, G.; Li, P. Low-rank matrix completion in the presence of high coherence. IEEE Trans. Signal Process. 2016, 64, 5623–5633. [Google Scholar] [CrossRef]
  4. Lu, X.; Gong, T.; Yan, P.; Yuan, Y.; Li, X. Robust alternative minimization for matrix completion. IEEE Trans. Syst. Man Cybern. Part (Cybern.) 2012, 42, 939–949. [Google Scholar]
  5. Wang, H.; Zhao, R.; Cen, Y. Rank adaptive atomic decomposition for low-rank matrix completion and its application on image recovery. Neurocomputing 2014, 145, 374–380. [Google Scholar] [CrossRef]
  6. Lara-Cabrera, R.; González-Prieto, A.; Ortega, F.; Bobadilla, J. Evolving matrix-factorization-based collaborative filtering using genetic programming. Appl. Sci. 2020, 10, 675. [Google Scholar] [CrossRef] [Green Version]
  7. Zhang, D.; Liu, L.; Wei, Q.; Yang, Y.; Yang, P.; Liu, Q. Neighborhood aggregation collaborative filtering based on knowledge graph. Appl. Sci. 2020, 10, 3818. [Google Scholar] [CrossRef]
  8. Hu, Y.; Zhang, D.; Ye, J.; Li, X.; He, X. Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 35, 2117–2130. [Google Scholar] [CrossRef]
  9. Le Pendu, M.; Jiang, X.; Guillemot, C. Light field inpainting propagation via low rank matrix completion. IEEE Trans. Image Process. 2018, 27, 1981–1993. [Google Scholar] [CrossRef] [Green Version]
  10. Alameda-Pineda, X.; Ricci, E.; Yan, Y.; Sebe, N. Recognizing emotions from abstract paintings using non-linear matrix completion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 5240–5248. [Google Scholar]
  11. Yang, Y.; Feng, Y.; Suykens, J.A. Correntropy based matrix completion. Entropy 2018, 20, 171. [Google Scholar] [CrossRef] [Green Version]
  12. Ji, H.; Liu, C.; Shen, Z.; Xu, Y. Robust video denoising using low rank matrix completion. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 1791–1798. [Google Scholar]
  13. Cabral, R.; De la Torre, F.; Costeira, J.P.; Bernardino, A. Matrix completion for weakly-supervised multi-label image classification. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 121–135. [Google Scholar] [CrossRef] [PubMed]
  14. Luo, Y.; Liu, T.; Tao, D.; Xu, C. Multiview matrix completion for multilabel image classification. IEEE Trans. Image Process. 2015, 24, 2355–2368. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Harvey, N.J.; Karger, D.R.; Yekhanin, S. The complexity of matrix completion. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, Alexandria, VA, USA, 9–12 January 2006; pp. 1103–1111. [Google Scholar]
  16. Lin, Z.; Chen, M.; Ma, Y. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv 2010, arXiv:1009.5055. [Google Scholar]
  17. Shen, Y.; Wen, Z.; Zhang, Y. Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization. Optim. Methods Softw. 2014, 29, 239–263. [Google Scholar] [CrossRef] [Green Version]
  18. Toh, K.C.; Yun, S. An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problem. Pac. J. Optim. 2010, 6, 615–640. [Google Scholar]
  19. Cai, J.F.; Candes, E.J.; Shen, Z. A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 2010, 20, 1956–1982. [Google Scholar] [CrossRef]
  20. Chen, C.; He, B.; Yuan, X. Matrix completion via an alternating direction method. IMA J. Numer. Anal. 2012, 32, 227–245. [Google Scholar] [CrossRef]
  21. Wen, Z.; Yin, W.; Zhang, Y. Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Math. Program. Computn. 2012, 4, 333–361. [Google Scholar] [CrossRef]
  22. Han, H.; Huang, M.; Zhang, Y.; Bhatti, U.A. An extended-tag-induced matrix factorization technique for recommender systems. Information 2018, 9, 143. [Google Scholar] [CrossRef] [Green Version]
  23. Wang, C.; Liu, Q.; Wu, R.; Chen, E.; Liu, C.; Huang, X.; Huang, Z. Confidence-aware matrix factorization for recommender systems. In Proceedings of the AAAI Conference on Artificial Intelligence, Orleans, LA, USA, 2–7 February 2018; Volume 32, p. 1. [Google Scholar]
  24. Luo, Z.; Zhou, M.; Li, S.; Xia, Y.; You, Z.; Zhu, Q.; Leung, H. An efficient second-order approach to factorize sparse matrices in recommender systems. IEEE Trans. Ind. Inform. 2015, 11, 946–956. [Google Scholar] [CrossRef]
  25. Luo, Z.; Zhou, M.; Li, S.; Xia, Y.; You, Z.; Zhu, Q. A nonnegative latent factor model for large-scale sparse matrices in recommender systems via alternating direction method. IEEE Trans. Neural Netw. Learn. Syst. 2015, 27, 579–592. [Google Scholar] [CrossRef]
  26. Cao, X.; Zhao, Q.; Meng, D.; Chen, Y.; Xu, Z. Robust low-rank matrix factorization under general mixture noise distributions. IEEE Trans. Image Process. 2016, 25, 4677–4690. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  27. Lawrence, N.; Hyvärinen, A. Probabilistic non-linear principal component analysis with Gaussian process latent variable models. J. Mach. Learn. Res. 2005, 6, 2005. [Google Scholar]
  28. Fan, J.; Chow, T. Deep learning based matrix completion. Neurocomputing 2017, 266, 540–549. [Google Scholar] [CrossRef]
  29. Fan, J.; Cheng, J. Matrix completion by deep matrix factorization. Neural Netw. 2018, 98, 34–41. [Google Scholar] [CrossRef] [PubMed]
  30. Nguyen, D.M.; Tsiligianni, E.; Calderbank, R.; Deligiannis, N. Regularizing autoencoder-based matrix completion models via manifold learning. In Proceedings of the 2018 26th European Signal Processing Conference (EUSIPCO), Rome, Italy, 3–7 September 2018; pp. 1880–1884. [Google Scholar]
  31. Abavisani, M.; Patel, V.M. Deep sparse representation-based classification. IEEE Signal Process. Lett. 2019, 26, 948–952. [Google Scholar] [CrossRef] [Green Version]
  32. Bobadilla, J.; Alonso, S.; Hernando, A. Deep learning architecture for collaborative filtering recommender systems. Appl. Sci. 2020, 10, 2441. [Google Scholar] [CrossRef] [Green Version]
  33. Zhang, S.; Yao, L.; Sun, A.; Tay, Y. Deep learning based recommender system: A survey and new perspectives. ACM Comput. Surv. (CSUR) 2019, 52, 1–38. [Google Scholar] [CrossRef] [Green Version]
  34. Sedhain, S.; Menon, A.K.; Sanner, S.; Xie, L. Autorec: Autoencoders meet collaborative filtering. In Proceedings of the 24th International Conference on World Wide Web, New York, NY, USA, 18–22 May 2015; pp. 111–112. [Google Scholar]
  35. Guo, D.; Lu, H.; Qu, X. A fast low rank Hankel matrix factorization reconstruction method for non-uniformly sampled magnetic resonance spectroscopy. IEEE Access 2017, 5, 16033–16039. [Google Scholar] [CrossRef]
  36. Huang, Y.; Zhao, J.; Wang, Z.; Guo, D.; Qu, X. Complex exponential signal recovery with deep hankel matrix factorization. arXiv 2020, arXiv:2007.06246. [Google Scholar]
  37. Signoretto, M.; Cevher, V.; Suykens, J.A. An SVD-free approach to a class of structured low rank matrix optimization problems with application to system identification. In Proceedings of the IEEE Conference on Decision and Control (CDC), Firenze, Italy, 10–13 December 2013. [Google Scholar]
  38. Lee, D.; Jin, K.H.; Kim, E.Y.; Park, S.H.; Ye, J.C. Acceleration of MR parameter mapping using annihilating filter-based low rank hankel matrix (ALOHA). Magn. Reson. Med. 2016, 76, 1848–1864. [Google Scholar] [CrossRef] [PubMed]
  39. Everingham, M.; Gool, L.V.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 2010, 88, 3–338. [Google Scholar] [CrossRef] [Green Version]
  40. Everingham, M.; Eslami, S.A.; Gool, L.V.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vis. 2015, 111, 98–136. [Google Scholar] [CrossRef]
  41. Nie, F.; Hu, Z.; Li, X. Matrix completion based on non-convex low-rank approximation. IEEE Trans. Image Process. 2019, 28, 2378–2388. [Google Scholar] [CrossRef]
  42. Chen, L.; Jiang, X.; Liu, X.; Zhou, Z. Robust Low-Rank Tensor Recovery via Nonconvex Singular Value Minimization. IEEE Trans. Image Process. 2020, 29, 9044–9059. [Google Scholar] [CrossRef]
  43. Gu, K.; Zhai, G.; Yang, X.; Zhang, W. Using free energy principle for blind image quality assessment. IEEE Trans. Multimed. 2014, 17, 50–63. [Google Scholar] [CrossRef]
  44. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
  45. Liu, Z.; Luo, P.; Wang, X.; Tang, X. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3730–3738. [Google Scholar]
Figure 1. The structure of the DMFCNet network. (a) Network architecture. (b) Restoration Module. (c) U and V updating modules.
Figure 1. The structure of the DMFCNet network. (a) Network architecture. (b) Restoration Module. (c) U and V updating modules.
Entropy 24 01500 g001
Figure 2. Schematic diagram of the operation of the NNMF algorithm.
Figure 2. Schematic diagram of the operation of the NNMF algorithm.
Entropy 24 01500 g002
Figure 3. The loss convergence of the network and the reconstruction results. (a) Loss convergence of the training. (b) Loss convergence of the test set. (c) PSNR values of the reconstructed images in the test set. (d) SSIM values of the reconstructed images in the test set.
Figure 3. The loss convergence of the network and the reconstruction results. (a) Loss convergence of the training. (b) Loss convergence of the test set. (c) PSNR values of the reconstructed images in the test set. (d) SSIM values of the reconstructed images in the test set.
Entropy 24 01500 g003
Figure 4. Test results of the model on complex images obtained by training with different datasets. (a) Average PSNR values of recovered images (b) Average SSIM values of recovered images.
Figure 4. Test results of the model on complex images obtained by training with different datasets. (a) Average PSNR values of recovered images (b) Average SSIM values of recovered images.
Entropy 24 01500 g004
Figure 5. DMFCNet-1 and DMFCNet-2 restore image containing 40% missing rate. (a) Original image. (b) Preliminary restoration result of DMFCNet-1. (c) Final restoration result of DMFCNet-1. (d) Partially missing image. (e) Preliminary restoration result of DMFCNet-2. (f) Final restoration result of DMFCNet-2.
Figure 5. DMFCNet-1 and DMFCNet-2 restore image containing 40% missing rate. (a) Original image. (b) Preliminary restoration result of DMFCNet-1. (c) Final restoration result of DMFCNet-1. (d) Partially missing image. (e) Preliminary restoration result of DMFCNet-2. (f) Final restoration result of DMFCNet-2.
Entropy 24 01500 g005
Figure 6. DMFCNet-1 and DMFCNet-2 restore image containing 60% missing rate. (a) Original image. (b) Preliminary restoration result of DMFCNet-1. (c) Final restoration result of DMFCNet-1. (d) Partially missing image. (e) Preliminary restoration result of DMFCNet-2. (f) Final restoration result of DMFCNet-2.
Figure 6. DMFCNet-1 and DMFCNet-2 restore image containing 60% missing rate. (a) Original image. (b) Preliminary restoration result of DMFCNet-1. (c) Final restoration result of DMFCNet-1. (d) Partially missing image. (e) Preliminary restoration result of DMFCNet-2. (f) Final restoration result of DMFCNet-2.
Entropy 24 01500 g006
Figure 7. Five grayscale images of 256 × 256 size for comparison experiments, numbered 1–5 from left to right.
Figure 7. Five grayscale images of 256 × 256 size for comparison experiments, numbered 1–5 from left to right.
Entropy 24 01500 g007
Figure 8. Image recovery containing a 30% random pixel mask. (a) Complete image of size 256 × 256. (b) Partially missing image (10.52 dB/0.127). (c) Restored result by MF in 0.099 s (27.42 dB/0.822/0.099 s). (d) Restored result by NNM (29.37 dB/0.873/5.488 s). (e) Restored result by TNNM (29.676 dB/0.883/2.669 s). (f) Restored result by DLMC (29.354 dB/0.867/15.181 s). (g) Restored result by NC-MC (29.69 dB/0.893/5.399 s). (h) Restored result by LNOP (29.816 dB/0.881/2.391 s). (i) Restored result by DMFCNet-1 (33.29 dB/0.956/0.388 s). (j) Restored result by DMFCNet-2 (34.68 dB/0.971/0.345 s).
Figure 8. Image recovery containing a 30% random pixel mask. (a) Complete image of size 256 × 256. (b) Partially missing image (10.52 dB/0.127). (c) Restored result by MF in 0.099 s (27.42 dB/0.822/0.099 s). (d) Restored result by NNM (29.37 dB/0.873/5.488 s). (e) Restored result by TNNM (29.676 dB/0.883/2.669 s). (f) Restored result by DLMC (29.354 dB/0.867/15.181 s). (g) Restored result by NC-MC (29.69 dB/0.893/5.399 s). (h) Restored result by LNOP (29.816 dB/0.881/2.391 s). (i) Restored result by DMFCNet-1 (33.29 dB/0.956/0.388 s). (j) Restored result by DMFCNet-2 (34.68 dB/0.971/0.345 s).
Entropy 24 01500 g008
Figure 9. Image recovery containing a 50% random pixel mask. (a) Complete image of size 256 × 256. (b) Partially missing image (8.29 dB/0.076). (c) Restored result by MF (25.69 dB/0.784/0.072 s). (d) Restored result by NNM (26.82 dB/0.805/4.297 s). (e) Restored result by TNNM (27.264 dB/0.820/3.094 s). (f) Restored result by DLMC (27.08 dB/0.812/17.956 s). (g) Restored result by NC-MC (27.63 dB/0.837/3.473 s). (h) Restored result by LNOP (27.29 dB/0.816/2.105 s). (i) Restored result by DMFCNet-1 (29.73 dB/0.898/0.403 s). (j) Restored result by DMFCNet-2 (30.04 dB/0.908/0.346 s).
Figure 9. Image recovery containing a 50% random pixel mask. (a) Complete image of size 256 × 256. (b) Partially missing image (8.29 dB/0.076). (c) Restored result by MF (25.69 dB/0.784/0.072 s). (d) Restored result by NNM (26.82 dB/0.805/4.297 s). (e) Restored result by TNNM (27.264 dB/0.820/3.094 s). (f) Restored result by DLMC (27.08 dB/0.812/17.956 s). (g) Restored result by NC-MC (27.63 dB/0.837/3.473 s). (h) Restored result by LNOP (27.29 dB/0.816/2.105 s). (i) Restored result by DMFCNet-1 (29.73 dB/0.898/0.403 s). (j) Restored result by DMFCNet-2 (30.04 dB/0.908/0.346 s).
Entropy 24 01500 g009
Figure 10. Image recovery containing a 70% random pixel mask. (a) Complete image of size 256 × 256. (b) Partially missing image (4.61 dB/0.024). (c) Restored result by MF (24.86 dB/0.614/0.034 s). (d) Restored result by NNM (25.78 dB/0.683/4.836 s). (e) Restored result by TNNM (26.126 dB/0.664/8.630 s). (f) Restored result by DLMC (27.057 dB/0.719/19.767 s). (g) Restored result by NC-MC (26.09 dB/0.663/2.266 s). (h) Restored result by LNOP (26.53 dB/0.68/1.808 s). (i) Restored result by DMFCNet-1 (29.82 dB/0.859/0.449 s). (j) Restored result by DMFCNet-2 (31.47 dB/0.884/0.339 s).
Figure 10. Image recovery containing a 70% random pixel mask. (a) Complete image of size 256 × 256. (b) Partially missing image (4.61 dB/0.024). (c) Restored result by MF (24.86 dB/0.614/0.034 s). (d) Restored result by NNM (25.78 dB/0.683/4.836 s). (e) Restored result by TNNM (26.126 dB/0.664/8.630 s). (f) Restored result by DLMC (27.057 dB/0.719/19.767 s). (g) Restored result by NC-MC (26.09 dB/0.663/2.266 s). (h) Restored result by LNOP (26.53 dB/0.68/1.808 s). (i) Restored result by DMFCNet-1 (29.82 dB/0.859/0.449 s). (j) Restored result by DMFCNet-2 (31.47 dB/0.884/0.339 s).
Entropy 24 01500 g010
Figure 11. The average restoration performance of the eight methods for five images with different missing rates. The PSNR and SSIM values of the recovered images are shown on the (a) and (b), respectively.
Figure 11. The average restoration performance of the eight methods for five images with different missing rates. The PSNR and SSIM values of the recovered images are shown on the (a) and (b), respectively.
Entropy 24 01500 g011
Figure 12. Execution times of eight methods to recover grayscale images of size 256 × 256 containing different missing rates. (a) Lena. (b) Butterfly. (c) Bridge.
Figure 12. Execution times of eight methods to recover grayscale images of size 256 × 256 containing different missing rates. (a) Lena. (b) Butterfly. (c) Bridge.
Entropy 24 01500 g012
Figure 13. Image recovery with text mask and grid mask. (a) Images with masks. (b) Restored results by MF. (c) Restored results by NNM. (d) Restored results by TNNM. (e) Restored results by DLMC. (f) Restored results by NC-MC. (g) Restored results by LNOP. (h) Restored results by DMFCNet-2.
Figure 13. Image recovery with text mask and grid mask. (a) Images with masks. (b) Restored results by MF. (c) Restored results by NNM. (d) Restored results by TNNM. (e) Restored results by DLMC. (f) Restored results by NC-MC. (g) Restored results by LNOP. (h) Restored results by DMFCNet-2.
Entropy 24 01500 g013
Table 1. Restoration results of two methods of restoration containing different missing rates, each method contains preliminary restoration results (left) and final restoration results (right).
Table 1. Restoration results of two methods of restoration containing different missing rates, each method contains preliminary restoration results (left) and final restoration results (right).
Missing RateImages NO.PSNR/SSIM
DMFCNet-1DMFCNet-2
30%129.89/0.90433.29/0.95631.11/0.95234.68/0.971
227.63/0.91530.59/0.95828.08/0.95331.94/0.974
330.80/0.90833.44/0.95932.21/0.96335.29/0.973
430.10/0.90232.56/0.94631.92/0.94633.70/0.960
532.52/0.87536.47/0.95635.59/0.96838.32/0.976
Average30.19/0.90133.27/0.95531.78/0.95634.78/0.971
50%127.46/0.84730.04/0.91427.24/0.87930.30/0.929
223.98/0.86327.23/0.92923.24/0.85627.74/0.939
327.97/0.85230.19/0.92127.77/0.89631.00/0.938
427.56/0.83129.73/0.89828.18/0.87330.04/0.908
529.85/0.80433.54/0.91931.58/0.92334.39/0.947
Average27.37/0.84030.14/0.91627.60/0.88530.69/0.932
70%124.92/0.76326.23/0.83423.79/0.72926.12/0.829
221.65/0.78123.43/0.86019.12/0.66522.29/0.810
325.02/0.77026.62/0.86123.94/0.74426.66/0.853
424.93/0.72126.47/0.80124.95/0.73926.62/0.808
527.70/0.74629.82/0.85928.14/0.83231.47/0.884
Average24.84/0.75626.51/0.84323.99/0.74226.63/0.837
Table 2. PSNR and SSIM values of the five images recovered by eight methods containing 20% to 70% missing rate, respectively. The best results are highlighted in bold.
Table 2. PSNR and SSIM values of the five images recovered by eight methods containing 20% to 70% missing rate, respectively. The best results are highlighted in bold.
Missing RateImages NO.PSNR/SSIM
MF [21]NNM [16]TNNM [8]DLMC [28]NC-MC [41]LNOP [42]DMFCNet-1DMFCNet-2
20%129.72/0.89032.06/0.92932.30/0.93531.22/0.91332.05/0.93932.55/0.93435.60/0.97237.57/0.984
226.10/0.83129.24/0.89629.60/0.90429.09/0.88829.85/0.91230.12/0.90633.35/0.97235.03/0.985
330.90/0.90133.26/0.93933.54/0.94332.70/0.93333.37/0.94833.92/0.94535.77/0.97337.69/0.984
431.99/0.93833.39/0.95133.63/0.95432.26/0.93833.73/0.95933.81/0.95534.97/0.96936.20/0.978
535.71/0.93937.31/0.95937.71/0.96136.92/0.95537.55/0.96737.87/0.96239.04/0.97241.08/0.986
Average30.88/0.90033.05/0.93533.36/0.94032.44/0.92633.31/0.94533.65/0.94035.75/0.97237.51/0.983
30%127.42/0.82229.37/0.87329.68/0.88329.35/0.86729.69/0.89329.82/0.88133.29/0.95634.68/0.971
223.91/0.76225.80/0.91626.14/0.82726.46/0.82526.70/0.84226.54/0.82730.59/0.95831.94/0.974
328.63/0.84530.29/0.88530.68/0.89430.35/0.89130.88/0.90931.02/0.89633.44/0.95935.29/0.973
429.54/0.89530.72/0.91331.07/0.91930.34/0.90331.31/0.92731.15/0.92032.56/0.94633.70/0.960
533.32/0.90434.48/0.93134.87/0.93234.68/0.93135.03/0.94435.05/0.93436.47/0.95638.32/0.976
Average28.56/0.84630.13/0.90430.49/0.89130.24/0.88330.72/0.90330.72/0.89233.27/0.95534.78/0.971
40%125.84/0.77127.13/0.80427.43/0.81827.40/0.79927.70/0.83627.56/0.81231.00/0.93232.10/0.952
221.04/0.65523.16/0.71823.44/0.73223.93/0.73023.96/0.74023.87/0.73228.68/0.93729.17/0.953
326.17/0.77727.65/0.81628.07/0.82828.36/0.83928.56/0.85428.38/0.82931.60/0.93932.50/0.956
427.36/0.84228.69/0.86529.04/0.87428.79/0.86329.42/0.88929.14/0.87430.57/0.91631.81/0.940
530.72/0.85232.07/0.88932.51/0.89333.02/0.90032.80/0.91132.63/0.89534.23/0.93036.12/0.962
Average26.23/0.77927.74/0.81828.10/0.83028.30/0.82628.49/0.84628.32/0.82831.21/0.93132.34/0.953
50%124.01/0.68525.07/0.71725.37/0.73125.10/0.70325.54/0.74625.42/0.72230.04/0.91430.30/0.929
219.65/0.58820.95/0.62621.27/0.63521.75/0.63921.56/0.63421.57/0.63227.23/0.92927.74/0.939
324.60/0.71325.40/0.73325.96/0.75726.42/0.77226.50/0.79026.12/0.75130.19/0.92131.00/0.938
425.69/0.78426.82/0.80527.26/0.81927.08/0.81227.63/0.83727.29/0.81629.73/0.89830.04/0.908
529.26/0.79829.92/0.83930.54/0.84231.19/0.85831.06/0.86730.60/0.84433.54/0.91934.39/0.947
Average24.64/0.71425.63/0.74426.08/0.75726.31/0.75726.46/0.77526.20/0.75330.14/0.91630.69/0.932
60%122.27/0.61823.21/0.62423.50/0.64223.31/0.62323.53/0.64323.52/0.63128.17/0.88928.27/0.893
216.75/0.46818.81/0.51819.04/0.52719.36/0.53018.39/0.48119.28/0.52425.48/0.90725.34/0.901
321.99/0.60723.49/0.64124.11/0.66624.43/0.68424.34/0.67324.22/0.65428.94/0.90429.05/0.907
424.04/0.70424.94/0.72225.33/0.73925.37/0.73525.55/0.75425.41/0.73628.23/0.86228.45/0.868
526.44/0.69228.15/0.77028.71/0.86929.54/0.81029.17/0.79628.83/0.77332.08/0.90132.88/0.924
Average22.30/0.61823.72/0.65524.14/0.68924.40/0.67624.20/0.66924.25/0.72428.58/0.89328.80/0.900
70%121.04/0.52021.41/0.51521.82/0.53221.50/0.49421.17/0.47621.85/0.51826.23/0.83426.12/0.829
215.50/0.38416.77/0.40416.84/0.40517.04/0.40615.10/0.31217.25/0.41223.43/0.86022.29/0.810
320.80/0.52021.32/0.51521.99/0.53922.19/0.54921.17/0.48122.09/0.53126.62/0.86126.66/0.853
422.59/0.61723.16/0.62323.38/0.63623.50/0.64223.12/0.62023.57/0.63426.47/0.80126.62/0.808
524.86/0.61425.78/0.68326.12/0.66427.06/0.71926.09/0.66326.53/0.68029.82/0.85931.47/0.884
Average21.00/0.53121.69/0.54822.03/0.55522.26/0.56221.33/0.51022.26/0.55526.51/0.84326.63/0.837
Table 3. Average running time (in seconds) for eight methods to restore images containing different missing rates and different sizes.
Table 3. Average running time (in seconds) for eight methods to restore images containing different missing rates and different sizes.
Image SizeMissing RateMFNNMTNNMDLMCNC-MCLNOPDMFCNet-1DMFCNet-2
256 × 25630%0.0954.9052.97416.1295.6512.3840.3900.341
50%0.0964.3403.45517.4293.5072.1650.4200.339
70%0.1303.5346.81618.6442.2601.6490.4360.329
512 × 51230%0.28823.38012.41659.58148.2079.1201.7631.019
50%0.19918.99817.12869.52823.9708.3021.6611.015
70%0.11616.19318.80178.08712.1227.4151.7691.050
Table 4. Restoration results of seven methods on five images containing text mask and grid mask.
Table 4. Restoration results of seven methods on five images containing text mask and grid mask.
Mask TypeImages NO.PSNR/SSIM
MFNNMTNNMDLMCNC-MCLNOPDMFCNet-2
Text Mask130.33/0.93632.37/0.95032.52/0.95231.64/0.94232.42/0.95332.65/0.95337.13/0.985
224.80/0.89128.74/0.92628.79/0.92928.47/0.92029.25/0.93229.57/0.93434.63/0.984
330.27/0.93031.97/0.94732.49/0.95632.27/0.94832.75/0.95832.57/0.95535.17/0.985
433.18/0.95735.16/0.96835.69/0.97134.35/0.96035.54/0.97235.65/0.97137.44/0.983
534.30/0.94937.54/0.97538.31/0.97537.96/0.97438.70/0.97938.45/0.97740.82/0.990
Average30.58/0.93333.16/0.95333.56/0.95732.94/0.94933.73/0.95933.78/0.95837.04/0.985
Grid Mask129.19/0.89932.95/0.94232.96/0.94433.22/0.92832.62/0.94333.20/0.94337.11/0.983
223.22/0.81229.30/0.91429.52/0.91828.92/0.90229.75/0.91730.11/0.91934.34/0.986
328.41/0.88233.17/0.94633.57/0.95132.44/0.94033.41/0.95433.89/0.95137.67/0.986
431.01/0.93334.39/0.96234.59/0.96433.39/0.95234.71/0.96634.72/0.96436.75/0.980
533.14/0.91837.45/0.96637.88/0.96736.46/0.95837.78/0.97238.01/0.96841.16/0.988
Average28.99/0.88933.45/0.94633.70/0.94932.89/0.93633.65/0.95033.99/0.94937.41/0.985
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ma, X.; Li, Z.; Wang, H. Deep Matrix Factorization Based on Convolutional Neural Networks for Image Inpainting. Entropy 2022, 24, 1500. https://doi.org/10.3390/e24101500

AMA Style

Ma X, Li Z, Wang H. Deep Matrix Factorization Based on Convolutional Neural Networks for Image Inpainting. Entropy. 2022; 24(10):1500. https://doi.org/10.3390/e24101500

Chicago/Turabian Style

Ma, Xiaoxuan, Zhiwen Li, and Hengyou Wang. 2022. "Deep Matrix Factorization Based on Convolutional Neural Networks for Image Inpainting" Entropy 24, no. 10: 1500. https://doi.org/10.3390/e24101500

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop