Adaptive Feature Fusion and Kernel-Based Regression Modeling to Improve Blind Image Quality Assessment

Ryu, Jihyoung

doi:10.3390/app13137522

Open AccessArticle

Adaptive Feature Fusion and Kernel-Based Regression Modeling to Improve Blind Image Quality Assessment

by

Jihyoung Ryu

Electronics and Telecommunications Research Institute (ETRI), Gwangju 61012, Republic of Korea

Appl. Sci. 2023, 13(13), 7522; https://doi.org/10.3390/app13137522

Submission received: 23 May 2023 / Revised: 2 June 2023 / Accepted: 15 June 2023 / Published: 26 June 2023

Download

Browse Figures

Versions Notes

Abstract

:

In the fields of image processing and computer vision, evaluating blind image quality (BIQA) is still a difficult task. In this paper, a unique BIQA framework is presented that integrates feature extraction, feature selection, and regression using a support vector machine (SVM). Various image characteristics are included in the framework, such as wavelet transform, prewitt and gaussian, log and gaussian, and prewitt, sobel, and gaussian. An SVM regression model is trained using these features to predict the quality ratings of photographs. The proposed model uses the Information Gain attribute approach for feature selection to improve the performance of the regression model and decrease the size of the feature space. Three commonly used benchmark datasets, TID2013, CSIQ, and LIVE, are utilized to assess the performance of the proposed methodology. The study examines how various feature types and feature selection strategies affect the functionality of the framework through thorough experiments. The experimental findings demonstrate that our suggested framework reaches the highest levels of accuracy and robustness. This suggests that it has a lot of potential to improve the accuracy and dependability of BIQA approaches. Additionally, its use is broadened to include image transmission, compression, and restoration. Overall, the results demonstrate our framework’s promise and ability to advance studies into image quality assessment.

Keywords:

visual quality assessment; feature extraction; support vector machine (SVM); wavelet transform; filtering

1. Introduction

Images have become a vital channel for conveying information due to the rapid volant of digital technology. This led to an immersive user experience and the exponential stacking of digital images in large data volumes. A decline in the quality of digital images takes place during the cyclic period of acquiring the images, performing image compression, processing, and transmitting the images to the data storage point. This inevitable fact consequently governs the immersive user experience, leading to abysmal image quality, thus making a space for an image quality assessment IQA approach to be filled in [1]. On the other side of the spectrum, being able to robustly assess the image quality deeply affects various computer vision applications, which broadly include bio-medical imaging, self-driving vehicles, and object detection, to name a few [2,3].

Among the two quality assessment approaches, subjective and objective IQA, the latter is widely utilized, even though subjective IQA is a definitive and nearly error-free approach. The induction of human evaluation in the latter approach makes it a time extensive and tedious task, thus giving a selective priority to objective IQA, which automatically deduces the image quality by deploying computational methods capable of speculating the image quality in a manner that is harmonious with the image quality scores of human subjects [4]. Objective IQA is sub-classed into full reference IQA, reduce-reference IQA, and no-reference IQA, widely labeled as Blind IQA. The availability of distortion-free versions of images, termed as pristine images, classifies the three types. The reference or pristine image is unavailable during the process of BIQA, making it a noteworthy assessment approach among the three objective IQA methods [4,5].

Moreover, BIQA is further classified into distortion-specific and general-purpose image quality assessment approaches. The latter approach is designed keeping in mind the fact that multiple distortions can occur in a digital image, making it superior to the distortion-specific approach. As the general-purpose approach is not limited to a specific type of image distortion, its ability to assess image quality highly relies on obtaining the features that capture the essential information of multiple image distortions [6]. Conventional BIQA methods utilizing hand-crafted features are based on platforms such as the human visual system HVS and natural scene statistics NSS. However, due to the higher extent of similarity among BIQA and human perception, compiling such hand-crafted features thath can effectively capture and represent the image quality degradation levels becomes a tedious task [7].

Improper lighting, constraints put on imaging lenses, and limitations of sensors can induce distortions that are more complex in nature, and such cases in digital images usually occur in reality-based scenarios. The traditional quality assessment methods usually diverged towards artificially distorted images with distortions such as JPEG, gaussian noise, and blue, to name a few [8]. However, the bottleneck in these datasets is that they contain a limited quantity of images, which are less diverse in terms of distortions. Furthermore, as multiple distortions may exist in an image in reality-based cases, such artificially distorted images do not emulate the complex nature of distortions that can occur in real world images [9].

Traditional BIQA models extract low-level features from distorted images and predict the image quality by applying machine learning regression models [10,11]. This way of applying NSS to represent image quality does not capture the local distortions in an image. A recent utilization of deep learning methods in IQA research prompted the use of convolutional neural networks CNN to extract image features in order to highlight degradation regions in an image. One major issue arises that deep learning based IQA methods are usually trained on smaller and artificially induced image datasets [12,13]. Moreover, CNNs are also deployed to extract features of distorted images to replace the conventional approach of hand-crafted features for IQA research [1]. CNN-based IQA methods should be capable of capturing two quantities, such as distortion and image visual content. This mainly arises from the fact that most CNN-based IQA methods are oriented to image-visual content, which emulates the process of image recognition [14]. The designed IQA method should be capable of handling generalization as various distortions can coexist in an image from a set of vast image distortions. Along with it, deep IQA methods required a large quantity of images to train a neural network, usually comprising more than one million images [15].

In recent times, deep neural networks have shown great improvement in different research fields such as bioinformatics [16,17,18], agriculture [19,20], and medical imaging [21,22]. The deep IQA methods utilize convolutional operations by deploying deep neural networks to extract features from distorted regions of the image [1]. These IQA methods cannot yield accurate predictions on reality-based or authentic datasets, as such methods cannot learn the patterns and feature representations that occur in these distorted images. The distortions as induced in artificial datasets range from Gaussian blur, Gaussian noise, JPEG compression, etc. [15]. While distortions in the authentic datasets are generally caused by naturally occurring conditions. This can be evidently observed by examining the gradient maps of both types of datasets. The high-frequency components evident in the gradient map are object-oriented in the case of a naturally distorted image, while such components are distributed across the whole image for artificially distorted images [23].

A no-reference IQA approach that can combine both types of distorted images, is an effective solution because it is more versatile in nature. Multiple such IQA approaches countenance this issue, but still, these methods have various issues, i.e., in order to predict the image quality, high-level semantic features are utilized. Furthermore, such approaches try to combine the effects of both types of distorted images by using a bilinear pooling operation, which is inadequate for capturing the relationship between these two types of distorted images [24].

2. Datasets

The TID2013 dataset comprises 25 reference photos and 3000 distorted images created with 24 various forms of distortion, such as JPEG compression, Gaussian blur, and white noise. The collection also contains subjective quality rankings for each warped image, which were gathered from human observers during a subjective experiment. The subjective quality scores range from 0 to 9, with 0 being the poorest and 9 representing the best.

The CSIQ dataset contains 30 reference photos and 866 distorted images created with six distinct forms of distortion, including JPEG2000 compression, JPEG compression, and Gaussian blur. The subjective quality scores for each warped image were also collected from human observers in a subjective experiment, and they range from 1 to 5, with 1 being the poorest quality and 5 being the greatest quality.

The LIVE dataset is more current, with 29 reference photos and 779 distorted images made using 5 different forms of distortion, including JPEG compression, Gaussian blur, and white noise. The subjective quality scores for each deformed image were determined by human observers in a subjective experiment, and they range from 0 to 100, with 0 being the lowest quality and 100 representing the greatest quality.

While these datasets have been frequently used to test the effectiveness of blind image quality evaluation algorithms in the literature, they do have certain drawbacks. For example, the distortion types contained in these datasets may not completely reflect the spectrum of real-world image distortions. Nonetheless, these datasets serve as a valuable standard for comparing the performance of various approaches in the field of blind image quality evaluation. Figure 1 represents the sample images from the datasets.

3. Proposed Methodology

Figure 2 represents the overall framework of the Blind Image Quality Assessment system used in this research paper. The framework consists of three main stages: Feature Extraction, Feature Selection, and Quality Prediction.

In the Feature Extraction stage, the reference and distorted images are preprocessed using several well-known image processing techniques to extract relevant features. The extracted features include Haar wavelet coefficients, Prewitt and Sobel edge detection, and Log filters.

In the Feature Selection stage, the extracted features are passed through the WEKA machine learning tool to select the most relevant features using the Information Gain attribute algorithm aiming to reduce the in dimensionality of the feature map and improve the efficiency of the proposed model. The selected characteristics are supplied into the Support Vector Machine (SVM) Regression model in the Quality Prediction step to forecast the quality of the distorted images.

3.1. Feature Extraction

Feature extraction is an important step in Blind Image Quality Assessment because it allows vital details to be extracted from deformed images. Numerous well-known algorithms for image processing were utilized in this study to extract characteristics from source and damaged images. The images were processed using the following filters:

3.1.1. Wavelet Transform

The Wavelet Transform is a common image processing approach for extracting features. It divides the image into frequency sub-bands that capture varying levels of detail and texture. The Haar wavelet transform was utilized in this study to extract characteristics from reference and distorted images. This is how the Haar wavelet transform is defined,

W_{l, m} = \frac{1}{\sqrt{2^{l}}} \sum_{q = 0}^{2^{l} - 1} h_{q} \cdot x_{p - q} where l, p = 0, 1, 2, \dots

(1)

where

W_{l, p}

denotes the wavelet coefficient at scale l and position m,

h_{q}

denotes the Haar wavelet filter, and

x_{p}

denotes the input image pixel value.

3.1.2. Prewitt and Gaussian

The Prewitt filter is an edge recognition filter that improves the image’s edges. Prewitt filter was used for the pristine and damaged images in this study, followed by a Gaussian filter to minimize noise. The Prewitt filter can be defined by following equation,

G_{p} = [\begin{matrix} - 1 & 0 & 1 - 1 & 0 & 1 - 1 & 0 & 1 \end{matrix}] * I and G_{q} = [\begin{matrix} - 1 & - 1 & - 1 0 & 0 & 0 1 & 1 & 1 \end{matrix}] * I

(2)

where

G_{p}

and

G_{q}

are the Prewitt filters for identifying vertical and horizontal edges, respectively, and I is the input image. The Gaussian filter is defined as follows,

G (p, q) = \frac{1}{2 π σ^{2}} exp (- \frac{p^{2} + q^{2}}{2 σ^{2}})

(3)

where

G (p, q)

denotes the Gaussian filter at location

(p, q)

, and

σ

denotes the standard deviation of the filter.

3.1.3. Log and Gaussian

The log filter is used to identify edges and minimize image noise. In this study, the reference and distorted images were subjected to a Log filter, which was then followed by a Gaussian filter to minimize noise. The Log filter may be defined in the following way,

L (p, q) = - \frac{1}{π σ^{4}} (1 - \frac{p^{2} + q^{2}}{2 σ^{2}}) exp (- \frac{p^{2} + q^{2}}{2 σ^{2}})

(4)

where

L (p, q)

denotes the Log filter at location

(p, q)

, and

σ

denotes the standard deviation of the filter.

3.1.4. Prewitt, Sobel, and Gaussian

Another edge detection filter that improves image edges is the Sobel filter. Prewitt and Sobel filters were applied to the reference and distorted images in this study, followed by a Gaussian filter to remove noise.

3.2. Feature Selection

The suggested Blind Image Quality Assessment system relies on the choice of features to minimize the complexity of the feature vector and increase the machine’s performance. In this study, the WEKA machine learning tool is utilized to pick features.

The Information Gain attribute feature selector ranks attributes based on their capacity to convey details regarding the overall quality of the distorted images. By employing a feature as a split property, the evaluator calculates the reduction in entropy of the quality scores. The Information Gain score

IG (F, Q)

is derived using the following equation for feature set F and quality scores Q:

IG (F, Q) = H (Q) - \sum_{f \in F} \frac{| {q \in Q : q_{f} = f} |}{| Q |} H ({q \in Q : q_{f} = f})

(5)

In this equation,

H (Q)

denotes the entropy of the quality scores, f denotes an attribute, and

q_{f}

denotes the quality rating of the image determined by that selected feature. The total number of images in Q for which the feature f is the split attribute is given by

| q \in Q : q_{f} = f |

. The entropy of the quality ratings for images when the feature f is the divided attribute is

H (q \in Q : q_{f} = f)

.

The Information Gain metric represents the quantity of information obtained regarding image quality by incorporating a feature into the model. The greater the Information Gain score, the more essential the attribute is for quality forecasting, thus increasing its position in the process of choosing the feature.

4. Quality Prediction Stage

In the final assessment stage, the selected characteristics are used to construct a regression model that is capable of predicting the score for the quality of the distorted images. Due to its capacity to handle feature vectors with high dimensionality and nonlinear correlations between features and quality ratings, Support Vector Machine (SVM) regression is employed as the prediction model in this research work. We extract numerous image characteristics from the proposed system, such as wavelet coefficients, edge detection findings, and filter responses. These characteristics contribute to a multidimensional feature space. The capacity of SVM to handle such complicated and high-dimensional data makes it ideal for our needs. The strength of SVM is its capacity to deal with non-linear connections between characteristics and target variables. The connection between image attributes and quality scores in image quality evaluation can be non-linear and complicated. The kernel-based trick of SVM allows it to implicitly translate characteristics to a higher-dimensional space, allowing for the successful modeling of non-linear connections. Moreover, SVM offers high generalization characteristics and is less susceptible to overfitting, both of which are important when constructing a robust and trustworthy BIQA model.

The SVM regression model establishes the relationship between the set of features F and the quality ratings Q. A set of training data D of N labeled image data is used to train the function of regression

f (p)

which is defined as,

f (p) = \sum_{l = 1}^{M} α_{l} y_{l} k (p, p_{l}) + w

(6)

In this formula, p is an image sample expressed as a feature vector, and

α_{l}

and

y_{l}

are the initial samples’ Lagrange multipliers and labels for classes, respectively. The kernel function

k (p, p_{l})

shifts the feature vector p to a higher-dimensional feature space, allowing for nonlinear decision boundaries. The radial basis function (RBF) kernel function is utilized in this research paper,

k (p, p_{l}) = exp (- γ | | p - p_{l} {| |}^{2})

(7)

where

γ

denotes the kernel parameter.

The below-mentioned optimization equation is used to learn the regression coefficients

α_{l}

and bias term w:

\begin{matrix} min_{α, w} & \frac{1}{2} \sum_{l = 1}^{M} \sum_{j = 1}^{M} α_{l} α_{j} y_{l} y_{j} k (p_{l}, p_{j}) - \sum_{l = 1}^{M} α_{i} \\ s . t . & 0 \leq α_{l} \leq C \\ \sum_{l = 1}^{M} α_{l} y_{l} = 0 \end{matrix}

(8)

C is a hyperparameter in this optimization problem that determines the tradeoff between maximizing the margin and reducing the training error. The optimization issue is tackled iteratively using Sequential Minimal Optimization (SMO). Once trained on the training set, the SVM regression model can be used to predict the quality scores of fresh, unseen image samples using their associated feature vectors and the learnt regression function

f (p)

.

5. Evaluation Metrics

Several measures are used to compare projected quality scores with ground-truth quality scores to evaluate the performance of BIQA models. These measures are used to assess the relationship between projected and actual quality ratings. To evaluate the proposed framework, we have used four of the most commonly used evaluation metrics, which are discussed below.

5.1. Spearman Rank-Order Correlation Coefficient (SROCC)

The monotonic connection between two ranking variables is measured by SROCC. It is used in BIQA to assess the relationship between expected and actual quality ratings. The equation of SROCC is defined as,

ρ = 1 - \frac{6 \sum_{l = 1}^{m} {(r_{l} - q_{l})}^{2}}{m (m^{2} - 1)},

(9)

where m is the total number of images in the dataset,

r_{l}

is the projected quality rating rank of the lth image, and

q_{l}

is the ground truth rating rank of the lth image. SROCC has a value between

- 1

and 1, with 1 indicating an ideal positive association, 0 indicating no correlation, and

- 1

indicating a perfect negative association.

5.1.1. Linear Correlation Coefficient (LCC)

The linear connection between two variables is measured by LCC. It is used in BIQA to assess the relationship between expected and actual quality ratings. The following is a definition of LCC,

L C C = \frac{\sum_{l = 1}^{m} (p_{i} - \bar{p}) (q_{l} - \bar{q})}{\sqrt{\sum_{l = 1}^{m} {(p_{l} - \bar{p})}^{2} \sum_{l = 1}^{m} {(q_{l} - \bar{q})}^{2}}},

(10)

where

p_{l}

and

q_{l}

are the forecasted and true quality ratings of the lth image, and

\bar{p}

and

\bar{q}

are the average of the predicted and ground truth quality scores. LCC has a value between

- 1

and 1, with 1 indicating perfect positive correlation, 0 indicating no correlation, and

- 1

indicating a perfect negative correlation.

5.1.2. Kendall Rank-Order Correlation Coefficient (KROCC)

KROCC assesses the ordinal relationship between two variables. It is used in BIQA to assess the relationship between expected and actual quality ratings. The equation of KROCC is defined as,

τ = \frac{(number of concordant pairs) - (number of discordant pairs)}{(\binom{m}{2})},

(11)

where m is the total number of images within the database, and a pair of images

(p_{l}, q_{l})

is considered to be concordant if their rankings in their predicted and actual scores are the same, and otherwise discordant. KROCC has a value between

- 1

and 1, with 1 indicating perfect positive correlation, 0 indicating no correlation, and

- 1

indicating a perfect negative correlation.

5.1.3. Root Mean Squared Error (RMSE)

The RMSE gauges the difference between the expected and actual quality ratings. The RMSE is defined as,

R M S E = \sqrt{\frac{1}{M} \sum_{l = 1}^{M} {(q_{p r e d, l} - q_{t r u e, l})}^{2}}

(12)

M is the total number of test images,

q_{p r e d, l}

is the projected quality rating for image l, and

q_{t r u e, l}

is the actual quality rating for image l. The smaller the RMSE, the greater the model’s performance.

6. Results and Discussion

In this section, comparative analysis is carried out between different features and between existing and proposed methods.

6.1. Performance Comparison of Different Features

The performance of the proposed Blind Image Quality Assessment (BIQA) methodology on three frequently used benchmark datasets is shown in Table 1. The table displays the results produced by employing various image attributes in the suggested technique. The proposed model employed four types of image features in this study: Haar wavelet coefficients, Prewitt and Sobel edge detection, Log filters, and Prewitt, Sobel, and Gaussian filter combinations.

Table 1 also includes the results of using the Information Gain feature selection algorithm with the WEKA tool on the Prewitt and Gaussian (WEKA PG) and Prewitt, Sobel, and Gaussian (WEKA PSG) features, which achieved a performance improvement over the non-selected features.

6.2. Feature Analysis

Figure 3 depicts the Prewitt and Gaussian (PG) filter feature vectors for four distinct forms of distortion from CSIQ datasets: JPEG compression, white noise, Gaussian blur, and JPEG 2000 compression. Each subfigure depicts the PG feature vector distribution for a distinct type of distortion.

As seen in the image, each form of distortion has a distinct feature vector pattern. In the first subfigure, for example, the feature vector of JPEG compression shows a different pattern when compared with the other forms of distortion. This shows that the suggested technique is capable of collecting and learning the unique characteristics of each form of distortion.

Furthermore, the results show that by extracting and selecting relevant features with the Information Gain Score algorithm and then training the SVM regression model, the proposed framework can accurately predict the quality of distorted images. On widely used benchmark datasets such as TID2013, CSIQ, and LIVE, approach outperforms state-of-the-art methods in terms of accuracy and robustness.

6.3. Comparison with Exisiting Techniques

Table 2 compares the performance of proposed Blind Image Quality Assessment (BIQA) model to that of various leading models, including FRIQUEE, NFERM, BRSIQUE, BLINDS-II, CORNIA, and DIVINE, in terms of LCC, SROCC, KROCC, and RMSE on all three benchmark datasets (CSIQ, LIVE, and TID2013). On all three datasets, proposed model outperformed all other models in terms of all evaluation metrics. On the TID2013 dataset, for example, the proposed model achieved an LCC of 0.9453, an SROCC of 0.9289, a KROCC of 0.7804, and an RMSE of 0.406, all of which are significantly better than the best results reported by the other models. Similarly, using the LIVE and CSIQ datasets, the suggested model consistently outperformed the other models in terms of all assessment measures.

Several factors contribute to the superior performance of the proposed BIQA model. For starters, combining multiple image features such as wavelet transform, Prewitt and Sobel edge detection, and Log filters, as well as selecting them using the Information Gain attribute evaluator, allows for the capture of a wide range of relevant image attributes that are important for quality assessment. This enables the algorithm to successfully discriminate between different forms of image distortion and reliably estimate their quality.

Second, using SVM regression as the underlying machine learning algorithm has several advantages, including the ability to handle high-dimensional feature spaces, non-linear relationships between features and quality scores, and the ability to generalize well to new, previously unseen data.

Furthermore, extensive experimental evaluation and analysis of various types of features and feature selection techniques aid in the identification of the most effective combination of features for BIQA. Overall, the proposed BIQA model provides a promising method for reliably and robustly assessing the quality of distorted images, with potential applications in image reduction, transmission, and restoration.

It is vital to highlight that the use of machine learning techniques, such as the SVM Regression model, provides computational efficiency by definition. ML algorithms are renowned for their capacity to effectively analyze data and generate predictions based on previously learnt patterns. Furthermore, the feature selection phase in the proposed system reduces the complexity of the feature space, enhancing computing efficiency even further.

Certain restrictions and possible downsides must be considered in suggested approach. To begin, the framework’s performance is strongly dependent on the feature extraction techniques used, such as wavelet transformation and edge detection. While these strategies have yielded promising results, their efficacy may vary based on the precise properties of the photos under consideration. Furthermore, the framework’s generalizability to various image types should be examined. Although the system has been evaluated on benchmark datasets, its performance may vary when applied to other image categories or domains.

7. Conclusions

This study presented a system for Blind Image Quality Assessment (BIQA) that integrates feature extraction, feature selection, and support vector machine (SVM) regression. The incorporation of four separate types of image features, as well as the usage of the Information Gain attribute approach for feature selection, contribute to the regression model’s increased performance. Using well-known benchmark datasets such as TID2013, CSIQ, and LIVE, this shows that the proposed framework outperforms previous techniques in terms of accuracy and durability. Furthermore, this study performed extensive tests to investigate the effect of various types of features and feature selection methodologies on the framework’s performance. The system improves the resilience and accuracy of the quality prediction model by including a wide set of image properties. Furthermore, using the Information Gain approach for feature selection minimizes the complexity of the feature space and improves the regression model’s performance. As a whole, the proposed methodology outperforms the competition across many benchmark datasets, the proving its efficacy and robustness in blind image quality evaluation. These findings illustrate the major benefits of the proposed methodology and help to develop the discipline. It is crucial to highlight, however, that the suggested approach has limitations, such as the dependency on certain datasets and the need for future feature extraction technique development. A future study should solve these constraints and investigate the framework’s use in additional image-processing tasks.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

ur Rehman, M.; Nizami, I.F.; Majid, M. DeepRPN-BIQA: Deep architectures with region proposal network for natural-scene and screen-content blind image quality assessment. Displays 2022, 71, 102101. [Google Scholar] [CrossRef]
Qi, K.; Li, H.; Rong, C.; Gong, Y.; Li, C.; Zheng, H.; Wang, S. Blind Image Quality Assessment for MRI with A Deep Three-dimensional content-adaptive Hyper-Network. arXiv 2021, arXiv:2107.06888. [Google Scholar]
Li, G.; Yang, Y.; Qu, X.; Cao, D.; Li, K. A deep learning based image enhancement approach for autonomous driving at night. Knowl.-Based Syst. 2021, 213, 106617. [Google Scholar] [CrossRef]
Rajevenceltha, J.; Gaidhane, V.H. An efficient approach for no-reference image quality assessment based on statistical texture and structural features. Eng. Sci. Technol. Int. J. 2022, 30, 101039. [Google Scholar] [CrossRef]
Li, Q.; Lin, W.; Gu, K.; Zhang, Y.; Fang, Y. Blind image quality assessment based on joint log-contrast statistics. Neurocomputing 2019, 331, 189–198. [Google Scholar] [CrossRef]
Xu, L.; Jiang, X. Blind image quality assessment for anchor-assisted adaptation to practical situations. Multimed. Tools Appl. 2023, 83, 17929–17946. [Google Scholar] [CrossRef]
Zhai, G.; Min, X. Perceptual image quality assessment: A survey. Sci. China Inf. Sci. 2020, 63, 211301. [Google Scholar] [CrossRef]
Nizami, I.F.; Rehman, M.U.; Waqar, A.; Majid, M. Impact of visual saliency on multi-distorted blind image quality assessment using deep neural architecture. Multimed. Tools Appl. 2022, 81, 25283–25300. [Google Scholar] [CrossRef]
Hosu, V.; Lin, H.; Sziranyi, T.; Saupe, D. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Trans. Image Process. 2020, 29, 4041–4056. [Google Scholar] [CrossRef] [Green Version]
Nizami, I.F.; Majid, M.; Rehman, M.u.; Anwar, S.M.; Nasim, A.; Khurshid, K. No-reference image quality assessment using bag-of-features with feature selection. Multimed. Tools Appl. 2020, 79, 7811–7836. [Google Scholar] [CrossRef]
Nizami, I.F.; Rehman, M.u.; Majid, M.; Anwar, S.M. Natural scene statistics model independent no-reference image quality assessment using patch based discrete cosine transform. Multimed. Tools Appl. 2020, 79, 26285–26304. [Google Scholar] [CrossRef]
Ribeiro, R.; Trifan, A.; Neves, A.J. Blind Image Quality Assessment with Deep Learning: A Replicability Study and Its Reproducibility in Lifelogging. Appl. Sci. 2023, 13, 59. [Google Scholar] [CrossRef]
Fateh, A.; Fateh, M.; Abolghasemi, V. Multilingual handwritten numeral recognition using a robust deep network joint with transfer learning. Inf. Sci. 2021, 581, 479–494. [Google Scholar] [CrossRef]
Yang, G.; Wang, Y. Deep Superpixel-Based Network For Blind Image Quality Assessment. arXiv 2021, arXiv:2110.06564. [Google Scholar]
Wu, J.; Ma, J.; Liang, F.; Dong, W.; Shi, G.; Lin, W. End-to-end blind image quality prediction with cascaded deep neural network. IEEE Trans. Image Process. 2020, 29, 7414–7426. [Google Scholar] [CrossRef]
Rehman, M.U.; Tayara, H.; Zou, Q.; Chong, K.T. i6mA-Caps: A CapsuleNet-based framework for identifying DNA N6-methyladenine sites. Bioinformatics 2022, 38, 3885–3891. [Google Scholar] [CrossRef]
Rehman, M.U.; Tayara, H.; Chong, K.T. DL-m6A: Identification of N6-methyladenosine Sites in Mammals using deep learning based on different encoding schemes. IEEE/ACM Trans. Comput. Biol. Bioinform. 2022, 20, 904–911. [Google Scholar] [CrossRef]
Rehman, M.U.; Tayara, H.; Chong, K.T. DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species. Comput. Struct. Biotechnol. J. 2021, 19, 6009–6019. [Google Scholar] [CrossRef]
Rakhmatulin, I.; Kamilaris, A.; Andreasen, C. Deep neural networks to detect weeds from crops in agricultural environments in real-time: A review. Remote Sens. 2021, 13, 4486. [Google Scholar] [CrossRef]
Espejo-Garcia, B.; Mylonas, N.; Athanasakos, L.; Fountas, S. Improving weeds identification with a repository of agricultural pre-trained deep neural networks. Comput. Electron. Agric. 2020, 175, 105593. [Google Scholar] [CrossRef]
Rehman, M.U.; Ryu, J.; Nizami, I.F.; Chong, K.T. RAAGR2-Net: A brain tumor segmentation network using parallel processing of multiple spatial frames. Comput. Biol. Med. 2023, 152, 106426. [Google Scholar] [CrossRef]
Rehman, M.U.; Akhtar, S.; Zakwan, M.; Mahmood, M.H. Novel architecture with selected feature vector for effective classification of mitotic and non-mitotic cells in breast cancer histology images. Biomed. Signal Process. Control 2022, 71, 103212. [Google Scholar] [CrossRef]
Chetouani, A.; Quach, M.; Valenzise, G.; Dufaux, F. Combination of Deep Learning-based and Handcrafted Features for Blind Image Quality Assessment. In Proceedings of the 2021 9th European Workshop on Visual Information Processing (EUVIP), Paris, France, 23–25 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–6. [Google Scholar]
Pan, Z.; Zhang, H.; Lei, J.; Fang, Y.; Shao, X.; Ling, N.; Kwong, S. Dacnn: Blind image quality assessment via a distortion-aware convolutional neural network. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 7518–7531. [Google Scholar] [CrossRef]
Ghadiyaram, D.; Bovik, A.C. Perceptual quality prediction on authentically distorted images using a bag of features approach. J. Vis. 2017, 17, 32. [Google Scholar] [CrossRef] [Green Version]
Gu, K.; Zhai, G.; Yang, X.; Zhang, W. Using free energy principle for blind image quality assessment. IEEE Trans. Multimed. 2014, 17, 50–63. [Google Scholar] [CrossRef]
Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef]
Saad, M.A.; Bovik, A.C.; Charrier, C. Blind image quality assessment: A natural scene statistics approach in the DCT domain. IEEE Trans. Image Process. 2012, 21, 3339–3352. [Google Scholar] [CrossRef] [PubMed]
Ye, P.; Kumar, J.; Kang, L.; Doermann, D. Unsupervised feature learning framework for no-reference image quality assessment. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 1098–1105. [Google Scholar]
Moorthy, A.K.; Bovik, A.C. Blind image quality assessment: From natural scene statistics to perceptual quality. IEEE Trans. Image Process. 2011, 20, 3350–3364. [Google Scholar] [CrossRef]

Figure 1. Sample images from the dataset.

Figure 2. Overview of the proposed framework for Blind Image Quality Assessment. The framework includes three main stages: Feature Extraction, Feature Selection, and Quality Prediction.

Figure 3. Visual word occurences ofr different feature vector.

Table 1. Performance Comparision on CSIQ, LIVE, TID2013 datasets. WT: Wavelet Transform, PG: Prewitt and Gaussian, LG: Log and Gaussian, PSG: Prewitt, Sobel and Gaussian.

	Metrics	TID2013							CSIQ				LIVE
	Metrics	SSR	SCN	HFN	IN	ID	GBLUR	JPEG	Awgn	JPEG	Jp2k	Fnoise	Jp2k	JPEG	Wn	Gblur	FF
WT	srocc	0.9054	0.9553	0.9320	0.9069	0.9014	0.9181	0.9308	0.9365	0.9012	0.9217	0.8928	0.9245	0.9204	0.9724	0.9586	0.9216
	lcc	0.9111	0.9654	0.9654	0.9078	0.9201	0.9219	0.9779	0.9517	0.9348	0.9356	0.9011	0.9376	0.9502	0.9819	0.9634	0.9253
	krocc	0.7467	0.8333	0.7903	0.7446	0.7323	0.7667	0.7933	0.7931	0.7537	0.7655	0.7149	0.7727	0.7761	0.8966	0.8424	0.7492
	rmse	0.8177	0.2038	0.2792	0.2630	0.6634	0.4972	0.3323	0.0547	0.1117	0.1140	0.1006	5.769	5.1151	2.9712	4.3413	6.3678
PG	srocc	0.9169	0.9292	0.9363	0.9120	0.9201	0.9331	0.9159	0.9429	0.8987	0.9253	0.8921	0.9240	0.9167	0.9773	0.9655	0.9231
	lcc	0.9346	0.9345	0.9710	0.9131	0.9440	0.9404	0.9610	0.9539	0.9310	0.9295	0.8959	0.9340	0.9430	0.9889	0.9698	0.9267
	krocc	0.7533	0.7867	0.8001	0.7600	0.7713	0.7960	0.7803	0.7636	0.7448	0.7571	0.7611	0.7652	0.7648	0.8916	0.8621	0.7635
	rmse	0.7210	0.2865	0.2580	0.2498	0.5465	0.4460	0.3936	0.0548	0.1053	0.1197	0.1051	6.0187	5.4605	2.5090	3.9717	6.1869
LG	srocc	0.9311	0.9130	0.9333	0.9021	0.9032	0.9042	0.9123	0.9369	0.8891	0.9150	0.9034	0.9174	0.9124	0.9842	0.9542	0.9097
	lcc	0.9482	0.9172	0.9709	0.9042	0.9260	0.9082	0.9664	0.9445	0.9330	0.9278	0.9004	0.9243	0.9358	0.9920	0.9602	0.9405
	krocc	0.7867	0.7656	0.8001	0.7579	0.7401	0.7379	0.7600	0.7833	0.7291	0.75886	0.7287	0.7576	0.7559	0.9064	0.8325	0.7635
	rmse	0.6049	0.3118	0.2529	0.2509	0.5689	0.5458	0.3949	0.0574	0.1130	0.1195	0.1012	6.4027	5.8812	2.1291	4.5495	5.7394
PSG	srocc	0.9177	0.9208	0.9323	0.9238	0.9078	0.9385	0.9046	0.9423	0.8940	0.9030	0.8989	0.9174	0.9122	0.9823	0.9675	0.9167
	lcc	0.9326	0.9293	0.9681	0.9296	0.9403	0.9491	0.9444	0.9550	0.9187	0.9211	0.8979	0.9298	0.9401	0.9902	0.9704	0.9297
	krocc	0.7667	0.7780	0.7933	0.7800	0.7523	0.8000	0.7567	0.8128	0.7389	0.7586	0.7517	0.7538	0.7546	0.8916	0.8473	0.7538
	rmse	0.6785	0.2878	0.2597	0.2249	0.5616	0.4107	0.4460	0.0481	0.1186	0.1185	0.0967	6.2532	5.5626	2.3792	4.1871	5.7901
WEKA PG	srocc	0.9200	0.9305	0.9416	0.9185	0.9265	0.9348	0.9301	0.9453	0.9135	0.9336	0.9127	0.9295	0.9196	0.9788	0.9688	0.9312
	lcc	0.9298	0.9348	0.9735	0.9195	0.9458	0.9422	0.9716	0.9549	0.9472	0.9355	0.9259	0.9407	0.9447	0.9901	0.9722	0.9366
	krocc	0.7667	0.7923	0.8047	0.7667	0.7800	0.7986	0.7867	0.8079	0.7562	0.7954	0.7621	0.7727	0.7637	0.8966	0.8674	0.7712
	rmse	0.7084	0.2760	0.2396	0.2418	0.5339	0.4360	0.3688	0.0515	0.1041	0.1145	0.1029	5.6505	5.3760	2.4960	3.8752	5.9869
WEKA PSG	srocc	0.9222	0.9267	0.9383	0.9255	0.9104	0.9378	0.9085	0.9458	0.9116	0.9163	0.9012	0.9251	0.9148	0.9836	0.9678	0.9178
	lcc	0.9348	0.9323	0.9695	0.9323	0.9433	0.9450	0.9577	0.9571	0.9327	0.9286	0.9022	0.9360	0.9421	0.9911	0.9708	0.9311
	krocc	0.7733	0.7850	0.7975	0.7867	0.7583	0.7989	0.7578	0.8130	0.7411	0.7648	0.7621	0.7727	0.7634	0.9015	0.8621	0.7638
	rmse	0.6824	0.2838	0.2472	0.2176	0.5587	0.4097	0.4423	0.0478	0.1149	0.1143	0.0943	5.9252	5.4526	2.2905	3.8823	5.7641

Table 2. Performance Comparison of Proposed Method with Existing BIQA Techniques.

BIQA Models	LIVE Dataset				TID2013 Dataset				CSIQ Dataset
	LCC	SROCC	KROCC	RMSE	LCC	SROCC	KROCC	RMSE	LCC	SROCC	KROCC	RMSE
FRIQUEE [25]	0.9411	0.9347	0.7817	9.2061	0.7688	0.6926	0.5161	0.7965	0.9069	0.8815	0.7077	0.1113
NFERM [26]	0.9463	0.9427	0.8063	8.8021	0.7465	0.6747	0.4976	0.8301	0.8658	0.8213	0.6394	0.1298
BRISQUE [27]	0.9482	0.9436	0.8005	8.6605	0.6213	0.5739	0.4149	0.9668	0.8311	0.7403	0.5590	0.1442
BLINDS II [28]	0.9370	0.9298	0.7754	9.5072	0.6511	0.5723	0.4137	0.9403	0.8134	0.7528	0.5652	0.1522
CORNIA [29]	0.9473	0.9452	0.7953	8.7478	0.7451	0.6542	0.4770	0.8247	0.8044	0.7325	0.5464	0.1554
DIIVINE [30]	0.9134	0.9120	0.7487	11.096	0.7294	0.6735	0.4947	0.8504	0.8077	0.7594	0.5718	0.1546
Proposed Model	0.9569	0.9456	0.8143	4.6769	0.9453	0.9289	0.7851	0.4006	0.9409	0.9263	0.7804	0.0932

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ryu, J. Adaptive Feature Fusion and Kernel-Based Regression Modeling to Improve Blind Image Quality Assessment. Appl. Sci. 2023, 13, 7522. https://doi.org/10.3390/app13137522

AMA Style

Ryu J. Adaptive Feature Fusion and Kernel-Based Regression Modeling to Improve Blind Image Quality Assessment. Applied Sciences. 2023; 13(13):7522. https://doi.org/10.3390/app13137522

Chicago/Turabian Style

Ryu, Jihyoung. 2023. "Adaptive Feature Fusion and Kernel-Based Regression Modeling to Improve Blind Image Quality Assessment" Applied Sciences 13, no. 13: 7522. https://doi.org/10.3390/app13137522

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive Feature Fusion and Kernel-Based Regression Modeling to Improve Blind Image Quality Assessment

Abstract

1. Introduction

2. Datasets

3. Proposed Methodology

3.1. Feature Extraction

3.1.1. Wavelet Transform

3.1.2. Prewitt and Gaussian

3.1.3. Log and Gaussian

3.1.4. Prewitt, Sobel, and Gaussian

3.2. Feature Selection

4. Quality Prediction Stage

5. Evaluation Metrics

5.1. Spearman Rank-Order Correlation Coefficient (SROCC)

5.1.1. Linear Correlation Coefficient (LCC)

5.1.2. Kendall Rank-Order Correlation Coefficient (KROCC)

5.1.3. Root Mean Squared Error (RMSE)

6. Results and Discussion

6.1. Performance Comparison of Different Features

6.2. Feature Analysis

6.3. Comparison with Exisiting Techniques

7. Conclusions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI