UMGAN: Underwater Image Enhancement Network for Unpaired Image-to-Image Translation

Sun, Boyang; Mei, Yupeng; Yan, Ni; Chen, Yingyi

doi:10.3390/jmse11020447

Open AccessEditor’s ChoiceArticle

UMGAN: Underwater Image Enhancement Network for Unpaired Image-to-Image Translation

by

Boyang Sun

^1,2,3,4,

Yupeng Mei

^1,2,3,4,

Ni Yan

^1,2,3,4 and

Yingyi Chen

^1,2,3,4,*

¹

National Innovation Center for Digital Fishery, China Agricultural University, Beijing 100083, China

²

Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affairs, Beijing 100083, China

³

Beijing Engineering and Technology Research Centre for the Internet of Things in Agriculture, China Agricultural University, Beijing 100083, China

⁴

College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2023, 11(2), 447; https://doi.org/10.3390/jmse11020447

Submission received: 27 January 2023 / Revised: 11 February 2023 / Accepted: 13 February 2023 / Published: 17 February 2023

Download

Browse Figures

Versions Notes

Abstract

:

Due to light absorption and scattering underwater images suffer from low contrast, color distortion, blurred details, and uneven illumination, which affect underwater vision tasks and research. Therefore, underwater image enhancement is of great significance in vision applications. In contrast to existing methods for specific underwater environments or reliance on paired datasets, this study proposes an underwater multiscene generative adversarial network (UMGAN) to enhance underwater images. The network implements unpaired image-to-image translation between the underwater turbid domain and the underwater clear domain. It has a great enhancement impact on several underwater image types. Feedback mechanisms and a noise reduction network are designed to optimize the generator and address the issue of noise and artifacts in GAN-produced images. Furthermore, a global–local discriminator is employed to improve the overall image while adaptively modifying the local region image effect. It resolves the issue of over- and underenhancement in local regions. The reliance on paired training data is eliminated through a cycle consistency network structure. UMGAN performs satisfactorily on various types of data when compared quantitatively and qualitatively to other state-of-the-art algorithms. It has strong robustness and can be applied to various enhancement tasks in different scenes.

Keywords:

image enhancement; generative adversarial networks; underwater; noise reduction

1. Introduction

Underwater imaging is a significant source of oceanic information, advancing the study of marine biology, marine archaeology, marine ecology, and naval military [1,2]. The effects of absorption and scattering on the underwater imaging process lead to significant image deterioration with low contrast, color distortion, blurred details, and uneven illumination [3]. Degraded images seriously affect underwater vision tasks and research. It is of great scientific significance to improve the visual quality of original underwater images with the help of image processing technology. However, the degree of deterioration and the existence of issues differ from one underwater image to the next. Therefore, underwater image enhancement for different scenes is a challenging task.

To address the issue of underwater images, researchers are currently employing a variety of techniques, including contrast enhancement [4,5], frequency domain enhancement [6,7], color constancy [8,9], and image fusion [10,11]. However, these techniques frequently focus on a single data type. They have low robustness and do not handle different styles of underwater images very well. With the development of image processing technology [12,13,14,15,16], deep learning techniques have advanced significantly in the field of image processing in recent years. An increasing number of researchers are applying deep learning methods to underwater image enhancement [17,18,19,20,21]. Compared with traditional methods, deep learning methods trained with enormous amounts of data can better handle images in different scenes.

A generative adversarial network called UMGAN has been proposed for the enhancement of several underwater scene images. GAN is adopted to build unpaired mappings between turbid and clear underwater image spaces, which frees us from the reliance on precisely paired images. According to the characteristics of the underwater environment, a feedback mechanism for underwater image quality measurement is proposed. Based on the quality of the created images, feedback is provided to the generator to continuously optimize the generated results. A noise reduction network is designed to prevent the creation of image noise during the training process since artifacts and texture noise are frequently present in the images produced by GAN. In addition, dual discriminators are used to balance global and local enhancements. The issues of over- and underenhancement in local regions are avoided by adaptive modification of the enhancement effect in each region. Figure 1 shows the underwater images of different scenes and enhancement effects.

The main contributions of this work are summarized according to three aspects. First, a feedback mechanism for the characteristics of underwater images is proposed. The generator is optimized by providing constant feedback about the image quality, achieving image enhancement for a variety of underwater scenes. Second, the noise reduction network is designed. The problem of artifacts and textural noise in the images generated by GAN is solved. Third, a global–local discriminator is employed. Dual discriminators weight the image results and adaptively enhance local regions while enhancing the global image.

2. Related work

2.1. Underwater Image Enhancement

When light is transmitted in water, it is affected by both absorption and scattering. Therefore, underwater images often suffer from low contrast, color distortion, blurred details, and uneven illumination [22]. Two contrast enhancement methods, ICM [23] and UCM [4], were proposed by Iqbal et al., who were the first to research underwater image enhancement. Inspired by Iqbal et al., a two-stage technique for underwater image contrast enhancement and color correction was presented by Ghani et al. [5]. A series of histogram equalization methods (HE [24], AHE [25], and CLAHE [26]) was also used to enhance underwater images [27,28]. Jin et al. [29] combined CLAHE with Gaussian differential pyramids to solve the problem of low-contrast and blurred details in underwater images.

Frequency domain techniques such as Fourier transform and wavelet transform are used by researchers for underwater image enhancement [6,7]. Retinex is a widely used technique for enhancing images that is based on scientific research and analysis. Extended multiscale retinex was utilized by Zhang et al. [8] to enhance underwater image quality. Zhou et al. [30] used multiscale retinex to extract the light components to adjust the three channels for color correction. Image fusion entails performing several picture improvement procedures on the source image before merging the various improved images by weighting. Following the adjustment of the white balance, Ancuti et al. [10] performed a multiscale fusion of underwater images. A combination of enhanced background filtering and wavelet fusion methods was used by Ghani et al. [11]. Then, they integrated homomorphic filtering, recursive overlapping CLAHS, and dual-image wavelet fusion for underwater image enhancement [31]. Lei et al. [32] implemented underwater image enhancement based on color correction and dual-image multiscale fusion.

2.2. Deep Learning Methods

In recent years, an increasing number of researchers has applied deep learning methods to underwater image enhancement. Depending on the network model used, deep-learning-based underwater image enhancement methods can be classified as convolutional neural network (CNN)-based or generative adversarial network (GAN)-based methods. Wang et al. [33] first applied CNN to underwater image enhancement. Li et al. proposed Water-Net [21] and UWCNN [20] to implement end-to-end underwater image enhancement. To better support the deployment of underwater exploration missions on portable devices, Shallow-UWnet [34] maintains performance while reducing the number of participants.

WaterGAN [17] is an example of the early use of GAN for underwater image enhancement. It eliminates the reliance on training data, and the model synthesizes in-air RGB-D images and depth maps into underwater images. Researchers have focused on the application of GAN in underwater image enhancement from two directions: a conditional generative adversarial network (cGAN) and a cycle-consistent generative adversarial network (CycleGAN). Guo et al. [35] proposed a multiscale dense GAN for underwater image enhancement. Zhou et al. combined the physical model with GAN and used an underwater imaging model to simulate an underwater training dataset from RGB-D data. Inspired by CycleGAN [36], a weakly supervised color-transfer method was proposed by Li et al. [18] to solve the color distortion problem in underwater images. On this basis, Lu et al. [19] proposed a multiscale CycleGAN (MCycleGAN) by combining the dark channel prior and CycleGAN. FUnIE-GAN [37] implements real-time underwater image enhancement. Du et al. [38] added a content-loss regularizer and a blur-promoting adversarial loss regularizer to CycleGAN to retain detailed information about the enhanced images.

3. Materials and Methods

An underwater multiscene generative adversarial network (UMGAN) is proposed for the purpose of underwater image enhancement. In the forward process, the turbid underwater image is converted to a clear image by the G generator and the noise reduction network. A quality feedback mechanism was designed. The generated images feed the image quality back to the generator to continuously optimize the generated results. Then, the global–local discriminator is developed to jointly determine the effect of the generated image. In the same way, the F generator maps the clear image domain to the turbid image domain. SSIM loss is also used to control the consistency of the image structure in both domains. The overall framework is shown in Figure 2.

3.1. Underwater Image Quality Measure Feedback

A non-reference underwater image quality measure (UIQM) was proposed by Panetta et al. [39]. The UIQM includes three image attribute measures: the underwater image colorfulness measure (UICM), the underwater image sharpness measure (UISM), and the underwater image contrast measure (UIConM). Inspired by Panetta et al.’s work, a feedback mechanism for underwater images is proposed. The UIQM loss function is designed to evaluate the quality of the generated images in terms of colorfulness, sharpness, and contrast, which is fed back to the generator to optimize the generated image results. During the training process, G mapping continuously receives feedback from the UIQM loss function, and the quality of the generated images is gradually improved. In the forward process, the UICM, UISM, and UIConM values of the generated images are fed back to the generation module. The UIQM loss of the image is expressed as follows:

L_{U I Q M} (I, G) = \frac{1}{n} \sum_{p i x = 1}^{n} (λ_{1} - L_{U I C M} (G (I_{p i x}))) + \frac{1}{n} \sum_{p i x = 1}^{n} (λ_{2} - L_{U I S M} (G (I_{p i x}))) + \frac{1}{n} \sum_{p i x = 1}^{n} (λ_{3} - L_{U I C o n M} (G (I_{p i x})))

(1)

where

I

denotes the input turbid image,

G (\cdot)

denotes the enhanced output of the generator,

p i x

denotes each pixel of the image, and

n

denotes the total number of pixels.

λ_{1}, λ_{2}

, and

λ_{3}

are set to 1, 2.5, and 2, respectively, as determined by extensive experiments on the training dataset.

3.2. Noise Reduction Network

The images generated by GAN networks frequently include noise. In this case, a noise reduction network is proposed, which is added after the generator. Before being predicted by the discriminator, the image matrix created during each training round is sent into the noise reduction network, which improves the quality of the generated images in each round. The network’s filter is based on Gaussian filtering. A 3 × 3 Gaussian kernel is used to convolve the generated images. The noise reduction network can effectively improve the problem of severe textural noise in the images generated by the GAN. A visual comparison after adding the noise reduction network is shown in Figure 3.

3.3. Global–Local Discriminators

Different scenes in a picture are located at different distances from the camera. Light conditions also differ depending on the region. Global discriminators alone often cannot adaptively enhance every local region in an image. Inspired by previous work [40], a global–local discriminator structure for underwater image enhancement is designed. It adaptively enhances local areas while enhancing the global image. The original global discriminator judges the full map enhancement effect. The local discriminator crops out random local blocks from the generated image for determination. This method avoids the problems of over- and underenhancement of local areas. The comprehensive results of the two discriminators are:

D (I) = α D_{G l o b a l} (I) + \sum_{k = 1}^{n} β_{k} D_{L o c a l} (I_{k}^{i j})

(2)

where

α

and

β_{k}

are the weights of the two discriminators, and

α + \sum_{k = 1}^{n} β_{k} = 1

.

i

and

j

denote the height and width of the randomly cropped block, respectively (

i = 64

,

j = 64

). To ensure the stability of the cycle structure, local discriminators are introduced in both the forward and backward processes. The visual comparison after adding the global–local discriminator is shown in Figure 4.

3.4. Loss Function

UMGAN can realize underwater image enhancement by learning unpaired datasets, which eliminates the dependence on paired training data. For this purpose, a set of loss functions are proposed to evaluate the effect of image enhancement. The following four loss functions are used to train the network.

3.4.1. UIQM Loss

UIQM is introduced to measure the three aspects of colorfulness, sharpness, and contrast of the generated images. The evaluation results are fed back to the generator to optimize the generation results. Since only the enhanced image generated from the blurred image needs to be enhanced, the UIQM loss is only applied in the forward process. Referring to Equation (2), the UIQM loss of the algorithm is expressed as follows:

L_{U I Q M} (X, G_{X 2 Y}) = \frac{1}{n} \sum_{p i x = 1}^{n} (λ_{1} - L_{U I C M} (G (X_{p i x}))) + \frac{1}{n} \sum_{p i x = 1}^{n} (λ_{2} - L_{U I S M} (G (X_{p i x}))) + \frac{1}{n} \sum_{p i x = 1}^{n} (λ_{3} - L_{U I C o n M} (G (X_{p i x})))

(3)

3.4.2. Adversarial Loss (Global–Local)

Adversarial loss is used to train the generators and discriminators to make the generated images more realistic. The loss function at the global level is as follows:

L_{G A N}^{G l o b a l} (G, D_{Y}, X, Y) = E_{y ~ p_{d a t a} (y)} [l o g D_{Y} (y)] + E_{x ~ p_{d a t a} (x)} [l o g (1 - D_{Y} (G (x))]

(4)

The mapping function (

F : Y \to X

) and its discriminator (

D_{X}

) are introduced with a similar adversarial loss, i.e.,

L_{G A N}^{G l o b a l} (F, D_{X}, Y, X)

. The loss function on the local discriminator is:

L_{G A N}^{L o c a l} (G, D_{L}, X, Y) = E_{y ~ p_{d a t a} (y)}^{y^{i j} \in y} [l o g D_{L} (y^{i j})] + E_{y ~ p_{d a t a} (y)}^{y^{i j} \in G (x)} [l o g (1 - D_{L} (G (x))]

(5)

Referring to Equation (3), the total adversarial loss is:

L_{G A N} = α [L_{G A N}^{G l o b a l} (G, D_{Y}, X, Y) + L_{G A N}^{G l o b a l} (F, D_{X}, Y, X)] + \sum_{k = 1}^{n} β_{k} [L_{G A N}^{L o c a l} (G, D_{L}, X, Y) + L_{G A N}^{L o c a l} (F, D_{X}, Y, X)]

(6)

3.4.3. Cycle Consistency Loss

The space of potential mapping functions is constrained by cycle consistency loss. In the forward process, each image (

x

) from domain

X

can be transformed to itself after mapping

G

and mapping

F

, i.e.,

x \to G (x) \to F (G (x)) \approx x

. Similarly,

G

and

F

should satisfy the backward cycle consistency, i.e.,

y \to F (y) \to G (F (y)) \approx y

. The cycle consistency loss is:

L_{c y c} (G, F) = E_{x ~ p_{d a t a} (x)} [F (G (x)) - x_{1}] + E_{y ~ p_{d a t a} (y)} [G (F (y)) - y_{1}]

(7)

3.4.4. SSIM Loss

There is a strong correspondence between the enhanced image and the original image. SSIM loss preserves the content and structure between them [18]. The formula for SSIM is:

S S I M (p) = \frac{2 μ_{x} μ_{y} + C_{1}}{μ_{x}^{2} + μ_{y}^{2} + C_{1}} \cdot \frac{2 σ_{x y} + C_{2}}{σ_{x}^{2} + σ_{y}^{2} + C_{2}}

(8)

where p is the center pixel, and

x

and

y

are the image patches in

X

and

G (x)

, respectively. A 13 × 13 sliding window is used for convolutional filtering.

μ_{x}

is the mean of

x

,

σ_{x}

is the standard deviation of

x

, and

σ_{x y}

is the covariance of

x

and

y

.

C_{1} = 0.02

, and

C_{2} = 0.03

. SSIM losses can be expressed as follows:

L_{S S I M} (x, G (x)) = 1 - \frac{1}{n} \sum_{p i x = 1}^{n} (S S I M (p))

(9)

where n is the number of pixels. SSIM losses are calculated in both the forward and backward processes. Similarly, the SSIM loss in the backward process is

L_{S S I M} (y, F (y))

.

The total loss function is a linear combination of the above four losses and is expressed as follows:

L o s s = L_{U I Q M} + L_{G A N} + L_{c y c} + L_{S S I M}

(10)

4. Discussion

4.1. Dataset and Implementation Details

Since UMGAN has the unique ability to train using unpaired underwater turbid/clear images, it is possible to collect unpaired training sets covering different image qualities and contents. A large number of training samples ensures the performance of the model and avoids overfitting. The UIEB [21] and EUVP [37] datasets are numbered by natural numbers starting from 1. The entire dataset totals 15,980 clear images and 18,405 blurred images. The images for training were selected using the method of generating random numbers. We used the entire dataset for training, and the training time exceeded 72 h. After comparing the experimental results, it was found that there was no significant difference between the training effect of the randomly selected images and the full dataset, meaning that the amount of data from the randomly selected images is enough to guarantee the effectiveness of the algorithm. To ensure the efficiency of completing extensive training during the subsequent ablation experiments and comparison experiments, we adopted a random selection of images. The final selection from the dataset was 2397 underwater turbid images and 2635 clear images. All input images were converted to PNG format and resized to 256 × 256 × 3 pixels. Training dataset samples are shown in Figure 5.

First, 100 epochs are trained from scratch with a learning rate of

10^{- 4}

. Then, another 100 epochs are performed, and the learning rate decays linearly to 0. The learning rate is set to

10^{- 4}

because there are a number of parameters in the algorithm that affect the final result of the image. These parameters interact with each other. The lower learning rate ensures that we do not miss any local minimum values. This avoids overadjustment of one parameter in one training session, which could lead to large changes in other parameters. Using the Adam optimizer, the batch size was set to 16. The entire training process took 14 h on four Nvidia 2080Ti GPUs.

4.2. Ablation Study

To demonstrate the effectiveness of each component in the algorithm, several ablation experiments were performed. Specifically, three experiments were designed by removing the components of UIQM loss, noise reduction network, and local discriminator, respectively. The effects of the noise reduction network and the local discriminator are demonstrated in Figure 3 and Figure 4, respectively. In Figure 6, the first row shows the input images. The second row shows the effect of the algorithm in enhancing the image without UIQM loss. The third row shows the effect of the algorithm in enhancing the image without the noise reduction network. The fourth row shows the enhanced image using only the global discriminator. The last line is produced by the final version of UMGAN. The color enhancement of the images is significantly reduced after removing UIQM losses. After the noise reduction network is removed, more noise and artifacts are produced. After reducing the local discriminator, the enhancement effect of some edge details is insufficient.

Full-reference image quality evaluation metrics such as MSE, PSNR, and SSIM and no-reference image quality evaluation metrics such as UCIQE and UIQM were used to objectively evaluate the underwater image enhancement effect. A total of 300 underwater images of different styles were used to evaluate the results of the ablation experiments. As shown in Table 1, UMGAN showed the best MSE, PSNR, SSIM, UCIQE, and UIQM scores.

4.3. Qualitative Evaluation

A comparison was made between our algorithm and several other algorithms. The results are shown in Figure 7. The first row shows the turbid underwater images. The second to tenth rows show the enhanced images obtained by the UCM algorithm [4], the CLAHE algorithm [26], the white balance algorithm [41], the retinex algorithm [9], the image fusion algorithm [10], the UWCNN algorithm [20], the CycleGAN algorithm [36], the FUnIE-GAN algorithm [37], and the Shadow-UWnet algorithm [34], respectively. The last line shows the results generated by our proposed UMGAN.

The contrast of the enhanced images was improved to some extent using the UCM method. Colors such as red and green are more prominent. However, the image is still blurred, and the defogging effect is poor. The CLAHE method considerably optimizes the color cast as compared to the UCM method. The distribution of red, green, and blue pixels is equal, but little noise and few artifacts are introduced. The white balance method highlights the red color, but the images’ overall color is dark, and some of them have more artifacts. The retinex method improves image brightness and contrast while introducing a tiny degree of artifacts. Over- and underenhancement issues are successfully resolved by the image fusion technique. The image contrast is high, and the color cast is greatly improved, but the colors are not very vivid. UWCNN trained the weight files for a variety of scenarios, and the red color was significantly enhanced, but the results were all darkened. The CycleGAN method is better at restoration, but overenhancement produces artifacts and textural noise. There are artifacts in the images generated by FUnIE-GAN. Shallow-UWnet has fewer parameters and runs fast, but the output image is low in pixels. In contrast, UMGAN has the best overall visual appearance. In addition, in the detail processing step, UMGAN not only adaptively enhances each region of the image but also reduces artifacts and noise.

4.4. Quantitative Evaluation

A total of 300 underwater images of different styles were selected for quantitative analysis of the above method. As shown in Table 2, among all the methods, UMGAN showed the best MSE, PSNR, SSIM, and UIQM scores, improving on the second-place finisher by 27.49%, 7.93%, 1.49%, and 2.57%, respectively. The proposed achieved good overall performance.

4.5. Results and Significance

The UMGAN algorithm is trained and tested using publicly available datasets. Ablation experiments demonstrate the effectiveness of each module of the algorithm. UIQM loss improves the performance of the algorithm to enhance images. The noise reduction network removes artifacts and noise from the image. The global–local discriminator adaptively adjusts local region image effects. In the qualitative evaluation, UMGAN demonstrates the best visualization. Artifacts and noise are eliminated. In the quantitative analysis, UMGAN has the best overall score. Among them, MSE, PSNR, SSIM, and UIQM scored first. According to testing of different types of images in several datasets, UMGAN can enhance different types of underwater images very well. It solves the common problems of greenish, yellowish, and blueish images; low light; white fog; and low contrast of underwater images. The problem of artifacts and textural noise in the images generated by GAN is also solved by the designed module. UMGAN can be successfully applied to various types of underwater image enhancement tasks.

5. Conclusions

An image enhancement network called UMGAN, which is an end-to-end network trained by unpaired datasets, is proposed for a variety of underwater scenes. To force the generator to produce clear, high-quality, noise-free underwater images, an image quality feedback mechanism and a noise reduction network were designed. A global–local discriminator was created to ensure the enhancing effect in the local region using adaptive weighting loss to balance the impact of the two discriminators. Our algorithm’s performance was demonstrated through an ablation study. The results of qualitative and quantitative analysis show that our algorithm outperforms other algorithms in the processing of various styles of underwater images. In the future, we will attempt to improve the algorithm’s performance in terms of image detail information retention and training speed.

Author Contributions

Conceptualization, B.S. and Y.C.; formal analysis, B.S.; investigation, Y.M.; resources, Y.C.; data curation, N.Y.; writing—original draft preparation, B.S.; writing—review and editing, B.S.; funding acquisition, Y.C.; supervision, Y.C.; project administration, B.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China “Analysis and feature recognition on feeding behaviour of fish school in facility farming based on machine vision” (No. 62076244), in part by the Beijing Digital Agriculture Innovation Consortium Project (BAIC10-2022), and in part by the National Natural Science Foundation of China “Intelligent identification method of underwater fish morphological characteristics based on binocular vision” (No. 62206021).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The UIEB dataset is available at https://li-chongyi.github.io/proj_benchmark.html, accessed on 1 July 2022. The EUVP dataset is available at the following website: http://irvlab.cs.umn.edu/resources/euvp-dataset, accessed on 1 July2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

Raveendran, S.; Patil, M.D.; Birajdar, G.K. Underwater image enhancement: A comprehensive review, recent trends, challenges and applications. Artif. Intell. Rev. 2021, 54, 5413–5467. [Google Scholar] [CrossRef]
Han, M.; Lyu, Z.; Qiu, T.; Xu, M. A Review on Intelligence Dehazing and Color Restoration for Underwater Images. IEEE Trans. Syst. Man, Cybern. Syst. 2018, 50, 1820–1832. [Google Scholar] [CrossRef]
Lu, H.; Li, Y.; Zhang, Y.; Chen, M.; Serikawa, S.; Kim, H. Underwater Optical Image Processing: A Comprehensive Review. Mob. Netw. Appl. 2017, 22, 1204–1211. [Google Scholar] [CrossRef] [Green Version]
Iqbal, K.; Odetayo, M.O.; James, A.E.; Salam, R.A.; Talib, A.Z. Enhancing the low quality images using Unsupervised Colour Correction Method. In Proceedings of the 2010 IEEE International Conference on Systems, Man and Cybernetics, Istanbul, Turkey, 10–13 October 2010; IEEE: Piscataway, NJ, USA, 2010. [Google Scholar]
Abdul, G.A.; Mat, I.N. Underwater image quality enhancement through composition of dual-intensity images and Rayleigh-stretching. Springerplus 2014, 3, 757. [Google Scholar] [CrossRef] [Green Version]
Vasamsetti, S.; Mittal, N.; Neelapu, B.C.; Sardana, H.K. Wavelet based perspective on variational enhancement technique for underwater imagery. Ocean. Eng. 2017, 141, 88–100. [Google Scholar] [CrossRef]
Priyadharsini, R.; Sharmila, T.S.; Rajendran, V. A wavelet transform based contrast enhancement method for underwater acoustic images. Multidimens. Syst. Signal Process. 2018, 29, 1845–1859. [Google Scholar] [CrossRef]
Zhang, S.; Wang, T.; Dong, J.; Yu, H. Underwater image enhancement via extended multi-scale Retinex. Neurocomputing 2017, 245, 1–9. [Google Scholar] [CrossRef] [Green Version]
Fu, X.; Zhuang, P.; Yue, H.; Liao, Y.; Zhang, X.P.; Ding, X. A retinex-based enhancing approach for single underwater image. In Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP), Paris, France, 27–30 October 2014. [Google Scholar]
Ancuti, C.O.; Ancuti, C.; De Vleeschouwer, C.; Bekaert, P. Color Balance and Fusion for Underwater Image Enhancement. IEEE Trans. Image Process. 2018, 27, 379–393. [Google Scholar] [CrossRef] [Green Version]
Ghani, A.S.A.; Nasir, A.F.A.; Tarmizi, W.F.W. Integration of enhanced background filtering and wavelet fusion for high visibility and detection rate of deep sea underwater image of underwater vehicle. In Proceedings of the 2017 5th International Conference on Information and Communication Technology (ICoIC7), Melaka, Malaysia, 17–19 May 2017. [Google Scholar] [CrossRef]
Merugu, S.; Tiwari, A.; Sharma, S.K. Spatial–Spectral Image Classification with Edge Preserving Method. J. Indian Soc. Remote Sens. 2021, 49, 703–711. [Google Scholar] [CrossRef]
Shaik, A.S.; Karsh, R.K.; Islam, M.; Laskar, R.H. A review of hashing based image authentication techniques. Multimed. Tools Appl. 2022, 81, 2489–2516. [Google Scholar] [CrossRef]
Shaik, A.S.; Karsh, R.K.; Islam, M.; Singh, S.P. A Secure and Robust Autoencoder-Based Perceptual Image Hashing for Image Authentication. Wirel. Commun. Mob. Comput. 2022, 2022, 1645658. [Google Scholar] [CrossRef]
Karsh, R.K. LWT-DCT based image hashing for image authentication via blind geometric correction. Multimed. Tools Appl. 2022, 81, 1–19. [Google Scholar] [CrossRef]
Shaheen, H.; Ravikumar, K.; Anantha, N.L.; Kumar, A.U.S.; Jayapandian, N.; Kirubakaran, S. An efficient classification of cirrhosis liver disease using hybrid convolutional neural network-capsule network. Biomed. Signal Process. Control 2023, 80, 104152. [Google Scholar] [CrossRef]
Li, J.; Skinner, K.A.; Eustice, R.M.; Johnson-Roberson, M. WaterGAN: Unsupervised Generative Network to Enable Real-time Color Correction of Monocular Underwater Images. IEEE Robot. Autom. Lett. 2017, 3, 387–394. [Google Scholar] [CrossRef] [Green Version]
Li, C.; Guo, J.; Guo, C. Emerging From Water: Underwater Image Color Correction Based on Weakly Supervised Color Transfer. IEEE Signal Process. Lett. 2018, 25, 323–327. [Google Scholar] [CrossRef] [Green Version]
Lu, J.; Li, N.; Zhang, S.; Yu, Z.; Zheng, H.; Zheng, B. Multi-scale adversarial network for underwater image restoration. Opt. Laser Technol. 2019, 110, 105–113. [Google Scholar] [CrossRef]
Li, C.; Anwar, S.; Porikli, F. Underwater scene prior inspired deep underwater image and video enhancement. Pattern Recognit. 2020, 98, 107038. [Google Scholar] [CrossRef]
Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An Underwater Image Enhancement Benchmark Dataset and Beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar] [CrossRef] [Green Version]
Yang, M.; Hu, J.; Li, C.; Rohde, G.; Du, Y.; Hu, K. An In-Depth Survey of Underwater Image Enhancement and Restoration. IEEE Access 2019, 7, 123638–123657. [Google Scholar] [CrossRef]
Kashif, I.; Salam, R.A.; Azam, O.; Talib, A.Z. Underwater Image Enhancement Using an Integrated Colour Model. Iaeng Int. J. Comput. Sci. 2007, 34, 239–244. [Google Scholar]
Hummel, R. Image enhancement by histogram transformation. Comput. Graph. Image Process. 1977, 6, 184–195. [Google Scholar] [CrossRef]
Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; ter Haar Romeny, B.; Zimmerman, J.B.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vis. Graph. Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
Zuiderveld, K. Contrast Limited Adaptive Histogram Equalization—ScienceDirect. Graph. Gems 1994, 8, 474–485. [Google Scholar]
Akila, C.; Varatharajan, R. Color fidelity and visibility enhancement of underwater image de-hazing by enhanced fuzzy intensification operator. Multimed. Tools Appl. 2018, 77, 4309–4322. [Google Scholar] [CrossRef]
Singh, K.; Kapoor, R.; Sinha, S.K. Enhancement of low exposure images via recursive histogram equalization algorithms. Optik 2015, 126, 2619–2625. [Google Scholar] [CrossRef]
Jin, S.; Qu, P.; Zheng, Y.; Zhao, W.; Zhang, W. Color Correction and Local Contrast Enhancement for Underwater Image Enhancement. IEEE Access 2022, 10, 119193–119205. [Google Scholar] [CrossRef]
Zhou, J.; Yao, J.; Zhang, W.; Zhang, D. Multi-scale retinex-based adaptive gray-scale transformation method for underwater image enhancement. Multimed. Tools Appl. 2022, 81, 1811–1831. [Google Scholar] [CrossRef]
Abdul Ghani, A.S. Image contrast enhancement using an integration of recursive-overlapped contrast limited adaptive histogram specification and dual-image wavelet fusion for the high visibility of deep underwater image. Ocean Eng. 2018, 162, 224–238. [Google Scholar] [CrossRef]
Lei, X.; Wang, H.; Shen, J.; Liu, H. Underwater image enhancement based on color correction and complementary dual image multi-scale fusion. Appl. Opt. 2022, 61, 5304. [Google Scholar] [CrossRef]
Yang, W.; Jing, Z.; Yang, C.; Wang, Z. A deep CNN method for underwater image enhancement. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017. [Google Scholar]
Naik, A.; Swarnakar, A.; Mittal, K. Shallow-UWnet: Compressed Model for Underwater Image Enhancement. arXiv 2021, arXiv:2101.02073. [Google Scholar]
Guo, Y.; Li, H.; Zhuang, P. Underwater Image Enhancement Using a Multiscale Dense Generative Adversarial Network. IEEE J. Ocean Eng. 2019, 45, 862–870. [Google Scholar] [CrossRef]
Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
Islam, J.; Xia, Y.; Sattar, J. Fast Underwater Image Enhancement for Improved Visual Perception. IEEE Robot. Autom. Lett. 2020, 5, 3227–3234. [Google Scholar] [CrossRef] [Green Version]
Du, R.; Li, W.; Chen, S.; Li, C.; Zhang, Y. Unpaired Underwater Image Enhancement Based on CycleGAN. Information 2021, 13, 1. [Google Scholar] [CrossRef]
Panetta, K.; Gao, C.; Agaian, S. Human-Visual-System-Inspired Underwater Image Quality Measures. IEEE J. Ocean. Eng. 2015, 41, 541–551. [Google Scholar] [CrossRef]
Jiang, Y.; Gong, X.; Liu, D.; Cheng, Y.; Fang, C.; Shen, X.; Yang, J.; Zhou, P.; Wang, Z. EnlightenGAN: Deep Light Enhancement Without Paired Supervision. IEEE Trans. Image Process. 2021, 30, 2340–2349. [Google Scholar] [CrossRef]
Liu, Y.-C.; Chan, W.-H.; Chen, Y.-Q. Automatic white balance for digital still camera. IEEE Trans. Consum. Electron. 1995, 41, 460–466. [Google Scholar] [CrossRef]

Figure 1. Different types of underwater images enhanced with UMGAN. Columns 1, 3, and 5 are turbid underwater images, while columns 2, 4, and 6 are corresponding images enhanced by UMGAN. (a) Greenish images. (b) Yellowish images. (c) Blueish images. (d) Low-light images. (e) Images with white fog. (f) Low-contrast images.

Figure 2. The overall framework of UMGAN.

Figure 3. Visual comparison after adding the noise reduction network. (a) Turbid underwater image. (b) Enhanced image after removing the noise reduction network. (c) Enhanced image of UMGAN. Adding the noise reduction network significantly reduces the noise in the image.

Figure 4. Visual comparison after adding the global–local discriminator. (a) Turbid underwater image. (b) Enhanced image after removing the global–local discriminator. (c) Enhanced image of UMGAN. After adding the global–local discriminator, the local area detail enhancement is better, and the problem of insufficient local area enhancement is solved.

Figure 5. Samples from the training dataset. First row: underwater clear images. Second row: underwater turbid images.

Figure 6. Visual comparison from the ablation study.

Figure 7. Visual comparison of various underwater image enhancement methods. (a) Input images, (b) UCM, (c) CLAHE, (d) White balance, (e) Retinex, (f) Image fusion, (g) UWCNN, (h) CycleGAN, (i) FUnIE-GAN, (j) Shadow-UWnet, (k) UMGAN. Columns one to five are greenish, with white haze, blueish, low contrast and yellowish images respectively.

Table 1. Quantitative evaluation of ablation experiments (best scores are bolded; the numbers in parentheses represent the percentage boost after adding the module).

Metric	Without UIQM Loss	Without Noise Reduction Network	Without Local Discriminator	UMGAN
MSE	532.58 (1.24%)	678.45 (28.97%)	561.63 (6.76%)	526.05
PSNR	23.33 (0.94%)	21.48 (9.64%)	23.07 (2.08%)	23.55
SSIM	0.8934 (0.68%)	0.8214 (9.51%)	0.8915 (0.90%)	0.8995
UCIQE	0.4208 (4.66%)	0.4372 (0.73%)	0.4232 (4.06%)	0.4404
UIQM	3.191 (9.40%)	3.268 (6.82%)	3.193 (9.33%)	3.491

Table 2. Quantitative evaluation results of several algorithms (best scores are bolded).

Method	MSE	PSNR	SSIM	UCIQE	UIQM
UCM	725.47	21.82	0.8863	0.4114	3.0164
CLAHE	1924.31	15.67	0.5573	0.4528	2.9699
White balance	8238.24	10.61	0.4266	0.3089	2.3813
Retinex	2873.83	14.18	0.6353	0.4417	3.1860
Image fusion	1884.54	16.75	0.7336	0.4202	3.3534
UWCNN	1959.92	16.22	0.8095	0.3498	3.1654
CycleGAN	1150.65	19.05	0.7431	0.4587	3.3417
FUnIE-GAN	952.22	19.15	0.7481	0.4224	3.4052
Shadow-UWnet	740.95	20.15	0.8460	0.3758	3.0890
Ours	526.05	23.55	0.8995	0.4404	3.4911

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sun, B.; Mei, Y.; Yan, N.; Chen, Y. UMGAN: Underwater Image Enhancement Network for Unpaired Image-to-Image Translation. J. Mar. Sci. Eng. 2023, 11, 447. https://doi.org/10.3390/jmse11020447

AMA Style

Sun B, Mei Y, Yan N, Chen Y. UMGAN: Underwater Image Enhancement Network for Unpaired Image-to-Image Translation. Journal of Marine Science and Engineering. 2023; 11(2):447. https://doi.org/10.3390/jmse11020447

Chicago/Turabian Style

Sun, Boyang, Yupeng Mei, Ni Yan, and Yingyi Chen. 2023. "UMGAN: Underwater Image Enhancement Network for Unpaired Image-to-Image Translation" Journal of Marine Science and Engineering 11, no. 2: 447. https://doi.org/10.3390/jmse11020447

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

UMGAN: Underwater Image Enhancement Network for Unpaired Image-to-Image Translation

Abstract

1. Introduction

2. Related work

2.1. Underwater Image Enhancement

2.2. Deep Learning Methods

3. Materials and Methods

3.1. Underwater Image Quality Measure Feedback

3.2. Noise Reduction Network

3.3. Global–Local Discriminators

3.4. Loss Function

3.4.1. UIQM Loss

3.4.2. Adversarial Loss (Global–Local)

3.4.3. Cycle Consistency Loss

3.4.4. SSIM Loss

4. Discussion

4.1. Dataset and Implementation Details

4.2. Ablation Study

4.3. Qualitative Evaluation

4.4. Quantitative Evaluation

4.5. Results and Significance

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI