Extended Depth-of-Field Imaging Using Multi-Scale Convolutional Neural Network Wavefront Coding

Zhou, Yiran; Wu, Yijian; Guo, Xiaohu; Gui, Wenyong

doi:10.3390/electronics12194028

Open AccessEssay

Extended Depth-of-Field Imaging Using Multi-Scale Convolutional Neural Network Wavefront Coding

¹

College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou 325000, China

²

China North Vehicle Research Institute, Beijing 100072, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(19), 4028; https://doi.org/10.3390/electronics12194028

Submission received: 2 September 2023 / Revised: 20 September 2023 / Accepted: 20 September 2023 / Published: 25 September 2023

Download

Browse Figures

Versions Notes

Abstract

:

Wavefront encoding (WFC) is a depth-of-field (DOF) extension technology that combines optical encoding and digital decoding. The system extends DOF at the expense of intermediate image quality and then decodes it through an image restoration algorithm to obtain a clear image. Affected by point spread differences, traditional decoding methods are often accompanied by artifacts and noise amplification problems. In this paper, based on lens-combined modulated wavefront coding (LM-WFC), we simulate the imaging process under different object distances, generate a simulation data set of WFC, and train a multi-scale convolutional neural network. The simulation experiment proves that this method can effectively reduce artifacts and improve image clarity. In addition, we used the LM-WFC camera to obtain real scene images with different target distances for experiments. The decoding results showed that the network model can enhance the quality of image restoration and generate clear images that are more in line with human vision, which is conducive to the improvement and practical application of wavefront coding systems.

Keywords:

wavefront coding; extended depth of field; image restoration; multi-scale convolutional networks

1. Introduction

Wavefront coding(WFC) is a depth-of-field(DOF) extension technology that integrates optical and digital image processing [1]. This system not only can achieve a large depth-of-field extension but also can correct aberrations related to defocusing. Therefore, it has wide application prospects in many fields. For example, WFC can increase the image plane defocus allowed by the system and solve the problems of image plane deviation and imaging quality degradation caused by the influence of external environment changes in infrared optical systems and space optical systems [2,3]. For optical microscopic imaging, the characteristic of WFC of enlarging focal depth is helpful for optical microscopy to observe three-dimensional objects and obtain clear fault images [4]. By inserting a specially designed mask at the pupil of the imaging system, such as the classical cubic phase plate [5], WFC modulates the incident light so that each light deviates from its propagation path in the traditional system. All light is no longer focused on a point on the focal plane, and the image point quickly disperses, as the light moves away from the focal plane, but it is imaged within a certain range on the focal plane. The diameter of the diffuse spot is almost constant, so the point diffusion function (PSF) and optical transfer function (OTF) of the optical system are almost unchanged within a large defocusing range, and the imaging plane can therefore obtain an intermediate image with almost the same degree of blur at different focal lengths. After decoding using effective image restoration technology, a final clear image with a large depth of field can be obtained.

Until now, improving the signal-to-noise ratio and clarity of the final image has remained a challenge, the main reason being that the difference in modulation transfer function (MTF) or PSF is difficult to completely eliminate. Scholars have carried out research work in two directions. On the one hand, researchers have been committed to studying various types of masks to eliminate the small differences in PSF or MTF as much as possible, such as with logarithmic phase plates, tangent phase plates, exponential phase plates, etc. [6,7]. However, it has been proved that the ability of a single phase plate is very limited, and it is difficult to achieve the target effect. More importantly, except for the cubic phase plate, mask plates of other face types require extremely high processing accuracy, which is very difficult to apply in practice [8]. On the other hand, the essence of image restoration in wavefront coding is to use a PSF that is approximately constant with defocusing for deconvolution operations. Therefore, partial differences in PSF have a very important impact on the quality of the final image; for example, it easily results in artifacts. Classical decoding algorithms, such as the Wiener filter [9] and L-R algorithm [10], all use a single filter to restore images at different defocusing positions according to a fixed PSF, which is highly dependent on the selected PSF. Differences between PSFs often lead to artifacts, noise amplification, and other problems. Guo et al. [11] reported that phase plate eccentricity would change the energy distribution of PSF and the envelope shape of PSF, and they used different eccentricity PSFs to restore fuzzy images, verifying that eccentricity would reduce the focal depth extension. Sheng et al. [12] designed an equal increment automatic rotation correction PSF based on the sum of gray gradient vector modes for the actual wavefront coding system to match the actual system and improve the image restoration quality.

In recent years, deep learning has brought significant changes to various fields of computer vision [13,14,15]. For image restoration, algorithms different from traditional image restoration algorithms that rely on physical models, deep learning is data driven and learns the mapping relationship between fuzzy images and clear images through training, solving many problems, such as the difficulty in obtaining a PSF fuzzy kernel in traditional non-blind image restoration algorithms, and it has brought great improvement to the core index of restoration. Jin et al. [16], based on the influence of fuzzy checks on deconvolution restoration results, used convolutional neural networks to perform subregional fuzzy kernel estimation of fuzzy images. Cho [17] used the U-net structure with multiple inputs and multiple outputs to effectively improve the quality of deblurring images. Kupyn [18] proposed an image deblurring architecture based on a generative adversarial network, improved the loss function and added the content loss function, and achieved a good deblurring effect.

To resolve the impact of small differences in PSFs of different defocusing on the final image quality in the WFC, we optimized lens-combined modulated wavefront coding (LM-WFC) based on the MTF consistency principle, and we a trained multi-scale convolutional neural network as an end-to-end decoding method using the advantages of deep learning algorithms in image restoration. Deep learning decoding algorithms rely heavily on training data. In the network training stage, we simulated the system imaging process and generated a set of training data. The data set contained PSF information about multiple object distance positions, realizing the end-to-end mapping relationship between the intermediate fuzzy image and the clear image at multiple imaging distances. The simulation results showed that this method can effectively suppress noise and improve the quality of restored images. In addition, by sharing the field of view between the traditional imaging system and the LM-WFC system, we obtained pictures of real scenes for experimental verification. The experimental results showed that the resolution of the restored image was obviously improved, and the details were effectively restored. In summary, this work realizes the practical application of deep learning networks in the visible light LM-WFC system. The network trained by this method can effectively solve the problem of small differences in PSFs affecting the quality of decoded images, and it improved the imaging performance of wavefront coding.

2. Design and Methods

2.1. LM-WFC Camera

To demonstrate that the deep learning model can be effectively applied to the decoding part of the wavefront coding system, we used a visible light WFC imaging system based on a combined lens. The traditional WFC modulates the Gaussian beam by adding a phase mask to the pupil so that the detector can obtain an intermediate image with basically consistent blur images within a certain range. Based on this WFC, the LM-WFC is optimized. Instead of using a single phase mask, the system uses multiple spherical profiles to control aberrations [19]. The PSF and MTF of the system are highly consistent in a wide range. Figure 1 provides the lens design drawing and a physical images of the system. The system parameter settings are shown in Table 1.

Figure 2 shows the MTF curves of the LM-WFC system at the object plane position of 50–1000 m in the full frequency range. The curve is obtained using Zemax OpticStudio 2022 to simulate imaging under different object distances, extracting the corresponding PSF and performing Fourier transform on it. It can be seen from the figure that the MTF has a high degree of consistency under different object distances, and there is no zero point, indicating that the system has the coding characteristics of the WFC system. Theoretically, the system can obtain a highly consistent coded image, and using image restoration technology to restore the coded image, a clear image can be obtained to achieve the purpose of expanding the depth of field.

Although the MTF has a high degree of consistency, there is still a certain degree of variation. Figure 3 shows the PSF at the position of 50–1000 m, from which it can be more directly observed that, as the target gradually moves away from the optimal focusing position, the dispersion radius gradually increases, and the difference between PSFs emerges. These differences will affect image decoding, resulting in artifacts or noise amplification in the final image.

2.2. Data Set Generation

In general, the parameter training of deep learning networks relies on a large number of samples, and the quantity and quality of samples largely determine the actual effect of the network. How to obtain effective training data sets is one of the most important links in training networks. In this paper, we choose the open data set on the network as the basic data and use the physical imaging model to create the data set for training the neural network. The original data set adopted was MS COCO [20], from which we randomly selected clear images. For each clear image, we simulated system imaging to obtain coded images. This operation is denoted by

I^{W F C} = I^{ori} \otimes P S F_{L M - W F C} + n_{A G W N}

(1)

where

I^{W F C}

is the coded image obtained by LM-WFC through simulation imaging;

I^{o r i}

is the clear image in the original data set, which will be used as the target image in the training process;

\otimes

is the convolution process;

P S F_{L M - W F C}

is the PSF data obtained from LM-WFC; and

n_{A G W N}

is the random white Gaussian noise, which will randomly add 10–30 dB of additive white Gaussian noise to simulate the effect of ambient noise on imaging.

The LM-WFC data set generation process is shown in Figure 4 By simulating the imaging process, we obtained wavefront-encoded simulated blurred images, combined with the clear images, to create a training set of clear-blurred image pairs for the LM-WFC system. We cropped all the images in the data set and reset the image size to 640 × 420.

3. Decoding Model: Deep Multi-Scale Convolutional Neural Network

In this paper, a deep multi-scale convolutional neural network (Deepdeblur) was selected as a decoding network and was trained with simulation data sets. DeepDeblur uses a coarse-to-fine approach to train in a multi-scale space in the form of a Gaussian pyramid. In addition, the residual network structure [21] is used in the network as the building block of the model. Since the values of fuzzy and corresponding sharp images are similar, it is efficient to have the parameters learn only the differences between images. Furthermore, by stacking a sufficient number of convolutional layers using residual network blocks, it is possible to efficiently extend the acceptance domain at each scale while achieving a deeper architecture. The overall structure is shown in Figure 5.

Deepdeblur consists of three stages, and the scale level is defined according to the order of decreasing resolution, with the scale with the highest resolution being level 1. The network performs multiple downsampling operations on the original blurry image and the corresponding clear image at the same time to obtain blurry and clear image pairs with dimensions of {256 × 256,128 × 128,64 × 64}. The multi-layer convolutional network is used to learn the end-to-end mapping relationship between image pairs of different sizes, and finally the reconstructed clear image is obtained.

In the coarsest layer network, the first convolutional layer converts an image of size 64 × 64 into 64 feature maps, which are then stacked through 19 resblocks and output to the last convolutional layer, which converts the feature maps into the input dimensions. The size of each convolutional filter is 5 × 5, and the resolution of zero filling is retained. This scale network contains 40 convolutional layers in total, so this layer network has a large enough acceptance domain to cover the entire patch and finally outputs coarse-layered latent layer images. In addition, to transmit the output information of the coarse layer to the next finer network layer, the reconstructed image of the coarse layer network will be dimensionally transformed through an upper convolutional layer [22] to connect with the fuzzy image as input in the next stage of the network. Since clear and fuzzy blocks share low-frequency information, learning suitable features using the upper convolutional layer helps to eliminate redundancy compared to other size conversion methods. The other stages of the network basically have the same structure. However, the first convolution layer takes the sharp feature from the previous stage, as well as its own blurry input image, in a concatenated form. In addition, all stage layers except the most fine-scaled layer must extract output features through an upper convolution layer. Finally, the finest layer outputs a clear image at the original resolution.

When optimizing network parameters, we combine multi-scale content loss and adversarial loss to achieve model training. The loss function of the network is:

L = L_{c o n t} + λ \times L_{a d v}

(2)

where

L_{c o n t}

represents the multi-scale content loss value,

L_{a d v}

represents the adversarial loss value, and the weight constant λ = 1 × 10⁻⁴. In the network, the pyramid structure is used to process fuzzy images of different sizes to obtain reconstructed clear images of various scales. Therefore, the mean square error of the intermediate output images of each layer is calculated, and the multi-scale content loss function is defined as follows:

L_{c o n t} = \frac{1}{2 K} \sum_{k = 1}^{K} \frac{1}{c_{k} w_{k} h_{k}} ‖L_{k} - {S_{K}‖}^{2}

(3)

L_{k}

and

S_{k}

_, respectively, represent the reconstructed image and the clear image of the corresponding size of the k layer;

c_{k}

,

w_{k}

and

h_{k}

are the channel number, width, and height of the k layer size image, respectively; and the content loss of each scale is normalized.

In light of the generator and discriminator architecture of GAN, it can help the network to compete for alternate training, improve the network performance, and obtain better image restoration results [18,23]. We introduced the discriminator in GAN to conduct joint training of the network [23], and the discriminator structure is shown in Figure 6. The model parameters are shown in Table 2. By inputting the output image of the finest layer of the multi-scale convolutional network or the true clarity image, the discriminator will distinguish the input image.

The adversarial loss is defined in Equation (4), where

G

represents the generator, which is Deepdeblur;

D

is the introduced discriminator;

S ~ p_{s h a r p} (S)

represents the clear image; and

B ~ p_{b l u r r y} (B)

represents the output image of the deblurring network.

L_{a d v} = \underset{S ~ p_{s h a r p} (S)}{Ε} [\log D (S)] + \underset{B ~ p_{b l u r r y} (B)}{Ε} [\log (1 - D (G (B)))]

(4)

The input space size of the network is 256 × 256. During training, the batch size was set to 2, and the initial learning rate was set to 5 × 10⁻⁵. An ADAM [24] optimizer was used to update the network parameters, and the model converged after 9 × 10⁵ iterations.

4. Simulation Results and Analysis

The PSF of WFC can be obtained by optical design or by measurement, so the decoding process is actually an image restoration process based on prior knowledge of the PSF. As mentioned in the second section above, the PSF of the wavefront coding system varies to a certain extent between 50 and 1000 m. However, traditional image restoration algorithms, such as the Wiener filter, can only restore based on a single PSF, so the difference in PSF will cause the restoration effect to become unsatisfactory.

To achieve better restoration effects and verify the impacts of PSF differences on image restoration, this paper uses the data set generation method mentioned in Section 2 to create two data sets and train two networks, namely Deepdeblur¹ and Deepdeblur². Deepdeblur¹ is trained on the data set generated by the PSF at 100 m, with a total of 9000 pairs of data. Deepdeblur² is obtained by training the data set generated by PSF at eight different positions at a certain interval within the range of 50–1000 m. Each image in the original data set is simulated with a random PSF to obtain a training data set containing PSF information about multiple object distances, and the training set has 9000 image pairs. The model is implemented under the Pytorch deep learning framework, and all experiments in this paper are carried out in NVIDIA GeForce GTX 2080.

To verify the effectiveness of the deblurring network, the traditional filtering algorithms Wiener filter [9], L-R algorithm [10], FCN [25], Deepdeblur¹, and Deepdeblur² are used to deblur the coded image at 100 m, and the experimental results are analyzed with visual examples. PSNR and SSIM were used as objective evaluation indices for quantitative comparative analysis.

Figure 7 shows the comparison of the results of various decoding algorithms, and Table 3 shows the experimental image quality evaluation results. The results show that the Deepdeblur network architecture adopted in this paper can obtain a higher PSNR and SSIM. From the restoration results, the Wiener filter and L-R algorithm have a certain ability for image restoration, and the edge outline of the letters in the figure has been restored to a certain extent, but the restored image contains substantial noise, and the texture details are not clear. The clarity of FCN has been improved, but the detail recovery is still poor, and the edge of the building and the outline of the car body in the figure are accompanied by artifacts. Compared with the other algorithms, Deepdeblur’s restored image has clear edges, rich texture, and effective suppression of noise amplification, with better visual effects.

To verify the depth of field extension effect, this paper compares the decoding effect of the Wiener filter, Deepdeblur¹, and Deepdeblur² networks on multiple object plane positions. Figure 8 shows the decoding results when the object plane distance is 50 m, 100 m, 300 m, 500 m, and 1000 m. The results show that, with the increase in imaging distance, artifacts gradually appear on the edge of the building in the image restored by the Wiener filter, the outline of the font becomes blurred, and the noise is obviously amplified. The overall effect of the Deepblur¹ image is clear, but the detail recovery also gradually decreases, and the ringing effect is gradually obvious through the enlarged font details. In contrast, the Deepblur² network model restoration results include better detail recovery, higher image quality, and more stable deblurring performance for fuzzy images with different imaging distances. The experimental results show that the deep learning model trained by the multi-position PSF can better solve the decoding problem in the wavefront coding system and achieve the purpose of depth-of-field extension.

5. Experimental Results and Analysis

In this section, to verify the practical application ability of the network structure proposed in this paper in real scenes, we used the Deepdeblur network to conduct experiments on the encoded images of the LM-WFC system in real scenes. During the experiment, we connected the traditional camera to the LM-WFC camera to capture the same scene for comparison and tested the performance of the LM-WFC system based on the decoding method in this paper. As shown in Figure 9, The F/# of a conventional optical lens is 0.95.

Figure 10 shows the experimental results. The distance of the target object in the first group of images is about 80–200 m, and the object distance is relatively close at this time. As shown in Figure 10a, the encoded image captured by the LM-WFC camera is fuzzier than that captured by the traditional camera, but after decoding, the image clarity is significantly improved, and the restored image has rich texture and more obvious detail on display. The second group takes pictures of objects about 400–500 m away. The results are shown in Figure 10b. The edges of objects such as the wire rack in the coded image are blurred, and the edges are improved after the network deblurring. The structure of the wire rack is clearly imaged, and the image quality is effectively improved. The target distance of the third group is about 800–1000 m, as shown in Figure 10c. At this time, the target distance is relatively large, exceeding the working range of traditional cameras. In the encoded image, the edge of the building begins to be seriously blurred, and the appearance of the building is fuzzy. However, it can still decode clear details, and the edge of the building and the outline of the window ledge are clearly visible.

The results show that, with the increase in the target distance, the blur degree of the encoded image gradually increases, and the decoding is slightly affected, but the image details can still be effectively recovered, and the clear decoded image can be obtained over a wide range of object surface distances. Therefore, it can be verified that the method proposed in this paper can effectively restore encoded images at different imaging distances in real scenes.

To obtain a more intuitive comparison, we used the non-reference Image Spatial Quality Evaluator (BRISQUE) to quantify the imaging characteristics of wavefront coded cameras and traditional cameras [26]. BRISQUE is a general no reference quality assessment algorithm for distortion based on natural scene statistics, using scene statistics with local normalized luminance coefficients to quantify the possible loss of image “naturalness” due to the presence of distortion, thus measuring the overall quality of the image. The statistical features extracted with the algorithm are associated with human perception, so it has good image evaluation ability. The smaller that the value is, the better that the perceived quality of the image is. The images in Figure 10 were evaluated. The evaluation results of images captured by traditional cameras, intermediate encoded images, and decoded images from LM-WFC cameras in the three sets of scenes are shown in Table 4. It can be seen that, although the intermediate encoding image perception quality of the LM-WFC camera is relatively poor, it can obtain smaller indicators than traditional cameras through decoding algorithm processing. It is proved that the method used in this paper is helpful in improving the restoration results of wavefront encoding and decoding and improving the quality of deblurring images.

6. Conclusions and Future Work

In this paper, a multi-scale convolutional neural network for deblurring WFC images is proposed to solve the problem of artifacts and noise amplification in WFC systems with large defocus. Based on an LM-WFC optical imaging system, PSFs of different targets were simulated, A WFC synthetic data set was generated by simulation imaging, and the Deepdeblur model was trained. A simulation and experiment showed that this method can effectively suppress noise amplification, enhance image detail recovery, and provide good image-restoration effects. In the future, the architecture of the multi-scale convolutional neural network could be optimized, including improving the residual network structure or increasing the transmission mode of feature information between different scale layers, to further improve the image-restoration ability of the network, improve the wavefront coding imaging performance, and increase the potential for its widespread application in various fields.

Author Contributions

Y.Z. finished the original manuscript; Y.W. and W.G. reviewed and edited the manuscript; X.G. designed the lens-combined modulated wavefront coding; Y.Z. completed the experiment with Y.W.’s guidance. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Zhejiang Provincial Natural Science Foundation of China (Grant No. LQ20F050011).

Data Availability Statement

The data of the experiments are not shared.

Acknowledgments

Thanks to the Beijing Key Laboratory for Precision Optoelectronic Measurement Instrument and Technology for use of its equipment.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yao, C.; Shen, Y. Optical Aberration Calibration and Correction of Photographic System Based on Wavefront Coding. Sensors 2021, 21, 4011. [Google Scholar] [CrossRef] [PubMed]
Zhang, F.; Feng, B.; Li, H. Research on athermalization design of infrared optical system based on wavefront coding. Laser Optoelectron. Prog. 2021, 58, 2208001. [Google Scholar]
Zhou, Y.; Xie, J.; Chen, H.; Zhang, L.; Yan, Y.; Li, F.; Zhang, L.; Zhou, P.; Deng, L. Wavefront control of 2D curved coding metasurfaces based on extended array theory. IEEE Access 2019, 7, 158427–158433. [Google Scholar] [CrossRef]
Tang, M.; Tang, Z.Y.; Qiao, X.; Lang, K.Q.; Sun, Y.; Wang, X.P. Extension of refocus depth range in digital holographic microscopy. Appl. Opt. 2020, 59, 8540–8552. [Google Scholar] [CrossRef] [PubMed]
Zhu, L.; Li, F.; Huang, Z.; Zhao, T. An apodized cubic phase mask used in a wavefront coding system to extend the depth of field. Chin. Phys. B 2022, 31, 054217. [Google Scholar] [CrossRef]
Le, V.; Kuang, C.; Xu, L. Optimization of wavefront coding systems based on the use of multitarget optimization. Opt. Eng. 2017, 56, 093102. [Google Scholar]
Wang, L.; Ye, Q.; Nie, J.; Sun, X. Optimized asymmetrical arcsine phase mask for extending the depth of field. IEEE Photonics Technol. Lett. 2018, 30, 1309–1312. [Google Scholar] [CrossRef]
Wei, X.; Han, J.; Xie, S.; Yang, B.; Wan, X.; Zhang, W. Experimental analysis of a wavefront coding system with a phase plate in different surfaces. Appl. Opt. 2019, 58, 9195–9200. [Google Scholar] [CrossRef]
Arines, J.; Hernandez, R.O.; Sinzinger, S.; Grewe, A.; Acosta, E. Wavefront-coding technique for inexpensive and robust retinal imaging. Opt. Lett. 2014, 39, 3986–3988. [Google Scholar] [CrossRef]
Lucy, L.B. An iterative technique for the rectification of observed distributions. Astron. J. 1974, 79, 745. [Google Scholar] [CrossRef]
Guo, X.H.; Zhao, C.X.; Zhou, P.; Tian, J.W.; Zhao, J.; Wang, F.P.; Zhang, W. Application and analysis of the extension of depth of field and focus in zoom system. Opt. Techn. 2019, 45, 263–268. [Google Scholar]
Sheng, J.; Cai, H.; Wang, Y.; Chen, X.; Xu, Y. Improved Exponential Phase Mask for Generating Defocus Invariance of Wavefront Coding Systems. Appl. Sci. 2022, 12, 5290. [Google Scholar] [CrossRef]
Wang, W.; Hu, Y.; Luo, Y.; Zhang, T. Brief survey of single image super-resolution reconstruction based on deep learning approaches. Sens. Imaging 2020, 21, 1–20. [Google Scholar]
Shi, Y.; Fan, C.; Zou, L.; Sun, C.; Liu, Y. Unsupervised adversarial defense through tandem deep image priors. Electronics 2020, 9, 1957. [Google Scholar] [CrossRef]
Wu, X.; Sahoo, D.; Hoi SC, H. Recent advances in deep learning for object detection. Neurocomputing 2020, 396, 39–64. [Google Scholar] [CrossRef]
Jin, Y.; Zhang, D.; Li, M.; Wang, Z.; Chen, Y. A fuzzy support vector machine-enhanced convolutional neural network for recognition of glass defects. Int. J. Fuzzy Syst. 2019, 21, 1870–1881. [Google Scholar] [CrossRef]
Cho, S.-J.; Ji, S.-W.; Hong, J.-P.; Jung, S.-W.; Ko, S.-J. Rethinking coarse-to-fine approach in single image deblurring. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 17 October 2021; pp. 4641–4650. [Google Scholar]
Kupyn, O.; Budzan, V.; Mykhailych, M.; Mishkin, D.; Matas, J. Deblurgan: Blind motion deblurring using conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8183–8192. [Google Scholar]
Shou, Q.; Kuang, W.; Liu, M.; Zhou, Z.; Chen, Z.; Hu, W.; Guo, Q. Two dimensional large-scale optical manipulation of microparticles by circular Airy beams with spherical and oblique wavefronts. Opt. Commun. 2022, 525, 128561. [Google Scholar] [CrossRef]
Lin, T.Y.; Maire, M.; Belongie, S.; Bourdev, L.; Girshick, R.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L.; Dollár, P. Microsoft coco: Common objects in context, Part V 13. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Springer International Publishing: Berlin, Germany, 2014; pp. 740–755. [Google Scholar]
Li, W.; Sun, W.; Zhao, Y.; Yuan, Z.; Liu, Y. Deep image compression with residual learning. Appl. Sci. 2020, 10, 4023. [Google Scholar] [CrossRef]
Wu, C.; Wo, Y.; Han, G.; Wu, Z.; Liang, J. Non-uniform image blind deblurring by two-stage fully convolution network. IET Image Process. 2020, 14, 2588–2596. [Google Scholar] [CrossRef]
Kim, D.W.; Chung, J.R.; Kim, J.; Lee, D.Y.; Jeong, S.Y.; Jung, S.-W. Constrained adversarial loss for generative adversarial network-based faithful image restoration. Etri J. 2019, 41, 415–425. [Google Scholar] [CrossRef]
Kinga, D.; Adam, J.B. A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015; Volume 5, p. 6. [Google Scholar]
Ke, L.; Liao, P.; Zhang, X.; Chen, G.; Zhu, K.; Wang, Q.; Tan, X. Haze removal from a single remote sensing image based on a fully convolutional neural network. J. Appl. Remote Sens. 2019, 13, 036505. [Google Scholar]
Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef] [PubMed]

Figure 1. LM-WFC system: (a) diagram; (b) lens.

Figure 2. MTF curves of LM-WFC.

Figure 3. Simulated PSFs of the LM-WFC. Representative object distances from 50 to 1000 m.

Figure 4. The process of LM-WFC data set generation.

Figure 5. Overview of the decoding model architecture.

Figure 6. Overview of the discriminator.

Figure 7. Target distance 100-m encoded image decoding result. From top to bottom are the encoded image, Wiener, LR, SRN Deepdeblur¹, and Deepdeblur².

Figure 8. Image restoration results encoded by different object distances. The distances from top to bottom are 50 m, 100 m, 300 m, 500 m, and 1000 m.

Figure 9. Experiment equipment for LM-WFC test.

Figure 10. Images captured in real scenes by traditional camera and LM-WFC camera. (a) Near scene. (b) Middle scene. (c) Far scene. Left: Images captured with traditional lens (F/# 0.95). Medium: Encoded image captured by the LM-WFC. Right: Decode images of the LM-WFC.

Table 1. Parameters of the LM-WFC system.

Parameters	Focal Length	Field of View	Wavelength	Pixel Size
Value	58 mm	±3°	visible	4.5 um

Table 2. Model parameters of the discriminator.

#	Layer	Weight Dimension	Stride
1	Conv	32 × 3 × 5 × 5	2
2	conv	64 × 32 × 5 × 5	1
3	conv	64 × 64 × 5 × 5	2
4	conv	128 × 64 × 5 × 5	1
5	conv	128 × 128 × 5 × 5	4
6	conv	256 × 128 × 5 × 5	1
7	conv	256 × 256 × 5 × 5	4
8	conv	512 × 256 × 5 × 5	1
9	conv	512 × 512 × 4 × 4	4
10	fc	512 × 1 × 1 × 1	-
11	sigmoid	-	-

Table 3. Quantitative results for the LM-WFC test data set.

Method	PSNR	SSIM
Wiener	23.36	0.68
L-R	24.12	0.71
FCN	25.58	0.74
Deepdeblur¹	27.32	0.81
Deepdeblur²	27.31	0.81

Table 4. BRISQUE values for traditional camera and LM-WFC camera images.

	Conventional Images	Intermediate Encoded Images	Decoded Images
Scenario a	18.2938	29.0970	15.9342
Scenario b	50.8396	61.7248	42.6337
Scenario c	73.1439	69.7806	54.8342

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, Y.; Wu, Y.; Guo, X.; Gui, W. Extended Depth-of-Field Imaging Using Multi-Scale Convolutional Neural Network Wavefront Coding. Electronics 2023, 12, 4028. https://doi.org/10.3390/electronics12194028

AMA Style

Zhou Y, Wu Y, Guo X, Gui W. Extended Depth-of-Field Imaging Using Multi-Scale Convolutional Neural Network Wavefront Coding. Electronics. 2023; 12(19):4028. https://doi.org/10.3390/electronics12194028

Chicago/Turabian Style

Zhou, Yiran, Yijian Wu, Xiaohu Guo, and Wenyong Gui. 2023. "Extended Depth-of-Field Imaging Using Multi-Scale Convolutional Neural Network Wavefront Coding" Electronics 12, no. 19: 4028. https://doi.org/10.3390/electronics12194028

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Extended Depth-of-Field Imaging Using Multi-Scale Convolutional Neural Network Wavefront Coding

Abstract

1. Introduction

2. Design and Methods

2.1. LM-WFC Camera

2.2. Data Set Generation

3. Decoding Model: Deep Multi-Scale Convolutional Neural Network

4. Simulation Results and Analysis

5. Experimental Results and Analysis

6. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI