# An Improved Boosting Learning Saliency Method for Built-Up Areas Extraction in Sentinel-2 Images

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Proposed Method

#### 2.1. Image Preprocessing

#### 2.1.1. Sentinel-2 Constellation

#### 2.1.2. Atmospheric Correction and Image Sharpening

#### 2.1.3. Optimal Band Selection

#### 2.2. Multiscale Segmentation

_{i}, i = 1, ..., N, where N is the number of segmented objects. Figure 2d,e show the results of merging 20,000 superpixels into 2000 and 4000 objects, respectively. It not only reduces the number of superpixels, but also ensures that the outline of the built-up areas is well captured.

#### 2.3. Feature Selection

_{i}in RGB space and CIELab space, and the color feature of the segmented object O

_{i}can be described as

_{r,g,b}and c

_{L,a,b}represent the average value of each color channel of the pixels in the segmented object O

_{i}in the RGB and CIELab color spaces.

_{i}was constructed, and the texture feature of the segmented object O

_{i}can be described as

_{i}is the value of the i-th bin in an LBP histogram.

_{Area}is the area and O

_{Ecce}is the eccentricity of segmented object O

_{i}, and O

_{Ecce}is between 0 and 1. To avoid erroneously eliminating the road inside the built-up areas, we only consider the long strip-shaped segmented objects with a large area. We experimentally set the th

_{1}to 500 pixels, and th

_{2}was set to 0.95. The feature vector of segmented object O

_{i}can be obtained by

#### 2.4. Coarse Saliency Map

#### 2.4.1. Multiple Cues Fusion

#### Compactness Saliency Using Color Cues

_{1}, v

_{2}, · · ·, v

_{N}} was constructed, and edges E weighted by an affinity matrix

**W**= [w

_{ij}]

_{N × N}. Node v

_{i}corresponds to the ith segmented object, and edge e

_{ij}link nodes v

_{i}and v

_{j}to each other, and the CIELab color space distance l

_{i}

_{j}between nodes v

_{i}and v

_{j}is defined as

_{i}and c

_{j}are the mean of segmented objects corresponding to nodes v

_{i}and v

_{j}in the CIELab color space. Note that the distance matrix

**L**= [l

_{ij}]

_{N × N}is normalized to the interval [0, 1]. The affinity matrix w

_{ij}is defined as

_{i}denotes the set of neighbors of node v

_{i}. If v

_{i}and v

_{j}are adjacent, v

_{j}is treated as a neighbor of v

_{i}, and the set of neighbors is equal to the number of v

_{j}.

_{ij}between a pair of segmented objects, v

_{i}and v

_{j}, is defined as

**A**= [a

_{ij}]

_{N × N},

**D**= diag {d

_{11}, d

_{22}, …, d

_{NN}}, d

_{ii}is the degree of nodes v

_{i}, and

**H**= [h

_{ij}]

_{N × N}is the similarity matrix after the diffusion process, α balances the smooth and fitting constraints of the manifold ranking algorithm and, empirically, α was set to 0.99 as in [65]. The spatial variance of segmented objects can be calculated as

_{j}represents the number of pixels that belong to segmented object v

_{j}, b

_{j}= [${b}_{j}^{x}$, ${b}_{j}^{y}$] represents the centroid of the segmented object v

_{j}, and the μ

_{i}= [${\mu}_{i}^{x}$, ${\mu}_{i}^{y}$] represents the spatial mean.

_{x}, p

_{y}] is the spatial coordinate of the image center.

#### Foreground Saliency Using Multiple Cues Contrast

_{s}is the foreground seed set, D

_{t}is the texture similarity between segmented objects based on LBP, and ||

**b**

_{i}−

**b**

_{j}|| is the Euclidean distance between position of segmented objects.

_{FG}map was propagated using manifold ranking and, then, the propagated map was normalized to [0, 1], and denoted S

_{fore}(i). The S

_{com}(i) and S

_{fore}(i) maps are complementary to one another, and both saliency maps were integrated to define the initial saliency map,

_{com}(i) and foreground saliency map S

_{fore}(i). In the optimal band combination, the built-up areas can be better identified using color features, while the built-up areas are also sensitive to texture features. Both of them have important contributions, so η was set to 0.5.

#### 2.4.2. Geodesic Weighted Bayesian

_{i}is segmented object, sal is initial set of salient regions, and bk is the initial set of background regions. p

_{geo}(s

_{i}) denotes the probability of s

_{i}, namely, the weight of segmented object s

_{i}.

_{i}can be calculated as

_{j}denotes the number of pixels within segmented object O

_{i}, n

_{j}(f(x)) denotes the number of f(x) values contained in segmented object O

_{i}, and f∈{L, a, b, LBP} denotes the component of feature vector v, substituting observation likelihood (16) and (17) into (14), and utilizing the initial saliency map as a prior distribution to generate a more precise saliency map. Then, the initial saliency map was further refined to obtain the coarse saliency map S

_{course}based on graph cut method [68].

#### 2.4.3. Removing the Water Bodies

_{w}, to 0, thereby achieving the purpose of removing water bodies. To determine T

_{w}, the histogram of SWIR band was first generated. For cities with more water bodies, water bodies occupy a larger area, so there is a peak on the left side of the histogram. The gray values of other ground objects are usually greater than water bodies, and their peaks on the histogram are to the right of the water peak. We determined the value corresponding to the first trough to the right of the water peak as T

_{w}. Based on statistical results of multiple images, we determined that the threshold T

_{w}is 0.15. The cities with less water are almost unaffected by water bodies, and T

_{w}was set to 0.01. The gray value of the building shadow is also low, but its area is small. To avoid removing building shadow, we only removed segmented objects with a large area and the gray value less than T

_{w}. Considering that buildings in some areas are dense, the area of the shadow is relatively large. We empirically set the area threshold for removing water to 100 pixels.

#### 2.4.4. Training Sample Selection

_{h}and T

_{l}(T

_{h}is greater than T

_{l}), were set to generate initial built-up area and non-built-up area training samples. Both thresholds can be adaptively determined by the mean value ϖ of the coarse saliency map, T

_{h}was set to ϑ times ϖ, and T

_{l}was set to ϖ, where ϑ is a parameter and ϑ was set to 1.8; more discussions about the values of ϑ can be found in Section 3.1.3. The segmented objects with saliency values above the T

_{h}were selected as initial built-up area samples, while those with saliency values below the T

_{l}were selected as initial non-built-up area samples. Next, we constrained the initial training sample set using the spatial feature F

_{spatial}to obtain the training samples {s

_{i}, l

_{i}${\}}_{i=1}^{P}$, where s

_{i}is the i-th training sample from the coarse saliency map S

_{course}, l

_{i}is the binary label of the training sample, P is the number of the samples, built-up areas samples are labeled +1, and non-built-up areas samples are labeled −1.

#### 2.5. Refined Saliency Map

_{f}(N

_{feature}by N

_{kerne}

_{l}) different standard SVM classifiers, where N

_{featur}

_{e}is the number of features and N

_{kernel}is the number of kernel functions. The four different kernel functions are linear, polynomial, radial basis function, and sigmoid. For different feature sets, the decision function can be defined as

_{n}is the kernel weight, w

_{i}is the Lagrange multiplier, and $\overline{b}$ is the bias in the standard SVM algorithm. Equation (19) is a conventional function for the multiple kernel learning method; when the boosting algorithm was used to replace the simple combination of single-kernel SVMs in the multiple kernel learning, Equation (19) can be rewritten as

**k**= [k

_{n}(s)_{n}(s,s

_{1}),k

_{n}(s,s

_{2}),…,k

_{n}(s,s

_{P})]

^{T}, w = [w

_{1}l

_{1},w

_{2}l

_{2},…,w

_{P}l

_{P}]

^{T}, and b =$\text{}{\displaystyle \sum}_{n=1}^{Nf}\overline{b}$. By setting the decision function as ${Z}_{n}\left(S\right)={w}^{T}{k}_{n}\left(S\right)+{\overline{b}}_{n}$, the AdaBoost method may be employed to train a strong classifier, and Formula (20) can be rewritten as

_{j}, and J represents the number of iterations of the boosting process. The process is as follows:

**Step 1:**Begin with uniform weights, ω

_{1}(i) = 1/P, i = 1, 2, …, P, and assign a set of decision functions {Z

_{n}(S), n = 1, 2, …, N

_{f}} to each weak classifier.

**Step 2:**Compute the classification error {ε

_{n}} for each of the weak classifiers, and ascertain the decision function z

_{j}(s) with the minimum error ε

_{j}; then, the combination coefficient β

_{j}is computed by

_{j}must exceed 0.

**Step 3:**Update the weight according to Equation (23), and repeat step 2 for the next iteration until J iterations are completed.

_{j}and z

_{j}(s) can be obtained, and then the strong classifier was learned. Subsequently, a pixel-wise saliency map was generated using the strong classifier. Finally, the refined saliency map S

_{refined}was improved based on the graph cut method [68] and guided filter [71].

#### 2.6. Multiscale Saliency

_{m}. Following segmentation, a pixel i can be classified as foreground or background. If a pixel, i, is foreground, the probability that one of its neighboring pixels, j, is measured as foreground is λ, while µ is the probability that j is measured as background when i belongs to the background. We assumed that λ is equal to µ if it was considered equally probable that the pixel belongs to the foreground or to the background. The posterior probability can be denoted as ${S}_{m,i}^{\left(t\right)}$·λ, which represents the probability of pixel i belonging to the foreground F, on the condition that its neighboring pixel j in the m-th saliency map was binarized as foreground at time t, and posterior probability ${S}_{m,i}^{(t+1)}$ can also be used to represent the probability of pixel i belonging to the foreground F at time t + 1. Based on the prior ratio in [46], we have

**1**= [1, 1, . . . , 1]. After T

_{C}iterations, the integrated saliency map s

^{(TC)}can be integrated as

#### 2.7. Integration

#### 2.8. Bulit-Up Area Extraction

_{final}, built-up areas usually have the highest values, ground objects similar to built-up areas have the next highest values, and other ground objects have very low values. As such, the final saliency map can also be broadly segmented into three parts based on the gray value. To extract accurate built-up areas, an appropriate threshold segmentation image needs to be set. In [72], the genetic algorithm was used to determine the optimal segmentation threshold and achieved good results. In our paper, a multi-threshold segmentation algorithm, FODPSO [47], was employed. Following segmentation, the highest value part is the binary map of the built-up areas. The pseudo-code for FODPSO is presented in Table 1.

## 3. Experimental Results

#### 3.1. Comparison to the State-of-the-Art Saliency Methods

#### 3.1.1. Qualitative Experiment

#### 3.1.2. Quantitative Experiment

#### ROC-AUC Metric

#### Precision, Recall, and F-Measure

^{2}was set as 1 to balance the importance of precision and recall.

#### Time Comparison

#### 3.1.3. Important Parameter Settings

_{h}and T

_{l}. To determine the optimal parameter M, we compared the accuracy and time of the different M. The results are shown in Table 6. From Table 6, as M increases, the accuracy increases, but the calculation time also rises sharply. If M continues to increase, the computational time cannot be accepted.

#### 3.2. Comparison to the State-of-the-Art Built-Up Areas Extraction Methods

_{swir}is the reflectance of the SWIR band, b

_{nir}is the reflectance of the NIR band, b

_{red}is the reflectance of the red band, tx

_{i}= f(w = 9,

**v**, m = CON), i∈[α

_{i}_{1}, d

_{1}; a

_{2}, d

_{2}; …; a

_{n}, d

_{n}]. w = window size, α and d are the distance and angle defining the displacement vector

**v**required to select the pairs producing the co-occurrence matrix, and m is the textural measure applied to the given co-occurrence matrix distribution. CON =$\sum}_{i=1}^{Ng}{\displaystyle \sum}_{j=1}^{Ng}{\left(i-j\right)}^{2}\xb7{P}_{ij$, with N

_{g}is the number of gray levels present in the image, and P

_{i,j}is the (i,j)th entry of the co-occurrence matrix.

_{area}and remove the area smaller than the T

_{area}, where the T

_{area}is empirically set to 3000 pixels. From Figure 6, it is clear that the two index-based methods perform poorly on the images of the desert cities and the valley cities, they are almost impossible to use to identify and extract the built-up areas, while the desert and bare rock are clearly extracted. However, they perform very well on the images of coastal cities, riverside cities, and plain cities, because of the high vegetation coverage or large water area in these cities. The PanTex method performs better than both index-based methods in detecting built-up areas, with the locations clearly identifiable. However, the PanTex method only utilizes texture features; some areas with texture features similar to built-up areas may also be extracted. For example, in desert cities, they are surrounded by a large number of loess and desert areas, which have similar texture features to the built-up areas and are, therefore, incorrectly extracted. Although the land cover around cities varies, our proposed method can still efficiently identify the locations and boundaries of built-up areas, and can accurately extract them. To quantitatively evaluate the various methods, three statistical measures were used: overall accuracy, commission error, and omission error. The commission error represents the percentage of pixels that belong to non-built-up areas but have been classified as built-up areas. Omission error represents the percentage of pixels that belong to built-up areas, but have been classified as non-built-up areas. Table 8 shows the average statistical measurement results for five different types of cities. The overall accuracies of our proposed method in five different types of cities are higher than the other three methods, and the commission and omission errors of the proposed method are the lowest among the four methods. NDBI and NBI have low overall accuracy and high commission errors and omission error on images in desert and valley cities. This suggests that these two index-based methods are not suitable for extracting built-up areas surrounded by bare rock and desert. However, they perform well on images in the other three types of cities and achieve high overall accuracy. PanTex performs very well, second only to the proposed method. Its omission error is also relatively low, while the commission error is high in desert cities. This indicates that when PanTex extracted the built-up areas of the desert city, a large amount of non-built-up areas are incorrectly extracted. In summary, the proposed method takes into account the different features of the built-up areas, based on visual salience, and can achieve good results in different types of cities.

## 4. Discussion

## 5. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Taubenböck, H.; Esch, T.; Felbier, A.; Wiesner, M.; Roth, A.; Dech, S. Monitoring urbanization in mega cities from space. Remote Sens. Environ.
**2012**, 117, 162–176. [Google Scholar] [CrossRef] - Yu, S.; Sun, Z.; Guo, H.; Zhao, X.; Sun, L.; Wu, M. Monitoring and analyzing the spatial dynamics and patterns of megacities along the maritime silk road. J. Remote Sens.
**2017**, 21, 169–181. [Google Scholar] - Sun, Z.; Guo, H.; Li, X.; Lu, L.; Du, X. Estimating urban impervious surfaces from landsat-5 tm imagery using multilayer perceptron neural network and support vector machine. J. Appl. Remote Sens.
**2011**, 5, 053501. [Google Scholar] [CrossRef] - Deng, C.; Wu, C. Bci: A biophysical composition index for remote sensing of urban environments. Remote Sens. Environ.
**2012**, 127, 247–259. [Google Scholar] [CrossRef] - Jieli, C.; Manchun, L.; Yongxue, L.; Chenglei, S.; Wei, H.U. Extract residential areas automatically by new built-up index. In Proceedings of the 18th International Conference on Geoinformatics, Beijing, China, 18–20 June 2010. [Google Scholar]
- Xu, H. A new index for delineating built-up land features in satellite imagery. Int. J. Remote Sens.
**2008**, 29, 4269–4276. [Google Scholar] [CrossRef] - Zha, Y.; Gao, J.; Ni, S. Use of normalized difference built-up index in automatically mapping urban areas from tm imagery. Int. J. Remote Sens.
**2003**, 24, 583–594. [Google Scholar] [CrossRef] - Sun, G.; Chen, X.; Jia, X.; Yao, Y.; Wang, Z. Combinational build-up index (cbi) for effective impervious surface mapping in urban areas. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2016**, 9, 2081–2092. [Google Scholar] [CrossRef] - Zhang, P.; Sun, Q.; Liu, M.; Li, J.; Sun, D. A strategy of rapid extraction of built-up area using multi-seasonal landsat-8 thermal infrared band 10 images. Remote Sens.
**2017**, 9, 1126. [Google Scholar] [CrossRef] - Shao, Z.; Tian, Y.; Shen, X. Basi: A new index to extract built-up areas from high-resolution remote sensing images by visual attention model. Remote Sens. Lett.
**2014**, 5, 305–314. [Google Scholar] [CrossRef] - Pesaresi, M.; Gerhardinger, A.; Kayitakire, F. A robust built-up area presence index by anisotropic rotation-invariant textural measure. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2008**, 1, 180–192. [Google Scholar] [CrossRef] - Wentz, E.A.; Stefanov, W.L.; Gries, C.; Hope, D. Land use and land cover mapping from diverse data sources for an arid urban environments. Comput. Environ. Urban Syst.
**2006**, 30, 320–346. [Google Scholar] [CrossRef] - Leinenkugel, P.; Esch, T.; Kuenzer, C. Settlement detection and impervious surface estimation in the mekong delta using optical and sar remote sensing data. Remote Sens. Environ.
**2011**, 115, 3007–3019. [Google Scholar] [CrossRef] - Zhu, Z.; Woodcock, C.E.; Rogan, J.; Kellndorfer, J. Assessment of spectral, polarimetric, temporal, and spatial dimensions for urban and peri-urban land cover classification using landsat and sar data. Remote Sens. Environ.
**2012**, 117, 72–82. [Google Scholar] [CrossRef] - Zhang, J.; Li, P.; Wang, J. Urban built-up area extraction from landsat tm/etm+ images using spectral information and multivariate texture. Remote Sens.
**2014**, 6, 7339–7359. [Google Scholar] [CrossRef] - Borji, A.; Cheng, M.; Jiang, H.; Li, J. Salient object detection: A benchmark. IEEE Trans. Imag. Process.
**2015**, 24, 5706–5722. [Google Scholar] [CrossRef] [PubMed] - Dong, C.; Liu, J.; Xu, F. Ship detection in optical remote sensing images based on saliency and a rotation-invariant descriptor. Remote Sens.
**2018**, 10, 400. [Google Scholar] [CrossRef] - Zhang, Y.; Wang, X.; Xie, X.; Li, Y. Salient object detection via recursive sparse representation. Remote Sens.
**2018**, 10, 652. [Google Scholar] [CrossRef] - Zhang, L.; Li, A.; Zhang, Z.; Yang, K. Global and local saliency analysis for the extraction of residential areas in high-spatial-resolution remote sensing image. IEEE Trans. Geosci. Remote Sens.
**2016**, 54, 3750–3763. [Google Scholar] [CrossRef] - Itti, L.; Koch, C.; Niebur, E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell.
**1998**, 20, 1254–1259. [Google Scholar] [CrossRef] [Green Version] - Harel, J.; Koch, C.; Perona, P. Graph-based visual saliency. In Advances in Neural Information Processing Systems; Touretzky, D.S., Mozer, M.C., Hasselmo, M.E., Eds.; MIT Press: Cambridge, MA, USA, 2007. [Google Scholar]
- Ma, Y.; Zhang, H. Contrast-based image attention analysis by using fuzzy growing. In Proceedings of the Eleventh ACM International Conference on Multimedia, Berkeley, CA, USA, 2–8 November 2003. [Google Scholar]
- Gao, D.; Mahadevan, V.; Vasconcelos, N. The discriminant center-surround hypothesis for bottom-up saliency. In Advances in Neural Information Processing Systems; Platt, J.C., Koller, D., Singer, Y., Roweis, S.T., Eds.; Curran Associates Icn: Vancouver, BC, Canada, 2008. [Google Scholar]
- Gao, D.; Vasconcelos, N. Bottom-up saliency is a discriminant process. In Proceedings of the 2007 IEEE 11th International Conference on Computer Vision, Rio de Janeiro, Brazil, 14–21 October 2007. [Google Scholar]
- Cheng, M.; Mitra, N.J.; Huang, X.; Torr, P.H.; Hu, S.-M.; Intelligence, M. Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell.
**2015**, 37, 569–582. [Google Scholar] [CrossRef] [PubMed] - Perazzi, F.; Krähenbühl, P.; Pritch, Y.; Hornung, A. Saliency filters: Contrast based filtering for salient region detection. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012. [Google Scholar]
- Shi, K.; Wang, K.; Lu, J.; Lin, L. Pisa: Pixelwise image saliency by aggregating complementary appearance contrast measures with spatial priors. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 25–27 June 2013. [Google Scholar]
- Gopalakrishnan, V.; Hu, Y.; Rajan, D. Random walks on graphs for salient object detection in images. IEEE Trans. Imag. Process.
**2010**, 19, 3232–3242. [Google Scholar] [CrossRef] [PubMed] - Wei, Y.; Wen, F.; Zhu, W.; Sun, J. Geodesic saliency using background priors. In Proceedings of the European Conference on Computer Vision, Florence, Italy, 7–13 October 2012. [Google Scholar]
- Jiang, B.; Zhang, L.; Lu, H.; Yang, C.; Yang, M.-H. Saliency detection via absorbing markov chain. In Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Australia, 3–6 December 2013. [Google Scholar]
- Yan, Q.; Xu, L.; Shi, J.; Jia, J. Hierarchical saliency detection. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Oregon, Portland, 25–27 June 2013. [Google Scholar]
- Qin, Y.; Lu, H.; Xu, Y.; Wang, H. Saliency detection via cellular automata. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 8–10 June 2015. [Google Scholar]
- Bruce, N.; Tsotsos, J. Saliency based on information maximization. In Advances in Neural Information Processing Systems; Jordan, M.I., LeCun, Y., Solla, S.A., Eds.; MIT Press: Cambridge, MA, USA, 2006; pp. 155–162. [Google Scholar]
- Zhang, L.; Tong, M.H.; Marks, T.K.; Shan, H.; Cottrell, G.W. Sun: A bayesian framework for saliency using natural statistics. J. Vis.
**2008**, 8, 32. [Google Scholar] [CrossRef] [PubMed] - Shen, X.; Wu, Y. A unified approach to salient object detection via low rank matrix recovery. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 18–20 June 2012. [Google Scholar]
- Borji, A.; Itti, L. Exploiting local and global patch rarities for saliency detection. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 18–20 June 2012. [Google Scholar]
- Wang, Q.; Zheng, W.; Piramuthu, R. Grab: Visual saliency via novel graph model and background priors. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
- Zhu, W.; Liang, S.; Wei, Y.; Sun, J. Saliency optimization from robust background detection. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014. [Google Scholar]
- Peng, H.; Li, B.; Ji, R.; Hu, W.; Xiong, W.; Lang, C. Salient object detection via low-rank and structured sparse matrix decomposition. IEEE Trans. Patt. Anal. Mach. Intell.
**2013**, 39, 796–802. [Google Scholar] - Lang, C.; Liu, G.; Yu, J.; Yan, S. Saliency detection by multitask sparsity pursuit. IEEE Trans. Imag. Process.
**2012**, 21, 1327–1338. [Google Scholar] [CrossRef] [PubMed] - Jiang, H.; Wang, J.; Yuan, Z.; Wu, Y.; Zheng, N.; Li, S. Salient object detection: A discriminative regional feature integration approach. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Oregon, Portland, 25–27 June 2013. [Google Scholar]
- Yang, J.; Yang, M. Top-down visual saliency via joint crf and dictionary learning. IEEE Trans. Patt. Anal. Mach. Intell.
**2017**, 39, 576–588. [Google Scholar] [CrossRef] [PubMed] - Cholakkal, H.; Rajan, D.; Johnson, J. Top-Down Saliency with Locality-Constrained Contextual Sparse Coding. Available online: http://www.bmva.org/bmvc/2015/papers/paper159/paper159.pdf (accessed on 4 August 2018).
- Tong, N.; Lu, H.; Ruan, X.; Yang, M.-H. Salient object detection via bootstrap learning. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 8–10 June 2015. [Google Scholar]
- Wang, X.; Ma, H.; Chen, X. Geodesic weighted bayesian model for salient object detection. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada, 27–30 September 2015; pp. 397–401. [Google Scholar]
- Qin, Y.; Feng, M.; Lu, H.; Cottrell, G.W. Hierarchical cellular automata for visual saliency. Int. J. Comput. Vis.
**2018**, 1–20. [Google Scholar] [CrossRef] - Couceiro, M.; Ghamisi, P. Fractional Order Darwinian Particle Swarm Optimization: Applications and Evaluation of an Evolutionary Algorithm; Springer: Berlin, Germany, 2015. [Google Scholar]
- Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P. Sentinel-2: Esa’s optical high-resolution mission for gmes operational services. Remote Sens. Environ.
**2012**, 120, 25–36. [Google Scholar] [CrossRef] - Vuolo, F.; Żółtak, M.; Pipitone, C.; Zappa, L.; Wenng, H.; Immitzer, M.; Weiss, M.; Baret, F.; Atzberger, C. Data service platform for sentinel-2 surface reflectance and value-added products: System use and examples. Remote Sens.
**2016**, 8, 938. [Google Scholar] [CrossRef] - Mueller-Wilm, U. Sentinel-2 msi—Level-2a Prototype Processor Installation and User Manual. Available online: http://step.esa.int/thirdparties/sen2cor/2.2.1/S2PAD-VEGA-SUM-0001-2.2.pdf (accessed on 6 July 2018).
- Park, H.; Choi, J.; Park, N.; Choi, S. Sharpening the vnir and swir bands of sentinel-2a imagery through modified selected and synthesized band schemes. Remote Sens.
**2017**, 9, 1080. [Google Scholar] [CrossRef] - Valdiviezo-N, J.C.; Téllez-Quiñones, A.; Salazar-Garibay, A.; López-Caloca, A.A. Built-up index methods and their applications for urban extraction from sentinel 2a satellite data: Discussion. J. Opt. Soc. Am. A
**2018**, 35, 35–44. [Google Scholar] [CrossRef] [PubMed] - Pesaresi, M.; Corbane, C.; Julea, A.; Florczyk, A.J.; Syrris, V.; Soille, P.; Sensing, R. Assessment of the added-value of sentinel-2 for detecting built-up areas. Remote Sens.
**2016**, 8, 299. [Google Scholar] [CrossRef] [Green Version] - Chavez, P.; Berlin, G.L.; Sowers, L.B. Statistical method for selecting landsat mss ratios. J. Appl. Photogr. Eng.
**1982**, 8, 23–30. [Google Scholar] - Richards, J.A.; Richards, J. Remote Sensing Digital Image Analysis; Springer: Berlin, Germany, 1999. [Google Scholar]
- Swain, P.H.; Davis, S.M. Remote sensing: The quantitative approach. IEEE Trans. Patt. Anal. Mach. Intell.
**1981**, 713–714. [Google Scholar] [CrossRef] - Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Patt. Anal. Mach. Intell.
**2012**, 34, 2274–2282. [Google Scholar] [CrossRef] [PubMed] - Moya, M.M.; Koch, M.W.; Perkins, D.N.; West, R.D.D. Superpixel segmentation using multiple sar image products. In Radar Sensor Technology XVIII; International Society for Optics and Photonics: Bellingham, WA, USA, 2014; p. 90770R. [Google Scholar]
- Hu, Z.; Wu, Z.; Zhang, Q.; Fan, Q.; Xu, J. A spatially-constrained color–texture model for hierarchical vhr image segmentation. IEEE Geosci. Remote Sens. Lett.
**2013**, 10, 120–124. [Google Scholar] [CrossRef] - Connolly, C.; Fleiss, T. A study of efficiency and accuracy in the transformation from rgb to cielab color space. IEEE Trans. Imag. Process.
**1997**, 6, 1046–1048. [Google Scholar] [CrossRef] [PubMed] - Hu, P.; Wang, W.; Zhang, C.; Lu, K. Detecting salient objects via color and texture compactness hypotheses. IEEE Trans. Imag. Process.
**2016**, 25, 4653–4664. [Google Scholar] [CrossRef] [PubMed] - Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Patt. Anal. Mach. Intell.
**2002**, 24, 971–987. [Google Scholar] [CrossRef] [Green Version] - Jia, S.; Deng, B.; Zhu, J.; Jia, X.; Li, Q. Local binary pattern-based hyperspectral image classification with superpixel guidance. IEEE Trans. Geosci. Remote Sens.
**2018**, 56, 749–759. [Google Scholar] [CrossRef] - Zhou, L.; Yang, Z.; Yuan, Q.; Zhou, Z.; Hu, D. Salient region detection via integrating diffusion-based compactness and local contrast. IEEE Trans. Imag. Process.
**2015**, 24, 3308–3320. [Google Scholar] [CrossRef] [PubMed] - Zhou, D.; Weston, J.; Gretton, A.; Bousquet, O.; Schölkopf, B. Ranking on data manifolds. In Advances in Neural Information Processing Systems; Mit Press: Cambridge, MA, USA, 2004; pp. 169–176. [Google Scholar]
- Qiao, C.; Wang, J.; Shang, J.; Daneshfar, B. Spatial relationship-assisted classification from high-resolution remote sensing imagery. Int. J. Dig. Earth
**2015**, 8, 710–726. [Google Scholar] [CrossRef] - Xie, Y.; Lu, H.; Yang, M. Bayesian saliency via low and mid level cues. IEEE Trans. Imag. Process.
**2013**, 22, 1689–1698. [Google Scholar] - Boykov, Y.; Kolmogorov, V. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Patt. Anal. Mach. Intell.
**2004**, 26, 1124–1137. [Google Scholar] [CrossRef] [PubMed] - Xu, H. Modification of normalised difference water index (ndwi) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens.
**2006**, 27, 3025–3033. [Google Scholar] [CrossRef] - Lu, H.; Zhang, X.; Qi, J.; Tong, N.; Ruan, X.; Yang, M.-H. Co-bootstrapping saliency. IEEE Trans. Imag. Process.
**2017**, 26, 414–425. [Google Scholar] [CrossRef] [PubMed] - He, K.; Sun, J.; Tang, X. Guided image filtering. IEEE Trans. Patt. Anal. Mach. Intell.
**2013**, 1397–1409. [Google Scholar] [CrossRef] [PubMed] - Li, K.; Chen, Y. A genetic algorithm-based urban cluster automatic threshold method by combining viirs dnb, ndvi, and ndbi to monitor urbanization. Remote Sens.
**2018**, 10, 277. [Google Scholar] [CrossRef] - Li, X.; Lu, H.; Zhang, L.; Ruan, X.; Yang, M.-H. Saliency detection via dense and sparse reconstruction. In Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013. [Google Scholar]
- Lou, J.; Ren, M.; Wang, H. Regional principal color based saliency detection. PLoS ONE
**2014**, 9, e112475. [Google Scholar] [CrossRef] [PubMed] - Li, H.; Lu, H.; Lin, Z.; Shen, X.; Price, B. Inner and inter label propagation: Salient object detection in the wild. IEEE Trans. Imag. Process.
**2015**, 24, 3176–3186. [Google Scholar] [CrossRef] [PubMed] - Zhou, L.; Yang, Z.; Zhou, Z.; Hu, D. Salient region detection using diffusion process on a two-layer sparse graph. IEEE Trans. Imag. Process.
**2017**, 26, 5882–5894. [Google Scholar] [CrossRef] [PubMed] - Yuan, Y.; Li, C.; Kim, J.; Cai, W.; Feng, D.D. Reversion correction and regularized random walk ranking for saliency detection. IEEE Trans. Imag. Process.
**2018**, 27, 1311–1322. [Google Scholar] [CrossRef] [PubMed] - Pesaresi, M.; Huadong, G.; Blaes, X.; Ehrlich, D.; Ferri, S.; Gueguen, L.; Halkia, M.; Kauffmann, M.; Kemper, T.; Lu, L. A global human settlement layer from optical hr/vhr rs data: Concept and first results. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2013**, 6, 2102–2131. [Google Scholar] [CrossRef] - Zhang, L.; Lv, X.; Liang, X. Saliency analysis via hyperparameter sparse representation and energy distribution optimization for remote sensing images. Remote Sens.
**2017**, 9, 636. [Google Scholar] [CrossRef]

**Figure 2.**(

**a**) Segmentation result of SLIC (N = 100) (

**b**) Segmentation result of SLIC (N = 1000) (

**c**) Segmentation result of SLIC (N = 10,000) (

**d**) Result of merging 20,000 superpixels into 2000 objects (

**e**) Result of merging 20,000 superpixels into 4000 objects.

**Figure 3.**Study areas: Alxa (AL), Jinchang (JC), Wuwei (WW), Yulin (YL), Dalian (DL), Haikou (HK), Rizhao (RZ), Shanwei (SW), Chongqing (CQ), Jingzhou (JZ), Sanmenxia (SM), Wuhan (WH), Lanzhou (LZ), Lhasa (LS), Tianshui (TS), Xining (XN), Baoding (BD), Kaifeng (KF), Shangqiu (SQ), and Suzhou (SZ).

**Figure 4.**Saliency maps produced by our proposed model and eight competing models. (

**a**) Original images; (

**b**) optimal band combinations; (

**c**) ground truth; (

**d**) dense and sparse reconstruction (DSR); (

**e**) discriminative regional feature integration (DRFI); (

**f**) regional principal color (RPC); (

**g**) diffusion-based compactness and local contrast (DCLC); (

**h**) inner and inter label propagation (LPS); (

**i**) bootstrap learning (BL); (

**j**) diffusion process on a two-layer sparse graph (DPTLSG); (

**k**) reversion correction and regularized random walks ranking (RCRR); (

**l**) Ours.

**Figure 5.**Quantitative evaluation results of different methods: (

**a**) ROC curves of different methods on Sentinel-2 images (

**b**) Precision, recall, and F-measure of different methods on Sentinel-2 images.

**Figure 6.**Comparison of the results of the four methods. (

**a**) RGB images, (

**b**) ground truth, (

**c**) NDBI maps, (

**d**) NBI maps, (

**e**) PanTex maps, (

**f**) our saliency maps, (

**g**) built-up areas maps (NDBI), (

**h**) built-up areas maps (NBI), (

**i**) built-up areas maps (PanTex), (

**j**) built-up areas maps (proposed method).

**Table 1.**Pseudo-code for the fractional-order Darwinian particle swarm optimization (FODPSO) algorithm.

Start |
---|

Set Initial parameters v_{n} [0], x_{n} [0], χ_{1n} [0], χ_{2n} [0]// v _{n} is position parameter, x_{n} is velocity parameter, χ_{1n} is local best, χ_{2n} is global best |

for i = 1:1:Max. Number of the iteration |

Generated swarm matrix |

for n = 1:1:Number of swarm matrix row |

Calculate fitness function of each row |

end |

Obtain min. fitness function’s parameter configuration |

If min. fitness function(i)<min. fitness function (i−1) |

Update ${\chi}_{1n}^{i}$[t], ${\chi}_{2n}^{i}$[t] |

Update ${v}_{n}^{i}$[t + 1], ${x}_{n}^{i}$[t + 1] |

else |

Kill all swarm matrix member |

Go to “generated swarm matrix” |

end |

end |

end |

City Type | City | Size (pixels) | Date |
---|---|---|---|

Desert cities | Alxa (Inner Mongolia) | 1000 × 900 | 11/01/2016 |

Jinchang (Gansu) | 858 × 858 | 11/24/2016 | |

Wuwei (Gansu) | 1000 × 1000 | 11/04/2016 | |

Yulin (Shanxi) | 1500 × 1500 | 04/01/2016 | |

Coastal cities | Dalian (Liaoning) | 1300 × 1300 | 11/21/2016 |

Haikou (Hainan) | 1250 × 1250 | 12/20/2016 | |

Rizhao (Shandong) | 1000 × 1000 | 11/03/2016 | |

Shanwei (Guangdong) | 500 × 500 | 12/13/2016 | |

Riverside cities | Chongqing | 1650 × 1650 | 04/14/2017 |

Jinzhou (Hubei) | 1250 × 1250 | 08/01/2016 | |

Sanmenxia (Henan) | 1000 × 1000 | 09/23/2016 | |

Wuhan (Hubei) | 1800 × 1200 | 08/28/2016 | |

Valley cities | Lanzhou (Gansu) | 1750 × 900 | 12/01/2016 |

Lhasa (Tibet) | 1500 × 1500 | 10/24/2016 | |

Tianshui (Gansu) | 1020 × 1020 | 06/06/2017 | |

Xining (Qinghai) | 1750 × 1750 | 11/04/2016 | |

Plain cities | Baoding (Hebei) | 1400 × 1400 | 09/01/2016 |

Kaifeng (Henan) | 1200 × 1200 | 08/23/2016 | |

Shangqiu (Henan) | 1250 × 1250 | 08/28/2016 | |

Suzhou (Anhui) | 1200 × 1200 | 08/28/2016 |

Desert City | Cities | Alxa | Jinchang | Wuwei | Yulin |

Optimal | Bands 12, 11, 7 | Bands 12, 11, 7 | Bands 12, 11, 5 | Bands 12, 11, 7 | |

Coastal City | Cities | Dalian | Haikou | Shanwei | Rizhao |

Optimal | Bands 12, 11, 7 | Bands 12, 11, 7 | Bands 12, 11, 7 | Bands 12, 11, 5 | |

Riverside Cities | Cities | Chongqing | Jingzhou | Sanmenxia | Wuhan |

Optimal | Bands 12, 11, 7 | Bands 12, 11, 7 | Bands 12, 11, 7 | Bands 12, 11, 7 | |

Valley Cities | Cities | Lanzhou | Lhasa | Tianshui | Xining |

Optimal | Bands 12, 11, 5 | Bands 12, 11, 5 | Bands 12, 11, 5 | Bands 12, 11, 7 | |

Plain Cities | Cities | Baoding | Kaifeng | Shangqing | Suzhou |

Optimal | Bands 12, 11, 7 | Bands 12, 11, 7 | Bands 12, 11, 7 | Bands 12, 11, 7 |

Method | DSR | DRFI | RPC | DCLC | LPS | BL | DPTLSG | RCRR | Ours | ||
---|---|---|---|---|---|---|---|---|---|---|---|

AUC | 0.8890 | 0.8788 | 0.7521 | 0.9146 | 0.8485 | 0.8366 | 0.9167 | 0.8405 | 0.9687 |

Method | DSR | DRFI | RPC | DCLC | LPS | BL | DPTLSG | RCRR | Ours |
---|---|---|---|---|---|---|---|---|---|

Time(s) | 178.33 | 1547.65 | 514.23 | 325.42 | 137.92 | 135.24 | 1583.00 | 1363.15 | 1226.22 |

M | M = 3 (N = 1000, 1500, 2000) | M = 5 (N = 1000, 1500, 2000, 2500, 3000) | M = 7 (N = 1000, 1500, 2000, 2500, 3000, 3500, 4000) |
---|---|---|---|

AUC | 0.9645 | 0.9664 | 0.9687 |

F-Measure | 0.7853 | 0.7868 | 0.8038 |

Time(s) | 259.60 | 626.42 | 1226.22 |

ϑ | 1.2 | 1.4 | 1.6 | 1.8 | 2.0 |
---|---|---|---|---|---|

AUC | 0.9534 | 0.9581 | 0.9592 | 0.9597 | 0.9487 |

F-Measure | 0.7147 | 0.7212 | 0.7299 | 0.7378 | 0.7148 |

NDBI | NBI | PanTex | Ours | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

OA | Co | Om | OA | Co | Om | OA | Co | Om | OA | Co | Om | |

Desert | 36.40 | 90.85 | 65.66 | 42.52 | 90.48 | 64.32 | 81.27 | 65.18 | 33.87 | 95.56 | 11.25 | 26.26 |

Costal | 78.22 | 48.27 | 30.59 | 86.06 | 27.66 | 27.18 | 87.88 | 36.66 | 36.39 | 95.13 | 8.04 | 11.17 |

Riverside | 80.96 | 38.64 | 24.07 | 84.51 | 29.34 | 28.92 | 81.57 | 37.21 | 29.03 | 93.35 | 10.49 | 17.84 |

Valley | 48.62 | 83.01 | 48.86 | 64.62 | 71.72 | 43.35 | 87.81 | 29.54 | 25.91 | 96.37 | 11.68 | 18.79 |

Plain | 88.43 | 30.87 | 16.87 | 88.56 | 21.61 | 40.45 | 87.80 | 30.27 | 22.26 | 94.77 | 8.72 | 17.97 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Sun, Z.; Meng, Q.; Zhai, W.
An Improved Boosting Learning Saliency Method for Built-Up Areas Extraction in Sentinel-2 Images. *Remote Sens.* **2018**, *10*, 1863.
https://doi.org/10.3390/rs10121863

**AMA Style**

Sun Z, Meng Q, Zhai W.
An Improved Boosting Learning Saliency Method for Built-Up Areas Extraction in Sentinel-2 Images. *Remote Sensing*. 2018; 10(12):1863.
https://doi.org/10.3390/rs10121863

**Chicago/Turabian Style**

Sun, Zhenhui, Qingyan Meng, and Weifeng Zhai.
2018. "An Improved Boosting Learning Saliency Method for Built-Up Areas Extraction in Sentinel-2 Images" *Remote Sensing* 10, no. 12: 1863.
https://doi.org/10.3390/rs10121863