Feature Map Analysis of Neural Networks for the Application of Vacant Parking Slot Detection

Hwang, Jung-Ha; Cho, Byungwoo; Choi, Doo-Hyun

doi:10.3390/app131810342

Open AccessArticle

Feature Map Analysis of Neural Networks for the Application of Vacant Parking Slot Detection

by

Jung-Ha Hwang

¹,

Byungwoo Cho

^2,3

and

Doo-Hyun Choi

^1,*

¹

School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of Korea

²

School of ICT, Robotics, and Mechanical Engineering, Hankyong National University, Anseong 17579, Republic of Korea

³

Korea Institute of Medical Microrobotics, Gwangju 61011, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(18), 10342; https://doi.org/10.3390/app131810342

Submission received: 18 August 2023 / Revised: 12 September 2023 / Accepted: 13 September 2023 / Published: 15 September 2023

Download

Browse Figures

Versions Notes

Abstract

:

Vacant parking slot detection using image classification has been studied for a long time. Currently, deep neural networks are widely used in this research field, and experts have concentrated on improving their performance. As a result, most experts are not concerned about the features extracted from the images. Thus, no one knows the crucial features of how neural networks determine whether a particular parking slot is full. This study divides the structures of neural networks into feature extraction and classification parts to address these issues. The output of the feature extraction parts is visualized through normalization and grayscale imaging. The visualized feature maps are analyzed to match the feature characteristics and classification results. The results show that a specific region of feature maps is activated if the parking slot is full. In addition, it is verified that different networks whose classification parts are identical extract similar features from parking slot images. This study demonstrates that feature map analyses help us find hidden characteristics of features and understand how neural networks operate. Our findings show a possibility that handcrafted algorithms using the features found by machine learning algorithms can replace neural network-based classification parts.

Keywords:

vacant parking slot detection; image classification; deep neural networks; feature extraction; feature map visualization

1. Introduction

Image classification has been used in various fields [1,2,3,4]. Detecting vacant parking slots is currently a popular research topic. The increasing population has increased the number of vehicle users and has led to a lack of parking spaces [5,6,7]. To address this lack of parking spaces, researchers have paid attention to image classification methods that can effectively manage parking spaces. Vacant parking slot detection using image classification usually involves accuracy and processing speed [8,9,10,11]; however, most researchers have primarily focused on increasing detection accuracy.

Researchers have deployed diverse feature extraction methods to improve the accuracy of vacant parking slot detection. Edge detection, color histograms, and histograms of oriented gradients (HOGs) are commonly used methods. Edge detection finds the points at which the value difference between adjacent pixels is significant. Therefore, edge detection helps in the detection of objects, as the points usually appear at object borders or an object’s component borders. Edge detection can be used to compute the ratio of edges in a parking slot [12,13], and the closed contours of edges can be used in binary decisions [12]. Color histograms represent the distributions of all the pixels in an image based on the pixel values and frequency. Experts count the number of bright pixels in images by analyzing the histograms of grayscale images [12] or employ hue histograms obtained from the hue, saturation, and value color space for the training of support vector machines (SVMs) [14]. HOGs divide an image into many cells and generate one histogram for each cell; in turn, the histograms consist of the magnitudes and directions of all the pixel gradients in the cells. Several authors have used HOGs to train conventional machine learning algorithms, such as SVM and k-nearest neighbor [15] or Bayesian hierarchical frameworks, which are statistical models [16]. Additionally, there are many feature extraction methods, and these methods require the use of processed pixel colors for template matching [13,17], the application of a morphological operation to binary images [18], or the extraction of texture information through local binary patterns and local phase quantization [19]. These feature extraction methods help improve the accuracy of vacant parking slot detection, but generalization is difficult because each method is differently affected by the environment. For this reason, conventional feature extraction methods are limited, as they cannot be easily applied to different situations such as camera angle and direction variations, illumination, and weather conditions.

Experts have started to deploy neural networks (NNs) to overcome these limitations while maintaining the accuracy of vacant parking slot detection methods. Convolutional NNs (CNNs) have become the structural base of many NNs and have demonstrated remarkable image-processing performances. To identify vacant parking slots, researchers have employed CNN-based NNs as classifiers [20,21], which determine the status of each parking slot by using the slot image. However, the use of NNs as classifiers is disadvantageous, in that the initialization of the regions-of-interest (ROIs) is necessary because only one parking slot image is input into the classifiers at once. For this reason, classifiers must repeat their operations until all the classifications of each parking slot are completed. Researchers have deployed NNs as detectors (which can simultaneously identify several vacant parking slots) to surmount the disadvantages of classifiers. A combination of YOLOv3 and ResNet used a public dataset for training and testing and performs well as a detector [22]. A faster R-CNN has verified its excellent performance as a detector [23]. A variation of the ConvNeXT model attains an accuracy of >99% on public datasets [24]. Likewise, many studies have demonstrated that feature extraction using NNs is better for identifying vacant parking slots than feature extraction using conventional methods [20,21,25]. Vacant parking slot detection using NNs is advantageous, in that it can be applied to various situations without considering feature-extraction methods. In addition, there are many ways to improve the performance of NNs, and an adequate data quantity can enhance the fidelity and robustness of NNs. For these reasons, NNs are replacing conventional feature extraction methods.

However, when using NNs, researchers do not need to manually extract useful features from images; therefore, NNs extract suitable features based on their own decisions. As a result, most researchers only concentrate on achieving good results rather than analyzing the features extracted by NNs. These result-oriented perspectives are eventually unable to solve fundamental problems because the processes are not understood. Various feature analysis methods have been studied to determine the internal processes of NNs. Class activation mapping (CAM) informs researchers of the region that NNs importantly address in images [26]. Gradient-weighted CAM is a generalized CAM that can be applied regardless of the types of NNs [27]. The convolutional block attention module refines features by unifying channel and spatial domain information [28]. Pyramid feature attention enhances feature diversity by applying spatial attention modules to low-level features and channel attention modules to high-level features [29]. Efficient channel attention improves the performance of NNs by overcoming dimensionality reduction issues that many channel attention-based methods encounter [30]. SimAM builds three-dimensional attention modules by calculating each neuron’s importance and refines features using the module [31]. These studies effectively utilize the features extracted by NNs. However, they still cannot determine the characteristics of the features.

Therefore, this study focuses on the determination of the characteristics of the features extracted from images by analyzing the features based on a feature map visualization. Figure 1 shows the differences between our proposal and other studies.

In addition, the analysis and comparison of the features extracted by different NNs help us understand how NNs deal with vacant parking slot detection. The contributions of this study are as follows: (1) it is verified that the features extracted by NNs have specific patterns through a comparison and analysis based on feature map visualization. In addition, it is identified that it is possible to replace the classification part of NNs with handcrafted algorithms. (2) Various NNs were designed for systematic feature map analysis, and it is verified that NNs with different feature extraction structures extract similar features from images. (3) A new dataset is constructed to simplify vacant parking slot detection, and an ROI initialization method to optimize the performance of NNs is proposed.

Section 2 explains the proposed methods and materials in detail. Section 3 shows the experimental results with diverse materials, such as figures and tables. Section 4 discusses the results thoroughly. Section 5 concludes this study.

2. Materials and Methods

The current investigation involved the analysis of the features extracted from images using networks in vacant parking slot detection applications. However, the networks and datasets were generally so complicated [8,24] that the analysis of the features was difficult. Thus, simple networks and an appropriate dataset were needed, and feature map visualization was also proposed. In this context, only the parking slots at the same line of camera sight were considered. In total, 2700 parking slot images were collected and deployed for NN training and testing. Furthermore, four different NNs were designed and used for the experiments. In addition, feature map visualization was employed to convert the features into visual information by applying normalization. The dataset was acquired in the morning and afternoon to reflect illumination variations, like in other studies [21,25]. The NNs were designed to generate feature maps of the same size to directly compare their features. Moreover, a method was chosen for feature map visualization, which could indicate the correlation of the features rather than the features themselves. The experimental setup bears a close resemblance to [32].

2.1. System Configuration

Our computer system comprised an NVIDIA GeForce RTX 2070 super graphics card, 64 GB of random access memory, and a Ryzen 3900X processor. The learning environment comprised a CUDA Toolkit 10.0, cuDNN 7.6.5, TensorFlow graphics processing unit 1.13.1, and Python 3.6. The Adam optimizer and the categorical cross-entropy loss function were also used. Our NNs were trained for 100 epochs with 32 batch sizes; the rest of the hyperparameters were basic settings.

2.2. Data Acquisition

NNs with an outstanding classification performance were needed to obtain the correct feature analysis outcome, and an appropriate dataset was required to make outstanding NNs. Public datasets, such as PKLot [33] and CNRPark [25], have been used in many studies. However, the datasets were inappropriate for our experiments because they comprise parking slot images with small sizes or extensive occlusion caused by lampposts and trees. On the other hand, it has been demonstrated that the classification performance of NNs is affected by ROIs of a specific parking slot [34,35]. For these reasons, a specific dataset was needed. A camera was placed on the wall of our campus building to acquire images of the outer parking lot. The dataset was composed of 2700 cropped parking slot images. The parking lot in front of the campus building is shown in Figure 2.

Among all the parking slots, only 20 slots (P1 to P20 in Figure 2) in the middle of the parking lot were employed to collect the dataset. Five different ROIs were applied to each parking slot to collect parking slot images in various views, and 27 images per ROI were obtained from the video frames. Thus, 135 images per slot were acquired, and 2700 images were collected. Example images of five ROIs are shown in Figure 3.

ROI initialization was conducted based on the following sequences: first, the four edge coordinates of each parking slot were found. These were matched to the four edges of the ROI level 0 of each slot. As shown in Figure 2, higher ROI levels were obtained by moving the four edges of ROI level 0 in the +Y direction. However, critical information could be lost when the edges excessively move in the +Y direction. For instance, only unrelated information will remain if the edges move too much. Likewise, occlusion by adjacent vehicles increases unnecessary information. Moving the ROI in the X direction disturbs the classification by enhancing the occlusion. Thus, moving the ROI in the X direction was not considered. Therefore, ROI level 4 was set to prevent excessive motion just after ROI level 0 was found. When the four edges of ROI level 0 moved in the +Y direction by the difference between the Y coordinates of the lowest and rightmost edges of ROI level 0, the coordinates of the moved four edges corresponded to the four edges of ROI level 4. Suppose that ‘k’ is the difference between the Y coordinates of the lowest and rightmost edges of ROI level 0. Then, ROI levels 1, 2, and 3 are obtained by moving the four edges of ROI level 0 in +Y direction by

\frac{k}{4}

,

\frac{2 k}{4}

, and

\frac{3 k}{4}

.

2.3. Network Design

A network based on CNNs was built to extract features from parking slot images because CNNs are known to perform well in image processing. In addition, several structurally different networks whose classification parts are identical were designed to compare the characteristics of their extracted features. The networks were constructed to output same-size feature maps for direct comparison, because feature map comparisons are difficult if the sizes of their feature maps are different. The networks were based on ordinary convolutional neural networks, since vacant parking slot detection using ROI initialization is a simple task. The networks consisted of convolution, max-pooling, drop-out, and fully connected layers. The convolution layers extract valuable features from the input images based on convolution operations between the convolution filters and receptive fields of the input images. The max-pooling layers find the largest value in receptive fields and eliminate the rest to make the network faster and lighter by eliminating trivial information. The drop-out layers prevent networks from overfitting by randomly disconnecting propagation. The fully connected layers establish complete connections between successive layers. Our baseline network structure is shown in Figure 4.

The baseline network is comprised of four convolution layers, four max-pooling layers, one flatten layer, and two dense layers. The third max-pooling layer is followed by one drop-out layer. The rectified linear unit activation function was used in all the layers except for one; the Softmax function was used in the last dense layer. The other networks in this study were derived from the baseline network. As mentioned before, all the networks were designed to output same-size feature maps for direct feature comparisons, as shown in Figure 5.

2.4. Feature Map Visualization

Many studies have been performed to understand how features are involved in classification performance, and CAM [26] is one such approach. CAM shows which points of images are regarded as important features for NNs through heat maps, but how exactly the extracted features influence the classification results is still unknown. Feature map visualization has been used to deal with how the extracted features are directly related to the classification results. This method was easily implemented using the Python library. The principle of the method is described as follows: the feature maps—the outputs of hidden layers in neural networks—were visualized to identify the common characteristics between feature maps from different networks. The input data were changed into small patches (called feature maps) which went through the hidden layers. The output of the last max-pooling layer was used for feature map visualization in this study. Feature maps are composed of elements that have compressed information, and this information is comprised of decimal fractions. A normalization was utilized to consider the relations between the elements rather than the values of elements. The entire visualization process is shown in Figure 6.

The normalizing equation is as follows:

V^{'} = \frac{V - V_{m i n}}{V_{m a x} - V_{m i n}} \times 255

(1)

where

V

is the value of each element in the feature maps and

V^{'}

is the normalized value.

V_{m a x}

and

V_{m i n}

are the maximum and the minimum values of all the elements in a feature map, respectively. Normalization results in decimal fractions, but rounding down transforms decimal fractions into integer fractions. The elements were changed into integer fractions, which ranged from 0 to 255, resulting in visible grayscale image, as shown in Figure 6.

2.5. Experimental Processes

The experimental processes are divided into three phases: data preparation, feature analysis preparation, and feature analysis. The processes are shown in Figure 7.

Data preparation is the phase in which the new datasets needed for the experiments are prepared. The original dataset in the data acquisition section consisted of five subsets based on ROI levels. New datasets were constructed by uniting parts of the subsets; only subsets with successive ROI levels were unified. As a result, ten new datasets were prepared by controlling the number of unified subsets.

Feature analysis preparation is the phase in which an optimal dataset is prepared and the networks are optimally trained. New datasets were input into networks for training and testing after grayscaling and resizing. The experiments were repeated 40 times because there were ten datasets and four networks. One of the new datasets, ROI levels 0 to 1, was chosen as an optimal dataset, since the networks trained by the dataset showed good performance. The optimal dataset and optimally trained networks were deployed for the last phase.

Feature analysis is the phase in which the feature maps obtained from parking slot images are compared and analyzed. Feature maps were extracted by the feature extraction part of the networks and changed into human-recognizable images using the feature map visualization method. When one image was input into the networks, 16 feature maps were extracted. There were various patterns on the 16 feature maps, and the tenth feature map showed prominent pattern differences according to the parking slot states. Thus, that feature map was employed for feature map comparisons.

3. Results

The characteristics of the extracted features must be determined in vacant parking slot detection cases to understand the processing mechanism of NNs. For this reason, feature map visualization was employed, and NNs with an excellent classification performance were used to increase the reliability of the feature analysis outcome. Various ROI combinations were utilized to train and test the NNs to achieve an outstanding performance. Normalization and grayscale imaging were used in the feature map visualization. Certain ROI combinations maximized the performance of the NNs, but the performance was not guaranteed by the diversity of the ROI combinations. In correct classification cases, there were apparent differences among the features of parking states and the features of empty states. In misclassifications, the patterns of visualized features were opposite compared to those associated with correct classifications. In addition, similar visualized feature patterns were observed in structurally different networks with the same performance. The results indicate a specific rule in an image classification of NNs and a certain directionality, regardless of the extent of the NNs.

3.1. Feature Map Analysis of the Baseline Network

Experiments were conducted to analyze the characteristic of the features extracted from the parking slot images by the baseline network. The dataset that maximized the performance of the baseline was selected to enhance the reliability of the analyzed results. Table 1 lists the training and testing results of the baseline network.

ROI levels 0 to 4 indicate that the dataset used for the experiments contained the entire ROI levels. ROI levels 0 to 1 indicate that the dataset contained ROI levels 0 and 1 only. The datasets were divided into training, validation, and testing sets based on a fixed ratio; this ratio was 69% for training, 17% for validation, and 14% for testing. The baseline network used training sets for the update of weights and validation and test sets for inspection only. The validation sets helped inspect the classification performance of the baseline network during training. The test sets confirmed the generalization performance of the baseline, as the baseline had never seen the test sets during training. Misclassification indicates the number of erroneous classifications among the test sets when these were input to the baseline network. In general, the baseline had one or zero misclassifications. Consequently, the following feature analysis is reliable because the classification performance is convincing.

Images of the parking and empty states in the test set of the dataset were visualized to interpret the feature extraction mechanism of the baseline network. The qualitative information of the features extracted by the baseline network was provided by the proposed visualization method. The features extracted from the images of the parking and empty states were compared based on their qualitative information. Figure 8 shows that there was a difference between the images of the parking and empty states.

A visualized feature map has many bright elements within the red circle in the parking state. However, a visualized feature map also has fewer bright elements within the red circle in the empty state. Not all feature maps yield the same results, but the visualized feature maps of the parking states tend to have more bright elements within the red circle than those of the empty states. Figure 9 supports our observations.

In Figure 9, the green numbers in brackets indicate the classification results. Zero means that the parking slot was empty, and one means the parking slot was full. The green decimal numbers represent the class probabilities. The green numbers in Figure 9a indicate the empty state with a 74% probability. However, this is a misclassification, because it is the parking state. Figure 9c shows the reason for this misclassification. A visualized feature map of the parking case should have many bright elements within the red circle; instead, many dark or ambiguous elements are shown in Figure 9c. Figure 9b is the empty state, but many relatively bright elements are within the red circle in Figure 9d. It is verified that specific regions of feature maps are activated if the parking slot is full in correct classification. Additionally, experiments were conducted to survey the response time of our classifiers. The inference time of the baseline without preprocessing was nearly 8.00 ms on average with respect to one parking slot image. The inference time with preprocessing was approximately 9.29 ms on average. Thus, the baseline operates at approximately 5.3 FPS when only 20 parking slots are considered in a frame.

3.2. Feature Map Comparison of Different Networks

Experiments were performed to compare different feature maps, which were obtained from parking slot images based on networks A, B, C, and the baseline. An optimal dataset that simultaneously maximizes the classification performance of all the designed networks was needed for the feature map comparison in the same condition. All the datasets used for the feature map analysis of the baseline were deployed to identify the optimal dataset. Table 2 shows the number of misclassifications in networks A, B, and C in various dataset cases.

The four networks were designed to have the same input and output sizes for direct comparisons of their feature maps. Table 2 lists the number of misclassifications of the networks. The experimental results indicate that there are several optimal dataset candidates available for the feature map comparison. Among the candidates, ROI levels 0 to 1 were chosen as an optimal dataset to compare the features extracted by the different networks. Figure 10 shows the parking slot images and their feature maps extracted by network C.

Figure 10g,h shows that the feature maps of the empty states contained many dark elements within the red circle. Figure 10e,f shows that the feature maps of the parking states contained many bright elements within the red circle. Therefore, Figure 10 indicates that network C extracted features from the images that were similar to those extracted using the baseline network. Figure 11 shows a direct comparison of the feature maps extracted by the baseline network and network C.

Similarities were found in the feature maps extracted by the baseline and network C. While many bright elements were contained within the red circle in the feature maps obtained from the parking state images, many dark elements were contained within the red circle in the feature maps obtained from the empty state images, independent of the network that was employed for feature extraction. Consequently, the different networks extracted similar features from the parking slot images to classify the parking and empty states. In addition, the inference time of network C was estimated in the same way as the baseline. The inference time of network C without preprocessing was approximately 7.26 ms on average, and the inference time with preprocessing was nearly 8.09 ms on average. Thus, network C operated at approximately 6.18 FPS when only 20 parking slots were considered in a frame.

4. Discussion

The use of NNs for vacant parking slot detection through image classification has been widespread. NNs have overcome the need to develop feature extraction methods based on manual image analyses or to change feature extraction methods whenever the detecting environment changes [21,23]. However, most studies on vacant parking slot detection employing NNs do not determine which features are extracted from images to accomplish accurate classifications [24,35]. The current study determined which features were extracted by NNs and investigated these by using the proposed feature map visualization method.

If NNs have a low-classification performance, then the feature analysis result will be unreliable. However, previous studies have verified that diversified data improve the classification performance of NNs [34,35]. For this reason, the datasets for the experiments were diversified to maximize the classification performance of NNs. Among the datasets, the optimal dataset (which maximized the performance) was utilized for the feature map analysis. There were different patterns of feature maps extracted by the NNs according to the status of the parking slots. Relatively bright elements existed within a specific region in the feature maps of the parking state, and dark elements existed in corresponding regions in the feature maps of the empty state. It was assumed that the reason why the pattern difference occurred was due to vehicle bonnets. The region that showed pattern differences was consistent with the position of vehicle bonnets if cars were in slots. However, the region corresponded to the ground surface if there was no car. Therefore, it was concluded that the NNs used the information of the region as a crucial factor for classifying the parking slot states. As a result, our findings demonstrate that the characteristics of the features extracted by NNs can be intuitively recognized.

Many studies have been conducted to understand how NNs find solutions. One study suggested a method to indicate which regions in images are considered crucial factors for finding and classifying objects; these regions were found to be consistent with the positions of the objects [26]. Another study made important features more useful to performance than trivial features, although critical features were not shown in a human-recognizable way [28]. However, these studies cannot be easily compared with our study because the viewpoints are slightly different. Thus, expanded experiments were conducted to support our analysis by investigating the various features extracted by different NNs. As performed, an optimal dataset that simultaneously maximized the performance of different NNs was employed for the feature map analysis. Similar characteristics were found in the features extracted from the images, regardless of the used NNs, and the result of the feature map analysis was consistent with the results of our experiment on the baseline network. Thus, the results indicate that NNs classify parking slot images by extracting features based on a specific rule.

Our study can be deployed in various ways. First, a feature map analysis can help in finding the basis of network compression. Most network compression studies emphasize crucial weights or exclude trivial weights. Those studies conclude that network compression is successful if compressed NNs and uncompressed NNs have similar performances. However, a similar performance does not guarantee that one is a compressed version of the other. Therefore, reasonable explanations are needed to ensure the correlation of different networks. In that context, a feature map analysis can be used to determine the correlation. The result of a feature map analysis can be the basis of compression if the compressed and uncompressed NNs have similar patterns on the feature maps. Second, the inference time of the NNs can be faster. This result indicates that the classification part of NNs can be replaced by handcrafted algorithms. This replacement can reduce the processing time of systems, as the classification part requires significant computing resources.

This study is associated with various limitations. First, our study dealt with feature map analyses based on image classifications of parking slot images to identify the characteristics of these maps. The feature map analysis was simple because there were only two choices (the parking and empty states). However, recent studies have employed detection; detection is the end effect of classification and localization. Features extracted by NNs for detection are considerably complicated because NNs used for detection identify multiple objects. Although our experiments were effective in determining the characteristics of features extracted by classifiers, an additional study should be conducted to analyze the features extracted by detectors. Second, specific patterns of features were found, but handcrafted classification algorithms using the patterns were not designed and implemented. It should be verified that classification algorithms correctly classify the statuses of parking slots, making the patterns critical indicators to explain how NNs operate. Therefore, future studies will analyze the features extracted by detectors (more complicated networks) and verify that handcrafted classification algorithms perform well in place of neural-network-based classification parts.

5. Conclusions

Custom datasets and networks were designed to analyze feature maps for application in vacant parking slot detection. Feature map visualization was used to change feature maps into human-recognizable images. Datasets with high resolution and less occlusion were constructed to minimize information loss. In addition, networks with different feature extraction structures were built to compare the feature maps obtained by various networks. The result verified that there were specific patterns on the feature maps extracted by the networks. The feature maps of the parking state images had bright elements within a specific region, and those of the empty state images had dark elements within the corresponding region. This indicates that the pattern difference can be used in various ways. Moreover, the analysis results based on feature map visualization imply that handcrafted algorithms can replace the feature extraction part of NNs. Consequently, feature map analyses are useful in finding ways to deploy NNs efficiently.

Author Contributions

Conceptualization, methodology, validation, investigation, resources, and writing, J.-H.H. and B.C.; project administration, D.-H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data for this study are available for the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gulzar, Y. Fruit image classification model based on MobileNetV2 with deep transfer learning technique. Sustainability 2023, 15, 1906. [Google Scholar] [CrossRef]
Zhou, W.; Wang, H.; Wan, Z. Ore image classification based on improved CNN. Comput. Electr. Eng. 2022, 99, 107819. [Google Scholar] [CrossRef]
Sun, Y.; Xue, B.; Zhang, M.; Yen, G.G. Evolving deep convolutional neural networks for image classification. IEEE Trans. Evol. Comput. 2019, 24, 394–407. [Google Scholar] [CrossRef]
Kim, Y.-M.; Kim, Y.-G.; Son, S.-Y.; Lim, S.-Y.; Choi, B.-Y.; Choi, D.-H. Review of recent automated pothole-detection methods. Appl. Sci. 2022, 12, 5320. [Google Scholar] [CrossRef]
Mikusova, M.; Abdunazarov, J.; Zukowska, J.; Usmankulov, A. Designing of Parking Spaces Taking into account the Parameters of Design Vehicles in Russia. Commun.-Sci. Lett. Univ. Zilina 2020, 22, 31–41. [Google Scholar] [CrossRef]
Ji, Y.; Tang, D.; Blythe, P.; Guo, W.; Wang, W. Short-term forecasting of available parking space using wavelet neural network model. IET Intell. Transp. Syst. 2015, 9, 202–209. [Google Scholar] [CrossRef]
Parmar, J.; Das, P.; Dave, S.M. Study on demand and characteristics of parking system in urban areas: A review. J. Traffic Transp. Eng. 2020, 7, 111–124. [Google Scholar] [CrossRef]
Rafique, S.; Gul, S.; Jan, K.; Khan, G.M. Optimized real-time parking management framework using deep learning. Expert Syst. Appl. 2023, 220, 119686. [Google Scholar] [CrossRef]
Chen, W.; Yeo, C.K. Unauthorized parking detection using deep networks at real time. In Proceedings of the 2019 IEEE International Conference on Smart Computing (SMARTCOMP), Washington, DC, USA, 12–15 June 2019; pp. 459–463. [Google Scholar]
Naufal, A.; Fatichah, C.; Suciati, N. Preprocessed mask RCNN for parking space detection in smart parking systems. Int. J. Intell. Eng. Syst. 2020, 13, 255–265. [Google Scholar] [CrossRef]
Radiuk, P.; Pavlova, O.; El Bouhissi, H.; Avsiyevych, V.; Kovalenko, V. Convolutional neural network for parking slots detection. In Proceedings of the 3rd International Workshop on Intelligent Information Technologies & Systems of Information Security, Khmelnytskyi, Ukraine, 23–25 March 2022. [Google Scholar]
Liu, J.; Mohandes, M.; Deriche, M. A multi-classifier image based vacant parking detection system. In Proceedings of the 2013 IEEE 20th International Conference on Electronics, Circuits, and Systems (ICECS), Abu Dhabi, United Arab Emirates, 8–11 December 2013; pp. 933–936. [Google Scholar]
Shih, S.-E.; Tsai, W.-H. A convenient vision-based system for automatic detection of parking spaces in indoor parking lots using wide-angle cameras. IEEE Trans. Veh. Technol. 2014, 63, 2521–2532. [Google Scholar] [CrossRef]
Masmoudi, I.; Wali, A.; Jamoussi, A.; Alimi, A.M. Vision based system for vacant parking lot detection: Vpld. In Proceedings of the 2014 International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, 5–8 January 2014; pp. 526–533. [Google Scholar]
Wang, Y.; Hu, Y.; Hu, X.; Zhao, Y. A vision-based method for parking space surveillance and parking lot management. In Proceedings of the Image and Graphics: 8th International Conference, ICIG 2015, Tianjin, China, 13–16 August 2015; Proceedings, Part I 8. pp. 516–528. [Google Scholar]
Yusnita, R.; Norbaya, F.; Basharuddin, N. Intelligent parking space detection system based on image processing. Int. J. Innov. Manag. Technol. 2012, 3, 232–235. [Google Scholar]
Huang, C.-C.; Tai, Y.-S.; Wang, S.-J. Vacant parking space detection based on plane-based Bayesian hierarchical framework. IEEE Trans. Circuits Syst. Video Technol. 2013, 23, 1598–1610. [Google Scholar] [CrossRef]
Almeida, P.; Oliveira, L.S.; Silva, E.; Britto, A.; Koerich, A. Parking space detection using textural descriptors. In Proceedings of the 2013 IEEE International Conference on Systems, Man, and Cybernetics, Manchester, UK, 13–16 October 2013; pp. 3603–3608. [Google Scholar]
Baroffio, L.; Bondi, L.; Cesana, M.; Redondi, A.E.; Tagliasacchi, M. A visual sensor network for parking lot occupancy detection in smart cities. In Proceedings of the 2015 IEEE 2nd World Forum on Internet of Things (WF-IoT), Milan, Italy, 14–16 December 2015; pp. 745–750. [Google Scholar]
Ding, X.; Yang, R. Vehicle and parking space detection based on improved yolo network model. J. Phys. Conf. Ser. 2019, 1325, 012084. [Google Scholar] [CrossRef]
Fukusaki, T.; Tsutsui, H.; Ohgane, T. An evaluation of a CNN-based parking detection system with Webcams. In Proceedings of the 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Virtual, 7–10 December 2020; pp. 1–4. [Google Scholar]
Amato, G.; Carrara, F.; Falchi, F.; Gennaro, C.; Meghini, C.; Vairo, C. Deep learning for decentralized parking lot occupancy detection. Expert Syst. Appl. 2017, 72, 327–334. [Google Scholar] [CrossRef]
Khan, G.; Farooq, M.A.; Tariq, Z.; Khan, M.U.G. Deep-learning based vehicle count and free parking slot detection system. In Proceedings of the 2019 22nd International Multitopic Conference (INMIC), Islamabad, Pakistan, 29–30 November 2019; pp. 1–7. [Google Scholar]
Amato, G.; Carrara, F.; Falchi, F.; Gennaro, C.; Vairo, C. Car parking occupancy detection using smart camera networks and deep learning. In Proceedings of the 2016 IEEE Symposium on Computers and Communication (ISCC), Messina, Italy, 27–30 June 2016; pp. 1212–1217. [Google Scholar]
Encío, L.; Díaz, C.; Del-Blanco, C.R.; Jaureguizar, F.; García, N. Visual Parking Occupancy Detection Using Extended Contextual Image Information via a Multi-Branch Output ConvNeXt Network. Sensors 2023, 23, 3329. [Google Scholar] [CrossRef] [PubMed]
Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar]
Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Zhao, T.; Wu, X. Pyramid feature attention network for saliency detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3085–3094. [Google Scholar]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and PATTERN Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
Yang, L.; Zhang, R.-Y.; Li, L.; Xie, X. Simam: A simple, parameter-free attention module for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 11863–11874. [Google Scholar]
Hwang, J.-H. An Approach to Feature Map Based Model Compression of Deep Learning Architectures. Master’s Thesis, Kyungpook National University, Daegu, Republic of Korea, 2022. [Google Scholar]
De Almeida, P.R.; Oliveira, L.S.; Britto, A.S., Jr.; Silva, E.J., Jr.; Koerich, A.L. PKLot–A robust dataset for parking lot classification. Expert Syst. Appl. 2015, 42, 4937–4949. [Google Scholar] [CrossRef]
Huang, C.-C.; Vu, H.T. A multi-layer discriminative framework for parking space detection. In Proceedings of the 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), Boston, MA, USA, 17–20 September 2015; pp. 1–6. [Google Scholar]
Vu, H.T.; Huang, C.-C. A multi-task convolutional neural network with spatial transform for parking space detection. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 1762–1766. [Google Scholar]

Figure 1. Proposed method used to identify the basis of classifications.

Figure 2. Dataset acquisition environment. (a) Parking lot in front of the campus building; (b) parking slot numbers and examples of ROIs distinguished based on height from the ground.

Figure 3. Sample images corresponding to the five selected ROIs. (a) ROI level 0; (b) ROI level 1; (c) ROI level 2; (d) ROI level 3; (e) ROI level 4.

Figure 4. Architecture of the baseline network.

Figure 5. Other networks designed for direct feature comparison with the baseline.

Figure 6. Feature map visualization process.

Figure 7. Overall processes of feature analysis.

Figure 8. Visualized feature maps of the baseline. (a) Parking and (b) empty states. The red circles show the regions considered to contain important information.

Figure 9. Examples of misclassifications and their visualized baseline feature maps. Misclassifications of (a) parking state and (b) empty state, (c) a visualized feature map obtained from (a), and (d) a visualized feature map obtained from (b).

Figure 10. Examples of input images and their feature maps extracted by network C. Parking state images of (a) ROI level 0 and (b) ROI level 1. Empty state images of (c) ROI level 0 and (d) ROI level 1. (e–h) Feature maps individually corresponding to (a–d).

Figure 11. Examples of feature maps extracted by the baseline network and network C. Feature map extracted by the baseline network in the (a) parking and (c) empty states; feature maps extracted by network C in the (b) parking and (d) empty states.

Table 1. Classification performance of the baseline network based on various datasets.

Datasets	Number of Datapoints
Datasets	Training	Validation	Test	Misclassification
ROI levels 0 to 4	1850	460	390	1
ROI levels 0 to 3	1480	368	312	0
ROI levels 1 to 4	1480	368	312	1
ROI levels 0 to 2	1110	276	234	0
ROI levels 1 to 3	1110	276	234	0
ROI levels 2 to 4	1110	276	234	1
ROI levels 0 to 1	740	184	156	0
ROI levels 1 to 2	740	184	156	1
ROI levels 2 to 3	740	184	156	0
ROI levels 3 to 4	740	184	156	0

Table 2. Number of misclassifications based on various networks and datasets.

Datasets	Number of Misclassifications
Datasets	Baseline	Network A (FP1 + CP)	Network B (FP2 + CP)	Network C (FP3 + CP)
ROI levels 0 to 4	1	1	3	1
ROI levels 0 to 3	0	1	1	2
ROI levels 1 to 4	1	2	1	0
ROI levels 0 to 2	0	0	0	0
ROI levels 1 to 3	0	0	0	0
ROI levels 2 to 4	1	0	1	0
ROI levels 0 to 1	0	0	0	0
ROI levels 1 to 2	1	3	1	2
ROI levels 2 to 3	0	0	0	0
ROI levels 3 to 4	0	0	0	0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hwang, J.-H.; Cho, B.; Choi, D.-H. Feature Map Analysis of Neural Networks for the Application of Vacant Parking Slot Detection. Appl. Sci. 2023, 13, 10342. https://doi.org/10.3390/app131810342

AMA Style

Hwang J-H, Cho B, Choi D-H. Feature Map Analysis of Neural Networks for the Application of Vacant Parking Slot Detection. Applied Sciences. 2023; 13(18):10342. https://doi.org/10.3390/app131810342

Chicago/Turabian Style

Hwang, Jung-Ha, Byungwoo Cho, and Doo-Hyun Choi. 2023. "Feature Map Analysis of Neural Networks for the Application of Vacant Parking Slot Detection" Applied Sciences 13, no. 18: 10342. https://doi.org/10.3390/app131810342

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Feature Map Analysis of Neural Networks for the Application of Vacant Parking Slot Detection

Abstract

1. Introduction

2. Materials and Methods

2.1. System Configuration

2.2. Data Acquisition

2.3. Network Design

2.4. Feature Map Visualization

2.5. Experimental Processes

3. Results

3.1. Feature Map Analysis of the Baseline Network

3.2. Feature Map Comparison of Different Networks

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI