Next Article in Journal
The Path of Housing Prices in Promoting the Upgrading of Industrial Structure: Bank Credit Funds, Land Finance, and Consumer Demand
Next Article in Special Issue
Barrier Effect in a Medium-Sized Brazilian City: An Exploratory Analysis Using Decision Trees and Random Forests
Previous Article in Journal
Staying at Work? The Impact of Social Support on the Perception of the COVID-19 Epidemic and the Mediated Moderating Effect of Career Resilience in Tourism
Previous Article in Special Issue
Car-Free Day on a University Campus: Determinants of Participation and Potential Impacts on Sustainable Travel Behavior
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Machine Learning and Computer Vision Study of the Environmental Characteristics of Streetscapes That Affect Pedestrian Satisfaction

1
Department of Urban Planning, Hanyang University, Seoul 04763, Korea
2
Department of Urban Planning and Engineering, Hanyang University, Seoul 04763, Korea
*
Author to whom correspondence should be addressed.
Sustainability 2022, 14(9), 5730; https://doi.org/10.3390/su14095730
Submission received: 24 March 2022 / Revised: 4 May 2022 / Accepted: 6 May 2022 / Published: 9 May 2022

Abstract

:
Pedestrian-friendly cities are a recent global trend due to the various urbanization problems. Since humans are greatly influenced by sight while walking, this study identified the physical and visual characteristics of the street environment that affect pedestrian satisfaction. In this study, vast amounts of visual data were collected and analyzed using computer vision techniques. Furthermore, these data were analyzed through a machine learning prediction model and SHAP algorithm. As a result, every visual feature of the streetscape, for example, the visible area and urban design quality, had a greater effect on pedestrian satisfaction than any physical features. Therefore, to build a street with high pedestrian satisfaction, the perspective of pedestrians must be considered, and wide sidewalks, fewer lanes, and the proper arrangement of street furniture are required. In conclusion, visually, low enclosure, adequate complexity, and large green areas combine to create a highly satisfying pedestrian walkway. Through this study, we could suggest an approach from a visual perspective for the pedestrian environment of the street and see the possibility of using computer vision techniques.

1. Introduction

South Korea has experienced rapid economic growth and achieved rapid industrialization and urbanization. In that process, to quickly secure the competitiveness of the city, a car-oriented transportation system was established. However, the automobile-centered lifestyle has created many problems, including traffic congestion and environmental pollution. Walking is the most basic means of transport for humans; it is eco-friendly and promotes good health without producing any pollutants. Therefore, walking is one way to solve the urban problems caused by automobiles, and it has become the basic background for Compact City and Transit-Oriented Development. In fact, the transition from automobile-centric cities to pedestrian-friendly cities is gaining attention worldwide.
Walking mainly takes place on the road, and people perceive the streetscape mainly through sight [1]. Visual cognition, along with various other factors such as the physical environment of the street, produces a response to the environment in which that cognition occurs that can eventually be experienced as satisfaction [2]. Therefore, to allow residents to feel satisfied with walking, the landscape plan should consider what pedestrians see.
Prior studies examined the relationship between pedestrians and the walking environment. However, few studies have looked at the relationship between walking as an activity and the walking environment because that project is labor-intensive, and the scope of analysis has been limited to field investigations or surveys. Therefore, to do the research needed to create a truly pedestrian-friendly city, researchers need to exceed the limitations of the prior studies. For example, the pedestrian environment will need to be addressed in a broad way to make it possible to draw generalizable conclusions.
In recent years, newly developed technology has expanded the methodological possibilities for this kind of research [3]. For example, Google and other platforms now provide street view services that allow anyone to freely navigate vast amounts of street data through online map services. Accordingly, methods for batch processing of visual image data using computer vision techniques are emerging [4,5,6,7]. Therefore, studies beyond the limits of previous methods can now be undertaken using advanced technologies.

2. Literature Review

2.1. Streetscapes and Walkability

Streets are the most essential and representative element of a city, forming the city’s structure and characteristics. The visually perceived streetscape represents the landscape image of a city or region [8].
Streetscapes are not created by the individual landscape elements that make up the street; instead, they are created by the mutual connections of multiple elements [9]. Nonetheless, the physical environmental elements that make up the street play an important role in establishing a street’s identity. They greatly affect human visual perception, the physical environment of the street, and human visual perception plays an important role in determining the image of a place [10].
Walking is the most basic human means of transportation. Humans experience the urban environment comprehensively by walking [11], so pedestrian activity makes a great contribution to the formation of local communities, as well as vitalizing the urban economy by creating natural contacts among urban residents. Therefore, walking is an essential factor in revitalizing local and urban economies. A great street environment improves the frequency of outdoor activities and affects the behavior of street users, including pedestrians. For example, trees and green spaces, low building layouts, aesthetic facades of streets and buildings, and open spaces between them amplify the attractiveness of the walking experience and attract pedestrians [12]. Because pedestrians continue to walk voluntarily in an environment that continuously offers walking motivation, the meaning and attributes of the walking environment must be considered in combination to create a good walking experience. Therefore, walking satisfaction has a high correlation with the perception of the walking environment [13].
Various authors argued that people’s perceptions could be objectively evaluated by dealing with the walking environment of a street from a visual point of view. They defined five factors: imageability, enclosure, human scale, openness, and complexity [6,14,15]. They then argued that those factors influence people’s perceptions of the pedestrian environment in a measurable way, enabling researchers to concretely express the relationship between the physical characteristics of a street environment and pedestrians. Ernawati et al. [16] classified pedestrian environmental elements into spatial and visual factors. They considered urban design characteristics, which are the perceptual characteristics of the street landscape, to be important for promoting pedestrian activities.

2.2. Machine-Learning Model and Explainable AI

Machine learning is a field of artificial intelligence (AI) in which an algorithm predicts results by applying a mathematical technique that learns patterns based on data to enhance statistical reliability and minimize prediction errors [17]. In other words, machine learning predicts the output from new input data by learning from old data and automatically sensing data trends [18].
Machine learning can be flexibly applied to various situations because it focuses on understanding the major relationships between input variables and evaluating and predicting patterns in the input data through data learning [19]. Machine learning can analyze data that are difficult to analyze with traditional statistics because it can collect data without a specific intention or data that have various forms [20], and it can analyze complex interactions between nonlinear variables in the models [21]. Therefore, in today’s era of big data, especially unstructured data, many scholars are using predictive models based on machine learning algorithms to interpret data and make accurate predictions.
Machine-learning models are called black-box models because the internal algorithms used are complex, making it difficult to explain the reasoning behind the resulting predictions. However, there are thousands or millions of hyperparameters inside the model that are irrelevant to the input value but affect the predictive power [22]. So many studies have been conducted to interpret these prediction processes and validate the reliability of the results. Accordingly, these days, machine learning is no longer called a black-box model, and there are many different ways to interpret it.
Explainable artificial intelligence (XAI) is a technique for interpreting a machine-learning model [7]. Unlike other forms of AI, which cannot provide evidence, reliability statistics, or error correction methods for the derived results, XAI can explain the process, reasoning, and cause of errors throughout a machine-learning model [23]. XAI can identify the reasons behind the results derived by AI and explain the causes of any errors. We tried to improve upon existing research by providing an explanation interface that humans can understand and interpret immediately, rather than simply showing the calculated result values.
When a machine-learning model is analyzed through XAI, the following effects can be obtained [24]. First, analyzing the characteristics of the input variables makes it possible to identify factors that degrade the performance of a machine-learning model and derive the optimal hyperparameters for improving the model performance. Second, XAI can offer stability to the problem-solving process. In machine-learning models, it is important to design a fluid model that can operate under various conditions. Third, analyzing the structure of the model makes it possible to identify and cope with the causes of any errors that occur in the model and makes it possible to secure the reliability of the model by explaining the result-derivation process. Some examples of XAI techniques are Explain Like I’m 5 (ELI5), Skater, Local Interpretable Model-Agnostic Explanation (LIME), and Shapley Additive Explanations (SHAP).
Among them, the SHAP algorithm is one of the XAI techniques to interpret machine learning by evaluating the feature importance of each variable in the prediction aspect of machine learning and follows a game-theoretic approach to interpret the output of the machine learning model [25,26,27]. Therefore, SHAP is widely used in terms of interpreting machine learning models reliably.
Ding et al. [28] studied the importance of the characteristics of the construction environment for driving distance using the gradient boosting model, which is one of the machine learning models. Moreover, researchers have used SHAP to validate the predictive performance of machine learning–derived land-use change predictions [29], home price prediction through a street view analysis [30], and the prediction of traffic accidents [31]. Thus, research is expanding not only by using machine learning to analyze data but also by using XAI techniques to interpret the results from machine-learning models.

2.3. Limitation of Previous Studies and Differences in This Study

Many studies have analyzed the effects of street design factors on pedestrian satisfaction. However, most preceding studies have used limited research targets due to limitations in data collection or measurement methods, and they viewed the street environment factors fragmentarily, such as judging the physical elements of a street environment only by their existence.
Some studies about pedestrian environments have focused on only one or a few streets and lacked consideration for visual cognitive characteristics [9,32,33]. Likewise, research that classifies walking space using attributes carries the inherent limitation of difficulty in generalizing the results because the criteria for selecting emotional elements or dividing the street space are different for each researcher [32,34].
In this study, we synthetically considered the physical environmental factors in a streetscape that affect pedestrians’ walking satisfaction. First, we measured the components of the streetscape from the visual perspective to determine which characteristics influence pedestrian satisfaction. And then, we used computer vision techniques to quantify and measure the visual perceptions of a streetscape. Using these data, we built machine learning models, compared the performances, chose the best model, and interpreted it using SHAP, which is one of the XAI techniques. By using computer vision and machine learning techniques, we could analyze visual data to present an urban design perspective for improving the pedestrian environment. Additionally, we discussed the possibility of using computer vision as basic data to improve the quality of streetscapes in the future.

3. Materials and Methods

3.1. Materials

Seoul, the capital and largest city of South Korea, is already carrying out projects to create a pedestrian-friendly environment, and studies are being conducted to determine the best way to create a highly satisfactory pedestrian environment that encourages urban residents to walk.
This study is based on data from the Seoul Floating Population Report (Table 1), which is a large-scale survey conducted in Seoul. The Seoul Metropolitan Government surveyed 20,000 pedestrians at 1000 points per year from 2009 to 2015. Seoul’s floating population survey data are divided into observation data, attribute data, and point data. Among them, we investigated the attribute data by dividing the overall satisfaction with the walking environment into a 5-point Likert scale from ‘very unsatisfied’ to ‘very satisfied.’ These data were collected from individual surveys of pedestrians passing through representative points, and they contain the visitor’s gender, age, occupation, residence, purpose, frequency of visits, overall satisfaction and dissatisfaction scores for the walking environment, and the reasons for those scores. Additionally, the report provides information about the physical environment of each street, including the location of the survey point, the sidewalk status at each point, and so on. Therefore, because the Seoul Floating Population Report contains information about both the physical environment at each survey point and pedestrian reports of walking satisfaction, we used it as the basic data for this study.

3.1.1. Study Area

The spatial background of this study is Seoul, which is the capital and largest city in South Korea. First, the content of this study is limited to the visual perceptions of pedestrians, defined as the physical environmental elements that a pedestrian can visually perceive while walking on the street. The temporal range is 2015, which is the time of the most recent Seoul Floating Population Report. The spatial range is the entire area of Seoul, specifically the survey points of the Floating Population Report. However, because the walking satisfaction survey was not conducted at all the survey sites, this study targeted only the points for which walking satisfaction scores exist in the attribute data. Therefore, 978 sample points from the 1000 total sample points are used, excluding the points with missing data.

3.1.2. Streetscapes Images and the Naver Street View API

The streetscape images were obtained using the Naver Street View API with the Seoul Floating Population Report’s point data. Naver is the most popular portal site in Korea, and it supports map services and offers street view services similar to Google. Acquiring the streetscape images using an internet map service allowed us to acquire several images quickly and easily and enabled us to gather images from the year we wanted. In addition, using the images extracted through the street view service makes it possible to measure subjective perceptions of the built environment contained in the image [35,36,37,38].
In this study, we used 360° panoramic images from the Naver Street View API (Figure 1) because pedestrians perceive the overall image of the street, not a specific view. We used Python to extract the landscape images from the Naver Street View API.

3.2. Methods

We measured streetscape data from NSV images in terms of physical features and urban design quality. We used physical features from the Seoul Floating Population Report, which contains the width of the sidewalk, the number of lanes, road type, the presence or absence of a centerline, street furniture, slope, braille block, fence, bus stop, subway station, and crosswalk. Additionally, we measured urban design quality using semantic segmentation and edge detection, which are computer vision technologies. Using semantic segmentation, we measured enclosure, openness, greenery, and the feature area ratio, and using edge detection, we measured the complexity. Then, we analyzed and interpreted them with the Machine learning model and SHAP algorithm. All procedures are shown in Figure 2.

3.2.1. Semantic Segmentation

Deep learning, a machine-learning technique, is rapidly developing in areas such as data mining, image recognition, speech recognition, and natural language processing. The semantic segmentation technique (Figure 3) is a kind of computer vision that divides the objects that appear in images into meaningful unit classes through deep learning. It mimics the process of recognizing images by performing several steps simultaneously, just as a person visually recognizes an image. That means that it segments the image into pixels and then predicts the class to which each pixel belongs. Semantic segmentation thus enables the elements that constitute the landscape to be extracted at the pixel level. We used the High-Resolution Networks model jointly developed by the University of Illinois and Google in 2019 [39].

3.2.2. Edge Detection

Edge detection (Figure 4) derives edges by detecting the contours of an image and is used for object extraction and identification [8]. The detected edges reflect the objects constituting the image. Sobel Edge and Canny Edge are representative edge detection methods. Canny Edge is provided by the OpenCV function, so it is easy to use and has the advantage of measuring each edge exactly once. Canny Edge detection detects edges as white lines on a black background.
Applying edge detection to our streetscape images allowed us to represent the number of street elements. Previous studies used edge detection to quantify the visual complexity of the landscape and concluded that the measured complexity coincided with the complexity visually perceived by humans [40,41,42,43]. In that way, the perceived complexity can be expressed numerically [44].
The image complexity measurement taken by edge detection follows the theory of entropy, a physical quantity that measures the degree of disorder or degree of irregularity. The number of edges extracted from the entire image was used to calculate the entropy value using the formula given in Equation (1):
H f a c t o r = i = 1 n p i log 2 p i
where H is the entropy value and p i is the probability of random object occurrence among all objects.

3.2.3. Analytic Frame

This study uses objective indicators to represent the degree of influence that the elements of the street have on walking satisfaction. To that end, we built a model that defined the evaluation factors in the physical environment of pedestrians using people’s perceptions (Figure 5).
We used the information provided by the floating population research as the physical features of the streetscape. For the visual features, we used the measured area ratio together with enclosure, openness, and complexity, which are urban design qualities used in previous studies [6,14,15]. We used a semantic segmentation technique to extract visual features.
Enclosure describes the degree of surrounded space in terms of the D:H ratio [45]. In other words, enclosure describes the degree to which streetscape elements visually define the three elements of a ceiling, wall, and floor through buildings, walls, trees, and other vertical elements [14,15]. In urban space, the enclosure is mainly formed by rows of buildings or trees. Openness describes the feeling of visual visibility, determining the amount of perceived lightness, which has an impact on visual perception and pleasantness [15]. Complexity describes the visual abundance at a particular point as the number of repetitive forms perceived by humans [46]. In terms of the complexity perceived visually by humans [47], pedestrians need a significant amount of environmental complexity to feel that walking is attractive [16]. The area ratios of buildings, roads, sidewalks, and street furniture measure the proportions of each streetscape image that each element occupies.

4. Variables

This study is based on data from the floating population research conducted by the Seoul city government (2015). The purpose of this study is to identify the characteristics of a satisfactory walking environment. Therefore, we constructed machine learning classification models to predict walking satisfaction in streetscapes. For this, we transformed the walking satisfaction scores of normal, satisfied, or very satisfied into a dummy variable of ‘satisfied with the walking environment.’ Similarly, scores of unsatisfied and very unsatisfied were converted into ‘unsatisfied with the pedestrian environment.’
Furthermore, we divided personal characteristics into gender, age group, frequency of passage, the purpose of visit, and job. Physical features were divided into land use, the width of the sidewalk, the number of lanes, road type, the presence or absence of a centerline, street furniture, slope, braille block, fence, bus stop, subway station, and crosswalk. All of the personal characteristics and the presence or absence of the street’s physical features were converted into 0 and 1. Table 2 shows the variables in this study.

4.1. Enclosure

Enclosure (Figure 6) describes the streetscape frame formed by the vertical elements that make up the landscape. It represents the degree to which streetscapes are defined by buildings, walls, street trees, and other vertical elements [14]. The proper ratio of vertical and horizontal factors on a street can give pedestrians a comfortable feeling [48,49].

4.2. Openness

Openness (Figure 7) describes the ratio of the visible sky in the street environment. Openness determines the amount of light a person perceives and affects both visual perception and comfort [15].

4.3. Greenery

The amount of greenery (Figure 8) is an essential factor in walking satisfaction. In this study, we calculate all the green in the images, including both trees and other plants. Greenery is the most effective factor in improving the quality of a street environment [50], with the attractiveness and aesthetic value of a street increasing along with the proportion of greenery.

4.4. Complexity

Complexity (Figure 9) describes the visual abundance of a streetscape. People are more curious and satisfied with complex scenery than with monotonous scenery. However, if the amount of visual information that must be processed at one time is excessive, people feel cognitive confusion and soon lose interest. Therefore, the highest satisfaction scores are found when complexity has the median value [46].

4.5. Area Ratio

The area ratio describes the proportion of each physical feature. It includes the proportions of buildings (Figure 10), roads (Figure 11), sidewalks (Figure 12), and street furniture (Figure 13).

5. Analysis

5.1. Machine-Learning Analysis Methods

In this study, we constructed a walking satisfaction prediction model using logistic regression, random forest, and XGBoost algorithm machine-learning classification models. Then we used the best model to examine the factors in the street environment that affect walking satisfaction.

5.1.1. Logistic Regression Classification

Logistic regression is a machine-learning algorithm that applies a near regression to classification. Logistic regression predicts the probability of binary classification using linear regression. It is often used as a basic model of binary classification because it is light and fast and has excellent predictive performance [17].

5.1.2. Random Forest Classifier

The random forest is a set of randomly generated decision trees. It works well and has excellent performance without requiring parameter tuning or data scaling because it can compensate for data shortcomings while taking advantage of the single-tree model. It is currently the most popular machine-learning algorithm because of its fast speed and high predictive performance [17].

5.1.3. XGBoost Classifier

XGBoost is a gradient boosting model (GBM) that sequentially trains and predicts simple models (such as shallow decision trees) and improves errors. X in XGBoost means the eXtreme Gradient Boosting, a specific framework from the GBM. Unlike the random forest, XGBoost has no randomness. The trees are sequentially created to compensate for errors in a previous tree, so it generally shows better classification performance than other machine-learning models [17,51].
XGBoost solves the problems of slow speed and overfitting, which are the disadvantages of GBM, and can independently derive the importance of characteristics. In addition, it provides a variety of custom optimization options due to its high flexibility, and it links well with other algorithms [52].

5.2. Evaluating the Machine-Learning Techniques

The predictive performance indicators for machine-learning models are accuracy, precision, recall, F1 score, and AUC score. In binary classification, other evaluation indicators are often more important than model accuracy [17].
Accuracy is the most basic performance evaluation index, and it is the ratio of correctly classified data to all data. Precision is the proportion of pedestrians who actually answered that they were satisfied among the pedestrians predicted by the classification model to be satisfied. Recall is the proportion that the model correctly predicted for pedestrians who answered they were actually satisfied. Precision and recall are thus complementary to each other, and it is important to achieve high numbers for both. The F1 score is a combination of precision and recall, and it has a high value when precision and recall are not biased to either side. The ROC curve indicates how recall changes when the model misclassifies unsatisfied pedestrians as satisfied. It thus evaluates the overall performance of the classification model. The area under the ROC curve is the AUC score.

5.3. Interpretable Machine-Learning Techniques

We used the SHAP algorithm to interpret the results of the machine-learning models. The SHAP algorithm is an XAI technique for interpreting machine learning by evaluating the importance of each variable’s features in the machine-learning predictions [25,26,27,53].
The Shapley values of feature j are calculated as follows [25,54,55]:
j = S x 1 , , x p / x j S ! p S 1 ! p ! ( v a l ( S x j ) v a l S )
j : Feature contribution
S : Subset of features used in the model
χ : Vector of the feature values of the observations
p : Number of features
SHAP reliably interprets machine-learning models by calculating Shapley values for each input variable. SHAP then uses those Shapley values to graphically display the importance of each feature in determining the predictive performance, allowing users to visually check the Shapley values of any specific input variable [54,55].

6. Results

6.1. Machine-Learning Models

We used 80% of the total data for the training and 20% for the test data. The results for each machine-learning model are shown in Table 3. It was conducted using the test data, and as a result of comparing the performance evaluation indicators for each model (logistic regression analysis, random forest, XGBoost), the XGBoost model was selected for this study because of its excellent performance for all the indicators.

6.2. Interpretation of the Machine-Learning Model

6.2.1. Analysis of the Street Environment Characteristics That Affect Pedestrian Satisfaction

The results of the SHAP analysis are shown in Figure 14 and Table 4. The x-axis of the SHAP value plot indicates the importance of each variable as calculated by its Shapley value, such that the greater the absolute value, the greater the degree of contribution to the prediction [55]. In Figure 13, a red bar indicates a positive effect (+) on walking satisfaction, and a blue bar indicates a negative effect (−) on walking satisfaction.
The SHAP value plot ranks the importance of the entire data set to the performance of the machine-learning model. However, it does not show how each variable affects the model performance, so we use the SHAP summary plot to express that information in terms of direction and size. The SHAP summary plot is arranged in descending order of Shapley values, similar to the SHAP value plot. Based on the 0.0 in the plot, the right side contributes positively, and the left side contributes negatively to the model performance. In addition, red indicates high feature value, and blue indicates low feature value. For the dummy variables, 1 (yes) is represented in red, and 0 (no) is represented in blue. Each point constituting the plot represents the response of one person. When several points are placed in the same place on the x-axis, they appear densely stacked. Longtails of individual dots gathered to the right or left indicate that extreme measurements can be important to a particular individual [54,55].
In summary, the SHAP value plot ranks the impact that each variable has on the performance of the model and whether its effects are approximately positive or negative. The SHAP summary plot shows how and which values specifically affect each variable in terms of direction and size. Because our focus is to determine the physical characteristics of a satisfying walking environment, we analyzed the SHAP summary plot to find the positive Shapley values, which are on the right side of the plot.
Our analysis showed that all the visual features (the urban design qualities and the area ratios) significantly affected the model’s predictive performance more than any physical features. Among the visual features, the proportions of the road and street furniture, enclosure, complexity, and openness had a negative effect on walking satisfaction, whereas the proportions of the sidewalk, buildings, and greenery had a positive effect. For a better understanding of the results, we attached the example image of the street with the high or low pedestrian satisfaction score in Figure 15.
In the proportion of the road, the red dots dominantly appear on the left side of the SHAP value plot, indicating that a wide road negatively affects walking satisfaction. On the contrary, for the proportion of sidewalks, the red dots dominantly appear on the right side of the plot and create a long tail, indicating that a large sidewalk area has a very positive effect on pedestrian satisfaction. For complexity and openness, the high and low values are mixed, but they seem to have a negative effect overall.
For feature importance, the physical elements immediately following the visual elements are the width of the sidewalk and the number of lanes, with a high width of the sidewalk and a small number of lanes positively affecting walking satisfaction. In the land-use variables, the rankings of semi-industrial areas and class Ⅱ residential areas were relatively high and negatively affected walking satisfaction. People particularly disliked walking in semi-industrial areas. In addition, it can be seen a satisfying walking environment should have crosswalks and no slope. Among the road types, pedestrian-only lanes appear to positively affect walking satisfaction, and mixed-use lanes and car-only lanes do not seem to have much influence on the predictive power of the model.
The most important individual characteristic variables were, in order, the purpose of the passage, the purpose of commute, daily visitor, and 3–5 times a week visitor. The purpose of passage positively affected walking satisfaction, whereas the purpose of commute, daily visitor, and 3–5 times a week visitor negatively affected walking satisfaction. Excluding the two types of visitors, individual characteristics seemed to have little effect on the model’s performance. Among the pedestrian occupations, only the student was among the top 20 factors that influence walking satisfaction.

6.2.2. Analysis of the Relationship between Walking Satisfaction and Visual Features

Because the previous analysis could not clearly reveal the detailed and nonlinear relationships between street environment characteristics and walking satisfaction, we used a SHAP dependence plot (Table 5) to investigate the effects of visual factors on walking satisfaction. The SHAP dependence plot shows the interaction effect of each variable. The x-axis represents the numerical value of the variable, and the y-axis represents the Shapley value. In addition, because negative numbers indicate dissatisfaction with the walking environment, and positive numbers indicate satisfaction with the walking environment, the value of the street environment factor for a satisfying walking environment can be calculated. The vertically wide distribution of Shapley values for each value in the plot indicates that walking satisfaction varies from person to person [55].
In this analysis, most of the visual features showed a nonlinear relationship with walking satisfaction. In the proportion of the road, the Shapley values mostly have positive values below 16%, but when those values are exceeded, they have negative values, indicating that a wide road negatively affects walking satisfaction.
The Shapley values for the proportion of the sidewalk showed a positive correlation with walking satisfaction overall. In addition, on streets where the proportion of the sidewalk was 7% or more, almost all respondents were satisfied with the walking environment. This indicates that a wide sidewalk is necessary to create a highly satisfying walking environment.
The Shapley values for the proportion of street furniture are mostly positive, and they have a positive effect on walking satisfaction. However, if the area ratio is more than about 0.02 and less than 0.06, the Shapley values are negative, indicating that pedestrians do not prefer that proportion. When the proportion of street furniture was 10% or more and was close to the maximum in the input data, the highest satisfaction was shown.
Enclosure showed a negative correlation until about 0.4 and then showed a positive correlation thereafter. In addition, all Shapley values with more than 70% enclosure were positive, indicating that they positively affected walking satisfaction. The enclosure is the sum of vertical elements that make up the streetscape, so people prefer streets with many things to see.
For complexity, values below 1.4 had mostly positive Shapley values, and values above 1.4 show a mixture of positive and negative Shapley values. Therefore, people feel unsatisfied on streets where the complexity is too high. However, with complexity values above 1.5, the points on the plot appear vertically with both positive and negative numbers. Thus, even in streets with the same complexity, satisfaction and dissatisfaction with the walking environment vary from person to person.
Greenery showed a positive correlation overall, indicating that walking satisfaction increases as more greenery is visible. Streets without any greenery had a very negative effect on walking satisfaction. Particularly because most of the negative Shapley values are from streets with less than 1% greenery, it is clear that green space must be present for a street to satisfy pedestrians.
The proportion of buildings had the lowest Shapley values when no buildings were visible in the street image. Furthermore, most Shapley values were positive, so it had a positive effect on walking satisfaction. However, at about 90%, which is the maximum in the input data, the Shapley values fell from −0.3 to −0.5. Thus, pedestrians do not prefer streets with no buildings or with buildings that occupy about 90% of the visual landscape.
Similar to complexity, openness showed a pattern in which vertically long Shapley values appear at the same values, indicating that the satisfaction felt at the same openness varied from person to person. In particular, when the area of the sky was less than about 0.15, walking satisfaction varied widely by person. With about 0.35 or more openness, all Shapley values became negative. Compared with other values, openness had very low Shapley values, so when pedestrians saw too much sky, walking satisfaction was quite low. Due to the extremely low Shapley values in this section, it can be presumed that in the previous SHAP value plot, openness was expressed as a blue bar that negatively affects walking satisfaction. However, at values between 0.15 and 0.33, most Shapley values were positive, indicating that people do not prefer too much or too little sky; they prefer streets with a moderate amount of sky. In other words, as previously stated, people prefer streets with suitable sights.

7. Discussion

In this study, we used a machine-learning model and an XAI technique to investigate which physical characteristics of the streetscape influence walking satisfaction. We found that the importance of all the visual features (the urban design quality factors and the area ratios of the street environment) were more important than any of the physical features. Therefore, pedestrians are greatly influenced by visual input when they feel satisfaction while walking on the street.
In particular, the proportions of the road, sidewalk, and street furniture had the greatest influence on model performance, followed by enclosure, complexity, greenery, the proportion of buildings, and openness. Among the physical features, the width of the sidewalk had the largest positive effect, followed by the number of lanes, semi-industrial area, and Class Ⅱ residential area, which all negatively affected the model performance. After that, the personal characteristics of the purpose of the passage, commute, and daily visitor followed in importance. Among them, the purpose of commute and daily visitors negatively affected walking satisfaction, and the purpose of passage had a positive effect.
In the SHAP rankings for their degree of impact on the performance of the machine-learning model, the proportions of road, sidewalk, and street furniture were the most important. In other words, the biggest influence on the satisfaction of pedestrians is the floor surface of the streetscape, which is directly related to walking [34].
The streetscape is defined and formed by vertical elements that obstruct people’s gaze [14], and a proper ratio of vertical and horizontal elements provides a comfortable feeling for pedestrians [49]. The fact that enclosure and complexity appeared immediately after the floor surface indicates that the vertical elements and visual diversity of the landscape are important for walking satisfaction.
Most of the visual elements in the streetscape had a nonlinear relationship with walking satisfaction. People generally preferred streets with a low proportion of road, and the walking satisfaction score rose with the proportion of sidewalks. In other words, to create a satisfying walking environment, wider sidewalks are better. The fact that a low road area and high sidewalk area both positively affected walking satisfaction indicates that a large number of vehicles on the road increases the difficulty of creating a comfortable walking environment [56]. In addition, people preferred pedestrian-only streets, so road-oriented urban planning should be avoided, and safe and comfortable walking environments should be created instead.
For the proportion of the street furniture, once it exists, most of the Shapley values were positive, indicating that the presence of street facilities positively affects pedestrian satisfaction, but streets with a street furniture area ratio of 3 to 5% were not preferred.
For the enclosure, it had a negative effect on walking satisfaction, but the streets with an enclosure of 0.7 or higher were found to be satisfactory for walking.
In terms of complexity and openness, satisfaction and dissatisfaction tended to vary widely from person to person at the same street points. In general, people felt dissatisfied in streets with a complexity of 1.5 or higher, and overall, low complexity had a positive effect on walking satisfaction. In addition, people do not prefer streets where they can see too much or too little sky, with most pedestrians reporting satisfaction on streets with a middle value of openness. Therefore, people prefer streets with a moderate number of things to see.
The percentage of green space was a major factor in improving satisfaction with the pedestrian environment. Greenery is the most effective factor in improving the quality of the street environment [50], with a larger area of green space correlating with better aesthetics in the landscape. The green area ratio showed a positive correlation with satisfaction as a whole, but negative Shapley values were found when the green area ratio was less than about 1%. Thus, any amount of green space positively affects walking satisfaction, with walking satisfaction increasing along with the percentage of green space.
For the proportion of buildings, the most negative effect was seen when buildings were completely missing from the view. In general, a higher area ratio of buildings correlated with a positive influence on walking satisfaction, but the maximum area ratio in the input data, 90%, had a negative effect. The fact that most high building area ratios positively affected pedestrian satisfaction is consistent with the results of a previous study showing that the amount of walking increased as the number of buildings and the building coverage ratio increased [33].
Looking at the results of our study, it is clear that when dealing with the streetscape and pedestrian environments, not only the physical features that constitute the streetscape but also the visual features should be considered. To create a satisfying walking environment, it is necessary to have a low ratio of road, enclosure, and lanes. Moreover, a large amount of greenery, proportion of buildings, and width of the sidewalk are needed. For the proportion of street furniture, complexity, and openness, it is necessary to estimate the value that shows optimal walking satisfaction. Pedestrians did not prefer walking in semi-industrial areas or class Ⅱ residential areas. Among the types of road, pedestrians preferred pedestrian-only lanes and streets with no slope and a crosswalk. Among the personal characteristics, the walking satisfaction of pedestrians who visited each day as part of their commute was negative, whereas the purpose of passage affected walking satisfaction positively.

8. Conclusions

This study measured the constituent elements of the streetscape from the viewpoint of visual cognitive characteristics and examined their relationships with walking satisfaction. In the process, visual data were constructed using computer vision techniques. Then, we used machine-learning prediction models and the SHAP algorithm, an XAI technique, to analyze the complex interactions among the variables with nonlinear relationships.
We found that all the visual features in the streetscape (the urban design quality factors and area ratios of the street environmental factors) had greater effects on the machine-learning model performance than any of the physical features. Thus, for pedestrians to feel satisfied with their walking environment, the composition of the visible elements is more important than the physical elements on the street. To build a street for optimal pedestrian satisfaction, it is necessary to view the street environment as an overall image and manage the composition ratio of each element.
This study has shown the potential for using computer vision techniques in urban design. In particular, the results of this study suggest that a satisfying pedestrian environment can be created by designing the streetscape. Therefore, the urban design could be used to improve pedestrian environments, and our results can be used as basic data to improve the quality of street landscapes.
However, this study has several limitations. First, because the street view image is a 360° panoramic image taken by the vehicle in the middle of the lane, so the viewpoints of the pedestrian and the street view image do not match. Second, due to the characteristics of the 360° panoramic image, distortion occurs, and the area could be measured lower than the actual ratio. Nevertheless, the fact that the proportion of the sidewalk has the second greatest influence on pedestrian satisfaction means that the sidewalk is just as important. Third, we borrowed some of the models from Ewing and Handy [14] but failed to use all of the urban design quality elements they claimed. Therefore, as further research, it is necessary to examine which factors of the streetscapes affect pedestrian satisfaction by individual characteristics and to find the various ways to create a pedestrian environment with high satisfaction.

Author Contributions

Conceptualization, J.L. and J.P.; methodology, J.L. and D.K.; software, J.L.; validation, J.P.; formal analysis, J.L. and J.P.; investigation, J.L. and J.P.; resources, D.K. and J.P.; data curation, J.L.; writing—original draft preparation, J.L.; writing—review and editing, D.K. and J.P.; visualization, J.L. and D.K.; supervision, J.P.; project administration, J.P.; funding acquisition, J.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2019R1A2C1088467).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Suh, J.; Choi, Y. Research on the Visual Cognitivity of Urban Plaza—Focused on preference and complexity. Korea Soc. Des. Trend 2012, 34, 197–206. [Google Scholar]
  2. Lim, S. Environmental Psychology and Human Behavior: Human-Friendly Environment Design; Bomoondang: Seoul, Korea, 2007. [Google Scholar]
  3. Kandt, J.; Batty, M. Smart cities, big data and urban policy: Towards urban analytics for the long run. Cities 2021, 109, 102992. [Google Scholar] [CrossRef]
  4. Gong, Z.; Ma, Q.; Kan, C.; Qi, Q. Classifying Street Spaces with Street View Images for a Spatial Indicator of Urban Functions. Sustainability 2019, 11, 6424. [Google Scholar] [CrossRef] [Green Version]
  5. Ibrahim, M.R.; Haworth, J.; Cheng, T. Understanding cities with machine eyes: A review of deep computer vision in urban analytics. Cities 2020, 96, 102481. [Google Scholar] [CrossRef]
  6. Ma, X.; Ma, C.; Wu, C.; Xi, Y.; Yang, R.; Peng, N.; Zhang, C.; Ren, F. Measuring human perceptions of streetscapes to better inform urban renewal: A perspective of scene semantic parsing. Cities 2021, 110, 103086. [Google Scholar] [CrossRef]
  7. Meng, L.; Wen, K.-H.; Zeng, Z.; Brewin, R.; Fan, X.; Wu, Q. The Impact of Street Space Perception Factors on Elderly Health in High-Density Cities in Macau—Analysis Based on Street View Images and Deep Learning Technology. Sustainability 2020, 12, 1799. [Google Scholar] [CrossRef] [Green Version]
  8. Choi, I.J.; Jo, H.D. A Study on the Form-Element of Buildings Affecting in Street Spaces. J. Korean Assoc. Geogr. Inf. Stud. 2010, 13, 16–27. [Google Scholar]
  9. Kim, J.G. A Study on Analysis of Recognition and Preference in Urban Landscape—A Quantatative Experimental Analysis for Subjected Streetscapes. J. Korean Soc. Civ. Eng. 2005, D25, 305–309. [Google Scholar]
  10. Proshansky, H.M. The city and self-identity. Environ. Behav. 1978, 10, 147–169. [Google Scholar] [CrossRef]
  11. Lee, I. Development of Pedestrian Path-Choice Model in Urban Residential Area—Comparison of Importance, Satisfaction, and Environmental Tradeoff Models. J. Urban Des. Inst. Korea Urban Des. 2000, 1, 63–78. [Google Scholar]
  12. Hahm, Y.; Yoon, H.; Choi, Y. The effect of built environments on the walking and shopping behaviors of pedestrians; a study with GPS experiment in Sinchon retail district in Seoul, South Korea. Cities 2019, 89, 1–13. [Google Scholar] [CrossRef]
  13. Lee, S.; Han, M.; Rhee, K.; Bae, B. Identification of Factors Affecting Pedestrian Satisfaction toward Land Use and Street Type. Sustainability 2021, 13, 10725. [Google Scholar] [CrossRef]
  14. Ewing, R.; Handy, S. Measuring the Unmeasurable: Urban Design Qualities Related to Walkability. J. Urban Des. 2009, 14, 65–84. [Google Scholar] [CrossRef]
  15. Tang, J.; Long, Y. Measuring visual quality of street space and its temporal variation: Methodology and its application in the Hutong area in Beijing. Landsc. Urban Plan. 2019, 191, 103436. [Google Scholar] [CrossRef]
  16. Ernawati, J.; Adhitama, M.S.; Sudarmo, B.S. Urban Design Qualities Related Walkability in a Commercial Neighbourhood. Environ. Behav. Proc. J. 2016, 1, 242–250. [Google Scholar] [CrossRef]
  17. Müller, A.C.; Guido, S. Introduction to Machine Learning with Python: A Guide for Data Scientists, 1st ed.; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2016. [Google Scholar]
  18. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer Science & Business Media: New York, NY, USA, 2009. [Google Scholar]
  19. Bzdok, D.; Altman, N.; Krzywinski, M. Statistics versus machine learning. Nat. Methods 2018, 15, 233–234. [Google Scholar] [CrossRef]
  20. Yoo, J.-E. Predictor exploration via group lasso: Focusing on middle school students’ life satisfaction. Stud. Korean Youth 2017, 28, 127–149. [Google Scholar]
  21. Yoo, J.-E.; Rho, M. Predictive Modeling of Students Creativity via Elastic Net. SNU J. Educ. Res. 2018, 27, 185–205. [Google Scholar]
  22. Suman, R.R.; Mall, R.; Sukumaran, S.; Satpathy, M. Extracting State Models for Black-Box Software Components. J. Object Technol. 2010, 9, 79–103. [Google Scholar] [CrossRef]
  23. Gunning, D.; Stefik, M.; Choi, J.; Miller, T.; Stumpf, S.; Yang, G.Z. XAI—Explainable artificial intelligence. Sci. Robot. 2019, 4, eaay7120. [Google Scholar] [CrossRef] [Green Version]
  24. Lee, Y.-G.; Oh, J.-Y.; Kim, G. Interpretation of Load Forecasting Using Explainable Artificial Intelligence Techniques. Trans. Korean Inst. Electr. Eng. 2020, 69, 480–485. [Google Scholar] [CrossRef]
  25. Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar]
  26. Ribeiro, M.T.; Singh, S.; Guestrin, C. Why Should I Trust You? Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar]
  27. Shapley, L.S. A Value for n-Person Games. In Contributions to the Theory of Games (AM-28); Kuhn, H.W., Tucker, A.W., Eds.; Princeton University Press: Princeton, NJ, USA, 1953; Volume II, pp. 307–318. [Google Scholar]
  28. Ding, C.; Cao, X.; Næss, P. Applying gradient boosting decision trees to examine non-linear effects of the built environment on driving distance in Oslo. Transp. Res. Part A Policy Pract. 2018, 110, 107–117. [Google Scholar] [CrossRef]
  29. Wieland, R.; Lakes, T.; Nendel, C. Using SHAP to interpret XGBoost predictions of grassland degradation in Xilingol, China. Geosci. Model Dev. 2020, 1–28. [Google Scholar] [CrossRef]
  30. Chen, L.; Yao, X.; Liu, Y.; Zhu, Y.; Chen, W.; Zhao, X.; Chi, T. Measuring Impacts of Urban Environmental Elements on Housing Prices Based on Multisource Data—A Case Study of Shanghai, China. ISPRS Int. J. Geo-Inf. 2020, 9, 106. [Google Scholar] [CrossRef] [Green Version]
  31. Parsa, A.B.; Movahedi, A.; Taghipour, H.; Derrible, S.; Mohammadian, A. Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis. Accid. Anal. Prev. 2020, 136, 105405. [Google Scholar] [CrossRef]
  32. Kum, K.-J.; Jang, H.-Y.; Son, S.-N.; Hwang, Y.-J. Analysis of Pedestrian-Streetscape Image in Commercial District Using Structural Equation Model. J. Korea Plan. Assoc. 2010, 45, 97–109. [Google Scholar]
  33. Yun, N.Y.; Choi, C.G. Relationship between Pedestrian Volume and Pedestrian Environmental Factors on the Commercial Streets in Seoul. J. Korea Plan. Assoc. 2013, 48, 135–150. [Google Scholar]
  34. Kim, K.-R.; Lee, J.-S. Pedestrian Cognition and Satisfaction on the Physical Elements in Pedestrian Space. J. Urban Des. Inst. Korea Urban Des. 2016, 17, 89–103. [Google Scholar]
  35. Hu, L.; He, S.; Han, Z.; Xiao, H.; Su, S.; Weng, M.; Cai, Z. Monitoring housing rental prices based on social media: An integrated approach of machine-learning algorithms and hedonic modeling to inform equitable housing policies. Land Use Policy 2019, 82, 657–673. [Google Scholar] [CrossRef]
  36. Lu, Y.; Sarkar, C.; Xiao, Y. The effect of street-level greenery on walking behavior: Evidence from Hong Kong. Soc. Sci. Med. 2018, 208, 41–49. [Google Scholar] [CrossRef] [PubMed]
  37. Yin, L.; Wang, Z. Measuring visual enclosure for street walkability: Using machine learning algorithms and Google Street View imagery. Appl. Geogr. 2016, 76, 147–153. [Google Scholar] [CrossRef]
  38. Zeng, L.; Lu, J.; Li, W.; Li, Y. A fast approach for large-scale Sky View Factor estimation using street view images. Build. Environ. 2018, 135, 74–84. [Google Scholar] [CrossRef]
  39. Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y.; Liu, D.; Mu, Y.; Tan, M.; Wang, X.; et al. Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3349–3364. [Google Scholar] [CrossRef] [Green Version]
  40. Gunawardena, G.M.W.L. Evaluation of Streetscape Complexity Created by Streestscape Signage Using Different Objective Analysis Techniques. In Proceedings of the 5th International Conference on Arts and Humanities, Copenhagen, Denmark, 24–27 June 2019; Volume 5, pp. 50–61. [Google Scholar]
  41. Machado, P.; Romero, J.; Nadal, M.; Santos, A.; Correia, J.; Carballal, A. Computerized measures of visual complexity. Acta Psychol. 2015, 160, 43–57. [Google Scholar] [CrossRef]
  42. Tucker, C.; Ostwald, M.J.; Chalup, S.K. A method for the visual analysis of streetscape character using digital image processing. In Proceedings of the 38th Annual Conference of the Architectural Science Association ANZAScA and the International Building Performance Simulation Association, Contexts of Architecture, Launceston, Australia, 10–12 November 2004; Australia and New Zealand Architectural Science Association: Sydney, Australia; Auckland, New Zealand, 2004; pp. 134–140. [Google Scholar]
  43. Wang, H.; Duan, J.; Han, X.; Xiao, B. Research on image complexity evaluation method based on color information. In LIDAR Imaging Detection and Target Recognition; International Society for Optics and Photonics: Bellingham, WA, USA, 2017; p. 106051Q. [Google Scholar]
  44. Lee, J.-Y.; Park, J. The Analysis of the Effect of Visual Information Volume on the Preference of Commercial Streetscape—Using the Computer Vision Techniques. J. Urban Des. Inst. Korea Urban Des. 2020, 21, 75–86. [Google Scholar] [CrossRef]
  45. Jacobs, J. The Death and Life of Great American Cities; Random House: New York, NY, USA, 1961. [Google Scholar]
  46. Berlyne, D.E. Studies in the New Experimental Aesthetics: Steps toward an Objective Psychology of Aesthetic Appreciation; Hemisphere: New York, NY, USA, 1974. [Google Scholar]
  47. Rapoport, A. History and Precedent in Environmental Design; Kluwer Academic Publishers/Plenum Press: New York, NY, USA, 1990. [Google Scholar]
  48. Ewing, R.; Clemente, O. Measuring Urban Design: Metrics for Livable Places; Island Press: Washington, DC, USA, 2013. [Google Scholar]
  49. Jacobs, A.; Appleyard, D. Toward an Urban Design Manifesto. J. Am. Plan. Assoc. 1987, 53, 112–120. [Google Scholar] [CrossRef] [Green Version]
  50. Wolf, K.L. Business district streetscapes, trees, and consumer response. J. For. 2005, 103, 396–400. [Google Scholar]
  51. Cao, K.; Guo, H.; Zhang, Y. Comparison of Approaches for Urban Functional Zones Classification Based on Multi-Source Geospatial Data: A Case Study in Yuzhong District, Chongqing, China. Sustainability 2019, 11, 660. [Google Scholar] [CrossRef] [Green Version]
  52. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  53. Štrumbelj, E.; Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 2014, 41, 647–665. [Google Scholar] [CrossRef]
  54. Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef] [PubMed]
  55. Lundberg, S.M.; Erion, G.; Lee, S. Consistent individualized feature attribution for tree ensembles. arXiv 2018, arXiv:1802.03888. [Google Scholar]
  56. Ji, W.S.; Koo, Y.S.; Jwa, S.H. A Study on Satisfaction for Pedestrian Environment; Gyeonggi Research Institute: Suwon, Korea, 2008; pp. 3–4. [Google Scholar]
Figure 1. Example extracted Naver Street View image.
Figure 1. Example extracted Naver Street View image.
Sustainability 14 05730 g001
Figure 2. Research procedure.
Figure 2. Research procedure.
Sustainability 14 05730 g002
Figure 3. Area extracted by semantic segmentation from the image in Figure 1.
Figure 3. Area extracted by semantic segmentation from the image in Figure 1.
Sustainability 14 05730 g003
Figure 4. The image from Figure 1 with each object’s edges extracted by edge detection.
Figure 4. The image from Figure 1 with each object’s edges extracted by edge detection.
Sustainability 14 05730 g004
Figure 5. The framework of this study.
Figure 5. The framework of this study.
Sustainability 14 05730 g005
Figure 6. An example image of enclosure.
Figure 6. An example image of enclosure.
Sustainability 14 05730 g006
Figure 7. An example image of openness.
Figure 7. An example image of openness.
Sustainability 14 05730 g007
Figure 8. An example image of greenery.
Figure 8. An example image of greenery.
Sustainability 14 05730 g008
Figure 9. An example image of complexity.
Figure 9. An example image of complexity.
Sustainability 14 05730 g009
Figure 10. Example image showing the proportion of buildings.
Figure 10. Example image showing the proportion of buildings.
Sustainability 14 05730 g010
Figure 11. Example image showing the proportion of the road.
Figure 11. Example image showing the proportion of the road.
Sustainability 14 05730 g011
Figure 12. Example image showing the proportion of the sidewalk.
Figure 12. Example image showing the proportion of the sidewalk.
Sustainability 14 05730 g012
Figure 13. Example image showing the proportion of street furniture.
Figure 13. Example image showing the proportion of street furniture.
Sustainability 14 05730 g013
Figure 14. Result of SHAP analysis: (a) SHAP value plot; (b) SHAP summary plot.
Figure 14. Result of SHAP analysis: (a) SHAP value plot; (b) SHAP summary plot.
Sustainability 14 05730 g014
Figure 15. Results of visual analysis: (a) Example image with a high walking satisfaction score (4.05); (b) Example image with a poor walking satisfaction score (1.35).
Figure 15. Results of visual analysis: (a) Example image with a high walking satisfaction score (4.05); (b) Example image with a poor walking satisfaction score (1.35).
Sustainability 14 05730 g015
Table 1. Contents of the Seoul Floating Population Report.
Table 1. Contents of the Seoul Floating Population Report.
Survey Point1000 Representative Places in SeoulSurvey Locations in Seoul
Survey TargetPedestrians passing through the representative points Sustainability 14 05730 i001
Survey PeriodEvery Friday and Saturday during October 2015
Survey Range10 people per point per day
Survey MethodIndividual interviews using questionnaires
Survey ContentsGender, age, occupation, residence, purpose, frequency of visits, overall satisfaction with the walking environment
Table 2. Variable definitions and basic statistics (n = 17,960).
Table 2. Variable definitions and basic statistics (n = 17,960).
FactorElementItemMean/ProportionS.D.Min.Max.
Dependent
variable
Pedestrian satisfaction
  • Very dissatisfied
  • Slightly dissatisfied
  • Normal
  • Slightly satisfied
  • Very satisfied
2.730.96615
Not satisfied (ref.)35%---
Satisfied65%---
Personal
Characteristics
GenderMen (ref.)46%---
Women54%---
Age15–195%---
20–2919%---
30–3918%---
40–4917%---
50–5923%---
60+18%---
Frequency of passageFirst time4%---
Every day30%---
1–2 times a week19%---
3–5 times a week33%---
Less than twice a month14%---
Purpose of visitCommute22%---
Personal37%---
Work or study19%---
Leisure14%---
Passage8%---
JobStudent14%---
Housewife22%---
White-collar27%---
Blue-collar7%---
Sales and service11%---
Self-employment11%---
Other8%---
Physical
features
Width of the sidewalkWidth of the sidewalk (m)4.122.312124
The number of lanesThe number of lanes3.932.637118
Bus roadBus roadNo (ref.)30%---
Yes70%---
CenterlineCenterlineNo (ref.)29%---
Yes71%---
Street
furniture
Street furnitureNo (ref.)8%---
Yes92%---
Road typeCar only7%---
Pedestrian only70%---
Mixed-use23%---
Braille blockBraille blockNo (ref.)64%---
Yes36%---
SlopeSlopeNo (ref.)77%---
Yes23%---
FencePedestrian safety fenceNo (ref.)79%---
Yes21%---
Bus stopBus stopNo (ref.)64%---
Yes36%---
Subway
station
Subway stationNo (ref.)65%---
Yes35%---
CrosswalkCrosswalkNo (ref.)39%---
Yes61%---
Land useCommercial area18%---
Green area3%---
Semi-residential area4%---
Semi-industrial area5%---
Class Ⅰ residential area6%---
Class Ⅱ residential area30%---
Class Ⅲ residential area33%---
Visual
features
Urban design qualitiesEnclosureSum of the area
ratios of buildings and street trees
0.5490.1530.1450.912
OpennessThe area ratio
of the sky
0.2080.09600.539
GreenerySum of planting
area ratios
0.2080.09600.539
ComplexityAmount of visual
information
1.2830.2380.5162.249
Area ratioThe proportion of buildings0.3940.2120.0020.895
The proportion of the road0.1440.06000.273
The proportion of the sidewalk0.0380.03100.287
The proportion of the street furniture0.0130.01800.160
Table 3. Analysis results for each machine-learning model.
Table 3. Analysis results for each machine-learning model.
ModelLogistic RegressionRandom ForestXGBoost
Accuracy0.650.720.82
Precision0.600.760.83
Recall0.510.810.92
F1 score0.530.790.87
AUC score0.560.760.90
ROC curve Sustainability 14 05730 i002 Sustainability 14 05730 i003 Sustainability 14 05730 i004
Table 4. Shapley values for each of the top 20 values.
Table 4. Shapley values for each of the top 20 values.
VariableSHAP
Value
VariableSHAP
Value
The proportion of the road0.1551Semi industrial area0.0493
The proportion of the sidewalk0.1475Class Ⅱ residential area0.0409
The proportion of the street furniture0.1412Purpose of passage0.0356
Enclosure0.1399Purpose of commute0.0316
Complexity0.1311Visit every day0.0315
Greenery0.1252Pedestrian-only road0.0276
Proportion of buildings0.1157Slope0.0266
Openness0.0949Crosswalk0.0247
Width of the sidewalk0.0792Student0.0236
The number of lanes0.0636Visit 3–5 times a week0.0255
Table 5. SHAP dependence plot.
Table 5. SHAP dependence plot.
The Proportion of the RoadThe Proportion of the SidewalkComplexityGreenery
Sustainability 14 05730 i005 Sustainability 14 05730 i006 Sustainability 14 05730 i007 Sustainability 14 05730 i008
The Proportion of
Street Furniture
EnclosureThe Proportion of
Buildings
Openness
Sustainability 14 05730 i009 Sustainability 14 05730 i010 Sustainability 14 05730 i011 Sustainability 14 05730 i012
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lee, J.; Kim, D.; Park, J. A Machine Learning and Computer Vision Study of the Environmental Characteristics of Streetscapes That Affect Pedestrian Satisfaction. Sustainability 2022, 14, 5730. https://doi.org/10.3390/su14095730

AMA Style

Lee J, Kim D, Park J. A Machine Learning and Computer Vision Study of the Environmental Characteristics of Streetscapes That Affect Pedestrian Satisfaction. Sustainability. 2022; 14(9):5730. https://doi.org/10.3390/su14095730

Chicago/Turabian Style

Lee, Jiyun, Donghyun Kim, and Jina Park. 2022. "A Machine Learning and Computer Vision Study of the Environmental Characteristics of Streetscapes That Affect Pedestrian Satisfaction" Sustainability 14, no. 9: 5730. https://doi.org/10.3390/su14095730

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop