Social Media Image and Computer Vision Method Application in Landscape Studies: A Systematic Literature Review

Ma, Ruochen; Furuya, Katsunori

doi:10.3390/land13020181

Open AccessReview

Social Media Image and Computer Vision Method Application in Landscape Studies: A Systematic Literature Review

by

Ruochen Ma

and

Katsunori Furuya

^*

Graduate School of Horticulture, Chiba University, Chiba 271-8510, Japan

^*

Author to whom correspondence should be addressed.

Land 2024, 13(2), 181; https://doi.org/10.3390/land13020181

Submission received: 6 January 2024 / Revised: 1 February 2024 / Accepted: 1 February 2024 / Published: 3 February 2024

(This article belongs to the Special Issue Advances in Landscape Perception Based on New Approaches & Technologies)

Download

Browse Figures

Versions Notes

Abstract

:

This study systematically reviews 55 landscape studies that use computer vision methods to interpret social media images and summarizes their spatiotemporal distribution, research themes, method trends, platform and data selection, and limitations. The results reveal that in the past six years, social media–based landscape studies, which were in an exploratory period, entered a refined and diversified phase of automatic visual analysis of images due to the rapid development of machine learning. The efficient processing of large samples of crowdsourced images while accurately interpreting image content with the help of text content and metadata will be the main topic in the next stage of research. Finally, this study proposes a development framework based on existing gaps in four aspects, namely image data, social media platforms, computer vision methods, and ethics, to provide a reference for future research.

Keywords:

landscape study; social media data; computer vision; machine learning

1. Introduction

A landscape is the boundary object and intersection of multiple disciplines, scales, and theories [1]. Its knowledge covers material and immaterial, and natural and human sciences; further, a landscape has multiple values for contemporary society [2,3]. The signing of the European Landscape Convention (ELC) in 2000 enhanced people’s awareness of the landscape [4]. Further, the convention played a crucial role in the integration of landscapes, cultural identity, and governance [1,5]. The ELC defined a landscape as “an area, as perceived by people, the character of which is the result of the action and interaction of natural and/or human factors” [6]. In addition, according to an article by Mueller et al. [7], researchers classified landscapes into urban, peri-urban, and rural or agricultural landscapes, as well as natural, semi-natural, and cultural landscapes. The ability to provide specific ecosystem services is a critical consideration in maintaining and improving regional and human well-being [8].

The insights and technological innovations generated by landscape studies can help concerned personnel understand, monitor, and manage landscapes and make appropriate decisions based on regional conditions [9]. The ELC emphasizes the importance of public participation or involvement in landscape management. Accordingly, investigating the public’s subjective perception is an important landscape management approach [10]. In this field, studies typically collect data through questionnaire surveys [11], interviews [12], the participatory geographic information system [13], and so on, which incur substantial temporal and financial costs. Further, only limited data sources are available [14,15]. In this situation, the advent of social media, an interactive digital media technology that is based on Web 2.0, has opened new avenues for landscape studies [16,17]. Social media provides volunteered geographic information with data from broad sources, large sample sizes, and high spontaneity, which overcomes certain spatial and temporal limitations of conventional methods [16,18,19,20]. Wilkins et al. [14] and Ghermandi [21] concluded that social media data could serve as an effective proxy for visits to landscape areas. In addition, current visitors share experiences and opinions about destinations on social media, influencing the thoughts of potential visitors in this way [22]. Social media data can also be a viable option for estimating travel and destination demands [23]. In recent years, researchers have analyzed various issues associated with landscape characteristics, user satisfaction, activity types, and movement patterns based on the metadata, texts, and images uploaded by social media users [16,24,25].

Images can help visualize landscapes and avoid the misunderstandings caused by text (for example, informal idiomatic texts and pluralistic languages), thereby building a bridge between researchers and the public [26]. The introduction of photography technology in the 19th century was a milestone in the history of landscape visualization, and since then, images have gradually become a commonly used and accepted tool in landscape studies [27]. Common image-based approaches, such as questionnaires or the Photo-Elicitation Interview, involve researchers taking or providing photos; another example is Visitor-Employed Photography, which involves a small group of participants taking photos for analysis [10]. The drawbacks of these approaches are that images have limited content richness and are susceptible to personal bias [28]. By leveraging social media, researchers can access rich image data voluntarily shared by users without relying on assumed landscape preferences [18]. However, on attempting to understand the image content of a large, crowdsourced dataset, manual coding consumes a lot of time and potentially introduces researcher bias [14,29]. Therefore, the current research trend is to use artificial intelligence for automated image content analysis [30]. Artificial intelligence algorithms have evolved over time and can now process data in its natural form, so specific algorithms can quickly analyze large-scale unstructured data such as text and images [31]. Researchers mine such unstructured data sets from platforms such as social media to generate insights about mass behavior and thought [32]. Computer vision refers to the visual information comprehension by a computer modeled after the human visual system through three steps: feature extraction, processing, and semantic information generation [33]. Today, state-of-the-art computer vision tools help people comprehend landscapes by categorizing pictures by content, recognizing captured objects, and so on [34].

Therefore, the current approach of using computer vision to interpret social media images in the landscape domain seems to simultaneously overcome the shortcomings of both small-sample data and manual analysis and is expected to become a research trend. Its implementation details and application potential are noteworthy. However, due to the novelty of this approach, the number of papers on computer vision published to date is relatively small. Yang and Liu [16] summarized this approach as an important research direction in their review of social media data studies on urban landscapes, focusing on deep learning tasks and frameworks. However, we noted that a more systematic, in-depth review of studies using this approach is lacking in the current literature database, particularly in the context of broad landscape types and scales. Meanwhile, due to the rapid development of relevant technologies, the potential challenges of this advanced, yet evolving, approach should be considered. Therefore, this study reviews the relevant literature, analyzes the short-term progress and roles of a system that uses computer vision as a research tool and social media images as data sources in landscape research, summarizes the gaps and potential biases of current studies, and predicts future trends. Further, this study focuses on the following four questions:

What are the spatial and temporal distributions and research themes for these studies?
Which computer vision methods are used in these studies? What are the trends and purposes of their use?
Which social media platforms are used by relevant studies? How do these studies use other data provided by social media while analyzing images?
What are the limitations of this type of studies?

Based on the results of the quantitative review, this study suggests future directions for improvement and optimizes the research framework for this system, and thereby provides guidance for the effective application of this advanced methodology in future studies.

2. Materials and Methods

Figure 1 shows the workflow of this study, which involves four steps: data collection, data processing, data analysis and visualization, and interpretation.

2.1. Literature Search and Screening

This study used the Web of Science Core Collection, a database that covers academic literature in more than 250 research fields and enables landscape researchers to search across disciplinary fields, as the source of information. It provides convenient citation indexing and academic influence evaluation functions, allowing users to access and analyze various literature data [35]. This study conducted a systematic review according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses framework [36]. We used TS = ((“landscape”) AND (“social media” OR “crowdsourced” OR “crowdsourcing” OR “crowdsource”) AND (“computer vision” OR “machine learning” OR “deep learning”)) as the query to perform an advanced search. The search terms were divided into three parts: “landscape” defined the theme; “social media” and three words expressing the “crowdsource” concept determined the data source; and words related to “computer vision” clarified data analysis methods. “Machine learning” refers to algorithms that can solve tasks without being explicitly programmed by human developers [37]. In computer vision, machine learning plays a significant role in extracting information from images [38]. Further, “deep learning” indicates modern neural networks and is a sub-branch of machine learning that consists of multiple layers to learn the features of data with multiple levels of abstraction [37,39]. The data retrieval date was October 1, 2023. Database searching recorded 134 papers. Further, we added some papers (n = 25) that matched relevant topics but were not captured in the search, such as those papers that had not yet been included in the database since they had newer publication dates.

The papers were screened according to the criteria listed in Table 1. First, the titles and abstracts of all records were read to exclude some papers. In this step, we deleted the papers that were not relevant to the research field of this review based on the first three terms in Table 1 (n = 95). When the contents of the last two terms could not be determined from the abstract, we read the complete text to exclude the papers that did not conform to this review’s research objectives (n = 9). Finally, 55 papers satisfied the screening criteria (accounting for approximately 35% of the screened articles).

2.2. Literature Review and Data Analysis

We reviewed the complete text of 55 papers. Based on this review’s four research questions, the following research paper information was recorded in Microsoft Excel for data arrangement and preliminary analysis (Table 2). The four questions correspond to four broad categories: the main characteristics, computer vision, social media data, and limitations. Finally, OriginPro 2021 published by OriginLab Corporation in the United States was used for the quantitative analysis and visualization of the data [40].

3. Results

3.1. Main Characteristics of Reviewed Studies

Figure 2 (source: Publication year recorded in Table 2, summarized by the authors of this review) depicts the growth trend of studies using social media images and computer vision in the landscape field as of September 2023. Studies using the system that uses computer vision as a research tool and social media images as data sources have been in the exploratory stage for the past six years or more. The number of annually published papers is less than 20 and generally follows a continuously increasing trend.

Figure 3 (source: Study location and Study area recorded in Table 2, summarized by the authors of this review) depicts the statistics of studies’ spatial distribution by continent. Many research teams focus on landscape areas within their own countries. Today, landscape studies in foreign and multinational contexts are emerging, with research teams from European countries contributing the most (64%). Among the statistical results from other continents, studies conducted in locations within the country of the first author’s affiliation are overwhelmingly dominant. According to Figure 3, research teams from Asian countries contributed the most studies. Further, statistical results reveal that China (18%) has the highest total number of publications, followed by South Korea (10%) and the United Kingdom (9%).

Figure 4 (source: Setting and Research theme recorded in Table 2, summarized by the authors of this review) depicts the Sankey diagram between the settings and research themes of studies. In this figure, “Region” refers to a region within a country with multiple cities, including administrative divisions, such as provinces and prefectures, and non-administrative divisions. Research themes are divided into three branches: cultural ecosystem services (CES), urban planning and management, and tourism management. The largest number of studies draws on computer vision to interpret social media images’ content and assess an area’s CES. For example, landscape aesthetic services are assessed using photographs depicting broad and large-scale landscapes; recreational services are assessed through photographs showing people engaging in recreational, social, and sporting activities; and the services of biological observation and interaction are assessed through photographs featuring living things, such as animals and plants [41]. The dominant theme was the comprehensive assessment of these services. In the urban planning and management branch, researchers use crowdsourced images to develop an overall image of a city comprising various intentional elements or to focus on the public perception and use of certain places [42,43]. Distinguishing different urban styles through landscape feature quantification is another important research theme [44]. Further, in the tourism management branch, the visual content analysis of tourism photographs is an effective method for examining tourists’ perceptions of destination images and tapping into tourists’ activities [45]. Figure 4 indicates that most studies take urban parks or urban open spaces as settings (20%), which involve two major themes: CES, and urban planning and management. This is followed by studies on cities (16%) and nature reserves (15%).

3.2. Utilization of Computer Vision in Image Analysis

Table 3 summarizes the computer vision tasks used in the studies reviewed in this study. According to the proportional statistics depicted in Figure 5, image classification and recognition are the most widely used tasks. In this study, image recognition refers to the image recognition function in visual artificial intelligence products developed by commercial providers, such as Google, Microsoft, and Amazon. Researchers use providers’ Application Programming Interfaces (APIs) pretrained on large datasets to identify entire scenes and multiple objects, faces, or text in an image and return preset or custom labels [46]. In earlier studies, these tags are usually semantically clustered to clarify images’ topic trends. For instance, Lee et al. [29] clustered the labels predicted by Clarifai into nine groups using a hierarchical clustering algorithm and extracted two CES-related groups, “Landscape Aesthetics” and “Existence.” Since this approach requires less machine learning experience for researchers and operates differently from other deep learning-based tasks, this study classifies it separately. Google Cloud Vision is the most popular among all business service options.

The advent of deep learning and deep neural networks, such as the Convolutional Neural Network (CNN), the Recurrent Neural Network, and the Graph Neural Network (GNN), resulted in tremendous advances in computer vision [47]. Image classification, image segmentation, and object detection are the main tasks applied to social media image data. The goal of image classification is to classify images into a preset category based on content. In outdoor research, researchers often use CNN architectures trained on the Place 365 dataset to generate the five most likely scenes among 365 scenes. Xiao et al. [48] used this database to classify tourist photographs in Jiangxi Province, China, into scenes, such as mountains, fields, villages, forests, and lakes, and reveal the spatiotemporal heterogeneity of tourist destination landscapes. ResNet is the most commonly used image classification architecture (Figure 5). Image segmentation and object detection facilitate the identification and location of multiple targets from an image and are more complex than image classification. In landscape research, image segmentation is often used to determine the proportions of various landscape elements, thereby quantifying the landscape. Qi et al. [49] used the proportion of elements such as vegetation and water to measure the naturalness dimension, and the proportion of elements such as sidewalks and buildings to measure the artificial environment dimension to describe the urban landscape’s composition and quality. Further, object detection is used to determine visitors’ activities through specific objects. For example, Song et al. [50] considered vehicles, handbags, pets, and bicycles to be proxies for wilderness, sightseeing, walking activities, and cycling activities, respectively, to clarify park use. Facial detection is a branch of object detection, and its role in interpreting people’s emotional states gives it an advantage in analyzing social media images.

Image clustering tasks are implemented in two different ways. Before the emergence of deep learning, researchers used image embedding algorithms to extract fixed feature points from an image, such as Scale-Invariant Feature Transform (SIFT) and Oriented FAST and rotated BRIEF3 [51,52]. The widespread use of CNN architectures in image classification has led to the advent of new image clustering methods that use a pretrained image classification CNN to convert image content into feature vectors that can be processed by machine learning. Kim and Kang [51] used a pretrained CNN to embed a single image as a feature vector and combined dimensionality reduction tools and hierarchical clustering to clarify the visual elements attractive to tourists. Such studies confirm the following advantages of image clustering: it does not require the establishment of classification categories in advance and can improve the classification performance on small sample datasets.

Table 3. Computer vision task categories used in reviewed studies.

Computer Vision Task	Description	Example
Image recognition	A machine learning algorithm to identify objects or scenes in images. Pretrained models from commercial cloud services are often used to add content-relevant machine labels to photographs [30,53].	[54]
Image classification	Based on the overall information expressed by an image, a neural network is used to assign and label images to the most likely scene categories [16,42].	[45]
Image clustering	An unsupervised learning method that uses algorithms to cluster semantically similar images by extracting the images’ features and converting them into vectors [34,51].	[41]
Image segmentation	Based on the image’s semantic features, a neural network is used to segment the image at the pixel level and divide it into subparts or sub-objects [16,55].	[56]
Object detection	The location and shape of each object in an image are detected using a neural network–based detector that identifies target objects’ bounding boxes or boundaries [34,50].	[57]
Facial detection	The commercial service provides an accessible API * that takes images as input and outputs the detected face’s attributes (gender, age, and expression) [50].	[50]

* API = Application Programming Interface.

Table 4 depicts the characteristics of the computer vision models used in the reviewed studies. It is undoubtedly convenient for researchers to use models that are pretrained on large-scale datasets and perform image content analysis well in related fields. However, pretrained models have limitations when applied to specific areas. Some images may be misidentified or classified when they primarily reflect regional features, rather than common scenes [58]. Further, training a model from scratch requires a large amount of corresponding image data, the collection of which is an expensive and time-consuming process. Transfer learning alleviates this problem to some extent [47]. For example, fine-tuning the pretrained model and constantly updating the initialized weights enable the network to learn the specific characteristics of a new task or use the pretrained model to extract image features and modify or replace only the classifier part. Such transfer learning enables image clustering. This task uses a pretrained model as a feature extractor and helps replace the fully connected layer used for classification in the model, thereby obtaining feature vectors separately for clustering [41]. In addition, Havinga et al. [59] changed the output of the pretrained model into image scenery scores by replacing the model’s classifier layers. Hence, some studies apply models using the two aforementioned methods, whereas others either propose new models or do not use models at all. Mouttaki et al. [60] proposed a new convolutional image classification architecture that takes as input images trained by a supervised method and outputs defined CES categories. Further, Bai et al. [61] used semi-supervised learning to train multiple GNN models as an ensemble to incorporate multivariate data from social media into an attributed multigraph structure to assist in mapping cultural significance in geographical contexts. Finally, Hartmann et al. [52] used the SIFT algorithm, rather than a deep learning architecture, for image clustering.

Figure 6 depicts the trends in researchers’ selection of computer vision methods. Multiple tasks or services/architectures used in one study are counted multiple times. Therefore, this study counted 70 methods from 55 papers and plotted them as percentage stacks. In the early stages of such studies (2018–2019), image recognition methods using pretrained APIs from commercial providers were dominant. With the development of deep learning, the number of deep learning-based tasks increased significantly during 2020–2021. Further, researchers started examining the application of transfer learning in adapting models to their study areas. The studies published in 2022 had the largest variety of computer vision tasks, which were dominated by tasks related to overall content analysis. In 2023, the proportion of image segmentation has increased to a level similar to that of image recognition and classification, that is, image content analysis tends to shift from the whole to its constituent elements. In the past two years, some researchers have started proposing new architectures based on their research purposes and objectives, rather than relying on pretrained models.

To clarify the effectiveness of the aforementioned methods, we collected the accuracies of computer vision analysis results mentioned in 55 studies. Accuracy refers to the proportion of samples correctly predicted by a computer from the total samples. The studies that provided data on indicators such as precision or recall alone were not included. When authors provided both TOP1 and TOP5 results, we included the TOP5 results in statistics to evaluate image classification accuracy. Finally, we counted 27 accuracy values and plotted them by model and task (Figure 7, source: Accuracy recorded in Table 2, summarized by authors of this review). The accuracy of image recognition tasks based on pretrained models was relatively stable and ranged from 0.78 to 1. However, according to Kim et al. [66], the image classification result obtained using the pretrained model was not ideal and had an accuracy of less than 0.30. Subsequently, Kang et al. [58] used the same deep learning architecture (Inception-v3 model) to analyze the same area (Seoul). To overcome the limitations of the earlier study, Kang et al. [58] retrained the model using transfer learning and achieved an accuracy of more than 95%. The box above the scatter points reflects a high dispersion of the results obtained using transfer learning ranging from 0.49 to 0.97. Further, both Cardoso et al. [67] and Winder et al. [68] showed that when a model trained using data from one region is applied to another region, the model’s performance decreases. Therefore, the model’s generalization after transfer learning is limited. The samples on the deep learning architectures proposed by the authors in our collection of reviewed studies are too small, with only the study by Mouttaki et al. [60] providing an accuracy of up to 0.99.

To examine the problems that can be solved by computer vision methods in landscape studies, we summarized the specific purposes of using them as a research step. When multiple purposes were involved in a study, all were counted. Ultimately, we recorded 174 purposes from 55 papers and summarized 15 categories of purposes that appeared in more than one study (Figure 8, source: Purpose of use recorded in Table 2, summarized by the authors of this review). The graph on the right of Figure 8 indicates that most of the purposes can be achieved by image recognition and classification tasks, except “select representative photographs of study site.” This purpose is often realized by selecting the image at the cluster center as the representative image after image clustering [69]. The graph on the left visualizes the trend of purpose by calculating the proportion of the number of studies with this purpose to the total number of studies each year (software: OriginPro 2021). The publishing of a single study in 2018 resulted in limited representativeness. Nevertheless, “classify/cluster images for descriptive statistics” and “map the spatial distribution of image content” were the dominant purposes for the six years of review. Further, “sort recognized shooting objects” and “investigate cooccurring landscape elements” were more popular in the early review stages than the later ones. However, “learn photography preferences at different study sites,” “explore the correlation of image content with other data,” “compare and optimize research methods,” “compare results with other research methods,” and “select representative photos of study site” were more popular in the later stages of the review.

3.3. Social Media Platforms and Data

Among the reviewed landscape studies, 46 (84%) used a single social media platform as the data source. Data availability is the most important criterion in platform selection. The geographical query function that is not limited by region and time range makes Flickr the most popular platform for researchers (Table 5). Compared to Instagram and Twitter users, who are more active, Flickr users are more likely to share high-quality, professional photographs and have a higher average age [70,71]. Usually, Instagram users selectively share their daily experiences and express themselves through photographs [72]. Although this platform is advantageous in analyzing human activities, computer vision analysis is limited by privacy policies [73,74]. In addition, Instagram location representations use georeferenced tags, rather than actual shooting locations, from photographs’ meta-information [57]. Platforms such as VKontakte and Weibo have advantages over globally popular platforms in targeting user groups of some languages (e.g., Russian and Chinese). Some groups (outdoor sports enthusiasts and tourists) prefer platforms such as Wikiloc and TripAdvisor; therefore, the users of these groups are more homogeneous than those of general social media platforms [18]. In addition, images uploaded by users of platforms related to tourism and outdoor sports reflect the human–landscape interactions more than people’s perception of the physical properties of the environment [75].

Most studies (60%) introduced the geographical location of platform contributions while performing image analysis. The number of such studies is increasing every year (Figure 9). The two main motivations of studies are understanding the characteristics and geographical drivers of the spatial distribution of photographs and relying on photograph content to map a study area’s potential (Table 6). Some common analysis methods are Kernel density estimation [81] and Getis-Ord-Gi* cold-hotspot analysis [48] performed on ArcGIS. Further, the machine learning techniques random forests [82], maximum entropy models [83], and self-organizing maps [84] are used in spatial analysis. In addition, timestamps and user information are useful auxiliary information in the metadata contributed by the platform. Using timestamps, it is possible to visualize changes in the form of a timeline, such as changes in the popularity of landscape types [81], changes in user groups [64], and variations in a specific landscape attribute [59], or to make comparisons between two points in time, such as before and after a temporary urban event [85] or a major public health event [57]. The primary user information used in these studies is nationality/region. In the studies reviewed by us, discussions based on demographic characteristics, such as gender or age, were rare, which might be attributed to a strong bias in the user groups posting landscape photographs on popular platforms or incomplete user information [34,43]. Further, with the increasing development of deep mining methods for textual and image information, the frequency of textual content utilization has increased significantly in recent studies. The pixel-level segmentation of images in computer vision enables the quantification of landscape elements in an image, whereas natural language processing enables the quantification of emotions expressed in text; therefore, research to combine the quantification of landscape elements and emotions appropriately is gradually emerging [86]. Interactions are beginning to be used in studies as a proxy for public preferences, as well.

3.4. Limitations of Landscape Studies

In the process of reading the full text of each paper, we looked for paragraphs in the Methods, Discussion, and Conclusions Sections that explicitly mentioned biases, limitations, challenges, or gaps. Each limitation mentioned in the paragraph is itemized in Excel. Limitations with similar meanings expressed in different papers are summarized into one. For example, Karasov et al. [88] proposed that the elderly and children are likely to be the age groups least represented in social media; Huai et al. [41] proposed that social media data have inherent biases, that is, certain groups are underrepresented; and Liu et al. [53] proposed that data collected from specific platforms may reflect age-group bias, affecting the result reliability. They are summarized into one item (the first item of Main content in Table 7). Items involved in more than one study were later summarized into broad categories (Limitations in Table 7). In our review, landscape research using computer vision as a research tool and social media images as data sources indicated considerable advancement and effectiveness. However, most studies (95%) clearly stated their limitations. The most frequently reported limitations are inherent sampling biases in social media data. Further, studies reporting the effects of active users avoided this effect by setting filter conditions during data collection. To limit the bias among overly productive users, 13 studies introduced Photo-User Days (PUD). PUD is the number of individual users who upload at least one photograph on a given day in a particular location [91]. In addition, some studies proposed methods such as “retaining only one photo per user per square kilometer” and “retaining only one photo with the same label taken within one minute” [76,92].

4. Discussion

4.1. Status of Computer Vision Use in Social Media Image Interpretation in Landscape Studies

Social media enables people to shift from the passive reception of information to the active sharing of opinions and knowledge and provides a new direction to examine the human–nature interaction in landscape studies [93]. Although a considerable number of landscape studies based on social media data were conducted in the past decade, our review results indicated that the combined system of image data and computer vision methods remains in the exploratory stage of development. Currently, the research objects are mainly domestic areas; however, studies in foreign and transnational contexts are emerging. The image categories posted by users can match or be associated with various CES types, which enables the system to play a prominent role in comprehensive CES assessment [94]. Furthermore, in the fields of urban science and tourism, this system has significant potential for the investigation of large landscape areas. Initially, following the introduction of automatic methods, researchers focused primarily on the image content itself in the study case, such as understanding the frequency of occurrence of each content or the cooccurrence relationship between elements. With the continuous advancement and enrichment of machine learning methods, researchers have started focusing on the accuracy and applicability of methods, differences among locations, and combinations of social media data and other data sources (geographic environmental variables and onsite survey data) [82,95].

In terms of computer vision methods, learning tasks focused on analyzing overall image content in the early stages (2018–2020). Since 2021, the proportion of tasks identifying and locating discrete objects in images has been increasing annually. Landscape studies based on social media have entered a refined and diversified phase of automatic visual image analysis. The image recognition services provided by commercial providers have been actively studied over the past six years due to their low machine learning experience requirements. Although these services continue to improve their algorithms, the models provided by them are pretrained according to generic image datasets and, hence, have limited capabilities in identifying certain landscape images with regional characteristics [71]. Deep learning-based tasks, such as image classification and segmentation, enable researchers to adjust the model through transfer learning to make it more suitable for their studies. However, transfer learning applicability is based on adequate similarity between the initial classification task and the task to which the transfer learning methods are applied [96]. The automatic analysis accuracy reported by current studies using transfer learning is not as stable as the accuracy of studies that directly use pretrained models provided by commercial services. Further, due to the biogeographical pattern of the distance decay of biotic similarity, a model retrained by transfer learning for a certain area has limited reuse capabilities in geographically distant areas [67]. Recently, some researchers proposed new architectures to overcome some of the limitations of this system and provide potential research directions.

Regarding social media data application, combining geographic information and crowdsourced image content to understand the geographic drivers of landscape preferences or the landscape potential of geographic locations are two important trends in current landscape studies. The former usually uses tools such as ArcGIS to visualize different perception themes on maps and explain the geographical features of hotspot areas [45,58,81], or examine the impact of geographical variables on landscape perception themes through regression analysis [68,97]. The latter is usually based on the content of the images taken at each location to simulate the spatial distribution of landscapes, activities, and groups with the help of machine learning methods (random forests, maximum entropy models, and self-organizing maps) [64,82]. Therefore, location-based social media platforms offer strong advantages as data sources. Flickr, which provides more liberal data access and uses more policies than other platforms, has emerged as the most popular crowdsourced image source. However, Flickr data have some limitations, such as insufficient user representation, incomplete geographical data, and a decline in popularity [43,98]. Additionally, data quality and study reproducibility may be affected by changes in user settings or platform policies, which are potential risks encountered by researchers using any platform. The difficulty of conveying intangible concepts through images and the subjectivity of researchers in interpreting publishers’ intentions were reported in the reviewed studies. Therefore, studies on using text information released along with images to assist image interpretation has begun to emerge in the past two years, thanks to technological advances in the fields of natural language processing and computer vision. Although more appropriate combination methods are still being explored, the next hot trend lies in the intersection of these two artificial intelligence fields.

4.2. Development Framework for Future Research

The main challenges faced by researchers under the combined system of social media image and computer vision method can be summarized into four aspects: image data, social media platforms, computer vision methods, and ethics. Based on this classification, we propose a development framework that can limit bias and promote the integration of these four aspects (Figure 10).

Image data: To reasonably interpret the intangible content that is difficult to convey through an image, it should be analyzed together with the text content and tags added by users at the time of posting. Researchers may not be able to accurately capture users’ experience and motivations through a single photograph; hence, the photographs posted by users in a certain area are recommended to be grouped into clusters for overall analysis [99]. Some rules should be established to filter the images that are irrelevant to a topic or may interfere with machine analysis. In addition, when researchers cannot explain some results by relying on social media data alone, they can collect survey data to provide complementary perspectives.

Social media platforms: Since each platform’s user composition is different, integrating data from different social media platforms can make analysis results more comprehensive. To avoid demographic biases, the selection of crowdsourcing platforms specializing in sports, outdoor recreation, travel, and so on, is sometimes more effective than selecting popular social media platforms because specialized platform users are more homogeneous and suitable for investigation on related topics than general platform users. In addition, relevant technical personnel are recommended to develop hybrid deep learning models for social media platforms that can consider metadata and text content in image analysis.

Computer vision methods: Since each model is trained on different image datasets, future research should consider using multiple models and comparing their validity to avoid the biases caused by tool selection. In addition, researchers should select appropriate tasks and models based on their image data. When the preset categories of a pretrained model are insufficient to meet their needs, transfer learning can be used to flexibly adjust or train new models. Further, the manual verification of a small number of images is essential to avoid bias when using machine learning alone.

Ethics: Due to the involvement of public information and emerging technologies, the practices in the three aforementioned aspects must comply with relevant ethical principles. Although computer-based image viewing is less intrusive than manual viewing, researchers must manage data carefully to protect users’ personal information. Finally, according to Winder et al. [68], a public repository that sets clear permissions and restrictions on the use of voluntary crowdsourced images should be established to assist researchers in training models on the basis of privacy protection.

Crowdsourced data have become an essential part of the landscape research field [24]. Through the results reported in these studies, the practical benefits of this data source in terms of planning, governance and design utility deserve to be explored. Large-scale crowdsourced datasets can be used to assess the landscape impacts of large national energy projects [100], to identify saturated green space areas in cities and establish controls that do not exceed carrying capacity [101], and to build smart city-related systems to improve the life quality and well-being of residents [102]. Small-scale crowdsourced data sets can be used to conduct in-depth studies such as evaluating the effectiveness of landscape design projects [103] and residents’ participation in project site selection [104] to assist planners in their decision-making. The introduction of artificial intelligence technologies such as computer vision undoubtedly improve the efficiency of these processes and reduce costs. The framework proposed in this study can provide theoretical guidance for crowdsourced data research that is not limited to social media, so that it can be applied accurately and widely without violating ethical principles.

However, this study has some limitations. First, search strategies and screening criteria affect search records. Therefore, not all relevant literature can be covered. In addition, using only the WOS Core Collection as the information source may introduce biases of discipline and language [105]. Second, some analysis and interpretation are limited by the author’s experience. In addition, due to the rapid technological development in the field, changes difficult to predict with existing knowledge are likely to occur. Nonetheless, this study provides a clear understanding of the status of landscape studies under the combined system of social media image and computer vision method. Further, the aforementioned framework can serve as a general theoretical foundation for the next research stage.

5. Conclusions

Several related reviews confirm the rapid growth of studies using social media as a data source in the landscape studies field. Image content is one of the most informative parameters of social media data. The introduction of computer vision enables the automatic analysis of large numbers of images, and thereby helps overcome the shortcomings of manual analysis, which is time-consuming, laborious, and highly subjective. This study found the turnover of research tools and ideas to be rapid, despite the recent emergence of studies interpreting social media images using computer vision methods and the relatively small number of publications. Current studies continue to face the sampling biases inherent in social media data, the pitfalls of computer vision automation, and the presence of biased information in images. However, with the development of computer deep learning and Big Data mining technologies, the accuracy of artificial intelligence is expected to improve in the future. Further, the combination of other information sources to facilitate image interpretation will open new research frontiers, and studies using this system will enhance the potential for further development. The implications of the development framework proposed by this review, which is based on the research experience to date, are that it will help landscape researchers minimize foreseeable biases and increase the system’s reliability once its application becomes popular.

Author Contributions

Conceptualization, R.M.; methodology, R.M.; software, R.M.; formal analysis, R.M.; investigation, R.M.; data curation, R.M.; writing—original draft preparation, R.M.; writing—review and editing, K.F.; visualization, R.M.; supervision, K.F.; project administration, K.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Acknowledgments

We thank all reviewers for their valuable comments on this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Plieninger, T.; Kizos, T.; Bieling, C.; Le Dû-Blayo, L.; Budniok, M.-A.; Bürgi, M.; Crumley, C.L.; Girod, G.; Howard, P.; Kolen, J.; et al. Exploring Ecosystem-Change and Society through a Landscape Lens: Recent Progress in European Landscape Research. Ecol. Soc. 2015, 20, art5. [Google Scholar] [CrossRef]
Löfgren, S. Knowing the Landscape: A Theoretical Discussion on the Challenges in Forming Knowledge about Landscapes. Landsc. Res. 2020, 45, 921–933. [Google Scholar] [CrossRef]
Medeiros, A.; Fernandes, C.; Gonçalves, J.F.; Farinha-Marques, P. Research Trends on Integrative Landscape Assessment Using Indicators—A Systematic Review. Ecol. Indic. 2021, 129, 107815. [Google Scholar] [CrossRef]
Marine, N. Landscape Assessment Methods Derived from the European Landscape Convention: Comparison of Three Spanish Cases. Earth 2022, 3, 522–536. [Google Scholar] [CrossRef]
Vlami, V.; Zogaris, S.; Djuma, H.; Kokkoris, I.P.; Kehayias, G.; Dimopoulos, P. A Field Method for Landscape Conservation Surveying: The Landscape Assessment Protocol (LAP). Sustainability 2019, 11, 2019. [Google Scholar] [CrossRef]
Council of Europe. European Landscape Convention; Council of Europe: Florence, Italy, 2000. [Google Scholar]
Mueller, L.; Eulenstein, F.; Mirschel, W.; Antrop, M.; Jones, M.; McKenzie, B.M.; Dronin, N.M.; Kazakov, L.K.; Kravchenko, V.V.; Khoroshev, A.V.; et al. Landscapes, Their Exploration and Utilisation: Status and Trends of Landscape Research. In Current Trends in Landscape Research; Mueller, L., Eulenstein, F., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 105–164. [Google Scholar]
Wu, J. Landscape Sustainability Science: Ecosystem Services and Human Well-Being in Changing Landscapes. Landsc. Ecol. 2013, 28, 999–1023. [Google Scholar] [CrossRef]
Mueller, L.; Eulenstein, F. Editors’ Preface. In Current Trends in Landscape Research; Springer International Publishing: Cham, Switzerland, 2019; pp. 5–8. [Google Scholar]
Kang, N.; Liu, C. Towards Landscape Visual Quality Evaluation: Methodologies, Technologies, and Recommendations. Ecol. Indic. 2022, 142, 109174. [Google Scholar] [CrossRef]
Gkoltsiou, A.; Paraskevopoulou, A. Landscape Character Assessment, Perception Surveys of Stakeholders and SWOT Analysis: A Holistic Approach to Historical Public Park Management. J. Outdoor Recreat. Tour. 2021, 35, 100418. [Google Scholar] [CrossRef]
Santo-Tomás Muro, R.; Sáenz de Tejada Granados, C.; Rodríguez Romero, E.J. Green Infrastructures in the Peri-Urban Landscape: Exploring Local Perception of Well-Being through ‘Go-Alongs’ and ‘Semi-Structured Interviews’. Sustainability 2020, 12, 6836. [Google Scholar] [CrossRef]
Bijker, R.A.; Sijtsma, F.J. A Portfolio of Natural Places: Using a Participatory GIS Tool to Compare the Appreciation and Use of Green Spaces inside and Outside Urban Areas by Urban Residents. Landsc. Urban Plan. 2017, 158, 155–165. [Google Scholar] [CrossRef]
Wilkins, E.J.; Wood, S.A.; Smith, J.W. Uses and Limitations of Social Media to Inform Visitor Use Management in Parks and Protected Areas: A Systematic Review. Environ. Manage. 2021, 67, 120–132. [Google Scholar] [CrossRef]
Norman, P.; Pickering, C.M. Discourse about National Parks on Social Media: Insights from Twitter. J. Outdoor Recreat. Tour. 2023, 44, 100682. [Google Scholar] [CrossRef]
Yang, C.; Liu, T. Social Media Data in Urban Design and Landscape Research: A Comprehensive Literature Review. Land 2022, 11, 1796. [Google Scholar] [CrossRef]
Tieskens, K.F.; Van Zanten, B.T.; Schulp, C.J.E.; Verburg, P.H. Aesthetic Appreciation of the Cultural Landscape through Social Media: An Analysis of Revealed Preference in the Dutch River Landscape. Landsc. Urban Plan. 2018, 177, 128–137. [Google Scholar] [CrossRef]
Callau, A.À.; Albert, M.Y.P.; Rota, J.J.; Giné, D.S. Landscape Characterization Using Photographs from Crowdsourced Platforms: Content Analysis of Social Media Photographs. Open Geosci. 2019, 11, 558–571. [Google Scholar] [CrossRef]
Johnson, M.L.; Campbell, L.K.; Svendsen, E.S.; McMillen, H.L. Mapping Urban Park Cultural Ecosystem Services: A Comparison of Twitter and Semi-Structured Interview Methods. Sustainability 2019, 11, 6137. [Google Scholar] [CrossRef]
Sim, J.; Miller, P. Understanding an Urban Park through Big Data. Int. J. Environ. Res. Public Health 2019, 16, 3816. [Google Scholar] [CrossRef] [PubMed]
Ghermandi, A. Geolocated Social Media Data Counts as a Proxy for Recreational Visits in Natural Areas: A Meta-Analysis. J. Environ. Manage. 2022, 317, 115325. [Google Scholar] [CrossRef]
Rodríguez-Fernández, M.-M.; Sánchez-Amboage, E.; Juanatey-Boga, O. Virtual Tourist Communities and Online Travel Communities. In Communication: Innovation & Quality; Miguel, T., Valentín-Alejandro, M., Xosé, L., Xosé, R., Francisco, C., Eds.; Springer: Cham, Switzerland, 2019; pp. 435–446. [Google Scholar]
Liao, Y.; Yeh, S.; Gil, J. Feasibility of Estimating Travel Demand Using Geolocations of Social Media Data. Transportation 2022, 49, 137–161. [Google Scholar] [CrossRef]
Ghermandi, A.; Sinclair, M. Passive Crowdsourcing of Social Media in Environmental Research: A Systematic Map. Glob. Environ. Change 2019, 55, 36–47. [Google Scholar] [CrossRef]
Cui, N.; Malleson, N.; Houlden, V.; Comber, A. Using VGI and Social Media Data to Understand Urban Green Space: A Narrative Literature Review. ISPRS Int. J. Geoinf. 2021, 10, 425. [Google Scholar] [CrossRef]
Carter, S.; Weerkamp, W.; Tsagkias, M. Microblog Language Identification: Overcoming the Limitations of Short, Unedited and Idiomatic Text. Lang. Resour. Eval. 2013, 47, 195–215. [Google Scholar] [CrossRef]
Nakarmi, G.; Yuill, C.; Strager, M.P.; Butler, P.; Moreira, J.C.; Burns, R.C. A Crowdsource Approach to Documenting Users’ Preferences for Landscape Attributes in the Proposed Appalachian Geopark Project in West Virginia, United States. Int. J. Geoheritage Parks 2023, 11, 310–327. [Google Scholar] [CrossRef]
Oteros-Rozas, E.; Martín-López, B.; Fagerholm, N.; Bieling, C.; Plieninger, T. Using Social Media Photos to Explore the Relation between Cultural Ecosystem Services and Landscape Features across Five European Sites. Ecol. Indic. 2018, 94, 74–86. [Google Scholar] [CrossRef]
Lee, H.; Seo, B.; Koellner, T.; Lautenbach, S. Mapping Cultural Ecosystem Services 2.0—Potential and Shortcomings from Unlabeled Crowd Sourced Images. Ecol. Indic. 2019, 96, 505–515. [Google Scholar] [CrossRef]
Ghermandi, A.; Depietri, Y.; Sinclair, M. In the AI of the Beholder: A Comparative Analysis of Computer Vision-Assisted Characterizations of Human-Nature Interactions in Urban Green Spaces. Landsc. Urban Plan. 2022, 217, 104261. [Google Scholar] [CrossRef]
Dwivedi, Y.K.; Kshetri, N.; Hughes, L.; Slade, E.L.; Jeyaraj, A.; Kar, A.K.; Baabdullah, A.M.; Koohang, A.; Raghavan, V.; Ahuja, M.; et al. Opinion Paper: “So What If ChatGPT Wrote It?” Multidisciplinary Perspectives on Opportunities, Challenges and Implications of Generative Conversational AI for Research, Practice and Policy. Int. J. Inf. Manage. 2023, 71, 102642. [Google Scholar] [CrossRef]
Samuel, J.; Ali, G.G.M.N.; Rahman, M.M.; Esawi, E.; Samuel, Y. COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification. Information 2020, 11, 314. [Google Scholar] [CrossRef]
Xu, S.; Wang, J.; Shou, W.; Ngo, T.; Sadick, A.-M.; Wang, X. Computer Vision Techniques in Construction: A Critical Review. Arch. Comput. Methods Eng. 2021, 28, 3383–3397. [Google Scholar] [CrossRef]
Väisänen, T.; Heikinheimo, V.; Hiippala, T.; Toivonen, T. Exploring Human–Nature Interactions in National Parks with Social Media Photographs and Computer Vision. Conserv. Biol. 2021, 35, 424–436. [Google Scholar] [CrossRef] [PubMed]
Web of Science Core Collection. Available online: https://clarivate.com/products/scientific-and-academic-research/research-discovery-and-workflow-solutions/webofscience-platform/web-of-science-core-collection (accessed on 27 January 2024).
Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; PRISMA Group. Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. PLoS Med. 2009, 6, e1000097. [Google Scholar] [CrossRef] [PubMed]
Mohimont, L.; Alin, F.; Rondeau, M.; Gaveau, N.; Steffenel, L.A. Computer Vision and Deep Learning for Precision Viticulture. Agronomy 2022, 12, 2463. [Google Scholar] [CrossRef]
Khan, A.; Laghari, A.; Awan, S. Machine Learning in Computer Vision: A Review. EAI Endorsed Trans. Scalable Inf. Syst. 2021, 8, e4. [Google Scholar] [CrossRef]
Oztel, I.; Yolcu, G.; Oz, C. Performance Comparison of Transfer Learning and Training from Scratch Approaches for Deep Facial Expression Recognition. In Proceedings of the 2019 4th International Conference on Computer Science and Engineering (UBMK), Samsun, Turkey, 11–15 September 2019; pp. 1–6. [Google Scholar]
Originlab. Available online: https://www.originlab.com/ (accessed on 4 January 2024).
Huai, S.; Chen, F.; Liu, S.; Canters, F.; Van De Voorde, T. Using Social Media Photos and Computer Vision to Assess Cultural Ecosystem Services and Landscape Features in Urban Parks. Ecosyst. Serv. 2022, 57, 101475. [Google Scholar] [CrossRef]
Shen, Y.; Xu, Y.; Liu, L. Crowd-Sourced City Images: Decoding Multidimensional Interaction between Imagery Elements with Volunteered Photos. ISPRS Int. J. Geoinf. 2021, 10, 740. [Google Scholar] [CrossRef]
Yang, C.; Liu, T.; Zhang, S. Using Flickr Data to Understand Image of Urban Public Spaces with a Deep Learning Model: A Case Study of the Haihe River in Tianjin. ISPRS Int. J. Geoinf. 2022, 11, 497. [Google Scholar] [CrossRef]
Zhao, L.; Luo, L.; Li, B.; Xu, L.; Zhu, J.; He, S.; Li, H. Analysis of the Uniqueness and Similarity of City Landscapes Based on Deep Style Learning. ISPRS Int. J. Geoinf. 2021, 10, 734. [Google Scholar] [CrossRef]
Zhang, K.; Chen, Y.; Li, C. Discovering the Tourists’ Behaviors and Perceptions in a Tourism Destination by Analyzing Photos’ Visual Content with a Computer Deep Learning Model: The Case of Beijing. Tour. Manag. 2019, 75, 595–608. [Google Scholar] [CrossRef]
Google Cloud Vision AI. Available online: https://cloud.google.com/vision (accessed on 4 January 2024).
Han, X.; Zhang, Z.; Ding, N.; Gu, Y.; Liu, X.; Huo, Y.; Qiu, J.; Yao, Y.; Zhang, A.; Zhang, L.; et al. Pre-Trained Models: Past, Present and Future. AI Open 2021, 2, 225–250. [Google Scholar] [CrossRef]
Xiao, X.; Fang, C.; Lin, H. Characterizing Tourism Destination Image Using Photos’ Visual Content. ISPRS Int. J. Geoinf. 2020, 9, 730. [Google Scholar] [CrossRef]
Qi, Z.; Duan, J.; Su, H.; Fan, Z.; Lan, W. Using Crowdsourcing Images to Assess Visual Quality of Urban Landscapes: A Case Study of Xiamen Island. Ecol. Indic. 2023, 154, 110793. [Google Scholar] [CrossRef]
Song, Y.; Ning, H.; Ye, X.; Chandana, D.; Wang, S. Analyze the Usage of Urban Greenways through Social Media Images and Computer Vision. Environ. Plan. B Urban Anal. City Sci. 2022, 49, 1682–1696. [Google Scholar] [CrossRef]
Kim, J.; Kang, Y. Automatic Classification of Photos by Tourist Attractions Using Deep Learning Model and Image Feature Vector Clustering. ISPRS Int. J. Geoinf. 2022, 11, 245. [Google Scholar] [CrossRef]
Hartmann, M.C.; Koblet, O.; Baer, M.F.; Purves, R.S. Automated Motif Identification: Analysing Flickr Images to Identify Popular Viewpoints in Europe’s Protected Areas. J. Outdoor Recreat. Tour. 2022, 37, 100479. [Google Scholar] [CrossRef]
Liu, S.; Su, C.; Zhang, J.; Takeda, S.; Liu, J.; Yang, R. Cross-Cultural Comparison of Urban Green Space through Crowdsourced Big Data: A Natural Language Processing and Image Recognition Approach. Land 2023, 12, 767. [Google Scholar] [CrossRef]
Arefieva, V.; Egger, R.; Yu, J. A Machine Learning Approach to Cluster Destination Image on Instagram. Tour. Manag. 2021, 85, 104318. [Google Scholar] [CrossRef]
Su, L.; Chen, W.; Zhou, Y.; Fan, L. Exploring City Image Perception in Social Media Big Data through Deep Learning: A Case Study of Zhongshan City. Sustainability 2023, 15, 3311. [Google Scholar] [CrossRef]
Wang, Y.; Shi, X.; Cheng, K.; Zhang, J.; Chang, Q. How Do Urban Park Features Affect Cultural Ecosystem Services: Quantified Evidence for Design Practices. Urban For. Urban Green. 2022, 76, 127713. [Google Scholar] [CrossRef]
Matasov, V.; Vasenev, V.; Matasov, D.; Dvornikov, Y.; Filyushkina, A.; Bubalo, M.; Nakhaev, M.; Konstantinova, A. COVID-19 Pandemic Changes the Recreational Use of Moscow Parks in Space and Time: Outcomes from Crowd-Sourcing and Machine Learning. Urban For. Urban Green. 2023, 83, 127911. [Google Scholar] [CrossRef]
Kang, Y.; Cho, N.; Yoon, J.; Park, S.; Kim, J. Transfer Learning of a Deep Learning Model for Exploring Tourists’ Urban Image Using Geotagged Photos. ISPRS Int. J. Geoinf. 2021, 10, 137. [Google Scholar] [CrossRef]
Havinga, I.; Marcos, D.; Bogaart, P.W.; Hein, L.; Tuia, D. Social Media and Deep Learning Capture the Aesthetic Quality of the Landscape. Sci. Rep. 2021, 11, 20000. [Google Scholar] [CrossRef]
Mouttaki, I.; Bagdanavičiūtė, I.; Maanan, M.; Erraiss, M.; Rhinane, H.; Maanan, M. Classifying and Mapping Cultural Ecosystem Services Using Artificial Intelligence and Social Media Data. Wetlands 2022, 42, 86. [Google Scholar] [CrossRef]
Bai, N.; Nourian, P.; Luo, R.; Cheng, T.; Pereira Roders, A. Screening the Stones of Venice: Mapping Social Perceptions of Cultural Significance through Graph-Based Semi-Supervised Classification. ISPRS J. Photogramm. Remote Sens. 2023, 203, 135–164. [Google Scholar] [CrossRef]
TensorFlow. Available online: https://www.tensorflow.org/tutorials/images/transfer_learning (accessed on 4 January 2024).
Richards, D.R.; Tunçer, B. Using Image Recognition to Automate Assessment of Cultural Ecosystem Services from Social Media Photographs. Ecosyst. Serv. 2018, 31, 318–325. [Google Scholar] [CrossRef]
Gosal, A.S.; Geijzendorffer, I.R.; Václavík, T.; Poulin, B.; Ziv, G. Using Social Media, Machine Learning and Natural Language Processing to Map Multiple Recreational Beneficiaries. Ecosyst. Serv. 2019, 38, 100958. [Google Scholar] [CrossRef]
Cho, N.; Kang, Y.; Yoon, J.; Park, S.; Kim, J. Classifying Tourists’ Photos and Exploring Tourism Destination Image Using a Deep Learning Model. J. Qual. Assur. Hosp. Tour. 2022, 23, 1480–1508. [Google Scholar] [CrossRef]
Kim, D.; Kang, Y.; Park, Y.; Kim, N.; Lee, J. Understanding Tourists’ Urban Images with Geotagged Photos Using Convolutional Neural Networks. Spat. Inf. Res. 2020, 28, 241–255. [Google Scholar] [CrossRef]
Cardoso, A.S.; Renna, F.; Moreno-Llorca, R.; Alcaraz-Segura, D.; Tabik, S.; Ladle, R.J.; Vaz, A.S. Classifying the Content of Social Media Images to Support Cultural Ecosystem Service Assessments Using Deep Learning Models. Ecosyst. Serv. 2022, 54, 101410. [Google Scholar] [CrossRef]
Winder, S.G.; Lee, H.; Seo, B.; Lia, E.H.; Wood, S.A. An Open-source Image Classifier for Characterizing Recreational Activities across Landscapes. People Nat. 2022, 4, 1249–1262. [Google Scholar] [CrossRef]
Payntar, N.D.; Hsiao, W.-L.; Covey, R.A.; Grauman, K. Learning Patterns of Tourist Movement and Photography from Geotagged Photos at Archaeological Heritage Sites in Cuzco, Peru. Tour. Manag. 2021, 82, 104165. [Google Scholar] [CrossRef]
Tenkanen, H.; Di Minin, E.; Heikinheimo, V.; Hausmann, A.; Herbst, M.; Kajala, L.; Toivonen, T. Instagram, Flickr, or Twitter: Assessing the Usability of Social Media Data for Visitor Monitoring in Protected Areas. Sci. Rep. 2017, 7, 17615. [Google Scholar] [CrossRef]
Santos Vieira, F.A.; Vinhas Santos, D.T.; Bragagnolo, C.; Campos-Silva, J.V.; Henriques Correia, R.A.; Jepson, P.; Mendes Malhado, A.C.; Ladle, R.J. Social Media Data Reveals Multiple Cultural Services along the 8.500 Kilometers of Brazilian Coastline. Ocean Coast. Manag. 2021, 214, 105918. [Google Scholar] [CrossRef]
Boy, J.D.; Uitermark, J. How to Study the City on Instagram. PLoS ONE 2016, 11, e0158161. [Google Scholar] [CrossRef]
Toivonen, T.; Heikinheimo, V.; Fink, C.; Hausmann, A.; Hiippala, T.; Järv, O.; Tenkanen, H.; Di Minin, E. Social Media Data for Conservation Science: A Methodological Overview. Biol. Conserv. 2019, 233, 298–315. [Google Scholar] [CrossRef]
Gülçin, D.; Yalçınkaya, N.M. Correlating Fluency Theory-Based Visual Aesthetic Liking of Landscape with Landscape Types and Features. Geo-Spat. Inf. Sci. 2022, 1–20. [Google Scholar] [CrossRef]
Chhetri, P.; Chhetri, A. Theoretical Perspectives on Landscape Perception. In Practising Cultural Geographies; Ravi, S., Bharat, D., Arun, K.S., Padma, C.P., Eds.; Springer: Singapore, 2022; pp. 85–110. [Google Scholar]
Wartmann, F.M.; Tieskens, K.F.; Van Zanten, B.T.; Verburg, P.H. Exploring Tranquillity Experienced in Landscapes Based on Social Media. Appl. Geogr. 2019, 113, 102112. [Google Scholar] [CrossRef]
Gao, Q.; Abel, F.; Houben, G.-J.; Yu, Y. A Comparative Study of Users’ Microblogging Behavior on Sina Weibo and Twitter. In User Modeling, Adaptation, and Personalization; Masthoff, J., Mobasher, B., Desmarais, M.C., Nkambou, R., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 88–101. [Google Scholar]
Zielstra, D.; Hochmair, H.H. Positional Accuracy Analysis of Flickr and Panoramio Images for Selected World Regions. J. Spat. Sci. 2013, 58, 251–273. [Google Scholar] [CrossRef]
Taecharungroj, V.; Mathayomchan, B. Analysing TripAdvisor Reviews of Tourist Attractions in Phuket, Thailand. Tour. Manag. 2019, 75, 550–568. [Google Scholar] [CrossRef]
Lee, K.; Yu, C. Assessment of Airport Service Quality: A Complementary Approach to Measure Perceived Service Quality Based on Google Reviews. J. Air Transp. Manag. 2018, 71, 28–44. [Google Scholar] [CrossRef]
Ding, Y.; Bai, Z.; Xia, H.; Tang, H. Tourists’ Landscape Preferences of Luoxiao Mountain National Forest Trail Based on Deep Learning. Wirel. Commun. Mob. Comput. 2022, 2022, 1–18. [Google Scholar] [CrossRef]
Goldspiel, H.; Barr, B.; Badding, J.; Kuehn, D. Snapshots of Nature-Based Recreation Across Rural Landscapes: Insights from Geotagged Photographs in the Northeastern United States. Environ. Manage. 2023, 71, 234–248. [Google Scholar] [CrossRef] [PubMed]
Richards, D.R.; Lavorel, S. Integrating Social Media Data and Machine Learning to Analyse Scenarios of Landscape Appreciation. Ecosyst. Serv. 2022, 55, 101422. [Google Scholar] [CrossRef]
Lee, S.; Son, Y. Mapping of User-Perceived Landscape Types and Spatial Distribution Using Crowdsourced Photo Data and Machine Learning: Focusing on Taeanhaean National Park. J. Outdoor Recreat. Tour. 2023, 44, 100616. [Google Scholar] [CrossRef]
Paukaeva, A.A.; Setoguchi, T.; Watanabe, N.; Luchkova, V.I. Temporary Design on Public Open Space for Improving the Pedestrian’s Perception Using Social Media Images in Winter Cities. Sustainability 2020, 12, 6062. [Google Scholar] [CrossRef]
Zhang, J.; Li, D.; Ning, S.; Furuya, K. Sustainable Urban Green Blue Space (UGBS) and Public Participation: Integrating Multisensory Landscape Perception from Online Reviews. Land 2023, 12, 1360. [Google Scholar] [CrossRef]
Lingua, F.; Coops, N.C.; Griess, V.C. Assessing Forest Recreational Potential from Social Media Data and Remote Sensing Technologies Data. Ecol. Indic. 2023, 149, 110165. [Google Scholar] [CrossRef]
Karasov, O.; Heremans, S.; Külvik, M.; Domnich, A.; Chervanyov, I. On How Crowdsourced Data and Landscape Organisation Metrics Can Facilitate the Mapping of Cultural Ecosystem Services: An Estonian Case Study. Land 2020, 9, 158. [Google Scholar] [CrossRef]
Spalding, M.D.; Longley-Wood, K.; McNulty, V.P.; Constantine, S.; Acosta-Morel, M.; Anthony, V.; Cole, A.D.; Hall, G.; Nickel, B.A.; Schill, S.R.; et al. Nature Dependent Tourism—Combining Big Data and Local Knowledge. J. Environ. Manag. 2023, 337, 117696. [Google Scholar] [CrossRef]
Cao, H.; Wang, M.; Su, S.; Kang, M. Explicit Quantification of Coastal Cultural Ecosystem Services: A Novel Approach Based on the Content and Sentimental Analysis of Social Media. Ecol. Indic. 2022, 137, 108756. [Google Scholar] [CrossRef]
Wood, S.A.; Guerry, A.D.; Silver, J.M.; Lacayo, M. Using Social Media to Quantify Nature-Based Tourism and Recreation. Sci. Rep. 2013, 3, 2976. [Google Scholar] [CrossRef]
Chen, M.; Arribas-Bel, D.; Singleton, A. Quantifying the Characteristics of the Local Urban Environment through Geotagged Flickr Photographs and Image Recognition. IJGI 2020, 9, 264. [Google Scholar] [CrossRef]
Li, S.; Yang, B. Social Media for Landscape Planning and Design: A Review and Discussion. Landsc. Res. 2022, 47, 648–663. [Google Scholar] [CrossRef]
Zhang, H.; Huang, R.; Zhang, Y.; Buhalis, D. Cultural Ecosystem Services Evaluation Using Geolocated Social Media Data: A Review. Tour. Geogr. 2022, 24, 646–668. [Google Scholar] [CrossRef]
Wilkins, E.J.; Van Berkel, D.; Zhang, H.; Dorning, M.A.; Beck, S.M.; Smith, J.W. Promises and Pitfalls of Using Computer Vision to Make Inferences about Landscape Preferences: Evidence from an Urban-Proximate Park System. Landsc. Urban Plan. 2022, 219, 104315. [Google Scholar] [CrossRef]
Lingua, F.; Coops, N.C.; Griess, V.C. Valuing Cultural Ecosystem Services Combining Deep Learning and Benefit Transfer Approach. Ecosyst. Serv. 2022, 58, 101487. [Google Scholar] [CrossRef]
Song, X.P.; Richards, D.R.; He, P.; Tan, P.Y. Does Geo-Located Social Media Reflect the Visit Frequency of Urban Parks? A City-Wide Analysis Using the Count and Content of Photographs. Landsc. Urban Plan. 2020, 203, 103908. [Google Scholar] [CrossRef]
Mancini, F.; Coghill, G.M.; Lusseau, D. Using Social Media to Quantify Spatial and Temporal Dynamics of Nature-Based Recreational Activities. PLoS ONE 2018, 13, e0200565. [Google Scholar] [CrossRef] [PubMed]
Fox, N.; Graham, L.J.; Eigenbrod, F.; Bullock, J.M.; Parks, K.E. Enriching Social Media Data Allows a More Robust Representation of Cultural Ecosystem Services. Ecosyst. Serv. 2021, 50, 101328. [Google Scholar] [CrossRef]
McKenna, R.; Mulalic, I.; Soutar, I.; Weinand, J.M.; Price, J.; Petrović, S.; Mainzer, K. Exploring Trade-Offs between Landscape Impact, Land Use and Resource Quality for Onshore Variable Renewable Energy: An Application to Great Britain. Energy 2022, 250, 123754. [Google Scholar] [CrossRef]
García-Palomares, J.C.; Gutiérrez, J.; Mínguez, C. Identification of Tourist Hot Spots Based on Social Networks: A Comparative Analysis of European Metropolises Using Photo-Sharing Services and GIS. Appl. Geogr. 2015, 63, 408–417. [Google Scholar] [CrossRef]
Alhalabi, W.; Lytras, M.; Aljohani, N. Crowdsourcing Research for Social Insights into Smart Cities Applications and Services. Sustainability 2021, 13, 7531. [Google Scholar] [CrossRef]
Ioannidis, R.; Sargentis, G.-F.; Koutsoyiannis, D. Landscape Design in Infrastructure Projects–Is It an Extravagance? A Cost-Benefit Investigation of Practices in Dams. Landsc. Res. 2022, 47, 370–387. [Google Scholar] [CrossRef]
Afzalan, N.; Muller, B. The Role of Social Media in Green Infrastructure Planning: A Case Study of Neighborhood Participation in Park Siting. J. Urban Technol. 2014, 21, 67–83. [Google Scholar] [CrossRef]
Mongeon, P.; Paul-Hus, A. The Journal Coverage of Web of Science and Scopus: A Comparative Analysis. Scientometrics 2016, 106, 213–228. [Google Scholar] [CrossRef]

Figure 1. Workflow of this study.

Figure 2. Reviewed studies’ publication growth trend.

Figure 3. Spatial distribution of studies: in the flow of the same color, the starting point is the study location, and the arrow points to the study area; * in the studies where both the starting point and arrow point toward Europe, the areas of four studies lie not in the country where the first author’s institution is located but in other European countries.

Figure 4. Flow relationships between study settings and research themes: * “Other” includes the “archaeological site,” “tourist attraction,” and “island” that appeared only in single studies, as well as the areas delineated by the authors themselves.

Figure 5. Proportional statistical results for various computer vision methods in reviewed studies: the inner, middle, and outer layers represent task categories, models, and the commercial services/deep learning architectures used, respectively; VGG = Visual Geometry Group, YOLO = You Only Live Once.

Figure 6. Temporal distribution of various computer vision methods in included studies: the left graph represents task categories, and the right graph indicates models.

Figure 7. Statistics of accuracy assessment results for automated image content analysis: the light gray dot represents an outlier.

Figure 8. Purpose of using computer vision methods: the graph in the left represents the heat of each purpose over time, and the graph in the right indicates the tasks involved in achieving each purpose.

Figure 9. Frequency of use of other social media data to assist in image content analysis.

Figure 10. Development framework for future research.

Table 1. Literature-screening criteria.

Term	Inclusion Criteria	Exclusion Criteria
WOS ¹ categories or citation topics	Environmental studies; environmental sciences; hospitality, leisure, sport, and tourism; geography; agriculture, environment, and ecology	Unrelated to these categories
Type of study	Empirical study	Literature review, commentary, or meta-analysis
Study area	Urban, peri-urban, and rural or agricultural landscape areas; natural or semi-natural, and cultural landscape areas	No designated area
Data source	Image data posted spontaneously by social media users	Only using non-VGI ² data (street views, remote sensing images, etc.) or non-image data
Research method	Using computer vision to understand images	Only using other artificial intelligence methods (natural language processing, random forest classifier, maximum entropy model, etc.)

¹ WOS = Web of Science; ² VGI = volunteered geographic information.

Table 2. Items recorded for each paper.

Broad Category	Variable	Description
Main characteristics	Publication year	Paper’s publication year in the citation information
	Study location	Country and continent of the first author’s institution
	Study area	Country and continent to which the study area mentioned in the Materials and Methods Section of the paper belongs
	Setting	Study area’s scale and type, such as country, city, and park.
	Research theme	Summary of the authors’ research theme/purpose/question stated in the title, keywords, and abstract
Computer vision	Task	Summary of methods used in the paper to acquire, process, analyze, and understand digital images
	Model	Training options for the models used in realizing computer vision tasks (pretrained model, transfer learning, etc.)
	Tool	Names of commercial services/deep learning architectures used in computer vision tasks
	Accuracy	Accuracy verification results of computer vision analysis provided in the paper
	Purpose of use	Summary of the specific purpose of using computer vision methods as a research step
Social media data	Platform	Name and characteristics of the social media platform from which the data originated
Social media data	Auxiliary data	Methods to assist image analysis using the metadata (geographic location, timestamp, user information, and interactions) or textual content provided by social media
Limitations	Limitation	Biases, limitations, challenges, or gaps explicitly stated in the Methods, Discussion, and Conclusions Sections of the paper

Table 4. Computer vision task models used in reviewed studies.

Model	Description	Example
Pretrained model	A saved network previously trained on large datasets and applied directly to the task by the authors [62].	[63,64]
Transfer learning	A machine learning method in which a pretrained model developed for a task was reused as the starting point for a model on a second task [27].	[44,65]
Other	Instead of relying on pretrained models, the authors propose new architectures to train models or use other computer vision algorithms to process images.	[52,60]

Table 5. Social media platforms used in reviewed studies.

Platform	Introduction	Count
Flickr	An online image-sharing-based photo album. It provides a free API * to obtain user-uploaded images and attached tags and text and enable queries by geographical location [76].	39
Instagram	An image-sharing-based social networking service. It enables users to upload media that can be edited using filters and add hashtags and geotags. The platform places limits on image content analysis [73].	6
VKontakte	A Russian general social networking service that supports the sharing of geotagged images. It provides an open API * to query images by geolocation [57].	4
Weibo	A Chinese general social networking service that provides a location check-in function that enables users to record their experiences and share text and images [77].	3
Wikiloc	A crowdsourced outdoor web service. It enables users to record and share movement tracks, which can be supplemented by comments and photographs [18].	2
Panoramio	An online application that entirely relies on geotagged images. It ceased operations in 2016 [78].	2
TripAdvisor	The most popular online information platform in the tourism field, where users can post reviews (images and text) and ratings of tourist attractions [79].	2
Google Maps	A web-mapping platform that provides user comment functionality, enabling users to post image and text comments and ratings on specific locations [80].	2
Other	Including Twitter, Foooooot, 2bulu, Sixfoot, Ramblr, Mafengwo, and an unknown Internet photo community, which were only reported in one study.	7

* API = Application Programming Interface.

Table 6. Methods of utilization of social media metadata and textual content in image analysis.

Data Type	Utilization Method	Count	Example
Geographic location	1. Mapping landscape distribution.	21	[45]
	2. Examining the effect of geographic variables on landscape image categories through regression analysis.	5	[68]
	3. Predicting the (landscape/activity) potential of locations in the study area through machine learning modeling.	4	[87]
	4. Understanding areas of interest within the study site and visitor photography preferences in each area.	3	[66]
	5. Analyzing CES within each land type (land cover/category of protection).	2	[88]
	6. Drawing a self-organizing map based on the image content to divide several spatial clusters in the study area and count the contribution of each image category to the clusters.	2	[84]
Timestamp	1. Examining the temporal distribution of photographed scenes or objects (year/quarter/month/week).	9	[48]
Timestamp	2. Comparing differences in image content before and after certain events.	2	[85]
User information	1. Comparing the differences in the content of shots between locals and tourists, and among tourists from different countries.	6	[34]
	2. Filtering images based on user sources.	4	[58]
	3. Clustering users into potential preference groups based on posted photograph content.	3	[60]
Interactions	Using the number of views/reposts/comments/likes as public preference variables.	2	[49]
Textual content	1. Analyzing image and text content separately and then combining them to jointly clarify and evaluate the study area.	4	[89]
	2. Examining the correlation between the sentiment expressed in the text and the image content.	3	[90]
	3. Filtration of images based on user-added tags.	1	[76]
	4. Training deep learning models that incorporate features from image and text data to achieve research goals.	1	[61]

Table 7. Main limitations stated in reviewed studies, summarized by the authors of this review.

Limitation	Main Content	Percentage of Studies
Inherent sampling biases in social media data	1. Social media users are not representative of all demographics, and some social groups, such as children and the elderly, are ignored. 2. Platforms are preferred by different types of users; hence, different platforms cause differences in results. 3. Not all visitors take photographs and post them on the Internet, causing social media images to be an underrepresentation of real visits. 4. Users provide little or no information on an individual’s age, gender, education, family, and racial origin due to privacy settings, which makes it difficult to assess the representativeness and bias of the data. 5. Users tend to upload content with more positive than negative connotations and may only take photographs in accessible areas and popular places; hence, researchers cannot obtain accurate feedback. 6. Social media popularity varies worldwide.	76
Pitfalls of automation with computer vision	1. There are omissions or misidentifications in the results, particularly for small datasets. 2. There are regional differences in model accuracy, and pretrained models may not be suitable for some scenarios with regional characteristics. 3. Since the output results are mostly labels, rather than natural language, the analysis of labels may produce different results. 4. Training high-precision models for a study area requires significant amounts of energy and time and has high requirements in terms of video memory. 5. Machine learning is not yet completely capable of capturing intangible aspects, such as spiritual or cultural heritage values. 6. The selection of computer vision tools affects results.	47
Biases in information expressed by images	1. Human experience is highly personalized and subjective. Researchers cannot fully understand users’ intentions to take and post photographs and can only make experience-based assumptions. 2. Images are not always able to convey elusive aspects, such as emotion, inspiration, or cognitive values. 3. Photography is restricted during some activities, such as biking, skiing, water-related activities, and religious acts. 4. Differences in the field of view, composition, focus, and proportion among photographs may skew the results.	36
Geotag-related limitations	1. Photo geotags may be offset. 2. The location where the user uploaded or manually edited the photograph is not the location where the photograph was taken. 3. Researchers are unable to access geolocation data due to policy restrictions, or users are unwilling to provide the location. 4. A single social media platform’s spatial coverage is limited, or some areas are inaccessible to users.	33
Effects of active users	Some active users upload images in batches in the same area on the same day, which affects the analysis results’ representativeness.	31
Concerns regarding repeatability	1. The publisher can delete or restrict access to data in social media platforms, which results in data loss. 2. Changes in platform policies may compromise reproducibility and the ability to monitor spatiotemporal trends. 3. The popularity of social media platforms changes over time. 4. The impact of the COVID-19 ¹ pandemic has caused a lack of data.	16
Ethical concerns	Ethical issues, such as user privacy, must be considered when using public information and emerging technologies.	9
Cost issues	High costs are incurred while using AI ² products from providers or mining multiple image streams.	4

¹ COVID-19 = coronavirus disease 2019; ² AI = artificial intelligence.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ma, R.; Furuya, K. Social Media Image and Computer Vision Method Application in Landscape Studies: A Systematic Literature Review. Land 2024, 13, 181. https://doi.org/10.3390/land13020181

AMA Style

Ma R, Furuya K. Social Media Image and Computer Vision Method Application in Landscape Studies: A Systematic Literature Review. Land. 2024; 13(2):181. https://doi.org/10.3390/land13020181

Chicago/Turabian Style

Ma, Ruochen, and Katsunori Furuya. 2024. "Social Media Image and Computer Vision Method Application in Landscape Studies: A Systematic Literature Review" Land 13, no. 2: 181. https://doi.org/10.3390/land13020181

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Social Media Image and Computer Vision Method Application in Landscape Studies: A Systematic Literature Review

Abstract

1. Introduction

2. Materials and Methods

2.1. Literature Search and Screening

2.2. Literature Review and Data Analysis

3. Results

3.1. Main Characteristics of Reviewed Studies

3.2. Utilization of Computer Vision in Image Analysis

3.3. Social Media Platforms and Data

3.4. Limitations of Landscape Studies

4. Discussion

4.1. Status of Computer Vision Use in Social Media Image Interpretation in Landscape Studies

4.2. Development Framework for Future Research

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI