Next Article in Journal
Georeferencing Accuracy Assessment of Historical Aerial Photos Using a Custom-Built Online Georeferencing Tool
Next Article in Special Issue
A Novel Approach Based on Machine Learning and Public Engagement to Predict Water-Scarcity Risk in Urban Areas
Previous Article in Journal
Classification of Floods in Europe and North America with Focus on Compound Events
Previous Article in Special Issue
Vulnerability Identification and Cascading Failure Spatiotemporal Patterns on Road Network under the Rainstorm Disaster
 
 
Article
Peer-Review Record

Different Ways Ambient and Immobile Population Distributions Influence Urban Crime Patterns

ISPRS Int. J. Geo-Inf. 2022, 11(12), 581; https://doi.org/10.3390/ijgi11120581
by Natalia Sypion-Dutkowska 1, Minxuan Lan 2,3,*, Marek Dutkowski 1 and Victoria Williams 4
Reviewer 1:
Reviewer 2: Anonymous
ISPRS Int. J. Geo-Inf. 2022, 11(12), 581; https://doi.org/10.3390/ijgi11120581
Submission received: 11 August 2022 / Revised: 10 October 2022 / Accepted: 19 November 2022 / Published: 22 November 2022
(This article belongs to the Special Issue Urban Geospatial Analytics Based on Crowdsourced Data)

Round 1

Reviewer 1 Report

Nice paper overall, needing some language revisions to clarify the main concepts.

Additional revisions, requested to have the manuscript published, include

1) a broader literature review, moving toward a global perspective

2) a better description of the study area and a broader technical presentation of the methodology

3) Novelty and originality of the approach should be highlighted better.

4) Future research directions can be discussed a bit more.

8/24 additional comments:
1) Table of descriptive statistics should be checked for minor errors and typos (2015-2917???)
2) the manuscript is really concise and all my previous suggestions should be taken in due account to expand the size and content of the paper
3) please change the type of article into 'short note' or 'short communication'.
4) a broader description of indicators in Table 1 (descriptive statistics) is meaningful. What is for instance, 'demographic load'?
5) literature review should be expanded.

In general, the manuscript is well written and should be accepted pending moderate revisions.

Author Response

Nice paper overall, needing some language revisions to clarify the main concepts.

Thanks very much for acknowledging our efforts! We have made substantial changes in this revision. Please see our point-to-point responses below.

Additional revisions, requested to have the manuscript published, include

  • a broader literature review, moving toward a global perspective

Thanks for the suggestion. In this revision, we have added additional references and enriched the literature review to include more studies in the World, including USA, UK, and Canada (lines 124-136).

“…Gerber (2014) found that the addition of Twitter-derived variables improves prediction performance for 19/25 crime types more than the model of solely historical crimes [25]. This study demonstrates the benefits of tweets for crime prediction. Bendler and colleagues (2014) used the amount of point of interest (POI) as the independent variable to simulate crime patterns in San Francisco and added the count of tweets into the model [35]. Ristea and colleagues (2017) used tweets count as an explanatory variable to test the crime-tweets relationship [42]. Lan and colleagues had done serial studies to test the crime-tweets relationship and suggested the reliability of tweets as a feasible dynamic population measure [12,31,43]. Hipp and colleagues (2018) also found that tweets can help explain the presence of crime in California [11].

However, as suggested by previous studies, geotagged tweets generally show the dynamic distribution of the population with higher mobility as they are the major Twitter users [23,44-51]…”

  • a better description of the study area and a broader technical presentation of the methodology

Thanks for the suggestion. In this revision, we expanded the description of the study area (lines 179-186) and methodology (lines 264-282).

“…Szczecin is the capital of the West Pomeranian Voivodeship in northwestern Poland. It is a major seaport as it is near the Baltic Sea and the German border. In 2015, it had a population of 404,712, and this number slightly declined to 404,000 in 2017. The size of the city is 301 km2, composed of 37 neighborhoods. The city lies on the delta of the Oder River and is known as the “Paris of the North”, because of the characteristic star-shaped layout of streets and squares [57]. As shown in the following figures, this city experience crime problems as many populated cities would.”

“To use a traditional linear regression model, the dependent variable needs to follow the normal distribution. However, as “Law of Crime Concentration” and “Iron Law of Troublesome Places” suggest: few places are responsible for most of the crime, and most places do not experience any crime, so the distribution of crime is always skewed [5,6,8,9]. As evidenced in figures 1-5, crime patterns in Szczecin are also clustered. This clearly violates the basic assumption of linear regression; thus, Poisson or negative binomial regression should be used [67-69]. Poisson regression model may be used for count data like crime incidents. However, to use Poisson regression, the dependent variable’s mean needs to be equal to the variance, which is often not satisfied in crime data. Therefore, following the general practice in crime studies, we use the negative binomial regression model to assess the relationship between crime and the ambient population versus the immobile population (≤15 & >65). The negative binomial regression model has been widely used in criminology studies because it does not assume homogeneity of variance and does not require normal distribution of the dependent variable [68]. The unit of analysis is the neighborhood (N = 37). The negative binomial regression model is:

 

where NB stands for negative binominal,  is the crime count in the ith (i = 1, . . ., n) neighborhood,  is the kth explanatory variable for neighborhood i,  (k = 0, 1, . . ., p) are the coefficients, and α is the parameter of overdispersion [69].”

  • Novelty and originality of the approach should be highlighted better.

Thanks for the suggestion. In this revision, we rewrote the introduction section (lines 134-153) and Research purpose section (lines 166-173) to highlight the novelty and originality of our approach.

“However, as suggested by previous studies, geotagged tweets generally show the dynamic distribution of the population with higher mobility as they are the major Twitter users [23,44-51]. Consequently, only using geotagged tweets to study crime patterns may overlook the population with limited mobility, e.g., those who are very young (younger than 16) and are elderly (older than 65). This study fills this gap by using two data sources to include both the ambient and immobile populations in crime analysis. The first source of data is geotagged tweets, as a measure of the ambient population who have increased mobility during the day as they commute and move a lot (Group 1) [23]. The second source of data is census data which are used to locate the immobile social groups (not tweet as much) such as children (age ≤15) and the elderly (age >65), who usually stay in their residential neighborhoods (Group 2). As routine activity theory argues, crime happens when a motivated offender meets a suitable target/victim at space and time, while no capable guardian is onsite [2]. Crime pattern theory also emphasizes space’s importance by arguing that the overlapped activity spaces of both offender and victims are riskier [1]. Therefore, both the mobile population (Group 1) and the immobile population (Group 2) are possibly involved in crime in their residential neighborhoods; additionally, the mobile population (Group 1) are also possibly involved in crime in the neighborhoods they frequent and visit [11-13,31]. Thus, we feel it is necessary to fill this gap and consider both groups to assess different ways ambient and immobile population distributions influence urban crime patterns in a European country.”

“These questions reflect the level of originality and novelty of the approach used in this study. It is crucial to distinguish between two groups of the population, with varying degrees of risk of crime: mobile and immobile. This was done using two different estimation methods: the number of tweeds posted in a given neighborhood, which is rarely used in the region. On this basis, the ambient population was estimated. Another innovation is taking into account not only ambient and immobile population relations in the research, but also socioeconomic characteristics that are important for crime opportunities. It is also worth emphasizing the significant number of crime types being analyzed.”

  • Future research directions can be discussed a bit more.

Thanks for the suggestion. In this revision, we added an additional discussion to show the future research directions (lines 325-333 & lines 350-368).

“Due to data limitations, we are only able to conduct this study in Szczecin, Poland. More tests should be done in other European countries, or even other parts of the World to further improve the discipline and advance knowledge. Further, we use the neighborhood as the study unit because that is the finest unit of the sociodemographic data we can get in this area. As suggested by a previous study, the study unit size may influence the results due to the modifiable areal unit problem (MAUP) [31]. Thus, if data are available, more tests may be needed at the even finer spatial unit, e.g., track, block group, and blocks to see if such relationships persist across different spatial scales.”

“As discussed in the earlier section, more tests in other regions at different spatial scales are recommended. Comparison studies among different cities across countries can also be beneficial, using the same method. This procedure, commonly found in experimental sciences, is seldomly used in the social sciences. However, it should be taken into account that in different socioeconomic conditions, the control variables may have different meanings and require different interpretations. The overall level of crime in a country or region is also relevant in such surveys. In Poland it is relatively low, in Szczecin it is average compared to the rest of the country, and the most serious crimes with the use of weapons and murders are practically isolated cases.

The assumption that age structure and mobile abilities need to be systematically considered in crime analysis studies should also be verified in other cities, but with a similar population age structure, level and lifestyle. In more affluent countries, young people and the elderly are generally more mobile thanks to individual and collective means of transportation.

In future studies, it should be checked as well whether the applied methods of estimating the number of mobile and immobile populations reflect their real mobility. Such an experimental study, using detailed lists of residents obtained, for example, from property owners and CCTV cameras, could verify the assumptions of this method and indicate the level of possible error.”

8/24 additional comments:
1) Table of descriptive statistics should be checked for minor errors and typos (2015-2917???)

Thanks for the valuable comment. We have revised such typos (line 262):

Variables (2015-2017)

Minimum

Maximum

Mean

Std. Deviation

Dependent

Burglary in commercial buildings

4

161

60.22

40.09

Drug crime

1

345

55.68

74.03

Fight and battery

3

313

46.19

57.74

Property damage

4

201

49.84

46.33

Theft

9

1369

272.97

268.86

Independent

Ambient population (Tweets)

6

2322

723.14

688.43

Immobile population (≤15 or >65)

292

8094

3269.57

2328.62

Control

Population density

33

27389

4570.83

5831.37

Population assisted by the Municipal Family Assistance Center

11

950

281.24

250.35

Unemployed population

51

1083

312.78

237.58

Demographic load index

31.10

66

45.10

8.82

Count of liquor stores

3

279

54.70

52.77


2) the manuscript is really concise and all my previous suggestions should be taken in due account to expand the size and content of the paper

3) please change the type of article into 'short note' or 'short communication'.

Thanks for the valuable suggestion. In this revision, we have made substantial changes and added additional information, introduction, literature review, and discussion. The manuscript is expanded from 5311 words to 6995 words. If still needed, we will be happy to discuss with the editor to explore 'short note' or 'short communication' options.


4) a broader description of indicators in Table 1 (descriptive statistics) is meaningful. What is for instance, 'demographic load'?

Thanks for the valuable suggestion. In this revision, we added a new paragraph to explicitly explain what variables are used and why they are used with necessary references (lines 246-261).

“Five dependent variables are counts of burglary in commercial buildings (commercial burglaries), drug crime, fight and battery (assault), property damage (vandalism), and theft. Two major independent variables are the tweets-derived ambient population, and the immobile population (population under age 16 or older than 65). Additional control variables are population density (a measure of population concentration), population assisted by the municipal family assistance center (people in poverty), unemployed population (people who are not employed), demographic load index (shows the burden to the society by the unproductive population), and the count of liquor stores (known to be related to many violent and property crime types) [53,61-66]. Control variables in this study are additional independent variables that are not of direct interest to this study's goals but are controlled because they have been tested by previous studies to influence crime patterns. We include them to make sure the crime-tweets relationship and crime-age structure relationship are not accidental. For each of these five dependent variables, two models are fit, one with the ambient population (tweets) and controls, another with the immobile population (≤15 or >65) and controls. Consequently, 2 5=10 models are fit.”


5) literature review should be expanded.

Thanks for the suggestion. In this revision, we have added additional references and enriched the literature review (lines 37-66, lines 124-153).

“To study crime patterns, various efforts have been made by scholars not only in criminology, but also geography, sociology, and social sciences at large. Geography of crime in the geography field and environmental criminology (also known as crime science) in the criminology field specifically focus on the location of crime incidents.

Theories from environmental criminology are helpful in understanding temporal and spatial crime patterns. Routine activity theory argues that crime is the result of the direct contact of three elements in space and time: a motivated offender, a suitable target, and the absence of the capable guardian [2,3]. It suggests that routine activities of people contribute to possible intersections in time and space with a potential offender, and when it happens, the probability of crime increases drastically [2].

Crime pattern theory also shows that the distribution of population in time and space is important to crime pattern studies. This theory suggests that the locations of crime incidents are not accidental, rather, they have clear spatio-temporal patterns. This theory highlights the activity spaces and routes connecting them. These activity spaces include 1) nodes – e.g., shopping centers, workplaces, schools, recreation and entertainment areas, places to meet friends, etc., 2) paths connecting these nodes, and 3) edges, which are boundaries dividing areas with various forms of management, rulership, or functions. Moving along the paths between the nodes, the criminal's awareness spaces are created. The space of action is reflected in the criminal's consciousness in the form of a cognitive space map. According to this theory, a motivated offender has contact with a relatively small part of the city areas in the course of routine everyday activities. From perceived and realized nodes, paths, and edges, he/she selects appropriate objects or victims of crime in a multi-stage decision-making process. The distribution of crime in a city depends on its spatial structure, transport system, and street networks, and is shaped by the distribution of crime generators, attractors, and detractors [1,4]. Moreover, the city crime problem usually concentrates in relatively small areas, as the “Law of Crime Concentration” and “Iron Law of Troublesome Places” suggest [5-9]. In many cities, the downtown area or Central Business Districts would fall in this category because of the concentration of population and opportunities.”

“…Gerber (2014) found that the addition of Twitter-derived variables improves prediction performance for 19/25 crime types more than the model of solely historical crimes [25]. This study demonstrates the benefits of tweets for crime prediction. Bendler and colleagues (2014) used the amount of point of interest (POI) as the independent variable to simulate crime patterns in San Francisco and added the count of tweets into the model [35]. Ristea and colleagues (2017) used tweets count as an explanatory variable to test the crime-tweets relationship [42]. Lan and colleagues had done serial studies to test the crime-tweets relationship and suggested the reliability of tweets as a feasible dynamic population measure [12,31,43]. Hipp and colleagues (2018) also found that tweets can help explain the presence of crime in California [11].

However, as suggested by previous studies, geotagged tweets generally show the dynamic distribution of the population with higher mobility as they are the major Twitter users [23,44-51]. Consequently, only using geotagged tweets to study crime patterns may overlook the population with limited mobility, e.g., those who are very young (younger than 16) and are elderly (older than 65). This study fills this gap by using two data sources to include both the ambient and immobile populations in crime analysis. The first source of data is geotagged tweets, as a measure of the ambient population who have increased mobility during the day as they commute and move a lot (Group 1) [23]. The second source of data is census data which are used to locate the immobile social groups (not tweet as much) such as children (age ≤15) and the elderly (age >65), who usually stay in their residential neighborhoods (Group 2). As routine activity theory argues, crime happens when a motivated offender meets a suitable target/victim at space and time, while no capable guardian is onsite [2]. Crime pattern theory also emphasizes space’s importance by arguing that the overlapped activity spaces of both offender and victims are riskier [1]. Therefore, both the mobile population (Group 1) and the immobile population (Group 2) are possibly involved in crime in their residential neighborhoods; additionally, the mobile population (Group 1) are also possibly involved in crime in the neighborhoods they frequent and visit [11-13,31]. Thus, we feel it is necessary to fill this gap and consider both groups to assess different ways ambient and immobile population distributions influence urban crime patterns in a European country.”

In general, the manuscript is well written and should be accepted pending moderate revisions.

Thank you very much for the valuable suggestions and comments!

Author Response File: Author Response.docx

Reviewer 2 Report

The article tends to propose a way of estimating the ambient and immobile urban population using geotagged tweets and age structure from census data, and test how they are related to some urban crime patterns. The crime and population data used in the paper are related to the city of Szczecin in Poland. 

Even if the topic of the paper is interesting and the problem is well introduced, my main concern with the paper deals with the lack of details on numerous aspects of the proposed method. Especially, the negative binomial regression model is never presented in details. May be it is widely used in criminology but may be unknown to readers coming from other domains. It is also not clear why some variables have been added and considered as control. How are they used in the method? How are computed the numbers presented in Table 2 for these different variables?

It is also mentioned that 37 neighborhoods have been used. How? What are the main characteristics and location of these neighborhoods? Are they defined by some administrative boundaries? If so, why are they a good fit to this king of analysis? The administrative limits are usually in no way related with the human activities on the ground. 

Table 2 is also unclear. It mentions 10 models? Why 10? What are these models? Is it just a focus on 10 neighborhoods? 

I have also concerns with the conclusion of the paper. As the experiment is limited to the city of Szczecin in Poland, it's impossible to generalize and conclude that age structure and mobile abilities have to systematically be considered in crime analysis studies. Yes there are some correlations in this area but is it the case in another city? country? The paper will significantly gain in soundness if the experiment is reproduced in various other cities in the world.

Author Response

The article tends to propose a way of estimating the ambient and immobile urban population using geotagged tweets and age structure from census data, and test how they are related to some urban crime patterns. The crime and population data used in the paper are related to the city of Szczecin in Poland. 

Thanks for the valuable comments! We have made substantial changes in this revision. Please see our point-to-point responses below.

Even if the topic of the paper is interesting and the problem is well introduced, my main concern with the paper deals with the lack of details on numerous aspects of the proposed method. Especially, the negative binomial regression model is never presented in details. May be it is widely used in criminology but may be unknown to readers coming from other domains.

Thanks for the valuable comments! In this revision, we detailly explained why the negative binomial regression model is used and how that works (lines 264-282): “To use a traditional linear regression model, the dependent variable needs to follow the normal distribution. However, as “Law of Crime Concentration” and “Iron Law of Troublesome Places” suggest: few places are responsible for most of the crime, and most places do not experience any crime, so the distribution of crime is always skewed [5,6,8,9]. As evidenced in figures 1-5, crime patterns in Szczecin are also clustered. This clearly violates the basic assumption of linear regression; thus, Poisson or negative binomial regression should be used [67-69]. Poisson regression model may be used for count data like crime incidents. However, to use Poisson regression, the dependent variable’s mean needs to be equal to the variance, which is often not satisfied in crime data. Therefore, following the general practice in crime studies, we use the negative binomial regression model to assess the relationship between crime and the ambient population versus the immobile population (≤15 & >65). The negative binomial regression model has been widely used in criminology studies because it does not assume homogeneity of variance and does not require normal distribution of the dependent variable [68]. The unit of analysis is the neighborhood (N = 37). The negative binomial regression model is:

where NB stands for negative binominal,  is the crime count in the ith (i = 1, . . ., n) neighborhood,  is the kth explanatory variable for neighborhood i,  (k = 0, 1, . . ., p) are the coefficients, and α is the parameter of overdispersion [69].”

It is also not clear why some variables have been added and considered as control. How are they used in the method? How are computed the numbers presented in Table 2 for these different variables?

Thanks for the valuable suggestion and comment. Control variables in this study are simply additional independent variables that are not of direct interest to this study's goals but are included in the regression models because they have been tested by previous studies to influence crime patterns. We need to include them in the model to show the relationships between dependent and major independent variables are accidental. In this revision, we added a new paragraph to explicitly explain what control variables are used and why they are used with necessary references (lines 250-258): “…Additional control variables are population density (a measure of population concentration), population assisted by the municipal family assistance center (people in poverty), unemployed population (people who are not employed), demographic load index (shows the burden to the society by the unproductive population), and the count of liquor stores (known to be related to many violent and property crime types) [53,61-66]. Control variables in this study are additional independent variables that are not of direct interest to this study's goals but are controlled because they have been tested by previous studies to influence crime patterns. We include them to make sure the crime-tweets relationship and crime-age structure relationship are not accidental...

It is also mentioned that 37 neighborhoods have been used. How? What are the main characteristics and location of these neighborhoods? Are they defined by some administrative boundaries? If so, why are they a good fit to this king of analysis? The administrative limits are usually in no way related with the human activities on the ground. 

Thanks for the valuable comment. Because of the data availability, the neighborhood is the smallest unit we can find in this city that contains socioeconomic factors like poverty and population density, etc. In this revision, we added additional sentences to illustrate why neighborhoods are used in this study as the study unit (lines 235-238): “…The neighborhood is the finest unit the sociodemographic data can be available in this city. Therefore, though we can collect point-level crime data and tweets, we have to aggregate them to the neighborhood level to compare them with age structure and other sociodemographic factors...”

We also added additional discussions on the study unit in lines 328-333: “…Further, we use the neighborhood as the study unit because that is the finest unit of the sociodemographic data we can get in this area. As suggested by a previous study, the study unit size may influence the results due to the modifiable areal unit problem (MAUP) [31]. Thus, if data are available, more tests may be needed at the even finer spatial unit, e.g., track, block group, and blocks to see if such relationships persist across different spatial scales.”

Table 2 is also unclear. It mentions 10 models? Why 10? What are these models? Is it just a focus on 10 neighborhoods? 

Thanks for the valuable comment. In this revision, we clarified this by adding additional sentences to explain why 10 models are used (lines 246-261 and lines 286-289): “…Five dependent variables are counts of burglary in commercial buildings (commercial burglaries), drug crime, fight and battery (assault), property damage (vandalism), and theft... For each of these five dependent variables, two models are fit, one with the ambient population (tweets) and controls, another with the immobile population (≤15 or >65) and controls. Consequently, 25=10 models are fit.” & “As specified in the methodology section, each of these five crime types is fit with two different models: one with the ambient population (tweets), and one with the immobile population (≤15 or >65). All models also contain necessary control socioeconomic variables, and the sample size of each model is 37...”

I have also concerns with the conclusion of the paper. As the experiment is limited to the city of Szczecin in Poland, it's impossible to generalize and conclude that age structure and mobile abilities have to systematically be considered in crime analysis studies. Yes there are some correlations in this area but is it the case in another city? country? The paper will significantly gain in soundness if the experiment is reproduced in various other cities in the world.

Thank you very much for the valuable comments and suggestions. We acknowledge this limitation and have added additional paragraphs in this revision to discuss it (lines 325-328 & 350-368). We are also exploring the possibility to expand this to other cities and regions in the follow-up projects.

“Due to data limitations, we are only able to conduct this study in Szczecin, Poland. More tests should be done in other European countries, or even other parts of the World to further improve the discipline and advance knowledge…”

“As discussed in the earlier section, more tests in other regions at different spatial scales are recommended. Comparison studies among different cities across countries can also be beneficial, using the same method. This procedure, commonly found in experimental sciences, is seldomly used in the social sciences. However, it should be taken into account that in different socioeconomic conditions, the control variables may have different meanings and require different interpretations. The overall level of crime in a country or region is also relevant in such surveys. In Poland it is relatively low, in Szczecin it is average compared to the rest of the country, and the most serious crimes with the use of weapons and murders are practically isolated cases.

The assumption that age structure and mobile abilities need to be systematically considered in crime analysis studies should also be verified in other cities, but with a similar population age structure, level and lifestyle. In more affluent countries, young people and the elderly are generally more mobile thanks to individual and collective means of transportation.

In future studies, it should be checked as well whether the applied methods of estimating the number of mobile and immobile populations reflect their real mobility. Such an experimental study, using detailed lists of residents obtained, for example, from property owners and CCTV cameras, could verify the assumptions of this method and indicate the level of possible error.”

Author Response File: Author Response.docx

Reviewer 3 Report

Mobility research from geolocated tweets is an interesting and proven methodology for younger age groups. Considering also data from other "immobile" age groups is appropriate.

I think that the bibliographic references and the methodological explanation could have been expanded.

Author Response

Mobility research from geolocated tweets is an interesting and proven methodology for younger age groups. Considering also data from other "immobile" age groups is appropriate.

Thanks very much for acknowledging our efforts! We have made substantial changes in this revision. Please see our point-to-point responses below.

I think that the bibliographic references and the methodological explanation could have been expanded.

Thanks for the suggestion! In this revision, we have added 11 additional references and enriched the literature review (lines 37-66, lines 124-153).

“To study crime patterns, various efforts have been made by scholars not only in criminology, but also geography, sociology, and social sciences at large. Geography of crime in the geography field and environmental criminology (also known as crime science) in the criminology field specifically focus on the location of crime incidents.

Theories from environmental criminology are helpful in understanding temporal and spatial crime patterns. Routine activity theory argues that crime is the result of the direct contact of three elements in space and time: a motivated offender, a suitable target, and the absence of the capable guardian [2,3]. It suggests that routine activities of people contribute to possible intersections in time and space with a potential offender, and when it happens, the probability of crime increases drastically [2].

Crime pattern theory also shows that the distribution of population in time and space is important to crime pattern studies. This theory suggests that the locations of crime incidents are not accidental, rather, they have clear spatio-temporal patterns. This theory highlights the activity spaces and routes connecting them. These activity spaces include 1) nodes – e.g., shopping centers, workplaces, schools, recreation and entertainment areas, places to meet friends, etc., 2) paths connecting these nodes, and 3) edges, which are boundaries dividing areas with various forms of management, rulership, or functions. Moving along the paths between the nodes, the criminal's awareness spaces are created. The space of action is reflected in the criminal's consciousness in the form of a cognitive space map. According to this theory, a motivated offender has contact with a relatively small part of the city areas in the course of routine everyday activities. From perceived and realized nodes, paths, and edges, he/she selects appropriate objects or victims of crime in a multi-stage decision-making process. The distribution of crime in a city depends on its spatial structure, transport system, and street networks, and is shaped by the distribution of crime generators, attractors, and detractors [1,4]. Moreover, the city crime problem usually concentrates in relatively small areas, as the “Law of Crime Concentration” and “Iron Law of Troublesome Places” suggest [5-9]. In many cities, the downtown area or Central Business Districts would fall in this category because of the concentration of population and opportunities.”

“…Gerber (2014) found that the addition of Twitter-derived variables improves prediction performance for 19/25 crime types more than the model of solely historical crimes [25]. This study demonstrates the benefits of tweets for crime prediction. Bendler and colleagues (2014) used the amount of point of interest (POI) as the independent variable to simulate crime patterns in San Francisco and added the count of tweets into the model [35]. Ristea and colleagues (2017) used tweets count as an explanatory variable to test the crime-tweets relationship [42]. Lan and colleagues had done serial studies to test the crime-tweets relationship and suggested the reliability of tweets as a feasible dynamic population measure [12,31,43]. Hipp and colleagues (2018) also found that tweets can help explain the presence of crime in California [11].

However, as suggested by previous studies, geotagged tweets generally show the dynamic distribution of the population with higher mobility as they are the major Twitter users [23,44-51]. Consequently, only using geotagged tweets to study crime patterns may overlook the population with limited mobility, e.g., those who are very young (younger than 16) and are elderly (older than 65). This study fills this gap by using two data sources to include both the ambient and immobile populations in crime analysis. The first source of data is geotagged tweets, as a measure of the ambient population who have increased mobility during the day as they commute and move a lot (Group 1) [23]. The second source of data is census data which are used to locate the immobile social groups (not tweet as much) such as children (age ≤15) and the elderly (age >65), who usually stay in their residential neighborhoods (Group 2). As routine activity theory argues, crime happens when a motivated offender meets a suitable target/victim at space and time, while no capable guardian is onsite [2]. Crime pattern theory also emphasizes space’s importance by arguing that the overlapped activity spaces of both offender and victims are riskier [1]. Therefore, both the mobile population (Group 1) and the immobile population (Group 2) are possibly involved in crime in their residential neighborhoods; additionally, the mobile population (Group 1) are also possibly involved in crime in the neighborhoods they frequent and visit [11-13,31]. Thus, we feel it is necessary to fill this gap and consider both groups to assess different ways ambient and immobile population distributions influence urban crime patterns in a European country.”

We also further expanded the methodology section (lines 264-282): “To use a traditional linear regression model, the dependent variable needs to follow the normal distribution. However, as “Law of Crime Concentration” and “Iron Law of Troublesome Places” suggest: few places are responsible for most of the crime, and most places do not experience any crime, so the distribution of crime is always skewed [5,6,8,9]. As evidenced in figures 1-5, crime patterns in Szczecin are also clustered. This clearly violates the basic assumption of linear regression; thus, Poisson or negative binomial regression should be used [67-69]. Poisson regression model may be used for count data like crime incidents. However, to use Poisson regression, the dependent variable’s mean needs to be equal to the variance, which is often not satisfied in crime data. Therefore, following the general practice in crime studies, we use the negative binomial regression model to assess the relationship between crime and the ambient population versus the immobile population (≤15 & >65). The negative binomial regression model has been widely used in criminology studies because it does not assume homogeneity of variance and does not require normal distribution of the dependent variable [68]. The unit of analysis is the neighborhood (N = 37). The negative binomial regression model is:

where NB stands for negative binominal,  is the crime count in the ith (i = 1, . . ., n) neighborhood,  is the kth explanatory variable for neighborhood i,  (k = 0, 1, . . ., p) are the coefficients, and α is the parameter of overdispersion [69].”

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Thanks for the detailed revisions.

Reviewer 2 Report

All comments and remarks have been appropriately been addressed by the authors. Thanks for the clear document which details the modifications performed. On my side, the paper could be now published.

Back to TopTop