Stats

28 pages, 5284 KiB

Open AccessFeature PaperArticle

Negative Spatial Autocorrelation: One of the Most Neglected Concepts in Spatial Statistics

by Daniel A. Griffith

Stats 2019, 2(3), 388-415; https://doi.org/10.3390/stats2030027 - 15 Aug 2019

Cited by 21 | Viewed by 7849

Negative spatial autocorrelation is one of the most neglected concepts in quantitative geography, regional science, and spatial statistics/econometrics in general. This paper focuses on and contributes to the literature in terms of the following three reasons why this neglect exists: Existing spatial autocorrelation quantification, the popular form of georeferenced variables studied, and the presence of both hidden negative spatial autocorrelation, and mixtures of positive and negative spatial autocorrelation in georeferenced variables. This paper also presents details and insights by furnishing concrete empirical examples of negative spatial autocorrelation. These examples include: Multi-locational chain store market areas, the shrinking city of Detroit, Dallas-Fort Worth journey-to-work flows, and county crime data. This paper concludes by enumerating a number of future research topics that would help increase the literature profile of negative spatial autocorrelation. Full article

► Show Figures

Figure 1

17 pages, 1920 KiB

Open AccessArticle

On Generalized Slash Distributions: Representation by Hypergeometric Functions

by Peter Zörnig

Stats 2019, 2(3), 371-387; https://doi.org/10.3390/stats2030026 - 19 Jul 2019

Cited by 4 | Viewed by 2688

Abstract

The popular concept of slash distribution is generalized by considering the quotient Z = X/Y of independent random variables X and Y, where X is any continuous random variable and Y has a general beta distribution. The density of Z can usually be expressed by means of generalized hypergeometric functions. We study the distribution of Z for various parent distributions of X and indicate a possible application in finance. Full article

► Show Figures

Figure 1

24 pages, 11414 KiB

Open AccessArticle

Computing Happiness from Textual Data

by Emad Mohamed and Sayed A. Mostafa

Stats 2019, 2(3), 347-370; https://doi.org/10.3390/stats2030025 - 03 Jul 2019

Cited by 2 | Viewed by 3792

Abstract

In this paper, we use a corpus of about 100,000 happy moments written by people of different genders, marital statuses, parenthood statuses, and ages to explore the following questions: Are there differences between men and women, married and unmarried individuals, parents and non-parents, and people of different age groups in terms of their causes of happiness and how they express happiness? Can gender, marital status, parenthood status and/or age be predicted from textual data expressing happiness? The first question is tackled in two steps: first, we transform the happy moments into a set of topics, lemmas, part of speech sequences, and dependency relations; then, we use each set as predictors in multi-variable binary and multinomial logistic regressions to rank these predictors in terms of their influence on each outcome variable (gender, marital status, parenthood status and age). For the prediction task, we use character, lexical, grammatical, semantic, and syntactic features in a machine learning document classification approach. The classification algorithms used include logistic regression, gradient boosting, and fastText. Our results show that textual data expressing moments of happiness can be quite beneficial in understanding the “causes of happiness” for different social groups, and that social characteristics like gender, marital status, parenthood status, and, to some extent age, can be successfully predicted form such textual data. This research aims to bring together elements from philosophy and psychology to be examined by computational corpus linguistics methods in a way that promotes the use of Natural Language Processing for the Humanities. Full article

► Show Figures

Figure 1

15 pages, 401 KiB

Open AccessArticle

Confidence Sets for Statistical Classification

by Wei Liu, Frank Bretz, Natchalee Srimaneekarn, Jianan Peng and Anthony J. Hayter

Stats 2019, 2(3), 332-346; https://doi.org/10.3390/stats2030024 - 30 Jun 2019

Cited by 2 | Viewed by 2821

Abstract

Classification has applications in a wide range of fields including medicine, engineering, computer science and social sciences among others. In statistical terms, classification is inference about the unknown parameters, i.e., the true classes of future objects. Hence, various standard statistical approaches can be used, such as point estimators, confidence sets and decision theoretic approaches. For example, a classifier that classifies a future object as belonging to only one of several known classes is a point estimator. The purpose of this paper is to propose a confidence-set-based classifier that classifies a future object into a single class only when there is enough evidence to warrant this, and into several classes otherwise. By allowing classification of an object into possibly more than one class, this classifier guarantees a pre-specified proportion of correct classification among all future objects. An example is provided to illustrate the method, and a simulation study is included to highlight the desirable feature of the method. Full article

► Show Figures

Figure 1

Journal Menu

Journal Browser

Stats, Volume 2, Issue 3 (September 2019) – 4 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI