Next Article in Journal
Improved Bernoulli Sampling for Discrete Gaussian Distributions over the Integers
Next Article in Special Issue
Optimisation of Time-Varying Asset Pricing Models with Penetration of Value at Risk and Expected Shortfall
Previous Article in Journal
High-Speed Implementation of PRESENT on AVR Microcontroller
Previous Article in Special Issue
Factors Associated with Spa Tourists’ Satisfaction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Models of Wealth and Inequality Using Fiscal Microdata: Distribution in Spain from 2015 to 2020

by
Ignacio González García
and
Alfonso Mateos Caballero
*,†
Artificial Intelligence Department, E.T.S. de Ingenieros Informáticos, Universidad Politécnica de Madrid, 28660 Madrid, Spain
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2021, 9(4), 377; https://doi.org/10.3390/math9040377
Submission received: 12 January 2021 / Revised: 2 February 2021 / Accepted: 3 February 2021 / Published: 13 February 2021

Abstract

:
In this research, we used Spanish wealth distribution microdata for the period 2015–2020 to provide a general framework for comparing different models and explaining different empirical datasets related to wealth distribution. We present a methodology to output the current value of assets and participations held by the population in order to calculate their real and current distribution. We propose a new methodology for mixture analysis, whereby we identify and analyze subpopulations and then go on to study their influence on wealth distribution. We use concepts of symmetry to identify two internal processes that are characteristic of the wealth accumulation process for the subpopulations of entrepreneurs and non-entrepreneurs. Finally, we propose a method to adjust these results to other empirical data in other countries and periods, providing a methodology for comparing results output with differing data granularity.

1. Introduction

The discipline of econophysics originated during the 1970s and has evolved significantly over a period of just 50 years [1,2,3,4]. From the very first, it was effectively applied to economic problems, such as the puzzle of excess volatility, where other consolidated theories such as dynamic stochastic general equilibrium had proved insufficient ([5], p. 2), and strange regularities as observed in the distribution of the number of workers of companies [6,7] or the study of wealth and income.
Economists’ traditional interest in wealth distribution resurged in the 1970s. Some studied the question from the normative point of view [8,9], and many other pragmatists considered its effects on growth [10]. With the turn of the century, the interest in inequality [11,12] extended the focus beyond the academic sphere, renewing econophysicists’ interest in the problem [13,14].
As is well-known, Pareto observed that the distribution of the number of individuals with income above a specified level of wealth presented a strange regularity. In 1953, Champernowe [15] proposed a model according to which a taxpayer’s annual income was a function of the preceding year’s income and a stochastic factor. Gibrat [16] used data on company size as a variable to study the proportional effects leading to lognormal distributions. This was the first attempt to explain this specific skewed pattern of distribution in stochastic terms. He showed that the cumulative effect of a succession of independent random shocks would generate a variable whose log would be uniformly distributed regardless of the distribution that governed the shocks. The idea is sometimes explained by saying that company growth is independent of company size.
These ideas inspired countless empirical studies [17,18,19,20]. Some tried to confirm Pareto’s law with a better fit [21,22,23,24], and others attempted disproof by counterexample [25]. It has been consistently found in l n l n plots that, when drawing the survival curve, the complement to the cumulative curve, 90–95% of incomes (above a threshold) can be fitted to a lognormal distribution, whereas the 5–10% of the richest fit a Pareto distribution.
The cause of the duality is an enigma. According to [26], the upper tail of the income distribution has long been regarded as a source of fascination for economists (for a recent review, see [27]).
Scholars have focused on three aspects:
  • Distributions. Many researchers have tried to achieve a comprehensive unique and universal fit of the observed density or distribution functions with: biparametric (lognormal, gamma, Weibull) distributions, triparametric (generalized gamma, Singh-Maddala or Dagum) distributions [28], and pentaparametric (generalized beta) distributions [29]. Economists have preferred lognormal distributions [30], while statisticians [31] and econophysicists [32] have shown the utility of gamma distributions. Gibbs’ distributions have been used to fit cumulative data.
  • Models. They have been investigated by scholars who believe that it is plausible for a unique simple physical law or a stochastic model to explain the observed regularity. For a complete review of the models, see [18].
  • Subpopulations. Some studies, interested in the design of effective policies, have attempted to adjust the density function as a mixture of component subpopulations of wage earners, pensioners, etc. [33].
Many questions remain open today:
  • What are the underlying causes of the distribution?
  • How is it possible that such skewed distributions are observed in competitive capitalist markets that should be efficient?
  • How many different groups are there within the society? [34,35].
  • What are the processes governing the constitution of existing subgroups, and their influence on wealth and income distributions?
In this research we have had the opportunity to process microdata, which provide a very detailed picture of wealth in Spain over the period 2015–2020. The aim was to:
  • Determine which of the income and wealth variables best determines the form of a universal law of wealth distribution provided there is access to microdata.
  • Identify the subgroups existing in a society whose mix determines the wealth distribution and find the distribution of each subgroup in the case of Spain.
  • Model the wealth generation processes of the subgroups.
  • Quantify the deviations and errors that occur in estimating income and wealth caused by the involuntary use of variables with partial or biased information.
  • Explain the emergence of the shape of the upper tail of the wealth distribution.
This paper has four sections. In Section 2, we describe the methodology with special attention to the method used to obtain the indirect wealth using microdata of tax declaration. In Section 3, we apply the methodology to the data of the wealth distribution in Spain during the period 2015–2020. We present the results of the method used to identify the mixture of subpopulations, to explain their effects in the observable distribution using the concept of symmetry. In Section 4, we resume the results obtained.

2. Methodology

The study of wealth and inequality has traditionally come up against a systematic scarcity of detailed data, because: (i) statistical data collection is designed according to the objectives of each survey, and it is not easy to adapt the collected data items for other purposes; (ii) tax data are subject to special protection and are not easily accessible for academia; (iii) wealth is hidden; (iv) the value of wealth components is volatile; (v) some wealth is associated with legal persons, like companies, and the task of associating such assets with their holders is complex, and, last but not least, (vi) considerable computational power is required to process all the data on all forms of wealth.
The methodology used in this research is divided into four phases: (1) data collection; (2) variable selection; (3) decision on data binarization; and (4) analysis.
1.
Data collection. We used detailed data from the declarations presented to the Spanish State Tax Agency (AEAT): taxpayer census, information submitted by public bodies (e.g., cadastre), tax (personal income tax, corporate tax, inheritance tax) self-assessments, as well as informative declarations submitted by insurance companies and financial institutions (the authors are solely responsible for their conclusions, which have not been reviewed by AEAT).
While some authors [26] have already used income tax microdata, this study increases the level of detail. We selected the variables of interest from many tax declarations, including, for example, the gross tax base from Income Tax Form 100; data on partners and % participation in other companies from Corporate Tax Form 200, and data on assets and liabilities from other forms, such as F 720, F 174, F 196, F 714 and F 189. All the forms and variables used are described on the AEAT web page (AEAT, 2020).
The number of taxpayer declarations in an average year (2018) include 19.98 million for income tax, 1.59 million for corporate tax and 207,225 for inheritance tax. A total of 94.7 million current accounts, and loans worth around 139.5 million euros are processed annually, using several hundreds of millions of data items sourced from the Spanish cadastre and from the informative declarations provided by financial institutions. These data are included in the AEAT annual reports [36].
We have studied three variables:
(a)
Income. Gross tax base, declared on Form 100.
(b)
Assets. This is the sum of direct assets (value of property, financial products, deposits, insurance policies, shares in listed companies, etc.) and indirect (or corporate) assets, that is, the value of the % participation in the assets of non-listed companies.
(c)
Net worth ( N W ). Assets minus passives liabilities (loans and accounts payable) is the sum of two components D N W (direct net worth) and I N W (indirect net worth).
For each inhabitant, indexed by i, i { 1 , , N } , ( N 59,73 M),
N W i = j D A i D N W i j + k C i I N W i k ,
where D A i is the set of direct assets belong to inhabitant i and C i is the set of companies in which inhabitant i has a share.
The calculation of the components of indirect wealth is complex because taxpayers may or may not to have participations in non-listed companies, which may or may not hold shares in other companies and so on. It is calculated according to two different processes explained using the toy example in Figure 1 with three natural persons and six companies {N, A, B, C, D, E}.
Following the example of Natural Person 1, we calculate the direct wealth of taxpayers from the declarations presented by the taxpayer and by others (e.g., financial institutions). Indirect wealth is calculated by accumulating the market value of participations in non-listed companies {A, B, C, D} that are downstream in the flow of relations between participations. In this example, Natural Person 2 is a shareholder of A only. The relationship between company A, Natural Person 1 and Natural Person 2 is declared by A on page 2 of Form 200. The relationship between A and {B, C} is declared by A on page 24 of Form 200 and includes the % participation and its value. This information can be crossed with the information provided by B and C.
The process is divided into two phases:
(a)
Calculate the % participation of each taxpayer in each company. This is a network-based calculation using the information provided by companies on page 2 and 24 of Form 200. For example, the % participation of Natural Person 1 in D is output by adding two components along paths A-B-D and A-C-D with a total of 0.25, as illustrated in Figure 1d.
(b)
Calculate the target variable (assets or net worth). We used balance-sheet data declared on Form 200 to output, for each network node, two set of values: (a) linked assets and liabilities; (b) non-linked assets and liabilities (that is, the balance-sheet items that appear due to company participation in other companies) as illustrated in Figure 1 for the example.
Now,
I N W i = c C i p P i c l p % P A R l N L N W l ,
where C i is the set of companies in which inhabitant i has a share, P i c is the set of all the paths from i to c and l is a company along the path p.
The indirect net worth of a taxpayer’s participation in a company is the sum, for the companies, indexed by c, in which it has a share, and for all the paths, indexed by p, that connect the taxpayer with a company, of the product of % participations P A R multiplied by the non-linked net worth ( N L N W ) of these participations in the L companies along a path. The non-linked assets (referred to in Spanish as “no vinculados”) are calculated using data from Forms 200 and 220, including the balance sheet items. A close approximation to the linked value can be calculated from the balance sheet by deducting the value declared under items 00118, 00153, 00160 referring to assets and items 00223, 00238, 00243 referring to group and partner companies from the company assets. It was decided that the value of a taxpayer’s urban property is, in each case, the maximum of three available values: (a) the value declared on Form 714 (Boxes A, A1, C, D and M), the cadastral value of the property plus the value declared on Form 720 [37].
2.
Variable selection. We collected and studied the descriptive statistics of candidate variables: income, assets, net worth, net worth of the net holders of wealth, etc. We then examined the limitations derived from technical taxation issues, such as exemptions from declaration at specified wealth intervals or special regulations, and decided that the best theoretical option for gaining a clear understanding of reality would be total value of assets.
3.
Decision on the optimal binarization of the data. We decided the optimal binarization strategy. The wealth range is [ 10 0 ; 10 10 ] with a mean, μ , of the order of 5 × 10 5 within which both high and low incomes have to be closely scrutinized.
4.
Analysis. Although different distributions have been studied (see Section 1 above), biparametric distributions have been most often used in meta-analysis. Normal, lognormal, Pareto and, occasionally, gamma distributions have been used to compare the quality of the fits against distributions with more parameters.
As a reminder, Kleiber and Kotz [38] studied at length the types of this family of distributions in which Mandelbrot [39] singled out a strong form, expressed in Equation (3), where m 0 is a scale factor, and the value of α is not determined.
(a)
Pareto I distribution
π ( m ) = ( m m 0 ) α when   m > m 0 1 when   m m 0 = m P ( m ) d m
or alternatively, using the survival function, which is the complement of the distribution function:
S ( x ) = 1 m 0 x α .
Pareto’s law [40] considers the variable in the positive range (even in the knowledge that there are people who are net debtors, he studied wealth above the survival threshold), where the higher the Pareto index, the smaller the proportion of very rich. The value output by modern empirical studies is close to 2, bigger than in older studies [19,41].
(b)
Lognormal distribution
This type of distribution is intuitively applicable and is endorsed by Gibrat’s research, because the distribution of any manifestation of wealth is universally asymmetrical [42].
From another point of view, note that Koch [17,25,43,44,45,46,47,48] observed that asymmetry is a possible exponential trace. The exponential increase in the wealth of the richest could be studied by analogy with biological phenomena [42].
5.
Analysis of the population mix. Pareto [49] observed that “society is not homogeneous”. This ledthem to think that inequality would be understood by studying the shape of the distribution of the total population. We now know that this intuition was true and that the Pareto index is an indicator of this form. While in a conventional exploratory study, we have to infer the number of subpopulations in the mix, knowledge of multiple covariates for each taxpayer, like age, activity, and income details, as well as expert knowledge, greatly facilitates the identification and choice of the “natural” subpopulations.
In a mixture experiment, the experimenter selects a number of different mixtures and varies the proportions of two or more of their components. If we call the number of components q, and x i is the proportion of the i-th component, then the value of each component is between 0 and 1, and its sum is 1. Due to the constraints, the geometric description the factor space that contains the q components consists of all the points within the bounds of a ( q 1 ) simplex. The points inside the rectangle have possitive components, and its centroid describes the mixture with an equal proportion of their components.
Having identified the factor space, we could ascertain which mixture of populations best explains the empirically observed distribution. If society is made up of K subpopulations and the density function of each subpopulation is f k ( γ ; θ k ) , where θ k is a parameter vector, then the density function of the population is
f k ( γ ; θ ) = k = 1 K π k f k ( γ ; θ k ) ,
where π k is the proportion of each population, with ( 0 π k 1 ) , and its sum is 1. The population density of the model is approximated by estimating the parameter vector, without knowing to which of the groups each member belongs. If the number of subpopulations is known, the parameters can be output by maximum likelihood, where the most common options are the EM algorithm [50] or Bayesian methods.

3. Wealth Distribution in Spain

3.1. Data Collection

Data from the National Institute of Statistics (2020) indicate that Spain has a population of 46.94 million. The Spanish Tax Agency’s taxpayer census contains data on 72.87 million taxpayers, of which 64.25 million are natural persons. The difference between the above figures is due to the fact that the taxpayer census includes legal persons belonging to one of the 24 existing categories, non-residents, and deceased whose taxes have not prescribed. The baseline of the research was the year 2015, when there were 59.733 million natural persons, including non-residents far outnumbering the number of households (18.47 million) in the Household Financial Survey used in other studies [51].
Table 1 shows some descriptive statistics divided into three blocks. The first block shows income (gross tax base, as defined in art. 47 of the Income Tax Law) of declarants with a positive tax base. The remainder, up to a total of 38.261 million taxpayers, had a negative or zero base or were not obliged to declare. The second block, taxpayer net worth (NW) (assets less liabilities of all kinds), is divided into two groups, taxpayers with a NW greater than zero and taxpayers with a NW less than zero (net debtors). The last three columns are the values of the assets deciles for all taxpayers holding assets, grouped according to total, direct, and indirect assets (that is, assets owned due to participation in non-listed companies).
The rows show the following variables: total sum, mean, standard deviation, number of elements and distribution deciles in euros. For example, the value of variable income observed for the taxpayer ranked in the position 2,147,207, first decile, is 2806 €. The first positive values for the last variable, indirect assets, are in percentile 93. Note that only 3.86% of asset owners hold assets in non-listed companies and the proportion of taxpayers who do not hold assets is 59.7 M − 37.7 M = 22.01 M (36.85%).
There are major changes in variability ( σ ) in the evolution of the value of assets over time, see Table 2, which could be due to macro errors in taxpayer declarations that have to be accounted for. In this research, we have used data as declared and do not process possible errors.

3.2. Variable Selection

The term income is used to mean different things in studies taking a statistical approach, where it generally refers to a variable of interest in an interval, Y = [ y , y ¯ ] , constituting any manifestation of wealth. We have studied three variables:
1.
Income. Some problems are inevitable:
(a)
Low incomes (<€22,000) are exempt from declaration.
(b)
The comparability of different studies is limited because these variables are defined legally, and there are special income attribution schemes in each country.
(c)
The tax base may be negative.
(d)
Tax fraud related to salaried income is easier to detect than for income of professionals or the extremely wealthy.
(e)
The data distribution may be contaminated due to the tendency to defraud [52].
(f)
The distribution mode for low incomes is sensitive to binning. For example, with a €100 grouping, the mode is situated in the interval (€) [13,800–13,900] (# 73,822 taxpayers). For a €1000 grouping, the mode is in the interval [10,000–11,000] (# 537,999). For a €10,000 grouping, the peak is between [50,000 and 60,000].
(g)
The use of income introduces a bias because workers may have mortgages, or assets, placing them in a completely different situation to others, with a similar salary, that are or are not homeowners.
(h)
Income tax declarations are sometimes filed individually and sometimes by household unit. Atkinson [20] and Atkinson, Piketty and Saez [53] have discussed the limitations of income tax data.
2.
Net worth. Its associated difficulties are:
(a)
Values may be negative if there are net debtors.
(b)
Although possible, it is very difficult to calculate. The problems are:
  • All the tax data on declared incomes and debts are required for its calculation.
  • Details contained in the corporate tax declaration that are not available in all countries are required to calculate indirect equity, that is, shares owned in unlisted investee companies.
  • All the assets of the companies owned by taxpayers have to be transferred to the current price value.
  • The declared corporate tax values have to be adjusted for each company due to different amortization rates that are acceptable under accounting rules, and the process is very complex [37].
  • Access to all the direct and indirect liabilities (loans and debts) of each taxpayer, even if assumed by the company in which the taxpayer has a share, is required.
3.
Total assets. This variable is more detailed than income, is always positive and does not have so many technical issues (from the viewpoint of accounting and data accessibility) as net worth.
Therefore, we decided to use total assets in most cases, with a detailed analysis of net worth on some specific occasions.
We use data for a period covering the last four years.

3.3. Decision on Binarization of Data

The decision on whether the “bins” should be of the same length, their position and, generally, their width is not straightforward. Silverman’s rule [54] is:
h ^ O P T = min σ ^ , q ^ 3 q ^ 1 1.349 n 1 / 5 .
Its logic is that the variance is more sensitive to extreme values than the interquartile range. We find that, in this case, σ is much greater, which is logical in a curve whose right tail describes the distribution of the assets of big fortunes:
0.9 min 4 , 123 , 829 , ( 136 , 972 2658 ) / 1.349 37 , 714 , 346 1 / 5 = 3039 .
The range of assets in € is [ 0 , 10 10 ] . The h ^ O P T is € 3039. This binning can be used for the low ranges. However, it implies 10 10 / 3039 10 7 intervals. As we will see, this level of granularity is unnecessary.

3.4. Analysis of Wealth

First, we updated previous studies providing a global metric of the effect of microdata. In the case of Spain, García Docampo [55] used different data sources (the National Statistics Institute (INE) Household Financial Surveys and the INE Household Panel) and output Pareto indicators between 0.8 and 0.9. We repeated the method and output an index of 1.85 in line with the current tendency to find values closer to 2.

3.4.1. Variable Distribution

Figure 2 shows the l n l n plots of survival functions for income, net worth and assets.
The origin of the ordinate is always 0, because the value of 100% of the elements of each variable is greater than 0 (we have only plotted the positive part of the net worth variable whose magnitude is shown on the abscissa).
For example, only 10% of individuals will have a variable value greater than decile 9. Using the value l n ( 1 0.9 ) = 2.30 on the ordinate, we get the value of the variable considered on each curve, for example, 12.52 for assets, which is equivalent to €273,795 (Table 1).
We find that income has a singular behavior, especially in the lower region, because, given the declaration threshold, it is not universally declared, its range of variability is smaller than for wealth, and it is sometimes submitted for household units. The distribution for the other group of variables is similar. If we were to consider net worth, a transformation would be necessary because it has an interval with negative values. There are 4.53 million taxpayers with “negative assets”, most of whom are paying back a mortgage on their home. Many of these taxpayers have income and are not accounted for only when the analyzed variable is “net assets > 0”. Therefore, we concluded that the best option is to study total assets.

3.4.2. Definition of Subpopulations

We agree with Pareto that the population is composed of subpopulations with different characteristics. The use of microdata has the advantage that it is possible to ascertain how many subpopulations there are, as well as their characteristics. We hypothesize that the characteristics of a subpopulation will impose a specific pattern on the wealth interval in which it accounts for the numerical majority.
Figure 3 shows the number of taxpayers with assets. Figure 3a illustrates intervals from € 10 j to 10 j + 1 with j = { 0 , 1 , 2 , 3 , , 9 } . The first interval is [ 10 0 , 10 1 ) . The abscissa represents the values of interval bounds j : j + 1 . Figure 3b shows the details of the interval [ 10 4 , 10 5 ) with a binning of 10,000. Figure 3a clearly shows that there is a subpopulation, probably minors, in the range €[0, 10) without assets, and a population on the abscissa that is, on this scale, not normal, but could, however, be fitted to a lognormal distribution. The pattern shown in Figure 3b is similar, albeit on a natural and bimodal scale.
There is a tendency to consider a total population but we should not forget that the observed data illustrate the emergence of a phenomenon originated by the coexistence of different wealth generation processes in different populations.
We tackle the problem by hypothesizing that there are five different sub-populations:
  • Taxpayers without assets ( C 0 ). Minors, net debtors, non-resident workers who receive income or subsidies, but do not have assets to their name. Taxpayers with assets ( C a ) contains the following subgroups.
  • Passive taxpayers ( C p ). With assets received by donation or inheritance (minors) and attributed to them by third parties (their parents or financial institutions).
  • Non-entrepreneurs ( C n ). Salary earners and professionals whose main income is a salary. They do not hold shares in non-listed companies. The wealth generation process is governed by the accumulation of savings.
  • Entrepreneurs ( C e ). Owners or shareholders in unlisted companies. They are characterized by owning assets, shares in listed companies or participations in companies through other companies. The wealth generation process is a combination of the accumulation characteristic of savings (direct assets) and multiplicative effects (indirect assets), mediated in this case by random a variable (success of the non-listed companies) in a process that can be modelled based on Gibrat’s ideas.
  • Large fortunes ( C f ) are, in some cases, exceptionally successful entrepreneurs but more often holders of large family fortunes. The wealth accumulation process is governed by other rules and is affected by inheritances and marriages.

3.4.3. Distribution of the Subpopulations

There is a population of 37,714,346 ( C a ) that own assets and a group ( C 0 ) of 22,019,068 not have assets to their name.
Figure 4 shows: (a) the distribution of the taxpayer census by birth year. The mode is 1977, and it illustrates the impact of sociological events, such as the mortality of the Civil War in Spain in 1936; (b) the distribution by birth year of the value of assets owned at three different wealth levels. The peak of the distribution of large fortunes (birth year 1962) corresponds with youngest people than lower fortunes (birth year 1947) and (c) distribution of wealth between minors and adults.
The low values of the distribution of assets could be considered to follow a normal distribution, assuming hypothetically that they are random donations or small gains without a regular pattern. Table 3 shows that, on the contrary, the asymmetry of the monotonically decreasing curves in all ranges (from €1 to €10, from €10 to € 10 2 , and up to €10 4 –10 5 ) is similar.
In summary, donations to minors reflect the wealth of their parents.
According to Pareto, we should find the value m 0 (mode), which serves to normalize the Pareto curve. A major issue in measuring the Pareto coefficient is the choice of threshold above which the distribution is assumed to adopt the Pareto shape. This is an arbitrary, but critical, decision because the value of α depends on the choice of the threshold. There are two approaches: (i) make a pragmatic decision, or (ii) use a statistical criterion to find a value, with the undesired collateral effect of yearly change. It is usual practice [20] to opt for a pragmatic approach, where an income threshold of £55,000 in the UK corresponded to the upper 5%.
We used different binnings for our analysis. With €1000, the mode is €1500, with 1,668,736 asset owners, with €10,000, the mode is in the interval [50,000–60,000) with 1,853,312 asset owners. If the binning is €100,000, the mode it is near the endpoint of the interval [0, 100,000). Applying Atkinson’s criterion, we decided to take a pragmatic approach to facilitate comparison between years and countries, and we used €100,000, rounding to the value €101,000 for total assets used in previous studies [37]. From the pragmatic point of view, this suffices; it is placed between the 6th and 7th deciles.
The distribution of the group under the mode could be explained by a mixture of three populations: C 0 , without assets (minors or adults), (36.86%), C p minors with assets, (5.25%), with a characteristic distribution similar in shape to that of their parents, and the least rich of the populations (57.89%), see Equation (5).
Our first hypothesis is that adults can be divided in two big subpopulations, entrepreneurs, including all individuals that have a holding in non-listed companies, and non-entrepreneurs (including farmers, professionals, civil servants, self-employed, etc.) We assume that there are rich and poor people in both groups, but we believe that their distribution, and, more importantly, their processes of wealth generation are different.
Figure 5a–c show the wealth distribution for non-entrepreneurs and entrepreneurs in three wealth intervals. Clearly, the curve of the total population is shaped by the combination of both subpopulations in the first interval, but the non-entrepreneur pattern is dominant only in the middle wealth interval.
Figure 6a shows that the two curves cross at the value 15.52 on the abscissa (approximately €5.5 million), and the entrepreneur distribution begins to reveal a different wealth accumulation process to non-entrepreneurs receiving a salary.
This is studied in greater detail in Figure 6b with one continuous and one dotted curves (a ratio between entrepreneurs and non-entrepreneurs). The abscissa represents the l n of wealth. The primary ordinate, on the left, shows the values of the μ /Max variable. The left end of the curves indicates that the value of the ratio between the average and maximum (€100,000) values of the assets owned is 0.274 for non-entrepreneurs and 0.44 for entrepreneurs (whose average value is €50,000 and l n (50,000) = 10.82). Figure 6 shows transitions near to the abscissa values 13, 15 and 20, which are in the range of €500,000, €5,000,000, €500,000,000. In the proximity of these abscissa points, the slope of the accumulated curve changes quickly. The curve that fits the values observed in the range [10,13] does not fit the range [15,20], and the curve that fits this range does not fit the range of big fortunes. The observed curve could be considered as the “envelope” of the curves of different subpopulations.
This is compatible with the idea that the value of assets that workers can save is “limited” to €0.5 million or that is a natural limit to the process of accumulation of wealth, savings, of non-entrepeneurs.
The interval between €0.5 million and €5 million is characterized by a population that is a mixture of professionals and entrepreneurs, the interval [5 M, 500 M) by a population of successful entrepreneurs governed by a multiplicative process, and the interval [500 M, 10,000 M) is the range in which the accumulation process applicable to large fortunes holds.
The columns in Figure 6b shows the ratio between the number of entrepreneurs and non-entrepreneurs in each interval. On the far left, there are 270,266 entrepreneurs and 24,733,827 non-entrepreneurs, with a ratio of 0.01. The maximum ratio value of 15.94 (right secondary axis) is reached in the interval [10 M, 100 M) with 19,802 entrepreneurs and 1830 non-entrepreneurs. Looking at the two curves, we find that the shape of the curve is characterized by the parameters of the subpopulation of successful entrepreneurs above the threshold of €500 million.
Figure 7 shows the behavior of the different subgroups of the population. Entrepreneurs own more assets than non-entrepreneurs, and this is clearly illustrated by the shapes of their distributions which are very unalike, especially in the lower wealth intervals. Hence, the accumulation process must be different.
Figure 7b shows that the distribution of the richest ( C f ), which can be fitted to a potential curve. It represents the 776 cases with assets between €100 million and €300 million. Using a power law, the fit is near perfect ( R 2 = 0.99), also accounting for the 141 people with assets between €300 million and €10,000 million (whose data are not represented in the graph for privacy reasons).
Figure 8 illustrates the evolution of the distribution of large fortunes over the period 2015–2018, with, in all cases, very good fits and a Zipf coefficient of the order of −1.3.

3.5. Analysis of the Components of Wealth

Table 4 shows the components of the net worth of taxpayers by intervals of net worth (2015). Each row # contains the number of taxpayers in this range. There are fewer natural persons with positive net worth (37,006,939) than natural persons with positive assets (37,714,346) because some are indebted. The right-hand columns, as of liabilities, express the ratio of each variable to total assets, including in its last column financial assets.
Taxpayers at the bottom of the scale (whose wealth is below the mode of €100,000) owe 63.59% of the value of their assets, 73.15% of which are accounted for by real estate (urban property), 9.25%, by indirect assets held in non-listed companies, another 14.03% are held in current accounts and deposits, and 3.57% are financial assets.
We highlight three facts:
  • The growing importance of indirect assets (shares in non-listed companies) as we move up from the bottom interval (9.25%) to large fortunes (79.44%), where it far outweighs total financial assets (19.83%) (including shareholdings in IBEX-listed companies).
  • The importance of the value of immovable property in the analysis. The share of real estate assets (first and second residences) that decreases with wealth. It represents 73.15% of the wealth of the poorest and accounts for an almost negligible fraction (0.50%) of large fortunes.
  • Liabilities and indirect assets play an extremely important role in conducting a correct analysis. If the variable used does not take liabilities into account, the poor appear to be richer (inequality is attenuated). If indirect assets are not counted, the richest appear to be poorer than they really are.
It is of the greatest importance to highlight that, when models are fitted to income or direct assets, the presence of residuals is not a direct manifestation of model insufficiencies. These models are not fitting the real wealth.
Analyzing its components in Figure 9 (plotting l n values), we observe that the distribution of real estate assets almost perfectly fits the Pareto curve in the interval from €10,000 to €20,000,000, but differs from total assets and total wealth. It is also very clear that the distribution of total assets, which fits a Pareto distribution nicely, is not fully explained by real estate and financial assets, that is, these assets do not suffice to explain wealth.
Figure 10 shows that, globally, a gamma distribution provides a correct fit. However, it is not a matter of fitting empirical data to a curve: this problem has been solved.

3.6. Characterization of the Nature of the Underlying Processes

In the light of this analysis, we describe some characteristics of the underlying economic processes. Table 5 shows the number of individuals, range, mean and standard deviation of the population subgroups. Entrepreneurs contains the population with indirect assets, including minors. For reasons of privacy regarding the biggest fortunes in Spain, we use the notation a ∗ 10,000 M with 10 > a > 1 .

Generation of Additive and Multiplicative Effects

We explored the idea put forward by Limpert et al. [42], studying the mechanisms that induce lognormal distributions and the principles of additive and multiplicative effects. They used the example of an experiment with two ordinary dice. The addition of the two numbers leads to values from 2 to 12 with a mean of 7. Total range can be expressed as 7 ± 5. The multiplication of the two numbers produces a skewed distribution with a range from 1 to 36. The range in this case can be expressed as ( x / c , x c ) . In this specific case we obtain c = 6 and x = 6 . In this case, symmetry has shifted to the multiplicative level, in a distribution with mean μ = 12.25 . They illustrated this point using natural models based on Galton’s quincunx. In the first case, they used rows of equilateral triangles whose edges lay at x + c and x c from the central vertex to get normal distributions with additive symmetry and a variant of quincunx with scalene triangles with edges at distances x c and x / c to get lognormal distributions [56].
We took up this idea to investigate the uniformity of the wealth creation process in each interval of wealth. Each interval, i, except the first [ 0 , 10 ] , has endpoints [ 10 j , 10 j + 1 ] with j { 1 , 2 , , 9 } . If, following Gibrat’s ideas, the distribution happens to be correctly described by a lognormal, we could affirm that it reflects an underlying multiplicative process.
We know that:
  • In each interval [ 10 j , 10 j + 1 ] , by construction, x j + 1 c j + 1 x j + 1 / c j + 1 = 10 j + 1 10 j = 10 . For all the intervals, c j + 1 = ( 10 ) 1 / 2 = 3.16 , j , that is, l n ( 3.16 ) = 1.15 .
  • If the underlying process is multiplicative, with the proposed intervals [ 10 j , 10 j + 1 ] , it holds that there is a point x j + 1 , in the interval
    x j + 1 c j + 1 = 10 j + 1
    x j + 1 c j + 1 = 10 j
    and, therefore,
    x j + 1 = 10 ( 2 j + 1 ) / 2 , j .
    If we compare the real value for both subpopulations, we would observe any difference with respect to the situation where the accumulative effect holds. We use for this observed value the notation y:
    y e = 10 ( 2 j + 1 ) / 2 μ e n t r e p r e n e u r s ,
    and
    y n = 10 ( 2 j + 1 ) / 2 μ n o n e n t r e p r e n e u r s .
In Table 6, the columns containing the values of the group of entrepreneurs and non-entrepreneurs show the values of μ , σ and their ratio in the intervals [ 10 j , 10 j + 1 ) from j = 0 to j = 9 . For these groups, the l n of c = μ / σ in each case and the excess with respect to the ideal behavior (1.15) of the multiplicative symmetry are shown in the right-hand column.
Figure 11 contains three graphics. In all of them we represent in abscissa the l n of assets. In the upper left corner, we show the distribution of the ratio ( μ / σ ) for entrepreneurs, non-entrepreneurs and total population. In the lower part the graphic shows the ratio l n ( μ ) / l n ( σ ) .
In the right side we express the difference with respect to the theoretical value of 1.15, that in our construction of the intervals of wealth, is the expression of a multiplicative process, that appears as a log normal distribution. We observe a symmetry that is expressed with the double arrow in the graphic, among the lines with the values of entrepreneurs and non-entrepreneurs.
In this graphic we can see that, over the intervals, there is an excess for total population with respect to the expected behavior of a multiplicative process.
We find that, throughout the region covered by Pareto’s law (on the right-hand side of the mode) and even up to €10,000 on the left-hand side in the region of the 3rd to 4th deciles, it holds that the value l n ( μ σ ) is similar to the theoretical value of multiplicative accumulation (1.15). The excess with respect to the 1.15 value has an asymptotical behavior toward an excess of 0.1. We can see that this effect is generated by the mixture of two subpopulations with a symmetrical behavior with respect to the asymptotic and ideal value of 1.15.
There is a zone, defined by the process of accumulation of entrepreneurs that can be fitted with precision to a lognormal, but the upper wealth interval, big fortunes, is characterized by a different process of wealth accumulation, inheritances and marriages.
In Figure 12, we develop the idea expressed in Equations (6)–(8). We interpret that the observed values of μ , greater that the values that should be expected in a multiplicative process, are the expression of the coexistence of one multiplicative process of accumulation of wealth with another that should be observed in absence of non-entrepreneurs, more similar to a Pareto’s distribution.
Table 7 includes the distribution of the number of these groups and their evolution over the time.
It is clear that the regularity observed for the total population emerges from the mixture of two subpopulations but it is the sum of two regularities.
Table 8 shows the values obtained with (7) and (8) obtained to investigate if it is the case that there is a subjacent multiplicative process, using in this case the position of the mean in each interval of wealth.
Figure 13 shows the ratio between the theorical μ and y ^ observed for each interval and subpopulation. For example, the value 1.69 in the interval [ 10 4 , 10 5 ] expresses that the value of μ (€53,382) for the subpopulation of non-entrepreneurs is 1.69 times the value of 10 ( 2 4 + 1 ) / 2 , that is, €31,623.
The total population is composed of two subpopulations (entrepreneurs and non-entrepreneurs) whose behavior in the graph is almost symmetric with respect to y = 1 ) and almost symmetric with respect to the threshold of Pareto, but the combined effect of the mixture of subpopulations is symmetric with respect to two axis, the ideal value of y = 1 and the Pareto threshold chosen. The lines of tendency of both subpopulations cross with a strange precision in O.
Non-entrepreneurs save and invest in their homes or in the stock market. Until the limit of €0.5 million in the case of Spain they outnumber the entrepreneurs. Some of them have large fortunes, but the process of accumulation by saving and with investments in urban property has a decreasing rate of return, and presence of non-entrepreneurs in the right zone of the distribution is sparse.
In the interval between €0.5 M and €1000 M the dominant process is the activity of entrepreneurs with a different process of accumulation of wealth, multiplicative. There is a new change in the wealth interval of big fortunes that increase their wealth by investment, inheritance and marriage that reflects in a Pareto distribution.
Looking at the global trendlines, without the use of microdata, the observable total distribution of wealth can be fitted, by intervals, to two or three distributions one of them, when a threshold is reached, a Pareto distribution.
Figure 14 concludes the process initiated in Figure 2. There are three curves: total assets in the middle and indirect assets on the right and liabilities in the left. The methodology explained in Section 2 has found that indirect wealth, the value of the non-listed companies owned by entrepreneurs, accounts a 11.08% of the total wealth in Spain.
Studies that do not include indirect wealth, the wealth accumulated in non-listed companies or use the income declared in tax form clearly does not reflect the reality.
We have proved that there are two wealth accumulation processes—one for non-entrepreneurs (savings) and another for entrepreneurs (investments), the second is multiplicative.
There are five subpopulations, and the shares can be calculated from public statistics. There are two wealth accumulation processes (saving and investment). The evolution of the accumulation process can be studied using Equation (8). We suggest that this strategy could be used to fit data from sources with different granularity.

4. Conclusions

We have shown that:
  • The best option for analyzing wealth distribution regularities and studying inequality is total assets, including direct and indirect assets, because it uses more relevant data for accurately studying entrepreneur activity. The choice between the two available options of studying positive values or positive and negative values has to be made considering the objectives of the study and the availability of microdata on mortgages.
  • Analyzing the Spanish population with a wealth distribution in the range [ 0 , 10 10 ] , we have found that the best threshold for Pareto analysis is at the point 10 5 and the best binning is 10 4 .
  • We have identified five subpopulations with the following population shares: C 0 (without assets) accounts for 36.86%. Owner of assets is a mixture of four subpopulations, minors, non-entrepreneurs and entrepreneurs and big fortunes.
  • We have identified two types of accumulative processes (savings) and (investments), where the region of transition between the processes is 5.5 × 10 7 . The process of generation of big fortunes is a mixture of both.
  • The distribution of the subpopulations with assets in the Pareto zone can be fitted to lognormal curves, but the process of accumulation proposed by Gibrat is clearer for entrepreneurs. Within the population that work for a salary, wealth accumulation is attenuated by the payment of mortgages. Table 4 shows that, on average, 73.15% of the value of assets owned by people that work for a salary takes the form of real estate, that is, their homes, where liabilities (mostly mortgages) account for 63.59% of total assets. Their wealth accumulation is limited by interest payments and is, in some cases, driven by the increase of the property values. There is a huge difference with the average “private” liabilities of entrepreneurs (with total assets ranging from €100 K to €1 M), which are less affected by interest rates.
  • The wealth accumulation process is different for large fortunes, where it is relevant the wealth accumulated as shareholdings in non-listed companies.
  • Wealth models should consider that: (a) the subpopulation of minors and donees of wealth is distributed so as to reflect the wealth of the donors (and has a lognormal distribution), (b) the population of entrepreneurs (with corporate assets) should be differentiated from the population of employees and professionals (non entrepreneurs), which behave differently, and aggregate behavior with a level of multiplicative symmetry emerges only as a result of the mixture of these two populations.
  • We have explained the difference observed in empirical studies with respect to the law of Pareto by the effect in the wealth distribution of two processes of accumulation of wealth, one of them stationary and other characteristics of entrepreneurs that changes within each society with different uses of technology and with the growing importance of financial activities.

Author Contributions

Conceptualization, I.G.G. and Mateos; methodology, I.G.G. and A.M.C.; software, I.G.G.; validation, I.G.G. and A.M.C.; formal analysis, I.G.G.; investigation, I.G.G. and A.M.C.; resources, I.G.G.; data curation, I.G.G.; writing—original draft preparation, I.G.G. and A.M.C.; writing—review and editing, I.G.G. and A.M.C.; visualization, I.G.G.; supervision, A.M.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

This research was supported by the Ministry of Economy and Competitiveness, project MTM2017-86875-C3-3-R.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Chandra Dash, K. The Story of Econophysics; Cambridge Scholar Publishing: Cambridge, UK, 2019. [Google Scholar]
  2. Jovanovic, F.; Schinckus, C. The History of Econophysics Emergence: A new Approach in Modern Financial Theory. Hist. Political Econ. 2013, 45, 443–474. [Google Scholar] [CrossRef] [Green Version]
  3. Savoiu, G.; Siman, I. History and Role of Econophysics in Scientific Research, Cap 1 en Econophysics: Background and Applications in Economics. Financ. Sociophys. 2013, 3–16. [Google Scholar] [CrossRef]
  4. Stanley, H.E. Stanley on Econophysics; Sage Publication: Sauzend Oaks, CA, USA, 2013; Volume 2, pp. 73–78. [Google Scholar]
  5. Bouchaud, J.P. Econophysics: Still Fringe after 30 Years? Available online: https://arxiv.org/abs/1901.03691v1 (accessed on 8 February 2021).
  6. Axtell, R. Zipf distribution of US firm sizes. Science 2001, 293, 1818–1820. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  7. Atkinson, A.B.; Voitchovsky, S. The distribution of top earnings in the UK since the Second World War. Economica 2010, 78, 440–459. [Google Scholar] [CrossRef]
  8. Rawls, J. A Theory of Justice; Cambridge, M.A., Ed.; Harvard University Press: Cambridge, MA, USA, 1971. [Google Scholar]
  9. Sen, A. Poverty and Famines; Oxford University Press: New Delhi, India, 1999. [Google Scholar]
  10. Forbes, K.J. A reassessment of the relationships between inequality and growth. Am. Econ. Rev. 2000, 90, 869–887. [Google Scholar] [CrossRef] [Green Version]
  11. Piketty, T. 21 Lessons for the 21st Century; Harvard University Press: Cambridge, MA, USA, 2013. [Google Scholar]
  12. Mantegna, R.N.; Stanley, H.E. 21 Lessons for the 21st Century; Spiegel & Grau: New York, NY, USA, 2018. [Google Scholar]
  13. Mantegna, R.N.; Stanley, H.E. An introduction to Econophysics: Correlations and Complexity Finance; Cambridge University Press: Cambridge, UK, 2001. [Google Scholar]
  14. Chatterjee, A.; Yarlagadda, S.; Chakrabarti, B.K. (Eds.) Econophysics of Wealth Distributions; Springer: Milan, Italy, 2005. [Google Scholar]
  15. Champernowe, D.G. A model of income distribution. Science 1953, 63, 318–351. [Google Scholar]
  16. Gibrat, R. Les Inegalités Economiques; Librairie du Recueil. Sirey: Paris, France, 1931. [Google Scholar]
  17. Santarelli, E.; Klomp, L.; Thurick, R. Gibrat’s Law: An overview of the empiric literature. In International Studies in Entreprenership; Springer: Berlin/Heidelberg, Germany, 2006; Volume 12, pp. 1–73. [Google Scholar]
  18. Chakrabarti, B.; Chakraborti, A.; Chakrabarty, S.; Chaterjee, A. Econophysics of Income and Wealth Distribution; U.K. Cambridge University Press: Cambridge, UK, 2013. [Google Scholar]
  19. Richmond, P.; Hutzler, S.; Coelho, R. A review of empirical studies and Models of Income Distributions in Society. In Econophysics and Sociophysics. Trends and Perspectives; Chakrabarti, B.K., Chakabrorti, A., Chaterjee, A., Eds.; Wiley-VCH: Berlin, Germany, 2016. [Google Scholar]
  20. Atkinson, A.B. The distribution of top incomes in the United Kingdom 1908–2000. In Top Incomes over the 20th Century; Atkinson, A.B., Piketty, T., Eds.; Oxford University Press: Oxford, UK, 2007. [Google Scholar]
  21. Yakovenko, V.M.; Rosser, J.B. Colloquium: Statistical mechanics of money wealth and income. Econ. J. 2013, 81, 703–725. [Google Scholar] [CrossRef] [Green Version]
  22. Aoyama, H.; Souma, W.; Nagahara, Y.; Okazaki, M.P.; Takayasu, H.; Takayasu, M. Pareto’s law for income of individuals and debt of bankrupt companies. Fractals 2000, 8, 293–300. [Google Scholar] [CrossRef] [Green Version]
  23. Aoyama, H.; Souma, W.; Fujiwara, Y. Growth and fluctuations of personal and company’s income. Physica A 2003, 324, 352–358. [Google Scholar] [CrossRef]
  24. Aoyama, H.; Fujiwara, Y.; Fujiwara, Y. Econophysics and Companies: Statistical Life and Death in Complex Business Networks; Cambridge University Press: New York, NY, USA, 2011. [Google Scholar]
  25. Shirras, F. The Pareto Law and the Distribution of Income. Econ. J. 1935, 45, 663–681. [Google Scholar] [CrossRef]
  26. Atkinson, A.B. Pareto and the upper tail of the income distribution in the UK: 1799 to the present. Econ. Spec. Issue Inequality 2017, 84, 129–343. [Google Scholar] [CrossRef] [Green Version]
  27. Benhabib, J.; Bisin, A. Skewed Wealth Distributions: Theory and Empirics; NBER Working Paper 21924; National Bureau of Economic Research: Cambridge, MA, USA, 2016. [Google Scholar]
  28. Kleiber, C. Dagum vs. Singh-Maddala income distributions. Econ. Lett. 1996, 53, 265–268. [Google Scholar] [CrossRef]
  29. Bandourian, R.; McDonald, J.B.; Turkey, R.S. A comparison of parametric models of income distribution across countries and overtime. Estadística 2003, 55, 135–152. [Google Scholar]
  30. Montroll, E.W.; Schelesinger, M.F. On 1/f noise and other distributions with long tails. Proc. Natl. Acad. Sci. USA 1982, 79, 3380–3383. [Google Scholar] [CrossRef] [Green Version]
  31. Hogg, R.V.; Mckean, J.W.; Craig, A.T. Introduction to Mathematical Statistics; Pearson Education: Delhi, India, 2007. [Google Scholar]
  32. Chatterjee, A.; Chakrabarti, B.K. Kinetic exchange models for income and wealth distributions. Eur. Phys. J. B 2006, 60, 135–149. [Google Scholar] [CrossRef] [Green Version]
  33. Cowell, A.; Flachaire, E. Statistical Methods for Distributional Analysis. In Handbook of Income Distribution; Elsevier: Amsterdam, The Netherlands, 2015. [Google Scholar]
  34. Chotikapanich, D.; Griffiths, W. Estimating income distributions using a mixture of gamma densities. In Modelling Income Distributions and Lorenz Curves; Chotikapanich, D., Ed.; Springer: New York, NY, USA, 2008; Chapter 16; pp. 285–302. [Google Scholar]
  35. Pittau, M.G.; Zelli, R. Empirical evidence of income dynamics across EU regions. Italian evidence in the 1990s from kernel density estimates. Empir. Econom. 2006, 29, 415–430. [Google Scholar] [CrossRef]
  36. AEAT 2020. Available online: www.aeat.es (accessed on 8 February 2021).
  37. González, I.; Mateos, A. La distribución de la riqueza. Pareo deconstruido, Gibrat reconstruido. Rev. Econ. Apl. Número Extraordin. Econophys. 2019, 37, 22–40. [Google Scholar]
  38. Kleiber, C.; Kotz, S. Statistical Size Distributions in Economics and Actuarial Sciences; John Willey and Sons: Hoboken, NJ, USA, 2003. [Google Scholar]
  39. Mandelbrot, B.B. The Pareto Levy law and the distribution of income. Int. Econ. Rev. 1960, 1, 79–106. [Google Scholar] [CrossRef] [Green Version]
  40. Arnold, B.C. Pareto Distributions, 2nd ed.; CRC Press, Taylor y Francis Group: Boca Raton, FL, USA, 2015. [Google Scholar]
  41. Richmond, P.; Solomon, S. Stable power laws in variable economics; Lotka-Volterra implies Pareto-Zipf. Eur. Phys. J. B 2002, 27, 257–261. [Google Scholar] [CrossRef]
  42. Limpert, E.; Stahel, W.A.; Abbt, M. Log-normal Distributions across the Sciences: Keys and Clues. BioScience 2001, 5, 341–352. [Google Scholar] [CrossRef]
  43. Koch, A.L. The logarithm in biology. I. Mechanisms generating the log-normal distribution exactly. J. Theor. Biol. 1966, 23, 276–290. [Google Scholar] [CrossRef]
  44. Koch, A.L. The logarithm in biology. II. Mechanisms generating the log-normal distribution exactly. J. Theor. Biol. 1966, 23, 251–268. [Google Scholar] [CrossRef]
  45. Coelho, R.; Neda, Z.; Ramasco, J.; Santos, A. A family-network model for wealth distribution in societies. Physica A 2005, 353, 515–528. [Google Scholar] [CrossRef] [Green Version]
  46. Gabaix, X. Power laws in economics and finance. Annu. Rev. Econ. 2009, 1, 255–294. [Google Scholar] [CrossRef] [Green Version]
  47. Montebruno, P.; Bennet, R.J.; Van Lieshout, C.; Smith, H. A tale of two tails: Do power Law and Lognormal models fit firm size distribution in the mid-Victorian Era? Physica A 2019, 523, 858–865. [Google Scholar] [CrossRef]
  48. Solomon, S.; Richmond, P. Power laws of wealth, market order volumes and market returns. Physica A 2001, 299, 188. [Google Scholar] [CrossRef] [Green Version]
  49. Pareto, V. Cours d’Economie Politique; Trad. “Manual of Political Economy”; Macmillan: Paris, France; London, UK, 1897; Volume 2. [Google Scholar]
  50. Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum Likelihood from Incomplete Data via the EM Algorithm. J. R. Stat. Soc. Ser. 1977, 39, 1–38. [Google Scholar]
  51. Brindusa, A.; Basso, H.; Bover, O.; Casado, J.M.; Hospido, L.; Izquierdo, M.; Kataryniu, I.; Lacuesta, A.; Montero, J.M.; Vozmediano, E. La desigualdad de la renta el consumo y la riqueza en España. Doc. Ocas. Banco Espa. 2018, 6, 1–49. [Google Scholar]
  52. Chescher, A.; Schluter, C. Measurement error and inequality measurement. Rev. Econ. Stud. 2002, 69, 357–378. [Google Scholar] [CrossRef] [Green Version]
  53. Atkinson, A.B.; Piketty, T.; Saez, E. Top incomes in the long run oftheirtory. J. Econ. Lit. 2011, 49, 3–71. [Google Scholar] [CrossRef] [Green Version]
  54. Silverman, B.W. Density Estimation for Statistics and Data Analysis; Chapman & Hall/CRC: London, UK, 1986. [Google Scholar]
  55. Garcia Docampo, M. Medición del análisis de las desigualdades en la distribución de la renta. Empiria. Rev. Metodol. Las Cienc. Soc. 2000, 3, 73–99. [Google Scholar]
  56. Gut, C.; Limpert, E.; Hinterberger, H. A Computer Simulation on the Web to Visualize the Genesis of Normal and Log-Normal Distributions. 2000. Available online: http://stat.ethz.ch/vis/log-normal (accessed on 8 February 2021).
Figure 1. Example of calculation of indirect wealth.
Figure 1. Example of calculation of indirect wealth.
Mathematics 09 00377 g001
Figure 2. Distributions: l n l n of income, wealth and assets.
Figure 2. Distributions: l n l n of income, wealth and assets.
Mathematics 09 00377 g002
Figure 3. Assets by intervals.
Figure 3. Assets by intervals.
Mathematics 09 00377 g003
Figure 4. Net worth by year.
Figure 4. Net worth by year.
Mathematics 09 00377 g004
Figure 5. Assets distribution.
Figure 5. Assets distribution.
Mathematics 09 00377 g005
Figure 6. Assets with a value greater than mode.
Figure 6. Assets with a value greater than mode.
Mathematics 09 00377 g006
Figure 7. Behavior of subpopulations and large fortunes.
Figure 7. Behavior of subpopulations and large fortunes.
Mathematics 09 00377 g007
Figure 8. Evolution of the top wealth interval of over time.
Figure 8. Evolution of the top wealth interval of over time.
Mathematics 09 00377 g008
Figure 9. Distribution of the components of wealth.
Figure 9. Distribution of the components of wealth.
Mathematics 09 00377 g009
Figure 10. Cullen and Frey graph.
Figure 10. Cullen and Frey graph.
Mathematics 09 00377 g010
Figure 11. Analysis of the subpopulations in ranges of order 10,000.
Figure 11. Analysis of the subpopulations in ranges of order 10,000.
Mathematics 09 00377 g011
Figure 12. Interpretation of Equation (6)–(8).
Figure 12. Interpretation of Equation (6)–(8).
Mathematics 09 00377 g012
Figure 13. Analysis by subpopulations.
Figure 13. Analysis by subpopulations.
Mathematics 09 00377 g013
Figure 14. Analysis by subpopulations.
Figure 14. Analysis by subpopulations.
Mathematics 09 00377 g014
Table 1. Descriptive statistics of wealth distribution in Spain (2015).
Table 1. Descriptive statistics of wealth distribution in Spain (2015).
INCOMENET WORTHASSETS > 0
>0>0<0TotalDirectIndirect
∑ (M€)442,8104,228,499−123,8935,504,7704,372,8661,131,697
μ 20,593114,205−27,335145,959115,97421,936
#21,472,06537,025,3664,532,46937,714,34637,714,3461,458,186
σ 59,4485,142,106202.6695,423,9974,216,8533,321,528
D12806165−65,6631431280
D252671258−41,439120211230
D393265417−27,605599654900
D413,63218,652−18,28627,78326,1450
Me16,65539,897−11,89254,62153,1370
D620,08864,136−755881,05879,0630
D724,33695,216−4393114,745111,5790
D830,210142,423−1820165,736159,8870
D938,428240,632−480273,795270,3120
Table 2. Evolution of assets over time.
Table 2. Evolution of assets over time.
Var.20152016201720182016/20152017/20152018/2015
5,504,8116,522,8115,925,2146,297,723118.49107.64114.40
μ 149,959168,858156,4171190,390115.69107.17130.44
σ 4,916,808126,005127,5483,972,6782.562.5980.80
#37,714,34638,806,62537,880,76638,072,958102.90100.44100.95
Table 3. Frequency of assets in the range 1–100,000 €.
Table 3. Frequency of assets in the range 1–100,000 €.
Interval €#Interval#Interval#Interval#
1 to 101,845,563<1003,522,730<1 K7,512,309<10 K13,411,653
10 to 20419,723100 to 200858,7551 K to 2 K1,791,39610 K to 20 K1,520,055
20 to 30264,339200 to 300606,9252 K to 3 K1,035,19820 K to 30 K1,447,770
30 to 40205,114300 to 400508,1973 K to 4 K744,92030 K to 40 K1,520,055
40 to 50161,698400 to 500430,3104 K to 5 K564,58640 K to 50 K1,614,721
50 to 60186,280500 to 600390,2685 K to 6 K465,26050 K to 60 K1,639,613
60 to 70133,745600 to 700347,5976 K to 7 K397,57460 K to 70 K1,591,818
70 to 80113,784700 to 800309,3617 K to 8 K336,23770 K to 80 K1,483,085
80 to 90104,315800 to 900281,0558 K to 9 K300,47980 K to 90 K1,353,336
90 to10088,169900 to 1K257,1119 K to 10 K263,69490 K to 100 K1,231,181
3,522,730 7,512,309 13,411,653 27,146,544
Table 4. Distribution of net worth and its components by intervals.
Table 4. Distribution of net worth and its components by intervals.
% WITH RESPECT TO TOTAL ASSETS
Interval#LiabilitiesIndirect AssetsReal EstateCurrent AccountsFF.AA
Less 100 K29,965,37263.599.2573.1514.033.57
100 K to 1 M9,629,28911.018.2470.3515.276.14
1 M to 10 M364,35717.6340.2337.597.1914.99
10 M to100 M14,32623.0569.629.093.1118.18
100 M to 1000 M47125.4579.472.031.7416.76
>1000 M1926.4379.440.230.5019.83
Table 5. Parameters of the subpopulation distributions.
Table 5. Parameters of the subpopulation distributions.
#Range μ σ
Total Population 59,733,414[0, 10,000 M]
WITHOUT ASSETS C 0 22,019,068 [ 0 , 0 ] 00
Minors without assets 3,128,625 [ 0 , 0 ]
Adults without assets 18,890,443[0, 10,000 M]
ASSET OWNERS < 0 4,532,569
ASSET OWNERS C a = C n + C f 37,714,346[0, a ∗ 10,000 M]
Minors with assets C p 4,072,002[0;130 M]5165
Adults with assets 34,585,721[[0, a ∗ 10,000 M]1594,318,838
Non-entrepreneurs C n 35,256,160[0, 10,000 M]98,974217
Entrepreneurs C e 1,458,186[0;100 M]8903,031,214
Big fortunes (>100 M) C f 917[100 M, a ∗ 10,000 M]318,213,094787,168,132
Entrepreneurs 863[100 M, a ∗ 10,000 M]314,059,377785,624,254
Other cases 44[100 M, a ∗ 10,000 M]512,925,903857,383,064
Table 6. Distribution by wealth intervals and subpopulations.
Table 6. Distribution by wealth intervals and subpopulations.
ENTREPRENEURSNON-ENTREPRENURSEN.EEN.E
μ σ μ σ μ σ μ σ Ln()Ln()Exc.Exc.
0–13.343.011.110.020.100.201.091.700.06−0.55
1–252252.0842261.621.231.15−0.080.00
2–35692612.184502631.711.141.100.010.05
3–4484025431.90383024711.551.081.060.070.09
4–554,09127,0052.0053,38225,6532.081.071.070.080.08
5–6364,262225,9821.61199,751156,0811.281.041.020.110.13
6–72,510,6871,780,1431.411,686,5011,100,5731.531.021.030.130.12
7–822,482,10615,811,0091.4216,394,32510,320,4861.591.021.030.130.12
8–9206,604,515143,069,7451.44158,398,77386,480,2581.831.021.030.130.12
9–103,233,259,8002,871,975,5941.132,286,954,185732,488,3553.121.011.060.140.09
Table 7. Distributions of entrepreneurs by wealth interval over the time.
Table 7. Distributions of entrepreneurs by wealth interval over the time.
100 K < Actives < 1 M1 M < Actives < 100 M100 M < Actives < 500 M1500 M < Actives < 15 KM
YearEntrep.N-Entrep.Entrep.N-Entrep.Entrep.N-Entrep.Entrep.N-Entrep.
20151,143,14311,151,678324,620193,88976315823
20161,183,17811,330,440333,804206,02879619921
20171,176,69811,970,874363,025276,286822221022
20181,192,84411,225,120375,442281,007877209910
Table 8. Deviations with respect to multiplicative process of the mean.
Table 8. Deviations with respect to multiplicative process of the mean.
Interval F = ( 2 j + 1 ) / 2 10 F μ (Entrepreneurs) y e μ (Non-Ent.) y n
0–10.53.1631.0541.26
1–21.531.62520.61421.33
2–32.53165690.564501.42
3–43.5316248400.6538301.21
4–54.531.62354,0910.5853,3821.69
5–65.5316,228364,2620.87199,7510.63
6–76.53,162,27822,482,1061.261,686,5010.53
7–87.531,622,77722,482,1061.4116,394,3250.52
8–98.5316,227,766206,604,5151.53158,398,7730.50
9–109.53,162,277,6603,233,249,8000.982,286,954,1850.72
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

González García, I.; Mateos Caballero, A. Models of Wealth and Inequality Using Fiscal Microdata: Distribution in Spain from 2015 to 2020. Mathematics 2021, 9, 377. https://doi.org/10.3390/math9040377

AMA Style

González García I, Mateos Caballero A. Models of Wealth and Inequality Using Fiscal Microdata: Distribution in Spain from 2015 to 2020. Mathematics. 2021; 9(4):377. https://doi.org/10.3390/math9040377

Chicago/Turabian Style

González García, Ignacio, and Alfonso Mateos Caballero. 2021. "Models of Wealth and Inequality Using Fiscal Microdata: Distribution in Spain from 2015 to 2020" Mathematics 9, no. 4: 377. https://doi.org/10.3390/math9040377

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop