# Random Permutations, Non-Decreasing Subsequences and Statistical Independence

^{*}

## Abstract

**:**

## 1. Introduction

## 2. The Procedure

**Definition**

**1.**

- i.
- the subsequence $\{{q}_{{i}_{1}},\dots ,{q}_{{i}_{k}}\}$ of Q is a non-decreasing subsequence of Q if $1\le {i}_{1}<\dots <{i}_{k}\le n$ and ${q}_{{i}_{1}}\le {q}_{{i}_{2}}\le \dots \le {q}_{{i}_{k}};$
- ii.
- the length of a subsequence verifying $i.$ is $k;$
- iii.
- $ln{d}_{n}\left(Q\right)={max}_{k}\{1\le k\le n:\{{q}_{{i}_{1}},\dots ,{q}_{{i}_{k}}\}\in {S}_{n}\},$ where ${S}_{n}$ is the set of subsequences of Q verifying $i.$

**Definition**

**2.**

**Remark**

**1.**

- i.
- Note that without the presence of ties, the set ${Q}_{\mathcal{D}}$ is a particular case of all the permutations of the values in the set $\{1,\dots ,n\}.$
- ii.
- With ties, there is more than one way of defining ranks. We apply the minimum rank notion. For example, the sample $6.1,2.1,5.3,4.7,5.5,6.2,5.3,4.7$ has ranks $7,1,4,2,6,8,4,2.$

**Example**

**1.**

**Remark**

**2.**

**Definition**

**3.**

**Definition**

**4.**

#### 2.1. ${F}_{JLN{D}_{n}}$ Estimates

## 3. Simulations

- i.
- $D1(m,a):$ Uniform on $A=\left\{(x,y)\in {\{1,\dots ,m\}}^{2}:|x-y|\le a\right\};$
- ii.
- $D2(m,a):$ Uniform on $A=\left\{(x,y)\in {\{1,\dots ,m\}}^{2}:|x-y|\le aor|x+y-m-1|\le a\right\};$
- iii.
- $D3(m,a,b):$ Uniform on $A=\{(x,y)\in {\{1,\dots ,m\}}^{2}:|x-y|\le aor|x-y+b|\le aor|x-y-b|\le a$ $\mathrm{or}|x-y-2b|\le a\mathrm{or}|x-y+2b|\le a\}.$

- iv.
- $M1(m,a):pD1(m,a)+(1-p)U\left(m\right);$
- v.
- $M2(m,a):pD2(m,a)+(1-p)U\left(m\right);$
- vi.
- $M3(m,a,b):pD3(m,a,b)+(1-p)U\left(m\right).$

- vii.
- $D4(m,a)$: Uniform distribution on $A=\left\{(x,y)\in \{1,\dots ,m\}\times [0,m+1]:|x-y|\le a\right\};$
- viii.
- $D5(m,a)$: Uniform on $A=\left\{(x,y)\in \{1,\dots ,m\}\times [0,m+1]:|x-y|\le aor|x+y-m-1|\le a\right\}.$

- ix.
- $M4(m,a):pD4(m,a)+(1-p)W\left(m\right);$
- x.
- $M5(m,a):pD5(m,a)+(1-p)W\left(m\right).$

## 4. Applying the Test in Real Data

## 5. Concluding Remarks

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Agresti, A. Categorical Data Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2002. [Google Scholar]
- Kendall, M.G. The treatment of ties in ranking problems. Biometrika
**1945**, 33, 239–251. [Google Scholar] [CrossRef] [PubMed] - Cramir, H. Mathematical Methods of Statistics; Princeton U. Press: Princeton, NJ, USA, 1946; Volume 500. [Google Scholar]
- Romik, D. The Surprising Mathematics of Longest Increasing Subsequences; Cambridge University Press: New York, NY, USA, 2015; Volume 4. [Google Scholar]
- Garca, J.E.; González-López, V.A. Independence test for sparse data. AIP Conf. Proc.
**2016**, 1738, 140002. [Google Scholar] - Garca, J.E.; González-López, V.A. Independence tests for continuous random variables based on the longest increasing subsequence. J. Multivar. Anal.
**2014**, 127, 126–146. [Google Scholar] [CrossRef] - Zelterman, D. Goodness-of-fit tests for large sparse multinomial distributions. J. Am. Stat. Assoc.
**1987**, 82, 624–629. [Google Scholar] [CrossRef] - Hollander, M.; Wolfe, D. Nonparametric Statistical Methods; John Wiley & Sons: New York, NY, USA, 1973; pp. 185–194. [Google Scholar]
- Simonoff, J.S. Smoothing Methods in Statistics; Springer: New York, NY, USA, 1996. [Google Scholar]
- Weisberg, S. Applied Linear Regression, 4th ed.; John Wiley & Sons: Minneapolis, MN, USA, 2005; Volume 528. [Google Scholar]

**Figure 2.**(

**left**) X vs. $Y.$ (

**right**) $ranks\left(X\right)$ vs. $ranks\left(Y\right)$. The values of X and Y are simulated from two independent exponential distributions, $\lambda =10$ for X and $\lambda =20$ for $Y,$ $n=100.$

**Figure 5.**(

**left**): Alcohol vs. Flavanoids. (

**right**): Flavanoids vs. Intensity. Variables coming from wine data set from gclus R-package.

${\mathit{x}}_{\mathit{i}}$ | ${\mathit{y}}_{\mathit{i}}$ | Rank (${\mathit{x}}_{\mathit{i}}$) | Rank (${\mathit{y}}_{\mathit{i}}$) |
---|---|---|---|

5.3 | 10.2 | 1 | 5 |

5.3 | 9.3 | 1 | 1 |

6.1 | 9.3 | 3 | 1 |

6.1 | 10.1 | 3 | 3 |

7.1 | 10.1 | 5 | 3 |

7.3 | 11.0 | 6 | 6 |

**Table 2.**The proportion of p-value $\le \alpha $ computed from Definition 4 and Equation (2), in 1000 simulations of size $n,$ of two independent and discrete Uniform distributions in $\{1,\dots ,m\}.$ On top, results for $\alpha =0.01,$ on bottom results for $\alpha =0.05.$

n | $\mathit{m}=10$ | $\mathit{m}=20$ | $\mathit{m}=50$ | $\mathit{m}=100$ | |

20 | 0.013 | 0.021 | 0.022 | 0.032 | |

40 | 0.021 | 0.038 | 0.037 | 0.041 | |

$\alpha =0.01$ | 60 | 0.025 | 0.033 | 0.043 | 0.050 |

80 | 0.019 | 0.040 | 0.053 | 0.050 | |

100 | 0.028 | 0.034 | 0.044 | 0.059 | |

$\mathit{n}$ | $\mathit{m}=\mathbf{10}$ | $\mathit{m}=\mathbf{20}$ | $\mathit{m}=\mathbf{50}$ | $\mathit{m}=\mathbf{100}$ | |

20 | 0.084 | 0.089 | 0.112 | 0.100 | |

40 | 0.091 | 0.105 | 0.134 | 0.143 | |

$\alpha =0.05$ | 60 | 0.104 | 0.114 | 0.148 | 0.149 |

80 | 0.095 | 0.124 | 0.139 | 0.159 | |

100 | 0.113 | 0.111 | 0.125 | 0.143 |

**Table 3.**The proportion of p-value $\le \alpha $ computed from Definition 4 and Equation (3), in 1000 simulations of size $n,$ of two independent and discrete Uniform distributions in $\{1,\dots ,m\}.$ On top, results for $\alpha =0.01,$ on bottom results for $\alpha =0.05.$

n | $\mathit{m}=2$ | $\mathit{m}=3$ | $\mathit{m}=4$ | $\mathit{m}=5$ | $\mathit{m}=10$ | $\mathit{m}=20$ | $\mathit{m}=50$ | $\mathit{m}=100$ | |

20 | 0.004 | 0.006 | 0.008 | 0.014 | 0.011 | 0.007 | 0.007 | 0.004 | |

40 | 0.004 | 0.009 | 0.005 | 0.008 | 0.007 | 0.008 | 0.010 | 0.011 | |

$\alpha =0.01$ | 60 | 0.005 | 0.004 | 0.009 | 0.007 | 0.006 | 0.004 | 0.010 | 0.014 |

80 | 0.006 | 0.005 | 0.010 | 0.011 | 0.012 | 0.009 | 0.006 | 0.008 | |

100 | 0.005 | 0.011 | 0.011 | 0.009 | 0.007 | 0.009 | 0.012 | 0.008 | |

$\mathit{n}$ | $\mathit{m}=\mathbf{2}$ | $\mathit{m}=\mathbf{3}$ | $\mathit{m}=\mathbf{4}$ | $\mathit{m}=\mathbf{5}$ | $\mathit{m}=\mathbf{10}$ | $\mathit{m}=\mathbf{20}$ | $\mathit{m}=\mathbf{50}$ | $\mathit{m}=\mathbf{100}$ | |

20 | 0.021 | 0.032 | 0.041 | 0.047 | 0.038 | 0.048 | 0.042 | 0.043 | |

40 | 0.019 | 0.038 | 0.030 | 0.043 | 0.030 | 0.046 | 0.045 | 0.037 | |

$\alpha =0.05$ | 60 | 0.029 | 0.041 | 0.042 | 0.044 | 0.040 | 0.032 | 0.056 | 0.044 |

80 | 0.039 | 0.031 | 0.045 | 0.046 | 0.051 | 0.046 | 0.046 | 0.052 | |

100 | 0.031 | 0.041 | 0.048 | 0.049 | 0.052 | 0.054 | 0.053 | 0.053 |

**Table 4.**The proportion of p-value $\le \alpha $ computed from Definition 4 and Equation (3), in 1000 simulations of size $n.$ On top, results for $\alpha =0.01,$ on bottom results for $\alpha =0.05.$ $m=20$, $a=1,$ $b=6.$

$\mathit{p}=1.0$ | $\mathit{p}=0.8$ | ||||||

$\mathit{n}$ | $\mathit{D}\mathbf{1}$ | $\mathit{D}\mathbf{2}$ | $\mathit{D}\mathbf{3}$ | $\mathit{M}\mathbf{1}$ | $\mathit{M}\mathbf{2}$ | $\mathit{M}\mathbf{3}$ | |

20 | 1.000 | 0.349 | 0.028 | 0.994 | 0.179 | 0.018 | |

40 | 1.000 | 0.798 | 0.050 | 1.000 | 0.568 | 0.034 | |

$\alpha =0.01$ | 60 | 1.000 | 0.983 | 0.136 | 1.000 | 0.858 | 0.078 |

80 | 1.000 | 0.999 | 0.252 | 1.000 | 0.963 | 0.109 | |

100 | 1.000 | 1.000 | 0.352 | 1.000 | 0.990 | 0.181 | |

$\mathit{p}=\mathbf{1.0}$ | $\mathit{p}=\mathbf{0.8}$ | ||||||

$\mathit{n}$ | $\mathit{D}\mathbf{1}$ | $\mathit{D}\mathbf{2}$ | $\mathit{D}\mathbf{3}$ | $\mathit{M}\mathbf{1}$ | $\mathit{M}\mathbf{2}$ | $\mathit{M}\mathbf{3}$ | |

20 | 1.000 | 0.537 | 0.101 | 1.000 | 0.366 | 0.064 | |

40 | 1.000 | 0.906 | 0.177 | 1.000 | 0.757 | 0.125 | |

$\alpha =0.05$ | 60 | 1.000 | 0.993 | 0.306 | 1.000 | 0.934 | 0.192 |

80 | 1.000 | 1.000 | 0.468 | 1.000 | 0.985 | 0.250 | |

100 | 1.000 | 1.000 | 0.601 | 1.000 | 0.997 | 0.368 |

**Table 5.**The proportion of p-value $\le \alpha $ computed from Definition 4 and Equation (3), in 1000 simulations of size $n.$ On top, results for $\alpha =0.01,$ on bottom results for $\alpha =0.05.$ $m=50$, $a=2,$ $b=12.$

$\mathit{p}=1.0$ | $\mathit{p}=0.8$ | ||||||

$\mathit{n}$ | $\mathit{D}\mathbf{1}$ | $\mathit{D}\mathbf{2}$ | $\mathit{D}\mathbf{3}$ | $\mathit{M}\mathbf{1}$ | $\mathit{M}\mathbf{2}$ | $\mathit{M}\mathbf{3}$ | |

20 | 1.000 | 0.431 | 0.044 | 0.993 | 0.229 | 0.024 | |

40 | 1.000 | 0.902 | 0.172 | 1.000 | 0.719 | 0.079 | |

$\alpha =0.01$ | 60 | 1.000 | 0.989 | 0.374 | 1.000 | 0.924 | 0.180 |

80 | 1.000 | 1.000 | 0.615 | 1.000 | 0.986 | 0.342 | |

100 | 1.000 | 0.999 | 0.762 | 1.000 | 0.998 | 0.515 | |

$\mathit{p}=\mathbf{1.0}$ | $\mathit{p}=\mathbf{0.8}$ | ||||||

$\mathit{n}$ | $\mathit{D}\mathbf{1}$ | $\mathit{D}\mathbf{2}$ | $\mathit{D}\mathbf{3}$ | $\mathit{M}\mathbf{1}$ | $\mathit{M}\mathbf{2}$ | $\mathit{M}\mathbf{3}$ | |

20 | 1.000 | 0.610 | 0.126 | 0.998 | 0.404 | 0.097 | |

40 | 1.000 | 0.949 | 0.368 | 1.000 | 0.837 | 0.196 | |

$\alpha =0.05$ | 60 | 1.000 | 0.997 | 0.608 | 1.000 | 0.963 | 0.370 |

80 | 1.000 | 1.000 | 0.823 | 1.000 | 0.993 | 0.571 | |

100 | 1.000 | 1.000 | 0.918 | 1.000 | 0.999 | 0.711 |

**Table 6.**The proportion of p-value $\le \alpha $ computed from Definition 4 and Equation (3), in 1000 simulations of size $n.$ On top, results for $\alpha =0.01,$ on bottom results for $\alpha =0.05.$ $m=100$, $a=5,$ $b=30.$

$\mathit{p}=1.0$ | $\mathit{p}=0.8$ | ||||||

$\mathit{n}$ | $\mathit{D}\mathbf{1}$ | $\mathit{D}\mathbf{2}$ | $\mathit{D}\mathbf{3}$ | $\mathit{M}\mathbf{1}$ | $\mathit{M}\mathbf{2}$ | $\mathit{M}\mathbf{3}$ | |

20 | 1.000 | 0.409 | 0.038 | 0.997 | 0.179 | 0.024 | |

40 | 1.000 | 0.865 | 0.137 | 1.000 | 0.623 | 0.063 | |

$\alpha =0.01$ | 60 | 1.000 | 0.984 | 0.292 | 1.000 | 0.884 | 0.141 |

80 | 1.000 | 0.999 | 0.473 | 1.000 | 0.969 | 0.247 | |

100 | 1.000 | 1.000 | 0.655 | 1.000 | 0.991 | 0.394 | |

$\mathit{p}=\mathbf{1.0}$ | $\mathit{p}=\mathbf{0.8}$ | ||||||

$\mathit{n}$ | $\mathit{D}\mathbf{1}$ | $\mathit{D}\mathbf{2}$ | $\mathit{D}\mathbf{3}$ | $\mathit{M}\mathbf{1}$ | $\mathit{M}\mathbf{2}$ | $\mathit{M}\mathbf{3}$ | |

20 | 1.000 | 0.597 | 0.110 | 1.000 | 0.345 | 0.090 | |

40 | 1.000 | 0.933 | 0.300 | 1.000 | 0.771 | 0.194 | |

$\alpha =0.05$ | 60 | 1.000 | 0.996 | 0.520 | 1.000 | 0.949 | 0.326 |

80 | 1.000 | 1.000 | 0.715 | 1.000 | 0.990 | 0.468 | |

100 | 1.000 | 1.000 | 0.848 | 1.000 | 0.998 | 0.620 |

**Table 7.**The proportion of p-value $\le \alpha $ computed from Definition 4 and Equation (3), in 1000 simulations of size $n.$ On top, results for $\alpha =0.01,$ on bottom results for $\alpha =0.05.$ Distribution $M4$ with $a=0.5.$

$\mathit{p}=1.0$ | $\mathit{p}=0.8$ | ||||||||

$\mathit{n}$ | $\mathit{m}=\mathbf{2}$ | $\mathit{m}=\mathbf{3}$ | $\mathit{m}=\mathbf{5}$ | $\mathit{m}=\mathbf{10}$ | $\mathit{m}=\mathbf{2}$ | $\mathit{m}=\mathbf{3}$ | $\mathit{m}=\mathbf{5}$ | $\mathit{m}=\mathbf{10}$ | |

20 | 1.000 | 1.000 | 1.000 | 1.000 | 0.713 | 0.936 | 0.985 | 0.996 | |

40 | 1.000 | 1.000 | 1.000 | 1.000 | 0.993 | 1.000 | 1.000 | 1.000 | |

$\alpha =0.01$ | 60 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |

80 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | |

100 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | |

$\mathit{p}=\mathbf{1.0}$ | $\mathit{p}=\mathbf{0.8}$ | ||||||||

$\mathit{n}$ | $\mathit{m}=\mathbf{2}$ | $\mathit{m}=\mathbf{3}$ | $\mathit{m}=\mathbf{5}$ | $\mathit{m}=\mathbf{10}$ | $\mathit{m}=\mathbf{2}$ | $\mathit{m}=\mathbf{3}$ | $\mathit{m}=\mathbf{5}$ | $\mathit{m}=\mathbf{10}$ | |

20 | 1.000 | 1.000 | 1.000 | 1.000 | 0.882 | 0.977 | 0.999 | 0.999 | |

40 | 1.000 | 1.000 | 1.000 | 1.000 | 0.999 | 1.000 | 1.000 | 1.000 | |

$\alpha =0.05$ | 60 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |

80 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | |

100 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |

**Table 8.**The proportion of p-value $\le \alpha $ computed from Definition 4 and Equation (3), in 1000 simulations of size $n.$ On top, results for $\alpha =0.01,$ on bottom results for $\alpha =0.05.$ Distribution $M5$ with $a=0.5.$

$\mathit{p}=1.0$ | $\mathit{p}=0.8$ | ||||||

$\mathit{n}$ | $\mathit{m}=\mathbf{3}$ | $\mathit{m}=\mathbf{5}$ | $\mathit{m}=\mathbf{10}$ | $\mathit{m}=\mathbf{3}$ | $\mathit{m}=\mathbf{5}$ | $\mathit{m}=\mathbf{10}$ | |

20 | 0.070 | 0.209 | 0.446 | 0.031 | 0.105 | 0.209 | |

40 | 0.211 | 0.634 | 0.926 | 0.118 | 0.374 | 0.708 | |

$\alpha =0.01$ | 60 | 0.446 | 0.910 | 0.996 | 0.224 | 0.655 | 0.945 |

80 | 0.611 | 0.977 | 1.000 | 0.364 | 0.852 | 0.994 | |

100 | 0.728 | 0.999 | 1.000 | 0.528 | 0.949 | 0.999 | |

$\mathit{p}=\mathbf{1.0}$ | $\mathit{p}=\mathbf{0.8}$ | ||||||

$\mathit{n}$ | $\mathit{m}=\mathbf{3}$ | $\mathit{m}=\mathbf{5}$ | $\mathit{m}=\mathbf{10}$ | $\mathit{m}=\mathbf{3}$ | $\mathit{m}=\mathbf{5}$ | $\mathit{m}=\mathbf{10}$ | |

20 | 0.161 | 0.385 | 0.638 | 0.112 | 0.239 | 0.388 | |

40 | 0.358 | 0.791 | 0.971 | 0.231 | 0.578 | 0.828 | |

$\alpha =0.05$ | 60 | 0.612 | 0.970 | 0.999 | 0.384 | 0.810 | 0.973 |

80 | 0.736 | 0.995 | 1.000 | 0.529 | 0.951 | 0.998 | |

100 | 0.850 | 0.999 | 1.000 | 0.671 | 0.981 | 1.000 |

**Table 9.**p-value of Copula’s test, Hoeffding’s test, $JLN{D}_{n}$’s test ($B=5000$); p-value and coefficient of Kendall’s test, Pearson’s test and Spearman’s test. Case (ii) Flavanoids vs. Intensity.

Copula | Hoeffding | Spearman | Pearson | Kendall | ${\mathit{JLND}}_{\mathit{n}}$ Equation (2) | ${\mathit{JLND}}_{\mathit{n}}$ Equation (3) | |
---|---|---|---|---|---|---|---|

p-value | 0.0005 | 0.0000 | 0.5695 | 0.0214 | 0.5713 | 0.0380 | 0.0044 |

coefficient | −0.0429 | −0.1724 | 0.0287 |

**Table 10.**Data set cdrate from [9] organized by attributes Return on CD and Type = 0 (bank), 1 (thrift).

Return on CD | Type = 0 | Type = 1 | Return on CD | Type = 0 | Type = 1 | Return on CD | Type = 0 | Type = 1 |
---|---|---|---|---|---|---|---|---|

7.51 | 0 | 1 | 8.15 | 0 | 1 | 8.49 | 0 | 3 |

7.56 | 1 | 0 | 8.17 | 1 | 0 | 8.50 | 1 | 9 |

7.57 | 1 | 0 | 8.20 | 0 | 1 | 8.51 | 1 | 0 |

7.71 | 1 | 0 | 8.25 | 0 | 2 | 8.52 | 0 | 1 |

7.75 | 0 | 1 | 8.30 | 1 | 2 | 8.55 | 1 | 0 |

7.82 | 2 | 0 | 8.33 | 2 | 1 | 8.57 | 1 | 0 |

7.90 | 1 | 1 | 8.34 | 0 | 1 | 8.65 | 2 | 0 |

8.00 | 7 | 3 | 8.35 | 0 | 2 | 8.70 | 0 | 1 |

8.05 | 2 | 0 | 8.36 | 0 | 1 | 8.71 | 1 | 0 |

8.06 | 1 | 0 | 8.40 | 1 | 6 | 8.75 | 0 | 1 |

8.11 | 1 | 0 | 8.45 | 0 | 1 | 8.78 | 0 | 1 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

García, J.E.; González-López, V.A.
Random Permutations, Non-Decreasing Subsequences and Statistical Independence. *Symmetry* **2020**, *12*, 1415.
https://doi.org/10.3390/sym12091415

**AMA Style**

García JE, González-López VA.
Random Permutations, Non-Decreasing Subsequences and Statistical Independence. *Symmetry*. 2020; 12(9):1415.
https://doi.org/10.3390/sym12091415

**Chicago/Turabian Style**

García, Jesús E., and Verónica A. González-López.
2020. "Random Permutations, Non-Decreasing Subsequences and Statistical Independence" *Symmetry* 12, no. 9: 1415.
https://doi.org/10.3390/sym12091415