# Consistent Estimation of Partition Markov Models

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Preliminaries

**Definition**

**1.**

- (i)
- $s,r\in \mathcal{S}$ are equivalent (denoted by $s{\sim}_{p}r$) if $P\left(a\right|s)=P(a\left|r\right)\phantom{\rule{0.277778em}{0ex}}\forall a\in A.$
- (ii)
- $\left({X}_{t}\right)$ is a Markov chain with partition $\mathcal{L}=\{{L}_{1},{L}_{2},\dots ,{L}_{\left|\mathcal{L}\right|}\}$ if this partition is the one defined by the equivalence relationship ${\sim}_{p}$ introduced by item (i).

**Remark**

**1.**

**Example**

**1.**

- (i)
- Suppose $P(\xb7|s)\ne P(\xb7|{s}^{\prime}),\forall s,{s}^{\prime}\in \mathcal{T}.$ Then, $\mathcal{L}$ verifies Definition 1(ii);
- (ii)
- Suppose $P(\xb7|s)\ne P(\xb7|{s}^{\prime}),\forall s,{s}^{\prime}\in \mathcal{T}\setminus \left\{0\right\}$ and $P(\xb7|\left\{0\right\})=P(\xb7\left|\right\{01\left\}\right).$ Define ${L}_{1}^{\prime}={L}_{1}\cup {L}_{2},$ and then ${\mathcal{L}}^{\prime}=\{{L}_{1}^{\prime},{L}_{3},{L}_{4}\}$ verifies Definition 1(ii), while $\mathcal{L}$ does not check that definition.

**Example**

**2.**

## 3. Consistent Estimation through the Bayesian Information Criterion

**Definition**

**2.**

**Remark**

**2.**

**Definition**

**3.**

- (i)
- $L\in \mathcal{L}$ is a good part of $\mathcal{L}$ if $\forall s,{s}^{\prime}\in L,$ $Prob({X}_{t}=.\phantom{\rule{0.166667em}{0ex}}|{X}_{t-M}^{t-1}=s)=Prob({X}_{t}=.\phantom{\rule{0.166667em}{0ex}}|{X}_{t-M}^{t-1}={s}^{\prime}),$ for values of $t:$ $t>M;$
- (ii)
- $\mathcal{L}$ is a good partition of $\mathcal{S}$ if for each $i\in \{1,\dots ,\left|\mathcal{L}\right|\},$ ${L}_{i}$ verifies item (i).

**Example**

**3.**

- (i)
- $\mathcal{L}=\mathcal{S}$ is a good partition of $\mathcal{S}.$
- (ii)
- Consider the Example 1(ii), and the partition $\mathcal{L}$ is a good partition of $\mathcal{S}={\{0,1\}}^{3}.$

**Notation**

**1.**

- (a)
- Let ${\mathcal{L}}^{ij}$ denote the partition${\mathcal{L}}^{ij}=\{{L}_{1},\dots ,{L}_{i-1},{L}_{ij},{L}_{i+1},\dots ,{L}_{j-1},{L}_{j+1},\dots ,{L}_{\left|\mathcal{L}\right|},\}$ where$\mathcal{L}=\{{L}_{1},\dots ,{L}_{\left|\mathcal{L}\right|}\}$ is a partition of $\mathcal{S},$ and for $1\le i<j\le \left|\mathcal{L}\right|$ with ${L}_{ij}={L}_{i}\cup {L}_{j}.$
- (b)
- For $a\in A$, we write $P({L}_{ij},a)=P({L}_{i},a)+P({L}_{j},a)$ and $P\left({L}_{ij}\right)=P\left({L}_{i}\right)+P\left({L}_{j}\right).$ In addition,$$\begin{array}{c}\hfill {N}_{n}({L}_{ij},a)={N}_{n}({L}_{i},a)+{N}_{n}({L}_{j},a);\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{N}_{n}\left({L}_{ij}\right)={N}_{n}\left({L}_{i}\right)+{N}_{n}\left({L}_{j}\right).\end{array}$$

#### 3.1. A Metric on the State Space

**Theorem**

**1.**

**Proof.**

**Corollary**

**1.**

**Proof.**

**Remark**

**3.**

**Definition**

**4.**

**Theorem**

**2.**

- (i)
- ${d}_{\mathcal{L}}(i,j)\ge 0$ with equality if and only if $\frac{{N}_{n}({L}_{i},a)}{{N}_{n}\left({L}_{i}\right)}=\frac{{N}_{n}({L}_{j},a)}{{N}_{n}\left({L}_{j}\right)}\phantom{\rule{0.277778em}{0ex}}\phantom{\rule{0.277778em}{0ex}}\forall a\in A;$
- (ii)
- ${d}_{\mathcal{L}}(i,j)={d}_{\mathcal{L}}(j,i);$
- (iii)
- ${d}_{\mathcal{L}}(i,k)\le {d}_{\mathcal{L}}(i,j)+{d}_{\mathcal{L}}(j,k).$

**Proof.**

**Corollary**

**2.**

**Proof.**

#### 3.2. Consistent Estimation of the Process’s Partition

**Theorem**

**3.**

**Proof.**

**Corollary**

**3.**

**Remark**

**4.**

## 4. Navigation Patterns on a Web Site (MSNBC.com)

## 5. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## Appendix A. Proofs

**Definition**

**A1.**

#### Appendix A.1. Proof of Theorem 1

#### Appendix A.2. Proof of Theorem 2

**Proof.**

#### Appendix A.3. Proof of Theorem 3

**Proof.**

## Appendix B. Auxiliary Results

**Proposition**

**A1.**

**Proof.**

## References

- Buhlmann, P.; Wyner, A. Variable length Markov chains. Ann. Stat.
**1999**, 27, 480–513. [Google Scholar] - Rissanen, J. A universal data compression system. IEEE Trans. Inf. Theory
**1983**, 29, 656–664. [Google Scholar] [CrossRef] - Weinberger, M.; Rissanen, J.; Feder, M. A universal finite memory source. IEEE Trans. Inf. Theory
**1995**, 41, 643–652. [Google Scholar] [CrossRef] - Csiszár, I.; Talata, Z. Context tree estimation for not necessarily finite memory processes, via BIC and MDL. IEEE Trans. Inf. Theory
**2006**, 52, 1007–1016. [Google Scholar] [CrossRef] - Schwarz, G. Estimating the dimension of a model. Ann. Stat.
**1978**, 6, 461–464. [Google Scholar] [CrossRef] - Csiszár, I. Large-scale typicality of Markov sample paths and consistency of MDL order estimators. IEEE Trans. Inf. Theory
**2002**, 48, 1616–1628. [Google Scholar] [CrossRef] - Csiszár, I.; Shields, P.C. The consistency of the BIC Markov order estimator. Ann. Stat.
**2000**, 28, 1601–1619. [Google Scholar] - Jääskinen, V.; Xiong, J.; Corander, J.; Koski, T. Sparse markov chains for sequence data. Scand. J. Stat.
**2014**, 41, 639–655. [Google Scholar] [CrossRef] - Manning, C.D.; Schütze, H. Foundations of Statistical Natural Language Processing; MIT Press: Cambridge, MA, USA, 1999; Volume 999. [Google Scholar]
- Garca, J.E.; González-López, V.A. Minimal Markov Models. In Proceedings of the Fourth Workshop on Information Theoretic Methods in Science and Engineering, Helsinki, Finland, 7–10 August 2011; Volume 1, pp. 25–28. [Google Scholar]
- Farcomeni, A. Hidden Markov Partition Models. Stat. Probab. Lett.
**2011**, 81, 1766–1770. [Google Scholar] [CrossRef] - García, J.E.; Fernández, M. Copula based model correction for bivariate Bernoulli financial series. In Proceedings of the 11th International Conference of Numerical Analysis and Applied Mathematics (ICNAAM 2013), Rhodes, Greece, 21–27 September 2013; AIP Publishing: Melville, NY, USA, 2013; Volume 1558, pp. 1487–1490. [Google Scholar]
- Fernández, M.; García Jesús, E.; González-López, V.A. Multivariate Markov chain predictions adjusted with copula models. In New Trends in Stochastic Modeling and Data Analysis; ISAST: Athens, Greece, 2015. [Google Scholar]
- García, J.E.; González-López, V.A.; Hirsh, I.D. Copula-Based Prediction of Economic Movements. In Proceedings of the 13th International Conference of Numerical Analysis and Applied Mathematics (ICNAAM 2015), Rhodes, Greece, 23–29 September 2015; AIP Publishing: Melville, NY, USA, 2015; Volume 1738, p. 140005. [Google Scholar]
- García, J.E.; González-López, V.A. Detecting regime changes in Markov models. Proceedings of The Sixth Workshop on Information Theoretic Methods in Science and Engineering, Tokyo, Japan, 26–29 August 2013. [Google Scholar]
- Gusfield, D. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
- MSNBC.com Anonymous Web Data Data Set. Available online: https://archive.ics.uci.edu/ml/datasets/MSNBC.com+Anonymous+Web+Data (accessed on 5 April 2017).
- Index of /~jg/MSNBC. Available online: http://www.ime.unicamp.br/~jg/MSNBC/ (accessed on 5 April 2017).
- Galves, A.; Galves, C.; García, J.; Garcia, N.L.; Leonardi, F. Context tree selection and linguistic rhythm retrieval from written texts. Ann. Appl. Stat.
**2012**, 6, 186–209. [Google Scholar] [CrossRef]

s | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |

$P\left(0\right|s)$ | 0.1 | 0.1 | 0.1 | 0.2 | 0.1 | 0.1 | 0.1 | 0.1 |

User 1 | frontpage, tech, tech, frontpage. |

User 2 | weather, weather, weather, misc, local, weather, weather, weather. |

User 3 | on-air, msn-news, msn-news, msn-news, msn-news, misc, msn-news. |

User 4 | news. |

User 5 | msn-sports, sports, msn-sports. |

User 6 | frontpage, frontpage, frontpage. |

User 7 | news, business, tech, local, business, business. |

User 8 | frontpage. |

User 9 | local. |

User 10 | frontpage, tech, tech. |

User 11 | frontpage, frontpage, business, frontpage. |

User 12 | sports, sports, sports, sports, sports, sports. |

**Table 3.**Number of parts or cardinal of $\mathcal{L}$ and BIC value of the model (Definition 2), for memories 2 and 3, respectively. In (ii), $\u03f5=\frac{\left(\right|A|-1)}{2}\frac{1}{10}$ was used. In bold we mark the highest BIC values, which indicate the best method.

Order 3 | ||

Method | Number of Parts ($\mathbf{\left|}\mathcal{L}\mathbf{\right|}$) | BIC Value |

(i) | 196 | −2957442 |

(ii) | 210 | −2895322 |

(iii) | 269 | −2865622 |

Order 2 | ||

Method | Number of Parts ($\mathbf{\left|}\mathcal{L}\mathbf{\right|}$) | BIC Value |

(i) | 177 | −3614825 |

(ii) | 177 | −3613655 |

(iii) | 181 | −3611092 |

Part | Strings | $\mathit{P}\left(\mathbf{local}\right|{\mathit{L}}_{\mathit{i}})$ |
---|---|---|

${L}_{1}$ | msn-news.news.local, msn-news.business.local, on-air.tech.local | 0.7257 |

tech.local.local, msn-news.tech.local, business.local.local | ||

on-air.local.local, msn-news.local.local | ||

${L}_{2}$ | health.news.local, health.local.local, news.local.local | 0.6096 |

${L}_{3}$ | local.local.local | 0.8874 |

${L}_{4}$ | misc.local.local, tech.weather.local, frontpage.opinion.misc | 0.6355 |

local.news.misc, local.misc.local, misc.misc.local | ||

${L}_{5}$ | weather.local.local, local.weather.misc, weather.weather.misc | 0.7822 |

${L}_{6}$ | local.local.misc, on-air.weather.misc, msn-news.weather.misc, local.misc.misc | 0.7373 |

${L}_{7}$ | misc.local.misc | 0.8563 |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

García, J.E.; González-López, V.A.
Consistent Estimation of Partition Markov Models. *Entropy* **2017**, *19*, 160.
https://doi.org/10.3390/e19040160

**AMA Style**

García JE, González-López VA.
Consistent Estimation of Partition Markov Models. *Entropy*. 2017; 19(4):160.
https://doi.org/10.3390/e19040160

**Chicago/Turabian Style**

García, Jesús E., and Verónica A. González-López.
2017. "Consistent Estimation of Partition Markov Models" *Entropy* 19, no. 4: 160.
https://doi.org/10.3390/e19040160