# Addressing the Cold-Start Problem in Recommender Systems Based on Frequent Patterns

## Abstract

## 1. Introduction

- It proposes a novel hybrid approach, which combines clustering, discriminative itemset mining and the user’s/product’s context.
- It can handle both the sparsity and the cold-start recommendation problems.
- It can handle the out-of-matrix prediction (pure) cold-start recommendation problem, where users have no rating history. In most of similar approaches, several initial ratings are needed.
- It does not need any interview process to ask the user to fill in some extra information (e.g., extended profile data, answer questions and rate items), before using the system for the first time [5].
- It does not include many modules that are interrelated with multiple dependencies.

## 2. Literature Review

## 3. Proposed Methodology

Algorithm1: Recommendation Algorithm based on Discriminant Frequent Itemsets |

Step 1: InputThe set of users $U=\left\{{u}_{1},\dots ,{u}_{nofu},\right\}$, where $nofu$ is the number of users The set of products $P=\left\{{p}_{1},\dots ,{p}_{nofp}\right\}$, where $nofp$ is the number of products The $\left(nofu\times c\right)$ user matrix $UM=\left\{\left({u}_{i},{u}_{i1},\dots ,{u}_{ic}\right)|{u}_{i1},\dots ,{u}_{ic}\mathrm{are}\text{}\mathrm{the}\text{}\mathrm{characteristics}\text{}\mathrm{of}\text{}\mathrm{user}\text{}{u}_{i}\in U\right\}$ The $\left(nofu\times nofp\right)$ user–product matrix $UP=\left\{\left({u}_{i},{p}_{1},\dots ,{p}_{nofp}\right)|{u}_{i}\in U,{p}_{1},\dots ,{p}_{nofp}\in P\right\}$ Step 2: Extract $C=\left\{{c}_{1},\dots ,{c}_{n}\right\}$ clusters of usersStep 3:For each cluster ${c}_{i}\in C$ do Form the $\left|{c}_{i}\right|\times nofp$ sub user–product binary matrix $U{P}_{{c}_{i}}=\left\{\left({u}_{i},{p}_{1},\dots ,{p}_{nofp}\right)|\left({u}_{i},{p}_{1},\dots ,{p}_{nofp}\right)\in UP,{u}_{i}\in {c}_{i}\right\}$ Extract frequent itemsets $FI{S}_{{c}_{i}}=\left\{fi{s}_{{c}_{i}1},\dots ,fi{s}_{{c}_{i}m}\right\}$ of $U{P}_{{c}_{i}}$ End For Step 4:For each cluster ${c}_{i}\in C$ do Form the $\left|{c}_{i}\right|\times nofp$ sub user–product binary matrix $U{P}_{{c}_{i}}=\left\{\left({u}_{i},{p}_{1},\dots ,{p}_{nofp}\right)|\left({u}_{i},{p}_{1},\dots ,{p}_{nofp}\right)\in UP,{u}_{i}\in {c}_{i}\right\}$ Extract discriminant frequent itemsets $DFI{S}_{{c}_{i}}=\left\{dfi{s}_{{c}_{i}1},\dots ,dfi{s}_{{c}_{i}q}\right\}$ of $U{P}_{{c}_{i}}$, where $DFI{S}_{{c}_{i}}\subseteq FI{S}_{{c}_{i}}$ End For Step 5:For each ${u}_{i}\in U$ do If $\left({u}_{i},{p}_{1},\dots ,{p}_{nofp}\right)\in UP$, ${p}_{1}={p}_{2=}\dots ={p}_{nofp}=0$ * cold user Then For each $dfi{s}_{{c}_{i}k}\in DFI{S}_{{c}_{i}}\wedge {u}_{i}\in {c}_{i}$ do ${p}_{s}=1,s\in dfi{s}_{{c}_{i}k}$ End For Else * sparsity For each $dfi{s}_{{c}_{i}k}\in DFI{S}_{{c}_{i}}\wedge {u}_{i}\in {c}_{i}$ do If $\left|\left\{{p}_{s}|s\in dfi{s}_{{c}_{i}k}\wedge {p}_{s}=1\right\}\right|ct$ If $s\in dfi{s}_{{c}_{i}k}\wedge {p}_{s}=NULL$ Then ${p}_{s}=1$ End If End If End For End If End For |

#### 3.1. Input Matrices

#### 3.2. Clustering Procedure

#### 3.3. Extraction of Frequent Itemsets

#### 3.4. Extraction of Discriminant Itemsets

#### 3.5. Recommendation

#### 3.6. An Illustrative Example

## 4. Empirical Results

- To increase the density of the initial matrix by recommending more movies to users who have a small rating history (in-matrix prediction);
- To recommend movies to new users with no rating history at all, based on the rating history of users with similar characteristics (out-of-matrix prediction).

#### Comparison to Similar Methods

## 5. Discussion

## 6. Conclusions

#### 6.1. Limitations

#### 6.2. Future Work

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

Age | Sex | Occupation | Country | Wage (In $) | |
---|---|---|---|---|---|

u_{1} | 25–35 | F | Pharmacist | Germany | 3000 |

u_{2} | 18–24 | M | Teacher | USA | 5000 |

u_{3} | 36–45 | F | Economist | France | 2500 |

u_{4} | 25–35 | F | Doctor | UK | 3000 |

u_{5} | 46–55 | M | Engineer | USA | 8000 |

u_{6} | 18–24 | M | Professor | USA | 6000 |

u_{7} | 25–35 | M | Lawyer | USA | 8000 |

u_{8} | 18–24 | F | Teacher | USA | 6000 |

u_{nofu-1} | Over 55 | F | Teacher | Canada | 6000 |

u_{nofu} | 46–55 | M | Assistant | China | 1500 |

p_{1} | p_{2} | p_{3} | p_{4} | p_{5} | p_{nofp-1} | p_{nofp} | ||
---|---|---|---|---|---|---|---|---|

u_{1} | 1 | 0 | 1 | Not updated | ||||

u_{2} | 1 | 1 | 1 | Updated (*cold user) | ||||

u_{3} | 1 | 1 | 1 | 1 | Updated | |||

u_{4} | 1 | 0 | 1 | 0 | 1 | Not updated | ||

u_{5} | 0 | 1 | 0 | 0 | ||||

u_{6} | 1 | 0 | 1 | Not updated | ||||

u_{7} | 1 | 1 | 1 | Not updated | ||||

u_{8} | 1 | 1 | 1 | Updated | ||||

u_{nofu-1} | 0 | 1 | ||||||

u_{nofu} | 0 | 1 | 0 | 1 |

Author | Precision | Recall | F1-Measure |
---|---|---|---|

Proposed method (out-of-matrix) | 0.895 | 0.463 | 0.61 |

Proposed method (in-matrix, 30–90%) | 0.903–0.900 | 0.444–0.445 | 0.595–0.595 |

de Carvalho et al. (2020) [8] | 0.035 | 0.121 | 0.052 |

Yanxiang, L. et al. (2013) [9] | 0.651 | - | - |

Peng, Lu et al. (2016) [11] | 0.786 | 0.287 | 0.367 |

Park and Chu (2009) [12] | 0.666 | 0.239 | 0.311 |

Bobadilla, Ortega [13] | 0.37–0.57 | 0.25–1 | - |

Huang and Yin (2010) [14] | - | - | 0.308 |

