# Residual Analysis of Predictive Modelling Data for Automated Fault Detection in Building’s Heating, Ventilation and Air Conditioning Systems

## Abstract

## 1. Introduction

## 2. Description of Simulated Data and System Faults

## 3. Statistical Tests

#### 3.1. Moving p-Value

#### 3.2. Mean p-Value (MPV)

#### 3.3. Parameter Optimization in the Case of Observed Faults

#### 3.4. Parameter Optimization in the Case of Unobserved Faults

#### 3.4.1. Rate of Estimated Faults

#### 3.4.2. Restricting Areas

## 4. Results

#### 4.1. The Case of Observed Faults

#### 4.2. The Case of Unobserved Faults

#### 4.3. Comparison Prediction Models

## 5. Conclusions

## 6. Outlook

## Abbreviations

AFDD | Automated fault detection and diagnostics |

ARX | Autoregressive with exogenous variables |

C | Choice set |

FD | Fault detection |

FDD | Fault detection and diagnostics |

${R}^{2}$ | Coefficient of determination |

$(L,\alpha ,H)$ | Decision rule |

ECB | Energy in Buildings and Communities Programm |

${\widehat{y}}_{t}$ | Estimated total heating power |

$\mathsf{\Phi}(L,\alpha ,H)$ | Fault function |

H | Minimum number of MPV that have to assume a fault before in total a fault is assumed |

HVAC | Heating, Ventilation and Air Conditioning |

${I}_{A}(.)$ | Indicator function |

IEA | International Energy Agency |

MPV, ${\mathsf{\Gamma}}_{T,L}\left({\widehat{\u03f5}}_{t}\right)$ | Mean p-value |

MSE | Mean squared error |

${L}_{m}$, ${\alpha}_{m}$ and ${H}_{m}$ | Mode of L, $\alpha $, H |

${M}_{\tilde{\u03f5}}(s,L)$ | Moving residuals |

n | Number of observations |

${y}_{t}$ | Observed total heating power |

${p}_{T}(.)$ | p-value of the statistical test T |

${\widehat{\u03f5}}_{t}$ | Residual at time t |

G | Set of decision rules |

$\alpha $ | Significance level |

t | Time |

L | Time window length |

$\widehat{\u03f5}$ | Vector of residuals |

$\tilde{\u03f5}$ | $({\widehat{\u03f5}}_{1},\cdots ,{\widehat{\u03f5}}_{n},{\widehat{\u03f5}}_{1},\cdots {\widehat{\u03f5}}_{L-1})$ |

**Figure 1.**Summary of FD procedure. The focus of this study is on the fault detection area highlighted in light red.

**Figure 3.**Example for restricting C using adjacency, simplified for $H=7$. In green are the combinations of C, which are adjacent to a combination which assumes exactly one fault.

**Figure 4.**Fault estimate with decision rule $(106,11.2\%,8)$ for the ARX model and $(397,6.2\%,7)$ for the random forest model. The periods with faults in the data are marked in red. The y-axis indicates if the decision rule decided on faults.

**Figure 5.**Fault estimation with decision rule $(119,11.8\%,7)$ for the ARX model and $(101,20\%,8)$ for the random forest model. The periods with faults in the data are marked in red. The y-axis indicates if the decision rule decided on faults.

**Figure 6.**Fault estimation with decision rule $(84,10.4\%,8)$ for the ARX model and $(97,9.6\%,7)$ for the random forest model. The periods with faults in the data are marked in red. The y-axis indicates whether the decision rule decided on faults.

**Figure 7.**Fault estimate with decision rule $(392,3\%,7)$ for the ARX model and $(545,10.6\%,7)$ for the random forest model. The periods with faults in the data are marked in red. The y-axis indicates whether the decision rule decided on faults.

**Figure 8.**Rate of estimated faults for the first fault. In red, the period with fault, in green the period without fault, in blue is the median, and the mean of the estimated rates is in orange.

**Figure 9.**Rate of estimated faults for the second fault. In red, the period with fault, in green the period without fault, in blue is the median, and the mean of the estimated rates is in orange.

**Figure 10.**Rate of estimated faults for the third fault. In red, the period with fault, in green the period without fault, in blue is the median, and the mean of the estimated rates is in orange.

**Figure 11.**Rate of estimated faults with all faults. In red, the period with fault, in green the period without fault, in blue is the median, and the mean of the estimated rates is in orange.

**Figure 12.**Rate of estimated faults for the faultless data. In red, the period with fault, in green the period without fault, in blue is the median, and the mean of the estimated rates is in orange.

Room | Supply Air [m${}^{3}$/h] | Return Air [m${}^{3}$/h] |
---|---|---|

Living | 50 | 0 |

Sleeping | 50 | 0 |

Child 1 | 25 | 0 |

Child 2 | 25 | 0 |

Dining | 0 | 50 |

Bath | 0 | 50 |

Kitchen (open to Living) | 0 | 50 |

Room | Occupancy | Time Slots |
---|---|---|

Living and Kitchen | 3 Persons | 6:30–7:30 and 17:00–23:00 |

Sleeping | 2 Persons | 23:00–6:30 |

Child 1 | 1 Person | 23:00–6:30 |

Child 2 | 1 Person | 23:00–6:30 |

Dining | 3 Persons | 6:30–7:30 and 17:00–23:00 |

Bath | 1 Person | 6:30–7:30 and 17:00–23:00 |

Fault | Start | End | |
---|---|---|---|

F1 | Circuit breaker failure of the electrical heating in the upper floor | 2 February 0:00 | 4 February 23:59 |

F2 | MHVR summer bypass switch off the heat recovery during fault duration | 10 February 0:00 | 15 February 23:59 |

F3 | Living room thermostat to set temperature 28 ${}^{\circ}\mathrm{C}$ | 20 February 0:00 | 23 February 23:59 |

Indoor Properties | - air temperatures for each room |

- total heating power supplied by all electrical radiators | |

Outdoor Properties | - air temperature |

- relative humidity | |

- diffuse and direct solar irradiation on horizontal surfaces | |

- wind speed and wind direction |

**Table 5.**Example for the calculation of the rate of estimated faults. The number of decision rules that assume a fault is marked in blue.

Nr. Decision Rule | t = 1 | t = 2 | t = 3 | t = 4 | $\mathsf{\Sigma}>0$ |
---|---|---|---|---|---|

1 | 0 | 1 | 1 | 0 | True |

2 | 0 | 0 | 0 | 0 | False |

3 | 0 | 0 | 0 | 0 | False |

4 | 0 | 1 | 0 | 1 | True |

5 | 1 | 1 | 1 | 0 | True |

$\mathsf{\Sigma}$ | 1 | 3 | 2 | 1 | 3 |

$\mathsf{\Sigma}/{3}$ | 1/3 | 1 | 2/3 | 1/3 | |

$\mathsf{\Sigma}/{3}>1/2$ | False | True | True | False |

