# Utilizing Mixture Regression Models for Clustering Time-Series Energy Consumption of a Plastic Injection Molding Process

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

#### Motivating Example

## 2. The Mixture Model for the Analysis of Time Series

- Smoothing: These techniques can help smooth data and reduce noise, which can be particularly useful when dealing with datasets that contain outliers.
- Robustness: Spline and polynomial regression are robust to missing data and differences in sampling times between time-series data.
- Ease of interpretation: The use of spline and polynomial regression can make it easier to interpret the results of the regression analysis.
- Prediction: These techniques can be useful for predictions, particularly when dealing with nonlinear relationships.
- Computational efficiency: Spline and polynomial regression can be computationally efficient, particularly when dealing with large datasets.

## 3. Regression Mixtures for Clustering Time Series with Auto-Correlated Data

## 4. Case Study

#### 4.1. The Benchmark K-Means Method for Time-Series Clustering

#### 4.2. The Benchmark Spectral Clustering Method for Time-Series Clustering

- A similarity matrix is constructed to quantify the similarity between each pair of data points.
- The eigenvalues and eigenvectors of the similarity matrix are determined.
- A threshold value is established for the eigenvalue gap to separate data points into distinct clusters.
- Each data point is assigned to a cluster corresponding to its eigenvector.

#### 4.3. Application of the Benchmark Algorithms to Case Study

#### 4.4. B-Spline Regression Mixtures for Time-Series Clustering

#### 4.5. Polynomial Regression Mixtures for Time-Series Clustering

## 5. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Elduque, A.; Elduque, D.; Javierre, C.; Fernández, Á.; Santolaria, J. Environmental impact analysis of the injection molding process: Analysis of the processing of high-density polyethylene parts. J. Clean. Prod.
**2015**, 108, 80–89. [Google Scholar] [CrossRef] - Kazmer, D.; Peterson, A.M.; Masato, D.; Colon, A.R.; Krantz, J. Strategic cost and sustainability analyses of injection molding and material extrusion additive manufacturing. Polym. Eng. Sci.
**2023**, 63, 943–958. [Google Scholar] [CrossRef] - Dunkelberg, H.; Schlosser, F.; Veitengruber, F.; Meschede, H.; Heidrich, T. Classification and clustering of the German plastic industry with a special focus on the implementation of low and high temperature waste heat. J. Clean. Prod.
**2019**, 238, 117784. [Google Scholar] [CrossRef] - Rashid, O.; Low, K.; Pittman, J. Mold cooling in thermoplastics injection molding: Effectiveness and energy efficiency. J. Clean. Prod.
**2020**, 264, 121375. [Google Scholar] [CrossRef] - Kelly, A.; Woodhead, M.; Coates, P. Comparison of injection molding machine performance. Polym. Eng. Sci.
**2005**, 45, 857–865. [Google Scholar] [CrossRef] - Liu, H.; Zhang, X.; Quan, L.; Zhang, H. Research on energy consumption of injection molding machine driven by five different types of electro-hydraulic power units. J. Clean. Prod.
**2020**, 242, 118355. [Google Scholar] [CrossRef] - Elduque, A.; Elduque, D.; Pina, C.; Clavería, I.; Javierre, C. Electricity consumption estimation of the polymer material injection-molding manufacturing process: Empirical model and application. Materials
**2018**, 11, 1740. [Google Scholar] [CrossRef] - Meekers, I.; Refalo, P.; Rochman, A. Analysis of process parameters affecting energy consumption in plastic injection moulding. Procedia CIRP
**2018**, 69, 342–347. [Google Scholar] [CrossRef] - Ishihara, M.; Hibino, H.; Harada, T. Simulated annealing based simulation method for minimizing electricity cost considering production line scheduling including injection molding machines. J. Adv. Mech. Des. Syst. Manuf.
**2020**, 14, JAMDSM0055. [Google Scholar] [CrossRef] - Wu, Y.; Feng, Y.; Peng, S.; Mao, Z.; Chen, B. Generative machine learning-based multi-objective process parameter optimization towards energy and quality of injection molding. Environ. Sci. Pollut. Res.
**2023**, 30, 51518–51530. [Google Scholar] [CrossRef] - Mianehrow, H.; Abbasian, A. Energy monitoring of plastic injection molding process running with hydraulic injection molding machines. J. Clean. Prod.
**2017**, 148, 804–810. [Google Scholar] [CrossRef] - Ahir, R.K.; Chakraborty, B. A novel cluster-specific analysis framework for demand-side management and net metering using smart meter data. Sustain. Energy Grids Netw.
**2022**, 31, 100771. [Google Scholar] [CrossRef] - Aghabozorgi, S.; Shirkhorshidi, A.S.; Wah, T.Y. Time-series clustering—A decade review. Inf. Syst.
**2015**, 53, 16–38. [Google Scholar] [CrossRef] - Javed, A.; Lee, B.S.; Rizzo, D.M. A benchmark study on time series clustering. Mach. Learn. Appl.
**2020**, 1, 100001. [Google Scholar] [CrossRef] - Fraley, C.; Raftery, A.E. Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc.
**2002**, 97, 611–631. [Google Scholar] [CrossRef] - Okereke, G.E.; Bali, M.C.; Okwueze, C.N.; Ukekwe, E.C.; Echezona, S.C.; Ugwu, C.I. K-means clustering of electricity consumers using time-domain features from smart meter data. J. Electr. Syst. Inf. Technol.
**2023**, 10, 1–18. [Google Scholar] [CrossRef] - Zheng, P.; Zhou, H.; Liu, J.; Nakanishi, Y. Interpretable building energy consumption forecasting using spectral clustering algorithm and temporal fusion transformers architecture. Appl. Energy
**2023**, 349, 121607. [Google Scholar] [CrossRef] - Pacella, M.; Papadia, G. Finite Mixture Models for Clustering Auto-Correlated Sales Series Data Influenced by Promotions. Computation
**2022**, 10, 23. [Google Scholar] [CrossRef] - Czepiel, M.; Bańkosz, M.; Sobczak-Kupiec, A. Advanced Injection Molding Methods. Materials
**2023**, 16, 5802. [Google Scholar] [CrossRef] [PubMed] - Gaffney, S.; Smyth, P. Curve Clustering with Random Effects Regression Mixtures. In Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, Key West, FL, USA, 3–6 January 2003; Bishop, C.M., Frey, B.J., Eds.; Machine Learning Research. JMLR: Cambridge, MA, USA, 2003; Volume R4, pp. 101–108. [Google Scholar]
- James, G.M.; Sugar, C.A. Clustering for Sparsely Sampled Functional Data. J. Am. Stat. Assoc.
**2003**, 98, 397–408. [Google Scholar] [CrossRef] - Liu, X.; Yang, M.C. Simultaneous curve registration and clustering for functional data. Comput. Stat. Data Anal.
**2009**, 53, 1361–1376. [Google Scholar] [CrossRef] - Chamroukhi, F. Unsupervised learning of regression mixture models with unknown number of components. J. Stat. Comput. Simul.
**2016**, 86, 2308–2334. [Google Scholar] [CrossRef] - Chen, W.C.; Maitra, R. Model-based clustering of regression time series data via APECM—an AECM algorithm sung to an even faster beat. Stat. Anal. Data Mining ASA Data Sci. J.
**2011**, 4, 567–578. [Google Scholar] [CrossRef] - Chen, W.C.; Ostrouchov, G.; Pugmire, D.; Prabhat; Wehner, M. A parallel EM algorithm for model-based clustering applied to the exploration of large spatio-temporal data. Technometrics
**2013**, 55, 513–523. [Google Scholar] [CrossRef] - Schwarz, G. Estimating the Dimension of a Model. Ann. Stat.
**1978**, 6, 461–464. [Google Scholar] [CrossRef] - Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control.
**1974**, 19, 716–723. [Google Scholar] [CrossRef] - Chen, J.; Li, P. Hypothesis test for normal mixture models: The EM approach. Ann. Stat.
**2009**, 37, 2523–2542. [Google Scholar] [CrossRef] - Wichitchan, S.; Yao, W.; Yang, G. Hypothesis testing for finite mixture models. Comput. Stat. Data Anal.
**2019**, 132, 180–189. [Google Scholar] [CrossRef] - Ahmed, M.; Seraj, R.; Islam, S.M. The k-means algorithm: A comprehensive survey and performance evaluation. Electronics
**2020**, 9, 1295. [Google Scholar] [CrossRef] - Holder, C.; Middlehurst, M.; Bagnall, A. A review and evaluation of elastic distance functions for time series clustering. Knowl. Inf. Syst.
**2023**, 1–45. [Google Scholar] [CrossRef] - Jia, H.; Ding, S.; Xu, X.; Nie, R. The latest research progress on spectral clustering. Neural Comput. Appl.
**2014**, 24, 1477–1486. [Google Scholar] [CrossRef] - Zhang, J.; Shen, Y. Review on spectral methods for clustering. In Proceedings of the 2015 34th Chinese Control Conference (CCC), Hangzhou, China, 28–30 July 2015; pp. 3791–3796. [Google Scholar]
- Pacella, M.; Papadia, G. Fault diagnosis by multisensor data: A data-driven approach based on spectral clustering and pairwise constraints. Sensors
**2020**, 20, 7065. [Google Scholar] [CrossRef] [PubMed] - Huang, D.; Wang, C.-D.; Wu, J.-S.; Lai, J.-H.; Kwoh, C.-K. Ultra-scalable spectral clustering and ensemble clustering. IEEE Trans. Knowl. Data Eng.
**2019**, 32, 1212–1226. [Google Scholar] [CrossRef] - Caliński, T.; Harabasz, J. A dendrite method for cluster analysis. Commun.-Stat.-Theory Methods
**1974**, 3, 1–27. [Google Scholar] [CrossRef] - Davies, D.L.; Bouldin, D.W. A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell.
**1979**, PAMI-1, 224–227. [Google Scholar] [CrossRef] - Kolluri, J.; Kotte, V.K.; Phridviraj, M.S.B.; Razia, S. Reducing overfitting problem in machine learning using novel L1/4 regularization method. In Proceedings of the 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184), Tirunelveli, India, 15–17 June 2020; pp. 934–938. [Google Scholar]
- Lloyd-Jones, L.R.; Nguyen, H.D.; McLachlan, G.J. A globally convergent algorithm for lasso-penalized mixture of linear regression models. Comput. Stat. Data Anal.
**2018**, 119, 19–38. [Google Scholar] [CrossRef] - Zhang, R.; Li, X.; Wu, T.; Zhao, Y. Data clustering via uncorrelated ridge regression. IEEE Trans. Neural Netw. Learn. Syst.
**2020**, 32, 450–456. [Google Scholar] [CrossRef]

**Figure 1.**Process flow chart for fins production. The production of fins is managed through two IIMs (machine 1 and 2). Machine 1 is used for the creation of the base product, namely paddles, while machine 2 is used for colored booties. (Image courtesy of STAM and SEACSUB).

**Figure 2.**Historical time-series dataset comprising 30 instances of hourly energy consumption in kilowatt-hours (kWh) on machine 1 on 24 h for 30 days. The vertical axis represents the energy consumption in kWh, and a different color is used to distinguish between the various production days in the dataset.

**Figure 3.**Collection and recording of hourly energy consumption data of nine IMMs into the SQL database for effective data management and analysis.

**Figure 4.**Graphical representation of the centroids for the two clusters generated through the implementation of the K-means algorithm with $K=2$. Varying colors differentiate profiles.

**Figure 5.**Graphical representation of the centroids for the two clusters generated through the implementation of the SC algorithm with $K=3$. Varying colors differentiate profiles.

**Figure 6.**Graphical depiction of the centroids for each of the six clusters (1–6) resulting from the implementation of B-spline regression mixtures. Varying colors differentiate profiles.

**Figure 7.**Graphical representation of the B-spline of order 3 resulting from the EM algorithm resampled at a frequency of a minute in 24 h (1440 data points).

**Figure 8.**Sixteen energy profiles grouped into Cluster 2, exhibiting a high-load profile spanning a period of approximately 16 h. Varying colors differentiate profiles.

**Figure 9.**Six energy profiles grouped into Cluster 3, exhibiting a high-load profile spanning a period of approximately 7 h during the first half of the day. Varying colors differentiate profiles.

**Figure 10.**Five energy profiles grouped into Cluster 6, exhibiting a high-load profile spanning a period of approximately 7 h during the second half of the day. Varying colors differentiate profiles.

**Figure 11.**Graphical depiction of the centroids for each of the six clusters (1–6) resulting from the implementation of polynomial regression mixtures. Varying colors differentiate profiles.

**Table 1.**Values of the $CH$ and $DB$ indices for different values of K for the K-means algorithm in the reference case study. The maximum value of $CH$ and minimum value of $DB$ are obtained for $K=2$.

CH | DB | |
---|---|---|

$K=1$ | $NaN$ | $NaN$ |

$K=2$ | $356.80$ | $0.4507$ |

$K=3$ | $272.79$ | $0.7536$ |

$K=4$ | $281.66$ | $0.8234$ |

$K=5$ | $259.30$ | $0.7661$ |

$K=6$ | $228.21$ | $0.7605$ |

$K=7$ | $262.66$ | $0.6251$ |

$K=8$ | $236.26$ | $0.7108$ |

**Table 2.**Values of the $CH$ and $DB$ indices for different values of K for the SC algorithm in the reference case study. The maximum value of $CH$ and minimum value of $DB$ are obtained for $K=3$.

CH | DB | |
---|---|---|

$K=1$ | $NaN$ | $NaN$ |

$K=2$ | $0.8369$ | $7.5416$ |

$K=3$ | $1.5623$ | $3.3407$ |

$K=4$ | $0.5670$ | $4.4918$ |

$K=5$ | $0.3818$ | $4.9000$ |

$K=6$ | $0.9225$ | $3.7632$ |

$K=7$ | $1.3381$ | $4.4297$ |

$K=8$ | $1.4708$ | $5.0791$ |

**Table 3.**BIC values for B-spline regression mixtures for different values of K and orders. The maximum BIC value is obtained for $K=6$ and B-spline of order 3 (BIC equal to $-7.7054$).

Order 2 | Order 3 | Order 4 | Order 5 | |
---|---|---|---|---|

$K=1$ | $-9.8558$ | $-9.8551$ | $-9.8576$ | $-9.8578$ |

$K=2$ | $-7.8063$ | $-7.8004$ | $-7.8044$ | $-7.8031$ |

$K=3$ | $-7.7764$ | $-7.7357$ | $-7.7765$ | $-7.7370$ |

$K=4$ | $-7.7549$ | $-7.7309$ | $-7.7510$ | $-7.7350$ |

$K=5$ | $-7.7321$ | $-7.7232$ | $-7.7285$ | $-7.7209$ |

$K=6$ | $-7.7236$ | $-7.7054$ | $-7.7343$ | $-7.7227$ |

$K=7$ | $-7.7392$ | $-7.7095$ | $-7.7281$ | $-7.7201$ |

$K=8$ | $-7.7317$ | $-7.7204$ | $-7.7663$ | $-7.7293$ |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Pacella, M.; Mangini, M.; Papadia, G.
Utilizing Mixture Regression Models for Clustering Time-Series Energy Consumption of a Plastic Injection Molding Process. *Algorithms* **2023**, *16*, 524.
https://doi.org/10.3390/a16110524

**AMA Style**

Pacella M, Mangini M, Papadia G.
Utilizing Mixture Regression Models for Clustering Time-Series Energy Consumption of a Plastic Injection Molding Process. *Algorithms*. 2023; 16(11):524.
https://doi.org/10.3390/a16110524

**Chicago/Turabian Style**

Pacella, Massimo, Matteo Mangini, and Gabriele Papadia.
2023. "Utilizing Mixture Regression Models for Clustering Time-Series Energy Consumption of a Plastic Injection Molding Process" *Algorithms* 16, no. 11: 524.
https://doi.org/10.3390/a16110524