Next Article in Journal
Investment in Learning Chinese by International Students Studying Chinese as a Second Language (CSL)
Next Article in Special Issue
Implementing the Technologies of Additional Impermeable Layers in a Building of the Monuments Office (Káčerov Majer) from a Sustainability Point of View
Previous Article in Journal
Accounting for Expansive Soil Movement in Geotechnical Design—A State-of-the-Art Review
Previous Article in Special Issue
Masonry in the Context of Sustainable Buildings: A Review of the Brick Role in Architecture
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Thermal Comfort Prediction Accuracy with Machine Learning between Regression Analysis and Naïve Bayes Classifier

1
Department of Informatics Engineering, Qur’anic Science University (Universitas Sains Al-Qur’an), Jl. Hasyim Asy’ari Km. 03, Wonosobo 56351, Indonesia
2
Department of Construction Technology and Management, Technical University of Košice, 042 00 Košice, Slovakia
3
Department of Architecture, Qur’anic Science University (Universitas Sains Al-Qur’an), Jl. Hasyim Asy’ari Km. 03, Wonosobo 56351, Indonesia
4
Department of Civil Engineering, Qur’anic Science University (Universitas Sains Al-Qur’an), Jl. Hasyim Asy’ari Km. 03, Wonosobo 56351, Indonesia
5
Department of Civil Engineering, Universitas Islam Indonesia, Jl. Kaliurang Km.14,5, Yogyakarta 55584, Indonesia
*
Author to whom correspondence should be addressed.
Sustainability 2022, 14(23), 15663; https://doi.org/10.3390/su142315663
Submission received: 20 September 2022 / Revised: 11 November 2022 / Accepted: 21 November 2022 / Published: 24 November 2022

Abstract

:
Various data analysis methods can make thermal comfort prediction models. One method that is often used is multiple linear regression statistical analysis. Regression analysis needs to be checked for accuracy with other analytical methods. This study compares the making of a thermal comfort prediction model with regression analysis and naïve Bayes analysis. The research method used quantitative methods for data collection regarding thermal comfort. The thermal comfort variable, consisting of eight independent variables and one dependent variable, was measured at Wonosobo High School, Indonesia. The analysis to make the prediction model was carried out with two different analyses: multiple linear regression analysis and naïve Bayes analysis. The results show that naïve Bayes is more accurate than multiple linear regression analysis.

1. Introduction

The problem of energy and thermal comfort has become a global issue. Many studies have tried to solve energy problems by predicting thermal comfort with variations in the type and shape of the building. Thermal comfort predictions can reduce energy wastage in buildings [1]. Thermal comfort cannot be separated from the scale of thermal sensation, model, climate, and personal variables. Several studies have used a variety of thermal sensation scales. The model is the result of an analysis of the thermal comfort variable, which is used to predict human thermal comfort [2]. Climatic variables associated with the thermal sensation of residents in naturally ventilated nursing homes can influence the mathematical model. Regions with different climates will also produce different responses, so research areas must pay attention to hot or cold areas [3]. A user’s thermal perception is one of the main variables in thermal comfort research. Field tests can be used in thermal comfort research Questionnaire based on the thermal perception standard from ASHRAE 55. The analysis uses regression analysis, producing a mathematical model or equation of thermal comfort [4]. Airflow is one of the factors of thermal comfort. Human perception of airflow is also essential to know in achieving thermal comfort. Using the laboratory as an experimental room in research can obtain results that can be accounted for. The use of regression analysis can be applied to airflow research using an experimental room [5].
The thermal comfort research model aims to predict building occupants’ thermal comfort. The prediction model generated from research is used as a standard for making building designs. The current prediction model uses an adaptive thermal comfort approach. Research verifying the adaptive thermal comfort model has been conducted in four Brazilian cities. The study used the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) method and found that the variation in the verified adaptive thermal comfort model was more than 90% for the four cities in Brazil [6]. Predictively modeling personal thermal comfort has become a trending topic in the improvement of human comfort in rooms. Thermal comfort is closely related to the design and performance of building systems, especially in sustainable and intelligent buildings [7]. Thermal comfort modeling is manifold. A basis for modeling that is currently developing is the use of stochastic algorithms and variables [8]
The model’s accuracy in predicting the thermal comfort of occupants is an important aspect that must be continuously developed. Accurate models that predict the right results are convincing [9]. Linear regression is also used in outdoor thermal comfort studies. Thermal perception and other variables such as air temperature, wind, and sun exposure have been analyzed using linear regression. The results showed a predictive model of thermal comfort. In addition, equations about air temperature have also been found based on wind and sun exposure [10]. In outdoor thermal comfort research, sun exposure is a factor that is more considered than in indoor thermal comfort research. The temperature of solar radiation in the room is not too significant compared with the temperature of solar radiation outside. The mathematical model of thermal comfort in the interior still includes the average solar radiation temperature according to the thermal comfort factor that has been formulated [11].
Methods in thermal comfort research include simulation with software, modeling in the lab, and field testing. Field tests are the most widely used in thermal comfort research. The measurement of the thermal comfort variable also sometimes coincides with the measurement of the acoustic and visual comfort variables. A study’s results will show the relationship between the variables of thermal, acoustic, and visual comfort [12]. The experimental method is also one of the methods used in thermal comfort research. Some studies make the experimental space a tool to test a model. The developing technology makes the experimental space more varied [13]. Simulation methods are often used to validate a model found in research. Simulation using software is widely used in model validation. One of the programs used for simulation is ENVI-NET.
Simulations are often combined with field measurements to obtain more valid results, with both methods carried out in a study [14]. Measurements in the field coupled with simulations can use building information modeling (BIM). Computational fluid dynamic (CFD) analysis is often one of the analytical methods in BIM [15]. Several studies have combined the two methods, but research comparing the two methods has not been widely carried out.
The development of the database field has led to the emergence of increasingly sophisticated data analysis tools. Machine learning algorithms are a new approach to analyzing thermal or visual comfort data. Machine learning algorithms as analytical tools are widely used for research on human comfort in buildings [16]. Machine learning algorithm data analysis methods continue to be developed and used in various types of buildings. Numerical computing is one of the strengths of machine learning algorithms [17]. The algorithm can also be used in urban heat island research. Research can produce effective strategies to reduce the urban heat island effect by avoiding overcrowding in infrastructure development; increasing plantations, waterbodies, and roof gardens; and using a white roof color in construction. Research findings will enable urban planners, policymakers, and local governments to achieve environmentally friendly outcomes [18]. Machine learning (ML)-based building models have gained popularity in building predictive control (MPC) models for building energy management applications. However, ML-based building models are usually nonlinear in capturing building dynamics, which causes a high computational load for MPC models, prohibiting their application in real-time building control [19]. Analysis of the hot water usage model for the thermal comfort of occupants can also be performed with machine learning. The created model provides control performance for occupant adaptation [20].
Developing convenience models using machine learning is inevitable with database development and automated calculations. Determining the analytical method is essential to finding an accurate thermal comfort model. The purpose of this study was to compare the method of multiple linear regression analysis and naïve Bayes in making an accurate thermal comfort model.
Measuring thermal comfort can be performed with objective and subjective measurements. Objective measurements use a thermal measuring instrument that measures air temperature, average solar radiation temperature, wind speed, and tire humidity. Subjective measurements include filling out a thermal sensation questionnaire from ASHRAE by having the respondent sit for 15 min within the research object [21]. Gender is an important factor in thermal comfort. The selection of respondents needs to consider gender. Different sexes will produce different thermal sensations [22]. Thermal comfort measurements have PMV (predicted mean vote) and PPD (predicted percent dissatisfied) indicators. These two aspects are the keywords in the analysis of thermal comfort [23].
Thermal comfort prediction with machine learning is still being developed. Prediction can be used to develop the science of building design. Not many studies predict thermal comfort using machine learning [24]. Machine learning methods have various forms, often using mathematical calculations and computational fluid dynamics (CFD). The use of various machine learning methods means that research results vary [17].
Thermal comfort prediction using a neural network has also been carried out. The thermoelectric airduct (TE-AD) cooling system is used to predict air temperature, PMV, and PPD. The prediction model is accurate in predicting thermal comfort [25]. The development of artificial neural networks for machine learning is still being carried out to find the right predictions [26]. Machine learning in predicting indoor air quality is also still being developed [27]. Thermal comfort prediction can also combine methods with thermal comfort variables [28].
Regression analysis is widely used in thermal comfort research to find a predictive model of thermal comfort. Currently, there are many models of thermal comfort that use regression analysis. Many comparisons of machine learning methods to find predictive models of thermal comfort have been carried out. One of the machine learning methods is naïve Bayes. A comparison between naïve Bayes and other methods has also been carried out [29]. Naïve Bayes can be an alternative to machine learning that does not require complicated calculations. Research that compares the regression method with naïve Bayes to form a predictive model of thermal comfort needs to be performed so that it is known that the method is not complicated and that it produces a better predictive model.
Clothing is one of the variables that affects thermal comfort. The majority of students in Wonosobo, Indonesia, wear closed clothes in carrying out learning activities at school. Currently, there are few predictive models with respondents who wear closed clothes and have a religious culture. Thus, this study has the novelty of finding a predictive thermal comfort model using closed clothing variables in cold areas.
This research can contribute to computer science to find predictive models that are simple and accurate. Contributions to architecture can be used as a basis for architectural design by predicting thermal comfort in naturally ventilated buildings.
Some of the related works are shown in Table 1, below.

2. Materials and Methods

This research compares two data analysis methods, regression analysis and naïve Bayes. The data used results from the measurement of thermal comfort in the field. Respondents were students at two private high schools in Kejajar District, Wonosobo Regency. The variables used are gender, age, height, weight, temperature, globe temperature, humidity, velocity, and thermal sensation vote (TSV). The survey was conducted on two measurement days in the morning, afternoon, and evening. Temperature and humidity were measured using a measuring tool with the Extect brand. We measured the globe temperature using a black copper ball with a diameter of 15 cm.
Respondents were asked to wait for 15 min when the initial measurements were taken. Respondents were high school students, so respondents had gone through adaptation to the room. The research subjects involved were male and female respondents. In thermal comfort research, it is possible that there are differences in the results of thermal sensation between men and women, although there are studies that say there are not many differences between men and women [32]. Research subjects were not selected using sampling. Responses were taken from all students who became the object of the research. The number of high school students was 252, and all of them were used as research subjects.
Data analysis used regression analysis with SPSS Statistics 25 and weka 3.8.6. The results of data analysis were compared for accuracy so that more accurate analytical tools could be found. Regression analysis was performed by making a mathematical equation model as follows:
Y = α + i = 1 8 β i X i
where X1: gender, X2: age, X3: height, X4: weight, X5: temperature, X6: globe_temperature, X7: relative_humidity, and X8: velocity.
Data analysis using naïve Bayes requires a reasonably long process starting with determining the training data that will be the test data. The calculation is performed by calculating the class probability P(Y), the probability of each P(X) criterion, and the final probability. Naïve Bayes analysis will produce an output from Y or TSV predictions. The resulting output can predict the TSV generated by building occupants with the value of the independent variable set (Figure 1).

3. Results

The data obtained amounted to 252 datasets. Respondents were 252 high school students. Female respondents wear hijab school uniforms, while men do not wear head coverings. Female students wear long and long-sleeved skirts, and male students wear shirts and trousers. All students, both girls and boys, wear shoes and socks. The activities they perform are sitting writing and sitting listening for 7–16 h. The total amount of data from eight independent variables and one dependent variable is 2268. Respondents consisted of 44% men and 56% women. Respondent ages ranged from 14 to 19 years, with an average of 16.3 years. The respondents’ heights were between 140 and 177 cm, with an average of 156 cm. The respondents’ body weights were between 30 and 82 kg, averaging 47.9 kg. The temperature in the class was between 22 and 25.5 °C, with an average of 23.75 °C. Globe temperature in the class was between 23 and 26.5 °C, with an average of 24.6 °C. Humidity was between 60 and 80%, with an average of 68.63%. Velocity did not involve too much movement, so it shows more zeros. The most significant velocity was 1 m/s. The thermal sensation votes obtained ranged from −3 (very cold) to +3 (very hot), with an average of −0.89 (near cool). Data measurement was performed by bringing the measuring instrument closer to the respondent by placing the measuring instrument on the classroom table (Figure 2).
Data analysis using multiple linear regression has several data test requirements, namely, validity and reliability. In addition, the classical assumption test also needs to be carried out to obtain data that can be used for multiple linear regression analysis. Analysis using SPSS software resulted in a large amount of valid test data. The normality test was part of the regression analysis and obtained a model that meets the assumption of normality (Figure 3).
Multiple linear regression data analysis using SPSS produced a value of unstandardized coefficients that can be used as the coefficient of the prediction model. Some of the resulting values looked insignificant. This value indicates that the influence of the independent variable on the dependent is less potent (Table 2). This value can still be used in predicting thermal comfort because several models from other studies also obtained the same results.
Based on the value of the constants and regression coefficients obtained from the regression analysis, it is known that the multiple linear regression equation based equation (Equation (1)) is as follows:
Y = (−8.796) + (−0.308) × X1 + (−0.093) × X2 + (−0.015) × X3 + 0.005 × X4 + 0.265 ×
X5 + 0.183 × X6 + 0.018 × X7 + 1.420 × X8
Several variables show a high significance value. This indicates that several independent variables did not strongly influence the dependent variable. This is possible because of the type of clothing worn by the respondents.
The comfortable air temperature was calculated when Y = 0 (comfortable thermal sensation condition), and a comfortable air temperature of 33.19 °C was obtained. The comfortable air temperature produced was higher than the average comfortable air temperature in the tropics, which is 27 °C. This is possible because the closed clothes worn by respondents are hijab for women and trousers for men.
Based on Equation (1) that was generated, a prediction of thermal comfort can be made for the sample data based on Table 3 as follows:
Y = (−8.796) + (−0.308) × 1 + (−0.093) × 18 + (−0.015) × 160 + 0.005 × 46 + 0.265 ×
22 + 0.183 × 24 + 0.018 × 62 + 1.420 × 0
Y = −1.61 ≈ −2
Naïve Bayes analysis begins with determining the training data in as many as 252 datasets according to the data from the measurement results. The variables used in predicting thermal comfort (TSV) are the same as the regression analysis, namely, gender, age, height, weight, temperature, globe temperature, relative humidity, and velocity. The training data is used as test data, which is the basis of our calculations.
Based on the training data used, there are 7 TSV classes, namely, class −3 with 3 data, class −2 with 87 data, class 1 with 81 data, class 0 with 50 data, class 1 with 24 data, class 2 with 6 data, and class 3 with as much as 1 data. TSV class classification data in the Weka software is shown in Figure 4.
The probability value for each criterion is shown in Appendix A.
In this test data, a thermal comfort (TSV) prediction can be made if the data used are as follows: gender: 1, age: 18, height: 160, weight: 46, temperature: 22, globe temperature: 24, relative humidity: 62, and velocity: 0.
Based on the test data in Table 3, a prediction calculation can be made for the data above.
Based on the naïve Bayes algorithm calculations based on Table 4, the highest value is 0.000026 for class −2. Thus, the prediction of the test data is class −2.
The accuracy of the prediction results can be calculated based on the confusion matrix. From a total of 252 available data, 0 data in class A were correctly predicted as class A (−3), and 3 data were correctly predicted as class one. In total, 72 data in class B were correctly predicted as class B (−2), and 15 data were correctly predicted as class B. In total, 49 data in class C were correctly predicted as class C (−1), and 32 data were predicted incorrectly as class C. In total, 29 data in class D were correctly predicted as class D (0), and 21 data were predicted incorrectly as class D. In total, 16 data in class E were correctly predicted as class E (1), and 8 data were predicted incorrectly as class E. In total, three data in class F were correctly predicted as class F (2), and three data were incorrectly predicted as class F. Zero data in class G were correctly predicted as class G (3), and one datum was predicted incorrectly as class G.
The results show that the number of correctly predicted data (correctly classified instances) was 169 data, or 67.06%, while the incorrectly predicted results (incorrectly classified instances) amounted to 83 data, or 32.94%.
The comparison between regression analysis and naïve Bayes seen from the number of TSVs shows that the regression analysis found the highest TSV predictions in the cool range (−1), with as many as 147 data. In the naïve Bayes prediction, the highest TSV was found in the cold range (−2), as much as 104. The highest TSV difference showed a difference in results between the regression analysis and naïve Bayes (Figure 5).
Higher data variation was found in the results of naïve Bayes analysis, which could predict all TSV categories, while the data generated using linear regression look clustered at a value of 0 to −2. If the prediction results using linear regression and the naïve Bayes method are compared with the actual data in the field, then the naïve Bayes method has a better level of accuracy because the results obtained from naïve Bayes can approach the actual data in the field.
The value generated from naïve Bayes looks close to the actual data generated in field testing. The value using regression analysis shows the greater the value, the higher the value (Figure 6).
By comparing the prediction results with the initial data, the level of accuracy of both methods can be found. The linear regression analysis has data accuracy in as many as 84 of 252 datasets, and naïve Bayes analysis makes correct predictions in 181 out of 252 datasets. The accuracy of linear regression analysis is 33%, while the naïve Bayes analysis is 67%. Prediction results show that the accuracy of naïve Bayes is higher than the multiple linear regression analysis (Table 5).

4. Discussion

Thermal comfort data obtained based on age are still relevant as a basis for formulating a thermal comfort model. Age is essential in thermal comfort, and other studies have analyzed elderly respondents. Thermal comfort models built with different age data will produce different findings. Research in Tibet in winter and summer with elderly respondents found differences in the research results regarding the acceptance of thermal comfort [33]. Individual thermal comfort response is inseparable from the microclimate of each region. Solar radiation influences the individual’s thermal comfort response with an influence on the average solar radiation temperature. The air content also influences the thermal comfort response of each individual in an area [34]. Modeling with thermal comfort data more often uses regression analysis. The use of machine learning has now grown so that the methods applied are more varied [35].
The use of regression analysis is still carried out in modeling thermal comfort in outdoor spaces. Research with regression analysis is still accurate in evaluating outdoor thermal comfort by including physiological parameters. The model found that it can provide a design basis for creating thermally comfortable open spaces in urban parks [36]. A comparison of methods in modeling thermal comfort using thermal sensation vote (TSV) was carried out, but it still needs to be used to find the most accurate method. TSV is one of the appropriate variables in accurately predicting thermal comfort at a rate of 95.8%. The Bayesian optimization technique is considered an accurate method for making prediction models. Algorithms in Bayesian optimization techniques can predict individual thermal comfort [31]. The results of other studies show that linear discriminant analysis (LDA) is better than linear regression (LA). Several algorithms show different results in different cases. These findings can contribute to studying subjective and objective feelings of indoor thermal comfort in public buildings, thereby guiding architectural design, the intelligent control of ventilation systems, and realizing human–building interaction interfaces [37].
Naïve Bayes is better than regression analysis. The results showed that naïve Bayes has a calculation accuracy of 67%. Another study compared naïve Bayes with artificial neural network (ANN), fuzzy logic (FL), and PMV-based algorithms. Other results show that the naïve Bayes calculation provides a prediction accuracy of 73% [30]. The difference compared with the research conducted in other studies is 1%. Another study comparing several machine learning methods in finding predictions of city thermal comfort found that naïve Bayes resulted in a data accuracy of 40.43% [38]. The results of other studies are quite different from the research that has been undertaken. Thermal comfort data in urban areas may differ from indoor data. Research on energy consumption savings that compares naïve Bayes and regression has also found results that are not different from the research on thermal comfort that has been carried out. The results of the study of energy consumption savings with regression resulted in a data accuracy of 41.43% and a naïve Bayes accuracy of 73% [39]. The results of other research regressions compared with the research that has been performed have a difference of 41.43 minus 33%, which is 8.43%. The difference compared with naïve Bayes accuracy is 1%.
The prediction results obtained from linear regression and naïve Bayes are not precise but instead are based on the closest value [40]. In linear regression, the results are obtained by rounding the final grade to the nearest side of the class, while the results from naïve Bayes are obtained from the class that has the largest final score.
PMV (predicted mean vote) and PPD (predicted percentage of dissatisfied) values were obtained using the CBE Thermal Comfort Tool software from https://comfort.cbe.berkeley.edu/ (accessed on 3 October 2022). Thermal variable data in the form of air temperature, average solar radiation temperature (globe temperature), wind speed, humidity, metabolism, and respondent activity were entered into the software, and PMV and PPD values were obtained. A total of 252 respondents calculated their PMV and PPD. The distribution of the PMV values was mostly in the range of −0.5 to −1, a value that indicates that a respondent is almost cold (score: −1). In another value, respondents seem to obtain a PMV value of 0.5, which indicates that some respondents feel close to warm (score: 1). The overall PMV results show on Figure 7 that the respondents are still not too cold or too hot.
The highest PPD value produced by the respondents reached 19%. The minimum value is 5%, and the average PPD produced is 9%. Not many respondents reached the 19%value. The PPD value generated using the software from https://comfort.cbe.berkeley.edu/software (accessed on 3 October 2022) shows that respondents are predicted to still be able to accept the existing thermal conditions. The PPD value is still below 25%, which means that respondents are still relatively comfortable with the existing thermal conditions (Figure 8).

5. Conclusions

We studied the use of multiple linear regression analysis as a model for predicting thermal comfort. Predicting thermal comfort using a regression model in a study at Wonosobo High School showed results with a better range of coolness. Naïve Bayes analysis is one of the analytical alternatives classified as machine learning. The use of machine learning in thermal comfort is important because the analysis is expected to have accurate predictions. Using naïve Bayes analysis in thermal comfort research at Wonosobo Senior High School, Indonesia, we found differences compared with research results using multiple linear regression analysis. The difference in prediction results shows that naïve Bayes analysis has a cooler TSV result than multiple linear regression analysis. The level of accuracy in predictions using naïve Bayes is higher than the multiple linear regression analysis method. A comparison between the predictions of he two analytical methods is not too different, so it is still possible to use both methods in predicting the thermal comfort of building occupants.

Author Contributions

Conceptualization, H.H.; Methodology, H.H.; Validation, H.S.; Data curation, H.S.; Writing—original draft, N.F.; Writing—review & editing, J.S.; Visualization, A.N.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Gender criteria probability.
Table A1. Gender criteria probability.
GenderCountProbability
−3−2−10123−3−2−10123
1024482685100.280.590.520.330.831
23633324161010.720.410.480.670.170
38781502461
Table A2. Age criteria probability.
Table A2. Age criteria probability.
AgeCountProbability
−3−2−10123−3−2−10123
140010000000.010000
1509161073000.10.20.20.290.50
160293618122000.330.440.360.50.330
1723521145110.670.40.260.280.210.171
18113780000.330.150.090.16000
19010000000.0100000
38781502461
Table A3. Height criteria probability.
Table A3. Height criteria probability.
HeightCountProbability
−3−2−10123−3−2−10123
1400011000000.010.02000
1420010000000.010000
143052100000.060.020.02000
144021000000.020.010000
145030010000.03000.0400
146042000000.050.020000
147042110000.050.020.020.0400
148031010000.030.0100.0400
149012110000.010.020.020.0400
150210662010.670.110.070.120.0801
151052221000.060.020.040.080.170
152032120000.030.020.020.0800
153063320000.070.040.060.0800
15411631000.330.010.070.060.0400
155055322000.060.060.060.080.330
156052311000.060.020.060.040.170
157031211000.030.010.040.040.170
158037310000.030.090.060.0400
1590042000000.050.04000
160058210000.060.10.040.0400
1610012000000.010.04000
162032001000.030.02000.170
163011110000.010.010.020.0400
164032210000.030.020.040.0400
165034300000.030.050.06000
166021110000.020.010.020.0400
167042110000.050.020.020.0400
1680022000000.020.04000
1690010000000.010000
1700051100000.060.020.0400
171010100000.0100.02000
172010000000.0100000
17400010000000.02000
176011100000.010.010.02000
1770010000000.010000
38781502461
Table A4. Weight criteria probability.
Table A4. Weight criteria probability.
WeightCountProbability
−3−2−10123−3−2−10123
3000010100000.0200.170
360000010000000.170
37010000000.0100000
38010000000.0100000
39031200000.030.010.04000
400149210100.160.110.040.0401
41011010000.010.0100.0400
42073111000.080.040.020.040.170
4323510000.670.030.060.02000
44042000000.050.020000
450139950000.150.110.180.2100
46040210000.0500.040.0400
47012200000.010.020.04000
4815601100.330.060.0700.040.170
49018121000.010.10.020.080.170
500971340000.10.090.260.1700
51023110000.020.040.020.0400
52051340000.060.010.060.1700
53044100000.050.050.02000
54043220000.050.040.040.0800
55015310000.010.060.060.0400
56011100000.010.010.02000
570021010000.020.0200.170
58012000000.010.020000
590012000000.010.04000
600031000000.040.02000
610011000000.010.02000
650020000000.020000
69010000000.0100000
82010000000.0100000
38781502461
Table A5. Temperature criteria probability.
Table A5. Temperature criteria probability.
TemperatureCountProbability
−3−2−10123−3−2−10123
220103000000.110.040000
231121131000.330.140.140.060.0400
241122174100.330.140.260.140.170.170
250911942000.10.140.180.170.330
22.51191384000.330.220.160.160.1700
22.8030000000.0300000
23.50127700000.140.090.14000
24.509121542100.10.150.30.170.331
25.5013171000.010.040.020.290.170
38781502461
Table A6. Globe temperature criteria probability.
Table A6. Globe temperature criteria probability.
Globe_TemperatureCountProbability
−3−2−10123−3−2−10123
23120914000.330.230.110.020.1700
2402510120000.290.120.020.0800
251119101000.330.130.110.20.0400
2608281093100.090.350.20.380.51
23.511215110000.330.140.190.22000
24.50441140000.050.050.220.1700
25.5066512000.070.070.10.040.330
26.5010131000.0100.020.130.170
38781502461
Table A7. Relative humidity criteria probability.
Table A7. Relative humidity criteria probability.
Relative_HumidityCountProbability
−3−2−10123−3−2−10123
6011200000.330.010.020000
61037601100.030.090.1200.171
620710300000.080.120.06000
6301111041000.130.1400.170.170
64010850000.0100.160.2100
65111231000.330.130.020.060.0400
6617452000.330.080.050.10.0800
67077710000.080.090.140.0400
680013000000.010.06000
69040000000.0500000
70076300000.080.070.06000
71038130000.030.10.020.1300
72022002000.020.02000.330
73022000000.020.020000
74015111000.010.060.020.040.170
75021000000.020.010000
77010131000.0100.020.130.170
7901210600000.140.120.12000
80053340000.060.040.060.1700
38781502461
Table A8. Velocity criteria probability.
Table A8. Velocity criteria probability.
VelocityCountProbability
−3−2−10123−3−2−10123
038776492161110.940.980.8811
0.30030000000.040000
10021300000.020.020.1300
38781502461
Figure A1. Prediction results using Weka software.
Figure A1. Prediction results using Weka software.
Sustainability 14 15663 g0a1
Figure A2. Confusion matrix using Weka software.
Figure A2. Confusion matrix using Weka software.
Sustainability 14 15663 g0a2
Figure A3. Naïve Bayes prediction accuracy using Weka software.
Figure A3. Naïve Bayes prediction accuracy using Weka software.
Sustainability 14 15663 g0a3

References

  1. Gong, P.; Cai, Y.; Zhou, Z.; Zhang, C.; Chen, B.; Sharples, S. Investigating spatial impact on indoor personal thermal comfort. J. Build. Eng. 2021, 45, 103536. [Google Scholar] [CrossRef]
  2. Dzyuban, Y.; Ching, G.N.; Yik, S.K.; Tan, A.J.; Banerjee, S.; Crank, P.J.; Chow, W.T. Outdoor thermal comfort research in transient conditions: A narrative literature review. Landsc. Urban Plan. 2022, 226, 104496. [Google Scholar] [CrossRef]
  3. Larriva, M.T.B.; Mendes, A.S.; Forcada, N. The effect of climatic conditions on occupants’ thermal comfort in naturally ventilated nursing homes. Build. Environ. 2022, 214, 108930. [Google Scholar] [CrossRef]
  4. Zang, X.; Liu, K.; Qian, Y.; Qu, G.; Yuan, Y.; Ren, L.; Liu, G. The influence of different functional areas on customers’ thermal comfort—A Field study in shopping complexes of North China. Energy Built Environ. 2022, in press. [CrossRef]
  5. Jia, X.; Wang, J.; Zhu, Y.; Ji, W.; Cao, B. Climate chamber study on thermal comfort of walking passengers with elevated ambient air velocity. Build. Environ. 2022, 218, 109100. [Google Scholar] [CrossRef]
  6. Niza, I.L.; Broday, E.E. Thermal comfort conditions in Brazil: A discriminant analysis through the ASHRAE Global Thermal Comfort Database II. Build. Environ. 2022, 221, 109310. [Google Scholar] [CrossRef]
  7. Feng, Y.; Liu, S.; Wang, J.; Yang, J.; Jao, Y.-L.; Wang, N. Data-driven personal thermal comfort prediction: A literature review. Renew. Sustain. Energy Rev. 2022, 161, 112357. [Google Scholar] [CrossRef]
  8. Heidari, A.; Maréchal, F.; Khovalyg, D. Reinforcement Learning for proactive operation of residential energy systems by learning stochastic occupant behavior and fluctuating solar energy: Balancing comfort, hygiene and energy use. Appl. Energy 2022, 318, 119206. [Google Scholar] [CrossRef]
  9. Jia, M.; Choi, J.-H.; Liu, H.; Susman, G. Development of facial-skin temperature driven thermal comfort and sensation modeling for a futuristic application. Build. Environ. 2022, 207, 108479. [Google Scholar] [CrossRef]
  10. Liu, K.; Lian, Z.; Dai, X.; Lai, D. Comparing the effects of sun and wind on outdoor thermal comfort: A case study based on longitudinal subject tests in cold climate region. Sci. Total Environ. 2022, 825, 154009. [Google Scholar] [CrossRef]
  11. Ji, Y.; Song, J.; Shen, P. A review of studies and modelling of solar radiation on human thermal comfort in outdoor environment. Build. Environ. 2022, 214, 108891. [Google Scholar] [CrossRef]
  12. Geng, Y.; Hong, B.; Du, M.; Yuan, T.; Wang, Y. Combined effects of visual-acoustic-thermal comfort in campus open spaces: A pilot study in China’s cold region. Build. Environ. 2022, 209, 108658. [Google Scholar] [CrossRef]
  13. Dharmasastha, K.; Samuel, D.L.; Nagendra, S.S.; Maiya, M. Thermal comfort of a radiant cooling system in glass fiber reinforced gypsum roof—An experimental study. Appl. Therm. Eng. 2022, 214, 118842. [Google Scholar] [CrossRef]
  14. Ma, X.; Leung, T.; Chau, C.; Yung, E.H. Analyzing the influence of urban morphological features on pedestrian thermal comfort. Urban Clim. 2022, 44, 101192. [Google Scholar] [CrossRef]
  15. Cheng, J.C.; Kwok, H.H.; Li, A.T.; Tong, J.C.; Lau, A.K. BIM-supported sensor placement optimization based on genetic algorithm for multi-zone thermal comfort and IAQ monitoring. Build. Environ. 2022, 216, 108997. [Google Scholar] [CrossRef]
  16. Luo, Z.; Sun, C.; Dong, Q.; Qi, X. Key control variables affecting interior visual comfort for automated louver control in open-plan office—A study using machine learning. Build. Environ. 2022, 207, 108565. [Google Scholar] [CrossRef]
  17. Zhang, R.; Liu, D.; Shi, L. Thermal-comfort optimization design method for semi-outdoor stadium using machine learning. Build. Environ. 2022, 215, 108890. [Google Scholar] [CrossRef]
  18. Kafy, A.-A.; Saha, M.; Faisal, A.-A.; Rahaman, Z.A.; Rahman, M.T.; Liu, D.; Fattah, A.; Al Rakib, A.; AlDousari, A.E.; Rahaman, S.N.; et al. Predicting the impacts of land use/land cover changes on seasonal urban thermal characteristics using machine learning algorithms. Build. Environ. 2022, 217, 109066. [Google Scholar] [CrossRef]
  19. Yang, S.; Wan, M.P. Machine-learning-based model predictive control with instantaneous linearization—A case study on an air-conditioning and mechanical ventilation system. Appl. Energy 2022, 306, 118041. [Google Scholar] [CrossRef]
  20. Heidari, A.; Maréchal, F.; Khovalyg, D. An occupant-centric control framework for balancing comfort, energy use and hygiene in hot water systems: A model-free reinforcement learning approach. Appl. Energy 2022, 312, 118833. [Google Scholar] [CrossRef]
  21. Irshad, K.; Habib, K.; Kareem, M.; Basrawi, F.; Saha, B.B. Evaluation of thermal comfort in a test room equipped with a photovoltaic assisted thermo-electric air duct cooling system. Int. J. Hydrogen Energy 2017, 42, 26956–26972. [Google Scholar] [CrossRef] [Green Version]
  22. Irshad, K.; Algarni, S.; Jamil, B.; Ahmad, M.T.; Khan, M.A. Effect of gender difference on sleeping comfort and building energy utilization: Field study on test chamber with thermoelectric air-cooling system. Build. Environ. 2019, 152, 214–227. [Google Scholar] [CrossRef]
  23. Yang, Z.; Du, C.; Xiao, H.; Li, B.; Shi, W.; Wang, B. A novel integrated index for simultaneous evaluation of the thermal comfort and energy efficiency of air-conditioning systems. J. Build. Eng. 2022, 57, 104885. [Google Scholar] [CrossRef]
  24. Zhang, W.; Wu, Y.; Calautit, J.K. A review on occupancy prediction through machine learning for enhancing energy efficiency, air quality and thermal comfort in the built environment. Renew. Sustain. Energy Rev. 2022, 167, 112704. [Google Scholar] [CrossRef]
  25. Irshad, K.; Khan, A.I.; Irfan, S.A.; Alam, M.; Almalawi, A.; Zahir, H. Utilizing Artificial Neural Network for Prediction of Occupants Thermal Comfort: A Case Study of a Test Room Fitted With a Thermoelectric Air-Conditioning System. IEEE Access 2020, 8, 99709–99728. [Google Scholar] [CrossRef]
  26. Elnour, M.; Himeur, Y.; Fadli, F.; Mohammedsherif, H.; Meskin, N.; Ahmad, A.M.; Petri, I.; Rezgui, Y.; Hodorog, A. Neural network-based model predictive control system for optimizing building automation and management systems of sports facilities. Appl. Energy 2022, 318, 119153. [Google Scholar] [CrossRef]
  27. Esrafilian-Najafabadi, M.; Haghighat, F. Impact of predictor variables on the performance of future occupancy prediction: Feature selection using genetic algorithms and machine learning. Build. Environ. 2022, 219, 109152. [Google Scholar] [CrossRef]
  28. Rana, R.; Kusy, B.; Jurdak, R.; Wall, J.; Hu, W. Feasibility analysis of using humidex as an indoor thermal comfort predictor. Energy Build. 2013, 64, 17–25. [Google Scholar] [CrossRef]
  29. Benito, P.I.; Sebastián, M.A.; González-Gaya, C. Study and Application of Industrial Thermal Comfort Parameters by Using Bayesian Inference Techniques. Appl. Sci. 2021, 11, 11979. [Google Scholar] [CrossRef]
  30. Aguilera, J.J.; Toftum, J.; Kazanci, O.B. Predicting personal thermal preferences based on data-driven methods. E3S Web Conf. 2019, 111, 05015. [Google Scholar] [CrossRef]
  31. Yang, B.; Li, X.; Liu, Y.; Chen, L.; Guo, R.; Wang, F.; Yan, K. Comparison of models for predicting winter individual thermal comfort based on machine learning algorithms. Build. Environ. 2022, 215, 108970. [Google Scholar] [CrossRef]
  32. Asif, A.; Zeeshan, M.; Khan, S.R.; Sohail, N.F. Investigating the gender differences in indoor thermal comfort perception for summer and winter seasons and comparison of comfort temperature prediction methods. J. Therm. Biol. 2022, 110, 103357. [Google Scholar] [CrossRef]
  33. Yao, F.; Fang, H.; Han, J.; Zhang, Y. Study on the outdoor thermal comfort evaluation of the elderly in the Tibetan plateau. Sustain. Cities Soc. 2021, 77, 103582. [Google Scholar] [CrossRef]
  34. Wei, D.; Yang, L.; Bao, Z.; Lu, Y.; Yang, H. Variations in outdoor thermal comfort in an urban park in the hot-summer and cold-winter region of China. Sustain. Cities Soc. 2022, 77, 103535. [Google Scholar] [CrossRef]
  35. Qin, H.; Wang, X. A multi-discipline predictive intelligent control method for maintaining the thermal comfort on indoor environment. Appl. Soft Comput. 2022, 116, 108299. [Google Scholar] [CrossRef]
  36. Zhu, R.; Zhang, X.; Yang, L.; Liu, Y.; Cong, Y.; Gao, W. Correlation analysis of thermal comfort and physiological responses under different microclimates of urban park. Case Stud. Therm. Eng. 2022, 34, 102044. [Google Scholar] [CrossRef]
  37. Song, G.; Ai, Z.; Zhang, G.; Peng, Y.; Wang, W.; Yan, Y. Using machine learning algorithms to multidimensional analysis of subjective thermal comfort in a library. Build. Environ. 2022, 212, 108790. [Google Scholar] [CrossRef]
  38. Gao, N.; Shao, W.; Rahaman, M.S.; Zhai, J.; David, K.; Salim, F.D. Transfer learning for thermal comfort prediction in multiple cities. Build. Environ. 2021, 195, 107725. [Google Scholar] [CrossRef]
  39. Lin, C.-M.; Lin, S.-F.; Liu, H.-Y.; Tseng, K.-Y. Applying the naïve Bayes classifier to HVAC energy prediction using hourly data. Microsyst. Technol. 2022, 28, 121–135. [Google Scholar] [CrossRef] [Green Version]
  40. Pan, W.; Ming, H.; Yang, Z.; Wang, T. Comments on "Using k-core Decomposition on Class Dependency Networks to Improve Bug Prediction Model’s Practical Performance". IEEE Trans. Softw. Eng. 2022. Early Access. [Google Scholar] [CrossRef]
Figure 1. Naïve Bayes flowchart.
Figure 1. Naïve Bayes flowchart.
Sustainability 14 15663 g001
Figure 2. Data measuring.
Figure 2. Data measuring.
Sustainability 14 15663 g002
Figure 3. Classic assumption test.
Figure 3. Classic assumption test.
Sustainability 14 15663 g003
Figure 4. TSV class.
Figure 4. TSV class.
Sustainability 14 15663 g004
Figure 5. Differences in prediction results.
Figure 5. Differences in prediction results.
Sustainability 14 15663 g005
Figure 6. Prediction result comparison between actual TSV, linear regression, and naïve Bayes.
Figure 6. Prediction result comparison between actual TSV, linear regression, and naïve Bayes.
Sustainability 14 15663 g006
Figure 7. PMV score.
Figure 7. PMV score.
Sustainability 14 15663 g007
Figure 8. PPD score.
Figure 8. PPD score.
Sustainability 14 15663 g008
Table 1. Related work.
Table 1. Related work.
AuthorRelated Work
J. J. Aguilera, J. Toftum, and O. Berk Kazanci (2019)The study “Predicting personal thermal preferences based on data-driven methods” predicts thermal comfort by comparing artificial neural network (ANN), naïve Bayes (NB), and fuzzy logic (FL) machine learning algorithms. The results showed that all methods performed well, with a 70% probability of guessing correctly [30].
P. I. Benito, M. A. Sebastián, and C. González-Gaya (2021)“Study and Application of Industrial Thermal Comfort Parameters Using Bayesian Inference Techniques” focuses on using Bayesian analysis for thermal comfort in the industry. It compared the results with the linear regression method used for thermal comfort. Bayesian analysis has a better ability to develop intelligent and thermally comfortable systems [29].
B. Yang et al. (2022)The study “Comparison of models for predicting individual winter thermal comfort based on machine learning algorithms” aims to develop a comparison of thermal comfort models based on skin temperature and environmental factors. The comparative study includes four models: support vector machine, decision tree, ensemble algorithms, and K-nearest neighbor. The thermal comfort prediction accuracy rate reaches 95.8% [31].
R. Zhang, D. Liu, and L. Shi (2022)The study “Thermal-comfort optimization design method for semi-outdoor stadium using machine learning” aims to reveal the relationship between stadium shape and thermal performance using an artificial neural network approach and genetic algorithms. The simulation results show that the simulation is close to the actual measurement and can be used for stadium optimization, which can be increased by 8.96% [17].
Table 2. Regression Value.
Table 2. Regression Value.
Coefficients
ModelUnstandardized CoefficientsStandardized CoefficientstSig.CorrelationsCollinearity
Statistics
BStd.
Error
BetaZero-OrderPartialPartToleranceVIF
1(Constant)−8.7963.231 −2.7230.007
Gender−0.3080.182−0.138−1.6960.091−0.179−0.108−0.0970.4932.030
Age−0.0930.074−0.078−1.2660.207−0.226−0.081−0.0720.8541.171
Height−0.0150.013−0.106−1.2350.2180.083−0.079−0.0700.4372.289
Weight0.0050.0120.0300.4390.6610.0550.0280.0250.7051.419
Temperature0.2650.1240.2382.1390.0330.3590.1360.1220.2633.808
Globe_temperature0.1830.1190.1821.5340.1260.3340.0980.0870.2314.332
Humidity0.0180.0110.1021.5880.1140.0140.1010.0900.7841.275
Velocity1.4200.4330.1983.2820.0010.1440.2060.1870.8911.123
Dependent variable: TSV.
Table 3. Sample of testing data for prediction.
Table 3. Sample of testing data for prediction.
GenderAgeHeightWeightTemperatureGlobe_TemperatureRelative_HumidityVelocityTSV
118160462224620?
Table 4. Prediction result.
Table 4. Prediction result.
VariableProbability
−3−2−10123
38781502461
Gender0.000.27590.590.520.330.831.00
Age0.330.14940.090.160.000.000.00
Height0.000.05750.100.040.040.000.00
Weight0.000.04600.000.040.040.000.00
Temperature0.000.11490.040.000.000.000.00
Globe_temperature0.000.28740.120.020.080.000.00
Relative_humidity0.000.08050.120.060.000.000.00
Velocity1.001.00000.940.980.881.001.00
Sum00.00002600000
Prediction Result−2
Table 5. Comparison of data analysis accuracies.
Table 5. Comparison of data analysis accuracies.
MethodCorrect PredictionCount of DataAccuracy
Linear Regression8425233%
Naïve Bayes16925267%
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Sibyan, H.; Svajlenka, J.; Hermawan, H.; Faqih, N.; Arrizqi, A.N. Thermal Comfort Prediction Accuracy with Machine Learning between Regression Analysis and Naïve Bayes Classifier. Sustainability 2022, 14, 15663. https://doi.org/10.3390/su142315663

AMA Style

Sibyan H, Svajlenka J, Hermawan H, Faqih N, Arrizqi AN. Thermal Comfort Prediction Accuracy with Machine Learning between Regression Analysis and Naïve Bayes Classifier. Sustainability. 2022; 14(23):15663. https://doi.org/10.3390/su142315663

Chicago/Turabian Style

Sibyan, Hidayatus, Jozef Svajlenka, Hermawan Hermawan, Nasyiin Faqih, and Annisa Nabila Arrizqi. 2022. "Thermal Comfort Prediction Accuracy with Machine Learning between Regression Analysis and Naïve Bayes Classifier" Sustainability 14, no. 23: 15663. https://doi.org/10.3390/su142315663

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop