Next Article in Journal
Fetal Electrocardiogram Signal Extraction Based on Fast Independent Component Analysis and Singular Value Decomposition
Previous Article in Journal
Exploiting Concepts of Instance Segmentation to Boost Detection in Challenging Environments
Order Article Reprints
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Prediction of Overall Energy Consumption of Data Centers in Different Locations

School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
AI Research Institute, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
Author to whom correspondence should be addressed.
Sensors 2022, 22(10), 3704;
Received: 13 April 2022 / Revised: 7 May 2022 / Accepted: 10 May 2022 / Published: 12 May 2022
(This article belongs to the Topic Digitalization for Energy Systems)


The use of big data leads to higher demands for hyperscale data centers (HDCs) in terms of the scale and quantity required for data storage and processing. Before the construction of an HDC, it is necessary to comprehensively analyze the economic budget according to the energy requirements and potential energy cost. We propose a global energy consumption prediction framework based on the power usage effectiveness (PUE) calculation that considers all heat sources and power consumption. The framework integrates physical models and a statistical framework that combines IT equipment energy consumption and data center energy consuming predictions. Furthermore, the framework provides a method to calculate the carbon emissions and electricity cost of the data center. Using hourly meteorological data as climate parameters, combined with a limited range of energy parameters, the annual PUE values of 60 regions were estimated, and a further analysis of the Carbon Usage Effectiveness (CUE) and electricity costs in China was conducted as an example. Based on experimental validation and an evaluation of real-time data, our framework can predict the overall energy consumption of HDCs effectively, filling a gap in HDC research in the Asia-Pacific region and providing a basis for HDC feasibility analysis.

1. Introduction

The rapid development of information technology has made data centers key infrastructure to support the development of cloud computing, the Internet of Things, 5 G, AR/VR, etc.; thus, they require significant energy consumption. According to statistics, the global energy demand for data centers surged from 194 TWh to 205 TWh between 2010 and 2018 [1]. The growing demand for computing-intensive services, such as artificial intelligence (AI), and the increasing number of internet users has resulted in exponential growth in both the types and volume of data. Data centers indirectly affect CO 2 emissions. It is estimated that, by 2030, CO 2 emissions will reach 720 million tons [2]. More HDCs have been constructed to meet this demand, which has caused a contradiction between huge energy consumption and the limited power supply, and it is necessary to balance the performance of HDCs with the capability of the external environment. In addition, operators of HDCs try to find business opportunities by taking advantage of natural advantages [3]. At present, there are two main development directions for large-scale data centers: (1) reducing PUE and improving the energy efficiency of data centers, and (2) purchasing renewable energy to provide power, rather than using a traditional power supply [1]. Therefore, to better optimize data center energy consumption on the premise of application performance, it is necessary to build an appropriate and accurate energy consumption model [4] that incorporates the performance, scale, location, and other characteristics of data centers. Energy consumption models help to predict the consequences of operational decisions, allowing for more effective management and control of a system [5]. We planned to analyze energy consumption by investigating the following two aspects: First, we identified the essential factors affecting energy consumption. Data centers consist of a wide range of complex IT equipment and infrastructures [6], enhancing the complexity of energy consumption calculations. To simplify the calculation, we planned to identify the decisive factors that play key roles in the energy efficiency of a data center to balance the calculation accuracy and efficiency. Second, we built mathematical models for the energy consumption of data centers. Currently, there are no official statistical data on the energy use of HDCs at the global level. It is necessary to formally describe the energy consumption of data centers through mathematical models by transforming the energy efficiency optimization problem of data centers into a classic combinatorial optimization problem [7] to provide optimization strategies.
Data centers of different sizes consume different amounts of energy in the same period. It is meaningless to judge whether a data center meets the energy-saving standard from the perspective of how much energy is consumed. However, PUE can be applied to most scenarios internationally to reflect the energy efficiency of data centers. Traditional data center energy consumption models usually consider all internal components separately and then perform a linear combination. However, it is easy to ignore the interactions between components that are hard to predict. Currently, PUE can be predicted in several ways. Using the PUE formula, we can calculate the overall energy consumption of a data center by measuring the energy consumption of IT equipment.
In this paper, based on the PUE calculation method, we proposed a global energy consumption prediction framework that integrates physical models and a statistical framework and takes all heat generation sources and power consumption components of a data center into consideration. The framework uses the quasi-Monte Carlo (QMC) method [8] to calculate the Sobol sensitivity results [9]. Specifically, the PUE model takes into account the free cooling technology used [10]. Based on hourly meteorological observation data in 60 regions, we estimated the annual PUE of HDCs under the use of different cooling methods and conducted a comparative analysis. We simulated the internal and external parameters of HDCs and predicted the overall energy consumption and carbon emissions of the data center based on PUE and CUE formulas. The feasibility of the zero-carbon data center was then concluded from an in-depth analysis combined with data on the electricity cost and energy structure.
The rest of the paper is organized as follows: Section 2 provides background knowledge and work related to HDC energy consumption technologies. Section 3 presents our energy consumption prediction framework, which includes the PUE models, the IT equipment model, and related calculation methods. Section 4 presents an analysis of the experimental results, an evaluation of our framework, and a discussion. Finally, Section 5 concludes the paper.

2. Related Work

Energy-saving, cost reduction, and carbon emission reduction strategies are the current research hotspots regarding the construction of green data centers. The deployment of energy-saving infrastructure and scheduling could dramatically increase the energy efficiency of data centers; however, such infrastructure is required to predict energy consumption. Energy consumption prediction is the basis of data center energy scheduling management, and it can be divided into two parts: the energy consumption of IT facilities and the supporting infrastructure. The former mainly refers to energy consumption during the operation of IT hardware, such as servers, switches, and disk arrays, and the latter mainly includes energy consumption by cooling equipment and power supplies. Generally, the former accounts for a larger proportion of energy than the latter [11,12].
Energy consumption by the data center can be calculated separately. From the perspective of server equipment, the most popular models are the additive model [13,14,15,16] and the model based on system utilization [17,18], and each model type can be further divided into linear and non-linear models. For server energy-saving technologies, the most common technologies include server sleep scheduling [19] and Dynamic Voltage Frequency Scaling (DVFS) [20]. Refrigeration is also a major energy-consuming part of large data centers, as about 40% of the total energy consumption of data centers is used for cooling [21]. For data center refrigeration, heating, ventilation, air conditioning (HVAC), and computer room air conditioning (CRAC) are commonly applied, and many experiments and models have been developed for the accurate estimation of these processes. Some researchers have found that the overall consideration of the IT system layer and the use of a supporting facility layer for cross-layer optimization can maximize the energy-saving potential of the data center [22,23]. This research focused on the cross-layer energy consumption optimization of the data center, taking the modules as a whole and considering the relationships between modules and carrying out unified optimal scheduling and management. The power management methods of a data center air-conditioning system based on IT load scheduling mainly include cooling cost optimization, cabinet heat balancing, peak heat load reduction, and free cooling source utilization, in which free cooling is more universal. Power supply system management technologies based on IT load scheduling mainly include limiting power, the use of UPS supplemental power, and power supply equipment scheduling.
Unified cross-layer energy consumption models are mainly constructed by the following two steps:
According to the physical energy consumption characteristics of the equipment, specific mathematical formulas are applied to calculate the energy consumption of each system, a process known as cross-layer joint optimization. The total energy efficiency optimization framework proposed by Wan et al. [24] is one of the examples of cross-layer joint optimization. This framework optimizes the energy cost of a cross-layer data center, spanning the chip layer, server layer, and computer room layer. The thermal prediction model ThermoCast [25] is another example of joint optimization. It can integrate data center sensor observations and physical laws and is capable of capturing cyber–physical interactions and undergoing automatic learning using the data.
With the help of learning tools, we can predict the temperature or energy consumption of a system based on current/historical information (load, air conditioning parameters, external environmental parameters, etc.), a process known as prediction-based cross-layer joint optimization. However, methods based on CFD simulation software have high computational complexity. There are also methods based on machine learning, such as the self-aware workload forecasting (SAWF) framework (Hsu et al. [26]), while Gao et al. [27] chose to directly predict the PUE.
PUE [28] is the ratio calculated by the total energy consumption of a data center ( p D C ) over the energy consumption of the IT equipment used ( p I T ). It is an index that is used to evaluate the energy efficiency of a data center, and the result is usually greater than 1. The closer the PUE is to 1, the higher the energy efficiency level of the data center is. PUE is expressed by Equation (1):
P U E = p D C p I T
Many researchers have chosen to calculate data center energy consumption by measuring PUE. The most comprehensive study applied a thermodynamic model using constant model inputs to estimate PUE and compared the results with a different free cooling method developed by Gozcü et al. [29]. Research on PUE prediction has also been conducted. A 5-layer neural network developed by Gao’s team [27] mentioned above predicted the PUE of a Google center with an average absolute error rate of 0.004. However, this work used 2-year historical data, and training such a neural network model requires a large amount of nonpublic data containing 19 dimensions, which further increases the training difficulty. Another study by Brady et al. [28] performed high-precision PUE calculations on a set of Facebook HDCs in Prineville, Oregon. This was a thermodynamic modeling case study, which was limited to the airside economizer of a single data center and did not include an accuracy evaluation of other data centers using airside economizers or waterside economizers. Although sensitivity analyses were performed in both studies, the were analyses were conducted separately. The relative importance of parameters was not assessed, and the interaction effects of important variables were also ignored.
In addition to PUE, other energy efficiency metrics have been applied to the evaluation of data centers, such as pPUE [30] (i.e., local PUE) and RER [31] (i.e., renewable energy ratio). Greenpeace, an international environmental protection organization, believes that “green IT = energy efficiency + renewable energy“, which means the greening of the Internet not only needs to involve the reduction of costs by improving the energy efficiency but also requires the use of renewable energy to fundamentally reduce carbon emissions. Internet companies have begun to deploy data centers in places with lower electricity costs or have switched to high usage of new energy, where the power either comes from purchasing from third-party power plants or is provided by new energy power plants built by the company itself. Typical examples of new-energy-powered plants include the solar farm used by Facebook in Ohio for powering data centers and the Green House Data wind farm built in Ohio.
The mature energy consumption prediction methods mentioned above (as well as the currently used method) first establish an energy consumption prediction model based on the historical data generated during the operation of IT equipment and supporting infrastructure and then apply the algorithm to obtain the optimal parameters to control the energy consumption value of various pieces of equipment in the future. The feasibility of these methods relies on the quality of the model. Once the model deviates from the real situation of the equipment operating parameters, the quality of the control strategy cannot be guaranteed.
Cloud computing services are extremely popular and widely adopted due to their flexibility and on-demand advantages. They are hosted in cloud data centers (CDCs), enabling lower energy consumption and carbon emissions. CDC techniques are dependent on geodispersed Modular Data Center (MDC) designs and virtualization-based workload migration [32]. Ahmad et al. [33] reviewed the Virtual Machine (VM)-based workload consolidation schemes in CDCs. P. Nehra et al. [34] compared several existing energy consumption models of CDCs. Yamini et al. [35] proposed a method to reduce the number of servers based on the clique star cover number theorem in which more nodes are connected to the server. Zhang et al. [36] elaborated on the energy consumption in the cloud environment by measuring energy usage in different scenarios. The field of energy efficiency in CDCs holds great promise and remains explorable for researchers.

3. Energy Consumption Prediction Framework

To predict the total energy consumption of a data center, we (1) calculated the PUE according to different internal and external parameters, and (2) estimated the IT equipment energy consumption. The overall energy consumption, carbon emissions, and electricity cost of the data center were obtained directly.

3.1. PUE Prediction

As shown in Figure 1a, the prediction of PUE is based on the model proposed by Lei et al. [37], which considers all of the heat-generating sources and power-consuming components in the data center system. When free cooling is involved in energy-saving problems, the model mainly considers three cooling scenarios: airside economizers combined with adiabatic cooling (AE), waterside economizers utilizing the evaporative cooling capability of cooling towers (WEC), and waterside economizers using seawater for cooling (WES). When the above technologies cannot provide a sufficient cooling capacity, the mechanical chiller of the cooling system will be deployed to maintain the indoor temperature within an acceptable range.
The aim of the experiments described in this paper was to verify the WEC refrigeration method. Given the external climatic conditions and a specified indoor thermal environment, the PUE model identifies the economizer application scenarios and the amount of additional mechanical cooling that may be required. Based on the thermodynamic model, the total calorific value and electricity consumption of a data center can be described by Equations (2) and (3):
q D C = p I T + ( 1 η U P S ) p U P S + α P D p P D + p L
p D C = p I T + ( 1 η U P S ) p U P S + α P D p P D + p L + f p f F A N + p p p P U M P + p C H
where q D C is the total heat generated by a data center. p I T , p U P S , p L , and p D C represent the power used by the IT equipment, UPS, lighting system, and the entire data center, respectively. p P D represents the power used across the power transformation and distribution system. p f F A N , p p P U M P represents the power used by the fan type f (including CRAC fans and cooling tower fans) and the power used by pump type p (including chiller pumps, waterside economizer pumps, cooling tower pumps, and humidification pumps). p C H is the power used by the chiller. The units of all of the above parameters are kW. η U P S is the efficiency of the UPS, and α P D is the percentage of power loss in the power transformation and distribution system (i.e., the loss of lines and switches).
The determination of the WEC application scenario requires the water temperature delivered by the economizer heat exchanger ( T W E C , C) and the return temperature of the facility water ( T r w , C) to be compared. The supply temperature of the facility water ( T s w , C) was set according to the dynamically changing temperature of the supply and return air of CRAC, which is described by Equation (4):
T s w = T r a ( T r a T s a ) / ϵ = T r a Δ T a i r / ϵ
where T r a represents the CRAC supply air temperature, and T s a represents the CRAC supply air temperature. Δ T a i r is the temperature difference between the supplied and returned CRAC air. The units of the above parameters are C. ϵ is the heat exchanger effectiveness of CRAC cooling coils. Then, T r w can be calculated as the temperature difference between the supplied and returned facility water ( Δ T w , C). T W E C can be calculated by Equation (5):
T W E C = T w b + ( A T C T A T E X ) / ϵ = T r a Δ T a i r / ϵ
where T w b , A T C T , and A T E X represent the wet bulb temperature of outdoor air, the approach temperature of the cooling tower, and the approach temperature of the economizer heat exchanger, respectively. Their units are C.
In general, q D C can be expressed by Equation (6):
q D C = q W E C + q C H
where q W E C represents free cooling supplied by WEC. Its units are kW.
As shown in Figure 2, the example vectors of the PUE model can be expressed in vector form:
s = [ T o a , R H o a , P a t m , η U P S , α P D , ]
The input of the PUE model can be divided into two categories: climate parameters and data center energy system parameters. The latter includes equipment specifications, system operating efficiency parameters, and indoor environmental set-points, and a detailed description of the former can be found in [37].
In order to find the parameters that have the greatest influence on predicting the PUE value and to further evaluate the influence of the interactions among variables during PUE prediction, we first used Sobol’s method to generate sample vectors from climate and energy system parameters, processed the sample vectors using the model, and finally, calculated the Sobol sensitivity index using the quasi-Monte Carlo (QMC) method for the uncertainty analysis. If the uncertainty of key parameters can be reduced, the accuracy of the prediction results can be greatly improved.
The outdoor dry bulb temperature and outdoor relative humidity ranges were set to −40–40 C (approximate range of the outdoor dry bulb temperature in all regions of China throughout the year) and 0–100%. The range of parameters was estimated and determined based on public information, and each interval adopted a uniform probability distribution.
The results of the Sobol sensitivity analysis show that climate parameters play a key role in PUE values, so site-specific climate data are needed as input. Fortunately, accurate values are relatively easy to obtain with meteorological data as the input. Specific data on energy and machinery depend on the internal documents of the specific data center. The results of the analysis of the Sobol sensitivity are described in detail in Section 4.1.

3.2. IT Equipment Energy Consumption Model

Energy consumption models of IT equipment presented in previous work can be roughly divided into two categories: energy consumption evaluation models based on system utilization and energy consumption prediction models based on performance monitoring counters (PMC) [38]. We built an additive model for IT equipment energy consumption considering the former category, as described by Equation (8):
p I T = p C P U + p M E M + p D I S K + p O t h e r s
Specifically, processor energy consumption can be calculated by CPU usage modeling, memory energy consumption can be calculated by cache miss rate modeling, and disk energy consumption can be calculated as the number of read and write bytes. Based on the performance monitoring counter, the model was built as described by Equation (9):
p I T = C 0 + i = 1 n C i E i
where C 0 is a constant, E i is the collected performance counter event, and C i is the influence coefficient of the i t h event on energy consumption. C 0 and C i can be found by linear regression.
Energy prediction models based on PMC have become mainstream applications for energy optimization. They always outperform energy modeling methods based on system utilization due to their fine-grained characteristic. A. Shahid et al. pointed out that any nonlinear energy model using only PMC (such as RF and NN models) is inconsistent and inaccurate [39] and proposed a theoretical framework for computing energy prediction models [40] because of the current state-of-the-art multicore CPU energy prediction models based on linear regression.The basic practical implications of the theory include selection criteria for model variables, model intercepts, and model coefficients. The model theory follows the physical laws of the conservation of computing energy.
Property 1: An abstract application run can be accurately characterized by a set of n-vectors of PMCs over R 0 . A null vector of PMCs is represented by
NULL = { 0 } k = 1 n
A function, f E : R 0 n R 0 maps the vectors to energy values, and p , q R n 0 ,
p = q f E ( p ) = f E ( q )
Property 2: There exists an application space, ( A , ) , where A is a (infinite) set of applications, and ⊕ is a binary function on A , :
A × A A
There exists a (infinite) set of binary operators,
O = { P Q , k : R 0 × R 0 R 0 , P , Q A , k [ 1 , n ] }
so that for each P , Q A , and their PMC vectors p = { p k } k = 1 n , q = { q k } k = 1 n R 0 n , respectively, the PMC vector of the compound application P Q will be equal to { p k P Q , k q k } k = 1 n .
Property 3:
f E ( NULL ) = 0
Property 4:
p R 0 n NULL , f E ( p ) > 0
Property 5: P , Q A , p = { p k } k = 1 n , q = { q k } k = 1 n R 0 n , P Q , k O ,
f E ( { p k P Q , k q k } k = 1 n ) = f E ( p ) + f E ( q )
When f E ( x ) is a linear function, the model is linear. The linear consistent energy prediction model can be formalized as p = ( p k ) k = 1 n , p k R 0 ,
f E ( p ) = β 0 + β × p = β 0 + k = 1 n β k × p k
where β 0 is the model intercept, β = { β 1 , β 2 , , β n } is the vector of the regression coefficients or the model parameters. Influenced by measurement errors or stochastic noise, the measured energy can be described by Equation (18):
f ˜ E ( p ) = f E ( p ) + ϵ
where the error term ϵ is a Gaussian random variable with an expectation of zero and variance of σ 2 , written as ϵ   N ( 0 , σ 2 ) .
Linear energy models have following properties:
Theorem 1.
If a linear energy predictive model, such as Equation (17), is consistent, the model intercept must be zero and the model coefficients must be positive.
Theorem 2.
If a consistent energy model is linear, then it is strongly composable with O = { + } .
Theorem 3.
If a consistent energy model is strongly composable with O = { + } and the function f E ( x ) is continuous, then it is linear.
Details and proofs can be found in [40]. Experiments on two modern Intel multicore servers improved the prediction accuracy of state-of-the-art linear regression models with significant energy saving. This theory can be used to build accurate linear energy prediction models.
Based on the above settings, it can be assumed that the PMC-based energy prediction model satisfying the following five properties of the extended model can be defined as a consistent energy model under the same computing environment.

3.3. Calculation of the Total Energy Consumption and Related Analysis

According to Equation (1), we can calculate the total energy consumption of a data center using Equation (19):
p D C = P U E × p I T
If the PUE result is predicted by Section 3.1, and the IT equipment energy consumption is obtained by Section 3.2 or known data, the total energy consumption of a particular data center can be inferred.
PUE cannot evaluate the environmental performance and energy expenditure of a data center. The Green Grid Organization proposed a new energy measurement standard for green data centers, the Carbon Usage Effectiveness (CUE). The CUE is the carbon emission intensity per kilowatt-hour of electricity used [41], the ratio of the total CO 2 emissions of the data center ( D t o t a l , kgCO 2 eq) to the energy consumption of the IT equipment ( p I T , kW × h):
C U E = D t o t a l p I T
The CUE can also be expressed by the product of the Carbon Emission Factor (CEF) and PUE as shown in Equation (21):
C U E = C E F × P U E
The CEF is the carbon emissions per unit of energy consumed (kgCO 2 e × kWh 1 ), and Table 1 shows the CEF of several common electrical energy sources [42]. The carbon emission factor of fossil fuels is the largest.
Of the energy sources presented above, wind energy and solar energy are the most promising green energy sources for data centers due to their extensive existence and environmental friendliness. However, the power generation of these green energy sources varies over time, causing instability. Environmental conditions also have a great impact. For example, wind speed affects wind energy, and sunshine intensity affects solar energy. In terms of early installation and deployment, new energy power generation costs more than energy production by traditional fossil fuel power plants, but the former has lower follow-up management costs and significantly less pollutant emissions during operation.
Table 2 shows the CEF and overheads of the grid and some new green energy products. In terms of the electricity cost, it is necessary to consider the power source. In addition, the impact of its carbon emissions needs to be considered.

4. Evaluation

We used hourly meteorological data as the input data for the climate component of the PUE model and generated random values within the range of established reliable mechanical parameters as the parameter input for the energy component. The annual PUE was estimated, and the carbon emissions and electricity costs were further analyzed.

4.1. Sensitivity Results

Making the key input parameters as accurate as possible is an important way to reduce the uncertainty of model prediction. As the largest source of uncertainty, climate parameters can be obtained from weather databases in most parts of the world. These climate data are exact sensor data and are beneficial to the model’s accuracy. However, the internal parameters of data centers are hard to obtain, and the specific internal settings of the data center, such as the UPS efficiency, may be difficult to determine. The accuracy of the method was assessed by the bootstrapping method using 100 sample replacements to calculate the 95% confidence interval of the sensitivity indicator [43]. Results of all sensitivity indices are shown in the attached Table A1, and factors greater than 0.01 are shown in the following Figure 3:
To show the interaction effect of the variables, we divided the sum of the total order sensitivity indices ( i = 1 k S T i ) by the sum of the first order sensitivity indices ( i = 1 k S i ). The ratio was 1.96, which proves that the total order sensitivity should be used, because the global sensitivity analysis takes the interactions between parameters into account, making it more robust than the local sensitivity analysis, while the first order sensitivity index can only reflect the effect of a single variable [44].
This section discusses the sensitivity analysis results obtained under Chinese climatic conditions (Section 3.1). When applying the model to other regions, a sensitivity analysis based on the climatic characteristics unique to that region would need to be performed.

4.2. Annual PUE Estimation Analysis

Based on the annual PUE values obtained from hourly meteorological data, we drew boxplots by season. Figure 4 lists the results for Guangzhou, Guiyang, and Mohe.
Figure 4 shows that the average PUE value is smaller in places where the average annual temperature is lower. The uncertainty range of the annual simulation results is large. Therefore, it is more intuitive to distinguish by season: a higher PUE in summer (S2) and a lower PUE in winter (S4). Figure 5 shows scatter plots of the estimated PUE values for Nanjing and Harbin by quarter. Overall, Nanjing’s estimated PUE is higher than Harbin’s. Obviously, the temperature is higher in summer, and the PUE values of the two cities in summer are relatively high, while those in the first quarter are lower. In the fourth quarter, the difference in PUE estimates between the two cities is even more pronounced, indicating that the large-scale air-cooled data center built in Harbin has better cooling conditions while utilizing free cooling sources. This can also be applied to other regions, which means that in regions with relatively higher temperatures and humidity levels, the energy consumption required by data centers is greater. In fact, reports of PUE values measured in existing data centers confirm this issue. Lei et al. [37] compared and evaluated models using 17 HDCs data from Google and Facebook. Most of the model prediction results controlled the prediction interval within 50%, and almost all values were within the 90% prediction interval, ensuring the accuracy of the PUE prediction model. Chinese data centers usually adopt a cooling method that combines a cooling tower and a plate heat exchanger. However, since real-time or hourly tracking PUE data from Chinese data centers has not been released, the prediction results of the annual PUE (WEC) value in this paper were compared with actual data reported from different places to confirm the accuracy of this model.
The model has been verified with data from several countries, but no one has applied it to the Asia-Pacific region. We predicted annual PUE values for some cities in Australia, Japan, and Russia. Taking Adelaide and Sapporo as examples in Figure 6, we modified the parameters according to the actual local conditions, and the results are in line with the reported values and our expectations.
We also conducted experiments on some regions in the US not mentioned in [37]. All scatter plots are shown in Appendix A.

4.3. Carbon Emissions Prediction

This subsection and Section 4.4 discuss China as an example. The analysis method for other regions is the same.
Table 3 was taken from the “Research Report on China’s Carbon Neutrality Before 2060“ released by the Global Energy Interconnection Development Cooperation Organization (GEIDCO) [45]. Assuming that the proportion of energy used by the data center is similar to the data in the table, the carbon emissions can be roughly estimated. If the CEF of biomass and other energy sources is considered to be 10 gCO 2 eq × kWh 1 , and the carbon emissions of oxygen-fired units are considered, then in 2020, 2030, and 2060, the estimated CUE of the data center will be 506.705x, 311.739x, 53.738x PUE respectively.
It can be inferred that the estimated carbon emission of data centers in China in 2030 will be about 60% of that in 2020, and emissions are expected to reduce by nearly 90% CO 2 by 2060 without the consideration of climate change and the optimization of energy saving technology in data centers. By substituting the predicted PUE values of different regions, it is also possible to carry out a comparative analysis of different regions.
COVID-19 has highlighted the important roles of digital technology, the digital industry, and digital services in the operation of the economy and society. In the postepidemic era, people’s production and lifestyles have undergone profound changes. The numbers of data centers and racks have increased dramatically, and electricity demand has grown rapidly. Ensuring an increase in clean energy installations and making them generate as much power as possible are essential to decarbonize the entire electricity industry. The spatiotemporal controllability of part of the power load in HDCs is conducive to the promotion of renewable energy consumption. Coal power harms the environment and contributes to climate change, and its economic benefits are not very good. Therefore, coal removal is the most direct and effective measure for greening and the attainment of a low-carbon power structure.

4.4. Electricity Cost Estimation

In order to simplify the calculation, we used general industrial and commercial sales prices under 10 kV form various provinces and cities in 2019 as the electricity fee calculation parameters (Table 4). The data were collected from the local Development and Reform Commission, the Price Bureau, and other departments.
Assuming that the total annual energy consumption of IT equipment in the preconstructed data center was 100 million kWh, the calculation was performed using the estimated PUE for each region in 2019. For situations where new HDCs are built in various places, the estimated electricity costs are shown in Table 5. The top 5 regions in descending order are presented here, and the full dataset is presented in Table A2 in Appendix A.
As shown in Table 5, even if some areas consume less energy, the calculated electricity cost is relatively high due to the higher electricity price. The top 5 regions in ascending order are shown here, the full dataset is presented in Table A3.

5. Conclusions

This study proposed a framework to predict the overall energy consumption of HDCs with air-cooled IT equipment. According to the PUE predicted from the location and the internal structure of data centers from the point of view of IT equipment energy consumption, the total energy consumption can be calculated, and the carbon emissions and electricity costs can be forecast. Using the hourly meteorological data in the NOAA Integrated Surface Database (ISD) as climate parameters, the annual PUE values and the electricity cost of data centers to be built in 49 regions in China were analyzed. We also conducted an experiment involving 11 regions in other countries to extend the generality of our framework. Compared with the data presented in actual reports, our framework performed well. Our results show that climate is an important factor that impacts the energy consumption of data centers with consideration of free cooling. Generally, building HDCs in areas with lower temperatures takes advantage of free cooling and could save energy costs and improve the economic efficiency. The UPS efficiency also has a large impact on the results of the model. Data centers can improve their overall energy efficiency by increasing the efficiency of the UPS. Compared with [37], we found that when some parameters are modified according to the characteristics of regions, the sensitivity indices and their sequence will change. This reflects the impact of location factors on data center construction.
According to the results, some regions have low annual PUE values with high electricity costs and unreasonable energy structures. This means that PUE should not be seen as the only criterion for measuring the quality of data centers. With the improvement of policies and people’s awareness of environmental protection, the cost of carbon trading and climate change need to be considered in the construction of data centers as well. Therefore, it is necessary to coordinate the cost factors when considering the construction of HDCs and consider the comprehensive benefits, social impact, and environmental friendliness in general.

Author Contributions

Conceptualization, J.L.; methodology, Y.Z. and J.L.; formal analysis, Y.Z.; resources, J.L.; data curation, Y.Z.; writing—original draft preparation, Y.Z.; writing—review and editing, Y.Z. and J.L.; visualization, Y.Z.; supervision, J.L.; project administration, J.L. All authors have read and agreed to the published version of the manuscript.


The APC was funded by AI Research Institute, Harbin Institute of Technology.

Data Availability Statement

[Integrated Surface Dataset (Global)] NOAA National Centers for Environmental Information. 2001. Integrated Surface Dataset (Global);; NCEI DSI 3505_03.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Sobol Sensitivity Index (WEC).
Table A1. Sobol Sensitivity Index (WEC).
Total Sensitivity Index
Outdoor Dry Bulb Temperature7.17 × 10−1
Outdoor Relative Humidity3.07 × 10−1
UPS efficiency2.02 × 10−1
Supply air dry bulb set point4.38 × 10−2
Chiller partial load factor2.64 × 10−2
Heat exchanger effectiveness (CRAC cooling coils)2.05 × 10−2
Approach temperature (cooling tower)1.53 × 10−2
Percentage of power loss in power transformation and distribution system8.27 × 10−3
Temperature difference (supply/return facility system water)7.04 × 10−3
COP relative error to regressed value3.72 × 10−3
Fan pressure (CRAC)3.60 × 10−3
Temperature difference (supply/return CRAC)3.29 × 10−3
Fan efficiency (CRAC)2.39 × 10−3
Temperature difference (supply/return cooling tower water)1.65 × 10−3
Fan pressure (cooling tower)1.51 × 10−3
Liquid–gas ratio (cooling tower)1.43 × 10−3
Atmospheric pressure1.30 × 10−3
Approach temperature (economizer heat exchanger)1.06 × 10−3
Pump pressure (cooling tower)7.36 × 10−4
Pump efficiency (cooling tower)3.84 × 10−4
Sensible heat ratio (SHR)2.05 × 10−4
Pump pressure (waterside economizer pump)1.42 × 10−4
Fan efficiency (cooling tower)1.31 × 10−4
Lighting power to IT power ratio7.94 × 10−5
Pump efficiency (waterside economizer pump)7.43 × 10−5
Pump pressure (chiller pump)1.42 × 10−5
Pump efficiency (chiller pump)7.12 × 10−6
Pump efficiency (humidification pump)3.09 × 10−8
Pump pressure (humidification pump)1.50 × 10−8
First Order Sensitivity
Outdoor Dry Bulb Temperature4.20 × 10−1
UPS efficiency2.02 × 10−1
Outdoor Relative Humidity4.29 × 10−2
Supply air dry bulb set point8.15 × 10−3
Percentage of power loss in power transformation and distribution system8.06 × 10−3
Fan pressure (CRAC)3.56 × 10−3
Fan efficiency (CRAC)2.51 × 10−3
Chiller partial load factor2.38 × 10−3
Heat exchanger effectiveness (CRAC cooling coils)2.23 × 10−3
Temperature difference (supply/return facility system water)2.08 × 10−3
Temperature difference (supply/return cooling tower water)1.77 × 10−3
Fan pressure (cooling tower)1.44 × 10−3
Atmospheric pressure1.08 × 10−3
COP relative error to regressed value9.52 × 10−4
Liquid–gas ratio (cooling tower)7.90 × 10−4
Pump efficiency (cooling tower)3.14 × 10−4
Sensible heat ratio (SHR)2.67 × 10−4
Temperature difference (supply/return CRAC)1.92 × 10−4
Lighting power to IT power ratio1.61 × 10−4
Approach temperature (economizer heat exchanger)1.47 × 10−4
Pump pressure (waterside economizer pump)1.26 × 10−4
Pump efficiency (waterside economizer pump)4.67 × 10−5
Pump efficiency (chiller pump)1.28 × 10−5
Pump pressure (chiller pump)1.14 × 10−5
Pump pressure (humidification pump)4.24 × 10−8
Pump efficiency (humidification pump)−1.68 × 10−8
Fan efficiency (cooling tower)−6.52 × 10−5
Approach temperature (cooling tower)−1.63 × 10−3
Table A2. Current electricity price for general industrial and commercial use in various regions.
Table A2. Current electricity price for general industrial and commercial use in various regions.
AreaCurrent Electricity Price 1/$ × kWh 1
Inner Mongolia (East)0.1084
Hebei (South)0.0844
Inner Mongolia (West)0.0853
Hebei (North)0.0831
1 "$" here is the price of USD in 2022.
Table A3. Estimated average annual electricity cost of data centers in various regions.
Table A3. Estimated average annual electricity cost of data centers in various regions.
AreaElectricity Cost 1/$1 M per Year
1 "$" here is the price of USD in 2022.
Figure A1. Annual PUE estimation.
Figure A1. Annual PUE estimation.
Sensors 22 03704 g0a1aSensors 22 03704 g0a1bSensors 22 03704 g0a1cSensors 22 03704 g0a1dSensors 22 03704 g0a1e


  1. Miller, R. The Sustainability Imperative: Green Data Centers and Our Cloudy Future. 2021. Available online: (accessed on 25 March 2022).
  2. Liu, Y.; Wei, X.; Xiao, J.; Liu, Z.; Xu, Y.; Tian, Y. Energy consumption and emission mitigation prediction based on data center traffic and PUE for global data centers. Glob. Energy Interconnect. 2020, 3, 272–282. [Google Scholar] [CrossRef]
  3. Ahmed, K.M.U.; Bollen, M.H.; Alvarez, M. A Review of Data Centers Energy Consumption And Reliability Modeling. IEEE Access 2021, 9, 152536–152563. [Google Scholar] [CrossRef]
  4. Dayarathna, M.; Wen, Y.; Fan, R. Data Center Energy Consumption Modeling: A survey. IEEE Commun. Surv. Tutor. 2016, 18, 732–794. [Google Scholar] [CrossRef]
  5. Rambo, J.; Joshi, Y. Modeling of data center airflow and heat transfer: State of the art and future trends. Distrib. Parallel Databases 2007, 21, 193–225. [Google Scholar] [CrossRef]
  6. Beloglazov, A.; Buyya, R.; Lee, Y.C.; Zomaya, A. A taxonomy and survey of energy-efficient data centers and cloud computing systems. In Advances in Computers; Elsevier: Amsterdam, The Netherlands, 2011; Volume 82, pp. 47–111. [Google Scholar]
  7. Jiye, W.; Biyu, Z.; Fa, Z.; Xiang, S.; Nan, Z.; Zhiyong, L. Data center energy consumption models and energy efficient algorithms. J. Comput. Res. Dev. 2019, 56, 1587. [Google Scholar]
  8. Niederreiter, H.; Winterhof, A. Quasi-monte carlo methods. In Applied Number Theory; Springer: Berlin/Heidelberg, Germany, 2015; pp. 185–306. [Google Scholar]
  9. Nossent, J.; Elsen, P.; Bauwens, W. Sobol’sensitivity analysis of a complex environmental model. Environ. Model. Softw. 2011, 26, 1515–1525. [Google Scholar] [CrossRef]
  10. Zhang, H.; Shao, S.; Xu, H.; Zou, H.; Tian, C. Free cooling of data centers: A review. Renew. Sustain. Energy Rev. 2014, 35, 171–182. [Google Scholar] [CrossRef]
  11. Yuventi, J.; Mehdizadeh, R. A critical analysis of Power Usage Effectiveness and its use in communicating data center energy consumption. Energy Build. 2013, 64, 90–94. [Google Scholar] [CrossRef]
  12. Choo, K.; Galante, R.M.; Ohadi, M.M. Energy consumption analysis of a medium-size primary data center in an academic campus. Energy Build. 2014, 76, 414–421. [Google Scholar] [CrossRef]
  13. Roy, S.; Rudra, A.; Verma, A. An energy complexity model for algorithms. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science, Berkeley, CA, USA, 9–12 January 2013; pp. 283–304. [Google Scholar]
  14. Tudor, B.M.; Teo, Y.M. On understanding the energy consumption of arm-based multicore servers. In Proceedings of the ACM Sigmetrics/International Conference on Measurement and Modeling of Computer Systems, London, UK, 11–15 June 2013; pp. 267–278. [Google Scholar]
  15. Ge, R.; Feng, X.; Cameron, K.W. Modeling and evaluating energy-performance efficiency of parallel processing on multicore based power aware systems. In Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, Chengdu, China, 10–12 August 2009; pp. 1–8. [Google Scholar]
  16. Song, S.L.; Barker, K.; Kerbyson, D. Unified performance and power modeling of scientific workloads. In Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, Denver, CO, USA, 17–21 November 2013; pp. 1–8. [Google Scholar]
  17. Gao, Y.; Guan, H.; Qi, Z.; Wang, B.; Liu, L. Quality of service aware power management for virtualized data centers. J. Syst. Archit. 2013, 59, 245–259. [Google Scholar] [CrossRef]
  18. Fan, X.; Weber, W.D.; Barroso, L.A. Power provisioning for a warehouse-sized computer. ACM Sigarch Comput. Archit. News 2007, 35, 13–23. [Google Scholar] [CrossRef]
  19. Gu, C.; Li, Z.; Huang, H.; Jia, X. Energy efficient scheduling of servers with multi-sleep modes for cloud data center. IEEE Trans. Cloud Comput. 2018, 8, 833–846. [Google Scholar] [CrossRef]
  20. Kim, W.; Gupta, M.S.; Wei, G.Y.; Brooks, D. System level analysis of fast, per-core DVFS using on-chip switching regulators. In Proceedings of the 2008 IEEE 14th International Symposium on High Performance Computer Architecture, Salt Lake City, UT, USA, 16–20 February 2008; pp. 123–134. [Google Scholar]
  21. Van Heddeghem, W.; Lambert, S.; Lannoo, B.; Colle, D.; Pickavet, M.; Demeester, P. Trends in worldwide ICT electricity consumption from 2007 to 2012. Comput. Commun. 2014, 50, 64–76. [Google Scholar] [CrossRef][Green Version]
  22. Chi, C.; Ji, K.; Marahatta, A.; Song, P.; Zhang, F.; Liu, Z. Jointly optimizing the IT and cooling systems for data center energy efficiency based on multi-agent deep reinforcement learning. In Proceedings of the Eleventh ACM International Conference on Future Energy Systems, Melbourne, QC, Australia, 22–26 June 2020; pp. 489–495. [Google Scholar]
  23. Wang, J.; Zhou, B.; Liu, W.; Hu, S. Research progress and development trend of cross-layer energy efficiency optimization in data centers. Sci. Sin. Inf. 2020, 50, 1–24. [Google Scholar] [CrossRef][Green Version]
  24. Wan, J.; Gui, X.; Kasahara, S.; Zhang, Y.; Zhang, R. Air flow measurement and management for improving cooling and energy efficiency in raised-floor data centers: A survey. IEEE Access 2018, 6, 48867–48901. [Google Scholar] [CrossRef]
  25. Li, L.; Liang, C.J.M.; Liu, J.; Nath, S.; Terzis, A.; Faloutsos, C. Thermocast: A cyber-physical forecasting model for datacenters. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21–24 August 2011; pp. 1370–1378. [Google Scholar]
  26. Hsu, Y.F.; Matsuda, K.; Matsuoka, M. Self-aware workload forecasting in data center power prediction. In Proceedings of the 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Washington, DC, USA, 1–4 May 2018; pp. 321–330. [Google Scholar]
  27. Gao, J. Machine Learning Applications for Data Center Optimization. 2014. Available online: (accessed on 11 March 2022).
  28. Brady, G.A.; Kapur, N.; Summers, J.L.; Thompson, H.M. A case study and critical assessment in calculating power usage effectiveness for a data centre. Energy Convers. Manag. 2013, 76, 155–161. [Google Scholar] [CrossRef]
  29. Gozcu, O.; Ozada, B.; Carfi, M.U.; Erden, H.S. Worldwide energy analysis of major free cooling methods for data centers. In Proceedings of the 2017 16th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm), Marina, San Diego, CA, USA, 31 May–1 June 2017; pp. 968–976. [Google Scholar]
  30. Chi, Y.Q.; Summers, J.; Hopton, P.; Deakin, K.; Real, A.; Kapur, N.; Thompson, H. Case study of a data centre using enclosed, immersed, direct liquid-cooled servers. In Proceedings of the 2014 Semiconductor Thermal Measurement and Management Symposium (SEMI-THERM), San Jose, CA, USA, 9–13 March 2014; pp. 164–173. [Google Scholar]
  31. Sharafi, M.; ElMekkawy, T.Y.; Bibeau, E.L. Optimal design of hybrid renewable energy systems in buildings with low to high renewable energy ratio. Renew. Energy 2015, 83, 1026–1042. [Google Scholar] [CrossRef]
  32. Shuja, J.; Gani, A.; Shamshirband, S.; Ahmad, R.W.; Bilal, K. Sustainable cloud data centers: A survey of enabling techniques and technologies. Renew. Sustain. Energy Rev. 2016, 62, 195–214. [Google Scholar] [CrossRef]
  33. Ahmad, R.W.; Gani, A.; Hamid, S.H.A.; Shiraz, M.; Yousafzai, A.; Xia, F. A survey on virtual machine migration and server consolidation frameworks for cloud data centers. J. Netw. Comput. Appl. 2015, 52, 11–25. [Google Scholar] [CrossRef]
  34. Nehra, P.; Nagaraju, A. Sustainable Energy Consumption Modeling for Cloud Data Centers. In Proceedings of the 2019 IEEE 5th International Conference for Convergence in Technology (I2CT), Bombay, India, 29–31 March 2019; pp. 1–4. [Google Scholar]
  35. Yamini, B.; Selvi, D.V. Cloud virtualization: A potential way to reduce global warming. In Proceedings of the Recent Advances in Space Technology Services and Climate Change 2010 (RSTS & CC-2010), Chennai, India, 13–15 November 2010; pp. 55–57. [Google Scholar]
  36. Zhang, Z.; Fu, S. Characterizing power and energy usage in cloud computing systems. In Proceedings of the 2011 IEEE Third International Conference on Cloud Computing Technology and Science, Athens, Greece, 29 November–1 December 2011; pp. 146–153. [Google Scholar]
  37. Lei, N.; Masanet, E. Statistical analysis for predicting location-specific data center PUE and its improvement potential. Energy 2020, 201, 117556. [Google Scholar] [CrossRef]
  38. Luo, L.; Wu, W.J.; Zhang, F. Energy modeling based on cloud data center. J. Softw. 2014, 25, 1371–1387. [Google Scholar]
  39. Shahid, A.; Fahad, M.; Manumachu, R.R.; Lastovetsky, A. A comparative study of techniques for energy predictive modeling using performance monitoring counters on modern multicore CPUs. IEEE Access 2020, 8, 143306–143332. [Google Scholar] [CrossRef]
  40. Shahid, A.; Fahad, M.; Manumachu, R.R.; Lastovetsky, A. Energy Predictive Models of Computing: Theory, Practical Implications and Experimental Analysis on Multicore Processors. IEEE Access 2021, 9, 63149–63172. [Google Scholar] [CrossRef]
  41. Azevedo, D.; Patterson, M.; Pouchet, J.; Tipley, R. Carbon usage effectiveness (CUE): A green grid data center sustainability metric. Green Grid 2010, 32, 4–5. [Google Scholar]
  42. Deng, W.; Liu, F.M.; Jin, H.; Li, D. Leveraging renewable energy in cloud computing datacenters: State of the art and future research. Jisuanji Xuebao (Chin. J. Comput.) 2013, 36, 582–598. [Google Scholar] [CrossRef]
  43. Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
  44. Sobol, I.M. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math. Comput. Simul. 2001, 55, 271–280. [Google Scholar] [CrossRef]
  45. GEIDCO. China Carbon Neutrality Research Report to 2060. 2021. Available online: (accessed on 1 April 2022).
Figure 1. Data center global energy consumption prediction framework. (a) is used for PUE prediction, and (b) is applied for IT equipment energy consumption.
Figure 1. Data center global energy consumption prediction framework. (a) is used for PUE prediction, and (b) is applied for IT equipment energy consumption.
Sensors 22 03704 g001
Figure 2. Generation of example vectors.
Figure 2. Generation of example vectors.
Sensors 22 03704 g002
Figure 3. Total order sensitivity indices greater than 0.01.
Figure 3. Total order sensitivity indices greater than 0.01.
Sensors 22 03704 g003
Figure 4. Annual PUE estimation in 3 cities.
Figure 4. Annual PUE estimation in 3 cities.
Sensors 22 03704 g004
Figure 5. Typical annual PUE estimation of 2 cities in China by season.
Figure 5. Typical annual PUE estimation of 2 cities in China by season.
Sensors 22 03704 g005
Figure 6. Typical annual PUE estimation of 2 cities worldwide by season.
Figure 6. Typical annual PUE estimation of 2 cities worldwide by season.
Sensors 22 03704 g006
Table 1. CEF of common electric energy sources.
Table 1. CEF of common electric energy sources.
Energy TypeCarbon Emission Factor / kgCO 2 e × kWh 1
Natural Gas440
Solar Energy53
Wind Energy29
Nuclear Energy15
Table 2. Unit energy expenditure and carbon emission factor.
Table 2. Unit energy expenditure and carbon emission factor.
EnergyPrice per Unit/$ × kWh 1 Carbon Emission Factor/kgCO 2 e × kWh 1
Electricity Grid5.0586
Table 3. Installed power generation and structure in China from 2020 to 2060 (in 100 million kWh).
Table 3. Installed power generation and structure in China from 2020 to 2060 (in 100 million kWh).
Table 4. Current electricity prices for general industrial and commercial use in various regions.
Table 4. Current electricity prices for general industrial and commercial use in various regions.
AreaCurrent Electricity Price 1/$ × kWh 1
1 "$" here is the price of USD in 2022.
Table 5. Top 5 estimated average annual electricity cost of data centers in various regions.
Table 5. Top 5 estimated average annual electricity cost of data centers in various regions.
AreaElectricity Cost 1/$1M per Year
1 "$" here is the price of USD in 2022.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhang, Y.; Liu, J. Prediction of Overall Energy Consumption of Data Centers in Different Locations. Sensors 2022, 22, 3704.

AMA Style

Zhang Y, Liu J. Prediction of Overall Energy Consumption of Data Centers in Different Locations. Sensors. 2022; 22(10):3704.

Chicago/Turabian Style

Zhang, Yiliu, and Jie Liu. 2022. "Prediction of Overall Energy Consumption of Data Centers in Different Locations" Sensors 22, no. 10: 3704.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop