Predicting Electric Vehicle Consumption: A Hybrid Physical-Empirical Model

Deschênes, Anthony; Gaudreault, Jonathan; Rioux-Paradis, Kim; Redmont, Chloé

doi:10.3390/wevj11010002

Open AccessArticle

Predicting Electric Vehicle Consumption: A Hybrid Physical-Empirical Model^†

¹

CRISI Research Consortium for Industry 4.0 Systems Engineering, Université Laval, Quebec City, QC G1V 0A6, Canada

²

École Polytechnique Montreal, Montreal, QC H3T 1J4, Canada

^*

Author to whom correspondence should be addressed.

^†

This paper is an extended version of our paper presented at 32nd International Electric Vehicle Symposium 2019 (EVS 32), Lyon, France, 19–22 May 2019.

World Electr. Veh. J. 2020, 11(1), 2; https://doi.org/10.3390/wevj11010002

Submission received: 30 September 2019 / Revised: 13 December 2019 / Accepted: 13 December 2019 / Published: 18 December 2019

(This article belongs to the Special Issue A World of E MOTION—Selected Papers from the 32nd International Electric Vehicles Symposium and Exhibition (Lyon, France))

Download

Browse Figures

Versions Notes

Abstract

:

Electric vehicles are becoming more important in our society. Using them in a fleet to minimize energy cost is, therefore, a compelling opportunity for taxi companies. It is crucial to develop accurate models that estimate energy consumption for traveling from one point to another. Consumption can be estimated using a physical model, but such a model fails to fit real-world data, especially in taxi-driving conditions. We compare different approaches to learn from historical data in order to correct/improve the physical model. Similar techniques can be used to estimate consumption for a new vehicle model, which can be useful for companies that want to add a new vehicle model for which they do not have historical data.

Keywords:

energy consumption; fleet; electric vehicle (EV); efficiency; taxi

1. Introduction

TEO Taxi (Taxelco Inc., Montreal, QC, Canada) is a company that runs a fleet of 100% electric taxis (approximately 170 cars). This leads to cost reduction and a reduction of greenhouse gas emissions [1]. However, electric vehicles used by the company, which are Nissan LEAF, Kia Soul EV, Tesla Model S and Tesla Model X do not offer as much autonomy as conventional internal combustion engine cars. Since official ratings lack precision especially in winter conditions [2,3,4], the need for an accurate energy consumption prediction model is a real preoccupation for the company. This is mandatory to allow optimized usage of each owned vehicle as well as for future acquisitions. TEO Taxi also wants to be able to use the developed model to predict more accurately the consumption for new/unknown vehicle models early on.

Such models do exist (e.g., [5,6]), but they are not adapted to managing electric taxis especially in winter-driving conditions. Some of them rely on a physical model, but do not take into account the specific taxi driving constraints such as frequent stops, doors being opened and closed frequently and intense urban driving. Another very important factor is the temperature in which the fleet evolves that goes from −30 °C to 30 °C. The autonomy of any electric vehicle declines in cold temperature [3,7,8] and it is worse in the context of taxi-driving where doors frequently open. This leads to heat loss and an increase in energy consumption to maintain a decent temperature in the cabin.

Inspired by [5], our main goal is to adapt a physical model to take into consideration the various external factors that affect electric vehicle consumption in a taxi-driving context. Two objectives are considered: (1) to enhance the accuracy of a physical model using historical data from a known vehicle model by using linear regression, and (2) allow a better prediction for a physical model for a new/unknown vehicle model added to the fleet. We explore different models and compare them to determine which one is the most accurate for each one of our objectives.

The present paper is an extended version of [9]. It is organized as follows. In Section 2, we present a literature review on the prediction of electric vehicle consumption in the context of taxi-driving. Section 3 presents a dataset containing more than 100,000 taxi rides over one year. The database from the company contains information about energy consumption, GPS position, etc. and is combined with a public database in order to include weather conditions and elevation. Section 4 describes various approaches used to build predictive models: physical models, physical models corrected by fitting historical data, models based solely on historical data, hybrid models and models with interaction terms. In Section 5.1, we compare the models according to their ability to predict electric taxi consumption. Finally, Section 5.3 presents the results for predicting electric vehicle consumption for a new/unknown vehicle.

2. Literature Review

Energy consumption is important for fleet management of conventional combustion engine cars as well as a key component for the fleet of electric vehicles. The limited range of electric vehicles and its variability amplify its importance [8,10]. Works currently done for fleets of electric vehicles mostly focus on charging strategies (e.g., [11]) and dispatch strategies (e.g., [12,13]). This simulation calls for an accurate energy consumption model, but current models do not take into consideration the particularities of taxi-driving.

Electric vehicle consumption prediction encompasses two related problems: to predict the range of an electric vehicle [14] or the consumption for a given trip [15]. Consumption is affected by various external factors such as temperature, route condition, and driving style [3,7,16]. It has been shown that these factors have a significant impact on energy consumption for a fleet [2].

De Cauwer et al. [5] proposed to adjust/correct physical models by learning from empirical data. They started with a physical model, it is a wheel power equation. By multiplying by a distance d we obtain the energy consumption for a very short distance traveled for a short duration. A small duration is necessary to fully retain the dynamics that affect an electric vehicle) (Equation (1).

E = \frac{1}{3600} (m g (f cos θ + sin θ) + \frac{1}{2} (ρ C_{x} A {(\frac{v}{3.6})}^{2}) + m a) \cdot d

(1)

\begin{array}{l} E & = Energy required to travel distance d (kWh) \\ m & = Mass of vehicle (kg) \\ g & = Gravitational acceleration {(m / s}^{2}) \\ f & = Rolling resistance of vehicle (-) \\ θ & = Road angle in radians (-) \\ ρ & = Air density {(kg / m}^{3}) \\ C_{x} & = Drag coefficient of vehicle (-) \\ A & = Frontal area of vehicle {(m}^{2}) \\ v & = Speed of vehicle (km / h) \\ a & = Acceleration of vehicle {(m / s}^{2}) \\ d & = Distance driven (km) . \end{array}

The first two terms are the rolling resistance (

f cos θ

) and potential energy (

sin (θ)

). The third one (

ρ C_{x} A {(\frac{v}{3.6})}^{2}

) is the aerodynamics loss and

m a

is the loss or regeneration caused by acceleration.

They improved the model to include vehicle dynamics, applying a regression on the historical data (Equation (2)). They used multiple linear regression (MLR) to determine the value for the

B_{i}

. They also added a term for accessories consumption.

E_{E V} = B_{1} s + B_{2} {(v_{E V} + v_{w})}^{2} + B_{3} a s + B_{4} h + B_{5} A u x_{T} A u x_{t} t

(2)

\begin{matrix} A u x_{T} & = Temperature scaling \\ A u x_{t} & = Fraction of time the auxiliaries are switched on \\ t & = Time \\ s & = Distance \end{matrix}

They also studied how events such as traffic lights, days of the week, traffic, driving dynamics, etc. affect the consumption of micro driving segments. This relation being non-linear, they used a neural network for this.

The present article focuses on predicting the global consumption of each taxi trip since we do not have the dynamic driving information needed to predict consumption associated to the individual micro driving segments. We instead take into consideration additional factors (e.g., winter tires, vehicle age, etc.) as well as interactions between factors. This can be done while keeping the model linear.

3. The Dataset: Working with Real Empirical Data

We have access to historical data for all TEO Taxi’s vehicle models (Nissan LEAF, Kia Soul, Tesla Model S and Tesla Model X) for a year of operation. Coupled with public datasets from Environment Canada and NASA, we were able to set up the dataset described in Table 1. It contains information about the distance driven, speed, elevation, trip duration, temperature, wind speed, wind direction, starting and ending state of charge (SOC), vehicle id, vehicle model, date, driver, odometer, battery capacity, and vehicle positions. Each ride is divided into steps of approximately 3 s containing information obtained by a Fleet Carma data logger. The length of the steps does not allow fully capturing the exact driving dynamics, but it should allow a good enough approximation to develop models that adapt to taxi-driving conditions.

Cleaning the Data

Inspecting carefully the speed of consecutive steps of a given ride, we realized there were some inconsistencies. It was established that the timestamps of the GPS points were not evenly spread in time as they should be. The timestamps were therefore corrected and a moving average was applied to correct what was identified as unexplained abnormalities. Taxi rides have an average duration of 20 min [17] and it was established with the company that rides of more than one hour were more likely related to the company bringing the car to a garage or similar. It is the same for rides of only a few seconds: they are related to the situation where the driver started/stopped/started the system probably involuntarily. A known error with the logs occurs when the vehicle passes through a tunnel, causing it to pass it in 2–8 s. The resulting speed was around 900 to 2000 km/h depending on the cases. We also considered that a driver in normal taxi driving circumstances would never go faster than 130 km/h. A total of 28,283 out of 193,347 rides were removed.

The company wants to evaluate the accuracy of the models for different ride lengths. Table 2 presents the number of rides in the dataset per distance cluster of 5 km. Table 3 presents the average consumption per ride (kWh) for each vehicle model.

4. Models

This section presents the various models we evaluate in Section 5.1 and discusses their particularities.

4.1. Basic Physical Model

The first model is the Basic Physical Model from [5] (Equation (1)). The model lacks aspects such as powertrain efficiency to be used for predicting energy consumption [3]. It fails to fit real-world data, but it will be used as a reference for other models.

4.2. Extended Physical Model

This model is the same as the previous one, but with an additional term related to air conditioning and accessories (term

A u x_{T} a u x_{t}

from Equation (2)).

A u x_{t}

was defined has 1.0 (always on). The temperature term (

A u x_{T}

) considers energy as a function of the outside temperature (in Celsius). We decided to define

A u x_{T}

as a non-linear function (Figure 1). It has been derived from historical data. The function is divided into five segments and each choke-point coordinate has been determined by fitting the data using a solver to perform a least-squares minimization. The logic behind this function is the following: the colder it is, the more energy will be needed to heat the cabin. There is a point at which the heating component reaches its maximal energy consumption and when the temperature rises, there is a point from which we use air conditioning, which takes less energy than heating [18]. Finally, there is a point where we do not use air conditioning or heating, but still use other accessories such as the radio and the lights.

4.3. Fitted Extended Physical Model

This model corresponds to Equation (2) with

A u x_{T} A u x_{t}

as defined in Section 4.2. Using the historical database containing thousands of taxi rides (Table 1), we are able to determine the best values for those weight parameters (

B_{i}

) using multiple linear regression (MLR). Using this technique, we correct the extended physical model to take into consideration external factors that were not initially considered by the model (that is, taxi-specific conditions). This allows the model to get a better accuracy for future rides. Moreover, some factors, such as battery capacity [19] and rolling resistance [20], are not constant and vary with external elements such as temperature. These variations can, in part, be corrected while training/adjusting the models.

4.4. Learned Empirical Model

The information contained in the database (Table 1) is more diverse than what is necessary for the Extended physical model. We created a Learned Empirical Model based on the available information. This model has access to more information than the Extended Physical Model; this could offer greater flexibility and better accuracy. It can be trained using the same method of learning as for the fitted extended physical model (MLR).

4.5. Hybrid Model

Another alternative is to combine the extended physical model with the learned empirical model. This model has the potential to further improve the results of the two models by exploiting the strength of each one.

4.6. Adding Interaction Terms into the Hybrid and Learned Empirical Models

Hybrid and Learned Empirical Models have multiple parameters that could have interactions between them. Interaction terms are terms that are the product of two or more terms in the linear regression. For example, if we have a term X and a term Y it is possible that X and Y are in relation and adding the term

X \cdot Y

could greatly improve the accuracy of the linear regression [21]. The weight applied by the linear regression to this term reflects the importance of the interaction for the model. Since there are approximately 35 terms in the hybrid model and in the learned empirical model, the total number of possible interactions between two terms is 595 and most of them are not logical. This number rises to 6545 if we make three terms interact. Therefore, we only selected terms from which their interactions could logically have a significant implication.

As for the basic and extended physical model, there is no logical interaction between the terms.

4.6.1. Interaction Terms for the Learned Empirical Model

For the learned empirical model, we added the square of each individual term. This allows the linear regression to put greater weight on a parameter when it grows larger. In addition, we added five more interaction terms, all related to the average speed: average speed multiplied by (1) the lost altitude, (2) the odometer, (3) the distance (4) winter and (5) summer. The reason behind (1) is that the possible recovery of energy when going downhill might be related to the speed at which we are driving. We also wanted to have a term which could be related to the state of health of the battery. Term (2) is used for this reason and is logical because the more the vehicle drove, the more likely it is that its battery has a weaker state of health. The term (3) serves as a logical combination of the distance driven with the average speed during that distance. Finally, terms (4) and (5) are related to the fact that winter and summer might affect consumption differently. We do not have any other non-linear terms with which we could interact and it is difficult to find other interaction terms that are logical. Therefore, we decided to keep those five terms.

4.6.2. Interaction Terms for the Hybrid Model

For the Hybrid model, since it already encompasses the non-linear terms of the Extended Physical Model, we only added interaction terms with these. We did not add the square of each term as we did in Section 4.6.1, but we added some more interesting relations such as the multiplication of the odometer by all the terms of the extended physical model to account for the state of health as mentioned in Section 4.6.1, the multiplication of the basic physical model terms with lost altitude for the same reasons as in Section 4.6.1 and the multiplication of winter and summer with all the terms of the extended physical model to allow a different weight of the terms during these seasons. The reason why we only multiplied the terms of the basic physical model with the lost altitude is that there is no logic in multiplying the lost altitude with the accessories consumption term or the climate control term.

5. Experiments

In this section, we first evaluate the models presented in Section 4 according to how well they allow predicting the energy consumption for a ride. For each model, we report the mean absolute error (MAE) (predicted energy consumption in kWh for a ride minus the real energy consumption of the ride). MAE is the best indicator for the company as it reflects how much they can trust the model and what security gap they should use while dispatching. A smaller MAE allows for a more efficient use of each vehicle in the fleet and helps to maximize their use.

The models are trained separately for each vehicle model to allow the best fit. For each vehicle model, a subset of the database (80% of the taxi rides) is randomly chosen in each distance cluster in order to define the training set.

Each resulting trained model is tested using the 20% remaining taxi rides. We repeat this process 10 times to create a 95% confidence interval.

5.1. Results

As a reference, Table 4 presents the actual average trip consumption in the test set. Table 5, Table 6, Table 7, Table 8, Table 9, Table 10 and Table 11 present the MAE by vehicle model for each Section 4 models.

Figure 2 presents aggregated results. It shows the weighted MAE (all vehicles aggregated) divided by the actual trip consumption according to the trip length. Globally, the longer the trip is, the smaller the relative error is (that is true for any model). The relative error is about 30% for the 0–5 distance cluster. It may be explained by the fact that the models lack information about events that have a greater impact on the shortest trip (e.g., door openings, etc) which cause the shortest trip to be more error prone.

The company showed more interest in the results for long trips (the relative error is of about 12% for the 20–25 distance cluster). Long trips caused more problems to them in the past as the dispatchers had to introduce very large buffers because of uncertainty. They were very pleased we were able to efficiently keep the error low for long trips while taking temperature and other environmental factors into account. It allows the company to use a smaller buffer (1 kWh) and thus to use their electric vehicles more efficiently.

As expected, all the models are better than the basic physical model (which we only use as a reference). The models using additional empirical data are better than the simpler fitted models. Adding interaction terms helps both models. Hybrid models are better than all the others.

Figure 3 presents the average MAE of each model for each vehicle model. On average, we observed a reduction of the error by 51.75% for the hybrid model with interaction terms in comparison to the basic physical model.

Since we are using the same training and test data for each model, we can verify if the models MAE are significantly different using a technique called the difference score [23]. Results show that the Hybrid Model using interaction terms dominates or co-dominates all models. For the Tesla Model S the use of interaction terms does not allow a significant improvement in the results. Finally, the hybrid model dominates all models for the Tesla model X except for the learned empirical model that uses interaction terms where they co-dominate all other models. The hybrid model with interaction terms is always better or equivalent than all other models. This model also has the lowest MAE for the higher distance cluster. It tends to have smaller confidence intervals. This means that it can consider more accurately the impact of long-range taxi-driving than the other models.

5.2. Removing Unnecessary Terms

We showed that the hybrid model using interaction terms dominates most of the other models. When two models provide equivalent results, the simplest is generally preferred [24]. The hybrid model contains a lot of terms and we explored the possible ways of removing unnecessary terms. The results for the hybrid model presented in Section 5.1 are those of the hybrid model from which we removed the terms identified in Section 5.2.1 and Section 5.2.2.

5.2.1. Removing Terms Using Recursive Feature Elimination, Cross-Validated (RFECV)

We used recursive feature elimination, cross-validated (RFECV) selection of the best number of features [25] to detect terms that are not useful to our linear regression.

Using this technique, we are able to determine that the parameter Time since last charge has no significant impact on the results. It is the only term that was identified using RFECV. This means all other terms have an impact on the accuracy of the Hybrid Model.

5.2.2. Removing Terms Using Recursive Feature Elimination (RFE)

Some parameters might have an impact on performance, but this impact might be very low. To detect these parameters, we can use recursive feature elimination (RFE) [25].

Using this method, we are able to remove the terms precipitation and gained altitude from the Hybrid Model without significantly affecting its results. This was tested with the same technique as described in Section 5.1.

5.3. Predicting Consumption for a New Vehicle

Predicting the consumption for a new/unknown vehicle is crucial for companies that manage a fleet of electric vehicles. Physical models are interesting because they can easily adapt to new vehicle particularities. We could use a similar technique as previously but we, unfortunately, lack historical data to learn from. Therefore, we propose that, for each vehicle model, we evaluate if training using data from other known vehicles allows accurate predictions.

We use linear regression to learn from all vehicles except the one we consider as the new/unknown vehicle. As an example, for the Nissan LEAF, our training set contained all data from the Kia Soul, Tesla Model S and Tesla Model X. We then test the resulting model on all Nissan LEAF trips using the Nissan LEAF physical parameters. We do not have access to data for driven distance larger than 25 km for the Nissan LEAF and we exclude data from the Nissan LEAF when training for other cars. Table 12 presents the results for all vehicle models. It compares the MAE in kWh of the predicted energy consumption compared to the real energy consumption. Table 13 presents the relative error for each model.

Since each test set contains all data about a specific vehicle model, it is normal to have exact values without confidence intervals. Table 12 and Table 13 show that the fitted extended physical model is the best model except for the Nissan LEAF and we observe on average a reduction of 30.75% of the error for this model in comparison to the basic physical model. This is a smaller improvement than the one presented in Section 5.1. It is caused by the fact that we do not learn on the data from the vehicle model itself and it is normal to achieve smaller improvement. The results are still better when compared to the basic physical model. Even though there are some significant differences in the behavior of each vehicle model, we can still model general rules that apply to an electric vehicle in a taxi-driving context using data from other vehicles.

Except for the Nissan LEAF, we can observe that adding terms to the model does not improve the results. It even makes it worse than using the fitted extended physical model. Models that worked better in the experiment presented in Section 5.1 now have difficulties correctly predicting the energy consumption. It can be explained by the phenomenon of overfitting [26]. It can happen when a model highly focuses on the learning data at the cost of generality. This situation happens mostly for regression models that consider interaction between factors, as for example the hybrid model. Also, the bigger the training set is in comparison with the test set, the more likely it is that overfitting happens.

6. Conclusions

Using historical data to train the extended physical model with linear regression improves its accuracy by as much as 44% in comparison with the basic physical model. This is crucial from the point of view of the company. It allows them to use smaller buffers when dispatching resulting in a more optimized system. The consequences are an increased efficiency for the uses of the electric vehicles. We then propose a new hybrid model that uses interaction terms and historical data to further enhance its accuracy. It reaches at best an improvement of about 61% when compared with the basic physical model. However, the historical data used contains a lot of variations due to the fact that it comes from sensors that are subjected to various errors and many uncontrolled parameters [4]. These variations suggest that the results are not as precise as they could be. Some of these variations have been caught by the Hybrid Model using interaction terms thus explaining its better accuracy. It is, however, possible that others might not have been caught by our models. Finally, the fact that our dataset did not contain enough rides in the distance clusters larger than 25 km restricted precision to such distance thus not evaluating models for higher driven distance. As for future work, since our best model included some non-linear terms, it is justified to suppose that other methods offering some non-linear possibilities might be effective and should be explored.

Author Contributions

Conceptualization, A.D. and J.G.; methodology, A.D. and J.G.; software, A.D.; validation, A.D., C.R. and J.G.; formal analysis, A.D., C.R. and J.G.; investigation, C.R. and A.D.; resources, K.R.-P.; data curation, A.D. and C.R.; writing—original draft preparation, A.D.; writing—review and editing, A.D. and J.G.; visualization, A.D.; supervision, J.G.; project administration, J.G.; funding acquisition, J.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Taxelco inc. (the owner of TEO-Taxi) and Mitacs.

Conflicts of Interest

The authors declare no conflict of interest.

References

Messagie, M.; Boureima, F.S.; Coosemans, T.; Macharis, C.; Mierlo, J. A Range-Based Vehicle Life Cycle Assessment Incorporating Variability in the Environmental Assessment of Different Vehicle Technologies and Fuels. Energies 2014, 7, 1467–1482. [Google Scholar] [CrossRef]
De Cauwer, C.; Maarten, M.; Heyvaert, S.; Coosemans, T.; Van Mierlo, J. Electric Vehicle Use and Energy Consumption Based on Realworld Electric Vehicle Fleet Trip and Charge Data and Its Impact on Existing EV Research Models. World Electr. Veh. J. 2015, 7, 436–446. [Google Scholar] [CrossRef] [Green Version]
Fiori, C.; Ahn, K.; Rakha, H.A. Power-based electric vehicle energy consumption model: Model development and validation. Appl. Energy 2016, 168, 257–268. [Google Scholar] [CrossRef]
Fontaras, G.; Zacharof, N.G.; Ciuffo, B. Fuel consumption and CO2 emissions from passenger cars in Europe–Laboratory versus real-world emissions. Prog. Energy Combust. Sci. 2017, 60, 97–131. [Google Scholar] [CrossRef]
De Cauwer, C.; Verbeke, W.; Coosemans, T.; Faid, S.; Van Mierlo, J. A Data-Driven Method for Energy Consumption Prediction and Energy-Efficient Routing of Electric Vehicles in Real-World Conditions. Energies 2017, 10, 608. [Google Scholar] [CrossRef] [Green Version]
Green Race 2.0. Available online: https://www.jurassictest.com/greenrace-2 (accessed on 22 February 2019).
Laurikko, J.; Granstrom, R.; Haakana, A. Realistic estimates of EV range based on extensive laboratory and field tests in Nordic climate conditions. In Proceedings of the IEEE 2013 World Electric Vehicle Symposium and Exhibition (EVS27), Barcelona, Spain, 17–20 November 2013; pp. 1–12. [Google Scholar] [CrossRef] [Green Version]
Lajunen, A. Evaluation of energy consumption and carbon dioxide emissions for electric vehicles in Nordic climate conditions. In Proceedings of the 2018 Thirteenth International Conference on Ecological Vehicles and Renewable Energies (EVER), Monte Carlo, Monaco, 10–12 April 2018; pp. 1–7. [Google Scholar] [CrossRef]
Deschenes, A.; Gaudreault, J.; Rioux-Paradis, K. Predicting electric vehicle consumption: A physical model that fits. In Proceedings of the 32nd International Electric Vehicle Symposium & Exposition (EVS 32), Lyon, France, 19–22 May 2019; p. 7. [Google Scholar]
Kambly, K.; Bradley, T.H. Geographical and temporal differences in electric vehicle range due to cabin conditioning energy consumption. J. Power Sources 2015, 275, 468–475. [Google Scholar] [CrossRef]
Chen, T.D.; Kockelman, K.M.; Hanna, J.P. Operations of a shared, autonomous, electric vehicle fleet: Implications of vehicle & charging infrastructure decisions. Transp. Res. Part A Policy Pract. 2016, 94, 243–254. [Google Scholar]
Bischoff, J.; Maciejewski, M. Agent-based simulation of electric taxicab fleets. Transp. Res. Procedia 2014, 4, 191–198. [Google Scholar] [CrossRef] [Green Version]
Hu, J.; Morais, H.; Sousa, T.; Lind, M. Electric vehicle fleet management in smart grids: A review of services, optimization and control aspects. Renew. Sustain. Energy Rev. 2016, 56, 1207–1226. [Google Scholar] [CrossRef] [Green Version]
Wager, G.; McHenry, M.P.; Whale, J.; Bräunl, T. Testing energy efficiency and driving range of electric vehicles in relation to gear selection. Renew. Energy 2014, 62, 303–312. [Google Scholar] [CrossRef]
De Cauwer, C.; Van Mierlo, J.; Coosemans, T. Energy Consumption Prediction for Electric Vehicles Based on Real-World Data. Energies 2015, 8, 8573–8593. [Google Scholar] [CrossRef]
Yao, E.; Yang, Z.; Song, Y.; Zuo, T. Comparison of electric vehicle’s energy consumption factors for different road types. Discret. Dyn. Nat. Soc. 2013. [Google Scholar] [CrossRef]
Moreno, A.T.; Michalski, A.; Llorca, C.; Moeckel, R. Shared Autonomous Vehicles Effect on Vehicle-Km Traveled and Average Trip Duration. J. Adv. Transp. 2018. [Google Scholar] [CrossRef] [Green Version]
The Truth About Electric Vehicles in Cold Weather. Available online: https://fr.slideshare.net/fleetcarma/the-truth-about-electric-vehicles-in-cold-weather (accessed on 17 July 2018).
Erdinc, O.; Vural, B.; Uzunoglu, M. A dynamic lithium-ion battery model considering the effects of temperature and capacity fading. In Proceedings of the IEEE 2009 International Conference on Clean Electrical Power, Capri, Italy, 9–11 June 2009; pp. 383–386. [Google Scholar] [CrossRef]
Grunditz, E.A.; Thiringer, T. Performance Analysis of Current BEVs Based on a Comprehensive Review of Specifications. IEEE Trans. Trans. Electr. 2016, 2, 270–289. [Google Scholar] [CrossRef]
Jaccard, J.; Turrisi, R. Interaction Effects in Multiple Regression; Sage: London, UK, 2003; Volume 72. [Google Scholar]
Kulas, J.T.; Robinson, D.H.; Smith, J.A.; Kellar, D.Z. Post-Stratification Weighting in Organizational Surveys: A Cross-Disciplinary Tutorial. Hum. Resour. Manag. 2018, 57, 419–436. [Google Scholar] [CrossRef]
Triola, M.F. Elementary Statistics; Addison Wesley Publishing Company: Boston, MA, USA, 1992. [Google Scholar]
Blumer, A.; Ehrenfeucht, A.; Haussler, D.; Warmuth, M.K. Occam’s razor. Inf. Process. Lett. 1987, 24, 377–380. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Belkin, M.; Hsu, D.J.; Mitra, P. Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018; pp. 2300–2311. [Google Scholar]

Figure 1. Climate control and accessories power according to the outside temperature.

Figure 2. Weighted average of MAE divided by average trip consumption (all vehicle model aggregated) according to distance driven.

Figure 3. Average MAE of each model for each vehicle model.

Table 1. Dataset information for each ride.

Term	Description	Source
Distance (km)	Distance driven	TEO-Taxi database *
Average Speed (km/s)	Average speed	TEO-Taxi database *
Gained Altitude	Total gained altitude in meters	NASA elevation map *
Lost Altitude	Total lost altitude in meters	NASA elevation map *
Duration seconds	Total duration in seconds	TEO-Taxi database
Temperature ( $°$ C)	Outside temperature	Environment Canada public database
Wind Speed (km/h)	Speed of wind on the day of the ride	Environment Canada public database
Wind Direction	Direction of wind on the day of the ride	Environment Canada public database
Starting SOC	Starting SOC of the vehicle	Fleet Carma Data Logger
Ending SOC	Ending SOC of the vehicle	Fleet Carma Data Logger
Vehicle Id	Unique identifier of the vehicle	TEO-Taxi database
Vehicle Model	Model of the vehicle	TEO-Taxi database
Date	Date of the ride	TEO-Taxi database
Driver Id	Id of the driver of the vehicle	TEO-Taxi database
Time since last charge	Time since the last charge of the vehicle	TEO-Taxi database
Odometer	Odometer of the vehicle	TEO-Taxi database
Precipitation	Amount of precipitation that happened during the day	Environment Canada public database
Nominal capacity	Theoretical capacity of the vehicle	TEO-Taxi database
Vehicle positions	GPS positions of the vehicle	TEO-Taxi database *
Winter	Is equal to 1 when the date is between the 15th of December	TEO-Taxi database
	and the 15th of March otherwise it is equal to 0.
	Winter tires are mandatory during this period.
Summer	Is equal to 1 when the date is not between the 15th of December	TEO-Taxi’s database
	and the 15th of March otherwise it is equal to 0
Speed histogram	21 terms representing the	TEO-Taxi database *
	distance driven at speed [0...5[ , [5...10[ ... [100...105[ km/h

* Derived from the GPS locations obtained using the Fleet Carma data logger.

Table 2. Number of rides per distance cluster for each vehicle model.

Vehicle Model	0–5 km	5–10 km	10–15 km	15–20 km	20–25 km	25–30 km	30+ km	Total
Kia Soul	118,621	12,772	426	31	9	3	4	131,866
Nissan LEAF	23,687	2485	92	3	2	0	0	26,269
Tesla Model S	9302	7363	4700	676	29	3	9	22,082
Tesla Model X	4858	4512	3195	538	19	3	5	13,130

Table 3. Average consumption per ride (kWh) for each vehicle model.

Vehicle Model	0–5 km	5–10 km	10–15 km	15–20 km	20–25 km	25–30 km	30+ km	Average
Kia Soul	0.50	1.44	2.449	3.11	5.068	4.40	7.58	3.51
Nissan LEAF	0.74	1.93	3.03	3.96	6.31	n/a	n/a	3.194
Tesla Model S	0.76	1.68	2.46	3.22	4.76	4.76	6.45	3.44
Tesla Model X	0.86	1.83	2.68	3.60	4.91	6.79	6.95	3.95

Table 4. Actual average trip consumption (kWh) in the test set.

Vehicle Model	0–5 km	5–10 km	10–15 km	15–20 km	20–25 km	Average *
Kia Soul	$0.54 \pm 0.0066$	$1.48 \pm 0.029$	$2.47 \pm 0.16$	$3.50 \pm 0.53$		$2.00 \pm 0.062$
Nissan LEAF	$0.80 \pm 0.017$	$1.99 \pm 0.074$	$3.08 \pm 0.44$			$1.95 \pm 0.034$
Tesla Model S	$0.81 \pm 0.022$	$1.70 \pm 0.044$	$2.49 \pm 0.046$	$3.35 \pm 0.18$	$4.64 \pm 1.00$	$2.60 \pm 0.093$
Tesla Model X	$093 \pm 0.047$	$1.86 \pm 0.076$	$2.75 \pm 0.048$	$3.72 \pm 0.25$	$5.84 \pm 1.24$	$3.02 \pm 0.17$
Weighted Average	$0.63 \pm 0.28$	$1.60 \pm 0.38$	$2.58 \pm 0.47$	$3.50 \pm 0.46$	$5.22 \pm 1.63$