1. Introduction
The transportation sector is experiencing a paradigm shift thanks to the fast development in information and communication technologies (ICT). Sustainability and a multitude of other factors have contributed to the establishment of transportation 4.0. It has been argued in many published reports that the legacy transportation system is inefficient, polluting, and unsafe [
1,
2,
3]. To remedy the problems associated with road traffic, the intelligent transportation system based on dedicated short range communication (DSRC), has been proposed with the following chief aims [
4]: reduce congestion, increase road safety, improve drive experience, lower greenhouse gas emission, and make the transportation more efficient. In [
5], the authors proposed an ecorouting system based on vehicletoinfrastructure (V2I). The energy consumption on a given road is transmitted to a roadside unit (RSU) and forwarded to the traffic management center (TMC). Drivers use this information to find fuelefficient routes. This solution necessitates a strong ICT infrastructure, which can be prohibitive. The same design principle has been advocated in [
6].
The revolution in communication, embedded systems, and the associated disruptive technologies have contributed to the realization of the fourth industrial revolution, commonly known as industry 4.0. The smart city is yet another concept that emerged with the development of the Internet of Things, as well as machine learning techniques. As pictured in
Figure 1, smart mobility is one of the pillars in the smart city [
7]. In the EU model for the smart city, the smart mobility indicator includes safety, sustainability, and innovation.
In academia, reducing car emissions has been the focus of intensive work. Many parameters impact car fuel consumption (load, tire pressure, road, weather, vehicle age, etc.). It has been demonstrated through intensive experiments that driving style substantially impacts fuel consumption. Three driving styles have been investigated in [
8]. It was found that economic driving style reduces fuel consumption by 21% compared to dynamic driving.
In the literature, several algorithms have been devised for the estimation of fuel consumption [
9,
10,
11,
12]. Realtime implementation of fuel estimation algorithms has received scant attention. This work is an extension of previously published work [
9,
13]. The contributions reported in this work are the following:
Reducing the computational complexity by 66% using highlevel transformation techniques;
Devising two techniques for computing the RPM: binary searching based on a directaddressable table and approximation algorithm;
Implementation of the devised architecture using both an IP method and a highlevel synthesis tool (GAUT).
The rest of the paper is organized as follows.
Section 2 compares our work with existing techniques.
Section 3 reviews the fuel estimation algorithm described in [
9].
Section 4 describes techniques to reduce the computational complexity of the fuel estimation algorithm and elaborates three hardware architectures.
Section 5 reports the implementation results.
Section 6 concludes the paper.
2. Related Work
In the last decade, numerous fuel estimation algorithms have been proposed. The authors of [
14] elaborated an algorithm using a powerbased model. The algorithm requires instantaneous values for the acceleration and speed; consequently, it cannot be used for ecorouting.
An Android application was devised in [
15]. The app reads vehicle parameters through onboard diagnostics parameter ID (OBDII) interface. The system uses artificial intelligence techniques to provide the driver with ecodriving tips.
Using the Willan’s internal combustion engine, the author of [
9] devised a noniterative fuel estimation model. The technique devised in [
16] for the vehicle routing problem (VRP) is determined by the comprehensive modal emission model (CMEM). CMEM requires both speed and acceleration to estimate the fuel consumption. The engine RPM was fixed to 2800 for a passenger vehicle and 2400 for a truck. The authors of [
13] envisioned an RPM algorithm and designed a hardware architecture using floating point arithmetic for the implementation of the fuel estimation algorithm.
Approximated computing is a new design technique that has been conceived to reduce the power consumption or increase the speed of VLSI circuits. Floatingpoint arithmetic consumes more area, is slower, and is more powerhungry compared with fixedpoint arithmetic. Fortunately, approximated computing have also been shown to substantially reduce delay and power consumption [
17].
4. Optimized Hardware Architecture
The hardware architecture for implementing the fuel estimation algorithm needs to have the following features: (1) an RPM unit that determines the engine rotation per minute given the driving speed; (2) a functional unit for computing ${h}_{1}$, ${h}_{2}$, and ${h}_{3}$; (3) a hardware module for calculating ${F}_{trac}$, ${P}_{trac}$, ${P}_{i,gb}$, ${P}_{start}$, ${P}_{fuel}$, ${V}_{f}$, $\widehat{{f}_{c}}$; and (3) a memory unit for storing the constants used by the precedent unit.
To reduce the computational complexity of the fuel estimation algorithm, the following techniques can be used efficiently. A predefined driving mode can be utilized to estimate
${h}_{1}$,
${h}_{2}$, and
${h}_{3}$.
Table 2, lists the values of
${h}_{1,2,3}$ for the three driving cycles.
The computational complexity can be further reduced by precomputing the first term of
${P}_{trac}$, that is, the quantity
${h}_{1}\frac{1}{2}{\rho}_{a}{A}_{f}{c}_{d}$.
Table 3 reports the precomputed values for the three driving cycles for air density
${\rho}_{a}=1.293$ kg/m
${}^{3}$,
${c}_{d}=0.312$,
${A}_{f}=2.06$ m
${}^{3}$.
It is further possible to reduce the computational cost of tractive force,
${F}_{trac}$, by the transformation shown in (
14):
where
${K}_{1}$ and
${K}_{2}$ are constants that depend on the driving cycle.
Table 2 and
Table 4 show the value of, respectively,
${K}_{1}$ and
${K}_{2}$ for the three driving cycle modes.
Having performed the rearrangements, the number of multiplications is reduced from 3 to 1. The required number of arithmetic operations before and after optimization is summarized in
Table 5. The control dataflow graph of the original and refined algorithm is shown in
Figure 3. The control dataflow graph can be further improved by inserting pipeline latches, which reduce the critical path delay to one arithmetic unit, that is
$\tau \phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}max({t}_{add},{t}_{divider},{t}_{multiplier})$, where
${t}_{add},{t}_{divider},{t}_{multiplier}$ are, respectively, the critical path delay of a floating point adder, divider, and multiplier.
The fuel estimation algorithm requires as input the RPM value (
${\omega}_{e}$). The closedform expression to compute
${\omega}_{e}$ is shown in (
15).
where
D is the diameter of the wheel expressed in meters,
${\gamma}_{A}$ is the axle ratio, and
${\gamma}_{G}$ is the gearbox ratio.
In [
19], the author proposed an approximation formula to compute the engine RPM, which is described in (
16):
where
${\gamma}_{i}$ is the gear ratio. The RPM unit can be implement in one of the following ways: a lookup table that stores a precalculated value or a datapath unit that computes
${\omega}_{e}$ using (
16). To verify the accuracy of (
16), the RPM and speed have been measured using a sedan vehicle.
Table 6 compares the measured RPM to the approximated one using (
16). The results shows that the approximation error has an acceptable accuracy.
The RPM unit designed using a lookup table requires the implementation of a searching algorithm. The algorithm takes as input the average speed and returns the engine RPM (
${\omega}_{e}$). The known searching algorithms are: linear, binary, hash table, and directtable. The complexity of those algorithms is summarized in
Table 7. Hardware implementation of the search algorithm. For the fuel estimation algorithm, the direct address table is the most appropriate way for searching the engine RPM, as the size of the table is small. The Algorithm 2 computes the engine RPM using the direct address table.
Algorithm 2 Pseudocode for the RPM calculation unit. 
 1:
procedureRPM2 ($\overline{v}$, TRPM, Tspeed)  2:
I $\leftarrow (0.1\phantom{\rule{0.166667em}{0ex}}\xb7\phantom{\rule{0.166667em}{0ex}}\overline{v})1$  3:
${\omega}_{e}\leftarrow $ TRPM(I)  4:
return ${\omega}_{e}$  5:
end procedure

The dataflow graph of the RPM unit using the approximating equation, the search algorithm using the direct address table, and the binary search algorithm are pictured in
Figure 4.
5. Implementation Results
Three types of architecture have been implemented using FPGA technology. Those architectures are summarized in
Table 8. The architecture ArchApproxRM uses an approximation formula to compute
${\omega}_{e}$. The binary search algorithm is used by the architecture ArchBinaryRPM. The last one uses a direct address table.
For automated architecture synthesis of the RPM unit, the academic highlevel synthesis tool (GAUT) has been used [
20]. The design flow using GAUT is shown in
Figure 5. Comparison between three architectures for the implementation of the RPM calculator is presented in
Table 9. From the presented results, it is clear the the approximation method consumes fewer resources as compared with the two others.
The data path of the fuel estimation algorithm necessitates the implementation of floatingpoint arithmetic. This is due to the highdynamic range of the coefficients used in the fuel estimation algorithm. Floatingpoint arithmetic is supported in VHDL 2008 operations [
21]. Furthermore, a number of vendors offer IP for floating point operations. To select the suitable method for the realization of the data path, both the floatingpoint package and the IP provided by Xilinx ISE tool were implemented and tested using Virtex6 FPGA. The core is reconfigurable and can be used to design adder, multiplier, absolute value, exponential function, squareroot, conversion between fixed and floatingpoint, natural logarithm, and accumulator [
22].
The comparison presented in
Table 10 favors the IP over the floatingpoint package. The critical delay for the data path is 4 ns, which is nearly 17.8 times less than the one proposed in [
13]. The proposed architecture can be used for ecorouting. The circuit can be further optimized by finetuning the floatingpoint arithmetic. Furthermore, the fuel estimation model does not consider factors such as driving comfort and weather conditions. A more detailed investigation will be conducted from both an algorithm standpoint as well as a hardware implementation standpoint.