Next Article in Journal
A Unified FPGA Realization for Fractional-Order Integrator and Differentiator
Next Article in Special Issue
Real-Time CLAHE Algorithm Implementation in SoC FPGA Device for 4K UHD Video Stream
Previous Article in Journal
An Improved RCS Calculation Method for Power Lines Combining Characteristic Mode with SMWA
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using Codes of Output Collections for Hardware Reduction in Circuits of LUT-Based Finite State Machines

by
Alexander Barkalov
1,2,
Larysa Titarenko
1,3,
Kazimierz Krzywicki
4 and
Kamil Mielcarek
1,*
1
Institute of Metrology, Electronics and Computer Science, University of Zielona Góra, ul. Licealna 9, 65-417 Zielona Góra, Poland
2
Department of Computer Science and Information Technology, Vasyl Stus’ Donetsk National University (in Vinnytsia), 600-richya Str. 21, 21021 Vinnytsia, Ukraine
3
Department of Infocommunication Engineering, Faculty of Infocommunications, Kharkiv National University of Radio Electronics, Nauky Avenue 14, 61166 Kharkiv, Ukraine
4
Department of Technology, The Jacob of Paradies University, ul. Teatralna 25, 66-400 Gorzów Wielkopolski, Poland
*
Author to whom correspondence should be addressed.
Electronics 2022, 11(13), 2050; https://doi.org/10.3390/electronics11132050
Submission received: 28 May 2022 / Revised: 21 June 2022 / Accepted: 27 June 2022 / Published: 29 June 2022
(This article belongs to the Special Issue Embedded Systems: Fundamentals, Design and Practical Applications)

Abstract

:
A method is proposed which aims to reduce the hardware in FPGA-based circuits of Mealy finite state machines (FSMs). The proposed method is a type of structural decomposition method. Its main goal is the reducing the number of look-up table (LUT) elements in FSM circuits compared to the three-block FSM circuit. The main idea of the proposed method is the using codes of collections of FSM outputs for replacing the FSM inputs and state variables. The interstate transitions are defined using collections of outputs generated in two adjacent cycles of synchronization. One, of output collection codes, is kept into a register. To optimize block-generating FSM outputs, a new type of state codes is proposed. A state is encoded as an element of some class of states. This approach allows both the number of logic levels and inter-level interconnections in LUT-based FSM circuit to be diminished. An example of an LUT-based Mealy FSM circuit with the proposed method applied is shown. Moreover, the results of our research are represented. The research was conducted using the CAD tool Vivado by Xilinx. The experiments prove that the proposed approach allows the reduction of hardware compared with such known methods as Auto and One-hot of Vivado, and JEDI. Moreover, the proposed approach gives better results than a method based on the simultaneous replacement of inputs and encoding collections of outputs. Compared to circuits of the three-block FSMs, the LUT counts are reduced by an average of 10.07% without significant reduction in the value of operating frequency. The gain in LUT counts increases with the increasing the numbers of FSM states and inputs.

1. Introduction

Since the 1950s, the model of Mealy finite state machine (FSM) [1] has been widely used in the design of sequential circuits [2,3,4]. Now, this model is used, for example, to set the behaviour of such sequential blocks as: (1) control devices of digital systems [5,6]; (2) serial communication and display protocols [7]; (3) various software tools of embedded systems [8]; (4) control-dominated systems [9]; (5) different systems in robotics [10] (6) hardware–software interfaces of embedded systems [3]; (7) the activation functions for deep neutral networks [11,12] and so on. Currently, research related to finite state machines is actively developing [9,13,14]. This justifies the choice of this model as an object of our current research.
To improve the quality of FSM-based blocks, it is necessary to improve such characteristics of corresponding FSM circuits as chip areas occupied by them, operating frequency and power dissipation. Due to this, there is a continuous interest in developing synthesis methods leading to optimization of these characteristics. As a rule, the less chip area is occupied by an FSM circuit, the less power it consumes [15,16]. Thus, it is very important to reduce the chip area occupied by an FSM circuit.
Today, a lot of digital systems are implemented using field programmable gate arrays (FPGAs) [17]. For example, FPGAs are widely used for implementing hardware accelerators [18]. In [19], around 1700 examples of various applications of FPGAs in a wide variety of digital systems are listed. Taking into account such popularity of FPGAs, we chose these chips as a platform for implementing Mealy FSMs circuits. Practically from the beginning of the FPGA era, the largest manufacturer of FPGA chips is Xilinx [20]. This explains why we focus our current research on solutions of Xilinx. We discuss FSM circuits implemented using such internal resources of an FPGA chip as look-up table (LUT) elements, programmable flip-flops, programmable interconnects, synchronization tree, and programmable input–outputs.
To optimize the basic characteristics of FSM circuits, the methods of structural decomposition (SD) can be used [21]. These methods allow structuring an LUT-based FSM circuit and presenting it as a composition of several large logical blocks. Each block is represented by a system of Boolean functions (SBF) having unique arguments [22]. In [23], we propose an FSM design method based on simultaneously applying two methods of SD. These methods are: (1) the replacement of FSM inputs [5] and (2) the encoding of collections of FSM outputs [5]. To apply these methods, it is necessary to generate two SBFs having two sets of additional variables. To implement circuits for these SBFs, it is necessary to use some chip resources. There are three logic levels in FSM circuits based on [23]. In this article, we propose a method which allows the exclusion of a block generating the additional variables replacing the FSM inputs. We propose to replace FSM inputs by the same variables which encode the collections of FSM outputs.
The main contribution of this paper is a novel design method aimed at reducing the LUT count in circuits of the three-block FPGA-based Mealy FSMs [23]. The proposed method is based on: (1) using the same additional variables for producing both input memory functions (IMFs) and FSM outputs and (2) encoding of the FSM state using class-state codes (CSCs) proposed in this paper. Saving on the number of elements is achieved by reducing both the number of additional arguments and state variables compared to [23].
The further text of the paper includes five sections. Section 2 is devoted to the background of FPGA-based Mealy FSMs. Section 3 includes the discussion of the state-of-the-art. The main idea of the proposed method is shown in Section 4. Section 5 shows an example of FSM circuit synthesis. The results of experiments and their analysis can be found in Section 6. A short conclusion is given in Section 7.

2. Background of Designing LUT-Based Mealy FSMs

The design process starts from formal representation of interstate transitions. This can be done using various tools [24]. Very often, the behaviour is defined using either state transitions graphs (STGs) or state transitions tables (STTs) [4]. We also use these tools in our paper. There are various formal methods using which it is possible to obtain SBFs representing an FSM logic circuit [4]. These SBFs define dependencies between FSM outputs and IMFs on the one hand, and FSM inputs and state variables on the other hand.
The FSM inputs form a set X = { x 1 , , x L } , the FSM outputs form a set Y = { y 1 , , y N } , and the FSM states form a set A = { a 1 , , a M } . The inputs cause interstate transitions. To synthesise an FSM circuit, the states a m A are encoded by binary codes K ( a m ) having R bits. The r-th bit of K ( a m ) corresponds to a state variable T r T , where T = { T 1 , , T R } is a set of state variables. The minimum number of state variables is determined as
R = l o g 2 M .
State codes based on (1) are called maximal state codes [25]. State codes are kept into a state code register (SCR) [5]. As a rule, in the case of FPGA-based FSMs, the SCR has informational inputs of D type [25,26]. The content of SCR is determined by the IMFs forming a set Φ = { D 1 , , D R } . A synchronization pulse C l o c k allows the entry of a state code into SCR. A single pulse S t a r t allows the entry of an initial state code into SCR.
To construct SBFs determining an FSM circuit, the initial STT (or STG) should be transformed into a direct structure table (DST) [5]. An STT includes five columns [4]. These columns are: a current state a m ; a state of transition a S ; a conjunction of inputs (or their complements) X h determining the transition from a m into a S ; a collection of outputs (CO) Y h generated during the h-th transition; h is a column with numbers of transitions ( h { 1 , , H } ) . Compared to an STT, a DST includes three additional columns [5].
These columns are: the code of the current state K ( a m ) ; the code of the next state K ( a S ) ; a collection of IMFs Φ h Φ necessary to load the next state code into SCR.
A DST is a base for deriving the SBFs
Φ = Φ ( T , X ) ;
Y = Y ( T , X ) .
These SBFs determine a logic circuit of P Mealy FSM Figure 1.
In Figure 1, a block of functions implements the SBFs (2) and (3). The SCR includes R flip-flops each of which corresponds to one bit of a current state code. The meaning of pulses S t a r t and C l o c k is clear.
A fragment of the STG is shown in Figure 2. It shows transitions between the current state a 6 and states of transition a 4 (the transition number h = 12 ) and a 7 (the transition number h = 13 ) of some Mealy FSM. This STG can be replaced by equivalent fragments of the STT Figure 2b and DST Figure 2c.
As follows from Figure 2b, the transition a 6 , a 7 is caused by the input signal X 12 = x 3 . The transition is accompanied by the producing outputs y 1 , y 2 Y . Row 12 of the STT Figure 2b reflects this transition. In the same manner, row 13 of the STT is filled Figure 2b. If, for example, there is M = 7 , then using (1) gives R = 3 and two sets: T = { T 1 , T 2 , T 3 } and Φ = { D 1 , D 2 , D 3 } . Let the states from Figure 2a have the following codes: K ( a 4 ) = 011 , K ( a 6 ) = 101 and K ( a 7 ) = 110 . These codes and corresponding IMFs are written in the rows 12 and 13 of DST Figure 2c. The row 12 determines a product term F 12 = T 1 T 2 ¯ T 3 x 3 , the row 13 determines a term F 13 = T 1 T 2 ¯ T 3 x 3 ¯ . These terms enter sum-of-products (SOPs) of Boolean functions D 1 , D 2 , y 1 , y 2 (the term F 12 ) and D 2 , D 3 , y 5 (the term F 13 ). All other parts of SOPs for (2) and (3) are constructed using the similar approach [27].
In this paper, we consider a case when SBFs (2) and (3) are implemented using such resources of FPGA chips as configurable logic blocks (CLBs) including LUTs, flip-flops and dedicated multiplexors [28], the programmable routing matrix, programmable input–output blocks and the synchronization tree [25,29]. Using the notation [30], we denote a LUT having I L inputs and a single output as I L -LUT. An I L -LUT can implement a circuit of an arbitrary Boolean function having up to I L arguments.
If the number of arguments exceeds the value of I L , then it is necessary to apply various methods of functional decomposition (FD) of this Boolean function [31,32,33,34]. In this case, a resulting circuit is multi-level. As a rule, it has a complicated system of “spaghetti-type” interconnections [21].
If all LUTs have the same number of inputs, then such a logic basis is rigid. It means that in some cases, only a part of the available inputs will be used. However, in other cases, the LUTs should be combined to increase the number of inputs. To reduce the impact of interconnects on such a join, it is important to have internal fast interconnects between some LUTs. In Xilinx solutions, these CLBs are combined into slices [29,35]. For example, the SLICEL of Virtex-7 includes four 6-LUTs, eight flip-flops and 27 multiplexers [28].
In LUT-based FSMs, the SCR is hidden and distributed among LUTs implementing SOPs of functions (2). Due to it, there are only two blocks in LUT-based P Mealy FSM Figure 3.
In this paper, a CLB-based block is denoted by a symbol LUTer. In P Mealy FSM, the LUTerT consists of CLBs generating IMFs D r Φ . The state variables T r T are kept into the distributed SCR. Due to this, the pulses C l o c k and S t a r t enter the LUTerT. The outputs y n Y are generated by the LUTerY.

3. Related Work

If each function ϕ k Φ Y depends on not more than I L Boolean arguments, then there are exactly N + R LUTs in the circuit of P Mealy FSM. This is the best possible outcome of synthesis. However, the modern LUTs have around 6 inputs [35,36,37]. In a CLB of Virtex-7 [36], it is possible to form either two 7-LUTs or a single 8-LUT using dedicated multiplexors. However, the total number of inputs and state variables of an FSM can significantly exceed 8 [17]. This leads to an imbalance between the characteristics of LUTs and SBFs (2) and (3). This imbalance is a source of the necessity of improving FPGA-based design methods.
To improve area-time characteristics of CLB-based FSM circuits, it is necessary to optimize their systems of inter-slice interconnections. It is known that only 30% of power dissipation is connected with LUTs [38]. It means that around 70% of the power is dissipated on the interconnections. As shown in [38], interconnection delays are starting to play a major role in comparison with logic delays. As shown in [23], the optimization of interconnections allows the reduction of both the time of cycle and power consumption of LUT-based FSM circuits. Using either two-fold state assignment [39,40] or the extended state codes can help in the optimization of interconnections.
Each function ϕ k Φ Y depends on N A ( ϕ k ) arguments. If the condition
N A ( ϕ k ) I L
is violated, then there are several levels of LUTs in an FSM circuit. Various methods have been developed for improving characteristics of FSM circuits [21,25,26,30,34,41,42,43,44].
As a rule, the known optimization methods can improve either the number of LUTs or the cycle time or the power consumption [42]. Moreover, there are methods that try to optimize two or even three of these parameters. In our current research, there is proposed a method for reducing the number of LUTs of three-block circuits of Mealy FSMs [23].
The SOPs of functions ϕ k Φ Y depend on product terms
F h = A m X h ( h { 1 , , H } .
These terms correspond to rows of DST. In (5), the symbol A m stands for a conjunction of state variables corresponding to the code K ( a m ) of a current state written in the h-th row of DST. These conjunctions add R literals in the SOPs of functions ϕ k Φ Y .
To diminish the number of literals, various methods of state assignment are used [45,46,47,48,49,50]. These methods can be found in many academic and industrial CAD tools. The well-known academic systems are, for example, SIS [51] and ABC by Berkeley [52,53] or Sinthagate [54]. The manufactures of FPGA chips have their own CAD packages. For example, AMD (Xilinx) has the CADs Vivado [55] and Vitis [56], whereas Intel (Altera) has the package Quartus [57].
There is no a universal state assignment approach which allows achieving an optimal solution for any FSM. In [34], there are compared FSM circuits based on maximum binary codes with R = l o g 2 M and one-hot state codes with R = M . As follows from the comparison, for FSMs with M > 16 the using one-hot codes allows FSM characteristics to be improved. However, the circuit characteristics depend strongly on the number of FSM inputs. This is due to the limited number of LUT inputs [21]. For example, the experiments [58] definitely show the following: if there is L > 10 , then using maximum binary codes leads to FSM circuits with better characteristics than the circuits based on one-hot codes.
So, in one case, the circuits with better characteristics could be produced due to using the one-hot state codes. However, in the other case it is better to use the maximum binary codes. Therefore, it is necessary to apply several state assignment methods and to choose a method producing the best results (for a particular FSM). Taking this fact into account, we have compared the results based on our proposed approach with characteristics of FSM circuits produced using the methods JEDI [51], binary state assignment Auto and One-hot state assignment of Vivado [55] by Xilinx [35]. We chose JEDI because it is considered one of the best state-assignment approaches [51].
If condition (4) is violated, then to implement a LUT-based FSM circuit, various methods of functional decomposition should be applied [31,42,43]. To implement a circuit, an original function ϕ k Φ Y is broken down by sub-functions for which the number of arguments does not exceed I L . Each sub-function differs from the initial function ϕ k Φ Y [42]. The decomposition should be executed in a way increasing the number of LUT levels of the final FSM circuit as little as possible [31]. The methods of FD are used by both academic and industrial CAD tools dealing with FPGA-based design. Unfortunately, this approach has a serious drawback: FD-based FSM circuits have complicated systems of “spaghetti-type” interconnections [21]. This drawback is manifested in the increasing for both cycle time and power consumption of a resulting FSM circuit [59].
The methods of SD [21] can be viewed as an alternative to methods of FD. The main goal of SD-based methods is the elimination of direct connection between the variables x l X and T r T , on the one hand, and functions y n Y and D r Φ , on the other hand. To achieve this goal, the block of functions (Figure 1) is represented as a composition of several logic blocks. As a rule, there are from two to four logic blocks [21]. This approach leads to the increasing the number of implemented functions. However, these new functions depend on significantly fewer arguments than functions ϕ k Φ Y .
The first known methods of SD were proposed in the mid-20th century by Prof. M. Wilkes [60]. These methods are the replacement of inputs and encoding of COs. In [23], we propose the joint use of these methods for optimization of LUT-based Mealy FSMs’ circuits. The main ideas of these methods are shown below.
The first method is reduced to the replacement of the set X = { x 1 , , x L } by a set of additional variables B = { b 1 , , b J } . This makes sense if the following condition holds: J L . The replacement is based on the creating a system of additional functions
B = B ( T , X ) .
In the case of LUT-based FSMs, these functions can be implemented with such resources of CLBs as LUTs and dedicated multiplexors [28].
The second method assumes the representing Q different COs Y q Y by binary codes K ( Y q ) . To do it, elements of an additional set Z = { z 1 , , z R Q } are used. The minimum number of bits in the codes K ( Y q ) can be found as
R Q = l o g 2 Q .
The following SBFs should be obtained to encode COs:
Z = Z ( T , X ) ;
Y = Y ( Z ) .
The SBFs (8) and (9) are implemented using LUTs. To implement the system (9), it is necessary to organize LUTs as decoders.
As shown in [23], combining these two methods is connected with introducing the following additional SBFs:
Φ = Φ ( T , B ) ;
Z = Z ( T , B ) .
The SBFs (6) and (9)–(11) determine a structural diagram of LUT-based M P Y Mealy FSM (Figure 4).
In MPY Mealy FSM, L U T e r I R executes the replacement of FSM inputs. Therefore, it implements SBF (6). The additional variables b j B enter L U T e r Z T which implements SBFs (9) and (10). The IMFs D r Φ enter the state code register SCR hidden inside of L U T e r Z T . At last, L U T e r Y transforms the additional variables z r Z into the functions y n Y .
We discuss a case when the logic blocks of MPY FSMs are implemented using internal resources of CLBs, inter-slice interconnections, programmable chip input–outputs and synchronization tree buffers [28]. The basic characteristics of equivalent P and M P Y FSMs are compared in [23]. The research results obtained in [23] show that the joint use of discussed methods of SD leads to improving the characteristics of LUT-based Mealy FSM circuits.
In this paper, we propose to transform the CO codes into both the output functions y n Y and state variables T r T . Moreover, we propose a new type of state code which allows the optimization of a circuit generating functions z r Z .

4. Main Idea of the Proposed Method

Our main idea is illustrated by Figure 5.
The transition a 2 , a 3 (Figure 5a) is caused by the input x 4 . This transition is accompanied by the producing a CO Y 2 . For the next instant of FSM time, this CO (we denote it as Y m ) indicates the relation a m = a 3 . If there is X h = x 1 , then there is a s = a 6 and Y s = Y 5 . So, the transition a 3 , a 6 can be indicated by the pair Y 2 , Y 5 . Using similar reasoning, it is possible to show that the transition a 3 , a 7 can be indicated by the pair Y 2 , Y 7 . To show how many COs are generated during transitions to a state a m A , we use the symbol Q m . There is Q m = 1 for the case represented by Figure 5a. The case with Q m > 1 is illustrated by Figure 5b. Two COs ( Y 3 and Y 6 ) are generated during transitions into the state a 4 . So, there is Q 4 = 2 . Now, the same transition a 4 , a 6 is represented by two pairs, namely, Y 3 , Y 5 and Y 6 , Y 5 .
This analysis shows that transitions a m , a s can be represented by pairs Y m , Y s . Using this result of analysis, we propose a P Z Mealy FSM, the structural diagram of which is shown in Figure 6.
There are two registers in P Z Mealy FSM. The register R Z keeps a code of CO Y s Y represented by variables z r Z = { z 1 , , z R Q } . The register R V keeps a code of CO Y m Y represented by variables v r V = { v 1 , , v R Q } . Obviously, these registers have R Q D flip-flops each, where the value of R Q is determined by (7). The registers are controlled by the same pulses C l o c k and S t a r t . So, they can be viewed as R Q single-bit shift registers. A B l o c k Ψ generates additional variables D r Ψ = { D 1 , , D R Q } used to load the code K ( Y s ) into R Z . The system Ψ is represented as
Ψ = Ψ ( T , X ) .
In each cycle, current codes of COs Y m and Y s are kept in the registers. A B l o c k Z generates FSM outputs represented by SBF 9. The contents of these registers are converted into a transition state code by a B l o c k T . To do it, the SBF
T = T ( Z , V )
is implemented by the B l o c k T .
Such an approach allows the exclusion of FSM input variables x l X from both FSM output functions and IMFs. Moreover, the outputs y n Y are registered. So, they do not depend on possible fluctuations of inputs [21] during any cycle of FSM operation. As a rule, this stability is achieved by using additional register having N flip-flops controlled by an additional synchronization pulse.
We discuss a case when an FSM circuit is implemented using slices similar to ones present in Virtex-7 of Xilinx [35,36]. In this case, the number of flip-flops is twice the number of LUTs per a slice. Each pair of flip-flops can be connected to form a shift register discussed before. So, in the same SLICEL, there are resources to produce both functions (12), as well as the additional variables z r Z and v r V .
If the condition (4) is violated for functions z r Z , then there is a multi-level circuit of B l o c k Ψ . To implement it, the methods of FD should be applied. To avoid the applying of SD, we propose a model of P C Z Mealy FSM. The method is based on using class-state codes proposed in this paper.
If the condition (4) is violated for functions z r Z , then we propose to create a partition Π A = { A 1 , , A K } of the set A. Each class A k Π A determines two sets. A set X k X includes L k FSM inputs causing transitions from states a m A k . A set Z k Z consists of additional variables z r Z generated during these transitions. There are M k elements in the class A k Π A .
Using ideas from the articles [39,40], we propose to encode states a m A k by codes S C ( a m ) having R s bits. The following formula determines the value of R s :
R s = m a x ( l o g 2 M 1 , , l o g 2 M K ) .
The partition Π A should be created in a way that the following condition holds for each class A k Π A :
R s + L k I L .
To create a CSC, it is necessary to encode classes A k Π A by class codes C C ( A k ) having R C bits:
R C = l o g 2 K .
Now, a state a m A k is represented by its class-state code
C S C ( a m ) = C C ( A k ) S C ( a m ) .
In (17), the symbol “*” stands for the concatenation of codes.
To encode the classes, we use class variables T r T B where R C = | T B | . To encode the states as class elements, we use state variables T r T A where R S = | T A | . These sets create a set T = T B T A having R T = R C + R S elements. The first R C elements of T create codes of classes; the next R S variables create state codes S C ( a m ) .
Using this encoding style, we propose a structural diagram of LUT-based P C Z Mealy FSM (Figure 7).
In P C Z Mealy FSM, a block L U T e r k corresponds to the class A k Π A . It implements an SBF
Z k = Z k ( T A , X k ) ( k { 1 , , K } ) .
A block L U T e r Z V includes CLBs and hidden distributed registers R Z and R V . It implements SBF
Z = ( T B , Z 1 , , Z K ) .
The variables v r V repeat the values of variables z r Z produced in the previous FSM operation cycle. A block L U T e r Y implements SBF (9). At last, a block L U T e r generates CSCs. To do it, the block implements SBF
T = T ( Z , V ) .
In this paper, we propose a synthesis method for P C Z -based Mealy FSMs. The synthesis process starts from an STG. The proposed method includes the following steps:
  • Constructing an STT corresponding to an initial STG.
  • Encoding of FSM states by maximum binary codes K ( a m ) .
  • Encoding of collections of outputs Y q Y by binary codes K ( Y q ) .
  • Creating the SBF Y = Y ( Z ) .
  • Creating the modified direct structure table of PZ Mealy FSM.
  • Creating a table of pairs P g = Y i , Y j corresponding to pairs a m , X h .
  • Creating the partition Π A with minimum amount of classes, K.
  • Encoding of classes and states to obtain class-state codes.
  • Creating tables representing blocks LUTer1-LUTerK and SBFs (18).
  • Creating table of L U T e r Z V and SBF (19).
  • Creating table of L U T e r and SBF (20).
  • Implementing the CLB-based circuit of P C Z Mealy FSM.

5. Example of Synthesis

We use the symbol P C Z ( S a ) to show that the model of P C Z Mealy FSM is used to obtain a logic circuit of an FSM S a . This Section is devoted to the synthesis of Mealy FSM P C Z ( S 1 ) . To implement the circuit, 5-LUTs are used. We start the synthesis process from an STG (Figure 8).
The following sets can be found from the STG (Figure 8): A = { a 1 , , a 8 } , X = { x 1 , , x 6 } and Y = { y 1 , , y 8 } . So, the following characteristics characterize the FSM S 1 : M = 8 , L = 6 , and N = 8 . There are H = 17 arcs connecting the nodes of the STG (Figure 8). So, there are 17 rows in the STT (and DST) of FSM S 1 .
Step 1. The transformation of an STG into an equivalent STT is executed in the trivial way [27]. As follows from Figure 3, the h-th arc of STG determines the h-th row of the corresponding STT ( h = { 1 , , H } ) . The STT of Mealy FSM S 1 is represented by Table 1.
Step 2. For FSM S 1 , there is M = 8 . Using (1) gives R = 3 . This determines the set of state variables T = { T 1 , T 2 , T 3 } . To simplify the presentation of our method, the states are encoded in the trivial way: K ( a 1 ) = 000 , K ( a 2 ) = 001 ,…, K ( a 8 ) = 111 .
Step 3. The analysis of Table 1 allows finding Q = 9 different collections Y q Y . These COs are the following: Y 1 = , Y 2 = { y 1 , y 2 } , Y 3 = { y 3 } , Y 4 = { y 1 , y 4 } , Y 5 = { y 3 , y 6 } , Y 6 = { y 4 } , Y 7 = { y 5 , y 7 } , Y 8 = { y 3 , y 8 } and Y 9 = { y 4 , y 5 } . Using (7) gives R Q = 4 and the set Z = { z 1 , , z 4 } .
As shown in [21], COs should be encoded in a way that minimizes the number of literals in SBF (8). If the condition
R Q > I L
holds, then such an approach could minimize the LUT count for L U T e r Y [21]. If (21) is violated, this method of encoding reduces the number of interconnections [21]. This reduces chip areas occupied by LUT-based FSM circuits [23].
To encode COs, we use the approach proposed in [61]. The outcome of encoding is shown in Figure 9.
Step 4. Using the codes of COs Figure 9 gives the following SBF:
y 1 = Y 2 Y 4 ; y 2 = Y 2 = z 2 z 3 ¯ ; y 3 = Y 3 Y 5 Y 8 = z 4 ; y 4 = Y 4 Y 6 Y 5 = z 3 z 4 ¯ ; y 5 = Y 7 Y 9 = z 1 z 4 ¯ ; y 6 = Y 5 Y 8 = z 1 z 4 ; y 7 = Y 7 = z 1 z 3 z 4 ; y 8 = Y 8 = z 3 z 4 .
The analysis of (22) shows that there are 15 literals in this system. So, there are 15 interconnections between the blocks L U T e r Z V and L U T e r Y . Obviously, the maximum number of these interconnections is equal to N R Q [21]. In the discussed case, there is N R Q = 32 . So, the number of interconnections is reduced by 2.13 times due to applying the approach [61].
If condition (21) is violated, then there are N LUTs in the circuit of L U T e r Y . The analysis of (22) shows that SOPs of functions y 1 and y 3 have a single literal. So, these functions are produced by LUTs of L U T e r Z V . So, there are N 2 = 6 LUTs in the circuit of L U T e r Y of FSM P C Z ( S 1 ) . Thus, the number of LUTs is reduced by 1.33 times due to applying the approach [61]. This is an upside effect of the method [61].
Step 5. The columns of a classical DST [27] are shown in Figure 3c. We have modified the traditional DST. The column Y h is replaced by a column Z h (Table 2). This table determines the Mealy FSM P Z ( S 1 ) .
The column Z h contains a variable z r Z if the r-th bit of K ( Y q ) is equal to 1 (we assume that the CO Y q Y is written in the h-th row of STT). For example, there is the CO Y 3 in the second row of Table 1. As follows from Figure 9, there is K ( Y 3 ) = 0001 . Due to it, there is the symbol z 4 in the second row of Table 2. All other rows for column Z h are filled in the same manner.
Step 6. A table of pairs P g = Y i , Y j shows a correspondence between these pairs and the pairs a m , X . It includes the following columns: a m (a current FSM state); a s (a state of transition); Y m and Y s (COs produced during the transition into the state a m and a s , respectively); P g (a pair Y m , Y s ); g (the number of a pair P g ( g { 1 , , G } ) ). In the discussed case, there is G = 29 . These pairs are represented by Table 3.
Step 7. In the discussed example, using the methods [39,40], the partition Π A = { A 1 , A 2 } can be found. There is the following distribution of states a m A between the classes: A 1 = { a 1 , a 2 , a 4 , a 8 } and A 2 = { a 3 , a 5 , a 6 , a 7 } . The partition determines the following sets: X 1 = { x 1 , x 2 , x 3 } , X 2 = { x 4 , x 5 , x 6 } , and Z 1 = Z 2 = Z .
So, there is K = 2, M 1 = M 2 = L 1 = L 2 = 3 . Using (4) gives the number of state variables R S = 2 . To implement the circuit of P C Z ( S 1 ) , the LUTs having I L = 5 inputs are used. Because the relation (15) holds for each class A k Π A , this partition satisfies the previously discussed requirements.
Step 8. In the discussed example, there are K = 2 classes A k Π A . Using (16) gives R C = 1 and T B = { T 1 } . Because there is R S = 2 , the state variables form the set T A = { T 2 , T 3 } . So, there is the set T = { T 1 , T 2 , T 3 } . The class–state codes are shown in (Figure 10).
For example, the following codes can be found from Figure 10: S C ( a 2 ) = 01 , C C ( A 1 ) = 0 , C S C ( a 2 ) = 001 , S C ( a 5 ) = 01 , C C ( A 2 ) = 1 , C S C ( a 5 ) = 101 and so on. These codes are used for creating SBFs 18–20.
Step 9. Tables of L U T e r 1 L U T e r 2 are created using the modified DST (Table 2) and state codes from Figure 10. Each table includes the columns a m , S C ( a m ) , X h 1 , Z h 1 ,h. The L U T e r Z 1 is represented by Table 4, the L U T e r Z 2 by Table 5.
These tables are used for deriving SBFs (18). For example, the following equations can be derived for functions z 1 1 (from Table 4) and z 1 2 (Table 4):
z 1 1 = T 2 ¯ T 3 x 2 ¯ x 3 ¯ T 2 T 3 ¯ x 3 ¯ T 2 T 3 ; z 1 2 = T 2 ¯ T 3 ¯ T 2 ¯ T 3 x 4 ¯ .
Step 10. Table of L U T e r Z V includes the columns z r (a function generated by L U T e r Z V ); L U T r ; r (the subscript of the corresponding function). If a partial function z r k appears in table of L U T e r k , then there is 1 at the intersection of the row z r and column k. In the discussed case, the L U T e r Z V is represented by Table 6.
The following SBF is derived from Table 6:
z 1 = T 1 ¯ z 1 1 T 1 z 1 2 ; z 2 = T 1 ¯ z 2 1 T 1 z 2 2 ; z 3 = T 1 ¯ z 3 1 T 1 z 3 2 ; z 4 = T 1 ¯ z 4 1 T 1 z 4 2 ;
Step 11. The table of L U T e r T is constructed using table of pairs of COs Table 3 and codes of COs (Figure 9). This table includes the columns Y m , K ( Y m ) , Y S , K ( Y S ) , a s , C S C ( a s ) , T ( a s ) , g. The g-th row of this table corresponds to the g-th row of table of pairs. The column T ( a s ) include IMFs equal to 1 to create the code C S C ( a s ) . In the discussed case, L U T e r T is represented by Table 7.
This table is a base for creating SBF (20). For example, the following SOP can be derived from Table 7:
T 1 = E 2 E 5 E 6 E 9 E 10 E 11 E 14 E 15 E 16 E 17 E 24 e 27 E 28 E 29 = v 1 ¯ v 2 ¯ v 3 ¯ v 4 ¯ z 1 ¯ z 2 ¯ z 3 ¯ z 4 ¯ v 1 ¯ v 2 v 3 ¯ 4 ¯ z 1 z 2 ¯ z 3 z 4 .
Step 12. Using the obtained SBFs, we can implement the logic circuit of Mealy FSM P C Z ( S 1 ) . This circuit includes 24 LUTs having 5 inputs. The circuit is shown in Figure 11.
The first logic level of the circuit includes 2 R Q = 8 LUTs. As follows from Table 4, there are 4 LUTs in the circuit of L U T e r Z 1 (LUT1–LUT4). As follows from Table 5, there are 4 LUTs in the circuit of L U T e r Z 2 (LUT5–LUT8).
The second level includes R Q = 4 LUTs. It follows from either Table 6 or SBF (24).
The third logic level includes two logic blocks ( L U T e r Y and L U T e r T ) operating in parallel. As follows from SBF (22), there are 6 LUTs in the circuit of L U T e r Y . This circuit includes LUT13–LUT18.
For the discussed case, the condition
2 R Q > I L
holds. Due to it, there are 2 LUTs in the circuit implementing any equation for T r T . For example, the circuit for T 1 T is a serial connection of LUT19 and LUT20. There are 2 ( R C + R S ) = 6 LUTs in the circuit of L U T e r T . To improve the time characteristics of L U T e r T . The LUT pairs (LUT19–LUT20, LUT21–LUT22, and LUT23–LUT24) can be connected using the dedicated multiplexer [28].
To obtain the LUT-based FSM circuits, the step of technology mapping [42] should be executed. To execute the technology mapping, some industrial CAD tools are used. If an FSM circuit is based on the internal resources of Virtex-7, the industrial package Vivado [55] should be used. The Vivado executes the steps of mapping, placement, routing, testing, and finding such characteristics of a circuit as the numbers of LUTs, slices, flip-flops, as well as maximum operating frequency and power consumption.

6. Experimental Results

In this Section, we show results of experiments conducted using the industrial CAD package Vivado and the library of standard benchmark (BM) FSMs [62]. In these experiments, we compared characteristics of P C Z -based Mealy FSMs with characteristics of FSM circuits based on some other models. The library [62] includes 48 BMs represented by STTs in the format KISS2. These benchmarks have a wide range in such characteristics as the numbers of states, inputs, transitions and outputs. The results of research based on this library can be found in many articles, as well as the BM characteristics.
The research was conducted using a personal computer with the following characteristics: CPU—Intel Core i7 6700K 4.2@4.4 GHz; Memory—16 GB RAM 2400 MHz CL15. To implement CLB-based circuits, we used the Virtex-7 VC709 Evaluation Platform (xc7vx690tffg1761-2) [63]. The package Vivado v2019.1 (64-bit) of Xilinx [55] was used for the implementation of FSM circuits. The CLBs of this platform have 6- LUTs. We use the reports of Vivado for creating the tables with research results.
The created tables include such parameters of FSM circuits as the LUT counts and maximum operating frequencies. The following FSM models have been used in our experiments: (1) Auto of Vivado (the state codes of these FSMs have R = l o g 2 M bits); (2) One-hot of Vivado (the state codes have R = M bits); (3) JEDI; (4) MPY-based FSMs [23] and (5) P C Z - based FSMs.
As in the research [23], we have divided the BMs by 5 sets denoted as B M 1 B M 5 . Belonging to a particular set is determined by the relation between L + R and I L . In the discussed case, there is I L = 6 . The number of a set j is determined as
j = L + R I L .
The value of (27) determines a set B M j ( j { 1 , , 5 } ) . The distribution is shown in Table 8.
The results of experiments are shown in Table 9, Table 10, Table 11, Table 12, Table 13, Table 14, Table 15 and Table 16. The same organization is used in these tables. The table columns are marked by the names of FSM design methods. The names of benchmarks are written into the rows of these tables. Inside each table, the benchmarks are listed in alphabetical order, and sorted by ascending value of j. The rows “Total” contain results of summation of numbers for each column. The row “Percentage” contains the percentage of summarized characteristics of FSM circuits produced by other methods, respectively, to P C Z -based FSMs. We use the model of Mealy P for all design methods except of M P Y FSMs. The sets B M j are shown in the columns “Set”.
These tables include the following information: (1) the numbers of LUTs for all BMs (Table 9); (2) the numbers of LUTs for BMs of the set BM1 (Table 10); (3) the numbers of LUTs for BMs of the set BM2 (Table 11); (4) the numbers of LUTs for BMs of sets BM3–BM5 (Table 12); (5) the maximum operating frequency for all BMs (Table 13); (6) the maximum operating frequency for BMs of the set BM1 (Table 14); (7) the maximum operating frequency for BMs of the set BM2 (Table 15); (8) the maximum operating frequency for BMs of the sets BM3–BM5 (Table 16). The following conclusions can be made from the analysis of these tables.
As follows from Table 9, our approach produces FSM circuits with fewer LUTs than seen in other investigated methods. Our approach produces circuits having 50.42% less 6-LUTs than it is for equivalent Auto-based FSMs; 75.040% less 6-LUTs than it is for equivalent One-hot-based FSMs; 23.88% less 6-LUTs than it is for equivalent JEDI-based FSMs. As we expected, our approach allows circuits with better LUT counts than equivalent MPY-based FSMs to be obtained. Our approach gives 10.07% of gain. However, the analysis for different sets of benchmarks showed that sometimes our method loses, and sometimes it wins. The amount of gain (or loss) depends on each set a particular BM belongs to.
As follows from Table 10, our approach loses compared to three other investigated methods. There is the following loss: 30.34% relative to Auto-based FSMs; 3.37% relative to One-hot-based FSMs; 31.46% relative to JEDI-based FSMs. It is worth noting that there are the same LUT counts for equivalent BMs-based on both MPY and P C Z FSMs. This is easily explained. If there is j = 1 , then L + R I L . In this case, LUT-based circuits of P FSMs are single-level. Therefore, there is no sense in the replacing inputs and encoding of COs. However, the encoding of COs is executed for both MPY and P C Z FSMs. Thus, their circuits include the redundant block L U T e r Y . This block consumes some chip resources; also, it adds some delay in the FSM cycle time.
Analysis of Table 11 and Table 12 shows that using our approach leads to circuits with fewer LUTs compared with other investigated methods. Compared with Auto-based FSMs, there is either 27.67% win rate (set BM2) or 68.55% of gain in LUT counts (sets BM3–BM5). Compared with One-hot-based FSMs, there is either 65.09% win rate (set BM2) or 87.8% of gain in LUT counts (sets BM3–BM5). Compared with JEDI-based FSMs, there is either 5.97% of gain (set BM1) or 37.23% win rate (sets BM3–BM5). Compared with M P Y -based FSMs, there is either 1.26% of gain (set BM1) or 14.72% win rate (sets BM3–BM5). So, the gain from using P C Z FSMs increases with the growth of the value L + R .
As follows from Table 13, our approach produces slightly faster LUT-based FSM circuits compared to Auto- and One-hot-based approaches. There is a gain of 4.48% and 5.25%, respectively. However, our approach is slightly inferior in performance compared to both JEDI-based FSMs (2.28%) and M P Y -based FSMs (0.33%). The gain and loss varies depending on the value determined by the Formula (27). For the set BM1 (Table 14), our approach provides a loss relative to Auto-based FSMs (10.94%), One-hot-based FSMs (8.22%) and JEDI-based FSMs (11.94%). The same is true for MPY-based FSMs. This is explained by the existence of L U T e r Y which is redundant for trivial FSMs. So, it does not make sense to use our approach for FSMs with L + R I L .
Table 15 shows results for the set BM2. As follows from Table 15, our approach produces faster circuits than both Auto- and One-hot-based FSMs (3.88% and 4% of gain, respectively). There is loss relatively to equivalent M P Y -based FSMs (0.97% of loss). The JEDI-based FSMs win 2.94%. So, JEDI-based FSMs are the fastest for BMs from BM2.
As follows from Table 16, our method produces the fastest FSM circuits. There is the following gain: 15.61% compared with Auto-based FSMs; 15.83% compared with One-hot-based FSMs; 5.04% compared with JEDI-based FSMs; 0.19% compared with M P Y -based FSMs. We believe that the gain compared to M P Y -based FSMs is due to the fact that there are several levels of LUTs in the circuit of the block replacing FSM inputs.
So, the proposed approach allows the reduction of the LUT counts (and, therefore, the chip area occupied by FSM circuit) compared to equivalent M P Y -based FSMs. At the same time, the gain in the number of LUTs grows with the increase in the total number of FSM inputs and state variables. The experimental results show that this gain in LUTs is not accompanied by the significant degradation in FSM operating frequency. Moreover, our approach produces slightly faster FSMs for rather complex FSMs (they belong to sets BM2–BM5). As follows from experimental results, P C Z -based FSMs can replace other investigated models starting from simple FSMs (the set BM2).

7. Conclusions

Today, FPGA chips are widely used for implementing circuits of finite state machines representing sequential blocks of various digital systems. The increasing complexity of digital systems leads to an increase in the complexity of their sequential block circuits. In turn, this leads to an increase in the values of such FSM parameters as the numbers of inputs, outputs, transitions and states. At the same time, there is an increase in the gap between the numbers of LUT inputs on the one hand, and the summarized values of state variables and FSM inputs on the other hand. Modern LUTs have no more than six inputs. However, the number of literals in SOPs of functions representing FSM circuits significantly exceeds six. In these conditions, there is a need to apply various methods of functional decomposition for implementing LUT-based FSM circuits. As a result [42], the produced FSM circuits are multi-level and they have sophisticated systems of spaghetti-type interconnections.
As follows from [21], in many cases, the structural decomposition of LUT-based FSM circuits allows the improvement of their characteristics compared with equivalent FD-based FSM. So, as shown in [23], the three-block SD-based FSM circuits require fewer LUTs than their FD-based counterparts. However, the reducing LUT counts leads to the introduction of additional functions. To implement these functions, some FPGA chip internal resources are used. This is the main drawback of this approach.
It is known that the number of interconnections in a circuit is directly proportional to the LUT count. Interconnects have a significant impact on FSM performance and power consumption. Therefore, it is important to reduce the number of LUTs in the circuits of implemented blocks of digital systems. Modern very powerful FPGA chips are quite expensive. Many digital system designers may simply not have enough funds to purchase such expensive chips. Therefore, reducing the number of LUTs can make it possible to replace a more expensive chip with a cheaper one, where the number of elements will be sufficient to implement a system with optimized sequential blocks.
In this article, we propose to use the codes of collections of FSM outputs for generating both output functions and state variables. To do this, it is necessary to use two registers which keep these codes. The proposed method results in two-level FSM circuits which require fewer LUTs than their counterparts based on the approach [23]. Our approach gives an average a gain in the LUT counts around 10.07%. Note that the payoff in the number of LUTs increases with increasing complexity of FSMs. Moreover, the proposed two-block FSMs have practically the same cycle times as their three-block counterparts. It is very important that reducing the number of LUTs for the proposed method does not lead to performance degradation. We think that the proposed approach has enough positive qualities to be used for the implementation of LUT-based FSM circuits.

Author Contributions

Conceptualization, A.B., L.T., K.K. and K.M.; methodology, A.B., L.T., K.K. and K.M.; formal analysis, A.B., L.T., K.K. and K.M.; writing—original draft preparation, A.B., L.T., K.K. and K.M.; supervision, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in the article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
BM standard benchmark
CLB configurable logic block
CO collection of outputs
CSC composite state code
DST direct structure table
FD functional decomposition
FPGA field-programmable gate array
FSM finite state machine
IMF input memory function
LUT look-up table
SBF systems of Boolean functions
SCR state code register
SD structural decomposition
SOP sum-of-products
STG state transitions graph
STT state transition table

References

  1. Glushkov, V. Synthesis of Digital Automata; FTD-MT, Translation Division, Foreign Technology Division: Wright-Patterson AIR Force Base, OH, USA, 1965; p. 487. [Google Scholar]
  2. Baranov, S. Logic and System Design of Digital Systems; TUT Press: Tallinn, Estonia, 2008; p. 276. [Google Scholar]
  3. Gajski, D.D.; Abdi, S.; Gerstlauer, A.; Schirner, G. Embedded System Design: Modeling, Synthesis and Verification, 1st ed.; Springer Publishing Company, Incorporated: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  4. De Micheli, G. Synthesis and Optimization of Digital Circuits; McGraw–Hill: New York, NY, USA, 1994; p. 578. [Google Scholar]
  5. Baranov, S. Logic Synthesis of Control Automata; Kluwer Academic Publishers: Norwell, MA, USA, 1994; p. 312. [Google Scholar]
  6. Czerwinski, R.; Kania, D. Finite State Machine Logic Synthesis for Complex Programmable Logic Devices. In Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2013; Volume 231, p. 172. [Google Scholar] [CrossRef]
  7. Gazi, O.; Arli, A. State Machines Using VHDL: FPGA Implementation of Serial Communication and Display Protocols; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar] [CrossRef]
  8. Koo, B.; Bae, J.; Kim, S.; Park, K.; Kim, H. Test Case Generation Method for Increasing Software Reliability in Safety-Critical Embedded Systems. Electronics 2020, 9, 797. [Google Scholar] [CrossRef]
  9. Baranov, S. High Level Synthesis of Digital Systems; Amazon Publishing: Seattle, WA, USA, 2018; p. 207. [Google Scholar]
  10. Zhao, X.; He, Y.; Chen, X.; Liu, Z. Human-Robot Collaborative Assembly Based on Eye-Hand and a Finite State Machine in a Virtual Environment. Appl. Sci. 2021, 11, 5754. [Google Scholar] [CrossRef]
  11. Li, P.; Lilja, D.J.; Qian, W.; Riedel, M.D.; Bazargan, K. Logical Computation on Stochastic Bit Streams with Linear Finite-State Machines. IEEE Trans. Comput. 2014, 63, 1474–1486. [Google Scholar] [CrossRef]
  12. Xie, Y.; Liao, S.; Yuan, B.; Wang, Y.; Wang, Z. Fully-Parallel Area-Efficient Deep Neural Network Design Using Stochastic Computing. IEEE Trans. Circuits Syst. II Express Briefs 2017, 64, 1382–1386. [Google Scholar] [CrossRef]
  13. Bollig, B.; Fortin, M.; Gastin, P. Communicating finite-state machines, first-order logic, and star-free propositional dynamic logic. J. Comput. Syst. Sci. 2021, 115, 22–53. [Google Scholar] [CrossRef]
  14. Cassel, S.; Howar, F.; Jonsson, B.; Steffen, B. Active Learning for Extended Finite State Machines. Form. Asp. Comput. 2016, 28, 233–263. [Google Scholar] [CrossRef]
  15. Jóźwiak, L.; Ślusarczyk, A.; Chojnacki, A. Fast and compact sequential circuits for the FPGA-based reconfigurable systems. J. Syst. Archit. 2003, 49, 227–246. [Google Scholar] [CrossRef]
  16. Islam, M.M.; Hossain, M.; Shahjalal, M.; Hasan, M.K.; Jang, Y.M. Area-Time Efficient Hardware Implementation of Modular Multiplication for Elliptic Curve Cryptography. IEEE Access 2020, 8, 73898–73906. [Google Scholar] [CrossRef]
  17. Maruyama, T.; Yamaguchi, Y.; Osana, Y. Programmable Logic Devices (PLDs) in Practical Applications. In Principles and Structures of FPGAs; Amano, H., Ed.; Springer: Singapore, 2018; pp. 179–206. [Google Scholar] [CrossRef]
  18. Skliarova, I.; Sklyarov, V. FPGA-Based Hardware Accelerators; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2019; p. 245. [Google Scholar] [CrossRef]
  19. Ruiz-Rosero, J.; Ramirez-Gonzalez, G.; Khanna, R. Field Programmable Gate Array Applications—A Scientometric Review. Computation 2019, 7, 63. [Google Scholar] [CrossRef] [Green Version]
  20. Trimberg, S. Three ages of FPGA: A Retrospective on the First Thirty Years of FPGA Technology. IEEE Proc. 2015, 103, 318–331. [Google Scholar] [CrossRef]
  21. Barkalov, A.; Titarenko, L.; Krzywicki, K. Structural Decomposition in FSM Design: Roots, Evolution, Current State—A Review. Electronics 2021, 10, 1174. [Google Scholar] [CrossRef]
  22. Barkalov, A.; Titarenko, L.; Krzywicki, K.; Saburova, S. Improving the Characteristics of Multi-Level LUT-Based Mealy FSMs. Electronics 2020, 9, 1859. [Google Scholar] [CrossRef]
  23. Barkalov, A.; Titarenko, L.; Krzywicki, K. Reducing LUT Count for FPGA-Based Mealy FSMs. Appl. Sci. 2020, 10, 5115. [Google Scholar] [CrossRef]
  24. Grout, I. Digital Systems Design with FPGAs and CPLDs; Elsevier Science: Amsterdam, The Netherlands, 2011; p. 718. [Google Scholar]
  25. Kubica, M.; Kania, D.; Kulisz, J. A Technology Mapping of FSMs Based on a Graph of Excitations and Outputs. IEEE Access 2019, 7, 16123–16131. [Google Scholar] [CrossRef]
  26. Skliarova, I.; Sklyarov, V.; Sudnitson, A. Design of FPGA-Based Circuits Using Hierarchical Finite State Machines; TUT Press: Tallinn, Estonia, 2012. [Google Scholar]
  27. Baranov, S. Finite State Machines and Algorithmic State Machines; Amazon Publishing: Seattle, WA, USA, 2018; p. 185. [Google Scholar]
  28. Chapman, K. Multiplexer Design Techniques for Datapath Performance with Minimized Routing Resources; Xilinx: San Jose, CA, USA, 2014; pp. 1–32. Available online: https://www.xilinx.com/support/documentation/application_notes/xapp522-mux-design-techniques.pdf (accessed on 8 January 2022).
  29. Trimberger, S. Field-Programmable Gate Array Technology; Springer US: New York, NY, USA, 2012. [Google Scholar]
  30. Mishchenko, A.; Brayton, R.; Jiang, J.H.R.; Jang, S. Scalable Don’t-Care-Based Logic Optimization and Resynthesis. ACM Trans. Reconfigurable Technol. Syst. 2011, 4, 1–23. [Google Scholar] [CrossRef]
  31. Scholl, C. Functional Decomposition with Application to FPGA Synthesis; Kluwer Academic Publishers: Boston, MA, USA, 2001. [Google Scholar]
  32. Kubica, M.; Kania, D. Technology mapping oriented to adaptive logic modules. Bull. Pol. Acad. Sci. Tech. Sci. 2019, 67, 947–956. [Google Scholar]
  33. Mishchenko, A.; Chatterjee, S.; Brayton, R. Improvements to technology mapping for LUT-based FPGAs. Comput.-Aided Des. Integr. Circuits Syst. 2007, 26, 240–253. [Google Scholar] [CrossRef] [Green Version]
  34. Khatri, S.; Gulati, K. (Eds.) Advanced Techniques in Logic Synthesis, Optimizations and Applications; Springer: New York, NY, USA; Dordrecht, The Netherlands; London, UK, 2011; p. 425. [Google Scholar] [CrossRef]
  35. Xilinx. FPGA. Available online: https://www.xilinx.com/products/silicon-devices/fpga.html (accessed on 7 January 2022).
  36. Soloviev, V. Architecture of the FILM of the Firm Xilinx: CPLD and FPGA of the 7th Series; Hotline-Telecom: Moscow, Russia, 2016; p. 392. (In Russian) [Google Scholar]
  37. Altera. Cyclone IV Device Handbook. Available online: http://www.altera.com/literature/hb/cyclone-iv/cyclone4-handbook.pdf (accessed on 6 January 2022).
  38. Feng, W.; Greene, J.; Mishchenko, A. Improving FPGA Performance with a S44 LUT Structure. In Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, New York, NY, USA, 25–27 February 2018; pp. 61–66. [Google Scholar] [CrossRef]
  39. Barkalov, O.; Titarenko, L.; Mielcarek, K. Hardware reduction for LUT-based Mealy FSMs. Int. J. Appl. Math. Comput. Sci. 2018, 28, 595–607. [Google Scholar] [CrossRef] [Green Version]
  40. Barkalov, O.; Titarenko, L.; Mielcarek, K. Improving characteristics of LUT-based Mealy FSMs. Int. J. Appl. Math. Comput. Sci. 2020, 30, 745–759. [Google Scholar] [CrossRef]
  41. Senhadji-Navarro, R.; Garcia-Vargas, I. Methodology for Distributed-ROM-Based Implementation of Finite State Machines. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2020, 40, 2411–2415. [Google Scholar] [CrossRef]
  42. Kubica, M.; Opara, A.; Kania, D. Technology Mapping for LUT-Based FPGA; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar] [CrossRef]
  43. Kubica, M.; Kania, D. Decomposition of multi-output functions oriented to configurability of logic blocks. Bull. Pol. Acad. Sci. Tech. Sci. 2017, 65, 317–331. [Google Scholar] [CrossRef] [Green Version]
  44. Salauyou, V.; Ostapczuk, M. State Assignment of Finite-State Machines by Using the Values of Output Variables. In Theory and Applications of Dependable Computer Systems; Springer: Berlin/Heidelberg, Germany, 2020; pp. 543–553. [Google Scholar] [CrossRef]
  45. Solov’ev, V.V. Implementation of finite-state machines based on programmable logic ICs with the help of the merged model of Mealy and Moore machines. J. Commun. Technol. Electron. 2013, 58, 172–177. [Google Scholar] [CrossRef]
  46. Park, J.; Yoo, H. Area-Efficient Differential Fault Tolerance Encoding for Finite State Machines. Electronics 2020, 9, 1110. [Google Scholar] [CrossRef]
  47. Amann, R.; Baitinger, U. Optimal state chains and states codes in finite state machines. IEEE Trans. Comput. Aided Des. 1989, 8, 153–170. [Google Scholar] [CrossRef]
  48. Chattopadhyay, S. Area conscious state assignment with flip-flop and output polarity selection for finite state machines synthesis—A genetic algorithm. Comput. J. 2005, 48, 443–450. [Google Scholar] [CrossRef]
  49. De Micheli, G.; Brayton, R.K.; Sangiovanni-Vincentelli, A. Optimal State Assignment for Finite State Machines. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2006, 4, 269–285. [Google Scholar] [CrossRef] [Green Version]
  50. El-Maleh, A.H. A probabilistic pairwise swap search state assignment algorithm for sequential circuit optimization. Integr. VLSI J. 2017, 56, 32–43. [Google Scholar] [CrossRef]
  51. Sentowich, E.; Singh, K.; Lavango, L.; Moon, C.; Murgai, R.; Saldanha, A.; Savoj, H.; P, P.S.; Bryton, R.; Sangiovanni-Vincentelli, A. SIS: A System for Sequential Circuit Synthesis; Technical Report; University of California: Berkely, CA, USA, 1992. [Google Scholar]
  52. ABC System. 2022. Available online: https://people.eecs.berkeley.edu/~alanmi/abc/ (accessed on 1 January 2022).
  53. Brayton, R.; Mishchenko, A. ABC: An Academic Industrial-Strength Verification Tool. In Computer Aided Verification; Touili, T., Cook, B., Jackson, P., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 24–40. [Google Scholar] [CrossRef] [Green Version]
  54. Baranov, S. From Algorithm to Digital System: HSL and RTL tool Sinthagate in Digital System Design; Amazon Publishing: Seattle, WA, USA, 2020; p. 76. [Google Scholar]
  55. Xilinx. Vivado Design Suite User Guide: Synthesis; UG901 (v2019.1). 2022. Available online: https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_1/ug901-vivado-synthesis.pdf (accessed on 2 January 2022).
  56. Xilinx. Vitis Platform. Available online: https://www.xilinx.com/products/design-tools/vitis/vitis-platform.html (accessed on 3 January 2022).
  57. Quartus Prime. 2022. Available online: https://www.intel.pl/content/www/pl/pl/software/programmable/quartus-prime/overview.html (accessed on 4 January 2022.).
  58. Sklyarov, V. Synthesis and Implementation of RAM-Based Finite State Machines in FPGAs; Springer: Berlin/Heidelberg, Germany, 2000; Volume 1896, pp. 718–728. [Google Scholar] [CrossRef]
  59. Tiwari, A.; Tomko, K. Saving power by mapping finite-state machines into Embedded Memory Blocks in FPGAs. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, Paris, France, 16–20 February 2004; pp. 916–921. [Google Scholar]
  60. Wilkes, M.V.; Stringer, J.B. Micro-programming and the design of the control circuits in an electronic digital computer. Math. Proc. Camb. Philos. Soc. 1953, 49, 230–238. [Google Scholar] [CrossRef]
  61. Achasova, S. Synthesis Algorithms for Automata with PLAs; M: Soviet Radio: Moscow, Russia, 1987; p. 135. (In Russian) [Google Scholar]
  62. McElvain, K. Lgsynth93 Benchmark Set: Version 4.0. 1993. Available online: https://people.engr.ncsu.edu/brglez/CBL/benchmarks/LGSynth93/LGSynth93.tar (accessed on 28 April 2022).
  63. Xilinx. VC709 Evaluation Board for the Virtex-7 FPGA. Available online: https://www.xilinx.com/support/documentation/boards_and_kits/vc709/ug887-vc709-eval-board-v7-fpga.pdf (accessed on 5 January 2022).
Figure 1. Structural diagram of P Mealy FSM.
Figure 1. Structural diagram of P Mealy FSM.
Electronics 11 02050 g001
Figure 2. Equivalent fragments of STG (a), STT (b) and DST (c).
Figure 2. Equivalent fragments of STG (a), STT (b) and DST (c).
Electronics 11 02050 g002
Figure 3. Structural diagram of LUT-based P Mealy FSM.
Figure 3. Structural diagram of LUT-based P Mealy FSM.
Electronics 11 02050 g003
Figure 4. Structural diagram of LUT-based M P Y Mealy FSM.
Figure 4. Structural diagram of LUT-based M P Y Mealy FSM.
Electronics 11 02050 g004
Figure 5. Replacement of transition pairs a m , a s by pairs Y m , Y s .
Figure 5. Replacement of transition pairs a m , a s by pairs Y m , Y s .
Electronics 11 02050 g005
Figure 6. Structural diagram of P Z Mealy FSM.
Figure 6. Structural diagram of P Z Mealy FSM.
Electronics 11 02050 g006
Figure 7. Structural diagram of LUT-based P C Z Mealy FSM.
Figure 7. Structural diagram of LUT-based P C Z Mealy FSM.
Electronics 11 02050 g007
Figure 8. State transition graph of Mealy FSM S 1 .
Figure 8. State transition graph of Mealy FSM S 1 .
Electronics 11 02050 g008
Figure 9. The outcome of encoding of COs for FSM S 1 .
Figure 9. The outcome of encoding of COs for FSM S 1 .
Electronics 11 02050 g009
Figure 10. Outcome of encoding of states and state classes.
Figure 10. Outcome of encoding of states and state classes.
Electronics 11 02050 g010
Figure 11. Logic circuit of Mealy FSM P C Z ( S 1 ) .
Figure 11. Logic circuit of Mealy FSM P C Z ( S 1 ) .
Electronics 11 02050 g011
Table 1. State transition table of Mealy FSM S 1 .
Table 1. State transition table of Mealy FSM S 1 .
a m a S X h Y h h
a 1 a 2 x 1 y 1 y 2 1
a 3 x 1 ¯ y 3 2
a 2 a 2 x 2 y 1 y 4 3
a 5 x 2 ¯ x 3 y 4 4
a 4 x 2 ¯ x 3 ¯ y 3 y 6 5
a 3 a 6 1 y 4 y 5 6
a 4 a 5 x 3 y 4 7
a 8 x 3 ¯ y 3 y 8 8
a 5 a 5 x 4 y 3 9
a 7 x 4 ¯ y 5 y 7 10
a 6 a 1 x 6 11
a 4 x 6 ¯ x 5 y 3 12
a 8 x 6 ¯ x 5 ¯ y 4 13
a 7 a 5 x 4 y 3 14
a 8 x 4 ¯ x 6 y 1 y 2 15
a 8 x 4 ¯ x 6 ¯ y 4 16
a 8 a 6 1 y 3 y 8 17
Table 2. Modified DST of Mealy FSM P Z ( S 1 ) .
Table 2. Modified DST of Mealy FSM P Z ( S 1 ) .
a m K ( a m ) a S K ( a S ) X h Φ h Z h h
a 1 000 a 2 001 x 1 D 3 z 2 1
a 3 010 x 1 ¯ D 2 z 4 2
a 2 001 a 2 001 x 2 D 3 z 2 z 3 3
a 5 100 x 2 ¯ x 3 D 1 z 3 4
a 4 011 x 2 ¯ x 3 ¯ D 2 D 3 z 1 z 4 5
a 3 010 a 6 1011 D 1 D 3 z 1 z 3 6
a 4 011 a 5 100 x 3 D 1 z 3 7
a 8 111 x 3 ¯ D 1 D 2 D 3 z 1 z 3 z 4 8
a 5 100 a 5 100 x 4 D 1 z 4 9
a 7 110 x 4 ¯ D 1 D 2 z 1 10
a 6 101 a 1 000 x 6 11
a 4 011 x 6 ¯ x 5 D 2 D 3 z 4 12
a 8 111 x 6 ¯ x 5 ¯ D 1 D 2 D 3 z 3 13
a 7 110 a 5 100 x 4 D 1 z 4 14
a 8 111 x 4 ¯ x 6 D 1 D 2 D 3 z 2 15
a 8 111 x 4 ¯ x 6 ¯ D 1 D 2 D 3 z 3 16
a 8 111 a 6 1011 D 1 D 3 z 1 z 3 z 4 17
Table 3. Table of pairs of COs.
Table 3. Table of pairs of COs.
a m a S Y m Y S g a m a S Y m Y S g
a 1 a 2 Y 1 Y 2 1 a 5 a 5 Y 3 Y 3 15
a 1 a 3 Y 1 Y 3 2 a 5 a 7 Y 6 Y 7 16
a 2 a 2 Y 2 Y 4 3 a 5 a 7 Y 3 Y 7 17
a 2 a 2 Y 4 Y 4 4 a 6 a 4 Y 9 Y 3 18
a 2 a 5 Y 2 Y 6 5 a 6 a 4 Y 8 Y 3 19
a 2 a 5 Y 4 Y 6 6 a 6 a 8 Y 9 Y 7 20
a 2 a 4 Y 2 Y 5 7 a 6 a 8 Y 8 Y 7 21
a 2 a 4 Y 4 Y 5 8 a 6 a 1 Y 9 Y 1 22
a 3 a 6 Y 3 Y 9 9 a 6 a 1 Y 8 Y 1 23
a 4 a 5 Y 5 Y 6 10 a 7 a 5 Y 7 Y 3 24
a 4 a 5 Y 5 Y 8 11 a 7 a 8 Y 7 Y 2 25
a 4 a 8 Y 3 Y 6 12 a 7 a 8 Y 7 Y 6 26
a 4 a 8 Y 3 Y 8 13 a 8 a 6 Y 6 Y 8 27
a 5 a 5 Y 6 Y 3 14 a 8 a 6 Y 8 Y 8 28
a 8 a 6 Y 2 Y 8 29
Table 4. Table of L U T e r Z 1 .
Table 4. Table of L U T e r Z 1 .
a m S C ( a m ) X h 1 Z h 1 h
a 1 00 x 1 z 2 1
x 1 ¯ z 4 2
a 2 01 x 2 z 2 z 3 3
x 2 ¯ x 3 z 3 4
x 2 ¯ x 3 ¯ z 1 z 4 5
a 4 10 x 3 z 3 6
x 3 ¯ z 1 z 3 z 4 7
a 8 111 z 1 z 3 z 4 8
Table 5. Table of L U T e r Z 2 .
Table 5. Table of L U T e r Z 2 .
a m S C ( a m ) X h 2 Z h 2 h
a 3 001 z 1 z 3 1
a 5 01 x 4 z 4 2
x 4 ¯ z 1 3
a 6 10 x 6 4
x 6 ¯ x 5 z 4 5
x 6 ¯ x 5 ¯ z 3 6
a 7 11 x 4 z 4 7
x 4 ¯ x 6 z 2 8
x 4 ¯ x 6 ¯ z 3 9
Table 6. Table of L U T e r Z V .
Table 6. Table of L U T e r Z V .
z r LUTrr
z 2 111
z 2 112
z 3 113
z 4 114
Table 7. Table of L U T e r T .
Table 7. Table of L U T e r T .
Y m K ( Y m ) Y S K ( Y S ) a S C S C ( a S ) T ( a S ) g
Y 1 0000 Y 2 0100 a 2 001 T 3 1
Y 1 0000 Y 3 0001 a 3 100 T 1 2
Y 2 0100 Y 4 0110 a 2 001 T 3 3
Y 4 0110 Y 4 0110 a 2 001 T 3 4
Y 2 0100 Y 6 0010 a 5 101 T 1 T 3 5
Y 4 0110 Y 6 0010 a 5 101 T 1 T 3 6
Y 2 0100 Y 5 1001 a 4 010 T 2 7
Y 4 0110 Y 5 1001 a 4 010 T 2 8
Y 3 0001 Y 9 1010 a 6 110 T 1 T 2 9
Y 5 1001 Y 6 0010 a 5 101 T 1 T 3 10
Y 5 1001 Y 8 1011 a 5 101 T 1 T 3 11
Y 3 0001 Y 6 0010 a 8 011 T 2 T 3 12
Y 3 0001 Y 8 1011 a 8 011 T 2 T 3 13
Y 6 0010 Y 3 0001 a 5 101 T 1 T 3 14
Y 3 0001 Y 3 0001 a 5 101 T 1 T 3 15
Y 6 0010 Y 7 1000 a 7 111 T 1 T 2 T 3 16
Y 3 0001 Y 7 1000 a 7 111 T 1 T 2 T 3 17
Y 9 1010 Y 3 0001 a 4 010 T 2 18
Y 8 1011 Y 3 0001 a 4 010 T 2 19
Y 9 1010 Y 7 1000 a 8 011 T 2 T 3 20
Y 8 1011 Y 7 1000 a 8 011 T 2 T 3 21
Y 9 1010 Y 1 0000 a 1 00022
Y 8 1011 Y 1 0000 a 1 00023
Y 7 1000 Y 3 0001 a 5 101 T 1 T 3 24
Y 7 1000 Y 2 0100 a 8 011 T 2 T 3 25
Y 7 1000 Y 6 0010 a 8 011 T 2 T 3 26
Y 6 0010 Y 8 1011 a 6 110 T 1 T 2 27
Y 8 1011 Y 8 1011 a 6 110 T 1 T 2 28
Y 2 0100 Y 8 1011 a 6 110 T 1 T 2 29
Table 8. Distribution of benchmarks between sets BM1–BM5.
Table 8. Distribution of benchmarks between sets BM1–BM5.
BM1BM2BM3BM4BM5
bbtasdk512ex1sands420
dk1bbssekirkman s510
dk27beecountplanet s820
dk512cseplanet1 s832
ex3dk14pma
ex5dk15s1
liondk16s1488
lion9donefiles149
mcex2s1a
modulo12ex4s208
shiftregex6styr
ex7tma
keyb
mark
opus
s2
s386
s840
sse
Table 9. Experimental results (numbers of LUTs for BM1–BM5).
Table 9. Experimental results (numbers of LUTs for BM1–BM5).
BenchmarkAutoOne-HotJEDIMPYOur ApproachSet
bbtas55588BM1
dk17512588BM1
dk2735477BM1
dk512101091212BM1
ex39991111BM1
ex59991010BM1
lion25266BM1
lion9611588BM1
mc47466BM1
modulo1277799BM1
shiftreg26244BM1
bbara1717101010BM2
bbsse3337242625BM2
beecount1919141414BM2
cse4066363333BM2
dk141627101211BM2
dk1515161267BM2
dk161534121111BM2
donfile3131242120BM2
ex2998810BM2
ex41513121110BM2
ex62436222120BM2
ex745467BM2
keyb4361403736BM2
mark12323201918BM2
opus2828222121BM2
s27618667BM2
s3862639222524BM2
s8999910BM2
sse3337302624BM2
ex17074534034BM3
kirkman4258393327BM3
planet131131887868BM3
planet1131131887868BM3
pma9494867265BM3
s16599615448BM3
s14881241311088983BM3
s14941261321109078BM3
s1a4981433832BM3
s20812311099BM3
styr93120817059BM3
tma4539393027BM3
sand1321321149979BM4
s4201031989BM5
s5104848322219BM5
s8208882685246BM5
s8328079625044BM5
Total18082104148913231202
Percentage, %150.42175.04123.88110.07100.00
Table 10. Experimental results (numbers of LUTs for BMs from BM1).
Table 10. Experimental results (numbers of LUTs for BMs from BM1).
BenchmarkAutoOne-HotJEDIMPYOur Approach
bbtas55588
dk17512588
dk2735477
dk512101091212
ex39991111
ex59991010
lion25266
lion9611588
mc47466
modulo1277799
shiftreg26244
Total6286618989
Percentage, %69.6696.6368.54100.00100.00
Table 11. Experimental results (numbers of LUTs for BM2).
Table 11. Experimental results (numbers of LUTs for BM2).
BenchmarkAutoOne-HotJEDIMPYOur Approach
bbara1717101010
bbsse3337242625
beecount1919141414
cse4066363333
dk141627101211
dk1515161267
dk161534121111
donfile3131242120
ex2998810
ex41513121110
ex62436222120
ex745467
keyb4361403736
mark12323201918
opus2828222121
s27618667
s3862639222524
s8999910
sse3337302624
Total406525337322318
Percentage, %127.67165.09105.97101.26100.00
Table 12. Experimental results (numbers of LUTs for BM3–BM5).
Table 12. Experimental results (numbers of LUTs for BM3–BM5).
BenchmarkAutoOne-HotJEDIMPYOur Approach
ex17074534034
kirkman4258393327
planet131131887868
planet1131131887868
pma9494867265
s16599615448
s14881241311088983
s14941261321109078
s1a4981433832
s20812311099
styr93120817059
tma4539393027
sand1321321149979
s4201031989
s5104848322219
s8208882685246
s8328079625044
Total134014931091912795
Percentage, %168.55187.80137.23114.72100.00
Table 13. Experimental results (the maximum operating frequency for BM1–BM5, MHz).
Table 13. Experimental results (the maximum operating frequency for BM1–BM5, MHz).
BenchmarkAutoOne-HotJEDIMPYOur ApproachSet
bbtas204.16204.16206.12200.38200.38BM1
dk17199.28167199.39199.87199.87BM1
dk27206.02201.9204.18196.65196.65BM1
dk512196.27196.27199.75194.17194.17BM1
ex3194.86194.86195.76191.22191.22BM1
ex5180.25180.25181.16178.06178.06BM1
lion202.43204202.35200.18200.18BM1
lion9205.3185.22206.38199.12199.12BM1
mc196.66195.47196.87193.17193.17BM1
modulo12207207207.13201.12201.12BM1
shiftreg262.67263.57276.26256.69256.69BM1
bbara193.39193.39212.21202.23201.82BM2
bbsse157.06169.12182.34181.23179.22BM2
beecount166.61166.61187.32185.14183.29BM2
cse146.43163.64178.12175.18171.64BM2
dk14191.64172.65193.85190.18188.12BM2
dk15192.53185.36194.87192.23190.84BM2
dk16169.72174.79197.13194.34192.18BM2
donfile184.03184203.65200.92197.47BM2
ex2198.57198.57200.14198.32196.63BM2
ex4180.96177.71192.83190.14189.69BM2
ex6169.57163.8176.59171.27169.19BM2
ex7200.04200.84200.6198.14196.26BM2
keyb156.45143.47168.43162.01160.65BM2
mark1162.39162.39176.18170.18168.73BM2
opus166.2166.2178.32175.29173.68BM2
s27198.73191.5199.13196.13194.42BM2
s386168.15173.46179.15176.85175.16BM2
s8180.02178.95181.23178.23177.39BM2
sse157.06169.12174.63170.12168.14BM2
ex1150.94139.76176.87182.34180.01BM3
kirkman141.38154156.68167.15166.25BM3
planet132.71132.71187.14189.12188.73BM3
planet1132.71132.71187.14189.12188.73BM3
pma146.18146.18169.83178.19177.67BM3
s1146.41135.85157.16162.23162.12BM3
s1488138.5131.94157.18168.32167.54BM3
s1494149.39145.75164.34172.27171.09BM3
s1a153.37176.4169.17178.21177.42BM3
s208174.34176.46178.76181.72181.02BM3
styr137.61129.92145.64161.87160.73BM3
tma163.88147.8164.14176.72175.72BM3
sand115.97115.97126.82145.68153.49BM4
s420173.88176.46177.25187.23190.62BM5
s510177.65177.65181.42187.32189.12BM5
s820152153.16176.58181.96182.58BM5
s832145.71153.23173.78186.12188.32BM5
Total8127.088061.228701.978536.278508.25
Percentage, %95.5294.75102.28100.33100.00
Table 14. Experimental results (the maximum operating frequency for BM1, MHz).
Table 14. Experimental results (the maximum operating frequency for BM1, MHz).
BenchmarkAutoOne-HotJEDIMPYOur Approach
bbtas204.16204.16206.12200.38200.38
dk17199.28167199.39199.87199.87
dk27206.02201.9204.18196.65196.65
dk512196.27196.27199.75194.17194.17
ex3194.86194.86195.76191.22191.22
ex5180.25180.25181.16178.06178.06
lion202.43204202.35200.18200.18
lion9205.3185.22206.38199.12199.12
mc196.66195.47196.87193.17193.17
modulo12207207207.13201.12201.12
shiftreg262.67263.57276.26256.69256.69
Total2254.902199.702275.352032.572032.57
Percentage, %110.94108.22111.94100.00100.00
Table 15. Experimental results (the maximum operating frequency for BM2, MHz).
Table 15. Experimental results (the maximum operating frequency for BM2, MHz).
BenchmarkAutoOne-HotJEDIMPYOur Approach
bbara193.39193.39212.21202.23201.82
bbsse157.06169.12182.34181.23179.22
beecount166.61166.61187.32185.14183.29
cse146.43163.64178.12175.18171.64
dk14191.64172.65193.85190.18188.12
dk15192.53185.36194.87192.23190.84
dk16169.72174.79197.13194.34192.18
donfile184.03184203.65200.92197.47
ex2198.57198.57200.14198.32196.63
ex4180.96177.71192.83190.14189.69
ex6169.57163.8176.59171.27169.19
ex7200.04200.84200.6198.14196.26
keyb156.45143.47168.43162.01160.65
mark1162.39162.39176.18170.18168.73
opus166.2166.2178.32175.29173.68
s27198.73191.5199.13196.13194.42
s386168.15173.46179.15176.85175.16
s8180.02178.95181.23178.23177.39
sse157.06169.12174.63170.12168.14
Total3339.553335.573576.723508.133474.52
Percentage, %96.1296.00102.94100.97100.00
Table 16. Experimental results (the maximum operating frequency for BM3-BM5, MHz).
Table 16. Experimental results (the maximum operating frequency for BM3-BM5, MHz).
BenchmarkAutoOne-HotJEDIMPYOur Approach
ex1150.94139.76176.87182.34180.01
kirkman141.38154156.68167.15166.25
planet132.71132.71187.14189.12188.73
planet1132.71132.71187.14189.12188.73
pma146.18146.18169.83178.19177.67
s1146.41135.85157.16162.23162.12
s1488138.5131.94157.18168.32167.54
s1494149.39145.75164.34172.27171.09
s1a153.37176.4169.17178.21177.42
s208174.34176.46178.76181.72181.02
styr137.61129.92145.64161.87160.73
tma163.88147.8164.14176.72175.72
sand115.97115.97126.82145.68153.49
s420173.88176.46177.25187.23190.62
s510177.65177.65181.42187.32189.12
s820152153.16176.58181.96182.58
s832145.71153.23173.78186.12188.32
Total2532.632525.952849.902995.573001.16
Percentage, %84.3984.1794.9699.81100.00
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Barkalov, A.; Titarenko, L.; Krzywicki, K.; Mielcarek, K. Using Codes of Output Collections for Hardware Reduction in Circuits of LUT-Based Finite State Machines. Electronics 2022, 11, 2050. https://doi.org/10.3390/electronics11132050

AMA Style

Barkalov A, Titarenko L, Krzywicki K, Mielcarek K. Using Codes of Output Collections for Hardware Reduction in Circuits of LUT-Based Finite State Machines. Electronics. 2022; 11(13):2050. https://doi.org/10.3390/electronics11132050

Chicago/Turabian Style

Barkalov, Alexander, Larysa Titarenko, Kazimierz Krzywicki, and Kamil Mielcarek. 2022. "Using Codes of Output Collections for Hardware Reduction in Circuits of LUT-Based Finite State Machines" Electronics 11, no. 13: 2050. https://doi.org/10.3390/electronics11132050

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop