Research

20 pages, 16411 KiB

Open AccessArticle

Analog Frontend for Reliable Human Body Temperature Measurement for IoT Devices

by Paweł Narczyk and Witold Adam Pleskacz

Electronics 2022, 11(3), 434; https://doi.org/10.3390/electronics11030434 - 31 Jan 2022

Cited by 1 | Viewed by 2403

In this paper an analog frontend for a reliable measurement of the human body temperature is presented. A new and novel temperature calibration technique using an on-chip resistor was developed specifically for the analog frontend. The discussed analog frontend consists of a bandgap [...] Read more.

In this paper an analog frontend for a reliable measurement of the human body temperature is presented. A new and novel temperature calibration technique using an on-chip resistor was developed specifically for the analog frontend. The discussed analog frontend consists of a bandgap current reference, a precision current source, a programmable gain amplifier, a voltage source proportional to absolute temperature and an on-chip calibration resistor. The developed calibration technique enables very high accuracy, even 0.1 °C, to be obtained in a very wide temperature range. This calibration method was elaborated for integrated circuits operating in temperatures from −40 °C to +125 °C. The presented analog frontend consumes no more than 185 μA and was designed and manufactured with United Microelectronics Corporation (UMC) CMOS 130 nm technology. The data presented in the paper were obtained from process corner and Monte Carlo simulations as well as from measurements. The measurements were taken using a manual wafer prober with climate-controlled microchamber, at temperatures ranging from −40 °C to +125 °C. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

17 pages, 1682 KiB

Open AccessArticle

CCSDS 131.2-B-1 Transmitter Design on FPGA with Adaptive Coding and Modulation Schemes for Satellite Communications

by Adrián Lamoral Coines and Víctor P. Gil Jiménez

Electronics 2021, 10(20), 2476; https://doi.org/10.3390/electronics10202476 - 12 Oct 2021

Cited by 5 | Viewed by 2338

Abstract

Satellite communications are a well-established research area in which the main innovation of last decade has been the use of multi-carrier modulations and more robust channel coding techniques. However, in recent years, novel advanced signal processing has started being developed for these communications [...] Read more.

Satellite communications are a well-established research area in which the main innovation of last decade has been the use of multi-carrier modulations and more robust channel coding techniques. However, in recent years, novel advanced signal processing has started being developed for these communications due to the increase in the signal processing capacity of transmitters and receivers. Although signal processing capabilities are increasing, they are still constrained by large limitations because these techniques need to be implemented in real hardware, thus making complexity a matter of critical importance. Therefore, this paper presents the design and implementation of a transmitter with adaptable coding and modulation on a field-programmable-gate-array (FPGA). The main motivation came from the standard CCSDS 131.2-B-1 which recommends that such a novel transmitter which has to date not been implemented in a real system The system was modeled by MATLAB with the purpose of being programmed in VHDL following the AXI-stream protocol between components. Behavioral simulation results were obtained in VIVADO and compared with MATLAB for verification purposes. The transmitter logical circuit was synthesized in a FPGA Zynq Ultrascale RFSoC ZU28DR, showing low resource consumption and correct functioning, leading us to conclude that the deployment of new communication systems in state-of-the-art hardware in satellite communications is justified. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

13 pages, 1160 KiB

Open AccessArticle

COREA: Delay- and Energy-Efficient Approximate Adder Using Effective Carry Speculation

by Hyelin Seok, Hyoju Seo, Jungwon Lee and Yongtae Kim

Electronics 2021, 10(18), 2234; https://doi.org/10.3390/electronics10182234 - 12 Sep 2021

Cited by 10 | Viewed by 2134

Abstract

This paper presents a delay- and energy-efficient approximate adder design exploiting an effective carry speculation scheme with error reduction. The proposed scheme reduces the delay and improves the energy efficiency without any significant accuracy degradation by effectively adding the predicted carry input using [...] Read more.

This paper presents a delay- and energy-efficient approximate adder design exploiting an effective carry speculation scheme with error reduction. The proposed scheme reduces the delay and improves the energy efficiency without any significant accuracy degradation by effectively adding the predicted carry input using the OR operation. Additionally, the error reduction technique improves the overall computation accuracy at the expense of a few logic gates. As a result, the proposed adder achieves 3.84- and 7.79-times greater energy and energy-delay product (EDP) efficiencies than the traditional adder when implemented in 65-nm CMOS technology. In particular, when jointly analyzed with hardware accuracy, our design attains 69% and 70% reductions of the energy- and EDP-normalized mean error distance (NMED) products, respectively, compared to the other approximate adders under consideration. Furthermore, the proposed adder’s efficacy over the existing adders is demonstrated by adopting it in a machine learning application. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

12 pages, 20988 KiB

Open AccessFeature PaperArticle

FPGA Implementation of the Range-Doppler Algorithm for Real-Time Synthetic Aperture Radar Imaging

by Yeongung Choi, Dongmin Jeong, Myeongjin Lee, Wookyung Lee and Yunho Jung

Electronics 2021, 10(17), 2133; https://doi.org/10.3390/electronics10172133 - 02 Sep 2021

Cited by 7 | Viewed by 3450

Abstract

In this paper, we propose a range-Doppler algorithm (RDA)-based synthetic aperture radar (SAR) processor for real-time SAR imaging and present FPGA-based implementation results. The processing steps for the RDA include range compression, range cell migration correction (RCMC), and azimuth compression. A matched filtering [...] Read more.

In this paper, we propose a range-Doppler algorithm (RDA)-based synthetic aperture radar (SAR) processor for real-time SAR imaging and present FPGA-based implementation results. The processing steps for the RDA include range compression, range cell migration correction (RCMC), and azimuth compression. A matched filtering unit (MFU) and an RCMC processing unit (RPU) are required for real-time processing. Therefore, the proposed RDA-based SAR processor contains an MFU that uses the mixed-radix multi-path delay commutator (MRMDC) FFT and an RPU. The MFU reduces the memory requirements by applying a decimation-in-frequency (DIF) FFT and decimation-in-time (DIT) IFFT. The RPU provides a variable tap size and variable interpolation kernel. In addition, the MFU and RPU are designed to enable parallel processing of four 32-bit which are transferred via a 128-bit AXI bus. The proposed RDA-based SAR processor was designed using Verilog-HDL and implemented in a Xilinx UltraScale+ MPSoC FPGA device. After comparing the execution time taken by the proposed SAR processor with that taken by an ARM cortex-A53 microprocessor, we observed a 85-fold speedup for a 2048 × 2048 pixel image. A performance evaluation based on related studies indicated that the proposed processor achieved an execution time that was approximately 6.5 times less than those of previous FPGA implementations of RDA processors. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

15 pages, 1898 KiB

Open AccessArticle

AERO: A 1.28 MOP/s/LUT Reconfigurable Inference Processor for Recurrent Neural Networks in a Resource-Limited FPGA

by Jinwon Kim, Jiho Kim and Tae-Hwan Kim

Electronics 2021, 10(11), 1249; https://doi.org/10.3390/electronics10111249 - 24 May 2021

Cited by 5 | Viewed by 2016

Abstract

This study presents a resource-efficient reconfigurable inference processor for recurrent neural networks (RNN), named AERO. AERO is programmable to perform inference on RNN models of various types. This was designed based on the instruction-set architecture specializing in processing primitive vector operations that compose [...] Read more.

This study presents a resource-efficient reconfigurable inference processor for recurrent neural networks (RNN), named AERO. AERO is programmable to perform inference on RNN models of various types. This was designed based on the instruction-set architecture specializing in processing primitive vector operations that compose the dataflows of RNN models. A versatile vector-processing unit (VPU) was incorporated to perform every vector operation and achieve a high resource efficiency. Aiming at a low resource usage, the multiplication in VPU is carried out on the basis of an approximation scheme. In addition, the activation functions are realized with the reduced tables. We developed a prototype inference system based on AERO using a resource-limited field-programmable gate array, under which the functionality of AERO was verified extensively for inference tasks based on several RNN models of different types. The resource efficiency of AERO was found to be as high as 1.28 MOP/s/LUT, which is 1.3-times higher than the previous state-of-the-art result. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

15 pages, 9268 KiB

Open AccessArticle

Scalable ESPRIT Processor for Direction-of-Arrival Estimation of Frequency Modulated Continuous Wave Radar

by Yongchul Jung, Hohyub Jeon, Seongjoo Lee and Yunho Jung

Electronics 2021, 10(6), 695; https://doi.org/10.3390/electronics10060695 - 16 Mar 2021

Cited by 12 | Viewed by 2871

Abstract

The estimation of signal parameters via rotational invariance techniques (ESPRIT) is an algorithm that uses the shift-invariant properties of the array antenna to estimate the direction-of-arrival (DOA) of signals received in the array antenna. Since the ESPRIT algorithm requires high-complexity operations such as [...] Read more.

The estimation of signal parameters via rotational invariance techniques (ESPRIT) is an algorithm that uses the shift-invariant properties of the array antenna to estimate the direction-of-arrival (DOA) of signals received in the array antenna. Since the ESPRIT algorithm requires high-complexity operations such as covariance matrix and eigenvalue decomposition, a hardware processor must be implemented such that the DOA is estimated in real time. Additionally, the ESPRIT processor should support a scalable number of antenna configuration for DOA estimation in various applications because the performance of ESPRIT depends on the number of antennas. Therefore, we propose an ESPRIT processor that supports two to eight scalable antenna configuration. In addition, since the proposed ESPRIT processor is based on multiple invariances (MI) algorithm, it can achieve a much better performance than the existing ESPRIT processor. The execution time is reduced by simplifying the Jacobi method, which has the most significant computational complexity for calculating eigenvalue decomposition (EVD) in ESPRIT. Moreover, the ESPRIT processor was designed using hardware description language (HDL), and an FPGA-based verification was performed. The proposed ESPRIT processor was implemented with 10,088 slice registers, 18,207 LUTs, and 80 DSPs, and the slice register, LUT, and DSP were reduced by up to 71.45%, 54.5%, and 68.38%, respectively, compared to the existing structure. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

14 pages, 1293 KiB

Open AccessArticle

A Modified KNN Algorithm for High-Performance Computing on FPGA of Real-Time m-QAM Demodulators

by David Marquez-Viloria, Luis Castano-Londono and Neil Guerrero-Gonzalez

Electronics 2021, 10(5), 627; https://doi.org/10.3390/electronics10050627 - 09 Mar 2021

Cited by 7 | Viewed by 3094

Abstract

A methodology for scalable and concurrent real-time implementation of highly recurrent algorithms is presented and experimentally validated using the AWS-FPGA. This paper presents a parallel implementation of a KNN algorithm focused on the m-QAM demodulators using high-level synthesis for fast prototyping, parameterization, and [...] Read more.

A methodology for scalable and concurrent real-time implementation of highly recurrent algorithms is presented and experimentally validated using the AWS-FPGA. This paper presents a parallel implementation of a KNN algorithm focused on the m-QAM demodulators using high-level synthesis for fast prototyping, parameterization, and scalability of the design. The proposed design shows the successful implementation of the KNN algorithm for interchannel interference mitigation in a 3 × 16 Gbaud 16-QAM Nyquist WDM system. Additionally, we present a modified version of the KNN algorithm in which comparisons among data symbols are reduced by identifying the closest neighbor using the rule of the 8-connected clusters used for image processing. Real-time implementation of the modified KNN on a Xilinx Virtex UltraScale+ VU9P AWS-FPGA board was compared with the results obtained in previous work using the same data from the same experimental setup but offline DSP using Matlab. The results show that the difference is negligible below FEC limit. Additionally, the modified KNN shows a reduction of operations from 43 percent to 75 percent, depending on the symbol’s position in the constellation, achieving a reduction

47.25 %

reduction in total computational time for 100 K input symbols processed on 20 parallel cores compared to the KNN algorithm. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

11 pages, 1429 KiB

Open AccessArticle

Area-Time Efficient Two-Dimensional Reconfigurable Integer DCT Architecture for HEVC

by Pramod Kumar Meher, Siew-Kei Lam, Thambipillai Srikanthan, Dong Hwan Kim and Sang Yoon Park

Electronics 2021, 10(5), 603; https://doi.org/10.3390/electronics10050603 - 05 Mar 2021

Cited by 4 | Viewed by 2247

Abstract

In this paper, we present area-time efficient reconfigurable architectures for the implementation of the integer discrete cosine transform (DCT), which supports all the transform lengths to be used in High Efficiency Video Coding (HEVC). We propose three 1D reconfigurable architectures that can be [...] Read more.

In this paper, we present area-time efficient reconfigurable architectures for the implementation of the integer discrete cosine transform (DCT), which supports all the transform lengths to be used in High Efficiency Video Coding (HEVC). We propose three 1D reconfigurable architectures that can be configured for the computation of the DCT of any of the prescribed lengths such as 4, 8, 16, and 32. It is shown that matrix multiplication schemes involving fewer adders can be used to derive parallel architectures for 1D integer DCT of different lengths. A novel transposition buffer is designed to be used for the proposed 2D DCT architecture, which offers double the throughput without increasing the size of the transposition buffer. We determine the optimal pipeline locations in the proposed design through the precise estimation of propagation delays and the critical path so that the area-delay-product is optimized and all the output samples are obtained in the same cycle in spite of the recursive nature of the structure. Implementation results show that the proposed 2D integer DCT architectures provide significantly higher throughput per unit area than the existing designs for HEVC. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

18 pages, 3275 KiB

Open AccessArticle

Low-Complexity High-Throughput QC-LDPC Decoder for 5G New Radio Wireless Communication

by Tram Thi Bao Nguyen, Tuy Nguyen Tan and Hanho Lee

Electronics 2021, 10(4), 516; https://doi.org/10.3390/electronics10040516 - 22 Feb 2021

Cited by 21 | Viewed by 5343

Abstract

This paper presents a pipelined layered quasi-cyclic low-density parity-check (QC-LDPC) decoder architecture targeting low-complexity, high-throughput, and efficient use of hardware resources compliant with the specifications of 5G new radio (NR) wireless communication standard. First, a combined min-sum (CMS) decoding algorithm, which is a [...] Read more.

This paper presents a pipelined layered quasi-cyclic low-density parity-check (QC-LDPC) decoder architecture targeting low-complexity, high-throughput, and efficient use of hardware resources compliant with the specifications of 5G new radio (NR) wireless communication standard. First, a combined min-sum (CMS) decoding algorithm, which is a combination of the offset min-sum and the original min-sum algorithm, is proposed. Then, a low-complexity and high-throughput pipelined layered QC-LDPC decoder architecture for enhanced mobile broadband specifications in 5G NR wireless standards based on CMS algorithm with pipeline layered scheduling is presented. Enhanced versions of check node-based processor architectures are proposed to improve the complexity of the LDPC decoders. An efficient minimum-finder for the check node unit architecture that reduces the hardware required for the computation of the first two minima is introduced. Moreover, a low complexity a posteriori information update unit architecture, which only requires one adder array for their operations, is presented. The proposed architecture shows significant improvements in terms of area and throughput compared to other QC-LDPC decoder architectures available in the literature. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

13 pages, 1888 KiB

Open AccessArticle

Reconfigurable Binary Neural Network Accelerator with Adaptive Parallelism Scheme

by Jaechan Cho, Yongchul Jung, Seongjoo Lee and Yunho Jung

Electronics 2021, 10(3), 230; https://doi.org/10.3390/electronics10030230 - 20 Jan 2021

Cited by 10 | Viewed by 3084

Abstract

Binary neural networks (BNNs) have attracted significant interest for the implementation of deep neural networks (DNNs) on resource-constrained edge devices, and various BNN accelerator architectures have been proposed to achieve higher efficiency. BNN accelerators can be divided into two categories: streaming and layer [...] Read more.

Binary neural networks (BNNs) have attracted significant interest for the implementation of deep neural networks (DNNs) on resource-constrained edge devices, and various BNN accelerator architectures have been proposed to achieve higher efficiency. BNN accelerators can be divided into two categories: streaming and layer accelerators. Although streaming accelerators designed for a specific BNN network topology provide high throughput, they are infeasible for various sensor applications in edge AI because of their complexity and inflexibility. In contrast, layer accelerators with reasonable resources can support various network topologies, but they operate with the same parallelism for all the layers of the BNN, which degrades throughput performance at certain layers. To overcome this problem, we propose a BNN accelerator with adaptive parallelism that offers high throughput performance in all layers. The proposed accelerator analyzes target layer parameters and operates with optimal parallelism using reasonable resources. In addition, this architecture is able to fully compute all types of BNN layers thanks to its reconfigurability, and it can achieve a higher area–speed efficiency than existing accelerators. In performance evaluation using state-of-the-art BNN topologies, the designed BNN accelerator achieved an area–speed efficiency 9.69 times higher than previous FPGA implementations and 24% higher than existing VLSI implementations for BNNs. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

13 pages, 4659 KiB

Open AccessArticle

Area-Efficient Vision-Based Feature Tracker for Autonomous Hovering of Unmanned Aerial Vehicle

by Hyeon Kim, Jaechan Cho, Yongchul Jung, Seongjoo Lee and Yunho Jung

Electronics 2020, 9(10), 1591; https://doi.org/10.3390/electronics9101591 - 28 Sep 2020

Cited by 5 | Viewed by 2641

Abstract

In this paper, we propose a vision-based feature tracker for the autonomous hovering of an unmanned aerial vehicle (UAV) and present an area-efficient hardware architecture for its integration into a flight control system-on-chip, which is essential for small UAVs. The proposed feature tracker [...] Read more.

In this paper, we propose a vision-based feature tracker for the autonomous hovering of an unmanned aerial vehicle (UAV) and present an area-efficient hardware architecture for its integration into a flight control system-on-chip, which is essential for small UAVs. The proposed feature tracker is based on the Shi–Tomasi algorithm for feature detection and the pyramidal Lucas–Kanade (PLK) algorithm for feature tracking. By applying an efficient hardware structure that leverages the common computations between the Shi–Tomasi and PLK algorithms, the proposed feature tracker offers good tracking performance with fewer hardware resources than existing feature tracker implementations. To evaluate the tracking performance of the proposed feature tracker, we compared it with the GPS-based trajectories of a drone in various flight environments, such as lawn, asphalt, and sidewalk blocks. The proposed tracker exhibited an average accuracy of 0.039 in terms of normalized root-mean-square error (NRMSE). The proposed feature tracker was designed using the Verilog hardware description language and implemented on a field-programmable gate array (FPGA). The proposed feature tracker has 2744 slices, 25 DSPs, and 93 Kbit memory and can support the real-time processing at 417 FPS and an operating frequency of 130 MHz for 640 × 480 VGA images. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

12 pages, 2031 KiB

Open AccessArticle

High Efficiency Ring-LWE Cryptoprocessor Using Shared Arithmetic Components

by Tuy Nguyen Tan, Tram Thi Bao Nguyen and Hanho Lee

Electronics 2020, 9(7), 1075; https://doi.org/10.3390/electronics9071075 - 30 Jun 2020

Cited by 1 | Viewed by 2787

Abstract

A high efficiency architecture for ring learning with errors (ring-LWE) cryptoprocessor using shared arithmetic components is presented in this paper. By applying a novel approach for sharing number theoretic transform (NTT) polynomial multiplier and polynomial adder in encryption and decryption operations, the total [...] Read more.

A high efficiency architecture for ring learning with errors (ring-LWE) cryptoprocessor using shared arithmetic components is presented in this paper. By applying a novel approach for sharing number theoretic transform (NTT) polynomial multiplier and polynomial adder in encryption and decryption operations, the total number of polynomial multipliers and polynomial adders used in the proposed ring-LWE cryptoprocessor are reduced. In addition, the processing time of NTT polynomial multiplier is speeded up by employing multiple-path delay feedback (MDF) architecture and deploying pipelined technique between all stages of NTT processes. As a result, the proposed architecture offers a great reduction in terms of the hardware complexity and computation latency compared with existing works. The implementation result for the proposed ring-LWE cryptoprocessor on Virtex-7 FPGA board using Xilinx VIVADO shows a significant decrease in the number of slices and LUTs compared with previous works. Moreover, the proposed ring-LWE cryptoprocessor offers higher throughput and efficiency than its predecessors. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

13 pages, 1868 KiB

Open AccessArticle

Design and Analysis of an Approximate Adder with Hybrid Error Reduction

by Hyoju Seo, Yoon Seok Yang and Yongtae Kim

Electronics 2020, 9(3), 471; https://doi.org/10.3390/electronics9030471 - 11 Mar 2020

Cited by 44 | Viewed by 5796

Abstract

This paper presents an energy-efficient approximate adder with a novel hybrid error reduction scheme to significantly improve the computation accuracy at the cost of extremely low additional power and area overheads. The proposed hybrid error reduction scheme utilizes only two input bits and [...] Read more.

This paper presents an energy-efficient approximate adder with a novel hybrid error reduction scheme to significantly improve the computation accuracy at the cost of extremely low additional power and area overheads. The proposed hybrid error reduction scheme utilizes only two input bits and adjusts the approximate outputs to reduce the error distance, which leads to an overall improvement in accuracy. The proposed design, when implemented in 65-nm CMOS technology, has 3, 2, and 2 times greater energy, power, and area efficiencies, respectively, than conventional accurate adders. In terms of the accuracy, the proposed hybrid error reduction scheme allows that the error rate of the proposed adder decreases to 50% whereas those of the lower-part OR adder and optimized lower-part OR constant adder reach 68% and 85%, respectively. Furthermore, the proposed adder has up to 2.24, 2.24, and 1.16 times better performance with respect to the mean error distance, normalized mean error distance (NMED), and mean relative error distance, respectively, than the other approximate adder considered in this paper. Importantly, because of an excellent design tradeoff among delay, power, energy, and accuracy, the proposed adder is found to be the most competitive approximate adder when jointly analyzed in terms of the hardware cost and computation accuracy. Specifically, our proposed adder achieves 51%, 49%, and 47% reductions of the power-, energy-, and error-delay-product-NMED products, respectively, compared to the other considered approximate adders. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) Design and Its Applications)

► Show Figures

Figure 1

Journal Menu

Journal Browser

System-on-Chip (SoC) Design and Its Applications

Share This Special Issue

Special Issue Editor

Special Issue Information

Published Papers (13 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI