Digital Twin—Cyber Replica of Physical Things: Architecture, Applications and Future Research Directions

Qian, Cheng; Liu, Xing; Ripley, Colin; Qian, Mian; Liang, Fan; Yu, Wei

doi:10.3390/fi14020064

Open AccessEditor’s ChoiceReview

Digital Twin—Cyber Replica of Physical Things: Architecture, Applications and Future Research Directions

by

Cheng Qian

¹

,

Xing Liu

¹,

Colin Ripley

¹

,

Mian Qian

¹

,

Fan Liang

² and

Wei Yu

^1,*

¹

Department of Computer and Information Science, Towson University, Towson, MD 21252, USA

²

Department of Computer Science, Sam Houston State University, Huntsville, TX 77340, USA

^*

Author to whom correspondence should be addressed.

Future Internet 2022, 14(2), 64; https://doi.org/10.3390/fi14020064

Submission received: 27 January 2022 / Revised: 12 February 2022 / Accepted: 14 February 2022 / Published: 21 February 2022

(This article belongs to the Special Issue Towards Convergence of Internet of Things and Cyber-Physical Systems)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The Internet of Things (IoT) connects massive smart devices to collect big data and carry out the monitoring and control of numerous things in cyber-physical systems (CPS). By leveraging machine learning (ML) and deep learning (DL) techniques to analyze the collected data, physical systems can be monitored and controlled effectively. Along with the development of IoT and data analysis technologies, a number of CPS (smart grid, smart transportation, smart manufacturing, smart cities, etc.) adopt IoT and data analysis technologies to improve their performance and operations. Nonetheless, directly manipulating or updating the real system has inherent risks. Thus, creating a digital clone of a real physical system, denoted as a Digital Twin (DT), is a viable strategy. Generally speaking, a DT is a data-driven software and hardware emulation platform, which is a cyber replica of physical systems. Meanwhile, a DT describes a specific physical system and tends to achieve the functions and use cases of physical systems. Since DT is a complex digital system, finding a way to effectively represent a variety of things in timely and efficient manner poses numerous challenges to the networking, computing, and data analytics for IoT. Furthermore, the design of a DT for IoT systems must consider numerous exceptional requirements (e.g., latency, reliability, safety, scalability, security, and privacy). To address such challenges, the thoughtful design of DTs offers opportunities for novel and interdisciplinary research efforts. To address the aforementioned problems and issues, in this paper, we first review the architectures of DTs, data representation, and communication protocols. We then review existing efforts on applying DT into IoT data-driven smart systems, including the smart grid, smart transportation, smart manufacturing, and smart cities. Further, we summarize the existing challenges from CPS, data science, optimization, and security and privacy perspectives. Finally, we outline possible future research directions from the perspectives of performance, new DT-driven services, model and learning, and security and privacy.

Keywords:

Digital Twin; Internet of Things (IoT); cyber-physical systems; smart-world applications

Graphical Abstract

1. Introduction

The technological trend of the Internet of Things (IoT) has led to a massive increase in the number of smart devices that are connected to cyberspace [1,2,3]. In order for these smart devices to make a meaningful impact, it is important that they have the capability of capturing information related to their intended use; some common information that would be captured from typical smart devices may include traffic flows, temperature, humidity, and energy consumption. Depending on what a smart device is designed for, it may capture the key characteristics of things so that the efficiency and intelligence of cyber-physical systems (CPS), including energy, transportation, manufacturing, agriculture, healthcare, and other critical infrastructure systems, can be realized [4,5,6,7,8,9,10,11,12,13,14]. Due to the large number of data points that are introduced in these CPS, it is important to adopt advanced networking, data analysis (e.g., deep learning), and cloud/edge computing technologies in smart systems [15,16,17,18,19,20,21]. By doing this, the relevant information can be collected, transmitted, analyzed, and shared in a timely manner so that usable information can provide the intelligence to enhance the monitoring and control abilities of physical systems [22].

As an energy-based CPS, the smart grid can be identified as a complex system, where numerous uncertainties arise in both cyber and physical components [23,24,25,26]. The uncertainties can be caused by an outage in one part of the grid; in other words, the smart grid is susceptible and is easily affected by random accidents such as power load imbalance, outages, and even disruption from surrounding environments. We can understate both the importance and frangibility of the smart grid, which is facing unprecedented challenges. Depending on the development of big data and IoT, adapting big data analysis tools (ML/DL, data mining, statistics, etc.) can predict some potential risks, in order to mitigate accidents in the smart grid [27]. Furthermore, leveraging big data can establish a virtual smart grid environment to simulate real accidents, to investigate and develop mitigation plans.

Another example is related to wireless sensors networks, which can be used in different environmental monitoring systems. For example, smart farming is an emerging application that connects IoT devices so that agricultural production, soil quality, and temperature humidity can be managed [12,28,29]. Due to the harsh weather and environments, there are numerous uncertainties that could affect the reliability of connectivity and physical devices in smart farming. The uncertainties of the smart farming system can cause failures and lead to the quantity and quality reduction of agricultural products. Thus, it is necessary to design and implement the smart farming system as a typical CPS in the agriculture domain so that important data can be collected and efficient operations of smart farms can be supported. To this end, leveraging the collected data to create a cyber replica of the farming system for testing and evaluation is a feasible approach to deal with a variety of uncertainties in the system.

Likewise, in smart manufacturing systems, leveraging big data analysis can help system administrators and engineers discover vulnerabilities in the system. In addition, based on big data analysis, system administrators and engineers can update systems to optimize supply chain and production performance [7,30,31]. Nonetheless, directly manipulating or updating the real system has some critical risks. Thus, creating a cyber replica of real physical systems to emulate real cases in those physical systems is an efficient strategy, which is named Digital Twin (DT).

Generally speaking, DT is a digital clone of a real physical system. The vision of DT refers to data-driven software and hardware set that describes a real physical system with all functions and use cases. It also includes all status and information in life-cycle phases [32]. With obvious benefits, integrating DT into CPS could pose numerous challenges to modeling, computing, networking, and data analysis, alongside the exceptional requirements of CPS with respect to latency, reliability, safety, scalability, security, and privacy, among others. While advanced networking, computing, and data analysis technologies can help the realization of DT, there are a number of issues that need to be addressed, including how to define the theoretical foundation and modeling techniques such that DT accurately and reliably reflect the states of things, how to design ML/DL models to achieve real-time big data processing, and and how to secure DT and protect privacy-sensitive information collection and publishing.

There are some existing studies that have proposed feasible approaches, such as leveraging multi-domain and multi-level designs in manufacturing systems; that is, designs based on different sub-system digital abstractions in different domains, and designs based on the principal structures of real physical systems at different levels [33]. In order to ensure that DT can represent the physical systems precisely, the multi-domain and multi-level design process must accompany the life-cycle of real physical systems. In addition, some existing surveys summarize existing DT efforts that are designed to emulate real physical systems. For example, there are several existing surveys focusing on inspecting the framework of DT in industrial systems [30,34,35,36]. Additionally, some studies have focused on categorizing different DTs in the power grid [37,38,39,40]. Nonetheless, since IoT is widely applied to different physical systems, it is necessary to conduct a comprehensive investigation of extending DTs to IoT-based smart systems driven by data science and engineering. Furthermore, based on new techniques, the challenges and future research directions for DT need to be discussed.

To address the aforementioned problems and issues, we make the following contributions in this study.

We begin with the architecture design of existing DTs, which includes DT variants, DT sub-types, and IoT-based DT. We then review the data representation and communication protocols for DT. In addition, we present the use cases for applying DT into a variety of smart-world systems, which include smart grid, smart transportation, smart city, and smart manufacturing, as typical CPS.
Based on the architecture and its applications to smart-world systems, we discuss challenges that arise from four perspectives: CPS, data science, optimization, and security and privacy. We also present several research directions for DT that we strongly feel need further research, including performance, new DT-driven services, modeling and machine learning, and security and privacy.

The remainder of this paper is organized as follows: In Section 2, we provide the historical background of DT, including the key concepts, techniques, architecture, and protocols of DT. In Section 3, we review the examples of applying DT into smart-world systems, including smart grid, smart transportation, smart manufacturing, and smart cities as case study. In Section 4, we present the challenges of DT from the perspectives of CPS, data science, optimization, and security and privacy. In Section 5, we outline possible future research directions from the perspectives of performance, new DT-driven services, modeling and machine learning, and security and privacy. Finally, we summarize the paper in Section 6.

2. Digital Twin

In this section, we review the basic concepts and architecture of DT, as well as data representation and communication protocols for enabling DT.

2.1. Basic Concepts

A DT system can be regarded as a replica of a target physical system. It leverages a model to continuously simulate different functions of physical systems. In order to accomplish this, the DT must have a connection with the target physical entity so that the state of physical things can be collected and updated. Thus, a DT model can be used to predict, control, or optimize the functionality of things while at the same time learning from things that it represents. Note that the function of a DT is more than just simulation; it can interact with the physical system so that it can adapt to environmental changes.

DT technology relies heavily on the ability of ingesting large amounts of data and drawing conclusions on their correlation. Because of this, DT is closely tied with big modeling driven by advanced ML/DL and big data analytic techniques so that real-time forecasting and prediction can be achieved. For example, by leveraging DT in manufacturing processes, we can plan repair and maintenance activities, which could eliminate failures from the manufacturing process. In order to obtain this seamless integration of DT and physical systems, it is critical that the DT obtains the real-time state information of physical systems (i.e., sharing a connection to data); this is realized by using IoT sensors and networking technologies.

Simulation technology has been evolving for several decades and is most commonly used in Computer Aided Design (CAD), where a simulation of a system can gather insight for a possible conclusion to a specific scenario. DT improves on this technology by introducing ML/DL engines to interpret data that can interact with the physical system. This important distinction between traditional simulations and DT is made possible through the use of sensors, communication networks, data aggregation, and analysis, as well as cost-effective and intelligent decision core, which can interpret the process and provide real-time feedback to the system.

2.2. Architecture

Generally speaking, DT can be divided into three parts: the physical system, the digital system, and the connection between them. Figure 1 illustrates the general architecture of a DT. The physical system represents any actual system in the real world, including smart grid, smart transportation, smart manufacturing, and smart cities. The physical system can provide service to multiple users. Nonetheless, the environment may change when the physical system operates. In this case, the physical system needs to take action to handle this change. Nonetheless, it can be difficult to update a real physical model based on a complex operating procedure. Under such a circumstance, the DT can leverage a simplified physical system and, using the data from the new environment, perform simulation and guide the physical system to perform the next step.

2.2.1. DT Variants

Based on different types of data sharing between physical and digital systems, we can categorize DT into three variants: Digital Model, Digital Shadow, and Digital Twin (DT). Figure 2 represents the architecture of these DT variants.

Generally speaking, digital modeling refers to a digital representation of one existing physical system or its theoretical model. The digital representation of the system may include a detailed description of the physical components. A distinct difference is that a true DT system would utilize the data collected from the physical system to build a digital system for simulation and control of the physical system based on the simulation results. Nonetheless, for digital models, digital systems cannot automatically manipulate physical systems based on digital system simulation results [30,41]. Changes to the physical representation will not affect the digital representation and vice versa.

To expand on the modeling capabilities of the digital model, the Digital Shadow further takes the simulation and integrates a one-way data flow from the physical object to the digital representation [42]. Changes to the state of the physical systems can dynamically alter the representation of the digital model based on changing states in the physical system. This type of digital modeling can be used as a virtual representation of the physical system. In this way, the system administrator can intuitively observe the operation of the physical system and can respond in time according to the existing problems of the system. Changes to the physical system can have impacts on the virtual model but not vice versa.

The DT is an extension of the digital model and digital shadow. The distinct difference between the two previous examples and a DT system is the two-way link between the physical system and the digital model. One benefit of having the two-way link is the ability to affect the physical system based on the digital representation. Changes in the physical system can be reflected over the digital model so that possible outcomes based on system variables can be output. The link back to the physical system allows the control system to interact with the physical system so that a desired outcome can be achieved.

2.2.2. Types of DT

In addition to the three variants mentioned above, the DT can be categorized into four types: DT prototype (DTP) [43], DT instance (DTI) [44], DT aggregation (DTA) [45], and DT environment (DTE) [44,45,46]. Figure 3 and Figure 4 represent the architecture of DT types and DT environment, respectively.

The DTP represents a type of DT framework that leverages all the information from the physical system, which is necessary to reproduce the system in the physical world. The data flow transfers from the physical system to the digital system. The purpose of DTP is to improve the time and cost efficiency of the operation of the physical system. DTP can be used in monitoring systems, such as smart grid monitoring systems. However, the DTP data flow is a one-way flow from the physical system to the digital system. It can only monitor the system and cannot manage it according to the incoming data from the physical system.

The DTI is a type of DT that connects to its corresponding physical target. Unlike the DTP, the DTI handles the data flow from the digital system to the physical system. The data flow contains predictions or guidelines, which assist the physical system to operate simultaneously when the environment changes. With both DTP and DTI, the two-way connection between the physical system and digital system has been established. Nonetheless, the DTI only represents a single data flow from the digital system to the physical system. In this case, another term, called DTA, has been proposed to represent the aggregation of all the DTIs. DTI/DTA can be used to manage systems as data flows from digital systems to physical systems. The operator can operate the physical system according to the prediction result of the physical system. Nonetheless, the DTI/DTA data flow cannot be transferred directly from the physical system to the digital system. Additional mechanisms need to be implemented to retrieve information from the physical system to enable the DT to operate.

The DTE (DT environment) can be considered as the boundary of the DT application. One DTE can contain multiple DT systems. The DTE is responsible for managing all the DT systems under its coverage to assist in the operation of the relevant physical systems. From the application point of view, DTE can be used to manage large systems. At the same time, a synchronization mechanism needs to be implemented to speed up the query processing among multiple digital systems.

2.2.3. Architecture for IoT Systems

As massive numbers of IoT devices have been implemented across a variety of real-world applications and physical locations, and the collection, aggregation, storage, and analysis of their generated data is a prodigious challenge. Figure 5 illustrates a generic architecture for IoT systems [47], which consists of four layers (i.e., object layer, communication layer, application layer, and end-user layer).

The object layer represents all IoT sensors and is used to provide data and information for different IoT-driven applications. It contains all the components that make up the physical system. The communication layer provides communication network infrastructure to collect IoT devices and collect data for DT. For example, a number of edge gateways can be deployed to collect and aggregate information from sensors and send the information to the application layer. IoT sensors and gateways use different communication protocols to collect and transmit information. After the gateway in the object layer obtains the information, it transmits the data to the application layer. The application layer includes the digital systems of DT. At this level, the DT system first maps all sensor and gateway information to the digital system. At the same time, we can use the IoT naming service to name the mapped IoT sensors and local gateways, so that the digital system can locate different resources [48]. The digital system can then use ML/DL models to perform prediction and analysis. By doing so, the digital system can control the actuators in the target layer based on the prediction results. Furthermore, the prediction and analysis results can provide different services in the end-user layer. The end-user layer provides services for users, who can send requests to the application layer. After that, the digital system will analyze the information obtained and respond with the results accordingly.

Based on the architecture of the DT system, data representation and communication protocols are essential for data sharing within the DT, as well the provisioning of services to end users.

2.3. Data Representation

Data representation enables components to understand data from different domains [49,50,51]. Several commonly used data representation protocols in DT are DT Definition Language (DTDL) [52], FIWARE [53], OPC Unified Architecture (OPC UA) [54,55], and Feature-Based DT Framework (FDTF) [56]. In particular, DTDL is an open-standard platform proposed by Microsoft [52], which can realize data transmission within the system or between different systems. The summary of data representation protocols is listed in Table 1.

DTDL [52] defines the following six characteristics of IoT components: (i) Interface: the basic information of a component (device ID, device type, display content, etc.). Different interfaces can inherit attributes from parent interfaces (e.g., charging station components can inherit some attributes of the parking lot such as addresses); (ii) Telemetry: the data sent from any component in the IoT-based DT (the raw data from IoT sensors, the processed data generated by DT models, etc.); (iii) Property: the status of the components in the IoT-based DT (i.e., the device is read-only or read-write). This can also include the different states between different DT (whether the data between the two DTs are consistent, etc.); (iv) Command: operations that any DT can understand; (v) Relationship: the connection between the DTs (e.g., the relationship between the house and the cleaner is cleaned up); and (vi) Component: the entities that exist in the DT, including sensors, gateways, and digital systems. Based upon these six components, DTDL can be composed of multiple DTs with a unified structure and data, which enables seamless data transmission between different DTs.

FIWARE [53] is also an open-source project designed to support DT data transmission and the processing of contextual information received from various IoT components. FIWARE developed a data modeling standard called FIWARE NGSI that describes the collection, processing, and change notification operations of contextual information. FIWARE NGSI is an object-oriented data protocol that uses context objects to represent physical/digital objects. NGSI provides a data model that enables data exchange between multiple DTs. It also uses NGSI-LD to represent the relationship between different objects and uses JSON-LD for encoding to unify the data format.

OPC UA [54,55] is a modeling framework that can retrieve information from raw data. OPC UA defines the data transmission and understanding model between different systems. It also has a mechanism to traverse all data and analyze its semantics. OPC UA supports data manipulation to make it easier to manage data under the DT architecture. Furthermore, it provides monitoring capabilities, enabling the DT system to manage the state of all sensors under the control.

FDTF [56] is a DT structure that enables the DT system to share information based on the data link between DT components. This structure realizes data sharing between digital systems and physical systems, as well as data sharing within different DT models.

2.4. Communication Protocols

The communication protocol realizes the information transmission between IoT devices, and between IoT systems [49,50,51]. There are some representative communication protocols common across IoT systems, including Constrained Application Protocol (CoAP) [57,58], OASIS Standard Message Passing Protocol (MQTT) [59], Modbus TCP/IP Protocol [60,61,62,63,64], and Ultra Reliable Low Latency Communication (URLLC) [65]. The summary of communication protocols are listed in Table 1.

CoAP [57,58] is a specialized web communication protocol based on the User Datagram Protocol (UDP), tailored for resource-restricted devices (battery-powered IoT devices, etc.). CoAP can be used to handle communication between IoT devices. It is designed to be compatible with the transmission of data via Hypertext Transfer Protocol (HTTP) and is more compatible with Web-based applications. CoAP can provide a publish and subscribe mechanism to simplify the process of obtaining continuous data from the sensor.

MQTT [59] is the communication protocol based on Transmission Control Protocol (TCP). It is a lightweight way for IoT devices to communicate with each other through MQTT messages. These messages are lightweight due to their optimized headers, so that the use of network resources can be reduced. The MQTT protocol includes MQTT clients and MQTT brokers. The MQTT client can be IoT sensors, and the MQTT broker can be deployed in an IoT gateway, which is used to manage data communication between sensors and control commands to IoT actuators. Since MQTT is based on the TCP protocol, it can provide reliable data transfer. Furthermore, IoT sensors may need to establish a connection with the IoT gateway within a certain period of time. In this case, with the help of MQTT brokers, IoT sensors can establish a long-existing outgoing TCP protocol to enable transmission.

Modbus TCP/IP [60,61,62,63,64] is a communication protocol used in the industrial field to realize the connection between industrial devices. The protocol was initially proposed to provide communication between Programmable Logic Controllers (PLCs). Modbus TCP/IP uses port 502 for data transfer. Meanwhile, the protocol contains built-in checksum protection, which can ensure the reliability of the data transmission required in IoT systems. With the aid of Modbus TCP/IP protocol, reliable operations can be supported and applied to the smart grid, smart manufacturing, and other IoT systems.

URLLC [65] is a communication protocol that can achieve low latency and reliability in the transmission process between IoT devices. Generally speaking, URLLC is based on the 5G wireless protocol, which can reduce the delay to 0.5 ms. In order to reduce the delay, URLLC uses a mini-slot for smaller time resource units, removes the scheduling mechanism of direct channel access, supports asynchronous upload, and adopts dynamic scheduling algorithms such as hybrid automatic repeat request (HARQ). In order to improve reliability, URLLC utilizes a channel estimation mechanism, a transmission diversity mechanism, and a bit error rate reduction mechanism. With the aid of URLLC, IoT sensors can exchange information faster and more reliably.

3. Integrating DT in CPS

In this section, we review some existing efforts on integrating DT in different CPS such as the smart grid, smart transportation, smart manufacturing, and smart cities. DT in CPS is an evolving field of research. There have been growing research interests on integrating DT in CPS. Figure 6 shows the number of recent publications from 2018 to 2022 from IEEE Xplore on the subject of DT and CPS from four representative CPS (smart grid, smart transportation, smart manufacturing, and smart cities). The included papers are related to both DT system and CPS system in the content or in the experiment.

3.1. Framework

Figure 7 describes an architecture option of a DT-driven smart-world system motivated by the existing efforts on DT [33]. The architecture consists of three main components: the physical subsystem, the Artificial Intelligence (AI) model, and the DT model. The IoT gateway from each physical system collects and aggregates the data from the physical system. With the support of communication protocols, data can be transmitted from the physical system to the AI model. The AI model is used for data analysis and model design based on the states of physical system. Then, the AI model stores its trained model as a DT model to represent the digital perspective of physical system. If the configuration of physical system does not change, the DT model can guide the physical system to react based on the data collected by physical system. If the configuration of physical system changes or needs to respond differently to the same situation, the AI model can update its model according to the changes of physical system and revise the DT model to further assist the operations of physical system. Recall that DT can assist in the control and operation of physical systems [34].

3.2. Smart Grid

As a key energy infrastructure, the smart grid leverages information communication technology (i.e., cyber system) to provide two-way communication (of both energy and information) so that the monitoring and control of the grid can be improved and consumers can be engaged. The physical infrastructure of the power grid is composed of power generation facilities, transmission facilities, and power distribution facilities. Power generation facilities are responsible for generating electricity, while transmission facilities are used to deliver power to distribution facilities. Finally, power distribution facilities provide electricity to consumers.

With the integration of the cyber system, the next-generation power grid, denoted as the smart grid, has been envisioned to provide more efficient energy service for users, including reliable and intelligent distribution management, renewable energy integration, energy storage, grid monitoring and control, and the integration of electric vehicles into the grid [5]. Generally speaking, the smart grid is a highly distributed grid system, which integrates information communication techniques, including sensing, networking communications, data analytics, and machine learning [66,67]. These technologies can improve the reliability, efficiency, and security resilience of the grid system.

The smart grid system is a large-scale distributed system. To improve the monitor and control of the grid, it is critical to know the states of physical objects (voltage, current, etc.) in the system. To do so, a massive quantity of sensors can be deployed in the grid to measure the state of critical objects and derive the accurate and insightful knowledge of physical objects based on measurements. In the past, both static and dynamic state estimation have been applied to the monitoring and control of the grid, and bad data detection algorithms have also been designed to deal with corrupted, invalid measurement, and/or malicious injection data, which can be generated via measurement error, sensor failure, or even cyber threats [24,25]. Note that in addition to potential failures in the grid, cyber threats can directly or indirectly affect the grid’s operations by injecting malicious measurement data and/or manipulating the critical control information to the actuator. To handle cyber threats, it is critical to not only monitor the state of physical objects but also monitor the state of cyber components. In this regard, the DT can be a viable approach to link the objects in the physical world to the cyber world.

DT uses a data-driven modeling strategy to map the physical grid to the digital grid so that monitoring and management operations for the entire grid can be enabled. Based on the DT architecture that we have discussed, the smart grid system can be divided into several layers: object layer, communication layer, application layer, and end-user layer. In the object layer, grid components include power generation facilities, transmission facilities, and power distribution facilities. The communication layer is responsible for the communication to connect objects, supporting data exchange between the objects and the application layer, and the data communication within the object layer. For example, the gateway of the communication layer contains the device information of object layer. When the application layer needs data, the gateway can retrieve the data from a specific device as required. The application layer can also construct a digital model based on the device distribution information from the gateway.

The application layer includes a digital representation of the gateways from the physical system, which contains device information from the object layer; it communicates between the application layer and the object layer through the communication layer. In the application layer, the AI model is trained based on the data collected from the object layer, and the trained model is stored in the data storage for future use. If the physical state of the object layer changes, the application layer can send commands to some actuators in the object layer to control the grid based on the current data and the trained model. If the physical components of the object layer change, the AI model can perform model updates to reflect the changes as well. The end user layer uses the information processed by the application layer to provide services for end users, including smart grid management systems, smart charging systems for autonomous vehicles, and hybrid energy management systems.

There are a number of research efforts on DT for the smart grid [38,39,68,69,70,71]. For example, General Electric (GE) proposed two models related to the use of DT in wind farms [68,69]. In the object layer, a communication network was designed to enable wind turbines to communicate with each other. A wind turbine control system was also implemented to control one or more wind turbines. The middleware layer includes a cloud-based infrastructure with the support of the communication network so that the data (e.g., state of turbines) can be collected and turbines can be remotely controlled via gateways. It also includes digital models that represent digital replicas of wind turbines. The digital model can be updated based on the information extracted from the object layer, and the wind farm can be controlled through an industrial gateway. In addition, GE designed a graphical user interface (GUI) in the application layer to show the digital representation of the wind farm and perform the control and management operations of the wind farm.

Ahmed et al. [39] proposed a microgrid DT model based on the IoT to mitigate cyber attacks. The object includes microgrid, local controller, and area controller. Real-time balancing algorithms were implemented in each controller. If some microgrids are attacked and go offline, the area controller can obtain information, and the balancing algorithm will not transfer more power to the offline area based on the tampered data. This architecture can also mitigate distributed denial of service (DDoS) attacks and false injection attacks. Likewise, Nikolaos et al. [38] used DT to manage a large number of devices in the smart grid system. To be specific, they used the spike neural network (SNN) deployed on the smart meter in the grid to detect fault nodes. A transient state estimator (TSE) was designed to reveal the dynamic state of the smart grid. William et al. [70] leveraged deep learning algorithms to analyze data from the supervisory control and data acquisition system (SCADA) to assist the DT in detecting physical faults in the smart grid system. Likewise, Payam et al. [71] proposed a DT framework with artificial neural networks (ANN) for distributed smart grids to ensure real-time model generation, verification, and identification.

3.3. Smart Transportation

The smart transportation system can be regarded as a CPS, which provides a variety of services for the transportation domain, including traffic management control. Some applications in smart transportation systems include toll payment, vehicle operation, emergency, vehicle control and safety management, and maintenance and construction management [6,72,73]. Since the smart transportation system requires real-time data transmission and analysis capabilities to implement time-sensitive services such as autonomous vehicles, it is difficult for traditional models to complete model training under massive data within a short time. It is additionally challenging for traditional smart transportation systems to complete the collection and aggregation of all sensor data within a short time. In this case, the DT model can use high performance cloud/edge servers in the digital system to conduct data aggregation and model training so that real-time processing and transmission can be achieved.

Based on the architecture of the DT that we have discussed, DT in the transportation system consists of four layers. The object layer contains the components of the infrastructure and participants (e.g., drivers), including sensors on the vehicle, traffic cameras, and other traffic sensors. The communication layer uses a variety of gateways to process sensor data from the object layer, and handle data transmission between the object and application layers. The application layer maps the gateway from the communication layer to realize data transmission between the communication layer and the application layer. Based on the data and sensor information from the gateway, the DT can use ML/DL technology to train the model and store it in data storage. The end-user layer can navigate based on the data processed by the application layer, assist the driving of autonomous vehicles, and cooperate with smart cities and smart grids to reduce transportation costs, reduce energy consumption, and make the charging process of electric vehicles more efficient.

There have been some research efforts on DT-driven smart transportation. For example, Peter et al. [74] proposed a scheme for DT-driven smart transportation that attempts to optimize traffic flows in urban areas. The proposed smart transportation system includes an AI model, a digital replica of the road network, and various traffic control and management services. The system obtains data from the object layer, including traffic lights, sensors, and cameras. After the smart transportation system processes the data, it will build an application at the end-user layer based on the collected data, which is called the city service information system. Then, operators can manage the entire system at a high level based on the end user’s application. Additionally, Sagar et al. [75] proposed a DT-based adaptive traffic signal control (ATSC) to reduce the waiting time at intersections, thereby improving the driver’s driving experience. In that study, the Urban Traffic Simulation (SUMO) was used as the simulation platform.

Additionally, Wang et al. [76] designed a DT-based verification platform to perform verification between different metro systems. The object domain includes operation center; trackside and train information is collected, and data is transmitted to the DT system of the application layer through various communication protocols, including 4G, 5G, Ethernet, and NB-IoT. The DT platform uses ML/DL, data fusion, and data modeling to analyze all data and assist the end-user level in scheduling, train monitoring, pedestrian monitoring, etc. Sahal et al. [77] proposed a DT structure combined with blockchain to enable DT-based intelligent transportation systems to achieve interoperability, identity verification, distributed ML/decision-making, and scalability and robustness. Likewise, Guo et al. [78] demonstrated a DT-based 3D collaborative vehicle infrastructure system (CVIS) to visualize real-time road data and use LIDAR as road sensors. They used the ROS bridge as a communication protocol to communicate between the physical and virtual worlds.

3.4. Smart Manufacturing

The smart manufacturing system, as a general concept, is the integration of traditional manufacturing systems and processes with information communication technologies to enable advanced intelligence and process optimization and automation. The demands of current manufacturing systems can be complex in nature, due to the challenges of scalability, efficiency, delivery, or design requirements [79]. The smart manufacturing system is the utilization of emerging technologies to massively optimize production in terms of throughput, efficiency, precision, loss, resource consumption, scale, and value [7].

Traditional manufacturing systems are comprised of shop floor hardware that works as a unit to streamline the production of a deliverable product. In contrast, the smart manufacturing system integrates communication and computing technologies to utilize information generated from traditional manufacturing systems to make a more complete representation of a product life cycle. The application of IoT to industrial systems, or Industry 4.0, relies on technologies such as sensing, control, communication, computing, and data science to support information flows between different components of the manufacturing system, allowing for better system interoperability and use.

One major limitation in manufacturing systems is the ability to automate and mass produce in a manner that is efficient and cost-effective. With the shifting paradigm in manufacturing of custom user deliverables, there is an increased challenge to provide custom solutions in a method that can be massively produced by manufacturing systems. A common issue in manufacturing is the ability to generate a deliverable that is flexible to the user’s needs while still maintaining automated systems that can quickly and efficiently create products. At scale, it is often difficult to analyze how changing requirements in the product design or supply chain will impact variables such as cost and production timelines while retaining quality and efficiency. Additionally, adoption rates are going to be a challenge for manufacturing environments that are looking to incorporate smart manufacturing systems in their product life-cycle. A core technology for enabling smart manufacturing is the availability of information from the physical objects (i.e., shop floor equipment) through IoT sensors connected to IoT networking infrastructure.

Implementation of DT in the smart manufacturing system can help evaluate the effectiveness of large-scale production environments by utilizing models for analyzing production efficiency and design factors. The link between the virtual and physical representation of a manufacturing system (i.e., in the way of DT) can determine whether a design is going to be cost-effective by integrating with decision support systems such as supply chain management and timeline forecasts. The integration of smart supply chain technologies involves a real-time link between the production environment and the suppliers and distribution centers to determine the deliverable dates [80]. Determining the availability of materials required for manufacturing production will allow the system to generate a cost model and timeline for delivering a final product.

Monitoring the physical production environment with DT systems can allow for the real-time analysis of system health on physical things (i.e., equipment, tools, and hardware) and prevent delays based on environmental factors. In the event of an anomaly detection on a manufacturing system, the DT model can determine whether the work load should be rerouted to another system in the manufacturing line to avoid delays in production [81]. IoT sensors in manufacturing equipment allow for real-time monitoring, enabling the detection of events that could cause breakdowns in the manufacturing system and providing alternative scenarios for maintaining system efficiency to prevent delays in production timelines [82].

Since smart manufacturing can apply to many different levels of technologies in manufacturing systems, it is important to consider the level of adoption for technologies in Industry 4.0 scenarios. For example, Frank et al. [80] analyzed different levels of adoption in several different manufacturing domains and concluded that the implementation of Industry 4.0 technologies is interrelated and provides potential benefits when more systems are integrated with smart manufacturing processes. Additionally, the utilization of both front-end technologies and base technologies could provide more complete integration between physical systems and virtual representations that can better handle the challenges of requirement complexity and decision support systems.

With the support from the communication layer and techniques to enable data aggregation, DT implementation can help with the following scenarios in a smart manufacturing system. The first scenario is the detection of fault tolerance and device health, which can help smart manufacturing systems to avoid outages by predicting device failures and rerouting workflows in the production process. In particular, cloud-based frameworks that are designed to handle large amounts of data utilize embedded sensor systems in manufacturing devices in order to monitor system health, leading to the creation of predictive models for avoidance procedures [83]. In the example of a robotic gripper in a manufacturing production line, Redelinghuys et al. [82] investigated anomaly detection in the gripper closing speed for detecting via pressure sensor to detect faults and degradation that could lead to failure and considered utilizing the result to reroute production traffic to another assembly line. This type of fault detection compares recent samples to historical data and builds a model that determines when a device is likely to fail in a production environment, enabling the automated scheduling of preventative maintenance.

The second scenario is supporting supply chain management, which involves the integration of DT systems and the communication of real-time data between suppliers and distribution centers. The tracking of products can be integrated into the supplier systems to inform the manufacturer system and can be integrated with the customer-facing deliverable to accurately determine product delivery timelines and end-to-end supply chain logistics. The tracking of goods required for a customer deliverable can be remotely monitored and integrated between supplier and manufacturer systems to determine the availability of goods required to produce a product [84]. In the event that a good required for a custom deliverable is not available, the decision support system can determine whether there is a more suitable supplier for the product and whether it would be cost-efficient to utilize a different supplier. By generating a pricing and timeline model for the availability of goods, the manufacturer can determine whether the product is worth producing and provide an accurate cost analysis to the customer.

Note that one of the emerging challenges of manufacturing systems is the ability to create highly customized products in a manner that is efficient. Manufacturing in general is often focused on the mass production of a good or product in order to decrease the cost and effort required per unit. This type of mass production is often associated with processes that are easily automated to speed up the production time. With the shifting paradigm of manufacturing trending towards more customized deliverables [85], product variety becomes a factor when designing production systems that will maintain efficiency while still meeting diverse customer requirements. Integration with DT systems can allow for the application of ML/DL algorithms and modeling technologies to determine whether a custom deliverable requirement is feasible in a given manufacturing system. Manufacturing systems need to be both flexible and efficient in order to be useful for custom deliverable objectives [86].

There are some existing research efforts on the evolution of smart manufacturing concepts and the best way to apply DT technologies for more robust manufacturing systems [30,86,87,88]. For example, Lu et al. [86] reviewed the standards utilized in smart manufacturing systems, applied their research toward the production of new standards, and further envisioned scenarios for smart manufacturing systems, some of which can benefit from the application of DT systems. Kritzinger et al. [30] took the application of smart manufacturing one step further by studying the application of DT as an enabling technology for Industry 4.0. Likewise, Brennet et al. [87] discussed the convergence of digital and physical systems for Industry 4.0 through the application of Personalized Product Emergence Processes (PPEP) and additive manufacturing technologies.

Furthermore, the implementation of DT in smart manufacturing systems was investigated [81,89,90]. For example, Tao et al. [89] summarized the deployment structure of DT for smart manufacturing based on the concept of DT Workshop, which enables digital systems and physical systems to synchronize with one another. In this case, the digital system can be virtualized based on data from the physical system, and the physical system can be optimized based on the instructions from the digital system. In order to better explain how to deploy Digital Twin Shop-Floor (DTS), DTS was divided into four parts: physical shop-floor, virtual shop-floor, shop-floor service system, and shop-floor DT. Likewise, Kamil et al. [90] implemented a DT in an experimental assembly quality production inspection system. In particular, they used radio-frequency identification (RFID) and cameras in the physical system and used a programmable logic controller (PLC) system to collect data. The OPC UA framework was considered to transfer data between the physical system and the digital system. The digital model consists of a 3D model created in CAD design software and is simulated and virtualized using the Tecnomatix platform. The cloud platform was integrated into the digital system for data analysis so that the assembly quality inspection could be realized without stopping the manufacturing process. Open-source technology can also benefit the development of applying DT in smart manufacturing scenarios. For example, Aghenta et al. [91] proposed a low-cost IoT-based SCADA system solution. In addition to providing the basic functions of a traditional SCADA system, this particular system can not only support data acquisition, transmission, and presentation but also the monitoring and control of physical objects.

3.5. Smart Cities

The objective of the smart city is to leverage information communication and data-related technologies so that resources can be efficiently used, leading to better quality of life, reduction of resource input, etc. So far, the majority of focus has been on government and energy initiatives, followed by transportation, building, and water, among other goals. A number of research and development efforts on smart cities and DT applications have been generated [92,93,94,95,96]. Similar to smart manufacturing, smart cities are the combination of a wide range of IoT domains to address the complex challenges in cities and must incorporate cloud/edge computing with big data collection and analysis as essential techniques to drive the realization of efficiency and optimization [11,97].

It should be noted that the development of the smart city has involved several opportunities and challenges: (i) people’s expectations are constantly changing. How can people’s expectations be accounted for when they are frequently and forever changing, even when real-time communication is a reality? How can the change in expectation be evaluated? (ii) How can the latest technologies be utilized to improve the efficiency of our government? How can we forecast outcomes and avoid potential losses or mistakes? How can we mitigate the risks? These questions have become the urgent issues that a government must consider with respect to smart cities. Similar to numerous practices in smart manufacturing, in smart cities, we could leverage DT to detect and address a problem before it arises, reducing cost, rather than attempting to solve a problem after it has occurred. If the results from the DT match the expectation, data can be transferred to real physical objects to implement the changes in a fast and seamless way.

Some efforts toward integrating DT with smart cities have already been made [98,99,100,101,102,103]. For example, Deng et al. [98] reviewed some key technologies in Digital Twin City (DTCs), such as surveying and mapping, building information model, IoT, next-generation wireless networking, and blockchain. Shahat et al. [99] reviewed some challenges of DT cities and discussed some research that leveraged DT into smart cities. Note that while the city DT cannot provide solutions for addressing problems in cities, DT can provide benefits in several areas. In order to demonstrate the performance of DT, it is necessary to conduct research on data collection, data management, the utilization of big data, virtual reality (VR), IoT, and 3D Modeling. Likewise, Shirowzhan et al. [100] discussed technologies that can be useful for enabling smarter cities. They identified trends in geospatial science, particularly in the application of geographic information system (GIS), and observed the impacts of newly developed online applications such as ArcGIS Urban.

More and more smart cities around the world are embracing DT to replicate its performance and response to changes. By the year 2020, 118 cities were using DT in Smart City projects and systems. Cities are also using DT to reduce carbon dioxide emissions and manage traffic efficiently. According to ABI Research, DT could assist cities in carrying out cost-effective urban resource planning [104]. We believe that DT has the potential to further improve upon the already significant gains that it has achieved.

4. Challenges

Integrating DT into CPS can not only improve the efficient operation of CPS through increased intelligence, it can also assist CPS in providing more valuable service to end-users. Recall that as DT is a digital replicate of physical things, it is critical to make DT capable of representing a variety of things in a timely, accurate, and efficient manner. The realization of such a DT raises numerous challenges to the networking, computing, control, and data analysis of IoT. Furthermore, the design of DT shall consider the numerous exceptional requirements of CPS (e.g., latency, reliability, safety, scalability, security, and privacy). To address such challenges, designing DT offers opportunities for novel and interdisciplinary research efforts.

4.1. CPS Challenges

As a complex system that requires the coordination of network, computing, and control, CPS has strict performance requirements in different application scenarios. Generally speaking, CPS applications require very low latency, high reliability, and large scalability. Taking the smart transportation system as an example, suppose we are designing applications related to autonomous driving. Vehicles need to be able to quickly collect pertinent information. This information includes data collected by sensors, transmitted through the Internet of Vehicles, and so on. Then, the vehicle needs to process this data quickly, extract usable information, and control the vehicle according to the determined situation at the time. Since vehicle driving is closely related to personal safety, there is no doubt that the reliability of autonomous driving is our primary consideration. The issues that we need to consider include network failures, computing hardware failures, and many more.

Moreover, because the number of vehicles in each scenario will be different, we need to ensure the scalability of the autonomous driving application that we design to handle the problems caused by the large number of vehicles, such as network congestion and high control algorithm complexity. From this example, we can conclude that the performance requirements of CPS are very strict. The overall performance of CPS is the result of multiple subsystem (network, computing, and control) interactions and is synthesized within the CPS. This interaction means that any slight change in the performance of any one subsystem can have a compounding effect on the performance of other subsystems, thereby causing a complex impact on the performance of the entire CPS system. Thus, when building a DT for CPS, it is challenging to not only meet all these strict performance requirements but also to reflect the interactions between the various subsystems in a timely manner.

4.2. Data Science Challenges

Building the DT of a complex system itself is very challenging, involving mathematical modeling and data analysis. There are two major approaches to building an accurate DT: model-driven and data-driven. When we adopt the model-driven approach, we use mathematical models to represent physical systems. Therefore, knowing how to build a mathematical model that accurately represents a complex physical system in CPS is a challenging problem. Since there are many variables involved in the physical system of CPS, the mathematical model will be significantly complicated, typically with properties such as non-linearity, highly-coupled, and time-variance. When we adopt a data-driven approach, we collect massive amounts of data from the physical system that reflects its status over time. The process of collecting and selecting these data poses some challenges.

First of all, in CPS, one type of data may be collected from multiple sources, and the hardware conditions of these sources will be different, such that it is difficult to ensure that the quality of all data is the same. Second, the storage and transmission of this massive amount of data creates significant overhead to the system. In order to truly reflect the timely updates of the physical system, we often sample the state of the physical system at a very high frequency, and these data require considerable storage space and pose significant overhead for networking and computing infrastructures with strict performance requirements. Therefore, one fundamental data science problem is how to collect the least amount of data without significantly and negatively impacting the efficacy of DT. In conclusion, how to effectively construct a DT for the complex system with the least amount of data is a challenging problem that needs to consider both mathematical modeling and data science aspects.

4.3. Optimization Challenges

Realizing a DT in CPS that accounts for the interactions of computing, control, communication, and data analysis modules in an end-to-end chain is challenging. The communication resource allocation algorithms, such as time and frequency resource scheduling, are very complex, especially when combined with massively distributed networking devices and low latency requirements. In addition, the computing resource allocation, such as task-offloading in an edge/cloud computing architecture, is another optimization problem with stringent performance constraints. Furthermore, the control mechanism, possibly event-triggered or time-triggered control, or both, yields still another optimization problem. In addition, analysis of the collected data is highly complex, requiring prodigious feature extraction, and training (or at least validation), coupled with potentially continual prediction or classification to satisfy real-time requirements.

The DT is a combination of the real-time computing, real-time control, and real-time communication, which generates the joint/integrated optimization problem of computing, control, and communication. Consider VR-supported smart manufacturing as an example. First, we need to deliver massive amounts of sensing data with required latency. Then, we need to use the graphics engine to generate the digital model in real time. Finally, we need to execute the control command within the latency constraint. Since any problem in this chain will directly affect the performance of the other, even in a centralized system with superior hardware conditions, it is very challenging to perform joint optimization on this VR system. Further, the distributed system of CPS only makes the problem of joint optimization much harder. For example, in our previous work [105], we adopted deep reinforcement learning to help configure the network and dynamically change the sensor sampling rate at the same time to improve the control performance of a distributed smart manufacturing system. However, by introducing DT to the CPS, there are more processes that need to be considered. For example, in the control process, there are two processes, such as the domain expert controlling DT and DT controlling the physical objects. The computing of DT in CPS needs to consider three computing processes, such as the computing process of generating a digital copy of a physical object, the computing process of the interaction between DTs in a simulated scenario, and the computing process of actual CPS tasks. The communication of DT in CPS needs to consider the communication between the physical object and the virtual clone, the communication between the DT and other DTs, and the communication between the DT and the human control interface. These extra processes make the optimization problem more difficult to solve.

Taking into account the limited resources of the CPS system, we also need to optimize the large amount of data generated in these processes from the perspective of data generation, data storage, and data transmission. From a data generation perspective, when we generate the data needed to build a DT in CPS, the quality of the data is a major issue. Because we need to deploy a massive number of CPS devices, the hardware conditions of a single CPS device are often very limited. In addition, in the CPS environment, one type of data may be collected by multiple sources. Thus, how to effectively select data sources for different CPS applications is a challenging problem. From a data storage perspective, when we are building the DT, in order to truly reflect the timely update of physical objects, we often sample the state of physical objects at a relatively high frequency. Inevitably, an incredible volume of data will be collected on the DT server side. On one hand, these data can be used to assist in the development of other DT systems; on the other hand, these data take up significant storage space. Because there is not a simple linear relationship between the amount of stored data and the performance of the DT system, the question of how to accurately select useful data for storage has always been difficult to answer. From a data transmission perspective, due to the heterogeneity of CPS systems, IoT/CPS devices use different communication technologies. When we need to transmit data between these devices, we usually deploy a gateway with various radio interfaces for data exchange. This solution will bring additional hardware costs and potential network bottlenecks. Considering that the DT system has very strict requirements on timing and latency, it is very challenging to design networking infrastructure to support real-time communications so that the information between different components can be delivered quickly and efficiently.

4.4. Security and Privacy Challenges

We now discuss the security and privacy challenges. Since DT needs to update the state of physical objects in real time through network communications, DT will be subject to cyber attacks, such as integrity attacks on physical hardware and sensors, transmission, and digital systems. Taking the data integrity attack as an example, the adversary can directly attack IoT devices as data collectors, causing them to upload invalid/misleading data. Alternatively, an adversary can attack the gateway, to which the IoT device uploads data, mixing the invalid/misleading data with the correct data collected by IoT devices. Finally, the adversary could directly inject false data into the DT. Because the DT and physical objects are closely connected, any one of these attacks will yield a significant impact on the integrity of the entire system.

Additionally, from the perspective of confidentiality, the DT applications contain a large amount of privacy-sensitive information, such as medical records, autonomous vehicle sensing information, and real-time smart grid operation information. Therefore, it is not only necessary to implement an authentication mechanism for physical objects but also an authentication mechanism for digital communication and machine-to-machine transmission. Nonetheless, it can be costly to implement authentication mechanisms on low-energy IoT devices. In this case, knowing how to handle the authentication process on devices with limited power has become a challenge. From a usability point of view, DT applications rely on sensor information from the object layer to help the DT build the digital clone of the physical system. In this case, adversaries can launch attacks on sensor or gateway, such as a denial of service (DoS) attack or malware propagation. Once the gateway or sensor is compromised, it is difficult for DT to obtain the overall states of the physical system. For this reason, it is important to propose and validate mitigation mechanisms to reduce the impact of such attacks.

5. Research Directions

In extending the DT to IoT scenarios, promising new fields arise that are worthy of further investigation. We will discuss the future research directions from the following aspects: performance, new DT-driven services, modeling and machine learning, and security and privacy.

5.1. Performance

As we discussed in Section 4, the DT in CPS has very strict performance requirements to achieve real-time computing, real-time control, and real-time communication. We believe that the joint optimization of sensing, control, networking, computing, and data analysis is critically important for DT to meet these performance requirements. There have been some works focused on one of these optimization targets (sensing, control, networking, or data analysis), and some results have been achieved. However, if we need to maximize the overall performance of the system, we must study the interaction between all components and design a joint optimization strategy. Taking the intelligent transportation system as an example, suppose we are generating DT for moving vehicles in a certain area to better manage traffic. We use sensor data from both vehicles and cities to build DT. When we select data, simply having more data is not optimal. Instead, the pre-processing of the data should account for the quality of the data, the bandwidth of the network, the computing power of the infrastructure, and the Quality of Service (QoS) requirements of the specific application. Similarly, when we are transmitting the data, we need to consider not only the current communication environment but also the mobility of the vehicle, and the data storage space of the vehicle, among others.

5.2. New DT-Driven Services

By combining DT and IoT search services, we can interconnect different CPS systems and provide more services. Specifically, the DT bridges CPS in disparate domains, such as transportation, energy, and manufacturing, to obtain a ubiquitous connection. The DT enables more data sharing to create abundant content that improves the IoT search results. For example, the DT bridges electric cars that primarily consume energy with the smart grid that primarily generates and stores electricity. With DT, electric cars can obtain seamless and efficient energy supply via the information from digitized charging piles, solar planes, and so on. IoT search services such as location-based search and content-based search can also help us find IoT devices and data to build DTs [48,106]. This is particularly important in the rapidly changing CPS environment.

For example, in the smart transportation system, suppose we are using smart cameras to build digital clones of vehicles. After a few seconds, the vehicle drives out of the range of the smart camera. We can use the location-based IoT search service to find other cameras around the vehicle to ensure that DT can continue to work. As another example, if we want to know the moving trajectory of vehicle, we can use a content-based IoT search service to find all the information that contains the vehicle license plate. Finally, some new services can be provided by DT on different CPS. For example, in the smart manufacturing system, operators can check DT through algorithms to quickly perform quality inspections on a large number of products. In the smart home, users can integrate smart cameras and VR devices to provide holographic remote social interaction. In the smart grid, DT can be used to monitor the status of infrastructures in real time. In smart health care, DT can be used to monitor and record the vital signs of patients so that doctors can make better treatment plans. Likewise, in the smart transportation system, DT can be used to dynamically plan routes and provide additional driving assistance functions such as blind spot monitoring, brake warning, etc.

5.3. Modeling and Machine Learning

Considering the challenges of modeling a complex system in Section 4.2, a modeling methodology that can be adapted to different CPS systems is worth studying. There are two major directions that should be investigated: model-based DT and model-free DT.

For model-based DT, the main problem is that the CPS system is difficult to mathematically model, and there are significant differences between different CPS. Therefore, any model that works and is complete for one CPS will need to be re-modeled every time the CPS changes. To address this problem, one solution is modular DT. This focuses on building the DT of the small objects of the system and combining them to form the DT of the entire system, instead of building the entire system from the beginning. This not only reduces the complexity of building a DT model but also uses the constructed DT models to create a different combination DT systems.

In model-free DT, we establish DT by frequently collecting information on physical objects. Because this kind of DT completely relies on observations and disregards mathematical model support, it is difficult for us to predict the state of the system, which greatly affects the control and optimization of the DT system. Therefore, we consider using DL to generate a neural network-based model by learning data collected from the CPS devices. In this case, even for very complex systems that are difficult to model, we can still use DL models to build DT. Nonetheless, one problem is that ML/DL algorithms usually require a large amount of data, which is sometimes difficult to provide for CPS devices with limited resources. Therefore, it is worth investigating the best way to design DL models that work for small amounts of data. Moreover, training a DL model requires considerable computing power, and trained models need to be retrained once the environment changes, both of which are obviously too demanding for CPS devices. To solve these problems, we believe that using transfer learning is a promising solution [107]. Transfer learning is an ML technology that can transfer the learned knowledge from one a well-trained model to a new learning model. Specifically, with transfer learning, we can use a generic DL model trained with large amount of the data as the source model and replace several of the last (terminal) layers in the network to construct a new model. During the training phase, we only update the replaced layers. By using this transfer learning technique, we can use less training data to obtain better performance than training from scratch. Because there are fewer network parameters that need to be trained, this method also reduces the requirements for device computing power and storage space. Therefore, it is suitable for real-time computing, real-time control, and real-time communication in a large-scale distributed CPS system.

Finally, we should consider the integration of model-based DT and model-free DT into a hybrid-model DT. Even in the most complex cps system, there will be components that can be easily mathematically modeled. Therefore, we can mathematically model those components of the system that are easily modeled and use a model-free approach such as DL to model the remaining components. Once a DL-based model is trained, it can be treated as a modular model and directly integrated with other model-based DTs in a modular DT system. For example, in the smart transportation system, we can mathematically model the mobility of vehicles and use DL to model the Internet of Vehicle communication. By combining these two components together, we can obtain a complete DT of moving vehicles in CPS.

5.4. Security and Privacy

DT applications rely on data provided by IoT sensors and gateways. In this case, data and communication security and privacy are an important research direction. In considering security, we discuss the research from several perspectives: confidentiality, integrity, and availability.

In terms of confidentiality, we now take the smart transportation system as an example. DT obtains data from IoT sensors installed near roads, vehicles, etc. At the same time, IoT sensors may belong to different organizations/operators. In this case, different organizations/operators need to ensure the confidentiality of data through some authentication and secured communication protocols. In addition, some data stored in the application layer, such as vehicle location and traffic camera data, are sensitive. Thus, access control needs to be in place to protect user privacy. At the same time, some information security regulations can protect user privacy and achieve data confidentiality by restricting data sharing, especially on the end-user layer. For example, in the field of healthcare, Health Insurance Portability and Accountability Act (HIPPA) will be used to regulate the sharing and confidentiality of user medical data.

For the sake of integrity, some DT applications, including smart grid, smart transportation, smart manufacturing, and smart cities, require real-time data to build a digital model of the DT system. In this case, data integrity is essential for the DT system to correctly model and guide the physical system in response to changes in the environment. Based on the DT architecture, we discuss the integrity of the four layers. For the object layer, the sensors need to communicate with the gateways. In this case, the communication from sensor to gateway and gateway to sensor at the application layer needs to be secured to prevent an adversary from hijacking the communication using a man-in-the-middle (MITM) attack or others. In addition, authentication and other security mechanisms must be in place to restrict unauthorized users from modifying data. At the same time, at the application layer, anomaly detection can be used to detect manipulated data to ensure data integrity. Finally, an authentication system can be implemented at the end user level to prevent unauthorized users from accessing DT.

Finally, availability is key to real-time DT. Certain types of cyber attacks, including distributed denial of service (DDoS) and malware propagation, can target sensors and gateways. In this case, mitigation schemes need to be implemented on the gateways to filter malicious traffic and reduce the impact of attacks. At the same time, an adversary can send time-consuming queries or massive queries through the end user layer to overload the DT. To deal with such threats, mitigation schemes can be deployed at the application layer to avert the affects. Moreover, in a certain period of time, there may be multiple normal users accessing the same service at the same time. In this case, load balancing algorithms can be deployed on the gateway. If a gateway is overloaded, the algorithm can use nearby low-load gateways to process the request, leading to a highly resilient system.

6. Final Remarks

With the rapid advance of big data and ML/DL techniques, the cyber replica of a real physical system has been considered as a viable digital platform to emulate lifecycle use cases for CPS. DT is the cyber replica of the real physical system, which consists of real-time computing, real-time control, and real-time communication. By leveraging the DT with a physical system, we can imitate real-world cases in physical systems, without directly operating the real-world system, to optimize system performance. DT can also simulate emergency situations of the system, allowing system operators to practice emergency plans. Furthermore, DT could enable data sharing to support more data-oriented services on CPS. DT needs to represent things in a timely, accurate, and efficient manner, which poses numerous challenges to the networking, control, computing, and data analysis of CPS. Furthermore, the design of DT shall consider numerous exceptional requirements of CPS (e.g., latency, reliability, safety, scalability, security, and privacy). In this paper, we have first reviewed the principles of DT (architecture, data presentation, and protocols). We have then presented how to integrate DT in different CPS, including smart grid, smart transportation, smart manufacturing, and smart cities. We have further discussed some challenges from CPS, data science, optimization, and security perspectives. Finally, we have outlined future research directions from the perspectives of performance, new DT-driven services, modeling and machine learning, and security and privacy.

Author Contributions

C.Q.: DT architectures, protocols, DT for smart grid, smart transportation, and research directions. X.L.: challenges and research directions. C.R.: DT concept, architecture, and DT for smart manufacturing. M.Q.: DT concept, DT for smart cities. F.L.: DT concept, challenges, and research directions. W.Y.: problem definition, paper organization, integrating DT in CPS framework, challenges, and research direction. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not Applicable, the study does not report any data.

Conflicts of Interest

The authors declare no conflict of interest.

References

Xu, L.D.; He, W.; Li, S. Internet of Things in Industries: A Survey. IEEE Trans. Ind. Inform. 2014, 10, 2233–2243. [Google Scholar] [CrossRef]
Stankovic, J. Research Directions for the Internet of Things. Internet Things J. IEEE 2014, 1, 3–9. [Google Scholar] [CrossRef]
Liu, X.; Qian, C.; Hatcher, W.G.; Xu, H.; Liao, W.; Yu, W. Secure Internet of Things (IoT)-Based Smart-World Critical Infrastructures: Survey, Case Study and Research Opportunities. IEEE Access 2019, 7, 79523–79544. [Google Scholar] [CrossRef]
Komninos, N.; Philippou, E.; Pitsillides, A. Survey in Smart Grid and Smart Home Security: Issues, Challenges and Countermeasures. IEEE Commun. Surv. Tutor. 2014, 16, 1933–1954. [Google Scholar] [CrossRef]
Xu, G.; Yu, W.; Griffith, D.; Golmie, N.; Moulema, P. Toward Integrating Distributed Energy Resources and Storage Devices in Smart Grid. IEEE Internet Things J. 2017, 4, 192–204. [Google Scholar] [CrossRef]
Liu, Y.; Weng, X.; Wan, J.; Yue, X.; Song, H.; Vasilakos, A.V. Exploring Data Validity in Transportation Systems for Smart Cities. IEEE Commun. Mag. 2017, 55, 26–33. [Google Scholar] [CrossRef]
Xu, H.; Yu, W.; Griffith, D.; Golmie, N. A Survey on Industrial Internet of Things: A Cyber-Physical Systems Perspective. IEEE Access 2018, 6, 78238–78259. [Google Scholar] [CrossRef]
Mahmud, M.S.; Wang, H.; Esfar-E-Alam, A.M.; Fang, H. A Wireless Health Monitoring System Using Mobile Phone Accessories. IEEE Internet Things J. 2017, 4, 2009–2018. [Google Scholar] [CrossRef]
Guo, H.; Zhang, N.; Wu, S.; Yang, Q. Deep Learning Driven Wireless Real-time Human Activity Recognition. In Proceedings of the ICC 2020—2020 IEEE International Conference on Communications (ICC), Online, 7–11 June 2020; pp. 1–6. [Google Scholar] [CrossRef]
Sun, Y.; Song, H.; Jara, A.J.; Bie, R. Internet of Things and Big Data Analytics for Smart and Connected Communities. IEEE Access 2016, 4, 766–773. [Google Scholar] [CrossRef]
Du, R.; Santi, P.; Xiao, M.; Vasilakos, A.V.; Fischione, C. The Sensable City: A Survey on the Deployment and Management for Smart City Monitoring. IEEE Commun. Surv. Tutor. 2019, 21, 1533–1560. [Google Scholar] [CrossRef]
Bartolini, A.; Corti, F.; Reatti, A.; Ciani, L.; Grasso, F.; Kazimierczuk, M.K. Analysis and Design of Stand-Alone Photovoltaic System for precision agriculture network of sensors. In Proceedings of the 2020 IEEE International Conference on Environment and Electrical Engineering and 2020 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I CPS Europe), Madrid, Spain, 9–12 June 2020; pp. 1–5. [Google Scholar] [CrossRef]
Al Rasyid, M.U.H.; Nadhori, I.U.; Sudarsono, A.; Luberski, R. Analysis of slotted and unslotted CSMA/CA Wireless Sensor Network for E-healthcare system. In Proceedings of the 2014 International Conference on Computer, Control, Informatics and Its Applications (IC3INA), Bandung, Indonesia, 21–23 October 2014; pp. 53–57. [Google Scholar] [CrossRef]
Pievanelli, E.; Plesca, A.; Stefanelli, R.; Trinchero, D. Dynamic wireless sensor networks for real time safeguard of workers exposed to physical agents in constructions sites. In Proceedings of the 2013 IEEE Topical Conference on Wireless Sensors and Sensor Networks (WiSNet), Austin, TX, USA, 20–23 January 2013; pp. 55–57. [Google Scholar] [CrossRef]
Hatcher, W.G.; Yu, W. A Survey of Deep Learning: Platforms, Applications and Emerging Research Trends. IEEE Access 2018, 6, 24411–24432. [Google Scholar] [CrossRef]
Liang, F.; Hatcher, W.G.; Liao, W.; Gao, W.; Yu, W. Machine Learning for Security and the Internet of Things: The Good, the Bad, and the Ugly. IEEE Access 2019, 7, 158126–158147. [Google Scholar] [CrossRef]
Wu, D.; Shi, H.; Wang, H.; Wang, R.; Fang, H. A Feature-Based Learning System for Internet of Things Applications. IEEE Internet Things J. 2019, 6, 1928–1937. [Google Scholar] [CrossRef]
Mohammadi, M.; Al-Fuqaha, A.; Sorour, S.; Guizani, M. Deep Learning for IoT Big Data and Streaming Analytics: A Survey. IEEE Commun. Surv. Tutor. 2018, 20, 2923–2960. [Google Scholar] [CrossRef] [Green Version]
Liang, Y.; Cai, Z.; Yu, J.; Han, Q.; Li, Y. Deep Learning Based Inference of Private Information Using Embedded Sensors in Smart Devices. IEEE Netw. 2018, 32, 8–14. [Google Scholar] [CrossRef]
Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge Computing: Vision and Challenges. IEEE Internet Things J. 2016, 3, 637–646. [Google Scholar] [CrossRef]
Zhu, S.; Xu, J.; Guo, H.; Liu, Q.; Wu, S.; Wang, H. Indoor Human Activity Recognition Based on Ambient Radar with Signal Processing and Machine Learning. In Proceedings of the 2018 IEEE International Conference on Communications (ICC), Kansas City, MO, USA, 20–24 May 2018; pp. 1–6. [Google Scholar] [CrossRef]
Cai, Z.; Zheng, X.; Wang, J. Efficient Data Trading for Stable and Privacy Preserving Histograms in Internet of Things. In Proceedings of the 2021 IEEE International Performance, Computing, and Communications Conference (IPCCC), Austin, TX, USA, 11–13 November 2021; pp. 1–10. [Google Scholar] [CrossRef]
Chen, S.Y.; Song, S.F.; Li, L.X.; Shen, J. Survey on smart grid technology. Power Syst. Technol. 2009, 33, 1–7. [Google Scholar] [CrossRef]
Guan, Z.; Sun, N.; Xu, Y.; Yang, T. A Comprehensive Survey of False Data Injection in Smart Grid. Int. J. Wire. Mob. Comput. 2015, 8, 27–33. [Google Scholar] [CrossRef]
Liu, Y.; Ning, P.; Reiter, M. False data injection attacks against state estimation in electric power grids. ACM Trans. Inf. Syst. Secur. 2011, 14, 13. [Google Scholar] [CrossRef]
Xu, H.; Yu, W.; Liu, X.; Griffith, D.; Golmie, N. On Data Integrity Attacks against Industrial Internet of Things. In Proceedings of the 2020 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Calgary, AB, Canada, 17–22 August 2020; pp. 21–28. [Google Scholar] [CrossRef]
Ponnusamy, V.K.; Kasinathan, P.; Madurai Elavarasan, R.; Ramanathan, V.; Anandan, R.K.; Subramaniam, U.; Ghosh, A.; Hossain, E. A Comprehensive Review on Sustainable Aspects of Big Data Analytics for the Smart Grid. Sustainability 2021, 13, 3322. [Google Scholar] [CrossRef]
Walter, A.; Finger, R.; Huber, R.; Buchmann, N. Opinion: Smart farming is key to developing sustainable agriculture. Proc. Natl. Acad. Sci. USA 2017, 114, 6148–6150. [Google Scholar] [CrossRef] [Green Version]
Jayaraman, P.P.; Yavari, A.; Georgakopoulos, D.; Morshed, A.; Zaslavsky, A. Internet of things platform for smart farming: Experiences and lessons learnt. Sensors 2016, 16, 1884. [Google Scholar] [CrossRef] [PubMed]
Kritzinger, W.; Karner, M.; Traar, G.; Henjes, J.; Sihn, W. Digital Twin in manufacturing: A categorical literature review and classification. IFAC-PapersOnLine 2018, 51, 1016–1022. [Google Scholar] [CrossRef]
Aceto, G.; Persico, V.; Pescapé, A. A Survey on Information and Communication Technologies for Industry 4.0: State-of-the-Art, Taxonomies, Perspectives, and Challenges. Commun. Surv. Tutor. 2019, 21, 3467–3501. [Google Scholar] [CrossRef]
Boschert, S.; Rosen, R. Digital twin—the simulation aspect. In Mechatronic Futures; Springer: Berlin/Heidelberg, Germany, 2016; pp. 59–74. [Google Scholar] [CrossRef]
Tao, F.; Sui, F.; Liu, A.; Qi, Q.; Zhang, M.; Song, B.; Guo, Z.; Lu, S.C.Y.; Nee, A.Y. Digital twin-driven product design framework. Int. J. Prod. Res. 2019, 57, 3935–3953. [Google Scholar] [CrossRef] [Green Version]
Tao, F.; Zhang, H.; Liu, A.; Nee, A.Y.C. Digital twin in industry: State-of-the-art. IEEE Trans. Ind. Inform. 2018, 15, 2405–2415. [Google Scholar] [CrossRef]
Leng, J.; Zhang, H.; Yan, D.; Liu, Q.; Chen, X.; Zhang, D. Digital twin-driven manufacturing cyber-physical system for parallel controlling of smart workshop. J. Ambient Intell. Humaniz. Comput. 2019, 10, 1155–1166. [Google Scholar] [CrossRef]
Qi, Q.; Tao, F. Digital twin and big data towards smart manufacturing and industry 4.0: 360 degree comparison. IEEE Access 2018, 6, 3585–3593. [Google Scholar] [CrossRef]
Brosinsky, C.; Westermann, D.; Krebs, R. Recent and prospective developments in power system control centers: Adapting the digital twin technology for application in power system control centers. In Proceedings of the 2018 IEEE International Energy Conference (ENERGYCON), Limassol, Cyprus, 3–7 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar] [CrossRef]
Tzanis, N.; Andriopoulos, N.; Magklaras, A.; Mylonas, E.; Birbas, M.; Birbas, A. A hybrid cyber physical digital twin approach for smart grid fault prediction. In Proceedings of the 2020 IEEE Conference on Industrial Cyberphysical Systems (ICPS), Tampere, Finland, 9–12 June 2020; IEEE: Piscataway, NJ, USA, 2020; Volume 1, pp. 393–397. [Google Scholar] [CrossRef]
Saad, A.; Faddel, S.; Youssef, T.; Mohammed, O.A. On the implementation of IoT-based digital twin for networked microgrids resiliency against cyber attacks. IEEE Trans. Smart Grid 2020, 11, 5138–5150. [Google Scholar] [CrossRef]
Danilczyk, W.; Sun, Y.; He, H. Angel: An intelligent digital twin framework for microgrid security. In Proceedings of the 2019 North American Power Symposium (NAPS), Wichita, KS, USA, 13–15 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6. [Google Scholar] [CrossRef]
Bird, P. An updated digital model of plate boundaries. Geochem. Geophys. Geosystems 2003, 4, 1–46. [Google Scholar] [CrossRef]
Remeikiene, R.; Gaspareniene, L.; Schneider, F.G. The definition of digital shadow economy. Technol. Econ. Dev. Econ. 2018, 24, 696–717. [Google Scholar] [CrossRef] [Green Version]
Rasheed, A.; San, O.; Kvamsdal, T. Digital twin: Values, challenges and enablers from a modeling perspective. IEEE Access 2020, 8, 21980–22012. [Google Scholar] [CrossRef]
Haag, S.; Anderl, R. Digital twin–Proof of concept. Manuf. Lett. 2018, 15, 64–66. [Google Scholar] [CrossRef]
Jones, D.; Snider, C.; Nassehi, A.; Yon, J.; Hicks, B. Characterising the Digital Twin: A systematic literature review. CIRP J. Manuf. Sci. Technol. 2020, 29, 36–52. [Google Scholar] [CrossRef]
Liu, Z. Reading behavior in the digital environment: Changes in reading behavior over the past ten years. J. Doc. 2005, 61, 700–712. [Google Scholar] [CrossRef] [Green Version]
Čolaković, A.; Hadžialić, M. Internet of Things (IoT): A review of enabling technologies, challenges, and open research issues. Comput. Netw. 2018, 144, 17–39. [Google Scholar] [CrossRef]
Hatcher, W.G.; Qian, C.; Gao, W.; Liang, F.; Hua, K.; Yu, W. Towards Efficient and Intelligent Internet of Things Search Engine. IEEE Access 2021, 9, 15778–15795. [Google Scholar] [CrossRef]
Jaloudi, S. Communication Protocols of an Industrial Internet of Things Environment: A Comparative Study. Future Internet 2019, 11, 66. [Google Scholar] [CrossRef] [Green Version]
Al-Sarawi, S.; Anbar, M.; Alieyan, K.; Alzubaidi, M. Internet of Things (IoT) communication protocols: Review. In Proceedings of the 2017 8th International Conference on Information Technology (ICIT), Amman, Jordan, 17–18 May 2017; pp. 685–690. [Google Scholar] [CrossRef]
Stusek, M.; Zeman, K.; Masek, P.; Sedova, J.; Hosek, J. IoT Protocols for Low-power Massive IoT: A Communication Perspective. In Proceedings of the 2019 11th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT), Dublin, Ireland, 28–30 October 2019; pp. 1–7. [Google Scholar] [CrossRef]
Azure. Digital Twin Definition Language. Available online: https://github.com/Azure/opendigitaltwins-dtdl (accessed on 26 January 2022).
Conde, J.; Munoz-Arcentales, A.; Alonso, A.; Lopez-Pernas, S.; Salvachua, J. Modeling Digital Twin Data and Architecture: A Building Guide with FIWARE as Enabling Technology. IEEE Internet Comput. 2021, 1. [Google Scholar] [CrossRef]
Foundation, O. Unified Architecture. Available online: https://opcfoundation.org/about/opc-technologies/opc-ua/ (accessed on 26 January 2022).
Ala-Laurinaho, R.; Autiosalo, J.; Nikander, A.; Mattila, J.; Tammi, K. Data Link for the Creation of Digital Twins. IEEE Access 2020, 8, 228675–228684. [Google Scholar] [CrossRef]
Autiosalo, J.; Vepsäläinen, J.; Viitala, R.; Tammi, K. A Feature-Based Framework for Structuring Industrial Digital Twins. IEEE Access 2020, 8, 1193–1208. [Google Scholar] [CrossRef]
Kome, M.L.; Cuppens, F.; Cuppens-Boulahia, N.; Frey, V. CoAP Enhancement for a Better IoT Centric Protocol: CoAP 2.0. In Proceedings of the 2018 Fifth International Conference on Internet of Things: Systems, Management and Security, Valencia, Spain, 15–18 October 2018; pp. 139–146. [Google Scholar] [CrossRef]
Silva, D.; Carvalho, L.I.; Soares, J.; Sofia, R.C. A Performance Analysis of Internet of Things Networking Protocols: Evaluating MQTT, CoAP, OPC UA. Appl. Sci. 2021, 11, 4879. [Google Scholar] [CrossRef]
Yang, K.; Zhang, B.; Zhang, J.; Zhu, J. Design of Remote Control Inverter Based on MQTT Communication Protocol. In Proceedings of the 2021 IEEE International Conference on Mechatronics and Automation (ICMA), Takamatsu, Japan, 8–11 August 2021; pp. 1374–1378. [Google Scholar] [CrossRef]
Cagnano, A.; De Tuglie, E.; Mancarella, P. Microgrids: Overview and guidelines for practical implementations and operation. Appl. Energy 2020, 258, 114039. [Google Scholar] [CrossRef]
González, I.; Calderón, A.J.; Portalo, J.M. Innovative multi-layered architecture for heterogeneous automation and monitoring systems: Application case of a photovoltaic smart microgrid. Sustainability 2021, 13, 2234. [Google Scholar] [CrossRef]
Liu, Q.; Li, Y. Modbus/TCP based Network Control System for Water Process in the Firepower Plant. In Proceedings of the 2006 6th World Congress on Intelligent Control and Automation, Dalian, China, 21–23 June 2006; Volume 1, pp. 432–435. [Google Scholar] [CrossRef]
Sharma, A.; Airan, S.; Shah, D. Designing C Library for MODBUS-RTU to CANBUS and MODBUS-TCP IOT Converters. In Proceedings of the 2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 4–6 August 2021; pp. 731–737. [Google Scholar] [CrossRef]
Galketiya, T.; Kahahena, J.; Chandran, J.; Kavalchuk, I. Novel Communication System for SCADA Tied Smart Inverter for Vietnam. In Proceedings of the 2019 25th Asia-Pacific Conference on Communications (APCC), Ho Chi Minh City, Vietnam, 6–8 November 2019; pp. 331–335. [Google Scholar] [CrossRef]
Tan, J.; Sha, X.; Dai, B.; Lu, T. Wireless Technology and Protocol for IIoT and Digital Twins. In Proceedings of the 2020 ITU Kaleidoscope: Industry-Driven Digital Transformation (ITU K), ONLINE, 7–11 December 2020; pp. 1–8. [Google Scholar] [CrossRef]
Zhou, M.; Yan, J.; Feng, D. Digital twin framework and its application to power grid online analysis. CSEE J. Power Energy Syst. 2019, 5, 391–398. [Google Scholar] [CrossRef]
Dileep, G. A survey on grid technologies and applications. Renew. Energy 2020, 146, 2589–2625. [Google Scholar] [CrossRef]
Lund, A.M.; Mochel, K.; Lin, J.W.; Onetto, R.; Srinivasan, J.; Gregg, P.; Chotai, S. Digital Wind Farm System. U.S. Patent US20160333855A1, 17 November 2016. [Google Scholar]
Lund, A.M.; Mochel, K.; Lin, J.W.; Onetto, R.; Srinivasan, J.; Gregg, P.; Chotai, S. Digital Twin Interface for Operating Wind Farms. U.S. Patent US9995278B2, 12 June 2018. [Google Scholar]
Danilczyk, W.; Sun, Y.L.; He, H. Smart Grid Anomaly Detection using a Deep Learning Digital Twin. In Proceedings of the 2020 52nd North American Power Symposium (NAPS), Tempe, AZ, USA, 11–13 October 2020; pp. 1–6. [Google Scholar] [CrossRef]
Baboli, P.T.; Babazadeh, D.; Kumara Bowatte, D.R. Measurement-based Modeling of Smart Grid Dynamics: A Digital Twin Approach. In Proceedings of the 2020 10th Smart Grid Conference (SGC), Kashan, Iran, 16–17 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
Chen, C.; Liu, L.; Qiu, T.; Jiang, J.; Pei, Q.; Song, H. Routing With Traffic Awareness and Link Preference in Internet of Vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 200–214. [Google Scholar] [CrossRef]
Jiang, D.; Huo, L.; Lv, Z.; Song, H.; Qin, W. A Joint Multi-Criteria Utility-Based Network Selection Approach for Vehicle-to-Infrastructure Networking. IEEE Trans. Intell. Transp. Syst. 2018, 19, 3305–3319. [Google Scholar] [CrossRef]
Rudskoy, A.; Ilin, I.; Prokhorov, A. Digital Twins in the Intelligent Transport Systems. Transp. Res. Procedia 2021, 54, 927–935. [Google Scholar] [CrossRef]
Dasgupta, S.; Rahman, M.; Lidbe, A.D.; Lu, W.; Jones, S. A Transportation Digital-Twin Approach for Adaptive Traffic Control Systems. arXiv 2021, arXiv:2109.10863. [Google Scholar]
Wang, X.; Song, H.; Zha, W.; Li, J.; Dong, H. Digital twin based validation platform for smart metro scenarios. In Proceedings of the 2021 IEEE 1st International Conference on Digital Twins and Parallel Intelligence (DTPI), Beijing, China, 15 July–15 August 2021; pp. 386–389. [Google Scholar] [CrossRef]
Sahal, R.; Alsamhi, S.H.; Brown, K.N.; O’Shea, D.; McCarthy, C.; Guizani, M. Blockchain-Empowered Digital Twins Collaboration: Smart Transportation Use Case. Machines 2021, 9, 193. [Google Scholar] [CrossRef]
Guo, Y.; Zou, K.; Chen, S.; Yuan, F.; Yu, F. 3D Digital Twin of Intelligent Transportation System based on Road-Side Sensing. In Proceedings of the Journal of Physics: Conference Series, London, UK, 5 March 2021; IOP Publishing: Bristol, UK, 2021; Volume 2083, p. 032022. [Google Scholar]
Wallace, F.R.E. Panel on Enabling Smart Manufacturing; APMS: State College, PA, USA, 2013. [Google Scholar]
Frank, A.G.; Dalenogare, L.S.; Ayala, N.F. Industry 4.0 technologies: Implementation patterns in manufacturing companies. Int. J. Prod. Econ. 2019, 210, 15–26. [Google Scholar] [CrossRef]
Kunath, M.; Winkler, H. Integrating the Digital Twin of the manufacturing system into a decision support system for improving the order management process. Procedia Cirp 2018, 72, 225–231. [Google Scholar] [CrossRef]
Redelinghuys, A.; Kruger, K.; Basson, A. A Six-Layer Architecture for Digital Twins with Aggregation; Springer: Berlin/Heidelberg, Germany, 2020; pp. 171–182. [Google Scholar] [CrossRef]
Huo, Z.; Mukherjee, M.; Shu, L.; Chen, Y.; Zhou, Z. Cloud-based Data-intensive Framework towards fault diagnosis in large-scale petrochemical plants. In Proceedings of the 2016 International Wireless Communications and Mobile Computing Conference (IWCMC), Cyprus, Paphos, 5–9 September 2016; pp. 1080–1085. [Google Scholar] [CrossRef] [Green Version]
Pfohl, H.C.; Yahsi, B.; Kurnaz, T. Concept and Diffusion-Factors of Industry 4.0 in the Supply Chain; Springer: Berlin/Heidelberg, Germany, 2017; pp. 381–390. [Google Scholar] [CrossRef]
Hu, S.J. Evolving Paradigms of Manufacturing: From Mass Production to Mass Customization and Personalization. Procedia CIRP 2013, 7, 3–8. [Google Scholar] [CrossRef] [Green Version]
Lu, Y.; Xu, X.; Wang, L. Smart manufacturing process and system automation—A critical review of the standards and envisioned scenarios. J. Manuf. Syst. 2020, 56, 312–325. [Google Scholar] [CrossRef]
Brenner, B.; Hummel, V. A Seamless Convergence of the Digital and Physical Factory Aiming in Personalized Product Emergence Process (PPEP) for Smart Products within ESB Logistics Learning Factory at Reutlingen University. Procedia CIRP 2016, 54, 227–232. [Google Scholar] [CrossRef] [Green Version]
Salah, B. Real-Time Implementation of a Fully Automated Industrial System Based on IR 4.0 Concept. Actuators 2021, 10, 318. [Google Scholar] [CrossRef]
Tao, F.; Zhang, M. Digital Twin Shop-Floor: A New Shop-Floor Paradigm Towards Smart Manufacturing. IEEE Access 2017, 5, 20418–20427. [Google Scholar] [CrossRef]
Židek, K.; Piteľ, J.; Adámek, M.; Lazorík, P.; Hošovský, A. Digital Twin of Experimental Smart Manufacturing Assembly System for Industry 4.0 Concept. Sustainability 2020, 12, 3658. [Google Scholar] [CrossRef]
Aghenta, L.O.; Iqbal, M.T. Low-cost, open source IoT-based SCADA system design using thinger. IO and ESP32 thing. Electronics 2019, 8, 822. [Google Scholar] [CrossRef] [Green Version]
Kaur, M.J.; Mishra, V.P.; Maheshwari, P. The convergence of digital twin, IoT, and machine learning: Transforming data into action. In Digital Twin Technologies and Smart Cities; Springer: Berlin/Heidelberg, Germany, 2020; pp. 3–17. [Google Scholar] [CrossRef]
Mishra, K.N.; Chakraborty, C. A novel approach toward enhancing the quality of life in smart cities using clouds and IoT-based technologies. In Digital Twin Technologies and Smart Cities; Springer: Berlin/Heidelberg, Germany, 2020; pp. 19–35. [Google Scholar] [CrossRef]
Seuwou, P.; Banissi, E.; Ubakanma, G. The future of mobility with connected and autonomous vehicles in smart cities. In Digital Twin Technologies and Smart Cities; Springer: Berlin/Heidelberg, Germany, 2020; pp. 37–52. [Google Scholar] [CrossRef]
Jraisat, L. Information sharing in sustainable value chain network (SVCN)—The perspective of transportation in cities. In Digital Twin Technologies and Smart Cities; Springer: Berlin/Heidelberg, Germany, 2020; pp. 67–77. [Google Scholar] [CrossRef]
Anthopoulos, L.G.; Janssen, M.; Weerakkody, V. Comparing Smart Cities with different modeling approaches. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 525–528. [Google Scholar] [CrossRef]
Gharaibeh, A.; Salahuddin, M.A.; Hussini, S.J.; Khreishah, A.; Khalil, I.; Guizani, M.; Al-Fuqaha, A. Smart Cities: A Survey on Data Management, Security, and Enabling Technologies. IEEE Commun. Surv. Tutor. 2017, 19, 2456–2501. [Google Scholar] [CrossRef]
Deng, T.; Zhang, K.; Shen, Z.J.M. A Systematic Review of a Digital Twin City: A New Pattern of Urban Governance toward Smart Cities. J. Manag. Sci. Eng. 2021, 6, 125–134. [Google Scholar] [CrossRef]
Shahat, E.; Hyun, C.T.; Yeom, C. City digital twin potentials: A review and research agenda. Sustainability 2021, 13, 3386. [Google Scholar] [CrossRef]
Shirowzhan, S.; Tan, W.; Sepasgozar, S.M. Digital twin and CyberGIS for improving connectivity and measuring the impact of infrastructure construction planning in smart cities. ISPRS Int. J.-Geo-Inf. 2020, 9, 240. [Google Scholar] [CrossRef]
Castro, D. Planning in Virtual Reality. Available online: https://www.govtech.com (accessed on 26 January 2022).
Gassmann, O.; Böhm, J.; Palmié, M. Smart Cities: Introducing Digital Innovation to Cities; Emerald Group Publishing: Bentley, UK, 2019. [Google Scholar] [CrossRef]
Schrotter, G.; Hürzeler, C. The digital twin of the city of Zurich for urban planning. PFG-Photogramm. Remote Sens. Geoinf. Sci. 2020, 88, 99–112. [Google Scholar] [CrossRef] [Green Version]
Research, A. The Use of Digital Twins for Urban Planning to Yield US$280 Billion in Cost Savings by 2030. Available online: https://www.abiresearch.com/press/use-digital-twins-urban-planning-yield-us280-billion-cost-savings-2030/ (accessed on 26 January 2022).
Xu, H.; Liu, X.; Yu, W.; Griffith, D.; Golmie, N. Reinforcement Learning-Based Control and Networking Co-Design for Industrial Internet of Things. IEEE J. Sel. Areas Commun. 2020, 38, 885–898. [Google Scholar] [CrossRef]
Liang, F.; Qian, C.; Hatcher, W.G.; Yu, W. Search Engine for the Internet of Things: Lessons From Web Search, Vision, and Opportunities. IEEE Access 2019, 7, 104673–104691. [Google Scholar] [CrossRef]
Pan, S.J.; Yang, Q. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]

Figure 1. DT Architecture.

Figure 2. DT Variants.

Figure 3. DT types.

Figure 4. DT environment.

Figure 5. DT architecture for IoT.

Figure 6. Number of publications related to DT and CPS in IEEE Xplore.

Figure 7. DT-based CPS.

Table 1. Protocols of DTs.

Protocol Name	Protocol Type	Protocol Characteristics
DTDL	Data Representation	As an open-standard platform, it defines six characteristics of IoT components and enables seamless data transmission between different DTs.
FIWARE	Data Representation	It supports DT data transmission and the processing of contextual information received from various IoT components.
OPC UA	Data Representation	As a modeling framework, it can retrieve information from raw data, support data manipulation, and provide monitoring capabilities.
FDTF	Data Representation	As a DT structure, it enables the DT system to share information based on the data link between DT components.
CoAP	Communication	As a specialized web communication protocol based on the User Datagram Protocol (UDP), it is tailored for resource-restricted devices, supports the transmission of data via Hypertext Transfer Protocol (HTTP), and provides a publish and subscribe mechanism to simplify the process of obtaining continuous data from the sensor.
MQTT	Communication	As a communication protocol based on Transmission Control Protocol (TCP), it enables lightweight way for IoT devices to communicate, provides reliable data transfer, and can establish a long-existing outgoing TCP protocol to enable transmission.
Modbus TCP/IP	Communication	As a communication protocol based on Transmission Control Protocol (TCP), it realizes the connection between industrial devices, provides reliable data transfer, and contains built-in checksum protection.
URLLC	Communication	As a communication protocol, it tends to achieve low latency and reliability in the transmission process between IoT devices.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qian, C.; Liu, X.; Ripley, C.; Qian, M.; Liang, F.; Yu, W. Digital Twin—Cyber Replica of Physical Things: Architecture, Applications and Future Research Directions. Future Internet 2022, 14, 64. https://doi.org/10.3390/fi14020064

AMA Style

Qian C, Liu X, Ripley C, Qian M, Liang F, Yu W. Digital Twin—Cyber Replica of Physical Things: Architecture, Applications and Future Research Directions. Future Internet. 2022; 14(2):64. https://doi.org/10.3390/fi14020064

Chicago/Turabian Style

Qian, Cheng, Xing Liu, Colin Ripley, Mian Qian, Fan Liang, and Wei Yu. 2022. "Digital Twin—Cyber Replica of Physical Things: Architecture, Applications and Future Research Directions" Future Internet 14, no. 2: 64. https://doi.org/10.3390/fi14020064

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Digital Twin—Cyber Replica of Physical Things: Architecture, Applications and Future Research Directions

Abstract

1. Introduction

2. Digital Twin

2.1. Basic Concepts

2.2. Architecture

2.2.1. DT Variants

2.2.2. Types of DT

2.2.3. Architecture for IoT Systems

2.3. Data Representation

2.4. Communication Protocols

3. Integrating DT in CPS

3.1. Framework

3.2. Smart Grid

3.3. Smart Transportation

3.4. Smart Manufacturing

3.5. Smart Cities

4. Challenges

4.1. CPS Challenges

4.2. Data Science Challenges

4.3. Optimization Challenges

4.4. Security and Privacy Challenges

5. Research Directions

5.1. Performance

5.2. New DT-Driven Services

5.3. Modeling and Machine Learning

5.4. Security and Privacy

6. Final Remarks

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI