Safety of Control Systems with Dual Architecture Based on PLCs

Medvedík, Milan; Ždánsky, Juraj; Rástočný, Karol; Hrbček, Jozef; Gregor, Michal

doi:10.3390/app12199799

Open AccessArticle

Safety of Control Systems with Dual Architecture Based on PLCs

by

Milan Medvedík

,

Juraj Ždánsky

,

Karol Rástočný

,

Jozef Hrbček

^*

and

Michal Gregor

Department of Control and Information Systems, Faculty of Electrical Engineering and Information Technology, University of Žilina, 010 26 Žilina, Slovakia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(19), 9799; https://doi.org/10.3390/app12199799

Submission received: 8 September 2022 / Revised: 23 September 2022 / Accepted: 26 September 2022 / Published: 29 September 2022

(This article belongs to the Section Electrical, Electronics and Communications Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

The implementation of safety functions, in contrast to ordinary control functions, requires a different approach to the design of the hardware and software of the control system. The reason for the different approaches is that each safety function must meet the required Safety Integrity Level (SIL). This has two aspects: safety integrity against random failures and safety integrity against systematic failures. Hardware is primarily related to random failure safety integrity, and software primarily to systematic failure safety integrity. The focus of this contribution lies in the design of a method that will allow the software of a safety function for industrial applications to be designed using a model of the functional behavior of this function. Since commercially available programmable logic controllers (PLCs) with a defined SIL level (so-called safety PLC) do not support such solutions, a dual architecture based on standard PLCs, including their software control, is proposed in the paper. Such an approach will make it possible to significantly limit the occurrence of systematic errors in the creation of application software, as well as to test the created application software and reveal deficiencies that arose in previous phases of the life cycle (e.g., when specifying safety functions). To implement the method proposed in this paper, a dual architecture created from the safety PLC Simatic series S7-1500 is used. With the help of this architecture, the safety function “Muting” is implemented.

Keywords:

safety integrity; dual architecture; safety function; programmable logic controller; state diagram

1. Introduction

This article discusses the specifics of fPLC configuration and programming and their impact on the safety function. PLCs specially designed for this purpose are commonly used today to implement safety functions in the industry. To distinguish this category of PLC from standard PLCs, the designation “safety PLC” or “fail-safe PLC” is used (for this article, we will use the designation “fPLC”; the designation “sPLC” will be used for standard Programmable Logic Controllers, not intended for the implementation of safety functions).

The fPLCs use a composite fault safety technique to achieve safety (more on this technique can be found in, for example, [1]) and are designed to be as easy to use as possible for the user. In practice, this means that the way they are configured and programmed is close to the way sPLCs are configured and programmed. However, it cannot be entirely identical because a dual architecture is used to implement the composite failure safety technique, and some specifics are associated with executing a program using such an architecture. The configuration and programming of the fPLC must also be adapted to these specifics. They aim to achieve the required Safety Integrity Level (SIL) and apply it to both hardware and software. In terms of hardware, it is primarily a question of the correct selection of sensors and actuators and their correct connection according to the safety function requirements. The connection of sensors and actuators concerning the requirements of the safety function is dealt with in more detail in the literature [2,3]. In terms of software, it is primarily a matter of using programming procedures and methods that will ensure the required function and enable its verification in various (even non-standard) operating states.

To simplify the procedures for ensuring safety, standardized hardware connections have been developed on the hardware side [4,5,6], and software modules [7] that cover the most common application problems are prepared on the software side. These solutions significantly facilitate the achievement of the required SIL. However, there are also applications for which standardized solutions are not suitable and which require an individual approach. In such cases, the limitations resulting from standardized solutions become a significant barrier to application development.

One example of such a limitation is the Ladder Logic (LAD) programming language, resp. Function Block Diagram (FBD) used for fPLC programming. The fPLCs do not allow the programming of safety features in other languages typical of sPLCs. These are mainly different versions of the structured text. Therefore, it is not possible to use third-party software tools for programming the fPLC, which make it possible to create a model of the functional behavior of the safety function (SF; the model can be, for example, a state diagram) and, after debugging it, to generate a program in structured text. Examples of such software are Rational Rhapsody, Visual Paradigm for UML, Millet UML Modeller, Modelio, MagicDraw, Matlab, etc. With sPLC, the ability to automatically generate code from third-party tools is becoming an increasingly used solution for performing complex functions.

This led us to the idea of exploring the possibilities of implementing safety functions using two sPLCs joined in a dual architecture. The architecture can suitably eliminate the consequences of accidental sPLC failures and thus affect the safety or reliability of the created control system. Using sPLC, the possibility of using code automatically generated by various modelling and simulation tools to implement safety functions would be retained. This would bring undeniable advantages in programming more complex functions, not only in the design and programming phase of the system but also in its verification and validation.

A common feature of the use of an fPLC vs. two sPLCs is the utilization of a composite fail-safe technique using a dual architecture. However, the difference is in the application of these architectures. In fPLC, the cooperation of dual architecture channels is solved at the firmware level of each module. However, if we want to create a dual architecture using two sPLCs, their collaboration must be addressed at the application level. The solution to the problem of sPLC interoperability in dual architecture is the focus of this problem (basically, it is the management of dual architecture at the application level). By creating a suitable algorithm controlling the dual architecture, a somewhat universal, affordable platform could be created, enabling the implementation of safety functions based on application code generated from pre-tested models.

This paper presents a dual architecture based on sPLC controlled at the application level. The architecture can perform any safety function based on a model created in Matlab Stateflow software (more information about this software can be found in ref. [8]). In this paper, the safety function “Muting” is provided as an example. There is currently no procedure that would allow implementing a safety function into the PLC-based control system from a model. However, this trend is proving to be very advantageous (precisely because of the elimination of systematic errors in software development), and the paper is devoted precisely to this issue.

The use of a dual architecture is one of the ways to apply redundancy in the system. In the case dealt with in this article, redundancy is used for quick fault detection using comparison. This allows us to detect random failures and respond to them in a predefined way, which will lead to an increased SIL. Another option is to use redundancy so that after a part of the control system fails, the redundant part will perform the control tasks. This leads to an increase in the availability of the control system. The realization of such a system at the application level is dealt with, for example, in ref. [9]. If necessary, redundancy can be used in combination to achieve the required SIL and the control system’s required availability. However, this requires a multi-channel architecture (not just a dual-channel one). A more detailed description of this issue can be found in ref. [10].

2. System Safety

According to ref. [11], safety is defined as the absence of unacceptable risk. To make safety measurable, ref. [11] defines four levels of SIL. SIL 1 represents the lowest level of safety, and SIL 4 represents the highest level. The classification into individual SIL levels depends on the degree of fault tolerance of the system. Ageing-related failures are random failures and can be quantified using a probabilistic approach. Faults that can be inadvertently introduced into the system during the development of the system (so-called systematic failures) are very difficult to quantify because we do not know the probability distribution of these failures. However, we can prevent them using various tools and techniques in system development, or they can be detected by testing the system. To achieve a certain SIL, it is necessary to meet the resistance to both accidental and systematic failures.

Resistance to accidental failures is expressed by the intensity of dangerous system failures. For continuous operation, the intensity of dangerous system faults in the range from 10⁻⁹ h⁻¹ to 10⁻⁸ h⁻¹ is required to meet SIL 4, and the intensity of dangerous system failures in the range from 10⁻⁸ h⁻¹ to 10⁻⁷ h⁻¹ is required to meet SIL 3. To meet SIL 2, the intensity of dangerous system failures ranging from 10⁻⁷ h⁻¹ to 10⁻⁶ h⁻¹, is required. To meet SIL 1, the intensity of dangerous system failures ranges from 10⁻⁶ h⁻¹ to 10⁻⁵ h⁻¹.

Resistance to systematic failures is expressed indirectly through measures to prevent and detect failures. It is true that the higher the SIL we want to achieve, the stronger the measures (or combinations of different measures) that must be used. The specific measures depend on the area of application.

If the system performs more than one safety function, the SIL is evaluated individually for each safety function. This guarantees the objectivity of the evaluation, as not every safety function uses all hardware parts of the system. Since the systems are implemented modularly today, the manufacturers state the SIL for each modular part of the system. Depending on whether this part applies to the implementation of a specific safety function or not, it does or does not affect its SIL. In this article, we will assume that the system performs only one safety function. The problem related to the implementation of multiple safety functions with different SILs utilizing a modular system is dealt with in more detail in refs. [12,13].

The SIL of individual safety functions is determined based on a risk analysis. The risk analysis is a separate problem, which is not the subject of this article. Refs. [14,15,16] deal with this in more detail.

Besides random and systematic failures, security accidents can also significantly affect safety, so it is necessary to consider the type of accidents. sPLC and fPLC have some built-in mechanisms for security management. In general, it can be said that these mechanisms are not currently as strong as the mechanisms used in other information technologies. The problem, for example, is the need for frequent updates, which is not a simple issue at the level of the control process. Therefore, a suitable combination of mechanisms at the PLC level and the enterprise level is necessary to eliminate security accidents (we assume that the PLC is one node of the enterprise network). The issue of security accidents is not part of this article and is dealt with in more detail in refs. [17,18].

3. The Safety of the Dual Architecture

3.1. The Safety Integrity against Random Failures

Figure 1 shows the block diagram of a dual architecture. The principle of operation is provided by the composite fail-safe technique in case of failure. Channel A and channel B operate independently of the same function, and their results are compared with each other. The Equipment Under Control (EUC) command is only provided if the results of both channels are identical. If the results do not match, the output of the dual architecture must be safe for the EUC (this process is called fault negation). The condition required for such a solution is that channels A and B are independent of each other (the occurrence of a fault in one channel does not affect the probability of the occurrence of a fault in the other channel).

If the above assumptions are met, then the probability of a dangerous failure of the dual architecture can be calculated according to the relation:

P_{D} = P_{A} \cdot P_{B}

(1)

where

P_{A}

is the probability of failure of channel A and

P_{B}

is the probability of failure of channel B.

If we assume that they are electronic systems, then for these systems, the distribution of failures can be described by an exponential distribution.

P_{D} (t) = (1 - e^{- λ_{A} t}) \cdot (1 - e^{- λ_{B} t})

(2)

where

λ_{A}

is the failure rate of channel A and

λ_{B}

is the failure rate of channel B. t is the time of fault detection and negation.

To evaluate the SIL of the dual architecture, it is necessary to know the intensity of its dangerous failures. Then

λ_{D} (t) = \frac{\frac{d P_{D} (t)}{d t}}{1 - P_{D} (t)} .

(3)

Assuming that

λ \cdot t ≪ 1

, a simplified relation can be derived from Equations (2) and (3).

λ_{D} (t) = 2 \cdot λ_{A} \cdot λ_{B} \cdot t

(4)

3.2. Safety Integrity against Systematic Failures

As mentioned above, the integrity of safety against systematic failures cannot be quantified but is expressed indirectly through failure prevention and fault detection measures. Minimum lists of measures necessary for individual SILs can be found, for example, in ref. [11]. If in the dual architecture, channels A and B are mass-produced components, then the measures to prevent failures will apply to the application software because the systematic failures of the hardware and firmware of such devices are verified by their mass deployment. Failure prevention measures are applied by each manufacturer at each stage of the equipment life cycle, but mass deployment in various applications is usually an even more effective test for systematic failures. With the number of applications as well as the time of their operation, the probability that systematic failures will occur also increases. If a systematic defect is found, the manufacturer will rectify it. Therefore, it can be argued that the number of systematic failures will be inversely proportional to the number of pieces produced and the time of their operation. However, the application software is individual for almost every application, so it is necessary to focus failure prevention measures precisely on the elimination of application software errors (if we continue referring to software, we are going to assume that it is application software unless explicitly stated otherwise).

The hardware and software of channels A and B used in the dual architecture may be identical or different. It is crucial for integrity against random failures that the channels be mutually independent. This can be carried out using appropriate hardware-level measures (galvanic channel separation, electromagnetic shielding, etc.). Only when this assumption is met does the channel comparison have the intended meaning. By suppressing the systematic failures at the application software level, the significance of the comparison depends on whether the hardware and software are identical or different. Since different hardware will also require different software, and identical hardware may or may not have identical software, this problem can be simplified to the problem of identical or different software.

Identical software will also have identical failures, which cannot be detected by comparison. With different software, the chance of detecting systematic failures is much greater, although it cannot be quantified. The situation is shown in Figure 2. The SysA denotes the set of systematic failures of channel A, and the SysB denotes the set of systematic failures of channel B. Ideally, the intersection of these sets is an empty set. Then, by comparing the channels with each other, we would reveal any systematic failures. In the case of identical hardware and software, the SysA and SysB sets completely overlap, and the comparison does not reveal any systematic failure.

Since the intersection of the SysA and SysB sets cannot be quantified, it must be assumed in person that the integrity of safety against systematic failures of the dual architecture will be equal to the integrity of safety against systematic failures of the worse channel (channel with a greater degree of the systematic failures).

Therefore, it can be argued that the required SIL of the dual architecture must meet the safety integrity against systematic failures of each unit.

4. Creating a Dual Architecture from sPLCs

The block diagram of the dual architecture based on sPLC is shown in Figure 3. The sPLC A and sPLC B may be sPLCs of the same or different manufacturers. The comparison occurs in software on both PLCs (it is also two-channel). Communication between PLCs is necessary for comparison and mutual synchronization. The I/O ports contain both systems, PLC A and PLC B, and sensors and terminals must be connected to both PLCs. This is fully compliant with sensors and terminals certified for use in safety applications. The same two-channel interfaces have fPLC. The connection of the sensors and terminals to the fPLC is discussed in more detail in refs. [2,3,19].

4.1. Integrity of Safety against Accidental Failures

The failure rate of channels A and B, shown in Figure 3, can be expressed by the relation

λ_{A, B} = \sum_{i = 1}^{n} λ_{i}

(5)

where

λ_{i}

is the failure rate of the i-th element involved in the implementation of the safety function, and n is the number of these elements.

The dangerous failure rate of the dual architecture (necessary for the determination of SIL) can be determined according to Equation (4).

Table 1 and Table 2 show the minimum failure rate of channel A and channel B (for simplicity, we assume identical channels) necessary to achieve the respective SIL, assuming a fault detection and negation time of 1 h (Table 1) and 24 h (Table 2). The considered detection and negation time is relatively large. With the proposed method of comparison (Section 4.2.1), these times are at the level of tens, at most hundreds of milliseconds. The tables also show the Mean Time to Failure (MTTF) values, as manufacturers generally state these values.

M T T F_{i} = \frac{1}{λ_{i}} .

(6)

By comparing the MTTF values of commonly available sPLCs (obtained from ref. [20]) with the minimum values calculated in Table 1 and Table 2, it can be stated that achieving SIL 1 to SIL 4 is possible using a dual sPLC-based architecture.

Data in Table 1 and Table 2 apply a diagnostic coverage (DC) of 100%. If the DC is <100%, the minimum fault intensities will be higher. However, given the considered margin of failure detection and negation, it can be reasonably assumed that the required SIL can be achieved through a dual sPLC-based architecture. A more detailed model needs to be developed to accurately determine the rate of dangerous faults in dual architectures with DC < 100%. Such a model can be created using Markov diagrams and deals with the influence of diagnostic coverage on the safety of dual architectures in more detail, e.g., in refs. [21,22,23]. This paper discusses the integrity of safety against systematic failures in more detail.

4.2. The Safety Integrity against Systematic Failures

General assumptions for achieving safety integrity against systematic failures of dual architecture were described in Section 3.2. This section focuses more on the problems associated with the created dual architecture based on sPLC. It solves two problems, namely synchronization and the comparison of channels (in summary, it can be said that this is a solution to the problem of controlling a dual architecture). These problems are very closely related to each other. Comparison would not be possible without synchronization. Conversely, synchronization depends on the method of comparison.

These problems can be solved by two approaches. The first consists of integrating the safety function with the architecture, and the second consist of the separation of these two parts. The first method will be more advantageous for the implementation of program code for one safety function. However, if we want to use such a procedure universally, we must design the architecture control independently of the implemented safety functions. The authors strive to propose a general approach, so we will continue to deal with the second option.

In the simplest case, the comparison would take place only at the level of outputs. This would not place high demands on synchronization accuracy, but it is the weakest option in terms of fault detection because there may be a fault in one of the channels, but the changed internal state may not be reflected immediately at the output. Ultimately, this will extend the fault detection time. This will negatively affect the dangerous failure rate of the dual architecture (this follows from relation (4)).

From the point of view of the fast detection of each failure, the comparison of the internal states of channels A and B (not only the comparison of outputs) is ideal. This represents the comparison on the channel data bus. However, such a comparison would require accurate external synchronization, which would be very difficult to implement. Comparisons in each bus cycle would not be possible for today’s computer systems, which are also used in PLCs. Therefore, some compromises need to be reached in this case, for example, comparing data hash functions (CRC totals) at regular intervals. However, this solution cannot be used in the case of dual architecture based on sPLC because the sPLC is available to the user only at the application level and cannot be synchronized externally at the bus level because its buses cannot be accessed, changes cannot be made in its firmware, and so on.

Therefore, the authors have proposed a comparison approach that is applicable at the application level and, at the same time, is sufficiently detailed (captures the internal states of the sPLC, not just its outputs). This approach consists of comparing the states of the performed function in the application program. It is assumed that the safety function always has an identifiable state. This can be achieved by modelling the function using a state-based model. A typical representative of such a model is the state diagram.

The model based on the state diagram can be rewritten practically in any language, or software tools can be used to generate an application program in the selected programming language or directly for the selected sPLC.

If the proposed procedure is combined with a suitably designed synchronization method, then channels A and B can be implemented not only from the same sPLCs but also using sPLCs from different manufacturers. Channel synchronization and comparison will form the control strategy for dual architecture. This control must be implemented at the application level and can also be described by a state diagram. For the dual architecture to be universal (usable for the implementation of various safety functions), its control must be designed independently of the implemented safety functions. For this to be possible, it is necessary to establish the preconditions for the implementation of safety functions that the control action of the dual architecture will be able to perform.

The assumptions can be summarized as follows:

Description of the safety function using a state diagram;
Availability of a software tool enabling the simulation of the created model (for its verification);
Availability of a software tool to generate a program from a state diagram for different types of sPLC;
Defining an interface for reading the current state of the safety function.

These requirements can be met using software tools Matlab Stateflow (function modelling) and Matlab Simulink (function simulation) (versions 2022a, Mathworks, Natick, MA, USA). These tools have been used in the design, simulation, and implementation of dual architecture control as well as an illustrative safety function.

4.2.1. Model of the Dual Architecture Control System

Dual architecture control must address:

Synchronization of units A and B;
Comparison of internal states of units A and B;
Negation of the fault after detecting a mismatch between units A and B. The fault detection and negation mechanism must place the system in a safe state; in our case, the safe state will be defined as logical zero at the system’s outputs.

As already indicated above, the control of the dual architecture would be most easily implemented using an external element cooperating with both sPLCs. However, such a solution would be unsatisfactory from the point of view of safety since the failure of this element could cause a dangerous failure of the dual architecture (it could induce the same failure in both channels; such a failure would be undetectable utilizing channel comparison). For this reason, the control of the architecture must be two-channel, and in case of failure of one channel, it must also terminate the operation of the system in a safe state. The proposed solution consists of two parts.

The first part represents the template for the safety function. The template ensures cooperation with the architecture control algorithm and is shown in Figure 4. The safety function model must be inserted into the red-framed state marked as “Logic” (the safety function model is realized by a state diagram; the “Logic” state represents a macro state). The “Do not execute” status is used to synchronize the operation of the safety functions in both units. In this state, the safety function awaits a command from the comparator. After matching the second unit, the safety function continues in the “Logic” macro state. If there is no match, the dual architecture goes into the safe state “Safe_state”. The mentioned template is used in both sPLCs, forming a dual architecture.

The second part cooperates with the first part. It is designed to exclude external synchronization, which would represent a potential source of errors. The principle is that after state changes in one unit, this state is compared with the state in the other unit. The system is waiting for a predefined time for the states to become identical. If the states of the units do not become identical within a predefined time, the architecture goes into the safe state (“Safe_state” in Figure 4). If the states of the units become identical within the specified time, the architecture evaluates the next potential transition between states. This method of evaluation is implemented in both units, and its detailed operation is described by the state diagram shown in Figure 5. Unit A (resp. B) compares the status of the safety function with the status in unit B (resp. A). The results of the comparison are exchanged between the units, and for the continuation of the operation of unit A (B), a positive result of the comparison of unit A (B) and unit B (A) is necessary. This ensures a two-channel unit comparison and a two-channel control of the mismatch tolerance time of channels A and B. This time is important and must not be exceeded because the response time of the system depends on it.

The meaning of the states in the status diagram of the comparator is as follows (Figure 5).

State Safe_state is the safe state in which the comparator lies after the PLC is restarted. It is the initialization state. The comp_A variable is set to false and the diag_code variable to 0.

The Recovery status is used to restore the functionality of the comparator after a PLC restart or after a fault by a sequence of logical values true and false at the recovery input within three seconds. If the true value is not followed by a false value within three seconds, the comparator goes into the Missing state. Then, the sequence of signals at the recovery input must be repeated.

After a successful recovery sequence, the comparator goes to the Setting state, where it sets the variables diag_code to the value of 8 and comp_A to the value of true. In this state, the comparator waits for the comparison result from comparator B. If this value of the comp_B variable is equal to true, the comparator goes into the “macro state” Evaluation.

In the “macro state” Evaluation, two parallel sub-states are executed, namely the Change_status sub-state and the Comparison sub-state. The Change_status sub-state monitors the execution of the safety function, and when its state changes, the comparator goes into the Different_states state, in which it saves the current state of the safety function in the variable previous_state and sets the wait variable to the value of true. If the status of the safety function in PLC A equals the status of the safety function in PLC B, then the Change_status sub-state acquires the status Same_states, in which the wait variable is reset to the value of false, and the value of the status of the safety function from PLC A is continuously stored in variable status_act.

When entering “macro state” Evaluation, in the Comparison sub-state, the initialized state is Wait, and the variable diag_code is set to the value 3. If the state of the safety function in PLC A and the state of the safety function in PLC B are equal, the comparator goes to the state Ok, and the output of the comparator comp_A is set to the value true and diag_code to the value 4. If the status of the safety function in PLC A and the status of the safety function in PLC B are not equal, the comparator goes back to status Wait, and diag_code is set to the value of 3. The comparator will remain in this state if the state of the safety function in PLC A and the state of the safety function in PLC B are not equal, but for the longest time, however, up to the allowed discrepancy time set at the discrepancy_time input. If there is a match between the states of the safety function in PLC A and the safety function in PLC B within the allowed discrepancy time, the comparator goes back to the Ok state, and the output of the comparator comp_A is set to the value of true and diag_code to the value of 4. If the allowed discrepancy time expires, then the comparator goes into the Discrepancy state, where the output of the comp_A comparator is set to the value of false, the auxiliary variable in_time to the value of true, and the diag_code is set to the value of 5. When the comparator reaches the Discrepancy state, the setting of the auxiliary variable in_time to value true leaves the comparator “macrostate” Evaluation and transitions into the safe state Safe_status, in which the auxiliary variable in_time is reset to false. From the “macro state” Evaluation, the comparator goes into the safe state, Safe_state, even in the event of a communication failure between PLC A and PLC B, in which case the diag_code is set to the value 7. Even from the Recovery and Setting states, in the event of a communication failure, the comparator goes into the safe state, Safe_state, and diag_code is set to the value of 7. The comparator can still transform into a “macro state” Evaluation from the Ok state to the Error_comp_B state with a value of false at the input comp_B, and then the variable comp_A is set to the value of false, the auxiliary variable error_comp_B to the value of true, and diag_code to the value of 6. After setting variable error_comp_B to the value of true, the comparator goes into the safe state, Safe_status, and the auxiliary variable error_comp_B is reset to the value of false.

Table 3 shows the input interface, and Table 4 shows the output interface of the comparator shown in Figure 5.

4.2.2. Safety Function Model

With the help of the dual architecture, the safety function commonly called “Muting” can be implemented. Its task is to protect a moving object against the effects of technology located in a dangerous zone (usually, it is about protecting a person after entering the dangerous part of the technology). The detailed specification of the “Muting” function was taken from the freely available PLCopen safety documentation [24]. More detailed information regarding the “Muting” function is also provided in ref. [25].

The safety function “Muting” can be characterized simply by the fact that, based on the sensors located in front of the light curtain (LC), it evaluates whether the light curtain was interrupted by a person or by moving material. In the event of a human interruption, it is necessary to put the device generating the potential danger into a safe state. In the event of material interruption, this is a normal working state, and the operation of the device does not need to be interfered with (muting of the light curtain is in progress).

Figure 6 shows a status diagram describing the operation of the “Muting” safety function. The state diagram is placed in the template according to Figure 4.

After running, the function is in the Do_Not_Execute state, which is also the initialization state. If a false value is sent from the comparator to the wait input, the function goes into the Execute state, where the initialization state is Safe_state, in which the values of all function outputs are set to false. If a true value is sent from the comparator to the comparator input, the function goes into the Recovery state, in which the ACK_REQ output is set to true, and a true value is expected at the recovery input. If a true value is detected at the recovery input, the function goes into the Standby state, in which the current state of the light curtain is checked, and the ACK_REQ output is set to false. If the light curtain is interrupted, the function goes into the LC_interrupted state, from which, if the light curtain is not interrupted, the function goes to the Recovery state, in which the ACK_REQ output is set to true again. That is, the recovery request is active, and the function can return to the active state after a failure. From the Standby state, if the light curtain is unbroken, the function goes into the LC_free state. In this state, output Q is set to true, and the function is waiting for the “Muting” sequence to start. In this state, the light curtain may be interrupted. If this happens, the function goes into the state LC_interrupted when output Q is set to the value of false. The sequence can start by detecting a positive edge on input MS1 or MS2 or both inputs simultaneously. In the case of an incorrect “Muting” sequence, the compound condition, which is written at the transition in Figure 6, the function enters the Error_Muting state, which is a safe state in which the outputs are set to false. From this state, the function enters the Recovery state when a false value is detected at the MS1 or MS2 input. After starting the correct “Muting” sequence, the function goes to the Muting_evaluation macro state. From the macro state Muting_evaluation, the function can go into the state Expired_time, in which all outputs of the function are set to the value of false when the maximum duration of muting time, Muting_time, is exceeded. From the Muting_evaluation macro state, the function can go into the LC_interrupted state when the light curtain is interrupted at a time when “Muting” is not in progress. In the Muting_evaluation macro state, the function can reach the Muting_Start_1 state after detecting a rising edge at the MS1 input, and at the same time, the value must be false at the MS2 input. After transitioning into this state, the allowed time of discrepancy starts to count, after which the function goes into the Expired_discrepancy_time state and then to the Expired_time state, which is a safe state in which all outputs of the function are set to false. From this state, the function enters the Recovery state after detecting a false value on the MS1 or MS2 input. The function behaves in the same way as in the second case, when the “Muting” sequence starts by detecting a positive edge at the MS2 input, and at the same time the MS1 input has the value of false, the function goes into the Muting_Start_2 state, in which the countdown of the allowed discrepancy_time starts. After it expires, the function goes into the Expired_discrepancy_time state and subsequently into the Expired_time state, in which all outputs of the function are set to the value false. From the Muting_Start_1 state, the function can reach the Muting_Error state due to an incorrect “Muting” sequence. If the “Muting” function is not enabled, the value is false at the enable_Muting input. Likewise, the function can transform from the Muting_Start_2 state to the Muting_Error state after a bad “Muting” sequence. If the “Muting” function is not enabled, the value is false at the enable_Muting input. From the Muting_Start_1 state, after detecting a leading edge at the MS2 input and at the same time, if the value of the MS1 input is true, the function goes into the Muting_active state, where the MUTING output is set to the value of true. Even from the Muting_Start_2 state, the function can go into the Muting_active state when a rising edge is detected at the MS1 input, and at the same time, the value is true at the MS2 input. From the Muting_active state, the function goes into the LC_free state after detecting a false value on the MS1 or MS2 input. In the LC_free state, the MUTING output is set to false.

The function can go from any state into the safe state Safe_state after detecting a false value at the comparator input. Likewise, the function can go from any state into the Do not execute state when a true value is detected on the wait input.

We created the model of the “Muting” safety function (Figure 6) in the Matlab Stateflow software tool. After creation, the function was simulated in the Matlab Simulating tool, which made it possible to perform various test scenarios and compare the results with the expected results. Given that all test scenarios were successful, we were able to proceed with the implementation of the function in a dual architecture based on sPLCs.

5. Implementation of the “Muting” Safety Function in a Dual Architecture Based on sPLC

Given that the Matlab Stateflow (version 2022a, Mathworks, USA) software tool enables the generation of software blocks that can be directly imported into the Simatic sPLC, the Simatic platform was used to implement the safety function, specifically the sPLC with CPU 1516F-3PN/DP. The implementation requires the design of a dual hardware architecture by the principles mentioned above and the subsequent implementation of the generated application software.

5.1. Dual Architecture Hardware Block Diagram

Figure 7 shows a block diagram of a dual architecture based on the sPLC. It shows the connection of sensors and action elements suitable for the “Muting” safety function. The wiring is simple because detailed circuit wiring is not necessary for this contribution and would go beyond its scope. A similar solution can also be used to implement other safety functions (after adjusting the input and output modules).

Sensors MS1 and MS2 are in front of the light curtain, and on their basis, it is possible to identify whether the light curtain was interrupted by a person or material (the resolution procedure is expressed by the status diagram in Figure 6). The light curtain (LC) is connected in two channels. One channel from LC is connected to sPLC A, and the other channel from LC is connected to sPLC B.

Contactors are connected to the outputs of sPLC A and sPLC B, which are connected in series and, according to the value at the output of sPLC A and sPLC B, connect or disconnect the energy source from the EUC. The two-channel connection of the contactors will enable the safe disconnection of the EUC from the energy source. The feedback serves to sense the state of the contactors and is necessary in the case of application diagnostics. Ref. [19] deals with the connection of contactors and feedback in more detail.

5.2. Software Implementation in Dual Architecture

After the software blocks are generated by the Matlab Stateflow tool, they can be imported into the TIA portal software tool. TIA portal is a tool for configuring and programming Siemens hardware components (the manufacturer of the used CPU is Siemens; more information about software TIA portal can be found in ref. [26]). After importing, these blocks must be called in the main program and assigned a specific physical and memory interface. Figure 8 shows the function blocks of the comparator and the “Muting” safety function in sPLC A. The same blocks are also implemented in sPLC B. The blocks in sPLC A and sPLC B differ only in the assigned interface.

For sPLC A and sPLC B to work together, communication between them is necessary. The condition is that communication is two-way. Different types of communication come into consideration and the choice depends primarily on the sPLCs used. In the given application example, communication using the Profinet bus was used, and the communication mode was set to IO controller I device.

5.3. Reviving and Testing Dual Architecture

Since the functions of the comparator (Figure 5) and the safety function “Muting” (Figure 6) were thoroughly tested at the model level in the Matlab Simulink software tool, the generated software was revived on the dual architecture according to Figure 7 without any difficulties and interventions into the generated software (such additional interventions would conflict with the safety requirements). Subsequently, test scenarios were created to verify the correct operation of the safety function and also to verify the reaction to simulated faults. The verification of the behavior of the “Muting” function during simulated faults is particularly important because the safe reaction to the fault is the main goal of the proposed and implemented dual architecture.

All tests were recorded using tables, where each row of the table represents one test case and also evaluates this test case (compliant/non-compliant). In the case of the tested safety function, all were evaluated as compliant. The tables with the individual test cases are not part of this article because they are too extensive.

6. Discussion and Conclusions

Dual architecture is commonly used to achieve the required SIL of electronic control systems. However, its implementation is typically at the firmware level of control systems. The proposed methodology enables the implementation of a dual architecture at the application level.

This paper deals primarily with safety integrity against systematic software failures. This contribution deals with the integrity of safety against accidental failures only to the extent of demonstrating that this problem will not be an obstacle in the implementation of a dual architecture. This paper is not about cyber security. If the dual architecture was connected to a publicly available network (e.g., the Internet), the issue of cyber security must also be addressed. Stimulating information on this issue can be found in ref. [27].

The possibility of implementing safety functions modelled and simulated in the Matlab Stateflow software environment can be considered the main benefit of the proposed method. The solution for the implementation of safety functions in industrial applications with the possibility of their modelling and simulation is not yet commercially available, and functional safety testing can only be carried out after the system has been programmed. At the same time, testing is a very important mechanism for detecting systematic software failures. Thanks to the solution proposed in the paper, testing can be performed even before the system is implemented, which can save a significant part of the costs because systematic faults can be detected and eliminated during the design phase and not after the hardware implementation of the control system.

A Simatic PLC of the S7-1500 series (company Siemens, Berlin, Germany) was used as dual architecture hardware. However, the method is universal and can be applied to PLCs of other manufacturers or other hardware platforms. The verification of the proposed methodology also took place on other safety functions and hardware platforms, but for the sake of clarity, only the selected platform and the selected safety function are listed.

The acquired knowledge can also be used in the creation of more complex architectures of control systems (2oo3) to positively influence not only the safety but also the availability of the control system.

Author Contributions

Conceptualization, J.Ž.; data curation, M.M.; formal analysis, K.R.; funding acquisition, M.G.; investigation, K.R.; methodology, M.M. and J.Ž.; project administration, K.R.; resources, M.G.; software, M.M.; supervision, J.H.; validation, J.Ž.; visualization, J.H.; writing—original draft, M.M.; writing—review and editing, J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported under the project of Operational Programme Integrated Infrastructure: Independent research and development of technological kits based on wearable electronics products as tools for raising hygienic standards in a society exposed to the virus causing the COVID-19 disease, ITMS2014+ code 313011ASK8. The project received co-funding from the European Regional Development Fund.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All necessary data is given in this article. (Generated from equations).

Conflicts of Interest

The authors declare no conflict of interest.

References

Rástočný, K.; Ždánsky, J.; Hrbček, J.; Medvedík, M. Calculation of the Dangerous Failure Rate of the Safety Function. Appl. Sci. 2022, 12, 2382. [Google Scholar] [CrossRef]
Ždánsky, J.; Rástočný, K.; Medvedík, M. Safety of two-channel connection of sensors to safety PLC. In Proceedings of the 13th International Conference, ELEKTRO 2020, Taormina, Sicily, Italy, 25–28 May 2012; IEEE: Piscataway, NJ, USA, 2020. [Google Scholar]
Ždánsky, J.; Rástočný, K.; Hrbček, J. The output circuit solution of the safety PLC for a larger number of output points. In Proceedings of the 26th International Conference on Applied Electronics, Pilsen, Czech Republic, 6–7 September 2021. [Google Scholar]
Siemens. SIMATIC Safety-Getting Started. Available online: https://cache.industry.siemens.com/dl/files/838/49972838/att_63504/v1/safety_getting_started_en-US.pdf (accessed on 31 May 2022).
Beckhoff. Application Guide TwinSAFE. Examples for the Calculation of Safety Parameters for Safety Functions. Available online: https://download.beckhoff.com/download/document/automation/twinsafe/applicationguidetwinsafeen.pdf (accessed on 31 May 2022).
Bernecker + Rainer Industrie Elektronik GmbH. Integrated Safety Technology. MASAFETY-ENG_V1.141. Available online: https://www.br-automation.com/cs/ke-stazeni/safety-technology/integrated-safety-technology-users-manual-legacy/ (accessed on 14 July 2022).
Siemens. Safety Applications with the S7-1200 FC CPU. Available online: https://support.industry.siemens.com/cs/document/109478932/safety-applications-with-s7-1200-fc-cpu?dti=0&lc=en-WW (accessed on 31 May 2022).
Available online: https://www.mathworks.com (accessed on 23 September 2022).
Zhao, Y.; Liu, F. The implementation of a dual-redundant control system. Control. Eng. Pract. 2022, 12, 445–453. [Google Scholar] [CrossRef]
Ždánsky, J.; Rástočný, K. Influence of Redundancy on Safety Integrity of SRCS with Safety PLC. In Proceedings of the 10th International Conference, ELEKTRO 2014, Rajecké Teplice, Slovakia, 19–20 May 2014; IEEE: Piscataway, NJ, USA, 2014. [Google Scholar]
EN61508: Functional Safety of Electrical/Electronic/Programmable Electronic Safety-Related Systems. 2010. Available online: https://webstore.iec.ch/publication/22273 (accessed on 23 September 2022).
Rástočný, K.; Ždánsky, J.; Nagy, P. Some specific activities at the railway signalling system development. In Proceedings of the 12th International Conference Transport Systems Telematics, Telematics in the Transport Environment, Katowice-Ustron, Poland, 10–13 October 2012; Springer: Berlin/Heidelberg, Germany, 2007; Volume 329, pp. 372–381. [Google Scholar]
Rástočný, K.; Ždánsky, J. Specificities of safety PLC based implementation of the safety function. In Proceedings of the International Conference Applied Electronics, Pilsen, Czech Republic, 6–7 September 2012. [Google Scholar]
Demichela, M.; Pirani, R.; Leva, M.C. Human factor analysis embedded in risk assessment of industrial machines: Effects on the safety integrity level. Int. J. Perform. Eng. 2014, 10, 487–496. [Google Scholar]
Hutchinson, D.; Luria, G.; Pindek, S.; Spector, P. The effects of industry risk level on safety training outcomes: A meta-analysis of intervention studies. Saf. Sci. 2022, 152, 1–11. [Google Scholar] [CrossRef]
Li, Z.; Dao, H.; Patel, H.; Liu, Y.; Zhou, B. Incorporating Traffic Control and Safety Hardware Performance Functions into Risk-Based Highway Safety Analysis. Promet-Traffic Transp. 2017, 29, 143–153. [Google Scholar] [CrossRef]
Liu, B.; Chen, J.; Hu, Y. Mode division-based anomaly detection against integrity and availability attacks in industrial cyber-physical systems. Comput. Ind. 2022, 137, 1–10. [Google Scholar] [CrossRef]
Yang, K.; Wang, H.N.; Sun, L.M. An effective intrusion-resilient mechanism for programmable logic controllers against data tampering attacks. Comput. Ind. 2022, 138, 1–13. [Google Scholar] [CrossRef]
Ždánsky, J.; Rástočný, K.; Hrbček, J. Influence of architecture and diagnostic to the safety integrity of SRECS output part. In Proceedings of the International Conference Applied Electronics, Pilsen, Czech Republic, 8–9 September 2015. [Google Scholar]
Siemens. Mean Time between Failures (MTBF)-List for SIMATIC Products. Available online: https://support.industry.siemens.com/cs/document/16818490/mean-time-between-failures-(mtbf)-list-for-simatic-products?dti=0&lc=en-WW (accessed on 2 June 2022).
Rástočný, K.; Ždánsky, J.; Franeková, M.; Zolotová, I. Modelling of Diagnostics Influence on Control System Safety. Comput. Inform. 2018, 37, 457–475. [Google Scholar] [CrossRef]
Kolek, L.; Ibrahim, M.Y.; Gunawan, I.; Laribi, M.A.; Zegloul, S. Evaluation of control system reliability using combined dynamic fault trees and Markov models. In Proceedings of the IEEE 13th International Conference on Industrial Informatics (INDIN), Cambridge, UK, 22–24 July 2015. [Google Scholar]
Shu, Y.; Zhao, J. A simplified Markov-based approach for safety integrity level verification. J. Loss Prev. Process Ind. 2014, 29, 262–266. [Google Scholar] [CrossRef]
Technical Specification. PLCopen–Technical Committee 5–Safety Software. Available online: https://plcopen.org/system/files/downloads/plcopen_safety_part_1_version_2.01.pdf (accessed on 16 May 2022).
Ždánsky, J.; Medvedík, M. Performing safety functions to monitor the protected area using a light curtain. In Proceedings of the International Conference Applied Electronics, Pilsen, Czech Republic, 10–11 September 2019. [Google Scholar]
Tia Portal. Available online: https://new.siemens.com/global/en/products/automation/industry-software/automation-software/tia-portal.html (accessed on 23 September 2022).
Hajda, J.; Jakuszewski, R.; Ogonowski, S. Security Challenges in Industry 4.0 PLC Systems. Appl. Sci. 2021, 11, 9785. [Google Scholar] [CrossRef]

Figure 1. The principle of dual architecture.

Figure 2. Representation of the sets of systematic failures of channels A and B.

Figure 3. Block diagram of the dual architecture based on sPLC.

Figure 4. Template for the safety function (H means a marker for initial state after transition to back to state Execute.).

Figure 5. Comparator state diagram.

Figure 6. State diagram of the “Muting” safety function.

Figure 7. Block diagram of dual architecture based on sPLC.

Figure 8. Function block of the comparator (a) and function block of the “Muting” safety function (b).

Table 1. The minimum failure rates and MTTFs of channels A and B necessary to achieve SIL 1–4 for fault detection and negation time of 1 h determined according to Equations (4) and (6).

	SIL
	1		2		3		4
λ [h⁻¹]	min	max	min	max	min	max	min	max
λ [h⁻¹]	1·10⁻⁶	1·10⁻⁵	1·10⁻⁷	1·10⁻⁶	1·10⁻⁸	1·10⁻⁷	1·10⁻⁹	1·10⁻⁸
$λ_{A, B}$ [h⁻¹]	7.07·10⁻⁴	2.24·10⁻³	2.24·10⁻⁴	7.07·10⁻⁴	7.07·10⁻⁵	2.24·10⁻⁴	2.24·10⁻⁵	7.07·10⁻⁵
MTTF [hod]	1414.21	447.21	4472.14	1414.21	14,142.14	4472.14	44,721.36	14,142.14
MTTF [rok]	0.16	0.05	0.51	0.16	1.61	0.51	5.11	1.61

Table 2. The minimum failure rates and MTTFs of channels A and B necessary to achieve SIL 1–4 for fault detection and negation time of 24 h determined according to Equations (4) and (6).

	SIL
	1		2		3		4
λ [h⁻¹]	min	max	min	max	min	max	min	max
λ [h⁻¹]	1·10⁻⁶	1·10⁻⁵	1·10⁻⁷	1·10⁻⁶	1·10⁻⁸	1·10⁻⁷	1·10⁻⁹	1·10⁻⁸
$λ_{A, B}$ [h⁻¹]	1.44·10⁻⁴	4.56·10⁻⁴	4.56·10⁻⁵	1.44·10⁻⁴	1.44·10⁻⁵	4.56·10⁻⁵	4.56·10⁻⁶	1.44·10⁻⁵
MTTF [hod]	6928.20	2190.89	21,908.90	6928.20	69,282.03	21,908.90	219,089.02	69,282.03
MTTF [rok]	0.79	0.25	2.50	0.79	7.91	2.50	25.01	7.91

Table 3. Comparator input interface.

Input	Data Type	Description
discrepancy_time	INT	Allowed discrepancy time between the states of the safety function executed in PLC A and PLC B in milliseconds.
comp_B	BOOL	Comparison results from comparator B. If the value is true, the comparison was successful, and if the value is false, the comparison was unsuccessful.
communication	BOOL	Communication between PLC A and PLC B. If the value is true, the communication is going well, and if the value is false, the communication is faulty or not happening at all.
recovery	BOOL	Restoring the comparator function from the safe state to the evaluation state.
state_A	INT	The status of the safety function in PLC A.
state_B	INT	The status of the safety function in PLC B.

Table 4. Comparator output interface.

Output	Data Type	Description
diag_code	INT	Diagnostic information about the status of the comparator. It expresses the status of the comparator using an integer.
comp_A	BOOL	Comparison results in comparator A. If the value is true, the comparison was successful, and if the value is false, the comparison was unsuccessful.
comparator	BOOL	The overall result of the comparison applies to the safety function, which is to perform its activity based on this output.
wait	BOOL	An instruction to the safety function to suspend its execution unless the safety function is in the same state in the second PLC.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Medvedík, M.; Ždánsky, J.; Rástočný, K.; Hrbček, J.; Gregor, M. Safety of Control Systems with Dual Architecture Based on PLCs. Appl. Sci. 2022, 12, 9799. https://doi.org/10.3390/app12199799

AMA Style

Medvedík M, Ždánsky J, Rástočný K, Hrbček J, Gregor M. Safety of Control Systems with Dual Architecture Based on PLCs. Applied Sciences. 2022; 12(19):9799. https://doi.org/10.3390/app12199799

Chicago/Turabian Style

Medvedík, Milan, Juraj Ždánsky, Karol Rástočný, Jozef Hrbček, and Michal Gregor. 2022. "Safety of Control Systems with Dual Architecture Based on PLCs" Applied Sciences 12, no. 19: 9799. https://doi.org/10.3390/app12199799

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Safety of Control Systems with Dual Architecture Based on PLCs

Abstract

1. Introduction

2. System Safety

3. The Safety of the Dual Architecture

3.1. The Safety Integrity against Random Failures

3.2. Safety Integrity against Systematic Failures

4. Creating a Dual Architecture from sPLCs

4.1. Integrity of Safety against Accidental Failures

4.2. The Safety Integrity against Systematic Failures

4.2.1. Model of the Dual Architecture Control System

4.2.2. Safety Function Model

5. Implementation of the “Muting” Safety Function in a Dual Architecture Based on sPLC

5.1. Dual Architecture Hardware Block Diagram

5.2. Software Implementation in Dual Architecture

5.3. Reviving and Testing Dual Architecture

6. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI