STMicroelectronics STM32MP1 User Manual

UM2714

User manual

STM32MP1 Series safety manual

Introduction

This document must be read along with the technical documentation such as reference manual(s) and datasheets for the STM32MP1 Series microprocessor devices, available on www.st.com.

It describes how to use the devices in the context of a safety-related system, specifying the user's responsibilities for installation and operation in order to reach the targeted safety integrity level. It also pertains to the X-CUBE-STL software product. The

safety concept described in this manual is based on the possible implementation of safety function(s) on the Arm® Cortex®-M4 CPU (and associated peripherals) included in STM32MP1.

It provides the essential information pertaining to the applicable functional safety standards, which allows system designers to avoid going into unnecessary details.

The document is written in compliance with IEC 61508, and it provides information relative to other functional safety standards.

The safety analysis in this manual takes into account the device variation in terms of memory size, available peripherals, and package.

UM2714 - Rev 2 - October 2020

www.st.com

For further information contact your local STMicroelectronics sales office.

 

 

 

UM2714

About this document

1About this document

1.1Purpose and scope

This document describes how to use STM32MP1 microprocessor unit (MPU) devices (further also referred to as Device(s)) in the context of a safety related system, specifying the user's responsibilities for installation and operation, in order to reach the desired safety integrity level. Note that the safety concept described in this

document is based on the possible implementation of safety function(s) on the Arm® Cortex®-M4 CPU (and associated peripherals) included in STM32MP1.

It is useful to system designers willing to evaluate the safety of their solution embedding one or more Device(s). For terms used, refer to the glossary at the end of the document.

Note:

Arm is a registered trademark of Arm Limited (or its subsidiaries) in the US and/or elsewhere.

1.2Normative references

This document is written in compliance with the IEC 61508 international norm for functional safety of electrical, electronic and programmable electronic safety-related systems, version IEC 61508:1-7 © IEC:2010.

The other functional safety standards considered in this manual are:

ISO 13849-1:2015, ISO13849-2:2012

IEC 62061:2005+AMD1:2012+AMD2:2015

IEC 61800-5-2:2016

The following table maps the document content with respect to the IEC 61508-2 Annex D requirements.

Table 1. Document sections versus IEC 61508-2 Annex D safety requirements

Safety requirement

Section number

 

 

D2.1 a) a functional specification of the functions capable of being performed

3

 

 

D2.1 b) identification of the hardware and/or software configuration of the Compliant item

3.2

 

 

D2.1 c) constraints on the use of the Compliant item or assumptions on which analysis of the behavior or

3.2

failure rates of the item are based

 

 

 

D2.2 a) the failure modes of the Compliant item due to random hardware failures, that result in a failure

 

of the function and that are not detected by diagnostics internal to the Compliant item;

 

 

 

D2.2 b) for every failure mode in a), an estimated failure rate;

 

 

 

D2.2 c) the failure modes of the Compliant item due to random hardware failures, that result in a failure

3.7

of the function and that are detected by diagnostics internal to the Compliant item;

 

 

 

D2.2 d) the failure modes of the diagnostics, internal to the Compliant item due to random hardware

 

failures, that result in a failure of the diagnostics to detect failures of the function;

 

 

 

D2.2 e) for every failure mode in c) and d), the estimated failure rate;

 

 

 

D2.2 f) for every failure mode in c) that is detected by diagnostics internal to the Compliant item, the

3.2.2

diagnostic test interval;

 

 

 

D2.2 g) for every failure mode in c) the outputs of the Compliant item initiated by the internal diagnostics;

3.6

 

 

D2.2 h) any periodic proof test and/or maintenance requirements;

 

 

 

D2.2 i) for those failure modes, in respect of a specified function, that are capable of being detected by

3.7

external diagnostics, sufficient information must be provided to facilitate the development of an external

 

diagnostics capability.

 

 

 

D2.2 j) the hardware fault tolerance;

 

 

3

D2.2 k) the classification as type A or type B of that part of the Compliant item that provides the function

 

(see 7.4.4.1.2 and 7.4.4.1.3);

 

 

 

UM2714 - Rev 2

page 2/114

 

 

UM2714

Reference documents

1.3Reference documents

[1]AN5459, FMEDA snapshots for STM32MP1 microprocessor series.

[2]AN5460, Results of FMEA on STM32MP1 Series microprocessor.

UM2714 - Rev 2

page 3/114

 

 

UM2714

Device development process

2Device development process

STM32 series product development process (see Figure 1), compliant with the IATF 16949 standard, is a set of interrelated activities dedicated to transform customer specification and market or industry domain requirements into a semiconductor device and all its associated elements (package, module, sub-system, hardware, software, and documentation), qualified with ST internal procedures and fitting ST internal or subcontracted manufacturing technologies.

Figure 1. STMicroelectronics product development process

 

 

 

2 Design and

 

 

 

3 Qualification

1 Conception

 

 

 

 

 

validation

 

 

 

 

 

 

 

 

 

 

·Key characteristics and requirements related to future uses of the device

·Industry domain(s), specific customer requirements and definition of controls and tests needed for compliance

·Product target specification and strategy

·Project manager

appointment to drive product development

·Evaluation of the technologies, design tools and IPs to be used

·Design objective specification and product validation strategy

·Design for quality

techniques (DFD, DFT, DFR,

DFM, …) definition

·Architecture and positioning to make sure the software and hardware system solutions meet the target specification

·Product approval strategy and project plan

·Semiconductor design development

·Hardware development ·Software development

·Analysis of new product specification to forecast reliability performance

·Reliability plan, reliability design rules, prediction of

failure rates for operating life test using Arrhenius’s law and other applicable models

·Use of tools and methodologies such as

APQP, DFM, DFT, DFMEA

·Detection of potential

reliability issues and solution to overcome them

·Assessment of Engineering

Samples (ES) to identify the main potential failure mechanisms

·Statistical analysis of

electrical parameter drifts for early warning in case of fast parametric degradation (such as retention tests)

·Failure analysis on failed

parts to clarify failure modes and mechanisms and identify the root causes

·Physical destructive

analysis on good parts after reliability tests when required

·Electrostatic discharge

(ESD) and latch-up sensitivity measurement

·Successful completion of the product qualification plan

·Secure product deliveries

on advanced technologies using stress methodologies to detect potential weak parts

·Successful completion of electrical characterization

·Global evaluation of new product performance to guarantee reliability of customer manufacturing process and final application of use (mission profile)

·Final disposition for

product test, control and monitoring

UM2714 - Rev 2

page 4/114

 

 

STMicroelectronics STM32MP1 User Manual

UM2714

Reference safety architecture

3Reference safety architecture

This section reports details of the STM32MP1 Series safety architecture.

3.1Safety architecture introduction

Device(s) analyzed in this document can be used as Compliant item(s) within different safety applications.

The aim of this section is to identify such Compliant item(s), that is, to define the context of the analysis with respect to a reference concept definition. The concept definition contains reference safety requirements, including design aspects external to the defined Compliant item.

As a consequence of Compliant item approach, the goal is to list the system-related information considered during the analysis, rather than to provide an exhaustive hazard and risk analysis of the system around the device.

3.2Compliant item

This section defines the Compliant item term and provides information on its usage in different safety architecture schemes.

3.2.1Definition of Compliant item

According to IEC 61508:1 clause 8.2.12, Compliant item is any item (for example an element) on which a claim is being made with respect to the clauses of IEC 61508 series. Any mature Compliant item must be described in a safety manual available to End user.

In this document, Compliant item is defined as a system including one or two STM32MP1 devices (see Figure 2). The communication bus is directly or indirectly connected to sensors and actuators.

Figure 2. STM32MP1 as Compliant item

 

 

 

 

Actuator

 

Sensor

Processing element

 

 

 

 

Remote

A

 

 

 

 

Remote

STM32MP1

controller

S

 

 

 

controller

device(s)

 

 

 

 

 

 

Remote

Compliant item

Remote

A

S

 

controller

 

 

controller

 

 

 

 

 

 

 

In the framework of STM32MP1 safety concept, the Compliant item is assumed to be divided in two partitions:

Safe Partition, including all STM32MP1 logic considered as safety related and suitable for the implementation of End User safety functions. This partition includes also the STM32MP1 logic implementing (in collaboration with application software) the separation (freedom from interference) between itself and the Non-Safe Partition.

The complete list of STM32MP1 internal modules belonging to Safe Partition is reported in

Section 3.7 Conditions of use. The Arm® Cortex® M4 CPU belongs to Safe Partition.

Non-Safe Partition, including all STM32MP1 logic considered as non-safety related and therefore not suitable for the implementation of End User safety functions and so out of the safety scope.

The STM32MP1 modules not listed explicitly in Section 3.7 Conditions of use are considered belonging to Non-Safe Partition. The Arm® A7 CPU belongs to Non-Safe Partition.

Accordingly, the implementation of the End User safety function is restricted to the STM32MP1 logic belonging to Safe Partition.

UM2714 - Rev 2

page 5/114

 

 

 

UM2714

 

Compliant item

 

 

Caution:

According to Non-Safe Partition definition, the implementation of safety function(s) based on Arm® A7 CPU is

 

not allowed.

Interference due to the presence in STM32MP1 device of logic belonging to Non-Safe Partition are managed by the help of the separation concept, which is described in detail in Section 3.2.5 .

It is worth to note that because of the effective isolation between Safe and Non Safe Partition, the presence of non-safety related application software running on A7 CPU is allowed. Section 4.2 of this document provides more information on this specific case.

Other components might be related to the Compliant item, like the external HW components needed to guarantee either the functionality of the device (external memory, clock quartz and so on) or its safety (for example, the external watchdog or voltage supervisors). These components are not analyzed in this safety manual.

A defined Compliant item can be classified as element according to IEC61508-4, 3.4.5.

3.2.2Safety functions performed by Compliant item

In essence, Compliant item architecture encompasses the following processes performing the safety function or a part of it:

input processing elements (PEi) reading safety related data from the remote controller connected to the sensor(s) and transferring them to the following computation elements

computation processing elements (PEc) performing the algorithm required by the safety function and transferring the results to the following output elements

output processing elements (PEo) transferring safety related data to the remote controller connected to the actuator

computation processing elements (PEd, see Note) executing hardware and software-based safety functions devoted to:

1.detect/prevent hardware random failures affecting all considered processing elements (PEi/PEc/PEo/ PEd)

2.prevent/mitigate systematic failures in the software executed on PEc/PEd

3.prevent/detect interferences on the safety related hardware and software caused by the non-safety related hardware and software

in 1oo2 architecture, potentially a further voting processing element (PEv)

processes external to the Compliant item ensuring safety integrity, such as watchdog (WDTe) and voltage monitors (VMONe)

Note:

Because the large part of safety mechanisms included in STM32MP1 are software-based, PEc and PEd

 

hardware are mainly coincident.

 

The role of the PEv process is clarified in Section 3.2.4 , while the one of WDTe and VMONe external processes

 

is clarified in the sections where the conditions of use (CoU) (definition of safety mechanism) are detailed:

 

WDTe: refer to External watchdog – CPU_SM_5 and Control flow monitoring in Application software

 

 

CPU_SM_1,

 

VMONe: refer to Supply voltage monitoring – VSUP_SM_1 and System-level power supply management -

 

 

VSUP_SM_5.

In summary, the devices support the implementation of End user safety functions consisting of three operations:

safe acquisition of safety-related data from input peripheral(s)

safe execution of application software program and safe computation of related data

safe transfer of results or decisions to output peripheral(s)

Claims on the Compliant item and computation of safety metrics are done with respect to these three basic operations and to the PEd processes which are integral part of STM32MP1 safety concept.

According to the definition for implemented safety functions, Compliant item (element) can be regarded as type B (as per IEC61508-2, 7.4.4.1.3 definition). Despite accurate, exhaustive and detailed failure analysis, Device has to be considered as intrinsically complex. This implies its type B classification.

Two main safety architectures are identified: 1oo1 (using one device) and 1oo2 (using two devices).

3.2.3Reference safety architectures - 1oo1

1oo1 reference architecture (Figure 3) ensures safety integrity of Compliant item through combining device internal processes (implemented safety mechanisms) with external processes WDTe and VMONe.

1oo1 reference architecture targets safety integrity level (SIL)SIL2.

UM2714 - Rev 2

page 6/114

 

 

UM2714

Compliant item

Figure 3. 1oo1 reference architecture

VMONe

 

WDTe

 

 

 

 

 

 

 

 

 

 

Sensors

PEi

PEc

PEo

Actuators

 

 

 

 

 

PEd

 

 

UM2714 - Rev 2

page 7/114

 

 

UM2714

Compliant item

3.2.4Reference safety architectures - 1oo2

1oo2 reference architecture (Figure 4) contains two separate channels, either implemented as 1oo1

reference architecture ensuring safety integrity of Compliant item through combining device internal processes (implemented safety mechanisms) with external processes WDTe and VMONe. The overall safety integrity is then ensured by the external voter PEv, which allows claiming hardware fault tolerance (HFT) equal to 1. Achievement of higher safety integrity levels as per IEC61508-2 Table 3 is therefore possible. Appropriate separation between the two channels (including power supply separation) should be implemented in order to avoid huge impact of common-cause failures (refer to Section 4.2 Analysis of dependent failures). However, β and βD parameters computation is required.

1oo2 reference architecture targets SIL3.

Figure 4. 1oo2 reference architecture

VMONe WDTe

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PEi

 

 

 

PEc

 

 

 

PEo

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PEd

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Actuators

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Sensors

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PEv

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PEi

 

 

PEc

 

 

 

PEo

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PEd

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

VMONe WDTe

3.2.5The separation concept

The Safe Partition and the Non-Safe Partition are isolated by the separation concept, which is composed by two different aspects, spatial separation and temporal separation.

Spatial separation

Safe and Non-Safe Partitions share the same device (STM32MP1), therefore interferences are possible. The protection against spatial interferences is built by two successive layers of protection:

eTZPC protection: the major part of STM32MP1 peripherals and DMA masters belonging to Safe Partition are "isolable" and so can be allocated to Cortex®-M4 CPU domain accesses exclusively by proper setting of MCU resource isolation system of the eTZPE controller. Then any access tried by a Non-Safe partition AXI master like the A7 CPU is detected and stopped by eTZPC hardware module. Note that all the peripherals which are securable to A7 CPU exclusively are excluded from the Safe Partition and safety concept by definition.

UM2714 - Rev 2

page 8/114

 

 

UM2714

Compliant item

The local safety concept for each peripheral in Safe Partition, which is composed by several overlapped safety mechanism defined at application level, is able to detect in efficient way unintended perturbing accesses coming from Non Safe Partition. All those safety mechanisms are declared as “highly

recommended” in Section 3.7 Table 142. List of safety recommendations, so their presence in the final system is guaranteed. This protection is quite valuable for the few common resources belonging to Safe Partitions which cannot be isolated by eTZPC protection, because fully shared with A7 CPU (power and clock configuration interfaces).

The two layers are overlapped, so the second one acts as second line of defense in case of failures of the main protection by eTZPC.

The separation concept protects the Safe Partition from unintended accesses regardless their nature on the Non Safe Partition side, so because of random hardware failures in the hardware or systematic errors in the software.

Figure 5 represents the separation concept:

Figure 5. Spatial separation concept

 

 

 

 

 

 

Safe software

 

Safe Partition

 

 

 

 

 

Logic implementing

Logic

 

implementing

 

End User safety function(s)

 

Separation

 

 

 

 

 

 

concept

 

Separation

Non-Safe software

Non-Safe Partition

Non safety related logic

It is worth to highlight that despite the M4 isolation feature belongs primarily to STM32MP1 security, it is applied for safety purposes in this specific case.

UM2714 - Rev 2

page 9/114

 

 

UM2714

Compliant item

Temporal separation

The temporal separation is required because of the specific boot structure of STM32MP1, where the A7 CPU is the key player during the boot sequence because it is in charge to load the software image for M4 CPU. The temporal separation guarantees that any possible interference from non-safety related hardware and/or software on A7 CPU side during the boot sequence is detected/mitigated by adequate measures implemented on M4 CPU side (and therefore built over safety related hardware and software). Following picture provides

a graphical representation (red boxes: events on non-safety related resources, green boxes: events on safety related resources)

Figure 6. Temporal separation concept

 

 

 

 

 

Interferences

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Power up

 

A7 boot

 

 

A7 load image on

 

 

A7 releases M4

 

M4 measures

 

M4 application software starts

 

 

 

M4

 

 

reset

 

 

 

 

 

 

 

 

 

 

 

 

 

Time

Temporal separation

The box marked “M4 measures” includes both hardware and software measures. Hardware measures can be implemented by final system characteristics helping to keep the safe state despite issues occurred during the boot (for example the presence of an intelligent external watchdog), while software measures can be implemented

by additional checks executed on M4 CPU just after its startup and before the execution of the end user safety application software. These measures are detailed in Section 3.7 in form of CoU.

It is worth to note that the authentication features implemented in first stage boot loader (FSBL) and that can be activated in second stage boot loader (U-Boot), despite mainly conceived for security reasons, can be considered as part of the temporal separation implementation because they protect the integrity of the software images loaded during the boot. That protection. represented Figure 6 by the yellow boxes, can be considered valid as per the outcomes of CoU_17 application.

UM2714 - Rev 2

page 10/114

 

 

UM2714

Safety analysis assumptions

3.3Safety analysis assumptions

This section collects all assumptions made during the safety analysis of the devices.

3.3.1Safety requirement assumptions

The concept specification, the hazard and risk analysis, the overall safety requirement specification and the consequent allocation determine the requirements for Compliant item as further listed. ASR stands for assumed safety requirements.

Caution: It is the End user’s responsibility to check the compliance of the final application with these assumptions. Furthermore, beside the recommendation included in this Safety Manual, it is also End user’s responsibility to guarantee the compliance of the final application with the requirements of the IEC61508.

ASR1: Compliant item can be used to implement four kinds of safety function modes of operation according to part 4,3.5.16:

a continuous mode (CM) or high-demand (HD) SIL3 safety function (CM3), or

a low-demand (LD) SIL3 safety function (LD3), or

a CM or HD SIL2 safety function (CM2), or

a LD SIL2 safety function (LD2).

 

ASR2: Compliant item is used to implement safety function(s) allowing a specific worst-case time budget (see

 

note below) for the STM32 MPU to detect and react to a failure. That time corresponds to the portion of the

 

process safety time (PST) allocated to the device (STM32xx Series duty in Figure 7) in error reaction chain at

 

system level.

Note:

The computation for time budget mainly depends on the execution speed for periodic tests implemented

 

by software. Such duration might depends on the actual amount of hardware resources (RAM memory and

 

peripherals) actually declared as safety-related. Further constraints and requirements from IEC61508-2, 7.4.5.3

 

must be considered.

Figure 7. Allocation and target for STM32 PST

STM32xx Series duty

 

 

 

End user duty

 

 

 

 

 

 

 

….

MPU detection

 

FW reaction

 

 

SW reaction

 

Actuator reaction

 

 

 

 

 

 

 

 

 

 

 

System-level PST

 

 

 

ASR3: Compliant item is used to implement safety function(s) that can be continuously powered on for a period

 

over eight hours. It is assumed to not require any proof test, and the lifetime of the product is considered to be no

 

less than 10 years.

 

ASR4.1: It is assumed that End User safety functions are implemented only on Device logic belonging to the Safe

 

Partition.

 

ASR4.2: It is assumed that logic and resources belonging to Safe Partition are not used for the implementation of

 

non-safety related functions, coexisting with safety functions.

 

ASR4.3: It is assumed that the software functions running on A7 CPU have been developed according some

 

specific quality standards.

Note:

This Assumption is not placed to require a formal compliance of A7 software development flow to IEC61508 3

 

model, but to guarantee a minimum quality of the software running on the Non-Safe partition. Related CoU_17

 

provides more specific insights.

 

ASR4.4: It is assumed that only one safety function is performed or if many, all functions are classified with the

 

same SIL and therefore they are not distinguishable in terms of their safety requirements.

 

ASR4.5: In case of multiple safety function implementations, it is assumed that End user is responsible to duly

 

ensure their mutual independence.

UM2714 - Rev 2

page 11/114

 

 

UM2714

Electrical specifications and environment limits

ASR5: It is assumed that the implemented safety function(s) does (do) not depend on transition of the overall STM32MP1 or Arm® Cortex®-M4 CPU to and from a low-power or suspended state.

ASR6.1: The local safe state of Compliant item is the one in which either:

SS1: the application software(1) is informed by the presence of a fault and a reaction by the application software(1) itself is possible.

SS2: the application software(1) cannot be informed by the presence of a fault or the application software(1) is not able to execute a reaction.

Note:

End user must consider that random hardware failures affecting the Device can compromise its operations (for

 

example failure modes affecting the program counter prevent the correct execution of software).

 

The following table provides details on the SS1 and SS2 safe states.

Table 2. SS1 and SS2 safe state details

Safe

Condition

Compliant item

System transition to safe

System transition to safe

state

action

state – 1oo1 architecture

state – 1oo2 architecture

 

 

 

 

 

 

 

The application software(1) is

Fault reporting

Application software(1) drives

Application software(1) in one

 

informed by the presence of

 

of the two channels drives

SS1

a fault and a reaction by

to application

the overall system in its safe

the overall system in its safe

 

the application software itself is

software(1)

state

 

state

 

possible.

 

 

 

 

 

 

 

 

 

 

 

 

The application software(1) cannot

Reset signal

WDTe drives the overall

PEv drives the overall system

 

be informed by the presence of a

SS2

system in its safe state

fault or the application software is

issued by WDTe

in its safe state

 

(“safe shut-down”) (2)

 

not able to execute a reaction.

 

 

 

 

 

 

 

 

 

 

 

1.Referred to application software executed on Arm® Cortex®-M4 CPU.

2.Safe state achievement intended here is compliant to Note on IEC 61508-2, 7.4.8.1

 

ASR6.2: It is assumed that the safe state defined at system level by End user is compatible with the assumed

 

local safe state (SS1, SS2) for Compliant item.

 

ASR7: Compliant item is assumed to be analyzed according to routes 1H and 1S of IEC 61508-2.

Note:

Refer to Section 3.5 Systematic safety integrity and Section 3.6 Hardware and software diagnostics.

 

ASR8: Compliant item is assumed to be regarded as type B, as per IEC 61508:2, 7.4.4.1.2.

 

ASR9: It is assumed that data exchanges between the Safe Partition and the Non-Safe Partition are implemented

 

by using statically allocated locations in SYSRAM bank and restricted to non-safety related data.

 

ASR10.1: STM32MP1 package is considered full safety related, in the framework of Device failure rate

 

computations.

 

ASR10.2: is End User responsibility to prove the freedom from interferences between safety related and non-

 

safety related physical pins (e.g. by running a pin-level FMEDA).

 

ASR11: it is assumed that glitches in GPIO output values with a duration lower than the adopted PST are not able

 

to cause violations of the implemented safety function(s)

Note:

This assumption can be fulfilled in two ways, either:

 

Final application robustness against GPIO glitches, as required by the assumption text

 

 

or

 

The execution frequency of the prescribed method GPIO_SM_2 and GPIO_SM_0 is higher than 1/Tm

 

 

(refer to related “Recommendations and known limitations“ fields for details)

ASR12: it is assumed that STM32MP1 is not used to build fail-operational solutions based exclusively on the resources of a unique STM32MP1 device itself.

3.4Electrical specifications and environment limits

To ensure safety integrity, the user must operate the Device(s) within its (their) specified:

absolute maximum rating

UM2714 - Rev 2

page 12/114

 

 

UM2714

Systematic safety integrity

capacity

operating conditions

For electrical specifications and environmental limits of Device(s), refer to its (their) technical documentation such as datasheet(s) and reference manual(s) available on www.st.com.

3.5Systematic safety integrity

According to the requirements of IEC 61508-2, 7.4.2.2, the Route 1S is considered in the analysis of Device(s). As clearly authorized by IEC61508-2, 7.4.6.1, STM32 MPU products can be considered as standard, massproduced electronic integrated devices, for which stringent development procedures, rigorous testing and extensive experience of use minimize the likelihood of design faults. However, ST internally assesses the compliance of the Device development flow, through techniques and measures suggested in the IEC 61508-2 Annex F. A safety case database (see Section 5 List of evidences) keeps evidences of the current compliance level to the norm.

3.6Hardware and software diagnostics

This section lists all the safety mechanisms (hardware, software and application-level) considered in the device safety analysis. It is expected that users are familiar with the architecture of the device, and that this document is used in conjunction with the related device datasheet, user manual and reference information. To avoid inconsistency and redundancy, this document does not report device functional details. In the following descriptions, the words safety mechanism, method, and requirement are used as synonyms.

As the document provides information relative to the superset of peripherals available on the devices it covers (not all devices have all peripherals), users are supposed to disregard any recommendations not applicable to their Device part number of interest.

Information provided for a function or peripheral applies to all instances of such function or peripheral on Device. Refer to its reference manual or/and datasheet for related information.

The implementation guidelines reported in the following section are for reference only. The safety verification executed by ST during the device safety analysis and related diagnostic coverage figures reported in this manual (or related documents) are based on such guidelines. For clarity, safety mechanisms are grouped by Device function.

Information is organized in form of tables, one per safety mechanism, with the following fields:

SM CODE

Unique safety mechanism code/identifier used also in FMEA document. Identifiers use the scheme

 

mmm_SM_x where mmm is a 3- or 4-letter module (function, peripheral) short name, and x is a

 

number. It is possible that the numbering is not sequential (although usually incremental) and/or that

 

the module short name is different from that used in other documents.

Description

Short mnemonic description

Ownership

ST : means that method is available on silicon.

 

End user: method must be implemented by End user through Application software modification,

 

hardware solutions, or both.

Detailed

Detailed implementation sometimes including notes about the safety concept behind the introduction

implementation

of the safety mechanism.

Error reporting

Describes how the fault detection is reported to application software.

Fault detection time

Time that the safety mechanism needs to detect the hardware failure.

Addressed fault

Reports fault model(s) addressed by the diagnostic (permanent, transient, or both), and other

model

information:

 

If ranked for Fault avoidance: method contributes to lower the probability of occurrence of a

 

 

failure

 

If ranked for Systematic: method is conceived to mitigate systematic errors (bugs) in

 

 

application software design

Dependency on

Reports if safety mechanism implementation or characteristics change among different Device part

Device configuration

numbers.

Initialization

Specific operation to be executed to activate the contribution of the safety mechanism

UM2714 - Rev 2

page 13/114

 

 

UM2714

Hardware and software diagnostics

Periodicity

Continuous : safety mechanism is active in continuous mode.

 

Periodic: safety mechanism is executed periodically(1).

 

On-demand: safety mechanism is activated in correspondence to a specified event (for instance,

 

reception of a data message).

 

Startup: safety mechanism is supposed to be executed only at power-up or during off-line

 

maintenance periods.

Test for the

Reports specific procedure (if any and recommended) to allow on-line tests of safety mechanism

diagnostic

efficiency.

Multiple-fault

Reports the safety mechanism(s) associated in order to correctly manage a multiple-fault scenario

protection

(refer to Section 4.1.3 Notes on multiple-fault scenario).

Recommendations Additional recommendations or limitations (if any) not reported in other fields. and known limitations

1.In CM systems, safety mechanism can be accounted for diagnostic coverage contribution only if it is executed at least once per PST. For LD and HD systems, constraints from IEC61508-2, 7.4.5.3 must be applied.

3.6.1Arm® Cortex®-M4 CPU

 

Table 3. CPU_SM_0

 

 

SM CODE

CPU_SM_0

 

 

Description

Periodical core self-test software for Arm® Cortex®-M4 CPU

Ownership

End user or ST

 

 

 

The software test is built around well-known techniques already addressed by IEC 61508:7,

Detailed implementation

A.3.2 (Self-test by software: walking bit one-channel). To reach the required values of

coverage, the self-test software is specified by means of a detailed analysis of all the CPU

 

 

failure modes and related failure modes distribution

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent

 

 

Dependency on Device configuration

None

 

 

Initialization

None

 

 

Periodicity

Periodic

 

 

 

Self-diagnostic capabilities can be embedded in the software, according the test

Test for the diagnostic

implementation design strategy chosen. The adoption of checksum protection on results

 

variables and defensive programming are recommended.

 

 

Multiple-fault protection

CPU_SM_5: external watchdog

 

 

 

This method is the main asset in STM32MP1 Series safety concept. CPU integrity is a key

Recommendations and known limitations

factor because the defined diagnostics for MPU peripherals are to major part software-based.

Startup execution of this safety mechanism is recommended for multiple fault mitigations -

 

 

refer to Section 4.1.3 Notes on multiple-fault scenario for details.

 

 

UM2714 - Rev 2

page 14/114

 

 

UM2714

Hardware and software diagnostics

 

 

Table 4. CPU_SM_1

 

 

 

SM CODE

 

CPU_SM_1

 

 

Description

Control flow monitoring in Application software

 

 

Ownership

End user

 

 

 

A significant part of the failure distribution of CPU core for permanent faults is related to

 

failure modes directly related to program counter loss of control or hang-up. Due to their

 

intrinsic nature, such failure modes are not addressed by a standard software test method

 

like SM_CPU_0. Therefore it is necessary to implement a run-time control of the Application

 

software flow, in order to monitor and detect deviation from the expected behavior due to such

 

faults. Linking this mechanism to watchdog firing assures that severe loss of control (or, in the

 

worst case, a program counter hang-up) is detected.

 

The guidelines for the implementation of the method are the following:

 

Different internal states of the Application software are well documented and described

 

 

(the use of a dynamic state transition graph is encouraged).

 

Monitoring of the correctness of each transition between different states of the

Detailed implementation

 

Application software is implemented.

 

Transition through all expected states during the normal Application software program

 

 

loop is checked.

 

A function in charge of triggering the system watchdog is implemented in order to

 

 

constrain the triggering (preventing the issue of CPU reset by watchdog) also to the

 

 

correct execution of the above-described method for program flow monitoring. The use

 

 

of window feature available on internal window watchdog (WWDG) is recommended.

 

The use of the independent watchdog (IWDG), or an external one, helps to implement a

 

 

more robust control flow mechanism fed by a different clock source.

 

In any case, safety metrics do not depend on the kind of watchdog in use (the adoption

 

of independent or external watchdog contributes to the mitigation of dependent failures, see

 

Section 4.2.2 Clock)

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation. Higher value is fixed by watchdog timeout interval.

 

 

Addressed fault model

Permanent and transient

 

 

 

Dependency on Device configuration

None

 

 

 

Initialization

Depends on implementation

 

 

Periodicity

Continuous

 

 

 

Test for the diagnostic

NA

 

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

 

Recommendations and known limitations

-

 

 

 

 

UM2714 - Rev 2

page 15/114

 

 

UM2714

Hardware and software diagnostics

 

 

Table 5. CPU_SM_2

 

 

 

SM CODE

 

CPU_SM_2

 

 

Description

Double computation in Application software

 

 

Ownership

End user

 

 

 

A timing redundancy for safety-related computation is considered to detect transient faults

 

affecting the Arm® Cortex®-M4 CPU subparts devoted to mathematical computations and data

 

access.

 

The guidelines for the implementation of the method are the following:

Detailed implementation

The requirement needs be applied only to safety-relevant computation, which in case of

 

wrong result could interfere with the system safety functions. Such computation must be

 

 

therefore carefully identified in the original Application software source code

 

Both mathematical operation and comparison are intended as computation.

 

The redundant computation for mathematical computation is implemented by using

 

 

copies of the original data for second computation, and by using an equivalent formula if

 

 

possible

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Transient

 

 

 

Dependency on Device configuration

None

 

 

 

Initialization

Depends on implementation

 

 

Periodicity

Continuous

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

Recommendations and known limitations

End user is responsible to carefully avoid that the intervention of optimization features of the

used compiler removes timing redundancies introduced according to this condition of use.

 

 

 

 

 

Table 6. CPU_SM_3

 

 

SM CODE

CPU_SM_3

 

 

Description

Arm® Cortex®-M4 HardFault exceptions

Ownership

ST

 

 

 

HardFault exception raise is an intrinsic safety mechanism implemented in Arm® Cortex®-M4

Detailed implementation

core, mainly dedicated to intercept systematic faults due to software limitations or error in

software design (causing for example execution of undefined operations, unaligned address

 

access). This safety mechanism is also able to detect hardware random faults inside the CPU

 

bringing to such described abnormal operations.

 

 

Error reporting

High-priority interrupt event

 

 

Fault detection time

Depends on implementation. Refer to functional documentation.

 

 

Addressed fault model

Permanent and transient

 

 

Dependency on Device configuration

None

 

 

Initialization

None

 

 

Periodicity

Continuous

 

 

 

It is possible to write a test procedure to verify the generation of the HardFault exception;

Test for the diagnostic

anyway, given the expected minor contribution in terms of hardware random-failure detection,

 

such implementation is not recommended.

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

Recommendations and known limitations

Enabling related interrupt generation on the detection of errors is highly recommended.

 

 

UM2714 - Rev 2

page 16/114

 

 

UM2714

Hardware and software diagnostics

 

 

Table 7. CPU_SM_4

 

 

 

SM CODE

 

CPU_SM_4

 

 

Description

Stack hardening for Application software

 

 

Ownership

End user

 

 

 

The stack hardening method is required to address faults (mainly transient) affecting CPU

 

register bank. This method is based on source code modification, introducing information

 

redundancy in register-passed information to called functions.

 

The guidelines for the implementation of the method are the following:

Detailed implementation

To pass also a redundant copy of the passed parameters values (possibly inverted) and

 

to execute a coherence check in the function.

 

To pass also a redundant copy of the passed pointers and to execute a coherence

 

 

check in the function.

 

For parameters that are not protected by redundancy, to implement defensive

 

 

programming techniques (plausibility check of passed values). For example enumerated

 

 

fields are to be checked for consistency.

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent and transient

 

 

 

Dependency on Device configuration

None

 

 

 

Initialization

Depends on implementation

 

 

Periodicity

On demand

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

 

This method partially overlaps with defensive programming techniques required by IEC61508

Recommendations and known limitations

for software development. Therefore in presence of Application software qualified for safety

 

integrity greater or equal to SC2, optimizations are possible.

 

 

 

 

Table 8. CPU_SM_5

 

 

SM CODE

CPU_SM_5

 

 

Description

External watchdog

 

 

Ownership

End user

 

 

 

Using an external watchdog linked to control flow monitoring method (refer to CPU_SM_1)

 

addresses failure mode of program counter or control structures of CPU.

 

External watchdog can be designed to be able to generate the combination of signals needed

Detailed implementation

on the final system to achieve the safe state. It is recommended to carefully check the

assumed requirements about system safe state reported in Section 3.3.1 Safety requirement

 

 

assumptions.

 

It also contributes to reduce potential common cause failures, because the external watchdog

 

is clocked and supplied independently of Device.

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation (watchdog timeout interval)

 

 

Addressed fault model

Permanent and transient

 

 

Dependency on Device configuration

None

 

 

Initialization

Depends on implementation

 

 

Periodicity

Continuous

 

 

Test for the diagnostic

To be defined at system level (outside the scope of Compliant item analysis)

 

 

Multiple-fault protection

CPU_SM_1: control flow monitoring in Application software

UM2714 - Rev 2

page 17/114

 

 

UM2714

Hardware and software diagnostics

SM CODE

CPU_SM_5

 

CPU_SM_6: MCU window watchdog

 

 

Recommendations and known limitations

In case of usage of windowed watchdog, End user must consider possible tolerance in

Application software execution, to avoid false error reports (affecting system availability).

 

 

 

 

Table 9. CPU_SM_6

 

 

SM CODE

CPU_SM_6

 

 

Description

MCU window watchdog

 

 

Ownership

ST

 

 

Detailed implementation

Using the WWDG1 watchdog linked to control flow monitoring method (refer to CPU_SM_1)

addresses failure mode of program counter or control structures of CPU.

 

 

 

Error reporting

Reset signal generation

 

 

Fault detection time

Depends on implementation (watchdog timeout interval)

 

 

Addressed fault model

Permanent

 

 

Dependency on Device configuration

None

 

 

Initialization

WWDG1 activation. It is recommended to use hardware watchdog in Option byte settings

(WWDG1 is automatically enabled after reset)

 

 

 

Periodicity

Continuous

 

 

Test for the diagnostic

WDG_SM_1: Software test for watchdog at startup

 

 

Multiple-fault protection

CPU_SM_1: control flow monitoring in Application software

WDG_SM_0: periodical read-back of configuration registers

 

 

 

 

The WWDG1 intervention is able to achieve a potentially “incomplete” local safe state

Recommendations and known limitations

because it can only guarantee that CPU is reset. No guarantee that Application software

can be still executed to generate combinations of output signals that might be needed by the

 

 

external system to achieve the final safe state.

 

 

 

 

Table 10. CPU_SM_7

 

 

 

SM CODE

 

CPU_SM_7

 

 

Description

Memory protection unit (MPU)

 

 

 

Ownership

ST

 

 

 

Detailed implementation

The CPU memory protection unit is able to detect illegal access to protected memory areas,

according to criteria set by End user.

 

 

 

Error reporting

Exception raise (MemManage)

 

 

Fault detection time

Refer to functional documentation

 

 

Addressed fault model

Systematic (software errors)

Permanent and transient (only program counter and memory access failures)

 

 

 

 

Dependency on Device configuration

None

 

 

 

Initialization

MPU registers must be programmed at start-up

 

 

Periodicity

On line

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

MPU_SM_0: Periodical read-back of configuration registers

 

 

 

The use of memory partitioning and protection by MPU functions is highly recommended

Recommendations and known limitations

when multiple safety functions are implemented in Application software. The MPU can be

indeed used to

 

 

enforce privilege rules

UM2714 - Rev 2

page 18/114

 

 

UM2714

Hardware and software diagnostics

SM CODE

CPU_SM_7

 

 

separate processes

enforce access rules

 

 

 

Hardware random-failure detection capability for MPU is restricted to well-selected failure

 

 

 

modes, mainly affecting program counter and memory access CPU functions. The associated

 

 

 

diagnostic coverage is therefore not expected to be relevant for the safety concept of Device.

 

 

 

Enabling related interrupt generation on the detection of errors is highly recommended.

 

 

 

 

 

 

 

Table 11. MPU_SM_0

 

 

 

 

SM CODE

 

MPU_SM_0

 

 

 

 

Description

 

Periodical read-back of MPU configuration registers

 

 

 

 

Ownership

 

End user

 

 

 

 

 

 

 

This method must be applied to MPU configuration registers (also unused by the End

Detailed implementation

 

userApplication software).

 

Detailed information on the implementation of this method can be found in

 

 

 

 

 

 

Section 3.6.4 EXTI controller.

 

 

 

 

Error reporting

 

Refer to NVIC_SM_0

 

 

 

 

Fault detection time

 

Refer to NVIC_SM_0

 

 

 

 

Addressed fault model

 

Refer to NVIC_SM_0

 

 

 

 

Dependency on Device configuration

 

Refer to NVIC_SM_0

 

 

 

 

Initialization

 

Refer to NVIC_SM_0

 

 

 

 

Periodicity

 

Refer to NVIC_SM_0

 

 

 

 

Test for the diagnostic

 

Refer to NVIC_SM_0

 

 

 

 

Multiple-fault protection

 

Refer to NVIC_SM_0

 

 

 

Recommendations and known limitations

Refer to NVIC_SM_0

 

 

 

 

 

 

 

Table 12. MPU_SM_1

 

 

 

 

SM CODE

 

 

MPU_SM_1

 

 

 

Description

 

MPU software test.

 

 

 

Ownership

 

End User.

 

 

 

 

 

This method tests MPU capability to detect and report memory accesses violating the policy

 

 

enforcement implemented by the MPU itself.

Detailed implementation

 

The implementation is based on intentionally performing memory accesses (in writing and read) to

 

memory areas outside of the allowed by the MPU regions programming, and to collect and verify

 

 

 

 

related generated error exceptions.

 

 

Test can be executed with the final MPU region programming or with a dedicated one.

 

 

 

Error reporting

 

Depends on implementation.

 

 

 

Fault detection Time

 

Depends on implementation.

 

 

 

Addressed Fault Model

 

Permanent.

 

 

 

Dependency on device configuration

 

None.

 

 

 

Initialization

 

Depends on implementation.

 

 

 

Periodicity

 

On demand.

 

 

 

Test for the diagnostic

 

Not needed.

 

 

 

Multiple faults protection

 

CPU_SM_0: Periodical core self test software

 

 

 

 

UM2714 - Rev 2

page 19/114

 

 

UM2714

Hardware and software diagnostics

SM CODE

MPU_SM_1

Recommendations and known

Startup execution of this safety mechanism is recommended for multiple fault mitigations - refer to

limitations

Section 4.1.3 Notes on multiple-fault scenario for details.

 

 

UM2714 - Rev 2

page 20/114

 

 

UM2714

Hardware and software diagnostics

3.6.2Embedded SRAM1/2/3/4

 

Table 13. RAM_SM_0

 

 

SM CODE

RAM_SM_0

 

 

Description

Periodical software test for static random access memory (SRAM or RAM)

 

 

Ownership

End user or ST

 

 

 

To enhance the coverage on SRAM data cells and to ensure adequate coverage for

 

permanent faults affecting the address decoder it is required to execute a periodical software

Detailed implementation

test on the system RAM memory. The selection of the algorithm must ensure the target SFF

 

coverage for both the RAM cells and the address decoder. Evidences of the effectiveness of

 

the coverage of the selected method must be also collected

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent

 

 

Dependency on Device configuration

RAM size can change according to the part number

 

 

Initialization

Depends on implementation

 

 

Periodicity

Periodic

 

 

Test for the diagnostic

Self-diagnostic capabilities can be embedded in the software, according the test

implementation design strategy chosen

 

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

 

Usage of a March test C- is recommended.

 

Because the nature of this test can be destructive, RAM contents restore must be

 

implemented. Possible interferences with interrupt-serving routines fired during test execution

 

must be also considered (such routines can access to RAM invalid contents).

 

Unused RAM sections can be excluded by the testing, under end user responsibility on actual

Recommendations and known limitations

RAM usage by final application software

Startup execution of this safety mechanism is recommended for multiple fault mitigations -

 

 

refer to Section 4.1.3 Notes on multiple-fault scenario for details.

 

RAM sections hosting application software or diagnostic libraries can be excluded by

 

the testing with this method, if correctly protected by the dedicated safety mechanism

 

CPU_SM_5. Indeed the diagnostic coverage granted by CPU_SM_5 permits to avoid the

 

overlap of the two methods on the same RAM area.

 

 

UM2714 - Rev 2

page 21/114

 

 

UM2714

Hardware and software diagnostics

 

Table 14. RAM_SM_2

 

 

SM CODE

RAM_SM_2

 

 

Description

Stack hardening for application software

 

 

Ownership

End user

 

 

 

The stack hardening method is used to enhance the application software robustness to SRAM

 

faults that affect the address decoder. The method is based on source code modification,

 

introducing information redundancy in the stack-passed information to the called functions.

Detailed implementation

Method contribution is relevant in case the combination between the final application software

 

structure and the compiler settings requires a significant use of the stack for passing function

 

parameters.

 

Implementation is the same as method CPU_SM_4

 

 

Error reporting

Refer to CPU_SM_4

 

 

Fault detection time

Refer to CPU_SM_4

 

 

Addressed fault model

Refer to CPU_SM_4

 

 

Dependency on Device configuration

Refer to CPU_SM_4

 

 

Initialization

Refer to CPU_SM_4

 

 

Periodicity

Refer to CPU_SM_4

 

 

Test for the diagnostic

Refer to CPU_SM_4

 

 

Multiple-fault protection

Refer to CPU_SM_4

 

 

Recommendations and known limitations

Refer to CPU_SM_4

 

 

 

 

Table 15. RAM_SM_3

 

 

 

SM CODE

 

RAM_SM_3

 

 

Description

Information redundancy for safety-related variables in application software

 

 

Ownership

End user

 

 

 

To address transient faults affecting SRAM controller, it is required to implement information

 

redundancy on the safety-related system variables stored in the RAM.

 

The guidelines for the implementation of this method are the following:

 

The system variables that are safety-related (in the sense that a wrong value due to

 

 

a failure in reading on the RAM affects the safety functions) are well-identified and

 

 

documented.

Detailed implementation

The arithmetic computation or decision based on such variables are executed twice and

 

the two final results are compared.

 

 

 

Safety-related variables are stored and updated in two redundant locations, and

 

 

comparison is checked before consuming data.

 

Enumerated fields must use non-trivial values, checked for coherence at least one time

 

 

per PST

 

Data vectors stored in SRAM must be protected by a encoding checksum (such as

 

 

CRC)

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent and transient

 

 

 

Dependency on Device configuration

None

 

 

 

Initialization

Depends on implementation

 

 

Periodicity

On demand

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

 

UM2714 - Rev 2

page 22/114

 

 

UM2714

Hardware and software diagnostics

SM CODE

RAM_SM_3

 

 

Implementation of this safety method shows a partial overlap with an already foreseen method Recommendations and known limitations for Arm®Cortex®-M4 (CPU_SM_1); optimizations in implementing both methods are therefore

possible

 

Table 16. RAM_SM_4

 

 

SM CODE

RAM_SM_4

 

 

Description

Control flow monitoring in application software

 

 

Ownership

End user

 

 

 

Because end user application software is executed from SRAM, permanent and transient

 

faults affecting the memory (cells and address decoder) can interfere with the program

Detailed implementation

execution.

To address such failures it is needed to implement this method.

 

 

For more details on the implementation, refer to description CPU_SM_1

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation. Higher value is fixed by watchdog timeout interval.

 

 

Addressed fault model

Permanent and transient

 

 

Dependency on Device configuration

None

 

 

Initialization

Depends on implementation

 

 

Periodicity

Continuous

 

 

Test for the diagnostic

NA

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

Recommendations and known limitations

CPU_SM_1 correct implementation supersedes this requirement

 

 

 

Table 17. RAM_SM_5

 

 

SM CODE

RAM_SM_5

Description

Periodical integrity test for application software in RAM

 

 

Ownership

End user

 

 

 

Because application software and diagnostic libraries are executed in RAM, it is needed

 

to protect the integrity of the code itself against soft-error corruptions and related code

 

mutations. This method must check the integrity of the stored code by checksum computation

 

techniques, on a periodic basis (at least once per PST), or another timing constraint; refer to

Detailed implementation

(1) in Section 3.6 Hardware and software diagnostics. RAM memory cell contents are then

checked versus the expected values, using signature-based techniques. According to IEC

 

61508:2 Table A.5, the effective diagnostic coverage of such techniques depends on the width

 

of the signature in relation to the block length of the information to be protected - therefore

 

the signature computation method must be carefully selected. Note that the simple signature

 

method (IEC 61508:7 - A.4.2 Modified checksum) is inadequate as it only achieves a low

 

value of coverage. The use of internal hardware CRC module is therefore recommended.

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent and transient

 

 

Dependency on Device configuration

None

 

 

Initialization

Depends on implementation

 

 

Periodicity

Periodic

 

 

UM2714 - Rev 2

page 23/114

 

 

UM2714

Hardware and software diagnostics

SM CODE

RAM_SM_5

Test for the diagnostic

Self-diagnostic capabilities can be embedded in the software, according the test

implementation design strategy chosen.

 

 

 

Multiple-fault protection

CPU_SM_0: periodical core self test software

CPU_SM_1: control flow monitoring in application software

 

 

 

Recommendations and known limitations

Refer to RAM_SM_0 description for information about the management of the overlap on

RAM between that method and this one.

 

 

 

 

Table 18. RAM_SM_9

 

 

SM CODE

RAM_SM_9

 

 

Description

SRAM static data encapsulation

 

 

Ownership

End user

 

 

 

If static data are stored in SRAM, encapsulation by a checksum field with encoding capability

Detailed implementation

(such as CRC) must be implemented.

 

Checksum validity is checked by application software before static data consuming.

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent and transient

 

 

Dependency on Device configuration

None

 

 

Initialization

Depends on implementation

 

 

Periodicity

On demand

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

CPU_SM_0: periodical core self test software

 

 

Recommendations and known limitations

None

 

 

3.6.3System bus architecture/peripherals interconnect matrix

 

Table 19. BUS_SM_0

 

 

SM CODE

BUS_SM_0

 

 

Description

Periodical software test for interconnections

 

 

Ownership

End user

 

 

 

The intra-chip connection resources (main AHB interconnection matrix, AHB or APB bridges)

 

needs to be periodically tested for permanent faults detection. Note that STM32MP1

 

Series devices have no hardware safety mechanism to protect these structures. The test

Detailed implementation

executes a connectivity test of these shared resources, including the testing of the arbitration

 

mechanisms between peripherals.

 

According to IEC 61508:2 Table A.8, A.7.4 the method is considered able to achieve high

 

levels of coverage

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent

 

 

Dependency on Device configuration

None

 

 

Initialization

Depends on implementation

 

 

Periodicity

Periodic

 

 

Test for the diagnostic

Not needed

 

 

UM2714 - Rev 2

page 24/114

 

 

UM2714

Hardware and software diagnostics

SM CODE

BUS_SM_0

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

Recommendations and known limitations

Implementation can be considered in large part as overlapping with the widely used Periodical

read-back of configuration registers required for several peripherals

 

 

 

 

Table 20. BUS_SM_1

 

 

SM CODE

BUS_SM_1

Description

Information redundancy in intra-chip data exchanges

 

 

Ownership

End user

 

 

 

This method requires to add some kind of redundancy (for example a CRC checksum at

Detailed implementation

packet level) to each data message exchanged inside Device.

Message integrity is verified using the checksum by the application software, before

 

 

consuming data.

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent and Transient

 

 

Dependency on Device configuration

None

 

 

Initialization

Depends on implementation

 

 

Periodicity

On demand

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

 

Implementation can be in large part overlapping with other safety mechanisms requiring

Recommendations and known limitations

information redundancy on data messages for communication peripherals. Optimizations are

 

therefore possible.

 

 

 

Table 21. LOCK_SM_0

 

 

SM CODE

LOCK_SM_0

 

 

Description

Lock mechanism for configuration options

 

 

Ownership

ST

 

 

 

The STM32MP1 Series devices feature spread protection to prevent unintended configuration

Detailed implementation

changes for some peripherals and system registers (for example PVD lock enable bit, timers);

the spread protection detects systematic faults in software application. The use of this method

 

 

is encouraged to enhance the end application robustness to systematic faults.

 

 

Error reporting

Not generated (when locked, register overwrites are just ignored)

 

 

Fault detection time

NA

 

 

Addressed fault model

None (Systematic only)

 

 

Dependency on Device configuration

None

 

 

Initialization

Depends on implementation

 

 

Periodicity

Continuous

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

Not needed

 

 

Recommendations and known limitations

No DC associated because this test addresses software systematic faults

 

 

UM2714 - Rev 2

page 25/114

 

 

UM2714

Hardware and software diagnostics

3.6.4

EXTI controller

 

 

 

 

 

Table 22. NVIC_SM_0

 

 

 

 

 

SM CODE

 

NVIC_SM_0

 

 

 

Description

 

Periodical read-back of configuration registers

 

 

 

Ownership

 

End user

 

 

 

 

 

This test is implemented by executing a periodical check of the configuration registers for

 

 

a system peripheral against its expected value. Expected values are previously stored in

 

 

RAM and adequately updated after each configuration change. The method mainly addresses

 

 

transient faults affecting the configuration registers, by detecting bit flips in the registers

 

 

contents. It addresses also permanent faults on registers because it is executed at least

 

 

one time within PST (or another timing constraint; refer to (1) in Section 3.6 Hardware and

 

 

software diagnostics) after a peripheral update.

 

 

Method must be implemented to any configuration register whose contents are able to

 

 

interfere with NVIC or EXTI behavior in case of incorrect settings. Check includes NVIC vector

 

 

table.

 

Detailed implementation

According to the state-of-the-art automotive safety standard ISO26262, this method can

 

 

 

 

achieve high levels of diagnostic coverage (DC) (refer to ISO26262:5, Table D.4)

 

 

An alternative valid implementation requiring less space in SRAM can be realized on the basis

 

 

of signature concept:

 

 

Peripheral registers to be checked are read in a row, computing a CRC checksum (use

 

 

 

of hardware CRC is encouraged)

 

 

Obtained signature is compared with the golden value (computed in the same way after

 

 

 

each register update, and stored in SRAM)

 

 

Coherence between signatures is checked by the application software – signature

 

 

 

mismatch is considered as failure detection

 

 

 

Error reporting

 

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent/transient

 

 

 

Dependency on Device configuration

None

 

 

 

 

Initialization

 

Values of configuration registers must be read after the boot before executing the first check

 

 

 

Periodicity

 

Periodic

 

 

Test for the diagnostic

Not required

 

 

Multiple-fault protection

CPU_SM_0: Periodic core self-test software

 

 

 

 

 

This method addresses only failures affecting configuration registers, and not peripheral core

 

 

logic or external interface.

Recommendations and known limitations

Attention must be paid to registers containing mixed combination of configuration and status

 

 

bits. Mask must be used before saving register contents affecting signature, and related

 

 

checks, to avoid false positive detections.

 

 

 

 

UM2714 - Rev 2

page 26/114

 

 

UM2714

Hardware and software diagnostics

 

 

Table 23. NVIC_SM_1

 

 

 

SM CODE

 

NVIC_SM_1

 

 

Description

Expected and unexpected interrupt check

 

 

Ownership

End user

 

 

 

According to IEC 61508:2 Table A.1 recommendations, a diagnostic measure for continuous,

 

absence or cross-over of interrupt must be implemented. The method of expected and

 

unexpected interrupt check is implemented at application software level.

 

The guidelines for the implementation of the method are the following:

 

The interrupts implemented on the MPU are well documented, also reporting, when

 

 

possible, the expected frequency of each request (for example, the interrupts related to

 

 

ADC conversion completion that come on a regular basis).

 

Individual counters are maintained for each interrupt request served, in order to detect in

 

 

a given time frame the cases of a) no interrupt at all b) too many interrupt requests

 

 

(“babbling idiot” interrupt source). The control of the time frame duration must be

Detailed implementation

 

regulated according to the individual interrupt expected frequency.

Interrupt vectors related to unused interrupt source point to a default handler that

 

 

reports, in case of triggering, a faulty condition (unexpected interrupt).

 

In case an interrupt service routine is shared between different sources, a plausibility

 

 

check on the caller identity is implemented.

 

The interrupt generation capability of each used peripherals must be periodically

 

 

checked, at least once per PST (or timing requirement as per (1)). In case no periodical

 

 

interrupt are generated during normal operations (e.g. a peripheral raising interrupt just

 

 

in case of incoming data), the interrupt generation capability must be verified by other

 

 

means. For instance a couple of input/output GPIO lines can be used to generate a

 

 

GPIO event interrupt.

 

Interrupt requests related to non-safety-related peripherals are handled with the same

 

 

method here described, despite their originator safety classification

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent/transient

 

 

 

Dependency on Device configuration

None

 

 

 

Initialization

Depends on implementation

 

 

Periodicity

Continuous

 

 

Test for the diagnostic

Not required

 

 

Multiple-fault protection

CPU_SM_0: Periodic core self-test software

 

 

Recommendations and known limitations

In order to decrease the complexity of method implementation, it is suggested to use polling

technique (when possible) instead of interrupt for end system implementation

 

 

 

 

3.6.5Direct memory access controller (DMA/ DMAMUX)

 

Table 24. DMA_SM_0

 

 

SM CODE

DMA_SM_0

 

 

Description

Periodical read-back of configuration registers

 

 

Ownership

End user

 

 

 

This method must be applied to DMA configuration register and channel addresses register as

Detailed implementation

well.

Detailed information on the implementation of this method can be found in

 

 

Section 3.6.4 EXTI controller

 

 

Error reporting

Refer to NVIC_SM_0

 

 

Fault detection time

Refer to NVIC_SM_0

 

 

UM2714 - Rev 2

page 27/114

 

 

UM2714

Hardware and software diagnostics

SM CODE

DMA_SM_0

Addressed fault model

Refer to NVIC_SM_0

 

 

Dependency on Device configuration

Refer to NVIC_SM_0

 

 

Initialization

Refer to NVIC_SM_0

 

 

Periodicity

Refer to NVIC_SM_0

 

 

Test for the diagnostic

Refer to NVIC_SM_0

 

 

Multiple-fault protection

Refer to NVIC_SM_0

 

 

Recommendations and known limitations

Refer to NVIC_SM_0

 

 

 

Table 25. DMA_SM_1

 

 

SM CODE

DMA_SM_1

Description

Information redundancy on data packet transferred via DMA

 

 

Ownership

End user

 

 

 

This method is implemented adding to data packets transferred by DMA a redundancy check

 

(such as CRC check, or similar one) with encoding capability. Full data packet redundancy

 

would be overkilling.

Detailed implementation

The checksum encoding capability must be robust enough to guarantee at least 90%

 

probability of detection for a single bit flip in the data packet

 

Consistency of data packet must be checked by the application software before consuming

 

data

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent and transient

 

 

Dependency on Device configuration

None

 

 

Initialization

Depends on implementation

 

 

Periodicity

On demand

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

Recommendations and known limitations

To give an example about checksum encoding capability, using just a bit-by-bit addition is

unappropriated

 

 

 

 

 

Table 26. DMA_SM_2

 

 

 

SM CODE

 

DMA_SM_2

 

 

Description

Information redundancy by including sender or receiver identifier on data packet transferred

via DMA

 

 

 

Ownership

End user

 

 

 

This method helps to identify inside the device the source and the originator of the message

 

exchanged by DMA.

 

Implementation is realized by adding an additional field to protected message, with a

 

coding convention for message type identification fixed at Device level. Guidelines for the

Detailed implementation

identification fields are:

Identification field value must be different for each possible couple of sender or receiver

 

 

 

on DMA transactions

 

Values chosen must be enumerated and non-trivial

 

Coherence between the identification field value and the message type is checked by

 

 

application software before consuming data.

UM2714 - Rev 2

page 28/114

 

 

UM2714

Hardware and software diagnostics

SM CODE

DMA_SM_2

 

This method, when implemented in combination with DMA_SM_4, makes available a kind of

 

“virtual channel” between source and destinations entities.

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent and transient

 

 

Dependency on Device configuration

None

 

 

Initialization

Depends on implementation

 

 

Periodicity

On demand

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

Recommendations and known limitations

None

 

 

 

 

Table 27. DMA_SM_3

 

 

 

SM CODE

 

DMA_SM_3

 

 

Description

Periodical software test for DMA

 

 

Ownership

End user

 

 

 

This method requires the periodical testing of the DMA basic functionality, implemented

 

through a deterministic transfer of a data packet from one source to another (for example

 

from memory to memory) and the checking of the correct transfer of the message on the

Detailed implementation

target. Data packets are composed by non-trivial patterns (avoid the use of 0x0000, 0xFFFF

values) and organized in order to allow the detection during the check of the following failures:

 

incomplete packed transfer

 

errors in single transferred word

 

wrong order in packed transmitted data

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent

 

 

 

Dependency on Device configuration

None

 

 

 

Initialization

Depends on implementation

 

 

Periodicity

Periodic

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

 

Recommendations and known limitations

None

 

 

 

 

 

Table 28. DMA_SM_4

 

 

SM CODE

DMA_SM_4

 

 

Description

DMA transaction awareness

 

 

Ownership

End user

 

 

 

DMA transactions are non-deterministic by nature, because typically driven by external events

Detailed implementation

like communication messages reception. Anyway, well-designed safety systems should keep

much control as possible of events – refer for instance to IEC61508:3 Table 2 item 13

 

 

requirements for software architecture.

UM2714 - Rev 2

page 29/114

 

 

UM2714

Hardware and software diagnostics

SM CODE

DMA_SM_4

 

This method is based on system knowledge of frequency and type of expected DMA

 

transaction. For instance, an externally connected sensor supposed to send periodically some

 

messages to a STM32 peripheral. Monitoring DMA transaction by a dedicated state machine

 

permits to detect missing or unexpected DMA activities.

 

 

Error reporting

Depends on implementation

 

 

Fault detection time

Depends on implementation

 

 

Addressed fault model

Permanent and transient

 

 

Dependency on Device configuration

None

 

 

Initialization

Depends on implementation

 

 

Periodicity

Continuous

 

 

Test for the diagnostic

Not needed

 

 

Multiple-fault protection

CPU_SM_0: periodical core self-test software

 

 

 

Because DMA transaction termination is often linked to an interrupt generation,

Recommendations and known limitations

implementation of this method can be merged with the safety mechanism NVIC_SM_1:

 

expected and unexpected interrupt check.

 

 

3.6.6Universal synchronous/asynchronous and low-power universal asynchronous receiver/ transmitter (USART2/3/6, UART4/5/7/8)

 

Table 29. UART_SM_0

 

 

SM CODE

UART_SM_0

 

 

Description

Periodical read-back of configuration registers

 

 

Ownership

End user

 

 

 

This method must be applied to UART configuration registers.

Detailed implementation

Detailed information on the implementation of this method can be found in

 

Section 3.6.4 EXTI controller.

 

 

Error reporting

Refer to NVIC_SM_0

 

 

Fault detection time

Refer to NVIC_SM_0

 

 

Addressed fault model

Refer to NVIC_SM_0

 

 

Dependency on Device configuration

Refer to NVIC_SM_0

 

 

Initialization

Refer to NVIC_SM_0

 

 

Periodicity

Refer to NVIC_SM_0

 

 

Test for the diagnostic

Refer to NVIC_SM_0

 

 

Multiple-fault protection

Refer to NVIC_SM_0

 

 

Recommendations and known limitations

Refer to NVIC_SM_0

 

 

 

Table 30. UART_SM_1

 

 

SM CODE

UART_SM_1

 

 

Description

Protocol error signals

 

 

Ownership

ST

 

 

 

USART communication module embeds protocol error checks (like additional parity bit

Detailed implementation

check, overrun, frame error) conceived to detect network-related abnormal conditions. These

mechanisms are able anyway to detect a marginal percentage of hardware random failures

 

 

affecting the module itself.

UM2714 - Rev 2

page 30/114

 

 

Loading...
+ 84 hidden pages