STMicroelectronics STM32MP1 User Manual

UM2714
User manual
STM32MP1 Series safety manual
Introduction
This document must be read along with the technical documentation such as reference manual(s) and datasheets for the STM32MP1 Series microprocessor devices, available on www.st.com.
It describes how to use the devices in the context of a safety-related system, specifying the user's responsibilities for installation and operation in order to reach the targeted safety integrity level. It also pertains to the X-CUBE-STL software product. The
safety concept described in this manual is based on the possible implementation of safety function(s) on the Arm® Cortex®-M4 CPU (and associated peripherals) included in STM32MP1.
It provides the essential information pertaining to the applicable functional safety standards, which allows system designers to avoid going into unnecessary details.
The document is written in compliance with IEC 61508, and it provides information relative to other functional safety standards.
The safety analysis in this manual takes into account the device variation in terms of memory size, available peripherals, and package.
UM2714 - Rev 2 - October 2020 For further information contact your local STMicroelectronics sales office.
www.st.com

1 About this document

1.1 Purpose and scope

This document describes how to use STM32MP1 microprocessor unit (MPU) devices (further also referred to as Device(s)) in the context of a safetyrelated system, specifying the user's responsibilities for installation and operation, in order to reach the desired safety integrity level. Note that the safety concept described in this
document is based on the possible implementation of safety function(s) on the Arm® Cortex®-M4 CPU (and associated peripherals) included in STM32MP1.
It is useful to system designers willing to evaluate the safety of their solution embedding one or more Device(s).
For terms used, refer to the glossary at the end of the document.
Note: Arm is a registered trademark of Arm Limited (or its subsidiaries) in the US and/or elsewhere.

1.2 Normative references

This document is written in compliance with the IEC 61508 international norm for functional safety of electrical, electronic and programmable electronic safety-related systems, version IEC 61508:1-7 © IEC:2010.
The other functional safety standards considered in this manual are:
ISO 13849-1:2015, ISO13849-2:2012
IEC 62061:2005+AMD1:2012+AMD2:2015
IEC 61800-5-2:2016
The following table maps the document content with respect to the IEC 61508-2 Annex D requirements.
UM2714
About this document
Table 1. Document sections versus IEC 61508-2 Annex D safety requirements
Safety requirement Section number
D2.1 a) a functional specification of the functions capable of being performed 3
D2.1 b) identification of the hardware and/or software configuration of the Compliant item 3.2
D2.1 c) constraints on the use of the Compliant item or assumptions on which analysis of the behavior or failure rates of the item are based
D2.2 a) the failure modes of the Compliant item due to random hardware failures, that result in a failure of the function and that are not detected by diagnostics internal to the Compliant item;
D2.2 b) for every failure mode in a), an estimated failure rate;
D2.2 c) the failure modes of the Compliant item due to random hardware failures, that result in a failure of the function and that are detected by diagnostics internal to the Compliant item;
D2.2 d) the failure modes of the diagnostics, internal to the Compliant item due to random hardware failures, that result in a failure of the diagnostics to detect failures of the function;
D2.2 e) for every failure mode in c) and d), the estimated failure rate;
D2.2 f) for every failure mode in c) that is detected by diagnostics internal to the Compliant item, the diagnostic test interval;
D2.2 g) for every failure mode in c) the outputs of the Compliant item initiated by the internal diagnostics; 3.6
D2.2 h) any periodic proof test and/or maintenance requirements;
D2.2 i) for those failure modes, in respect of a specified function, that are capable of being detected by external diagnostics, sufficient information must be provided to facilitate the development of an external diagnostics capability.
D2.2 j) the hardware fault tolerance;
D2.2 k) the classification as type A or type B of that part of the Compliant item that provides the function (see 7.4.4.1.2 and 7.4.4.1.3);
3.2
3.7
3.2.2
3.7
3
UM2714 - Rev 2
page 2/114

1.3 Reference documents

[1] AN5459, FMEDA snapshots for STM32MP1 microprocessor series.
[2] AN5460, Results of FMEA on STM32MP1 Series microprocessor.
UM2714
Reference documents
UM2714 - Rev 2
page 3/114

2 Device development process

STM32 series product development process (see Figure 1), compliant with the IATF 16949 standard, is a set of interrelated activities dedicated to transform customer specification and market or industry domain requirements into a semiconductor device and all its associated elements (package, module, sub-system, hardware, software, and documentation), qualified with ST internal procedures and fitting ST internal or subcontracted manufacturing technologies.
Figure 1. STMicroelectronics product development process
UM2714
Device development process
1 Conception
·Key characteristics and requirements related to future uses of the device
·Industry domain(s), specific customer requirements and definition of controls and tests needed for compliance
·Product target specification and strategy
·Project manager appointment to drive product development
·Evaluation of the technologies, design tools and IPs to be used
·Design objective specification and product validation strategy
·Design for quality techniques (DFD, DFT, DFR, DFM, …) definition
·Architecture and positioning to make sure the software and hardware system solutions meet the target specification
·Product approval strategy and project plan
2 Design and
validation
·Semiconductor design development
·Hardware development
·Software development
·Analysis of new product
specification to forecast reliability performance
·Reliability plan, reliability design rules, prediction of failure rates for operating life test using Arrhenius’s law and other applicable models
·Use of tools and methodologies such as APQP, DFM, DFT, DFMEA
·Detection of potential reliability issues and solution to overcome them
·Assessment of Engineering Samples (ES) to identify the main potential failure mechanisms
·Statistical analysis of electrical parameter drifts for early warning in case of fast parametric degradation (such as retention tests)
·Failure analysis on failed parts to clarify failure modes and mechanisms and identify the root causes
·Physical destructive analysis on good parts after reliability tests when required
·Electrostatic discharge (ESD) and latch-up sensitivity measurement
3 Qualification
·Successful completion of the product qualification plan
·Secure product deliveries on advanced technologies using stress methodologies to detect potential weak parts
·Successful completion of electrical characterization
·Global evaluation of new product performance to guarantee reliability of customer manufacturing process and final application of use (mission profile)
·Final disposition for product test, control and monitoring
UM2714 - Rev 2
page 4/114

3 Reference safety architecture

This section reports details of the STM32MP1 Series safety architecture.

3.1 Safety architecture introduction

Device(s) analyzed in this document can be used as Compliant item(s) within different safety applications.
The aim of this section is to identify such Compliant item(s), that is, to define the context of the analysis with respect to a reference concept definition. The concept definition contains reference safety requirements, including design aspects external to the defined Compliant item.
As a consequence of Compliant item approach, the goal is to list the system-related information considered during the analysis, rather than to provide an exhaustive hazard and risk analysis of the system around the device.

3.2 Compliant item

This section defines the Compliant item term and provides information on its usage in different safety architecture schemes.

3.2.1 Definition of Compliant item

According to IEC 61508:1 clause 8.2.12, Compliant item is any item (for example an element) on which a claim is being made with respect to the clauses of IEC 61508 series. Any mature Compliant item must be described in a safety manual available to End user.
In this document, Compliant item is defined as a system including one or two STM32MP1 devices (see Figure 2). The communication bus is directly or indirectly connected to sensors and actuators.
UM2714
Reference safety architecture
Figure 2. STM32MP1 as Compliant item
Actuator
Sensor
S
S
In the framework of STM32MP1 safety concept, the Compliant item is assumed to be divided in two partitions:
Safe Partition, including all STM32MP1 logic considered as safety related and suitable for the
implementation of End User safety functions. This partition includes also the STM32MP1 logic implementing (in collaboration with application software) the separation (freedom from interference) between itself and the Non-Safe Partition.
The complete list of STM32MP1 internal modules belonging to Safe Partition is reported in
Section 3.7 Conditions of use. The Arm® Cortex®‑M4 CPU belongs to Safe Partition.
Non-Safe Partition, including all STM32MP1 logic considered as non-safety related and therefore not
suitable for the implementation of End User safety functions and so out of the safety scope.
The STM32MP1 modules not listed explicitly in Section 3.7 Conditions of use are considered belonging to Non-Safe Partition. The Arm® A7 CPU belongs to Non-Safe Partition.
Accordingly, the implementation of the End User safety function is restricted to the STM32MP1 logic belonging to Safe Partition.
Remote
controller
Remote
controller
Processing element
STM32MP1
device(s)
Compliant item
Remote
controller
Remote
controller
A
A
UM2714 - Rev 2
page 5/114
UM2714
Compliant item
Caution:
According to Non-Safe Partition definition, the implementation of safety function(s) based on Arm® A7 CPU is not allowed.
Interference due to the presence in STM32MP1 device of logic belonging to Non-Safe Partition are managed by the help of the separation concept, which is described in detail in Section 3.2.5 .
It is worth to note that because of the effective isolation between Safe and Non Safe Partition, the presence of non-safety related application software running on A7 CPU is allowed. Section 4.2 of this document provides more information on this specific case.
Other components might be related to the Compliant item, like the external HW components needed to guarantee either the functionality of the device (external memory, clock quartz and so on) or its safety (for example, the external watchdog or voltage supervisors). These components are not analyzed in this safety manual.
A defined Compliant item can be classified as element according to IEC61508-4, 3.4.5.

3.2.2 Safety functions performed by Compliant item

In essence, Compliant item architecture encompasses the following processes performing the safety function or a part of it:
input processing elements (PEi) reading safety related data from the remote controller connected to the sensor(s) and transferring them to the following computation elements
computation processing elements (PEc) performing the algorithm required by the safety function and transferring the results to the following output elements
output processing elements (PEo) transferring safety related data to the remote controller connected to the actuator
computation processing elements (PEd, see Note) executing hardware and software-based safety functions devoted to:
1. detect/prevent hardware random failures affecting all considered processing elements (PEi/PEc/PEo/
PEd)
2. prevent/mitigate systematic failures in the software executed on PEc/PEd
3. prevent/detect interferences on the safety related hardware and software caused by the non-safety
related hardware and software
in 1oo2 architecture, potentially a further voting processing element (PEv)
processes external to the Compliant item ensuring safety integrity, such as watchdog (WDTe) and voltage monitors (VMONe)
Note: Because the large part of safety mechanisms included in STM32MP1 are software-based, PEc and PEd
hardware are mainly coincident.
The role of the PEv process is clarified in Section 3.2.4 , while the one of WDTe and VMONe external processes is clarified in the sections where the conditions of use (CoU) (definition of safety mechanism) are detailed:
WDTe: refer to External watchdog – CPU_SM_5 and Control flow monitoring in Application software – CPU_SM_1,
VMONe: refer to Supply voltage monitoring – VSUP_SM_1 and System-level power supply management ­VSUP_SM_5.
In summary, the devices support the implementation of End user safety functions consisting of three operations:
safe acquisition of safety-related data from input peripheral(s)
safe execution of application software program and safe computation of related data
safe transfer of results or decisions to output peripheral(s)
Claims on the Compliant item and computation of safety metrics are done with respect to these three basic operations and to the PEd processes which are integral part of STM32MP1 safety concept.
According to the definition for implemented safety functions, Compliant item (element) can be regarded as type B (as per IEC61508-2, 7.4.4.1.3 definition). Despite accurate, exhaustive and detailed failure analysis, Device has to be considered as intrinsically complex. This implies its type B classification.
Two main safety architectures are identified: 1oo1 (using one device) and 1oo2 (using two devices).

3.2.3 Reference safety architectures - 1oo1

1oo1 reference architecture (Figure 3) ensures safety integrity of Compliant item through combining device internal processes (implemented safety mechanisms) with external processes WDTe and VMONe.
1oo1 reference architecture targets safety integrity level (SIL)SIL2.
UM2714 - Rev 2
page 6/114
Figure 3. 1oo1 reference architecture
UM2714
Compliant item
Sensors
VMONe
PEc
PEd
WDTe
PEoPEi
Actuators
UM2714 - Rev 2
page 7/114

3.2.4 Reference safety architectures - 1oo2

1oo2 reference architecture (Figure 4) contains two separate channels, either implemented as 1oo1 reference architecture ensuring safety integrity of Compliant item through combining device internal processes (implemented safety mechanisms) with external processes WDTe and VMONe. The overall safety integrity is then ensured by the external voter PEv, which allows claiming hardware fault tolerance (HFT) equal to 1. Achievement of higher safety integrity levels as per IEC61508-2 Table 3 is therefore possible. Appropriate separation between the two channels (including power supply separation) should be implemented in order to avoid huge impact of common-cause failures (refer to Section 4.2 Analysis of dependent failures). However, β and βD parameters computation is required.
1oo2 reference architecture targets SIL3.
Figure 4. 1oo2 reference architecture
UM2714
Compliant item
Sensors
VMONe
PEc
PEd
WDTe
PEc
PEd
PEoPEi
PEv
PEoPEi
Actuators

3.2.5 The separation concept

The Safe Partition and the Non-Safe Partition are isolated by the separation concept, which is composed by two different aspects, spatial separation and temporal separation.
Spatial separation
Safe and Non-Safe Partitions share the same device (STM32MP1), therefore interferences are possible. The protection against spatial interferences is built by two successive layers of protection:
eTZPC protection: the major part of STM32MP1 peripherals and DMA masters belonging to Safe Partition are "isolable" and so can be allocated to Cortex®-M4 CPU domain accesses exclusively by proper setting of
MCU resource isolation system of the eTZPE controller. Then any access tried by a Non-Safe partition AXI master like the A7 CPU is detected and stopped by eTZPC hardware module. Note that all the peripherals which are securable to A7 CPU exclusively are excluded from the Safe Partition and safety concept by definition.
UM2714 - Rev 2
WDTeVMONe
page 8/114
UM2714
Compliant item
The local safety concept for each peripheral in Safe Partition, which is composed by several overlapped safety mechanism defined at application level, is able to detect in efficient way unintended perturbing accesses coming from Non Safe Partition. All those safety mechanisms are declared as “highly recommended” in Section 3.7 Table 142. List of safety recommendations, so their presence in the final system is guaranteed. This protection is quite valuable for the few common resources belonging to Safe Partitions which cannot be isolated by eTZPC protection, because fully shared with A7 CPU (power and clock configuration interfaces).
The two layers are overlapped, so the second one acts as second line of defense in case of failures of the main protection by eTZPC.
The separation concept protects the Safe Partition from unintended accesses regardless their nature on the Non Safe Partition side, so because of random hardware failures in the hardware or systematic errors in the software.
Figure 5 represents the separation concept:
Figure 5. Spatial separation concept
Safe software
Logic implementing
End User safety function(s)
Non-Safe software
Non safety related logic
Logic
implementing
Separation
concept
Safe Partition
Separation
Non-Safe Partition
It is worth to highlight that despite the M4 isolation feature belongs primarily to STM32MP1 security, it is applied for safety purposes in this specific case.
UM2714 - Rev 2
page 9/114
UM2714
Compliant item
Temporal separation
The temporal separation is required because of the specific boot structure of STM32MP1, where the A7 CPU is the key player during the boot sequence because it is in charge to load the software image for M4 CPU. The temporal separation guarantees that any possible interference from non-safety related hardware and/or software on A7 CPU side during the boot sequence is detected/mitigated by adequate measures implemented on M4 CPU side (and therefore built over safety related hardware and software). Following picture provides a graphical representation (red boxes: events on non-safety related resources, green boxes: events on safety related resources)
Figure 6. Temporal separation concept
Interferences
Power up A7 boot
A7 load image on
M4
A7 releases M4
reset
M4 measures
Temporal separation
M4 application software starts
Time
The box marked “M4 measures” includes both hardware and software measures. Hardware measures can be implemented by final system characteristics helping to keep the safe state despite issues occurred during the boot (for example the presence of an intelligent external watchdog), while software measures can be implemented by additional checks executed on M4 CPU just after its startup and before the execution of the end user safety application software. These measures are detailed in Section 3.7 in form of CoU.
It is worth to note that the authentication features implemented in first stage boot loader (FSBL) and that can be activated in second stage boot loader (U-Boot), despite mainly conceived for security reasons, can be considered as part of the temporal separation implementation because they protect the integrity of the software images loaded during the boot. That protection. represented Figure 6 by the yellow boxes, can be considered valid as per the outcomes of CoU_17 application.
UM2714 - Rev 2
page 10/114
Safety analysis assumptions

3.3 Safety analysis assumptions

This section collects all assumptions made during the safety analysis of the devices.

3.3.1 Safety requirement assumptions

The concept specification, the hazard and risk analysis, the overall safety requirement specification and the consequent allocation determine the requirements for Compliant item as further listed. ASR stands for assumed safety requirements.
Caution: It is the End user’s responsibility to check the compliance of the final application with these assumptions.
Furthermore, beside the recommendation included in this Safety Manual, it is also End user’s responsibility to guarantee the compliance of the final application with the requirements of the IEC61508.
ASR1: Compliant item can be used to implement four kinds of safety function modes of operation according to part 4,3.5.16:
a continuous mode (CM) or high-demand (HD) SIL3 safety function (CM3), or
a low-demand (LD) SIL3 safety function (LD3), or
a CM or HD SIL2 safety function (CM2), or
a LD SIL2 safety function (LD2).
ASR2: Compliant item is used to implement safety function(s) allowing a specific worst-case time budget (see note below) for the STM32 MPU to detect and react to a failure. That time corresponds to the portion of the
process safety time (PST) allocated to the device (STM32xx Series duty in Figure 7) in error reaction chain at
system level.
Note: The computation for time budget mainly depends on the execution speed for periodic tests implemented
by software. Such duration might depends on the actual amount of hardware resources (RAM memory and peripherals) actually declared as safety-related. Further constraints and requirements from IEC61508-2, 7.4.5.3 must be considered.
UM2714
Figure 7. Allocation and target for STM32 PST
STM32xx Series duty End user duty
MPU detection FW reaction SW reaction Actuator reaction
System-level PST
….
ASR3: Compliant item is used to implement safety function(s) that can be continuously powered on for a period over eight hours. It is assumed to not require any proof test, and the lifetime of the product is considered to be no less than 10 years.
ASR4.1: It is assumed that End User safety functions are implemented only on Device logic belonging to the Safe Partition.
ASR4.2: It is assumed that logic and resources belonging to Safe Partition are not used for the implementation of non-safety related functions, coexisting with safety functions.
ASR4.3: It is assumed that the software functions running on A7 CPU have been developed according some specific quality standards.
Note: This Assumption is not placed to require a formal compliance of A7 software development flow to IEC61508
model, but to guarantee a minimum quality of the software running on the Non-Safe partition. Related CoU_17 provides more specific insights.
ASR4.4: It is assumed that only one safety function is performed or if many, all functions are classified with the same SIL and therefore they are not distinguishable in terms of their safety requirements.
ASR4.5: In case of multiple safety function implementations, it is assumed that End user is responsible to duly ensure their mutual independence.
3
UM2714 - Rev 2
page 11/114
UM2714

Electrical specifications and environment limits

ASR5: It is assumed that the implemented safety function(s) does (do) not depend on transition of the overall
STM32MP1 or Arm® Cortex®-M4 CPU to and from a low-power or suspended state.
ASR6.1: The local safe state of Compliant item is the one in which either:
SS1: the application software software
(1)
itself is possible.
SS2: the application software not able to execute a reaction.
Note: End user must consider that random hardware failures affecting the Device can compromise its operations (for
example failure modes affecting the program counter prevent the correct execution of software).
The following table provides details on the SS1 and SS2 safe states.
(1)
is informed by the presence of a fault and a reaction by the application
(1)
cannot be informed by the presence of a fault or the application software
Table 2. SS1 and SS2 safe state details
(1)
is
Safe
state
The application software informed by the presence of
SS1
a fault and a reaction by the application software itself is possible.
The application software be informed by the presence of a
SS2
fault or the application software is
1.
2. Safe state achievement intended here is compliant to Note on IEC 61508-2, 7.4.8.1
not able to execute a reaction.
Referred to application software executed on Arm® Cortex®-M4 CPU.
Condition
(1)
is
(1)
cannot
Compliant item
action
Fault reporting to application software
Reset signal issued by WDTe
(1)
System transition to safe state – 1oo1 architecture
(2)
(1)
drives
Application software the overall system in its safe state
WDTe drives the overall system in its safe state (“safe shut-down”)
System transition to safe state – 1oo2 architecture
Application software of the two channels drives the overall system in its safe state
PEv drives the overall system in its safe state
ASR6.2: It is assumed that the safe state defined at system level by End user is compatible with the assumed local safe state (SS1, SS2) for Compliant item.
ASR7: Compliant item is assumed to be analyzed according to routes 1H and 1S of IEC 61508-2.
Note: Refer to Section 3.5 Systematic safety integrity and Section 3.6 Hardware and software diagnostics.
ASR8: Compliant item is assumed to be regarded as type B, as per IEC 61508:2, 7.4.4.1.2.
ASR9: It is assumed that data exchanges between the Safe Partition and the Non-Safe Partition are implemented
by using statically allocated locations in SYSRAM bank and restricted to non-safety related data.
ASR10.1: STM32MP1 package is considered full safety related, in the framework of Device failure rate computations.
ASR10.2: is End User responsibility to prove the freedom from interferences between safety related and non­safety related physical pins (e.g. by running a pin-level FMEDA).
ASR11: it is assumed that glitches in GPIO output values with a duration lower than the adopted PST are not able to cause violations of the implemented safety function(s)
Note: This assumption can be fulfilled in two ways, either:
Final application robustness against GPIO glitches, as required by the assumption text
or
The execution frequency of the prescribed method GPIO_SM_2 and GPIO_SM_0 is higher than 1/Tm (refer to related “Recommendations and known limitations“ fields for details)
ASR12: it is assumed that STM32MP1 is not used to build fail-operational solutions based exclusively on the resources of a unique STM32MP1 device itself.
(1)
in one
3.4
UM2714 - Rev 2
Electrical specifications and environment limits
To ensure safety integrity, the user must operate the Device(s) within its (their) specified:
absolute maximum rating
page 12/114
capacity
operating conditions
For electrical specifications and environmental limits of Device(s), refer to its (their) technical documentation such as datasheet(s) and reference manual(s) available on www.st.com.

3.5 Systematic safety integrity

According to the requirements of IEC 61508-2, 7.4.2.2, the Route 1S is considered in the analysis of Device(s). As clearly authorized by IEC61508-2, 7.4.6.1, STM32 MPU products can be considered as standard, mass­produced electronic integrated devices, for which stringent development procedures, rigorous testing and extensive experience of use minimize the likelihood of design faults. However, ST internally assesses the compliance of the Device development flow, through techniques and measures suggested in the IEC 61508-2 Annex F. A safety case database (see Section 5 List of evidences) keeps evidences of the current compliance level to the norm.

3.6 Hardware and software diagnostics

This section lists all the safety mechanisms (hardware, software and application-level) considered in the device safety analysis. It is expected that users are familiar with the architecture of the device, and that this document is used in conjunction with the related device datasheet, user manual and reference information. To avoid inconsistency and redundancy, this document does not report device functional details. In the following descriptions, the words safety mechanism, method, and requirement are used as synonyms.
As the document provides information relative to the superset of peripherals available on the devices it covers (not all devices have all peripherals), users are supposed to disregard any recommendations not applicable to their Device part number of interest.
Information provided for a function or peripheral applies to all instances of such function or peripheral on Device. Refer to its reference manual or/and datasheet for related information.
The implementation guidelines reported in the following section are for reference only. The safety verification executed by ST during the device safety analysis and related diagnostic coverage figures reported in this manual (or related documents) are based on such guidelines. For clarity, safety mechanisms are grouped by Device function.
Information is organized in form of tables, one per safety mechanism, with the following fields:
UM2714
Systematic safety integrity
SM CODE Unique safety mechanism code/identifier used also in FMEA document. Identifiers use the scheme
Description Short mnemonic description
Ownership ST : means that method is available on silicon.
Detailed implementation
Error reporting Describes how the fault detection is reported to application software.
Fault detection time Time that the safety mechanism needs to detect the hardware failure.
Addressed fault model
Dependency on Device configuration
Initialization Specific operation to be executed to activate the contribution of the safety mechanism
mmm_SM_x where mmm is a 3- or 4-letter module (function, peripheral) short name, and x is a number. It is possible that the numbering is not sequential (although usually incremental) and/or that the module short name is different from that used in other documents.
End user: method must be implemented by End user through Application software modification, hardware solutions, or both.
Detailed implementation sometimes including notes about the safety concept behind the introduction of the safety mechanism.
Reports fault model(s) addressed by the diagnostic (permanent, transient, or both), and other information:
If ranked for Fault avoidance: method contributes to lower the probability of occurrence of a failure
If ranked for Systematic: method is conceived to mitigate systematic errors (bugs) in application software design
Reports if safety mechanism implementation or characteristics change among different Device part numbers.
UM2714 - Rev 2
page 13/114
3.6.1
UM2714
Hardware and software diagnostics
Periodicity Continuous : safety mechanism is active in continuous mode.
Periodic: safety mechanism is executed periodically
On-demand: safety mechanism is activated in correspondence to a specified event (for instance, reception of a data message).
Startup: safety mechanism is supposed to be executed only at power-up or during off-line maintenance periods.
Test for the diagnostic
Multiple-fault protection
Recommendations and known limitations
Reports specific procedure (if any and recommended) to allow on-line tests of safety mechanism efficiency.
Reports the safety mechanism(s) associated in order to correctly manage a multiple-fault scenario (refer to Section 4.1.3 Notes on multiple-fault scenario).
Additional recommendations or limitations (if any) not reported in other fields.
1. In CM systems, safety mechanism can be accounted for diagnostic coverage contribution only if it is executed at least once per PST. For LD and HD systems, constraints from IEC61508-2, 7.4.5.3 must be applied.

Arm® Cortex®-M4 CPU

Table 3. CPU_SM_0
(1)
.
SM CODE CPU_SM_0
Description
Ownership End user or ST
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent
Dependency on Device configuration None
Initialization None
Periodicity Periodic
Test for the diagnostic
Multiple-fault protection CPU_SM_5: external watchdog
Recommendations and known limitations
Periodical core self-test software for Arm® Cortex®-M4 CPU
The software test is built around well-known techniques already addressed by IEC 61508:7, A.3.2 (Self-test by software: walking bit one-channel). To reach the required values of coverage, the self-test software is specified by means of a detailed analysis of all the CPU failure modes and related failure modes distribution
Self-diagnostic capabilities can be embedded in the software, according the test implementation design strategy chosen. The adoption of checksum protection on results variables and defensive programming are recommended.
This method is the main asset in STM32MP1 Series safety concept. CPU integrity is a key factor because the defined diagnostics for MPU peripherals are to major part software-based.
Startup execution of this safety mechanism is recommended for multiple fault mitigations ­refer to Section 4.1.3 Notes on multiple-fault scenario for details.
UM2714 - Rev 2
page 14/114
SM CODE CPU_SM_1
Description Control flow monitoring in Application software
Ownership End user
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation. Higher value is fixed by watchdog timeout interval.
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity Continuous
Test for the diagnostic NA
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations -
UM2714
Hardware and software diagnostics
Table 4. CPU_SM_1
A significant part of the failure distribution of CPU core for permanent faults is related to failure modes directly related to program counter loss of control or hang-up. Due to their intrinsic nature, such failure modes are not addressed by a standard software test method like SM_CPU_0. Therefore it is necessary to implement a run-time control of the Application software flow, in order to monitor and detect deviation from the expected behavior due to such faults. Linking this mechanism to watchdog firing assures that severe loss of control (or, in the worst case, a program counter hang-up) is detected.
The guidelines for the implementation of the method are the following:
Different internal states of the Application software are well documented and described (the use of a dynamic state transition graph is encouraged).
Monitoring of the correctness of each transition between different states of the Application software is implemented.
Transition through all expected states during the normal Application software program loop is checked.
A function in charge of triggering the system watchdog is implemented in order to constrain the triggering (preventing the issue of CPU reset by watchdog) also to the correct execution of the above-described method for program flow monitoring. The use of window feature available on internal window watchdog (WWDG) is recommended.
The use of the independent watchdog (IWDG), or an external one, helps to implement a more robust control flow mechanism fed by a different clock source.
In any case, safety metrics do not depend on the kind of watchdog in use (the adoption of independent or external watchdog contributes to the mitigation of dependent failures, see
Section 4.2.2 Clock)
UM2714 - Rev 2
page 15/114
SM CODE CPU_SM_2
Description Double computation in Application software
Ownership End user
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity Continuous
Test for the diagnostic Not needed
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations
UM2714
Hardware and software diagnostics
Table 5. CPU_SM_2
A timing redundancy for safety-related computation is considered to detect transient faults affecting the Arm® Cortex®-M4 CPU subparts devoted to mathematical computations and data
access.
The guidelines for the implementation of the method are the following:
The requirement needs be applied only to safety-relevant computation, which in case of wrong result could interfere with the system safety functions. Such computation must be therefore carefully identified in the original Application software source code
Both mathematical operation and comparison are intended as computation.
The redundant computation for mathematical computation is implemented by using copies of the original data for second computation, and by using an equivalent formula if possible
End user is responsible to carefully avoid that the intervention of optimization features of the used compiler removes timing redundancies introduced according to this condition of use.
Table 6. CPU_SM_3
SM CODE CPU_SM_3
Description
Ownership ST
Detailed implementation
Error reporting High-priority interrupt event
Fault detection time Depends on implementation. Refer to functional documentation.
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization None
Periodicity Continuous
Test for the diagnostic
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations Enabling related interrupt generation on the detection of errors is highly recommended.
Arm® Cortex®-M4 HardFault exceptions
HardFault exception raise is an intrinsic safety mechanism implemented in Arm® Cortex®-M4 core, mainly dedicated to intercept systematic faults due to software limitations or error in software design (causing for example execution of undefined operations, unaligned address access). This safety mechanism is also able to detect hardware random faults inside the CPU bringing to such described abnormal operations.
It is possible to write a test procedure to verify the generation of the HardFault exception; anyway, given the expected minor contribution in terms of hardware random-failure detection, such implementation is not recommended.
UM2714 - Rev 2
page 16/114
SM CODE CPU_SM_4
Description Stack hardening for Application software
Ownership End user
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity On demand
Test for the diagnostic Not needed
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations
UM2714
Hardware and software diagnostics
Table 7. CPU_SM_4
The stack hardening method is required to address faults (mainly transient) affecting CPU register bank. This method is based on source code modification, introducing information redundancy in register-passed information to called functions.
The guidelines for the implementation of the method are the following:
To pass also a redundant copy of the passed parameters values (possibly inverted) and to execute a coherence check in the function.
To pass also a redundant copy of the passed pointers and to execute a coherence check in the function.
For parameters that are not protected by redundancy, to implement defensive programming techniques (plausibility check of passed values). For example enumerated fields are to be checked for consistency.
This method partially overlaps with defensive programming techniques required by IEC61508 for software development. Therefore in presence of Application software qualified for safety integrity greater or equal to SC2, optimizations are possible.
Table 8. CPU_SM_5
SM CODE CPU_SM_5
Description External watchdog
Ownership End user
Using an external watchdog linked to control flow monitoring method (refer to CPU_SM_1) addresses failure mode of program counter or control structures of CPU.
External watchdog can be designed to be able to generate the combination of signals needed
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation (watchdog timeout interval)
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity Continuous
Test for the diagnostic To be defined at system level (outside the scope of Compliant item analysis)
Multiple-fault protection CPU_SM_1: control flow monitoring in Application software
on the final system to achieve the safe state. It is recommended to carefully check the assumed requirements about system safe state reported in Section 3.3.1 Safety requirement
assumptions.
It also contributes to reduce potential common cause failures, because the external watchdog is clocked and supplied independently of Device.
UM2714 - Rev 2
page 17/114
SM CODE CPU_SM_5
Recommendations and known limitations
SM CODE CPU_SM_6
Description MCU window watchdog
Ownership ST
Detailed implementation
Error reporting Reset signal generation
Fault detection time Depends on implementation (watchdog timeout interval)
Addressed fault model Permanent
Dependency on Device configuration None
Initialization
Periodicity Continuous
Test for the diagnostic WDG_SM_1: Software test for watchdog at startup
Multiple-fault protection
Recommendations and known limitations
UM2714
Hardware and software diagnostics
CPU_SM_6: MCU window watchdog
In case of usage of windowed watchdog, End user must consider possible tolerance in Application software execution, to avoid false error reports (affecting system availability).
Table 9. CPU_SM_6
Using the WWDG1 watchdog linked to control flow monitoring method (refer to CPU_SM_1) addresses failure mode of program counter or control structures of CPU.
WWDG1 activation. It is recommended to use hardware watchdog in Option byte settings (WWDG1 is automatically enabled after reset)
CPU_SM_1: control flow monitoring in Application software
WDG_SM_0: periodical read-back of configuration registers
The WWDG1 intervention is able to achieve a potentially “incomplete” local safe state because it can only guarantee that CPU is reset. No guarantee that Application software can be still executed to generate combinations of output signals that might be needed by the external system to achieve the final safe state.
SM CODE CPU_SM_7
Description Memory protection unit (MPU)
Ownership ST
Detailed implementation
Error reporting Exception raise (MemManage)
Fault detection time Refer to functional documentation
Addressed fault model
Dependency on Device configuration None
Initialization MPU registers must be programmed at start-up
Periodicity On line
Test for the diagnostic Not needed
Multiple-fault protection MPU_SM_0: Periodical read-back of configuration registers
Recommendations and known limitations
Table 10. CPU_SM_7
The CPU memory protection unit is able to detect illegal access to protected memory areas, according to criteria set by End user.
Systematic (software errors)
Permanent and transient (only program counter and memory access failures)
The use of memory partitioning and protection by MPU functions is highly recommended when multiple safety functions are implemented in Application software. The MPU can be indeed used to
enforce privilege rules
UM2714 - Rev 2
page 18/114
SM CODE CPU_SM_7
separate processes
enforce access rules
Hardware random-failure detection capability for MPU is restricted to well-selected failure modes, mainly affecting program counter and memory access CPU functions. The associated diagnostic coverage is therefore not expected to be relevant for the safety concept of Device.
Enabling related interrupt generation on the detection of errors is highly recommended.
Table 11. MPU_SM_0
SM CODE MPU_SM_0
Description Periodical read-back of MPU configuration registers
Ownership End user
This method must be applied to MPU configuration registers (also unused by the End
Detailed implementation
Error reporting Refer to NVIC_SM_0
Fault detection time Refer to NVIC_SM_0
Addressed fault model Refer to NVIC_SM_0
Dependency on Device configuration Refer to NVIC_SM_0
Initialization Refer to NVIC_SM_0
Periodicity Refer to NVIC_SM_0
Test for the diagnostic Refer to NVIC_SM_0
Multiple-fault protection Refer to NVIC_SM_0
Recommendations and known limitations Refer to NVIC_SM_0
userApplication software).
Detailed information on the implementation of this method can be found in
Section 3.6.4 EXTI controller.
UM2714
Hardware and software diagnostics
Table 12. MPU_SM_1
SM CODE MPU_SM_1
Description MPU software test.
Ownership End User.
This method tests MPU capability to detect and report memory accesses violating the policy enforcement implemented by the MPU itself.
Detailed implementation
Error reporting Depends on implementation.
Fault detection Time Depends on implementation.
Addressed Fault Model Permanent.
Dependency on device configuration None.
Initialization Depends on implementation.
Periodicity On demand.
Test for the diagnostic Not needed.
Multiple faults protection CPU_SM_0: Periodical core self test software
The implementation is based on intentionally performing memory accesses (in writing and read) to memory areas outside of the allowed by the MPU regions programming, and to collect and verify related generated error exceptions.
Test can be executed with the final MPU region programming or with a dedicated one.
UM2714 - Rev 2
page 19/114
SM CODE MPU_SM_1
Recommendations and known limitations
UM2714
Hardware and software diagnostics
Startup execution of this safety mechanism is recommended for multiple fault mitigations - refer to Section 4.1.3 Notes on multiple-fault scenario for details.
UM2714 - Rev 2
page 20/114

3.6.2 Embedded SRAM1/2/3/4

Table 13. RAM_SM_0
SM CODE RAM_SM_0
Description Periodical software test for static random access memory (SRAM or RAM)
Ownership End user or ST
To enhance the coverage on SRAM data cells and to ensure adequate coverage for
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent
Dependency on Device configuration RAM size can change according to the part number
Initialization Depends on implementation
Periodicity Periodic
Test for the diagnostic
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations
permanent faults affecting the address decoder it is required to execute a periodical software test on the system RAM memory. The selection of the algorithm must ensure the target SFF coverage for both the RAM cells and the address decoder. Evidences of the effectiveness of the coverage of the selected method must be also collected
Self-diagnostic capabilities can be embedded in the software, according the test implementation design strategy chosen
Usage of a March test C- is recommended.
Because the nature of this test can be destructive, RAM contents restore must be implemented. Possible interferences with interrupt-serving routines fired during test execution must be also considered (such routines can access to RAM invalid contents).
Unused RAM sections can be excluded by the testing, under end user responsibility on actual RAM usage by final application software
Startup execution of this safety mechanism is recommended for multiple fault mitigations ­refer to Section 4.1.3 Notes on multiple-fault scenario for details.
RAM sections hosting application software or diagnostic libraries can be excluded by the testing with this method, if correctly protected by the dedicated safety mechanism CPU_SM_5. Indeed the diagnostic coverage granted by CPU_SM_5 permits to avoid the overlap of the two methods on the same RAM area.
UM2714
Hardware and software diagnostics
UM2714 - Rev 2
page 21/114
Table 14. RAM_SM_2
SM CODE RAM_SM_2
Description Stack hardening for application software
Ownership End user
The stack hardening method is used to enhance the application software robustness to SRAM faults that affect the address decoder. The method is based on source code modification, introducing information redundancy in the stack-passed information to the called functions.
Detailed implementation
Error reporting Refer to CPU_SM_4
Fault detection time Refer to CPU_SM_4
Addressed fault model Refer to CPU_SM_4
Dependency on Device configuration Refer to CPU_SM_4
Initialization Refer to CPU_SM_4
Periodicity Refer to CPU_SM_4
Test for the diagnostic Refer to CPU_SM_4
Multiple-fault protection Refer to CPU_SM_4
Recommendations and known limitations Refer to CPU_SM_4
Method contribution is relevant in case the combination between the final application software structure and the compiler settings requires a significant use of the stack for passing function parameters.
Implementation is the same as method CPU_SM_4
UM2714
Hardware and software diagnostics
Table 15. RAM_SM_3
SM CODE RAM_SM_3
Description Information redundancy for safety-related variables in application software
Ownership End user
To address transient faults affecting SRAM controller, it is required to implement information redundancy on the safety-related system variables stored in the RAM.
The guidelines for the implementation of this method are the following:
The system variables that are safety-related (in the sense that a wrong value due to a failure in reading on the RAM affects the safety functions) are well-identified and documented.
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity On demand
Test for the diagnostic Not needed
Multiple-fault protection CPU_SM_0: periodical core self-test software
The arithmetic computation or decision based on such variables are executed twice and the two final results are compared.
Safety-related variables are stored and updated in two redundant locations, and comparison is checked before consuming data.
Enumerated fields must use non-trivial values, checked for coherence at least one time per PST
Data vectors stored in SRAM must be protected by a encoding checksum (such as CRC)
UM2714 - Rev 2
page 22/114
Hardware and software diagnostics
SM CODE RAM_SM_3
Implementation of this safety method shows a partial overlap with an already foreseen method
Recommendations and known limitations
for Arm®Cortex®-M4 (CPU_SM_1); optimizations in implementing both methods are therefore possible
Table 16. RAM_SM_4
SM CODE RAM_SM_4
Description Control flow monitoring in application software
Ownership End user
Because end user application software is executed from SRAM, permanent and transient faults affecting the memory (cells and address decoder) can interfere with the program
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation. Higher value is fixed by watchdog timeout interval.
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity Continuous
Test for the diagnostic NA
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations CPU_SM_1 correct implementation supersedes this requirement
execution.
To address such failures it is needed to implement this method.
For more details on the implementation, refer to description CPU_SM_1
UM2714
SM CODE RAM_SM_5
Description Periodical integrity test for application software in RAM
Ownership End user
Because application software and diagnostic libraries are executed in RAM, it is needed to protect the integrity of the code itself against soft-error corruptions and related code mutations. This method must check the integrity of the stored code by checksum computation techniques, on a periodic basis (at least once per PST), or another timing constraint; refer to
(1)
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity Periodic
in Section 3.6 Hardware and software diagnostics. RAM memory cell contents are then checked versus the expected values, using signature-based techniques. According to IEC 61508:2 Table A.5, the effective diagnostic coverage of such techniques depends on the width of the signature in relation to the block length of the information to be protected - therefore the signature computation method must be carefully selected. Note that the simple signature method (IEC 61508:7 - A.4.2 Modified checksum) is inadequate as it only achieves a low value of coverage. The use of internal hardware CRC module is therefore recommended.
Table 17. RAM_SM_5
UM2714 - Rev 2
page 23/114
SM CODE RAM_SM_5
Test for the diagnostic
Multiple-fault protection
Recommendations and known limitations
SM CODE RAM_SM_9
Description SRAM static data encapsulation
Ownership End user
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity On demand
Test for the diagnostic Not needed
Multiple-fault protection CPU_SM_0: periodical core self test software
Recommendations and known limitations None
Self-diagnostic capabilities can be embedded in the software, according the test implementation design strategy chosen.
CPU_SM_0: periodical core self test software
CPU_SM_1: control flow monitoring in application software
Refer to RAM_SM_0 description for information about the management of the overlap on RAM between that method and this one.
If static data are stored in SRAM, encapsulation by a checksum field with encoding capability (such as CRC) must be implemented.
Checksum validity is checked by application software before static data consuming.
UM2714
Hardware and software diagnostics
Table 18. RAM_SM_9

3.6.3 System bus architecture/peripherals interconnect matrix

Table 19. BUS_SM_0
SM CODE BUS_SM_0
Description Periodical software test for interconnections
Ownership End user
The intra-chip connection resources (main AHB interconnection matrix, AHB or APB bridges) needs to be periodically tested for permanent faults detection. Note that STM32MP1 Series devices have no hardware safety mechanism to protect these structures. The test
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity Periodic
Test for the diagnostic Not needed
executes a connectivity test of these shared resources, including the testing of the arbitration mechanisms between peripherals.
According to IEC 61508:2 Table A.8, A.7.4 the method is considered able to achieve high levels of coverage
UM2714 - Rev 2
page 24/114
SM CODE BUS_SM_0
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations
SM CODE BUS_SM_1
Description Information redundancy in intra-chip data exchanges
Ownership End user
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent and Transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity On demand
Test for the diagnostic Not needed
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations
UM2714
Hardware and software diagnostics
Implementation can be considered in large part as overlapping with the widely used Periodical read-back of configuration registers required for several peripherals
Table 20. BUS_SM_1
This method requires to add some kind of redundancy (for example a CRC checksum at packet level) to each data message exchanged inside Device.
Message integrity is verified using the checksum by the application software, before consuming data.
Implementation can be in large part overlapping with other safety mechanisms requiring information redundancy on data messages for communication peripherals. Optimizations are therefore possible.
Table 21. LOCK_SM_0
SM CODE LOCK_SM_0
Description Lock mechanism for configuration options
Ownership ST
The STM32MP1 Series devices feature spread protection to prevent unintended configuration
Detailed implementation
Error reporting Not generated (when locked, register overwrites are just ignored)
Fault detection time NA
Addressed fault model None (Systematic only)
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity Continuous
Test for the diagnostic Not needed
Multiple-fault protection Not needed
Recommendations and known limitations No DC associated because this test addresses software systematic faults
changes for some peripherals and system registers (for example PVD lock enable bit, timers); the spread protection detects systematic faults in software application. The use of this method is encouraged to enhance the end application robustness to systematic faults.
UM2714 - Rev 2
page 25/114
UM2714
Hardware and software diagnostics

3.6.4 EXTI controller

Table 22. NVIC_SM_0
SM CODE NVIC_SM_0
Description Periodical read-back of configuration registers
Ownership End user
This test is implemented by executing a periodical check of the configuration registers for a system peripheral against its expected value. Expected values are previously stored in RAM and adequately updated after each configuration change. The method mainly addresses transient faults affecting the configuration registers, by detecting bit flips in the registers contents. It addresses also permanent faults on registers because it is executed at least one time within PST (or another timing constraint; refer to
software diagnostics) after a peripheral update.
Method must be implemented to any configuration register whose contents are able to interfere with NVIC or EXTI behavior in case of incorrect settings. Check includes NVIC vector table.
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent/transient
Dependency on Device configuration None
Initialization Values of configuration registers must be read after the boot before executing the first check
Periodicity Periodic
Test for the diagnostic Not required
Multiple-fault protection CPU_SM_0: Periodic core self-test software
Recommendations and known limitations
According to the state-of-the-art automotive safety standard ISO26262, this method can achieve high levels of diagnostic coverage (DC) (refer to ISO26262:5, Table D.4)
An alternative valid implementation requiring less space in SRAM can be realized on the basis of signature concept:
Peripheral registers to be checked are read in a row, computing a CRC checksum (use
of hardware CRC is encouraged)
Obtained signature is compared with the golden value (computed in the same way after
each register update, and stored in SRAM)
Coherence between signatures is checked by the application software – signature
mismatch is considered as failure detection
This method addresses only failures affecting configuration registers, and not peripheral core logic or external interface.
Attention must be paid to registers containing mixed combination of configuration and status bits. Mask must be used before saving register contents affecting signature, and related checks, to avoid false positive detections.
(1)
in Section 3.6 Hardware and
UM2714 - Rev 2
page 26/114
Table 23. NVIC_SM_1
SM CODE NVIC_SM_1
Description Expected and unexpected interrupt check
Ownership End user
According to IEC 61508:2 Table A.1 recommendations, a diagnostic measure for continuous, absence or cross-over of interrupt must be implemented. The method of expected and unexpected interrupt check is implemented at application software level.
The guidelines for the implementation of the method are the following:
The interrupts implemented on the MPU are well documented, also reporting, when
possible, the expected frequency of each request (for example, the interrupts related to ADC conversion completion that come on a regular basis).
Individual counters are maintained for each interrupt request served, in order to detect in
a given time frame the cases of a) no interrupt at all b) too many interrupt requests (“babbling idiot” interrupt source). The control of the time frame duration must be regulated according to the individual interrupt expected frequency.
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent/transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity Continuous
Test for the diagnostic Not required
Multiple-fault protection CPU_SM_0: Periodic core self-test software
Recommendations and known limitations
Interrupt vectors related to unused interrupt source point to a default handler that
reports, in case of triggering, a faulty condition (unexpected interrupt).
In case an interrupt service routine is shared between different sources, a plausibility
check on the caller identity is implemented.
The interrupt generation capability of each used peripherals must be periodically
checked, at least once per PST (or timing requirement as per interrupt are generated during normal operations (e.g. a peripheral raising interrupt just in case of incoming data), the interrupt generation capability must be verified by other means. For instance a couple of input/output GPIO lines can be used to generate a GPIO event interrupt.
Interrupt requests related to non-safety-related peripherals are handled with the same
method here described, despite their originator safety classification
In order to decrease the complexity of method implementation, it is suggested to use polling technique (when possible) instead of interrupt for end system implementation
UM2714
Hardware and software diagnostics
(1)
). In case no periodical

3.6.5 Direct memory access controller (DMA/ DMAMUX)

Table 24. DMA_SM_0
SM CODE DMA_SM_0
Description Periodical read-back of configuration registers
Ownership End user
This method must be applied to DMA configuration register and channel addresses register as
Detailed implementation
Error reporting Refer to NVIC_SM_0
Fault detection time Refer to NVIC_SM_0
UM2714 - Rev 2
well.
Detailed information on the implementation of this method can be found in
Section 3.6.4 EXTI controller
page 27/114
SM CODE DMA_SM_0
Addressed fault model Refer to NVIC_SM_0
Dependency on Device configuration Refer to NVIC_SM_0
Initialization Refer to NVIC_SM_0
Periodicity Refer to NVIC_SM_0
Test for the diagnostic Refer to NVIC_SM_0
Multiple-fault protection Refer to NVIC_SM_0
Recommendations and known limitations Refer to NVIC_SM_0
Table 25. DMA_SM_1
SM CODE DMA_SM_1
Description Information redundancy on data packet transferred via DMA
Ownership End user
This method is implemented adding to data packets transferred by DMA a redundancy check (such as CRC check, or similar one) with encoding capability. Full data packet redundancy would be overkilling.
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity On demand
Test for the diagnostic Not needed
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations
The checksum encoding capability must be robust enough to guarantee at least 90% probability of detection for a single bit flip in the data packet
Consistency of data packet must be checked by the application software before consuming data
To give an example about checksum encoding capability, using just a bit-by-bit addition is unappropriated
UM2714
Hardware and software diagnostics
Detailed implementation
UM2714 - Rev 2
Table 26. DMA_SM_2
SM CODE DMA_SM_2
Description
Ownership End user
Information redundancy by including sender or receiver identifier on data packet transferred via DMA
This method helps to identify inside the device the source and the originator of the message exchanged by DMA.
Implementation is realized by adding an additional field to protected message, with a coding convention for message type identification fixed at Device level. Guidelines for the identification fields are:
Identification field value must be different for each possible couple of sender or receiver
on DMA transactions
Values chosen must be enumerated and non-trivial
Coherence between the identification field value and the message type is checked by
application software before consuming data.
page 28/114
SM CODE DMA_SM_2
This method, when implemented in combination with DMA_SM_4, makes available a kind of “virtual channel” between source and destinations entities.
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity On demand
Test for the diagnostic Not needed
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations None
SM CODE DMA_SM_3
Description Periodical software test for DMA
Ownership End user
This method requires the periodical testing of the DMA basic functionality, implemented through a deterministic transfer of a data packet from one source to another (for example from memory to memory) and the checking of the correct transfer of the message on the
Detailed implementation
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity Periodic
Test for the diagnostic Not needed
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations None
target. Data packets are composed by non-trivial patterns (avoid the use of 0x0000, 0xFFFF values) and organized in order to allow the detection during the check of the following failures:
incomplete packed transfer
errors in single transferred word
wrong order in packed transmitted data
UM2714
Hardware and software diagnostics
Table 27. DMA_SM_3
Detailed implementation
UM2714 - Rev 2
Table 28. DMA_SM_4
SM CODE DMA_SM_4
Description DMA transaction awareness
Ownership End user
DMA transactions are non-deterministic by nature, because typically driven by external events like communication messages reception. Anyway, well-designed safety systems should keep much control as possible of events – refer for instance to IEC61508:3 Table 2 item 13 requirements for software architecture.
page 29/114
UM2714
Hardware and software diagnostics
SM CODE DMA_SM_4
This method is based on system knowledge of frequency and type of expected DMA transaction. For instance, an externally connected sensor supposed to send periodically some messages to a STM32 peripheral. Monitoring DMA transaction by a dedicated state machine permits to detect missing or unexpected DMA activities.
Error reporting Depends on implementation
Fault detection time Depends on implementation
Addressed fault model Permanent and transient
Dependency on Device configuration None
Initialization Depends on implementation
Periodicity Continuous
Test for the diagnostic Not needed
Multiple-fault protection CPU_SM_0: periodical core self-test software
Recommendations and known limitations
3.6.6 Universal synchronous/asynchronous and low-power universal asynchronous receiver/
transmitter (USART2/3/6, UART4/5/7/8)
Because DMA transaction termination is often linked to an interrupt generation, implementation of this method can be merged with the safety mechanism NVIC_SM_1: expected and unexpected interrupt check.
Table 29. UART_SM_0
SM CODE UART_SM_0
Description Periodical read-back of configuration registers
Ownership End user
This method must be applied to UART configuration registers.
Detailed implementation
Error reporting Refer to NVIC_SM_0
Fault detection time Refer to NVIC_SM_0
Addressed fault model Refer to NVIC_SM_0
Dependency on Device configuration Refer to NVIC_SM_0
Initialization Refer to NVIC_SM_0
Periodicity Refer to NVIC_SM_0
Test for the diagnostic Refer to NVIC_SM_0
Multiple-fault protection Refer to NVIC_SM_0
Recommendations and known limitations Refer to NVIC_SM_0
Detailed information on the implementation of this method can be found in
Section 3.6.4 EXTI controller.
Table 30. UART_SM_1
SM CODE UART_SM_1
Description Protocol error signals
Ownership ST
USART communication module embeds protocol error checks (like additional parity bit
Detailed implementation
check, overrun, frame error) conceived to detect network-related abnormal conditions. These mechanisms are able anyway to detect a marginal percentage of hardware random failures affecting the module itself.
UM2714 - Rev 2
page 30/114
Loading...
+ 84 hidden pages