AIAG & VDA FMEA For Monitoring And System Response (FMEA-MSR)

In a Supplemental FMEA for Monitoring and System Response, potential Failure Causes which might occur under customer operating conditions are analyzed with respect to their technical effects on the system, vehicle, people, and regulatory compliance. The method considers whether or not Failure Causes or Failure Modes are detected by the system, or Failure Effects are detected by the driver. Customer operation is to be understood as end-user operation or in-service operation and maintenance operations.
FMEA-MSR includes the following elements of risk:

Severity of harm regulatory noncompliance. loss or degraded functionality. and unacceptable quality; represented by (S)
Estimated frequency of a Failure Cause in the context of an operational situation; represented by (F)
Technical possibilities to avoid or limit the Failure Effect via diagnostic detection and automated response, combined with human possibilities to avoid or limit the Failure Effect via sensory perception and physical reaction; represented by (M)

The combination of F and M is an estimate of the probability of occurrence of the Failure Effect due to the Fault (Failure Cause), and resulting malfunctioning behavior (Failure Mode).
NOTE: The overall probability of a Failure Effect to occur may be higher, because different Failure Causes may lead to the same Failure Effect.
FMEA-MSR adds value by assessing risk reduction as a result of monitoring and response. FMEA-MSR evaluates the current state of risk of failure and derives the necessity for additional monitoring by comparison with the conditions for acceptable residual risk.
The analysis can be part of a Design FMEA in which the aspects of Development are supplemented by aspects of Customer Operation. However. it is usually only applied when diagnostic detection is necessary to maintain safety or compliance. Detection in DFMEA Is not the same as Monitoring in Supplemental FMEA- MSR. In DFMEA Detection controls document the ability of testing to demonstrate the fulﬁllment of requirements in development and validation. For monitoring that is already part of the system design, validation is intended to demonstrate that diagnostic monitoring and system response works as intended. Conversely, Monitoring in FMEA-MSR assesses the effectiveness of fault detection performance in customer operation, assuming that speciﬁcations are fulﬁlled. The Monitoring rating also comprehends the safe performance and reliability of system reactions to monitored faults. It contributes to the assessment of the fulﬁllment of Safety Goals and may be used for deriving the Safety Concept. Supplemental FMEA-MSR addresses risks that in DFMEA would otherwise be assessed as High, by considering more factors which accurately reﬂect lower assessed risk according to the diagnostic functions of the vehicle operating system. These additional factors contribute to an improved depiction of risk of failure (including risk of harm, risk of noncompliance, and risk of not fulﬁlling speciﬁcations). FMEA-MSR contributes to the provision of evidence of the ability of the diagnostic, logical, and actuation mechanisms to achieve and maintain a safe or compliant state (in particular, appropriate failure mitigation ability within the maximum fault handling time interval and within the fault tolerant time interval). FMEA—MSR evaluates the current state of risk of failure under and user conditions (not just risk of harm to persons). The detection of faults/failures during customer operation can be used to avoid the original failure effect by switching to a degraded operational state (including disabling the vehicle), informing the driver and/or writing a diagnostic trouble code (DTC) into the control unit for service purposes. In terms of FMEA, the result of RELIABLE diagnostic detection and response is to eliminate (prevent) the original effect and replace it with a new, less severe effect. FMEA—MSR is useful in deciding whether the system design fulﬁlls the performance requirements with respect to safety and compliance. The results may include items such as:

additional sensor(s) may be needed for monitoring purposes
redundancy in processing may be needed
plausibility checks may reveal sensor malfunctions

Step 1: Planning and Preparation

1.1 Purpose

The main objectives of Planning and- Preparation in FMEA-MSR are:

Project identiﬁcation
Project plan (lnTent, Timing, Team, Tasks, Tools (5T)
Analysis boundaries: What is included and excluded from the analysis
Identification of baseline FMEA
Basis for the Structure Analysis step

1.2 FMEA-MSR Project Identification and Boundaries

FMEA—MSR project identiﬁcation includes a clear understanding of what needs to be evaluated. This involves a decision-making process to deﬁne the FMEA—MSRs that are needed for a customer program. What to exclude can be just as important as what to include in the analysis. The following may assist the team. in deﬁning FMEA-MSR projects, as applicable:

Hazard Analysis and Risk Assessment.
Legal Requirements
Technical Requirements
Customer wants/needs/expectation (external and internal customers)
Requirements speciﬁcation
Diagrams (Block/Boundary/System)
Schematics, Drawings. and/or 3D Models.
Bill of Materials (BOM), Risk Assessment
Previous FMEA for similar products

Answers to these questions and others deﬁned by the company help create the list of FMEA-MSR projects needed. The FMEA- MSR project list assures consistent direction, commitment and
focus. Below are some basic questions that help identify FMEA-MSR boundaries:

After completing a DFMEA on an Electrical/Electronic/Prograrnmable Electronic System, are there effects that may be harmful to persons or involve regulatory noncompliance?
Did the DFMEA indicate that all of the causes which lead to harm or noncompliance can be detected by direct sensing. and/or plausibility algorithms?
Did the DFMEA indicate that the intended system response to any and all of the detected causes is to switch to a degraded operational state (including disabling the vehicle]. inform the driver and/or write a Diagnostic Trouble Code (DTC) into the control unit for service purposes?

FMEA for Monitoring and System Response may be used to examine systems which have integrated fault monitoring and response mechanisms during operation. Typically, these are more complex systems composed of sensors. actuators and logical processing units. The diagnosis and monitoring in such systems, may be achieved through hardware and, or software. Systems that may be considered in a Supplemental FMEA for Monitoring and System Response consist in general of at least a sensor, a control unit, and an actuator or a subset of them and are called mechatronic systems. Systems in-scope may also consist of mechanical hardware components (e.g., pneumatic and hydraulics).

Generic-block diagram of an Electrical / Electronic / Programmable Electronic system

The scope of a Supplemental FMEA for Monitoring and System Response may be established in consultation between customer and supplier. Applicable scoping criteria may include, but are not limited to:

System Safety relevance
ISO Standards, i.e., Safety Goals according to ISO 26262
Documentation requirements from legislative bodies e.g., UN/ECE Regulations, FMVSS/CMVSS, NHTSA, and On Board Diagnostic Requirements (OBD) Compliance.

1.3 FMEA-MSR Project Plan

A plan for the execution of the FMEA-MSR should be developed once the FMEA-MSR project is known. It is recommended that the 5T method (Intent, Timing, Team, Tasks. Tool) be used. The plan for the FMEA-MSR helps the company be proactive in starting the FMEA-MSR early. The FMEA-MSR activities (5-step process) should be incorporated into the overall design project plan.

Step 2 : Structure Analysis

2.1 Purpose

The main objectives of Structure Analysis in FMEA—MSR are:

Visualization of the analysis scope
Structure tree or equivalent: block diagram, boundary diagram, digital model, physical parts
Identiﬁcation of design interfaces, interactions
Collaboration between customer and supplier engineering teams (interface responsibilities)
Basis for the Function Analysis step

Depending on the scope of analysis, the structure may consist of hardware elements and software elements. Complex structures may be split into several structures (work packages) or different layers of block diagrams and analyzed separately for organizational reasons or to ensure sufﬁcient clarity. The scope of the FMEA—MSR is limited to the elements of the system for which the baseline DFMEA showed that there are causes of failure which can result in hazardous or non-compliant effects. The scope may be expanded to include signals received by the control unit. In order to visualize a system structure, two methods are commonly used:

Block (Boundary) Diagrams
Structure Trees

2.2 Structure Trees

In a Supplemental FMEA for Monitoring and System Response, the root element of a structure tree can be at vehicle level, i.e. for OEMs which analyze the overall system or at the system level, i.e. for suppliers which analyze a subsystem or component .

Example of a structure tree of a window lift system for investigating erroneous signals, monitoring, and system response

The sensor element and the control unit may also be part of one component (smart sensor). Diagnostics and monitoring in such systems may be realized by hardware and/or software elements.

Example of a structure tree of a smart sensor with an Internal sensing element and output to an interface

In case there is no sensor within the scope of analysis, an Interface Element is used to describe the data/current/voltage received by the ECU. One function of any ECU is to receive signals via a connector. These signals can be missing or erroneous. With no monitoring, you get erroneous output. In case there is no actuator within the scope of analysis, an Interface Element is used to describe the data/current/voltage sent by the ECU. Another function of any ECU is to send signals. i.e. via a connector. These signals can also be missing or erroneous. It can also be “no output” or “failure information.” The causes of erroneous signals may be within a component which is outside the scope of responsibility of the engineer or organization. These erroneous signals may have an effect on the performance of a component which is within the scope of responsibility of the engineer or organization. It is therefore necessary to include such causes in the FMEA-MSR analysis.
NOTE: Ensure that the structure is consistent with the Safety Concept (as applicable).

STRUCTURE ANALYSIS (STEP 2)
1. Next Higher Level	2 Focus Element	3. Next lower level or characteristic Type
Window Lift System	ECU Window Lifter	Connector ECU Window Lifter

Example of Structure Analysis in the FMEA-MSR Form Sheet

Step 3 : Function Analysis

The main objectives of Function Analysis in FMEA-MSR are:

Visualization of functions and relationships between functions in Function tree/ function net, or equivalent parameter diagram (P—diagram)
Cascade of customer (external and internal) functions with associated requirements
Association of requirements or characteristics to functions
Collaboration between engineering teams (systems, safety, and components)
Basis for the Failure Analysis step

In a Supplemental FMEA for Monitoring and System Response, monitoring for failure detection and failure responses are considered as functions. Hardware and software functions may include monitoring of system states. Functions for monitoring and detection of faults/failures may consist of, for example: out of range detection, cyclic redundancy checks, plausibility checks and sequence counter checks. Functions for failure reactions may consist of, for example, provision of default values, switching to a limp home mode, switching off the corresponding function and/or display of a warning. Such functions are modeled for these structural elements that are carriers of these functions, i.e., control units or components with computational abilities like smart sensors. Additionally, sensor signals can be considered which are received by control units. Therefore, functions of signals may be described as well. Finally, functions of actuators can be added, which describe the way the actuator or vehicle reacts on demand. Performance requirements are assumed to be the maintenance of a safe or compliant state. Fulﬁllment of requirements is assessed through the risk assessment. in case sensors and/or actuators are not within the scope of analysis, functions are assigned to the corresponding interface- elernents (consistent with the Safety Concept-as applicable).

Example of a Structure Tree with functions

FUNCTION ANALYSIS (STEP 3)
1. Next Higher Level Function and Requirement	2 Focus Element Function and Requirement	3. Next lower level Function and Requirement or characteristic Type
Provide anti-pinch protection for comfort closing mode	Provide signal to stop and reverse window lifter motor in case of pinch situation	Transmit signal from Hall effect sensor to ECU

Example of Function Analysis in FMEA-MSR Form Sheet.

Step 4: Failure Analysis

4.1 Purpose

The purpose of Failure Analysis in FMEA-MSR is to-describe the chain of events which lead up to the end effect, in the context of a relevant scenario. The main objectives of Failure Analysis in FMEA-MSR are:

Establishment of the failure chain
Potential Failure Cause, Monitoring, System Response,Reduced Failure Effect.
Identiﬁcation of product Failure Causes using a parameter diagram or failure network
Collaboration between customer and supplier (Failure Effects)
Basis for the documentation of failures in the FMEA form sheet and the Risk Analysis step

4.2 Failure Scenario

A Failure Scenario is comprised of a description of relevant operating conditions in which a fault results in malfunctioning behavior and possible sequences of events (system states) that lead to an and system state (Failure Effect). It starts from deﬁned Failure Causes and leads to the Failure Effects.

Theoretical failure chain model DFMEA and FMEA-MSR

The focus of the analysis is a component with diagnostic capabilities, e.g., an ECU. If the component is not capable of detecting the fault/failure, the Failure Mode will occur which leads to the end effect with a corresponding degree of Severity. However, if the component can detect the failure, this leads to a system response with a Failure Effect with a lower Severity compared to the original Failure Effect. Details are described in the following scenarios (1) to (3).

**Failure Scenario (1) – Non-Hazardous**

Failure Scenario (1) describes the malfunctioning behavior from the occurrence of the fault to the Failure Effect, which in this example is not hazardous but may reach a non-compliant end system state.

Failure Scenario (2) describes the malfunctioning behavior from the occurrence of the fault to the Failure Effect, which in this example leads to a hazardous event. As an aspect of the Failure Scenario, it is necessary to estimate the magnitude of the Fault Handling Time Interval (time between the occurrence of the fault, and the occurrence of the hazard/non-compliant Failure Effect). The Fault Handling Time Interval is the maximum time span of malfunctioning behavior before a hazardous event occurs, if the safety mechanisms are not activated.

Failure Scenario (3) – mitigated (Effect)

Failure Scenario (3) describes the malfunctioning behavior from the occurrence of the fault to the mitigated Failure Effect, which in this example leads to a loss or degradation of a function instead of the hazardous event.

4.3 Failure Cause

The description of the Failure Cause is the starting point of the Failure Analysis in a Supplemental FMEA for Monitoring and System Response. The Failure Cause is assumed to have occurred and is not the true Failure Cause (root cause). Typical Failure Causes are electrical/electronic faults (E/E faults). Root causes may be insufficient robustness when exposed to various factors such as the external environment, vehicle dynamics, wear, service, stress cycling, data bus overloading, and erroneous signal states etc. Failure Causes can be derived from the DFMEA, catalogues for failures of E/E components, and network communication data descriptions.

NOTE: In FMEA-MSR, diagnostic monitoring is assumed to function as intended. (However, it may not be effective.) Therefore, Failure Causes of diagnostics are not part of FMEA—MSR but can be added to the DFMEA section of the form sheet. These include Failed to detect fault; Falsely detected fault (nuisance); Unreliable fault response (variation in response capability).

Teams may decide not to include failures of diagnostic monitoring in DFMEA because Occurrence ratings are most often very low (including “latent faults” Ref. ISO 26262). Therefore. this analysis may be of limited value. However, the correct implementation of diagnostic monitoring should be part of the test protocol. Prevention Controls of diagnostics in a DFMEA describe how reliable a mechanism is estimated to detect the Failure Cause and reacts on time with respect to the performance requirements. Detection Controls of diagnostics in a DFMEA would relate back to development tests which verify the correct implementation and the effectiveness of the monitoring mechanism.

4.4 Failure Mode

A Failure Mode is the consequence of the fault (Failure Cause). In FMEA-MSR two possibilities are considered:

In case of failure scenarios (1) and (2) the fault is not detected or the system reaction is too late. Therefore, the Failure Mode in FMEA-MSR is the same as in DFMEA.
Different is failure scenario (3), where the fault is detected and the system response leads to a mitigated Failure Effect. In this case a description for the diagnostic monitoring and system response is added to the analysis. Because the failure chain in this speciﬁc possibility consists of a fault/failure and a description of an intended behavior, this is called a hybrid failure chain or hybrid failure network

4.5 Failure Effect

A Failure Effect is deﬁned as the consequence of a Failure Mode. Failure Effects in FMEA-MSR are either a malfunctioning behavior of the system or an intended behavior after detection of a Failure Cause. The end effect may be a “hazard” or “non-compliant state” or, in case of detection and timely system response, a “safe state” or”compliant state” with loss or degradation of a function. The severity of Failure Effects is evaluated on a ten-point scale

FAILURE ANALYSIS (STEP 4)
Failure Effect (FE) to the Next Higher Level Element and/ or End User	2 Failure Mode (FM) of the Focus Element	3. Failure Cause (FC) of theNext lower level Element or characteristic
No anti-pinch protection in comfort closing mode. {Hand or neck may be pinched between window glass and frame]	No signal to stop and reverse window lifter motor in case of pinch situation	Signal of Hall effect sensor is not transmitted to ECU due to poor connection of Hail effect Sensor

Example of Failure Analysis In FMEA-HSR Form Sheet.

Step 5: Risk Analysis

5.1 Purpose

The purpose of Risk Analysis in FMEA-MSR is to estimate risk of failure by evaluating Severity, Frequency, and Monitoring. and prioritize the need for actions to reduce risk. The main objectives of the FMEA-MSR Risk Analysis are:

Assignment of existing and/or planned controls and rating of failures
Assignment of Prevention Controls to the Failure Causes
Assignment of Detection Controls to the Failure Causes and/or Failure Modes
Rating of Severity, Frequency and Monitoring for each failure chain.
Evaluation of Action Priority
Collaboration between customer and supplier (Severity).
Basis for the Optimization step.

5.2 Evaluations

Each Failure Mode, Cause and Effect relationship (failure chain or hybrid network) is assessed by the following three criteria:

Severity (S): represents the Severity of the Failure Effect
Frequency (F): represents the Frequency of Occurrence of the Cause in a given operational situation, during the intended service life of the vehicle
Monitoring (M): represents the Detection potential of the Diagnostic Monitoring functions (detection of Failure Cause, Failure Mode and/or Failure Effect)

Evaluation numbers from 1 to 10 are used for S, F, and M respectively. where 10 stands for the highest risk contribution. By examining these ratings individually and in combinations of the three factors the need for risk-reducing actions may be prioritized.

5.3 Severity (S)

The Severity rating (S) is a measure associated with the most serious Failure Effect for a given Failure Mode of the function being evaluated and is identical for DFMEA and FMEA-MSR. Severity should be estimated using the criteria in the Severity Table . The table may be augmented to include product- speciﬁc examples. The FMEA project team should agree on an evaluation criteria and rating system, which is consistent even if modiﬁed for individual design analysis. The Severity evaluations of the Failure Effects should be transferred by the customer to the supplier, as needed.

Product General Evaluation Criteria Severity (S)
Potential Failure Effects rated according to the criteria below.			Blank until filled in by user
S	Effect	Severity criteria	Corporate or Product Line Examples
10	Very High	Affects safe operation of the vehicle and/or other vehicles, the health of driver or passengers or road users or pedestrians.
9	Very High	Noncompliance with regulations.
8	High	Loss of primary vehicle function necessary for normal driving during expected service life.
7	High	Degradation of primary vehicle function necessary for normal driving during expected service life.
6	Moderate	Loss of secondary vehicle function.
5		Degradation of secondary vehicle function.
4		Very objectionable appearance, sound, vibration, harshness, or haptics.
3	Low	Moderately objectionable appearance, sound, vibration, harshness, or haptics.
2	Low	Slightly objectionable appearance, sound, vibration, harshness, or haptics.
1	Very low	No discernible effect

Supplemental FMEA-MSR SEVERITY (S)

5.4 Rationale for Frequency Rating

In a Supplemental FMEA for Monitoring and System Response, the likelihood of a failure to occur in the ﬁeld under customer operating conditions during service life is relevant. Analysis of end user operation requires assumptions that the manufacturing process is adequately controlled in order to assess the sufﬁciency of the design. Examples on which a rationale may be. based on:

Evaluation based on the results of Design FMEAs
Evaluation based on the results of Process FMEAs
Field data of returns and rejected parts
Customer complaints
Customer complaints
Warranty databases
Data handbooks

The rationale is documented in the column “Rationale for Frequency Rating“ of the FMEA-MSR form sheet.

5.5 Frequency (F)

The Frequency rating (F) is a- measure of the likelihood of occurrence of the cause in relevant operating situations during the intended service life of the vehicle or the system using the criteria in Table below. If the Failure Cause does not always lead to the associated Failure Effect, the rating may be adapted. taking into account the probability of exposure to the relevant operating condition. In such cases the operational situation and the rationale are to be stated in the column “Rationale for Frequency Rating.”
Example: From ﬁeld data it is known how often a control unit is defective in ppm/year. This may lead to F=3. The system under investigation is a parking system which is used only a very limited
time in comparison to the overall operating time. So harm to persons is only possible when the defect occurs during the parking maneuver. Therefore, Frequency may be lowered to F=2.

Frequency Potential (F) for the Product
Frequency criteria (F) for the estimated occurrence of the Failure Cause in relevant operating situations during the intended service life of the vehicle				Blank until filled in by user
F	Estimated Frequency	Frequency criteria — FMEA-MSR		Corporate or Product Line Examples
10	Extremely High or cannot be determined	Frequency of occurrence of the Failure Cause is unknown or known to be unacceptably high during the intended service life of the vehicle
9	High	Failure Cause is likely to occur during the intended service life of the vehicle
8		Failure Cause may occur often in the ﬁeld during the intended service life of the vehicle
7	Medium	Failure Cause may occur frequently in the ﬁeld during the intended service life of the vehicle
6		Failure Cause may occur somewhat frequently in the ﬁeld during the intended service life of the vehicle
5		Failure Cause may occur Occasionally in the ﬁeld during the intended service life of the vehicle
4	low	Failure Cause is predicted to occur rarely in the ﬁeld during the intended service life of the vehicle. At least ten occurrences in the ﬁeld are predicted.
3	Very Low	Failure Cause is predicted to occur in isolated cases in the ﬁeld during the intended service life of the vehicle. At least one occurrence in the ﬁeld is predicted.
2	Extreme Low	Failure Cause is predicted not to occur in the ﬁeld during the intended service life of the vehicle based on prevention and detection controls and ﬁeld experience with similar parts. Isolated cases cannot be ruled out. No proof it will not happen.
1	Cannot occur	Failure Cause cannot occur during the intended service life of the vehicle or is virtually eliminated. Evidence that Failure Cause cannot occur. Rationale is documented.
Percentage of relevant operating condition in comparison to overall operating time			Value by which F may be lowered
< 10%			1
< 1%			2

supplemental FMEA-MSR FREQUENCY (F)

Note:

Probability increases as number of vehicles are increased
Reference value for estimation is one. million vehicles in a the ﬁeld

5.6 Current Monitoring Controls

All controls that are planned or already implemented and lead to a detection of the Failure Cause, the Failure Mode or the Failure Effect by the system or by the driver are entered into the “Current Monitoring Controls” column. In addition, the fault reaction after-detection should be described. i.e. provision of default values, (if not already sufﬁciently described by the Failure Effect). Monitoring evaluates the potential that the Failure Cause, the Failure Mode or the Failure Effect can be detected early enough, so that the initial Failure Effect can be mitigated before a hazard occurs or a non-compliant state is reached. The result is an end state effect with a lower severity.

5.7 Monitoring (M)

The Monitoring rating (M) is a measure of the ability of detecting a fault/failure during customer operation and applying the fault reaction in order to maintain a safe or compliant state. The Monitoring Rating relates to the combined ability of all sensors, logic, and human sensory perception to detect the fault/failure; and react by modifying the vehicle behavior by means of mechanical actuation and physical reaction (controllability). In order to maintain a safe or compliant state of operation, the sequence of fault detection and reaction need to take place before the hazardous or non-compliant effect occurs. The resulting rating describes the ability to maintain a safe or compliant state of operation. Monitoring is a relative rating within the scope of the individual FMEA and is determined without regard. for severity or frequency. Monitoring should be estimated using the criteria in Table below. This table may be augmented with examples of common monitoring. The FMEA project team should agree on an evaluation criteria and rating system which is consistent, even if modiﬁed for individual product analysis. The assumption is that Monitoring is implemented and tested as designed. The effectiveness of Monitoring depends on the design of the sensor hardware. sensor redundancy, and diagnostic algorithms that are implemented. Plausibility metrics alone are not considered to be effective. Implementation of monitoring and the veriﬁcation of effectiveness should be part of the development process and therefore may be analyzed in the corresponding DFMEA of the product. The effectiveness of diagnostic monitoring and response, the fault monitoring response time. and the Fault Tolerant Time Interval need to be determined prior to rating. Determination of the effectiveness of diagnostic monitoring is addressed in detail in ISO 26262-5:2018 Annex D.
In practice. three different monitoring/response cases may be distinguished:

If there is no monitoring control. or if monitoring and response do not occur within the Fault Handling Time Interval, then Monitoring should be rated as Not Effective (M=10).

The original Failure Effect is virtually eliminated. Only the mitigated Failure Effect remains relevant for the risk estimation of the product or system in this instance only. the mitigated FE is
relevant for the Action Priority rating, not the original FE. The assignment of Monitoring Ratings to Failure Causes and their corresponding Monitoring Controls can vary depending on:

Variations in the Failure Cause or Failure Mode
Variations in the hardware implemented for diagnostic monitoring
The execution timing of the safety mechanism. i.e., failure is detected during “power up” only
Variations in system response
Variations in human perception and reaction
Knowledge of implementation and effectiveness from other projects (newness)

Depending on these Variations or execution timing, Monitoring Controls may not be considered to be RELIABLE in the sense of M=1.

The original Failure Effect occurs less often. Most of the failures are detected and the system response leads to a mitigated Failure Effect. The reduced risk is represented by the Monitoring rating. The most serious Failure Effect remains S=10.

Supplemental FMEA for Monitoring and System Response (M)
Monitoring Criteria (M) for Failure Causes, Failure Modes and Failure Effects by Monitoring during Customer Operation. Use the rating number that corresponds with the least effective of either criteria for Monitoring or System Response				Blank until filled in by user
M	Effectiveness of Monitoring Control and system response	Diagnostic Monitoring/ Sensory Perception criteria	System Response /Human Reaction Criteria		Corporate or Product Line Examples
10	Not effective	The fault/failure cannot be detected at all or not during the Fault Handling Time Interval; by the system, the driver, passenger, or service technician.	No response during the Fault Handling Time Interval
9	Very low	The fault/failure can almost never be detected in relevant operating conditions. Monitoring control with low effectiveness, high variance, or high uncertainty. Minimal diagnostic coverage.	The reaction to the fault/failure by the system or the driver may not reliably occur during the Fault Handling Time Interval
8	Low	The fault/failure can be detected in very few relevant operating conditions. Monitoring control with low effectiveness, high variance, or high uncertainty. Diagnostic coverage estimated <60%.	The reaction to the fault/failure by the system or the driver may not always occur during the Fault Handling Time Interval
7	Moderately Low	Low probability of detecting the fault/failure during the Fault Handling Time Interval by the system or the driver. Monitoring control with low effectiveness, high variance, or high uncertainty. Diagnostic coverage estimated >60%.	Low probability of reacting to the detected fault/failure during the Fault Handling Time Interval by the system or the driver
6	Moderate	The fault/failure will be automatically detected by the system or the driver only during power-up, with medium variance in detection time. Diagnostic coverage estimated >90%.	The automated system or the driver will be able to react to the detected fault/failure in many operating conditions.
5	Moderate	The fault/failure will be automatically detected by the system during the Fault Handling Time Interval, with medium variance in detection time or detected by the driver in very many operating conditions. Diagnostic coverage estimated between 90% – 97%.	The automated system or the driver will be able to react to the detected fault/failure during the Fault Handling Time Interval in a very many operating conditions
4	Moderately High	The fault/failure will be automatically detected by the system during the Fault Handling Time Interval, with medium variance in detection time or detected by the driver in most operating conditions. Diagnostic coverage estimated > 97%.	The automated system or the driver will be able to react to the detected fault/failure during the Fault Handling Time Interval in a most operating conditions
3	High	The fault/failure will be automatically detected by the system during the Fault Handling Time Interval, with very low variance in detection time and with high Probability. Diagnostic coverage estimated > 99%.	The system will automatically react to the detected fault/failure during the Fault Handling Time Interval in a most operating conditions with very low variance in system response time, and with a high probability.
2	Very High	The fault/failure will be automatically detected by the system during the Fault Handling Time Interval, with very low variance in detection time and with very high Probability. Diagnostic coverage estimated > 99.9%.	The system will automatically react to the detected fault/failure during the Fault Handling Time Interval in a most operating conditions with very low variance in system response time, and with a very high probability.
1	Reliable and acceptable for elimination of original failure effect	The fault/failure will always be automatically detected by the system. Diagnostic coverage estimated to be significantly greater then 99.9%.	The system will always automatically react to the detected fault/failure during the Fault Handling Time Interval

Supplemental F’MEA-MSR- MONITORING (M)

5.8 Action Priority (AP) for FMEA-MSR

The Action Priority is a methodology which allows for the prioritization of the need for action, considering Severity, Frequency, and Monitoring (SFM). This is done by the assignment of SFM ratings which provide a basis for the estimation of risk.

Priority High (H): Highest priority for review and action. The team needs to either identify an appropriate action to lower frequency and/or to improve monitoring controls or justify and document why current controls are adequate.
Priority Medium (M): Medium priority for review and action. The team should identify appropriate actions to lower frequency and/or to improve monitoring controls, or, at the discretion of the company, justify and document why controls are adequate.
Priority Low (L): Low priority for review and action. The team could identify actions to lower
frequency and/or to improve monitoring controls.

It is recommended that potential Severity 9-10 failure effects with Action Priority High and Medium, at a minimum, be reviewed by management including any recommended actions that were taken. This is not the prioritization of High, Medium, or Low risk, It. is the prioritization of the need for actions to reduce risk.
Note: it may be helpful to include a statement such as “No further action is needed” in the Remarks field as appropriate.

Auction Priority is based on combinations of Severity, Frequency, and monitoring ratings- in order to prioritize actions for risk reduction.
Effect	S	Prediction of Failure Cause occurring during service life of vehicle	F	Effectiveness of Monitoring	M	Action Priority (AP)
Product or Plant Effect Very high	10	Medium-Extremely High	5-10	Reliable – Not effective	1-10	H
		Low	4	Moderately high – Not effective	4-10	H
				Very high – High	2-3	H
				Reliable	1	M
		Very low	3	Moderately high – Not effective	4-10	H
				Very high – High	2-3	M
				Reliable	1	L
		Extremely low	2	Moderately high – Not effective	4-10	M
		Extremely low	2	Reliable- high	1-3	L
		Cannot occur	1	Reliable – Not effective	1-10	L
Product Effect high	9	Low – Extremely high	4-10	Reliable – Not effective	1-10	H
		Extremely low –Very low	2-3	Very High – Not effective	2-10	H
		Extremely low –Very low	2-3	Reliable- high	1-3	L
		Cannot occur	1	Reliable – Not effective	1-10	L
Product Effect Moderately high	7-8	Medium-Extremely High	6-10	Reliable – Not effective	1-10	H
		Medium	5	Moderately high – Not effective	5-10	H
		Medium	5	Reliable- Moderately high	1-4	M
		low	4	Moderately low – Not effective	7-10	H
				Moderately High- Moderate	4-6	M
				Reliable -High	1-3	L
		Very low	3	Very low – Not effective	9-10	H
				Moderately low-Low	7-8	M
				Reliable -Moderate	1-6	L
		Extreme Low	2	Moderately low – Not effective	7-10	M
		Extreme Low	2	Reliable -Moderate	1-6	L
		Cannot occur	1	Reliable – Not effective	1-10	L
Product or Effect Moderately low	4-6	High-Extremely High	7-10	Reliable – Not effective	1-10	H
		Medium	5-6	Moderate – Not effective	6-10	H
		Medium	5-6	Reliable –Moderately High	1-5	M
		Extremely low- Low	2-4	Very low – Not effective	9-10	M
				Moderately High- Moderate	7-8	M
				Reliable -Moderate	1-6	L
		Cannot occur	1	Reliable – Not effective	1-10	L
Product Effect low	2-3	High-Extremely High	7-10	Reliable – Not effective	1-10	H
		Medium	5-6	Moderately low – Not effective	7-10	M
		Medium	5-6	Reliable -Moderate	1-6	L
		Extremely low- Low	2-4	Reliable -Moderate	1-6	L
		Cannot occur	1	Reliable – Not effective	1-10	L
Product effect very low	1	Very low- Very high	1-10	Reliable – Not effective	1-10	L

ACTION PRIORITY FOR FMEA-MSR

NOTE 1: If M=1, the Severity rating of the Failure Effect after Monitoring and System Response is to be used for determining MSR Action Priority. If M is not equal to 1, then the Severity Rating of the original Failure Effect is to be used for determining MSR Action Priority.
NOTE 2: When FMEA—MSR is used. and M=1, then DFMEA Action Prioritization replaces the severity rating of the original Failure Effect with the Severity rating of the mitigated Failure Effect.

**Example of FMEA-MSR Risk Analysis – Evaluation of Current Risk Form Sheet**

Step 6: Optimization

6.1 Purpose

The primary objective of Optimization in FMEA-MSR is to develop actions that reduce risk and improve safety. In this step, the team reviews the results of the risk analysis and evaluates action priorities. The main objectives of FMEA-MSR Optimization are:

Identiﬁcation of the actions necessary to reduce risks
Assignment of responsibilities and target completion dates for action implementation
Implementation and documentation of actions taken including conﬁrmation of the effectiveness of the implemented actions and assessment of risk after actions taken.
Collaboration between the FMEA team, management, customers, and suppliers regarding potential failures
Basis for reﬁnement of the product requirements and prevention/detection controls

High and medium action priorities may indicate a need for technical improvement. Improvements may be achieved by introducing more reliable components which reduce the occurrence potential of the Failure Cause in the field or introduce additional monitoring which improve the detection capabilities of the system. Introduction of monitoring is similar to design change. Frequency of the Failure Cause is not changed. It may also be possible to eliminate the Failure Effect by introducing redundancy. If the team decides that no further actions are necessary. “No further action is needed” is written in the Remarks ﬁeld to show the risk analysis was completed. The optimization is most effective in the following order:

Component design modiﬁcations in order to reduce the Frequency (F) of the Failure Cause (FC)
Increase the Monitoring (M) ability for the Failure Cause (FC) or Failure Mode (FM).

In the case of design modiﬁcations. all impacted design elements are evaluated again.
In the case of concept modiﬁcations. all steps of the FMEA are reviewed for the affected sections. This is necessary because, the original analysis is no longer valid since it was based upon a different design concept.

6.2 Assignment of Responsibilities

Each action should have a responsible individual and a Target Completion Date (TCD) associated with it. The responsible person ensures’the action status is updated, if the action is conﬁrmed this person is also responsible for the action implementation. The Actual Completion Date is documented including the date the actions are implemented. Target Completion Dates should be, realistic (i.e., in accordance with the product development plan, Prior to process validation,
prior to start of production).

6.3 Status of the Actions

Suggested levels for Status of: Actions:

Open: No Action deﬁned.
Decision pending (optional): The action has been deﬁned but has not yet decided on. A decision paper is being created.
Implementation pending (optional): The action has been decided on but not yet implemented.
Completed: Completed actions have been implemented and their effectiveness has been demonstrated and documented. A ﬁnal evaluation has been done.
Not Implemented: Not Implemented status .is assigned when a decision is made not to implement an action. This may occur when risks related to practical and technical limitations are beyond current capabilities

The FMEA is not considered “complete” until the team assesses each item’s Action Priority and either accepts the level of risk or documents closure of all actions. Closure of all actions should be documented before the FMEA is released at Start of Production (SOP). If “No Action Taken”, then Action Priority is not reduced and the risk of failure is carried forward into the product design.

6.4 Assessment of Action Effectiveness

When an action has been completed, Frequency, and Monitoring values are reassessed, and a new Action Priority may be determined. The new action receives a preliminary Action Priority rating as a prediction of effectiveness. However. the status of the action remains “implementation pending” until the effectiveness has been tested. After the tests are ﬁnalized the preliminary rating has to be confirmed or adapted, when indicated. The status of the action is then changed from “implementation pending” to “completed.” The reassessment should be based on the effectiveness of the MSR Preventive and Diagnostic Monitoring Actions taken and the new values are based on the deﬁnitions in the FMEA-MSR Frequency and Monitoring rating tables.

6.5 Continuous Improvement

FMEA—MSR serves as an historical record for the design. Therefore. the original Severity. Frequency, and Monitoring (S, F, M) numbers are not modiﬁed once actions have been taken. The completed analysis becomes a repository to capture the progression of design decisions and design reﬁnements. However, original S, F, M ratings may be modiﬁed for basis, family or generic DFMEAs because the information is used as a starting point for an application-speciﬁc analysis.

Example of FMEA-MSR Optimization with new Risk Evaluation Form Sheet

Step 7: Results Documentation

7.1 Purpose

The purpose of the results documentation step is to summarize. and communicate the results of the Failure Mode and Effects Analysis activity. The main objectives of FMEA – MSR Results Documentation are:

Communication of results and conclusions of the analysis.
Establishment of the content of the documentation
Documentation of actions taken including conﬁrmation of the effectiveness of the implemented actions and assessment of risk after actions taken
Communication of actions taken to reduce risks.- including within the organization. and with customers and/or suppliers as appropriate
Record of risk analysis and reduction to acceptable levels

7.2 FMEA Report

The scope and results of an FMEA should be summarized in a report. The report can be used for communication purposes within a company, or between companies. The report is not meant to replace reviews of the FMEA-MSR details when requested by management, customers, or suppliers. It is meant to be a summary for the FMEA-MSR team and others to conﬁrm completion of each of the tasks and review the results of the analysis. It is important that the content of the documentation fulﬁlls the requirements of the organization, the intended reader. and relevant stakeholders. Details may be agreed upon between the parties. In this way, it is also ensured that all details of the analysis and the intellectual property remain at the developing company. The layout of the document may be company speciﬁc. However, the report should indicate the technical risk of failure as a part of the development plan and project milestones. The content may include the following:

A statement of ﬁnal status compared to original goals established in Project Plan
- FMEA Intent – Purpose of this FMEA‘?
- FMEA Timing – FMEA due date?
- FMEA Team – List of participants?
- FMEA Task – Scope of this FMEA?
- FMEA Tool – How do we conduct the analysis Method used?
A summary of- the scope of the analysis and identify what is new.
A summary of how the functions were developed.
A summary of at least the high-risk failures as determined by the team and provide a copy of the speciﬁc S/F/M rating tables and method of action prioritization (e.g. Action Priority table).
. A summary of the actions taken and/or planned to address the high-risk failures including status of those actions.
. A plan and commitment of timing for ongoing FMEA improvement actions.
- Commitment and timing to close open actions.
- Commitment to review and revise the FMEA-MSR during mass production to ensure the accuracy and completeness of the analysis as compared with the original production design (e.g. revisions triggered from design changes, corrective actions, etc., based on company procedures).
- Commitment to capture “things gone wrong” in foundation FMEA-MSRs for the beneﬁt of future analysis reuse, when applicable.

One thought on “AIAG & VDA FMEA For Monitoring And System Response (FMEA-MSR)”

Jesper Adolfsson says:

February 10, 2024 at 2:52 PM

Nice article and I really enjoyed reading it 🙂

A question:
According to ISO 26262 ASIL B as an example has a SPFM of 90% does that mean that you can’t claim better than M=5 (Moderate) for monitoring according to the tables? ASIL D has 99% so it can’t claim better than M=3. Say for example you have something with S=10, F=3 (one million vehicles according to your article so even if F is extremely low per vehicle it will most likely happen for 1 million vehicles) and then M=3 (ASIL D) then you would get an AP of Medium which might need further mitigation. Isn’t that a bit inconsistent then between VDA FMEA-MSR and ISO 26262? Or is this intentional saying that in this case you have to lower either S or F even more because not even ASIL D is enough

Loading...

AIAG & VDA FMEA For Monitoring And System Response (FMEA-MSR)

Step 1: Planning and Preparation

1.1 Purpose

1.2 FMEA-MSR Project Identification and Boundaries

1.3 FMEA-MSR Project Plan

Step 2 : Structure Analysis

2.1 Purpose

2.2 Structure Trees

Step 3 : Function Analysis

Step 4: Failure Analysis

4.1 Purpose

4.2 Failure Scenario

4.3 Failure Cause

4.4 Failure Mode

4.5 Failure Effect

Step 5: Risk Analysis

5.2 Evaluations

5.4 Rationale for Frequency Rating

5.5 Frequency (F)

5.6 Current Monitoring Controls

5.7 Monitoring (M)

5.8 Action Priority (AP) for FMEA-MSR

Step 6: Optimization

6.1 Purpose

6.2 Assignment of Responsibilities

6.3 Status of the Actions

6.4 Assessment of Action Effectiveness

6.5 Continuous Improvement

Step 7: Results Documentation

7.1 Purpose

7.2 FMEA Report

Like this:

Related

Published by Pretesh Biswas

One thought on “AIAG & VDA FMEA For Monitoring And System Response (FMEA-MSR)”

Leave a ReplyCancel reply

Thank you for your response. ✨

Step 1: Planning and Preparation

1.1 Purpose

1.2 FMEA-MSR Project Identification and Boundaries

1.3 FMEA-MSR Project Plan

Step 2 : Structure Analysis

2.1 Purpose

2.2 Structure Trees

Step 3 : Function Analysis

Step 4: Failure Analysis

4.1 Purpose

4.2 Failure Scenario

4.3 Failure Cause

4.4 Failure Mode

4.5 Failure Effect

Step 5: Risk Analysis

5.2 Evaluations

5.4 Rationale for Frequency Rating

5.5 Frequency (F)

5.6 Current Monitoring Controls

5.7 Monitoring (M)

5.8 Action Priority (AP) for FMEA-MSR

Step 6: Optimization

6.1 Purpose

6.2 Assignment of Responsibilities

6.3 Status of the Actions

6.4 Assessment of Action Effectiveness

6.5 Continuous Improvement

Step 7: Results Documentation

7.1 Purpose

7.2 FMEA Report

Share this:

Like this:

Related

Published by Pretesh Biswas

One thought on “AIAG & VDA FMEA For Monitoring And System Response (FMEA-MSR)”

Leave a ReplyCancel reply

Discover more from PRETESH BISWAS