AIAG & VDA FMEA For Monitoring And System Response (FMEA-MSR)

In a Supplemental FMEA for Monitoring and System Response, potential Failure Causes which might occur under customer operating conditions are analyzed with respect to their technical effects on the system, vehicle, people, and regulatory compliance. The method considers whether or not Failure Causes or Failure Modes are detected by the system, or Failure Effects are detected by the driver. Customer operation is to be understood as end-user operation or in-service operation and maintenance operations.
FMEA-MSR includes the following elements of risk:

  1. Severity of harm regulatory noncompliance. loss or degraded functionality. and unacceptable quality; represented by (S)
  2. Estimated frequency of a Failure Cause in the context of an operational situation; represented by (F)
  3. Technical possibilities to avoid or limit the Failure Effect via diagnostic detection and automated response, combined with human possibilities to avoid or limit the Failure Effect via sensory perception and physical reaction; represented by (M)

The combination of F and M is an estimate of the probability of occurrence of the Failure Effect due to the Fault (Failure Cause), and resulting malfunctioning behavior (Failure Mode).
NOTE: The overall probability of a Failure Effect to occur may be higher, because different Failure Causes may lead to the same Failure Effect.
FMEA-MSR adds value by assessing risk reduction as a result of monitoring and response. FMEA-MSR evaluates the current state of risk of failure and derives the necessity for additional monitoring by comparison with the conditions for acceptable residual risk.
The analysis can be part of a Design FMEA in which the aspects of Development are supplemented by aspects of Customer Operation. However. it is usually only applied when diagnostic detection is necessary to maintain safety or compliance. Detection in DFMEA Is not the same as Monitoring in Supplemental FMEA- MSR. In DFMEA Detection controls document the ability of testing to demonstrate the fulfillment of requirements in development and validation. For monitoring that is already part of the system design, validation is intended to demonstrate that diagnostic monitoring and system response works as intended. Conversely, Monitoring in FMEA-MSR assesses the effectiveness of fault detection performance in customer operation, assuming that specifications are fulfilled. The Monitoring rating also comprehends the safe performance and reliability of system reactions to monitored faults. It contributes to the assessment of the fulfillment of Safety Goals and may be used for deriving the Safety Concept. Supplemental FMEA-MSR addresses risks that in DFMEA would otherwise be assessed as High, by considering more factors which accurately reflect lower assessed risk according to the diagnostic functions of the vehicle operating system. These additional factors contribute to an improved depiction of risk of failure (including risk of harm, risk of noncompliance, and risk of not fulfilling specifications). FMEA-MSR contributes to the provision of evidence of the ability of the diagnostic, logical, and actuation mechanisms to achieve and maintain a safe or compliant state (in particular, appropriate failure mitigation ability within the maximum fault handling time interval and within the fault tolerant time interval). FMEA—MSR evaluates the current state of risk of failure under and user conditions (not just risk of harm to persons). The detection of faults/failures during customer operation can be used to avoid the original failure effect by switching to a degraded operational state (including disabling the vehicle), informing the driver and/or writing a diagnostic trouble code (DTC) into the control unit for service purposes. In terms of FMEA, the result of RELIABLE diagnostic detection and response is to eliminate (prevent) the original effect and replace it with a new, less severe effect. FMEA—MSR is useful in deciding whether the system design fulfills the performance requirements with respect to safety and compliance. The results may include items such as:

  • additional sensor(s) may be needed for monitoring purposes
  • redundancy in processing may be needed
  • plausibility checks may reveal sensor malfunctions

Step 1: Planning and Preparation

1.1 Purpose

The main objectives of Planning and- Preparation in FMEA-MSR are:

  • Project identification
  • Project plan (lnTent, Timing, Team, Tasks, Tools (5T)
  • Analysis boundaries: What is included and excluded from the analysis
  • Identification of baseline FMEA
  • Basis for the Structure Analysis step

1.2 FMEA-MSR Project Identification and Boundaries

FMEA—MSR project identification includes a clear understanding of what needs to be evaluated. This involves a decision-making process to define the FMEA—MSRs that are needed for a customer program. What to exclude can be just as important as what to include in the analysis. The following may assist the team. in defining FMEA-MSR projects, as applicable:

  • Hazard Analysis and Risk Assessment.
  • Legal Requirements
  • Technical Requirements
  • Customer wants/needs/expectation (external and internal customers)
  • Requirements specification
  • Diagrams (Block/Boundary/System)
  • Schematics, Drawings. and/or 3D Models.
  • Bill of Materials (BOM), Risk Assessment
  • Previous FMEA for similar products

Answers to these questions and others defined by the company help create the list of FMEA-MSR projects needed. The FMEA- MSR project list assures consistent direction, commitment and
focus. Below are some basic questions that help identify FMEA-MSR boundaries:

  1. After completing a DFMEA on an Electrical/Electronic/Prograrnmable Electronic System, are there effects that may be harmful to persons or involve regulatory noncompliance?
  2. Did the DFMEA indicate that all of the causes which lead to harm or noncompliance can be detected by direct sensing. and/or plausibility algorithms?
  3. Did the DFMEA indicate that the intended system response to any and all of the detected causes is to switch to a degraded operational state (including disabling the vehicle]. inform the driver and/or write a Diagnostic Trouble Code (DTC) into the control unit for service purposes?

FMEA for Monitoring and System Response may be used to examine systems which have integrated fault monitoring and response mechanisms during operation. Typically, these are more complex systems composed of sensors. actuators and logical processing units. The diagnosis and monitoring in such systems, may be achieved through hardware and, or software. Systems that may be considered in a Supplemental FMEA for Monitoring and System Response consist in general of at least a sensor, a control unit, and an actuator or a subset of them and are called mechatronic systems. Systems in-scope may also consist of mechanical hardware components (e.g., pneumatic and hydraulics).

Generic-block diagram of an Electrical / Electronic / Programmable Electronic system

The scope of a Supplemental FMEA for Monitoring and System Response may be established in consultation between customer and supplier. Applicable scoping criteria may include, but are not limited to:

  • System Safety relevance
  • ISO Standards, i.e., Safety Goals according to ISO 26262
  • Documentation requirements from legislative bodies e.g., UN/ECE Regulations, FMVSS/CMVSS, NHTSA, and On Board Diagnostic Requirements (OBD) Compliance.

1.3 FMEA-MSR Project Plan

A plan for the execution of the FMEA-MSR should be developed once the FMEA-MSR project is known. It is recommended that the 5T method (Intent, Timing, Team, Tasks. Tool) be used. The plan for the FMEA-MSR helps the company be proactive in starting the FMEA-MSR early. The FMEA-MSR activities (5-step process) should be incorporated into the overall design project plan.

Step 2 : Structure Analysis

2.1 Purpose

The main objectives of Structure Analysis in FMEA—MSR are:

  • Visualization of the analysis scope
  • Structure tree or equivalent: block diagram, boundary diagram, digital model, physical parts
  • Identification of design interfaces, interactions
  • Collaboration between customer and supplier engineering teams (interface responsibilities)
  • Basis for the Function Analysis step

Depending on the scope of analysis, the structure may consist of hardware elements and software elements. Complex structures may be split into several structures (work packages) or different layers of block diagrams and analyzed separately for organizational reasons or to ensure sufficient clarity. The scope of the FMEA—MSR is limited to the elements of the system for which the baseline DFMEA showed that there are causes of failure which can result in hazardous or non-compliant effects. The scope may be expanded to include signals received by the control unit. In order to visualize a system structure, two methods are commonly used:

  • Block (Boundary) Diagrams
  • Structure Trees

2.2 Structure Trees

In a Supplemental FMEA for Monitoring and System Response, the root element of a structure tree can be at vehicle level, i.e. for OEMs which analyze the overall system or at the system level, i.e. for suppliers which analyze a subsystem or component .

Example of a structure tree of a window lift system for investigating erroneous signals, monitoring, and system response

The sensor element and the control unit may also be part of one component (smart sensor). Diagnostics and monitoring in such systems may be realized by hardware and/or software elements.

Example of a structure tree of a smart sensor with an Internal sensing element and output to an interface

In case there is no sensor within the scope of analysis, an Interface Element is used to describe the data/current/voltage received by the ECU. One function of any ECU is to receive signals via a connector. These signals can be missing or erroneous. With no monitoring, you get erroneous output. In case there is no actuator within the scope of analysis, an Interface Element is used to describe the data/current/voltage sent by the ECU. Another function of any ECU is to send signals. i.e. via a connector. These signals can also be missing or erroneous. It can also be “no output” or “failure information.” The causes of erroneous signals may be within a component which is outside the scope of responsibility of the engineer or organization. These erroneous signals may have an effect on the performance of a component which is within the scope of responsibility of the engineer or organization. It is therefore necessary to include such causes in the FMEA-MSR analysis.
NOTE: Ensure that the structure is consistent with the Safety Concept (as applicable).

STRUCTURE ANALYSIS (STEP 2)
1. Next Higher Level2 Focus Element3. Next lower level or characteristic Type
Window Lift SystemECU Window LifterConnector ECU Window Lifter
Example of Structure Analysis in the FMEA-MSR Form Sheet

Step 3 : Function Analysis

The main objectives of Function Analysis in FMEA-MSR are:

  • Visualization of functions and relationships between functions in Function tree/ function net, or equivalent parameter diagram (P—diagram)
  • Cascade of customer (external and internal) functions with associated requirements
  • Association of requirements or characteristics to functions
  • Collaboration between engineering teams (systems, safety, and components)
  • Basis for the Failure Analysis step

In a Supplemental FMEA for Monitoring and System Response, monitoring for failure detection and failure responses are considered as functions. Hardware and software functions may include monitoring of system states. Functions for monitoring and detection of faults/failures may consist of, for example: out of range detection, cyclic redundancy checks, plausibility checks and sequence counter checks. Functions for failure reactions may consist of, for example, provision of default values, switching to a limp home mode, switching off the corresponding function and/or display of a warning. Such functions are modeled for these structural elements that are carriers of these functions, i.e., control units or components with computational abilities like smart sensors. Additionally, sensor signals can be considered which are received by control units. Therefore, functions of signals may be described as well. Finally, functions of actuators can be added, which describe the way the actuator or vehicle reacts on demand. Performance requirements are assumed to be the maintenance of a safe or compliant state. Fulfillment of requirements is assessed through the risk assessment. in case sensors and/or actuators are not within the scope of analysis, functions are assigned to the corresponding interface- elernents (consistent with the Safety Concept-as applicable).

Example of a Structure Tree with functions
FUNCTION ANALYSIS (STEP 3)
1. Next Higher Level Function and Requirement2 Focus Element Function and Requirement3. Next lower level Function and Requirement  or characteristic Type
Provide anti-pinch protection for comfort closing modeProvide signal to stop and reverse window lifter motor in case of pinch situationTransmit signal from Hall effect sensor to ECU
Example of Function Analysis in FMEA-MSR Form Sheet.

Step 4: Failure Analysis

4.1 Purpose

The purpose of Failure Analysis in FMEA-MSR is to-describe the chain of events which lead up to the end effect, in the context of a relevant scenario. The main objectives of Failure Analysis in FMEA-MSR are:

  • Establishment of the failure chain
  • Potential Failure Cause, Monitoring, System Response,Reduced Failure Effect.
  • Identification of product Failure Causes using a parameter diagram or failure network
  • Collaboration between customer and supplier (Failure Effects)
  • Basis for the documentation of failures in the FMEA form sheet and the Risk Analysis step

4.2 Failure Scenario

A Failure Scenario is comprised of a description of relevant operating conditions in which a fault results in malfunctioning behavior and possible sequences of events (system states) that lead to an and system state (Failure Effect). It starts from defined Failure Causes and leads to the Failure Effects.

Theoretical failure chain model DFMEA and FMEA-MSR

The focus of the analysis is a component with diagnostic capabilities, e.g., an ECU. If the component is not capable of detecting the fault/failure, the Failure Mode will occur which leads to the end effect with a corresponding degree of Severity. However, if the component can detect the failure, this leads to a system response with a Failure Effect with a lower Severity compared to the original Failure Effect. Details are described in the following scenarios (1) to (3).

Failure Scenario (1) – Non-Hazardous

Failure Scenario (1) describes the malfunctioning behavior from the occurrence of the fault to the Failure Effect, which in this example is not hazardous but may reach a non-compliant end system state.

Failure Scenario (2) – Hazardous

Failure Scenario (2) describes the malfunctioning behavior from the occurrence of the fault to the Failure Effect, which in this example leads to a hazardous event. As an aspect of the Failure Scenario, it is necessary to estimate the magnitude of the Fault Handling Time Interval (time between the occurrence of the fault, and the occurrence of the hazard/non-compliant Failure Effect). The Fault Handling Time Interval is the maximum time span of malfunctioning behavior before a hazardous event occurs, if the safety mechanisms are not activated.

Failure Scenario (3) – mitigated (Effect)

Failure Scenario (3) describes the malfunctioning behavior from the occurrence of the fault to the mitigated Failure Effect, which in this example leads to a loss or degradation of a function instead of the hazardous event.

4.3 Failure Cause

The description of the Failure Cause is the starting point of the Failure Analysis in a Supplemental FMEA for Monitoring and System Response. The Failure Cause is assumed to have occurred and is not the true Failure Cause (root cause). Typical Failure Causes are electrical/electronic faults (E/E faults). Root causes may be insufficient robustness when exposed to various factors such as the external environment, vehicle dynamics, wear, service, stress cycling, data bus overloading, and erroneous signal states etc. Failure Causes can be derived from the DFMEA, catalogues for failures of E/E components, and network communication data descriptions.

NOTE: In FMEA-MSR, diagnostic monitoring is assumed to function as intended. (However, it may not be effective.) Therefore, Failure Causes of diagnostics are not part of FMEA—MSR but can be added to the DFMEA section of the form sheet. These include Failed to detect fault; Falsely detected fault (nuisance); Unreliable fault response (variation in response capability).

Teams may decide not to include failures of diagnostic monitoring in DFMEA because Occurrence ratings are most often very low (including “latent faults” Ref. ISO 26262). Therefore. this analysis may be of limited value. However, the correct implementation of diagnostic monitoring should be part of the test protocol. Prevention Controls of diagnostics in a DFMEA describe how reliable a mechanism is estimated to detect the Failure Cause and reacts on time with respect to the performance requirements. Detection Controls of diagnostics in a DFMEA would relate back to development tests which verify the correct implementation and the effectiveness of the monitoring mechanism.

4.4 Failure Mode

A Failure Mode is the consequence of the fault (Failure Cause). In FMEA-MSR two possibilities are considered:

  1. In case of failure scenarios (1) and (2) the fault is not detected or the system reaction is too late. Therefore, the Failure Mode in FMEA-MSR is the same as in DFMEA.
  2. Different is failure scenario (3), where the fault is detected and the system response leads to a mitigated Failure Effect. In this case a description for the diagnostic monitoring and system response is added to the analysis. Because the failure chain in this specific possibility consists of a fault/failure and a description of an intended behavior, this is called a hybrid failure chain or hybrid failure network

4.5 Failure Effect

A Failure Effect is defined as the consequence of a Failure Mode. Failure Effects in FMEA-MSR are either a malfunctioning behavior of the system or an intended behavior after detection of a Failure Cause. The end effect may be a “hazard” or “non-compliant state” or, in case of detection and timely system response, a “safe state” or”compliant state” with loss or degradation of a function. The severity of Failure Effects is evaluated on a ten-point scale

FAILURE ANALYSIS (STEP 4)
Failure Effect (FE) to the  Next Higher Level Element and/ or End User2 Failure Mode (FM) of the Focus Element3. Failure Cause (FC) of theNext lower level Element  or characteristic
No anti-pinch protection in comfort closing mode. {Hand or neck may be pinched between window glass and frame]No signal to stop and reverse window lifter motor in case of pinch situationSignal of Hall effect sensor is not transmitted to ECU due to  poor connection of Hail effect Sensor
Example of Failure Analysis In FMEA-HSR Form Sheet.

Step 5: Risk Analysis

5.1 Purpose

The purpose of Risk Analysis in FMEA-MSR is to estimate risk of failure by evaluating Severity, Frequency, and Monitoring. and prioritize the need for actions to reduce risk. The main objectives of the FMEA-MSR Risk Analysis are:

  • Assignment of existing and/or planned controls and rating of failures
  • Assignment of Prevention Controls to the Failure Causes
  • Assignment of Detection Controls to the Failure Causes and/or Failure Modes
  • Rating of Severity, Frequency and Monitoring for each failure chain.
  • Evaluation of Action Priority
  • Collaboration between customer and supplier (Severity).
  • Basis for the Optimization step.

5.2 Evaluations

Each Failure Mode, Cause and Effect relationship (failure chain or hybrid network) is assessed by the following three criteria:

  • Severity (S): represents the Severity of the Failure Effect
  • Frequency (F): represents the Frequency of Occurrence of the Cause in a given operational situation, during the intended service life of the vehicle
  • Monitoring (M): represents the Detection potential of the Diagnostic Monitoring functions (detection of Failure Cause, Failure Mode and/or Failure Effect)

Evaluation numbers from 1 to 10 are used for S, F, and M respectively. where 10 stands for the highest risk contribution. By examining these ratings individually and in combinations of the three factors the need for risk-reducing actions may be prioritized.

5.3 Severity (S)

The Severity rating (S) is a measure associated with the most serious Failure Effect for a given Failure Mode of the function being evaluated and is identical for DFMEA and FMEA-MSR. Severity should be estimated using the criteria in the Severity Table . The table may be augmented to include product- specific examples. The FMEA project team should agree on an evaluation criteria and rating system, which is consistent even if modified for individual design analysis. The Severity evaluations of the Failure Effects should be transferred by the customer to the supplier, as needed.

Product General Evaluation Criteria Severity (S)
Potential Failure Effects rated according to the criteria below.Blank until filled in by user
SEffectSeverity criteriaCorporate or Product Line  Examples
10Very HighAffects safe operation of the vehicle and/or other vehicles, the health of driver or passengers or road users or pedestrians. 
9Noncompliance with regulations. 
8HighLoss of primary vehicle function necessary for normal driving during expected service life. 
7Degradation of primary vehicle function necessary for normal driving during expected service life. 
6ModerateLoss of secondary vehicle function. 
5Degradation of secondary vehicle function. 
4Very objectionable appearance, sound, vibration, harshness, or haptics. 
3LowModerately objectionable appearance, sound, vibration, harshness, or haptics. 
2Slightly objectionable appearance, sound, vibration, harshness, or haptics. 
1Very lowNo discernible effect 
Supplemental FMEA-MSR SEVERITY (S)

5.4 Rationale for Frequency Rating

In a Supplemental FMEA for Monitoring and System Response, the likelihood of a failure to occur in the field under customer operating conditions during service life is relevant. Analysis of end user operation requires assumptions that the manufacturing process is adequately controlled in order to assess the sufficiency of the design. Examples on which a rationale may be. based on:

  • Evaluation based on the results of Design FMEAs
  • Evaluation based on the results of Process FMEAs
  • Field data of returns and rejected parts
  • Customer complaints
  • Customer complaints
  • Warranty databases
  • Data handbooks

The rationale is documented in the column “Rationale for Frequency Rating“ of the FMEA-MSR form sheet.

5.5 Frequency (F)

The Frequency rating (F) is a- measure of the likelihood of occurrence of the cause in relevant operating situations during the intended service life of the vehicle or the system using the criteria in Table below. If the Failure Cause does not always lead to the associated Failure Effect, the rating may be adapted. taking into account the probability of exposure to the relevant operating condition. In such cases the operational situation and the rationale are to be stated in the column “Rationale for Frequency Rating.”
Example: From field data it is known how often a control unit is defective in ppm/year. This may lead to F=3. The system under investigation is a parking system which is used only a very limited
time in comparison to the overall operating time. So harm to persons is only possible when the defect occurs during the parking maneuver. Therefore, Frequency may be lowered to F=2.

Frequency Potential (F) for the Product
Frequency criteria (F) for the estimated occurrence of the Failure Cause in relevant operating situations during the intended service life of the vehicleBlank until filled in by user
FEstimated FrequencyFrequency criteria — FMEA-MSRCorporate or Product Line  Examples
10Extremely High or cannot be determinedFrequency of occurrence of the Failure Cause is unknown or known to be unacceptably high during the intended  service life of the vehicle 
9HighFailure Cause is likely to occur during the intended service  life of the vehicle 
8Failure Cause may occur often in the field during the  intended service life of the vehicle 
7MediumFailure Cause may occur frequently in the field during the intended service life of the vehicle 
6Failure Cause may occur somewhat frequently in the field during the intended service life of the vehicle 
5Failure Cause may occur Occasionally  in the field during the intended service life of the vehicle 
4lowFailure Cause is predicted to occur rarely in the field during the intended service life of the vehicle. At least ten occurrences in the field are predicted. 
3Very LowFailure Cause is predicted to occur in isolated cases in the field during the intended service life of the vehicle. At least one occurrence in the field is predicted. 
2Extreme LowFailure Cause is predicted not to occur in the field during the intended service life of the vehicle based on prevention and detection controls and field experience with similar parts. Isolated cases cannot be ruled out. No proof it will not happen. 
1Cannot occurFailure Cause cannot occur during the intended service life of the vehicle or is virtually eliminated. Evidence that Failure Cause cannot occur. Rationale is documented. 
Percentage of relevant operating condition in comparison to overall operating timeValue by which F may be lowered
< 10%1
< 1%2
supplemental FMEA-MSR FREQUENCY (F)

Note:

  1. Probability increases as number of vehicles are increased
  2. Reference value for estimation is one. million vehicles in a the field

5.6 Current Monitoring Controls

All controls that are planned or already implemented and lead to a detection of the Failure Cause, the Failure Mode or the Failure Effect by the system or by the driver are entered into the “Current Monitoring Controls” column. In addition, the fault reaction after-detection should be described. i.e. provision of default values, (if not already sufficiently described by the Failure Effect). Monitoring evaluates the potential that the Failure Cause, the Failure Mode or the Failure Effect can be detected early enough, so that the initial Failure Effect can be mitigated before a hazard occurs or a non-compliant state is reached. The result is an end state effect with a lower severity.

5.7 Monitoring (M)

The Monitoring rating (M) is a measure of the ability of detecting a fault/failure during customer operation and applying the fault reaction in order to maintain a safe or compliant state. The Monitoring Rating relates to the combined ability of all sensors, logic, and human sensory perception to detect the fault/failure; and react by modifying the vehicle behavior by means of mechanical actuation and physical reaction (controllability). In order to maintain a safe or compliant state of operation, the sequence of fault detection and reaction need to take place before the hazardous or non-compliant effect occurs. The resulting rating describes the ability to maintain a safe or compliant state of operation. Monitoring is a relative rating within the scope of the individual FMEA and is determined without regard. for severity or frequency. Monitoring should be estimated using the criteria in Table below. This table may be augmented with examples of common monitoring. The FMEA project team should agree on an evaluation criteria and rating system which is consistent, even if modified for individual product analysis. The assumption is that Monitoring is implemented and tested as designed. The effectiveness of Monitoring depends on the design of the sensor hardware. sensor redundancy, and diagnostic algorithms that are implemented. Plausibility metrics alone are not considered to be effective. Implementation of monitoring and the verification of effectiveness should be part of the development process and therefore may be analyzed in the corresponding DFMEA of the product. The effectiveness of diagnostic monitoring and response, the fault monitoring response time. and the Fault Tolerant Time Interval need to be determined prior to rating. Determination of the effectiveness of diagnostic monitoring is addressed in detail in ISO 26262-5:2018 Annex D.
In practice. three different monitoring/response cases may be distinguished:

If there is no monitoring control. or if monitoring and response do not occur within the Fault Handling Time Interval, then Monitoring should be rated as Not Effective (M=10).

The original Failure Effect is virtually eliminated. Only the mitigated Failure Effect remains relevant for the risk estimation of the product or system in this instance only. the mitigated FE is
relevant for the Action Priority rating, not the original FE. The assignment of Monitoring Ratings to Failure Causes and their corresponding Monitoring Controls can vary depending on:

  • Variations in the Failure Cause or Failure Mode
  • Variations in the hardware implemented for diagnostic monitoring
  • The execution timing of the safety mechanism. i.e., failure is detected during “power up” only
  • Variations in system response
  • Variations in human perception and reaction
  • Knowledge of implementation and effectiveness from other projects (newness)

Depending on these Variations or execution timing, Monitoring Controls may not be considered to be RELIABLE in the sense of M=1.

The original Failure Effect occurs less often. Most of the failures are detected and the system response leads to a mitigated Failure Effect. The reduced risk is represented by the Monitoring rating. The most serious Failure Effect remains S=10.

Supplemental FMEA for Monitoring and System Response (M)
Monitoring Criteria (M) for Failure Causes, Failure Modes and Failure Effects by Monitoring during Customer Operation. Use the rating number that corresponds with the least effective of either criteria for Monitoring or System ResponseBlank until filled in by user
MEffectiveness of Monitoring Control and system responseDiagnostic Monitoring/ Sensory Perception criteriaSystem Response /Human Reaction CriteriaCorporate or Product Line  Examples
10Not effectiveThe fault/failure cannot be detected at all or not during the Fault Handling Time Interval; by the system, the driver, passenger, or service technician.No response during the Fault Handling Time Interval 
9Very lowThe fault/failure can almost never be detected in relevant operating conditions. Monitoring control with low effectiveness, high variance, or high uncertainty. Minimal diagnostic coverage.The reaction to the fault/failure by the system or the driver may not reliably occur during the Fault Handling Time Interval 
8LowThe fault/failure can be detected in very few relevant operating conditions. Monitoring control with low effectiveness, high variance, or high uncertainty. Diagnostic coverage estimated <60%.The reaction to the fault/failure by the system or the driver may not always occur during the Fault Handling Time Interval 
7Moderately LowLow probability of detecting the fault/failure during the Fault Handling Time Interval by the system or the driver. Monitoring control with low effectiveness, high variance, or high uncertainty. Diagnostic coverage estimated >60%.Low probability of reacting to the detected fault/failure during the Fault Handling Time Interval by the system or the driver 
6ModerateThe fault/failure will be automatically detected by the system or the driver only during power-up, with medium variance in detection time. Diagnostic coverage estimated >90%.The automated system or the driver will be able to react to the detected fault/failure in many operating conditions.     
5The fault/failure will be automatically detected by the system during the Fault Handling Time Interval, with medium variance in detection time or detected by the driver in very many operating conditions. Diagnostic coverage estimated between 90% – 97%.The automated system or the driver will be able to react to the detected fault/failure during the Fault Handling Time Interval in a very many operating conditions 
4Moderately HighThe fault/failure will be automatically detected by the system during the Fault Handling Time Interval, with medium variance in detection time or detected by the driver in most operating conditions. Diagnostic coverage estimated > 97%.The automated system or the driver will be able to react to the detected fault/failure during the Fault Handling Time Interval in a most operating conditions 
3HighThe fault/failure will be automatically detected by the system during the Fault Handling Time Interval, with very low variance in detection time and with high Probability. Diagnostic coverage estimated > 99%.The system will automatically react to the detected fault/failure during the Fault Handling Time Interval in a most operating conditions with very low variance in system response time, and with a high probability. 
2Very HighThe fault/failure will be automatically detected by the system during the Fault Handling Time Interval, with very low variance in detection time and with very high Probability. Diagnostic coverage estimated > 99.9%.The system will automatically react to the detected fault/failure during the Fault Handling Time Interval in a most operating conditions with very low variance in system response time, and with a very high probability. 
1Reliable and acceptable for elimination of original failure effectThe fault/failure will always be automatically detected by the system. Diagnostic coverage estimated to be significantly greater then 99.9%.The system will always automatically react to the detected fault/failure during the Fault Handling Time Interval 
Supplemental F’MEA-MSR- MONITORING (M)

5.8 Action Priority (AP) for FMEA-MSR

The Action Priority is a methodology which allows for the prioritization of the need for action, considering Severity, Frequency, and Monitoring (SFM). This is done by the assignment of SFM ratings which provide a basis for the estimation of risk.

  • Priority High (H): Highest priority for review and action. The team needs to either identify an appropriate action to lower frequency and/or to improve monitoring controls or justify and document why current controls are adequate.
  • Priority Medium (M): Medium priority for review and action. The team should identify appropriate actions to lower frequency and/or to improve monitoring controls, or, at the discretion of the company, justify and document why controls are adequate.
  • Priority Low (L): Low priority for review and action. The team could identify actions to lower
  • frequency and/or to improve monitoring controls.

It is recommended that potential Severity 9-10 failure effects with Action Priority High and Medium, at a minimum, be reviewed by management including any recommended actions that were taken. This is not the prioritization of High, Medium, or Low risk, It. is the prioritization of the need for actions to reduce risk.
Note: it may be helpful to include a statement such as “No further action is needed” in the Remarks field as appropriate.

Auction Priority is based on combinations of Severity, Frequency, and monitoring ratings- in order to prioritize actions for risk reduction.
EffectSPrediction of Failure Cause occurring during service life of vehicleFEffectiveness of MonitoringMAction Priority (AP)
Product or Plant Effect Very high10Medium-Extremely High5-10Reliable – Not effective1-10H
Low4Moderately high – Not effective4-10H
Very high – High2-3H
Reliable1M
Very low3Moderately high – Not effective4-10H
Very high – High2-3M
Reliable1L
Extremely low2Moderately high – Not effective4-10M
Reliable- high1-3L
Cannot occur1Reliable – Not effective1-10L
Product Effect high9Low – Extremely high4-10Reliable – Not effective1-10H
Extremely low –Very low2-3Very High – Not effective2-10H
Reliable- high1-3L
Cannot occur1Reliable – Not effective1-10L
Product Effect Moderately high7-8Medium-Extremely High6-10Reliable – Not effective1-10H
Medium5Moderately high – Not effective5-10H
Reliable- Moderately high1-4M
low4Moderately low – Not effective7-10H
Moderately High- Moderate4-6M
Reliable -High1-3L
Very low3Very low – Not effective9-10H
Moderately low-Low7-8M
Reliable -Moderate1-6L
Extreme Low2Moderately low – Not effective7-10M
Reliable -Moderate1-6L
Cannot occur1Reliable – Not effective1-10L
Product or  Effect Moderately low4-6High-Extremely High7-10Reliable – Not effective1-10H
Medium5-6Moderate – Not effective6-10H
Reliable –Moderately High1-5M
Extremely low- Low2-4Very low – Not effective9-10M
Moderately High- Moderate7-8M
Reliable -Moderate1-6L
Cannot occur1Reliable – Not effective1-10L
Product  Effect low2-3High-Extremely High7-10Reliable – Not effective1-10H
Medium  5-6  Moderately low – Not effective7-10M
Reliable -Moderate1-6L
Extremely low- Low2-4Reliable -Moderate1-6L
Cannot occur1Reliable – Not effective1-10L
Product effect very low1Very low- Very high1-10Reliable – Not effective1-10L
ACTION PRIORITY FOR FMEA-MSR
  • NOTE 1: If M=1, the Severity rating of the Failure Effect after Monitoring and System Response is to be used for determining MSR Action Priority. If M is not equal to 1, then the Severity Rating of the original Failure Effect is to be used for determining MSR Action Priority.
  • NOTE 2: When FMEA—MSR is used. and M=1, then DFMEA Action Prioritization replaces the severity rating of the original Failure Effect with the Severity rating of the mitigated Failure Effect.
Example of FMEA-MSR Risk Analysis – Evaluation of Current Risk Form Sheet

Step 6: Optimization

6.1 Purpose

The primary objective of Optimization in FMEA-MSR is to develop actions that reduce risk and improve safety. In this step, the team reviews the results of the risk analysis and evaluates action priorities. The main objectives of FMEA-MSR Optimization are:

  • Identification of the actions necessary to reduce risks
  • Assignment of responsibilities and target completion dates for action implementation
  • Implementation and documentation of actions taken including confirmation of the effectiveness of the implemented actions and assessment of risk after actions taken.
  • Collaboration between the FMEA team, management, customers, and suppliers regarding potential failures
  • Basis for refinement of the product requirements and prevention/detection controls

High and medium action priorities may indicate a need for technical improvement. Improvements may be achieved by introducing more reliable components which reduce the occurrence potential of the Failure Cause in the field or introduce additional monitoring which improve the detection capabilities of the system. Introduction of monitoring is similar to design change. Frequency of the Failure Cause is not changed. It may also be possible to eliminate the Failure Effect by introducing redundancy. If the team decides that no further actions are necessary. “No further action is needed” is written in the Remarks field to show the risk analysis was completed. The optimization is most effective in the following order:

  • Component design modifications in order to reduce the Frequency (F) of the Failure Cause (FC)
  • Increase the Monitoring (M) ability for the Failure Cause (FC) or Failure Mode (FM).

In the case of design modifications. all impacted design elements are evaluated again.
In the case of concept modifications. all steps of the FMEA are reviewed for the affected sections. This is necessary because, the original analysis is no longer valid since it was based upon a different design concept.

6.2 Assignment of Responsibilities

Each action should have a responsible individual and a Target Completion Date (TCD) associated with it. The responsible person ensures’the action status is updated, if the action is confirmed this person is also responsible for the action implementation. The Actual Completion Date is documented including the date the actions are implemented. Target Completion Dates should be, realistic (i.e., in accordance with the product development plan, Prior to process validation,
prior to start of production).

6.3 Status of the Actions

Suggested levels for Status of: Actions:

  • Open: No Action defined.
  • Decision pending (optional): The action has been defined but has not yet decided on. A decision paper is being created.
  • Implementation pending (optional): The action has been decided on but not yet implemented.
  • Completed: Completed actions have been implemented and their effectiveness has been demonstrated and documented. A final evaluation has been done.
  • Not Implemented: Not Implemented status .is assigned when a decision is made not to implement an action. This may occur when risks related to practical and technical limitations are beyond current capabilities

The FMEA is not considered “complete” until the team assesses each item’s Action Priority and either accepts the level of risk or documents closure of all actions. Closure of all actions should be documented before the FMEA is released at Start of Production (SOP). If “No Action Taken”, then Action Priority is not reduced and the risk of failure is carried forward into the product design.

6.4 Assessment of Action Effectiveness

When an action has been completed, Frequency, and Monitoring values are reassessed, and a new Action Priority may be determined. The new action receives a preliminary Action Priority rating as a prediction of effectiveness. However. the status of the action remains “implementation pending” until the effectiveness has been tested. After the tests are finalized the preliminary rating has to be confirmed or adapted, when indicated. The status of the action is then changed from “implementation pending” to “completed.” The reassessment should be based on the effectiveness of the MSR Preventive and Diagnostic Monitoring Actions taken and the new values are based on the definitions in the FMEA-MSR Frequency and Monitoring rating tables.

6.5 Continuous Improvement

FMEA—MSR serves as an historical record for the design. Therefore. the original Severity. Frequency, and Monitoring (S, F, M) numbers are not modified once actions have been taken. The completed analysis becomes a repository to capture the progression of design decisions and design refinements. However, original S, F, M ratings may be modified for basis, family or generic DFMEAs because the information is used as a starting point for an application-specific analysis.

Example of FMEA-MSR Optimization with new Risk Evaluation Form Sheet

Step 7: Results Documentation

7.1 Purpose

The purpose of the results documentation step is to summarize. and communicate the results of the Failure Mode and Effects Analysis activity. The main objectives of FMEA – MSR Results Documentation are:

  • Communication of results and conclusions of the analysis.
  • Establishment of the content of the documentation
  • Documentation of actions taken including confirmation of the effectiveness of the implemented actions and assessment of risk after actions taken
  • Communication of actions taken to reduce risks.- including within the organization. and with customers and/or suppliers as appropriate
  • Record of risk analysis and reduction to acceptable levels

7.2 FMEA Report

The scope and results of an FMEA should be summarized in a report. The report can be used for communication purposes within a company, or between companies. The report is not meant to replace reviews of the FMEA-MSR details when requested by management, customers, or suppliers. It is meant to be a summary for the FMEA-MSR team and others to confirm completion of each of the tasks and review the results of the analysis. It is important that the content of the documentation fulfills the requirements of the organization, the intended reader. and relevant stakeholders. Details may be agreed upon between the parties. In this way, it is also ensured that all details of the analysis and the intellectual property remain at the developing company. The layout of the document may be company specific. However, the report should indicate the technical risk of failure as a part of the development plan and project milestones. The content may include the following:

  1. A statement of final status compared to original goals established in Project Plan
    • FMEA Intent – Purpose of this FMEA‘?
    • FMEA Timing – FMEA due date?
    • FMEA Team – List of participants?
    • FMEA Task – Scope of this FMEA?
    • FMEA Tool – How do we conduct the analysis Method used?
  2. A summary of- the scope of the analysis and identify what is new.
  3. A summary of how the functions were developed.
  4. A summary of at least the high-risk failures as determined by the team and provide a copy of the specific S/F/M rating tables and method of action prioritization (e.g. Action Priority table).
  5. . A summary of the actions taken and/or planned to address the high-risk failures including status of those actions.
  6. . A plan and commitment of timing for ongoing FMEA improvement actions.
    • Commitment and timing to close open actions.
    • Commitment to review and revise the FMEA-MSR during mass production to ensure the accuracy and completeness of the analysis as compared with the original production design (e.g. revisions triggered from design changes, corrective actions, etc., based on company procedures).
    • Commitment to capture “things gone wrong” in foundation FMEA-MSRs for the benefit of future analysis reuse, when applicable.
Standard FMEA-MSR Farm Sheet
FMEA-MSR Software View

Leave a Reply