Measurement System Analysis

Introduction:

Measurement Systems Analysis (MSA) is a type of experiment where you measure the same item repeatedly using different people or pieces of equipment. MSA is used to quantify the amount of variation in a measure that comes from the measurement system itself rather than from product or process variation. MSA helps you to determine how much of an observed variation is due to the measurement system itself. It helps you to determine the ways in which a measurement system needs to be improved. It assesses a measurement system for some or all of the following five characteristics:

Accuracy
Accuracy is attained when the measured value has a little deviation from the actual value. Accuracy is usually tested by comparing an average of repeated measurements to a known standard value for that unit of measure.
Repeatability
Repeatability is attained when the same person taking multiple measurements on the same item or characteristic gets the same result every time.
Reproducibility
Reproducibility is attained when other people (or other instruments or labs) get the same results you get when measuring the same item or characteristic.
Stability
Stability is attained when measurements that are taken by one person, in the same way, very little over time.
Adequate Resolution
Adequate resolution means that your measurement instrument can give at least five (and preferably more) distinct values in the range you need to measure. For example, if you measure the heights of adults with a device that measures only to the nearest foot, you will get readings of just three distinct values: four feet, five feet, and six feet. If you needed to measure lengths between 5.1 centimetres and 5.5 centimetres, to get adequate resolution the measurement instrument you used would have to be capable of measuring to the nearest 0.1 centimetres to give five distinct values in the measurement range,

DOE can be conducted by:

Conduct an experiment where different people (or machines) measure the same group of items repeatedly. This group should contain items that vary enough to cover the full range of typical variation.
Plot the data.
Analyze the data. Use statistical techniques such as Analysis of Variance (ANOVA) to determine what portion of the variation is due to operator differences and what portion is due to the measurement process.
Improve the measurement process, if necessary. Do this based on what you learn from your analysis. For example, if there is too much person-to-person variation, your measurement method must be standardized for multiple persons.

Measurement data are used more often and in more ways than ever before. For instance, the decision to adjust a manufacturing process is now commonly based on measurement data. The data, or some statistic calculated from them, are compared with statistical control limits for the process, and if the comparison indicates that the process is out of statistical control, then an adjustment of some kind is made. Otherwise, the process is allowed to run without adjustment. Another use of measurement data is to determine if a significant relationship exists between two or more variables. For example, it may be suspected that a critical dimension on a molded plastic part is related to the temperature of the feed material. What possible relationship could be studied by using regression analysis to compare measurements of the critical dimension with measurements of the temperature of the feed material. Studies that explore such relationships are called analytic studies, which increases knowledge about the system of causes that affect the process. Analytic studies are among the most important uses of measurement data because they lead ultimately to a better understanding of processes. The benefit of using a data-based procedure is largely determined by the quality of the measurement data used. If the data quality is low, the benefit of the procedure is likely to below. Similarly, if the quality of the data is high, the benefit is likely to be high also To ensure that the benefit derived from using measurement data is great enough to warrant the cost of obtaining it, attention needs to be focused on the quality of the data.

The quality of measurement data is defined by the statistical properties of multiple measurements obtained from a measurement system operating under stable conditions. For instance, suppose that a measurement system, operating under stable conditions, is used to obtain several measurements of a certain characteristic. If the measurements are all “close” to the master value for the characteristic, then the quality of the data is said to be “high”. Similarly, if some, or all, of the measurements, are “far away” from the master value, then the quality of the data is said to be “low”. The statistical properties most commonly used to characterize the quality of data are the bias and variance of the measurement system. The property called bias refers to the location of the data relative to a reference (master) value, and the property called variance refers to the spread of the data. One of the most common reasons for low-quality data is too much variation. Much of the variation in a set of measurements may be due to the interaction between the measurement system and its environment. For instance, a measurement system used to measure the volume of liquid in a tank may be sensitive to the ambient temperature of the environment in which it is used. In that case, variation in the data may be due either to changes in the volume or to changes in the ambient temperature. That makes interpreting the data more difficult and the measurement system, therefore, less desirable. If the interaction generates too much variation, then the quality of the data may be so low that the data are not useful. For example, a measurement system with a large amount of variation may not be appropriate for use in analyzing a manufacturing process because the measurement system’s variation may mask the variation in the manufacturing process. Much of the work of managing a measurement system is directed at monitoring and controlling variation. Among other things, this means that emphasis needs to be placed on learning how the measurement system interacts with its environment so that only data of acceptable quality are generated.

The terminology used in Measurement System Analysis

1.Measurement:

Measurement is defined as “the assignment of numbers [or values] to material things to represent the relations among them with respect to particular properties.” The process of assigning the numbers is defined as the measurement process, and the value assigned is defined as the measurement value.

2. Gage:

Gage is any device used to obtain measurements; frequently used to refer specifically to the devices used on the shop floor; includes go/no-go devices

3. Measurement System:

Measurement System is the collection of instruments or gages, standards, operations, methods, fixtures, software, personnel, environment and assumptions used to quantify a unit of measure or fix assessment to the feature characteristic being measured; the complete process used to obtain measurements.

4. Operational Definition

An operational definition is one that people can do business with. An operational definition of safe, round, reliable, or any other quality [characteristic] must be communicable, with the same meaning to the vendor as to the purchaser, same meaning yesterday and today to the production worker. Example:

A specific test of a piece of material or an assembly
A criterion (or criteria) for judgment
Decision: yes or no, the object or the material did or did not meet the criterion (or criteria)

5. Standards and Traceability:

The National Physical Laboratory of India, situated in New Delhi, is the measurement standards laboratory of India. It maintains standards of SI units in India and calibrates the national standards of weights and measures. Each modernized country, including India, has a National Metrological Institute (NMI), which maintains the standards of measurements. This responsibility has been given to the National Physical Laboratory, New Delhi. its primary responsibility is to provide measurement services and maintain measurement standards that assist Indian industry in making traceable measurements which ultimately assist in the trade of products and services. It provides these services directly to many types of industries, but primarily to those industries that require the highest level of accuracy for their products and that incorporate state-of-the-art measurements in their processes. Most of the industrialized countries throughout the world maintain their own NMIs, which provide a high level of metrology standards or measurement services for their respective countries. National Physical Laboratory works collaboratively with these other NMIs to assure measurements made in one country do not differ from those made in another. This is accomplished through Mutual Recognition Arrangements (MRAs) and by performing interlaboratory comparisons between the NMIs. One thing to note is that the capabilities of these NMIs will vary from country to country and not all types of measurements are compared on a regular basis, so differences can exist. This is why it is important to understand to whom measurements are traceable and how traceable they are.

5.1 Standard

5.2 Reference Standards

5.3 Measurement and Test Equipment (M&TE)

5.4 Calibration Standard

5.5 Transfer Standard

5.6 Master

5.7 Working Standard

5.8 Check Standard

6. Traceability:

Traceability is an important concept in the trade of goods and services. Measurements that are traceable to the same or similar standards will agree more closely than those that are not traceable. This helps reduce the need for re-test, rejection of good product, and acceptance of the bad product. Traceability is defined by the ISO International Vocabulary of Basic and General Terms in Metrology (VIM) as:
“The property of a measurement or the value of a standard whereby it can be related to stated references, usually national or international standards, through an unbroken chain of comparisons all having stated uncertainties.”
The traceability of measurement will typically be established through a chain of comparisons back to the NMI. However, in many instances in industry, the traceability of a measurement may be linked back to an agreed-upon reference value or “consensus standard” between a customer and a supplier. The traceability linkage of these consensus standards to the NMI may not always be clearly understood, so ultimately it is critical that the measurements are traceable to the extent that satisfies customer needs. With the advancement in measurement technologies and the usage of state-of-the-art measurement systems in the industry, the definition as to where and how a measurement is traceable is an ever-evolving concept.

NMIs work closely with various national labs, gage suppliers, state-of-the-art manufacturing companies, etc. to assure that their reference standards are properly calibrated and directly traceable to the standards maintained by the NMI. These government and private industry organizations will then use their standards to provide calibration and measurement services to their customers’ metrology or gage laboratories, calibrating working or other primary standards. This linkage or chain of events ultimately finds its way onto the factory floor and then provides the basis for measurement traceability. Measurements that can be connected back to NMI through this unbroken chain of measurements are said to be traceable to NMI. Not all organizations have metrology or gage laboratories within their facilities therefore depend on outside commercial/ independent laboratories to provide traceability calibration and measurement services. This is an acceptable and appropriate means of attaining traceability to NMI, provided that the capability of the commercial/independent laboratory can be assured through processes such as laboratory accreditation.

7. Calibration Systems:

A calibration system is a set of operations that establish, under specified conditions, the relationship between a measuring device and a traceable standard of known reference value and uncertainty. Calibration may also include steps to detect, correlate, report, or eliminate by adjustment any discrepancy inaccuracy of the measuring device being compared. The calibration system determines measurement traceability to the measurement systems through the use of calibration methods and standards. Traceability is the chain of calibration events originating with the calibration standards of appropriate metrological capability or measurement uncertainty. Each calibration event includes all of the elements necessary including standards, measurement and test equipment being verified, calibration methods and procedures, records, and qualified personnel. An organization may have an internal calibration laboratory or organization which controls and maintains the elements of the calibration events. These internal laboratories will maintain a laboratory scope which lists the specific calibrations they are capable of performing as well as the equipment and methods/procedures used to perform the calibrations. The calibration system is part of an organization’s quality management system and therefore should be included in any internal audit requirements. Measurement Assurance Programs (MAPs) can be used to verify the acceptability of the measurement processes used throughout the calibration system. Generally, MAPs will include verification of a measurement system’s results through a secondary independent measurement of the same feature or parameter. Independent measurements imply that the traceability of the secondary measurement process is derived from a separate chain of calibration events from those used for the initial measurement. MAPs may also include the use of statistical process control (SPC) to track the long-term stability of a measurement process. When the calibration event is performed by an external, commercial, or independent calibration service supplier, the service supplier’s calibration system can (or may) be verified through accreditation to ISO/IEC 17025. When a qualified laboratory is not available for a given piece of equipment, calibration services may be performed by the equipment manufacturer.

8. True Value:

The measurement process TARGET is the “true” value of the part. It is desired that any individual reading be as close to this value as (economically) possible. Unfortunately, the true value can never be known with certainty. However, uncertainty can be minimized by using a reference value based on a well defined operational definition of the characteristic and using the results of a measurement system that has higher-order discrimination and traceable to NIST. Because the reference value is used as a surrogate for the true value, these terms are commonly used interchangeably. This usage is not recommended.

9. Reference Value

A reference value, also known as the accepted reference value or master value, is a value of an artefact or ensemble that serves as an agreed-upon reference for comparison. Accepted reference values are based upon the following:

Determined by averaging several measurements with a higher level (e.g., metrology lab or layout equipment) of measuring equipment
Legal values: defined and mandated by law
Theoretical values: based on scientific principles
Assigned values: based on experimental work (supported by sound theory) of some national or international organization
Consensus values: based on collaborative experimental work under the auspices of a scientific or engineering group; defined by a consensus of users such as professional and trade organizations
Agreement values: values expressly agreed upon by the affected parties

In all cases, the reference value needs to be based upon an operational definition and the results of an acceptable measurement system. To achieve this, the measuring system used to determine the reference value should include:

Instrument(s) with higher-order discrimination and a lower measurement system error than the systems used for normal evaluation
Be calibrated with standards traceable to the NIST or other NMI

10. Discrimination

Discrimination is the amount of change from a reference value that an instrument can detect and faithfully indicate. This is also referred to as readability or resolution. The measure of this ability is typically the value of the smallest graduation on the scale of the instrument. If the instrument has “coarse” graduations, then a half-graduation can be used. A general rule of thumb is the measuring instrument discrimination ought to be at least one-tenth of the range to be measured. Traditionally this range has been taken to be the product specification. Recently the 10 to 1 rule is being interpreted to mean that the measuring equipment is able to discriminate to at least one-tenth of the process variation. This is consistent with the philosophy of continual improvement (i.e., the process focus is a customer designated target). The above rule of thumb can be considered as a starting point to determine the discrimination since it does not include any other element of the measurement system’s variability.
Because of economic and physical limitations, the measurement system will not perceive all parts of a process distribution as having separate or different measured characteristics. Instead, the measured characteristic will be grouped by the measured values into data categories. All parts in the same data category will have the same value for the measured characteristic. If the measurement system lacks discrimination (sensitivity or effective resolution), it may not be an appropriate system to identify the process variation or quantify individual part characteristic values. If that is the case, better measurement techniques should be used. The discrimination is unacceptable for analysis if it cannot detect the variation of the process, and unacceptable for control if it cannot detect the special cause variation

The figure above contains two sets of control charts derived from the same data. Control Chart (a) shows the original measurement to the nearest thousandth of an inch. Control Chart (b) shows these data rounded off to the nearest hundredth of an inch. Control Chart (b) appears to be out of control due to the artificially tight limits. The zero ranges are more a product of the rounding off than they are an indication of the subgroup variation. A good indication of inadequate discrimination can be seen on the SPC range chart for process variation. In particular, when the range chart shows only one, two, or three possible values for the range within the control limits, the measurements are being made with inadequate discrimination. Also, if the range chart shows four possible values for the range within control limits and more than one-fourth of the ranges are zero, then the measurements are being made with inadequate discrimination. Another good indication of inadequate discrimination is on a normal probability plot where the data will be stacked into buckets instead of flowing along the 45-degree line. In Control Chart (b), there are only two possible values for the range within the control limits (values of 0.00 and 0.01). Therefore, the rule correctly identifies the reason for the lack of control as inadequate discrimination (sensitivity or effective resolution). This problem can be remedied, of course, by changing the ability to detect the variation within the subgroups by increasing the discrimination of the measurements. A measurement system will have adequate discrimination if its apparent resolution is small relative to the process variation. Thus a recommendation for adequate discrimination would be for the apparent resolution to be at most one-tenth of total process six sigma standard deviation instead of the traditional rule which is the apparent resolution be at most one-tenth of the tolerance spread. Eventually, there are situations that reach a stable, highly capable process using a stable, “best-in-class” measurement system at the practical limits of technology. Effective resolution may be inadequate and further improvement of the measurement system becomes impractical. In these special cases, measurement planning may require alternative process monitoring techniques. Customer approval will typically be required for the alternative process monitoring technique.

11. Measurement Process Variation:

For most measurement processes, the total measurement variation is usually described as a normal distribution. Normal probability is an assumption of the standard methods of measurement systems analysis. In fact, there are measurement systems that are not normally distributed. When this happens, and normality is assumed, the MSA method may overestimate the measurement system error. The measurement analyst must recognize and correct evaluations for non-normal measurement systems.

12. Accuracy

Accuracy is an unbiased true value and is normally reported as the difference between the average of a number of measurements and the true value. Checking a micrometre with a gage block is an example of an accuracy check. Accuracy is a generic concept of exactness related to the closeness of agreement between the average of one or more measured results and a reference value. The measurement process must be in a state of statistical control, otherwise, the accuracy of the process has no meaning. In some organizations, accuracy is used interchangeably with bias. The ISO (International Organization for Standardization) and the ASTM (American Society for Testing and Materials) use the term accuracy to embrace both bias and repeatability. In order to avoid confusion which could result from using the word accuracy, ASTM recommends that only the term bias be used as the descriptor of location error.

13. Bias

Bias is often referred to as “accuracy.” Because “accuracy” has several meanings in literature, its use as an alternate for “bias” is not recommended. Bias is the difference between the true value (reference value) and the observed average of measurements on the same characteristic on the same part. Bias is the measure of the systematic error of the measurement system. It is the contribution to the total error comprised of the combined effects of all sources of variation, known or unknown, whose contributions to the total error tends to offset consistently and predictably all results of repeated applications of the same measurement process at the time of the measurements.

Possible causes of excessive bias are:

Instrument needs calibration
The worn instrument, equipment or fixture
Worn or damaged master, error in master
Improper calibration or use of the setting master
Poor quality instrument – design or conformance
Linearity error
Wrong gage for the application
Different measurement method – setup, loading, clamping, technique
Measuring the wrong characteristic
Distortion (gage or part)
Environment – temperature, humidity, vibration, cleanliness
Violation of an assumption, error in an applied constant
Application – part size, position, operator skill, fatigue, observation error (readability, parallax)

The measurement procedure employed in the calibration process (i.e., using “masters”) should be as identical as possible to the normal operation’s measurement procedure.

14. Precision

ln gage terminology, “repeatability” is often substituted for precision. Repeatability is the ability to repeat the same measurement by the same operator at or near the same time. Precision describes the net effect of discrimination, sensitivity and repeatability over the operating range (size, range and time) of the measurement system. In some organizations, precision is used interchangeably with repeatability. In fact, precision is most often used to describe the expected variation of repeated measurements over the range of measurement; that range may be size or time (i.e., “a device is as precise at the low range as the high range of measurement”, or “as precise today as yesterday”). One could say precision is to repeatability what linearity is to bias (although the first is random and the other systematic errors). The ASTM defines precision in a broader sense to include the variation from different readings, gages, people, labs or conditions. The calibration of measuring instruments is necessary to maintain accuracy but does not necessarily increase precision. In order to improve the accuracy and precision of a measurement process, it must have a defined test method and must be statistically stable.

15. Stability:

Stability (or drift) is the total variation in the measurements obtained with a measurement system on the same master or parts when measuring a single characteristic over an extended time period. That is, stability is the change in bias over time. Possible causes for instability include:

The instrument needs calibration, reduce the calibration interval
The worn instrument, equipment or fixture
Normal ageing or obsolescence
Poor maintenance – air, power, hydraulic, filters, corrosion, rust, cleanliness
Worn or damaged master, error in master
Improper calibration or use of the setting master
Poor quality instrument – design or conformance
Instrument design or method lacks robustness
Different measurement method – setup, loading, clamping, technique
Distortion (gage or part)
Environmental drift – temperature, humidity, vibration, cleanliness
Violation of an assumption, error in an applied constant
Application – part size, position, operator skill, fatigue, observation error (readability, parallax)

16. Linearity

The difference of bias throughout the expected operating (measurement) range of the equipment is called linearity. Linearity can be thought of as a change of bias with respect to size. Note that unacceptable linearity can come in a variety of flavors. Do not assume a constant bias.

Possible causes for linearity error include:

The instrument needs calibration, reduce the calibration interval.
The worn instrument, equipment or fixture.
Poor maintenance – air, power, hydraulic, filters, corrosion, rust, cleanliness.
Worn or damaged master(s), error in master(s) – minimum/ maximum.
Improper calibration (not covering the operating range) or use of the setting master(s).
Poor quality instrument – design or conformance.
Instrument design or method lacks robustness.
Wrong gage for the application.
Different measurement method – setup, loading, clamping, technique.
Distortion (gage or part) changes with part size.
Environment – temperature, humidity, vibration, cleanliness.
Violation of an assumption, error in an applied constant.
Application – part size, position, operator skill, fatigue, observation error (readability, parallax).

17. Sensitivity

The gage should be sensitive enough to detect differences in measurement as slight as one-tenth of the total tolerance specification or process spread, whichever is smaller. Inadequate discrimination will affect both the accuracy and precision of an operator’s reported values. Sensitivity is the smallest input that results in a detectable (usable) output signal. It is the responsiveness of the measurement system to changes in the measured feature. Sensitivity is determined by gage design (discrimination), inherent quality (OEM), in-service maintenance, and the operating condition of the instrument and standard. It is always reported as a unit of measure. Factors that affect sensitivity include:

Ability to dampen an instrument
The skill of the operator
Repeatability of the measuring device
Ability to provide drift-free operation in the case of electronic or pneumatic gages
Conditions under which the instrument is being used such as ambient air, dirt, humidity

18. Repeatability

This is traditionally referred to as the “within appraiser” variability. Repeatability is the variation in measurements obtained with one measurement instrument when used several times by one appraiser while measuring the identical characteristic on the same part. This is the inherent variation or capability of the equipment itself. Repeatability is commonly referred to as equipment variation (EV), although this is misleading. In fact, repeatability is the common cause (random error) variation from successive trials under defined conditions of measurement. The best term for repeatability is within-system variation when the conditions of measurement are fixed and defined – fixed part, instrument, standard, method, operator, environment, and assumptions. In addition to within-equipment variation, repeatability will include all within variation (see below) from any condition in the error model.

Possible causes for poor repeatability include:

Within-part (sample): form, position, surface finish, taper, sample consistency
Within-instrument: repair; wear, equipment or fixture failure, poor quality or maintenance
Within-standard: quality, class, wear
Within-method: variation in setup, technique, zeroing, holding, clamping
Within-appraiser: technique, position, lack of experience, manipulation skill or training, feel, fatigue
Within-environment: short-cycle fluctuations in temperature, humidity, vibration, lighting, cleanliness
Violation of an assumption – stable, proper operation
Instrument design or method lacks robustness, poor uniformity
Wrong gage for the application
Distortion (gage or part), lack of rigidity
Application – part size, position, observation error (readability, parallax)

19. Reproducibility

The “reliability” of the gage system or similar gage systems to reproduce measurements. The reproducibility of a single gage is customarily checked by comparing the results of different operators taken at different times. Gage reproducibility affects both accuracy and precision. This is traditionally referred to as the “between appraisers” variability. Reproducibility is typically defined as the variation in the average of the measurements made by different appraisers using the same measuring instrument when measuring the identical characteristic on the same part. This is often true for manual instruments influenced by the skill of the operator. It is not true, however, for measurement processes (i.e., automated systems) where the operator is not a major source of variation. For this reason, reproducibility is referred to as the average variation between systems or between-conditions of measurement.

Potential sources of reproducibility error include:

Between-parts (samples): average difference when measuring types of parts A, B, C, etc, using the same instrument, operators, and method.
Between-instruments: average difference using instruments A, B, C, etc., for the same parts, operators and environment. Note: in this study reproducibility error is often confounded with the method and/or operator.
Between-standards: average influence of different setting standards in the measurement process.
Between-methods: average difference caused by changing point densities, manual versus automated systems, zeroing, holding or clamping methods, etc.
Between-appraisers (operators): the average difference between appraisers A, B, C, etc., caused by training, technique, skill and experience. This is the recommended study for product and process qualification and a manual measuring instrument.
Between-environment: average difference in measurements over time 1, 2, 3, etc. caused by environmental cycles; this is the most common study for highly automated systems in product and process qualifications.
Violation of an assumption in the study
Instrument design or method lacks robustness
Operator training effectiveness
Application – part size, position, observation error (readability,
parallax)

20. Gage R&R or GRR

Gage R&R is an estimate of the combined variation of repeatability and reproducibility. Stated another way, GRR is the variance equal to the sum of within-system and between-system variances.

σ²_GRR = σ²_{reproducibility} + σ²_{repeatability}

21. Consistency

Consistency is the difference in the variation of the measurements taken over time. It may be viewed as repeatability over time. Factors impacting consistency are special causes of variation such as:

Temperature of parts
Warm-up required for electronic equipment
Worn equipment

22. Uniformity

Uniformity is the difference in variation throughout the operating range of the gage. It may be considered to be the homogeneity (sameness) of the repeatability oversize. Factors impacting uniformity include:

The fixture allows smaller/larger sizes to position differently
Poor readability on the scale
Parallax in reading

23. Capability

The capability of a measurement system is an estimate of the combined variation of measurement errors (random and systematic) based on a short term assessment. Simple capability includes the components of:

Uncorrected bias or linearity
Repeatability and reproducibility (GRR), including short-term
consistency

An estimate of measurement capability, therefore, is an expression of the expected error for defined conditions, scope and range of the measurement system (unlike measurement uncertainty, which is an expression of the expected range of error or values associated with a measurement result). The capability expression of combined variation (variance) when the measurement errors are uncorrelated (random and independent) can be quantified as:

σ²_capability = σ²_{bias(linearity)} + σ²_GRR

There are two essential points to understand and correctly apply measurement capability:
First, an estimate of capability is always associated with a defined scope of measurement – conditions, range and time. For example, to say that the capability of a 25 mm micrometre is 0.1 mm is incomplete without qualifying the scope and range of measurement conditions. Again, this is why an error model to define the measurement process is so important. The scope for an estimate of measurement capability could be very specific or a general statement of operation, over a limited portion or entire measurement range. Short-term could mean the capability over a series of measurement cycles, the time to complete the GRR evaluation, a specified period of production, or time represented by the calibration frequency. A statement of measurement capability need only be as complete as to reasonably replicate the conditions and range of measurement. A documented Control Plan could serve this purpose. Second, short-term consistency and uniformity (repeatability errors) over the range of measurement are included in a capability estimate. For a simple instrument, such as a 25 mm micrometre, the repeatability over the entire range of measurement using typical, skilled operators are expected to be consistent and uniform. In this example, a capability estimate may include the entire range of measurement for multiple types of features under general conditions. Longer range or more complex measurement systems (i.e., a CMM) may demonstrate measurement errors of (uncorrected) linearity, uniformity, and short-term consistency over range or size. Because these errors are correlated they cannot be combined using the simple linear formula above. When (uncorrected) linearity, uniformity or consistency varies significantly over the range, the measurement planner and analyst has only two practical choices:

Report the maximum (worst case) capability for the entire defined conditions, scope and range of the measurement system, or
Determine and report multiple capability assessments for defined portions of the measurement range (i.e., low, mid, larger range).

24. Performance

As with process performance, measurement system performance is the net effect of all significant and determinable sources of variation over time. Performance quantifies the long-term assessment of combined measurement errors (random and systematic). Therefore, the performance includes the long term error components of:

Capability (short-term errors)
Stability and consistency

An estimate of measurement performance is an expression of the expected error for defined conditions, scope and range of the measurement system (unlike measurement uncertainty, which is an expression of the expected range of error or values associated with a measurement result). The performance expression of combined variation (variance) when the measurement errors are uncorrelated (random and independent) can be quantified as:

σ²_performance = σ²_capability + σ²_stability+ σ²_consistency

Again, just as short-term capability, long-term performance is always associated with a defined scope of measurement – conditions, range and time. The scope for an estimate of measurement performance could be very specific or a general statement of operation, over a limited portion or entire measurement range. Long-term could mean the average of several capability assessments over time, the long-term average error from a measurement control chart, an assessment of calibration records or multiple linearity studies, or average error from several GRR studies over the life and range of the measurement system. A statement of measurement performance need only be as complete as to reasonably represent the conditions and range of measurement. Long-term consistency and uniformity (repeatability errors) over the range of measurement are included in a performance estimate. The measurement analyst must be aware of the potential correlation of errors so as to not overestimate the performance estimate. This depends on how the component errors were determined. When long-term (uncorrected) linearity, uniformity or consistency vary significantly over the range, the measurement planner and analyst has only two practical choices:

Report the maximum (worst case) performance for the entire defined conditions, scope and range of the measurement system,
Determine and report multiple performance assessments for a defined portion of the measurement range (i.e., low, mid, larger range).

25. Measurement Uncertainty

Measurement Uncertainty is a term that is used internationally to describe the quality of a measurement value. While this term has traditionally been reserved for many of the high accuracy measurements performed in metrology or gage laboratories, many customer and quality system standards require that measurement uncertainty be known and consistent with required measurement capability of any inspection, measuring or test equipment. In essence, uncertainty is the value assigned to a measurement result that describes, within a defined level of confidence, the range expected to contain the true measurement result. Measurement uncertainty is normally reported as a bilateral quantity. Uncertainty is a quantified expression of measurement reliability. A simple expression of this concept is:
True measurement = observed measurement (result) ± U
U is the term for “expanded uncertainty” of the measurand and measurement result. Expanded uncertainty is the combined standard error (uc), or standard deviation of the combined errors (random and systematic), in the measurement process multiplied by a coverage factor (k) that represents the area of the normal curve for a desired level of confidence. A normal distribution is often applied as a principal assumption for measurement systems. The ISO/IEC Guide to the Uncertainty in Measurement establishes the coverage factor as sufficient to report uncertainty at 95% of a normal distribution. This is often interpreted as k = 2.

U = ku_c

The combined standard error (u_c) includes all significant components of variation in the measurement process. Often, the most significant error component can be quantified by σ²_performance
Other significant error sources may apply based on the measurement application. An uncertainty statement must include an adequate scope that identifies all significant errors and allows the measurement to be replicated. Some uncertainty statements will build from long-term, other short-term, measurement system error. However, the simple expression can be quantified as:

u²_c = σ²_performance + σ²_others

It is important to remember that measurement uncertainty is simply an estimate of how much a measurement may vary at the time of measurement. It should consider all significant sources of measurement variation in the measurement process plus significant errors of calibration, master standards, method, environment and others not previously considered in the measurement process. In many cases, this estimate will use methods of MSA and GRR to quantify those significant standard errors. It is appropriate to periodically reevaluate uncertainty related to a measurement process to assure the continued accuracy of the estimate. The major difference between uncertainty and the MSA is that the MSA focus is on understanding the measurement process, determining the amount of error in the process, and assessing the adequacy of the measurement system for product and process control. MSA promotes understanding and improvement (variation reduction). Uncertainty is the range of measurement values, defined by a confidence interval, associated with a measurement result and expected to include the true value of the measurement.

The Measurement Process

In order to effectively manage variation of any process, there needs to be knowledge of:

What the process should be doing?
What can go wrong?
What the process is doing?
Specifications and engineering requirements define what the process should be doing.

The purpose of a Process Failure Mode Effects Analysis (PFMEA) is to define the risk associated with potential process failures and to propose corrective action before these failures can occur. The outcome of the PFMEA is transferred to the control plan. Knowledge is gained of what the process is doing by evaluating the parameters or results of the process. This activity often called inspection, is the act of examining process parameters, in-process parts, assembled subsystems, or complete end products with the aid of suitable standards and measuring devices which enable the observer to confirm or deny the premise that the process is operating in a stable manner with acceptable variation to a customer designated target. But this examination activity is itself a process.

The measurement and analysis activity is a process – a measurement process. Any and all of the management, statistical, and logical techniques of process control can be applied to it. This means that the customers and their needs must first be identified. The customer, the owner of the process, wants to make a correct decision with minimum effort. Management must provide the resources to purchase equipment which is necessary and sufficient to do this. But purchasing the best or the latest measurement technology will not necessarily guarantee correct production process control decisions. Equipment is only one part of the measurement process. The owner of the process must know how to correctly use this equipment and how to analyze and interpret the results. Management must therefore also provide clear operational definitions and standards as well as training and support. The owner of the process has, in turn, the obligation to monitor and control the measurement process to assure stable and correct results which include a total measurement systems analysis perspective – the study of the gage, procedure, user, and environment; i.e., normal operating conditions.

Statistical Properties of Measurement Systems:

An ideal measurement system would produce only “correct” measurements each time it is used. Each measurement would always agree with a standard. A measurement system that could produce measurements like that would be said to have the statistical properties of zero variance, zero bias, and zero probability of misclassifying any product is measured. Unfortunately, measurement systems with such desirable statistical properties seldom exist, and so process managers are typically forced to use measurement systems that have less desirable statistical properties. The quality of a measurement system is usually determined solely by the statistical properties of the data it produces over time. Other properties, such as cost, ease of use, etc., are also important in that they contribute to the overall desirability of a measurement system. But it is the statistical properties of the data produced that determine the quality of the measurement system. Statistical properties that are most important for one use are not necessarily the most important properties for another use. For instance, for some uses of a coordinate measuring machine (CMM), the most important statistical properties are “small” bias and variance. A CMM with those properties will generate measurements that are “close” to the certified values of standards that are traceable. Data obtained from such a machine can be very useful for analyzing the manufacturing process. But, no matter how “small” the bias and variance of the CMM may be, the measurement system which uses the CMM may be unable to do an acceptable job of discriminating between good and bad product because of the additional sources of variation introduced by the other elements of the measurement system. Management has the responsibility for identifying the statistical properties that are the most important for the ultimate use of the data. Management is also responsible for ensuring that those properties are used as the basis for selecting a measurement system. To accomplish this, operational definitions of the statistical properties, as well as acceptable methods of measuring them, are required. Although each measurement system may be required to have different statistical properties, there are certain fundamental properties that define a “good” measurement system. These include:

Adequate discrimination and sensitivity. The increments of measure should be small relative to the process variation or specification limits for the purpose of measurement. The commonly known Rule of Tens, or 10-to-1 Rule, states that instrument discrimination should divide the tolerance (or process variation) into ten parts or more. This rule of thumb was intended as a practical minimum starting point for gage selection.
The measurement system ought to be in statistical control. This means that under repeatable conditions, the variation in the measurement system is due to common causes only and not due to special causes. This can be referred to as statistical stability and is best evaluated by graphical methods.
For product control, the variability of the measurement system must be small compared to the specification limits. Assess the measurement system to feature tolerance.
For process control, the variability of the measurement system ought to demonstrate effective resolution and be small compared to manufacturing process variation. Assess the measurement system to the 6-sigma process variation and/or Total Variation from the MSA study.

Sources of Variation:

Similar to all processes, the measurement system is impacted by both random and systematic sources of variation. These sources of variation are due to common and special causes. In order to control the measurement system variation:

Identify the potential sources of variation.
Eliminate (whenever possible) or monitor these sources of variation.

Although the specific causes will depend on the situation, some typical sources of variation can be identified. There are various methods of presenting and categorizing these sources of variation such as cause-effect diagrams, fault tree diagrams, etc., but the guidelines presented here will focus on the major elements of a measuring system.
The acronym S.W.I.P.E. is used to represent the six essential elements of a generalized measuring system to assure attainment of required objectives. S.W.I.P.E. stands for Standard, Workpiece, Instrument, Person and Procedure, and Environment. This may be thought of as an error model for a complete measurement system. Factors affecting those six areas need to be understood so they can be controlled or eliminated.

Types of Measurement System Variation

It is often assumed that measurements are exact, and frequently the analysis and conclusions are based upon this assumption. An individual may fail to realize there is variation in the measurement system which affects the individual measurements, and subsequently, the decisions based upon the data. A measurement system error can be classified into five categories: bias, repeatability, reproducibility, stability and linearity.
One of the objectives of a measurement system study is to obtain information relative to the amount and types of measurement variation associated with a measurement system when it interacts with its environment. This information is valuable, since, for the average production process, it is far more practical to recognize repeatability and calibration bias and establish reasonable limits for these than to provide extremely accurate gages with very high repeatability. Applications of such a study provide the following:

A criterion to accept new measuring equipment.
A comparison of one measuring device against another.
A basis for evaluating a gage suspected of being deficient.
A comparison of measuring equipment before and after repair.
A required component for calculating process variation, and the acceptability level for a production process
Information necessary to develop a Gage Performance Curve (GPC), which indicates the probability of accepting a part of some true value

The Effects of Measurement System Variability

Because the measurement system can be affected by various sources of variation, repeated readings on the same part do not yield the same, identical result. Readings vary from each other due to common and special causes. The effects of the various sources of variation on the measurement system should be evaluated over a short and long period of time. The measurement system capability is the measurement system (random) error over a short period of time. It is the combination of errors quantified by linearity, uniformity, repeatability and reproducibility. The measurement system performance, as with process performance, is the effect of all sources of variation over time. This is accomplished by determining whether our process is in statistical control (i.e., stable and consistent; variation is due only to common causes), on target (no bias), and has an acceptable variation (gage repeatability and reproducibility (GRR)) over the range of expected results. This adds stability and consistency to the measurement system capability.

Effect on Decisions:
After measuring a part, one of the actions that can be taken is to determine the status of that part. Historically, it would be determined if the part were acceptable (within specification) or unacceptable (outside specification). Another common scenario is the classification of parts into specific categories (e.g., piston sizes). Further classifications may be reworkable, salvageable or scrap. Under a product control philosophy, this classification activity would be the primary reason for measuring a part. But, with a process control philosophy, interest is focused on whether the part variation is due to common causes or special causes in the process.
Effect on Product Decisions:
In order to better understand the effect of measurement system error on product decisions, consider the case where all of the variability in multiple readings of a single part is due to the gage repeatability and reproducibility. That is, the measurement process is in statistical control and has zero bias. A wrong decision will sometimes be made whenever any part of the above measurement distribution overlaps a specification limit. For example, a good part will sometimes be called “bad” (type I error, producer’s risk or false alarm)
And, a bad part will sometimes be called “good” (type II error, consumer’s risk or miss rate)

False Alarm Rate + Miss Rate = Error Rate.

That is, with respect to the specification limits, the potential to make the wrong decision about the part exists only when the measurement system error intersects the specification limits. This gives three distinct areas:where:
I Bad parts will always be called bad
II Potential wrong decision can be made
III Good parts will always be called good
Since the goal is to maximize CORRECT decisions regarding product status, there are two choices:
1. Improve the production process: reduce the variability of the process so that no parts will be produced in the II or “shaded” areas of the graphic above.Improve the measurement system: reduce the measurement system error to reduce the size of the II areas so that all parts being produced will fall within area III and thus minimize the risk of making a wrong decision.
This discussion assumes that the measurement process is in statistical control and on target. If either of these assumptions is violated then there is little confidence that any observed value would lead to a correct decision.
Effect on Process Decisions
With process control, the following needs to be established:
- Statistical control
- On target
- Acceptable variability
  As explained in the previous section, the measurement error can cause incorrect decisions about the product. The impact on process decisions would be as follows:
  - Calling a common cause a special cause
  - Calling a special cause a common cause
  Measurement system variability can affect the decision regarding the stability, target and variation of a process. The basic relationship between the actual and the observed process variation is:
  σ²_obs = σ²_actual + σ²_msa
  σ²_obs = observed process variance
  σ²_actual = actual process variance
  σ²_msa = variance of the measurement system
  The capability index C_p is defined as:
  The relationship between the C_p index of the observed process and the C_p indices of the actual process and the measurement system is derived by substituting the equation for C_p into the observed variance equation above:
  (C_p)^-2_obs = (C_p)^-2_actual + (C_p)^-2_msa
  Assuming the measurement system is in statistical control and on target, the actual process C_p can be compared graphically to the observed C_p. Therefore the observed process capability is a combination of the actual process capability plus the variation due to the measurement process. To reach a specific process capability goal would require factoring in the measurement variation. For example, if the measurement system C_p index were 2, the actual process would require a C_p index greater than or equal to 1.79 in order for the calculated (observed) index to be 1.33. If the measurement system C_p index were itself 1.33, the process would require no variation at all if the final result were to be 1.33, clearly an impossible situation.
New Process Acceptance
When a new process such as machining, manufacturing, stamping, material handling, heat treating, or assembly is purchased, there often is a series of steps that are completed as part of the buy-off activity. Often times this involves some studies done on the equipment at the supplier’s location and then at the customer’s location. If the measurement system used at either location is not consistent with the measurement system that will be used under normal circumstances then confusion may ensue. The most common situation involving the use of different instruments is the case where the instrument used at the supplier has higher-order discrimination than the production instrument (gage). For example, parts measured with a coordinate measuring machine during buyoff and then with a height gage during production; samples measured (weighed) on an electronic scale or laboratory mechanical scale during buyoff and then on a simple mechanical scale during production. In the case where the (higher-order) measurement system used during buy-off has a GRR of 10% and the actual process C_p is 2.0 the observed process C_p during buy-off will be 1.96. When this process is studied in production with the production gage, more variation (i.e., a smaller C_p) will be observed. For example, if the GRR of the production gage is 30% and the actual process C_p is still 2.0 then the observed process C_p will be 1.71. A worst-case scenario would be if a production gage has not been qualified but is used. If the measurement system GRR is actually 60% (but that fact is not known), then the observed C_p would be 1.28. The difference in the observed C_p of 1.96 versus 1.28 is due to the different measurement system. Without this knowledge, efforts may be spent, in vain, looking to see what went wrong with the new process.
Process Setup/ Control (Funnel Experiment):

Often manufacturing operations use a single part at the beginning of the day to verify that the process is targeted. If the part measured is off-target, the process is then adjusted. Later, in some cases, another part is measured and again the process may be adjusted. Dr Deming referred to this type of measurement and decision-making as tampering. Consider a situation where the weight of a precious metal coating on a part is being controlled to a target of 5.00 grams. Suppose that the results from the scale used to determine the weight vary ±0.20 grams but this is not known since the measurement system analysis was never done. The operating instructions require the operator to verify the weight at setup and every hour based on one sample. If the results are beyond the interval 4.90 to 5.10 grams then the operator is to set up the process again. At setup, suppose the process is operating at 4.95 grams but due to measurement error, the operator observes 4.85 grams. According to instructions, the operator attempts to adjust the process up by .15 grams. Now the process is running at 5.10 grams for a target. When the operator checks the setup this time, 5.08 grams is observed so the process is allowed to run. Over-adjustment of the process has added variation and will continue to do so. This is one example of the funnel experiment that Dr Deming used to describe the effects of tampering. Four rules of the funnel experiment are:
Rule 1: Make no adjustment or take no action unless the process is unstable.
Rule 2: Adjust the process in an equal amount and in an opposite direction from where the process was last measured to be.
Rule 3: Reset the process to the target. Then adjust the process in an equal amount and in an opposite direction from the target.
Rule 4: Adjust the process to the point of the last measurement.

Measurement Issues

Three fundamental issues must be addressed when evaluating a measurement system:

The measurement system must demonstrate adequate sensitivity.
- First, does the instrument (and standard) have adequate discrimination? Discrimination (or class) is fixed by design and serves as the basic starting point for selecting a measurement system. Typically, the Rule of Tens has been applied, which states that instrument discrimination should divide the tolerance (or process variation) into ten parts or more.
- Second, does the measurement system demonstrate effective resolution? Related to discrimination, determine if the measurement system has the sensitivity to detect changes in product or process variation for the application and conditions.
The measurement system must be stable.
- Under repeatability conditions, the measurement system variation is due to common causes only and not special (chaotic) causes.
- The measurement analyst must always consider the practical and statistical significance.
The statistical properties (errors) are consistent over the expected range and adequate for the purpose of measurement (product control or process control).

The long-standing tradition of reporting measurement error only as a percent of tolerance is inadequate for the challenges of the marketplace that emphasize strategic and continuous process improvement. As processes change and improve, a measurement system must be re-evaluated for its intended purpose. It is essential for the organization (management, measurement planner, production operator, and quality analyst) to understand the purpose of measurement and apply the appropriate evaluation.

Suggested Elements for a Measurement System Development Checklist

(This list should be modified based on the situation and type of measurement system. The development of the final checklist should be the result of the collaboration between the customer and the supplier.)

Measurement System Design and Development Issues:

What is to be measured? What type of characteristic is it? Is it a mechanical property? Is it dynamic or stationary? Is it an electrical property? Is there significant within-part variation?
For what purpose will the results (output) of the measurement process be used? Production improvement, production monitoring, laboratory studies, process audits, shipping inspection, receiving inspection, responses to a D.O.E.?
Who will use the process? Operators, engineers, technicians, inspectors, auditors?
Training required: Operator, maintenance personnel, engineers; classroom, practical application, OJT, apprenticeship period.  Have the sources of variation been identified? Build an error model (S.W.I.P.E.) using teams, brainstorming, profound process knowledge, cause & effect diagram or matrix.
Has an FMEA been developed for the measurement system?
Flexible vs. dedicated measurement systems: Measurement systems can either be permanent and dedicated or they can be flexible and have the ability to measure different types of parts; e.g., doghouse gages, fixture gaging, coordinate measurement machine, etc. Flexible gaging will be more expensive but can save money in the long run.
Contact vs. non-contact: Reliability, type of feature, sample plan, cost, maintenance, calibration, personnel skill required, compatibility, environment, pace, probe types, part deflection, image processing. This may be determined by the control plan requirements and the frequency of the measurement ( Full contact gaging may get excessive wear during continuous sampling). Full surface contact probes, probe type, air feedback jets, image processing, CMM vs. optical comparator, etc.
Environment: Dirt, moisture, humidity, temperature, vibration, noise, electromagnetic interference (EMI), ambient air movement, air contaminants, etc. Laboratory, shop floor, office, etc? The environment becomes a key issue with low, tight tolerances at the micron level. Also, in cases that CMM, vision systems, ultrasonic, etc. This could be a factor in auto-feedback in-process type measurements. Cutting oils, cutting debris, and extreme temperatures could also become issues. Is a cleanroom required?
Measurement and location points: Clearly define, using GD&T, the location of fixturing and clamping points and where on the part the measurements will be taken.
Fixturing method: Free state versus clamped part holding.
Part orientation: Body position versus others.
Part preparation: Should the part be clean, non-oily, temperature stabilized, etc. before measurement?
Transducer location: Angular orientation, distance from primary locators or nets.
Correlation issue #1 – duplicate gaging: Are duplicate (or more) gages required within or between plants to support requirements? Building considerations, measurement error considerations, maintenance considerations. Which is considered the standard? How will each be qualified?
Correlations issue #2 – methods divergence: Measurement variation resulting from different measurement system designs performing on the same product/process within accepted practice and operation limits (e.g., CMM versus manual or open-setup measurement results).
Automated vs. manual: on-line, off-line, operator dependencies.
Destructive versus non-destructive measurement (NDT): Examples: tensile test, salt spray testing, plating/paint coating thickness, hardness, dimensional measurement, image processing, chemical analysis, stress, durability, impact, torsion, torque, weld strength, electrical properties, etc.
Potential measurement range: size and expected range of conceivable measurements.
Effective resolution: Is measurement sensitive to physical change (ability to detect process or product variation) for a particular application acceptable for the application?
Sensitivity: Is the size of the smallest input signal that results in a detectable (discernable) output signal for this measurement device acceptable for the application? Sensitivity is determined by inherent gage design and quality (OEM), in-service maintenance, and operating condition.

Measurement System Build Issues (equipment, standard, instrument):

Have the sources of variation identified in the system design been addressed? Design review; verify and validate.
Calibration and control system: Recommended calibration schedule and audit of equipment and documentation. Frequency, internal or external, parameters, in-process verification checks.
Input requirements: Mechanical, electrical, hydraulic, pneumatic, surge suppressors, dryers, filters, setup and operation issues, isolation, discrimination and sensitivity.
Output requirements: Analog or digital, documentation and records, file, storage, retrieval, backup.
Cost: Budget factors for development, purchase, installation, operation and training.
Preventive maintenance: Type, schedule, cost, personnel, training, documentation.
Serviceability: Internal and external, location, support level, response time, availability of service parts, standard parts list.
Ergonomics: Ability to load and operate the machine without injuries over time. Measurement device discussions need to focus on issues of how the measurement system is interdependent with the operator.
Safety considerations: Personnel, operation, environmental, lock-out.
Storage and location: Establish the requirements around the storage and location of the measurement equipment. Enclosures, environment, security, availability (proximity) issues.
Measurement cycle time: How long will it take to measure one part or characteristic? Measurement cycle integrated to process and product control.
Will there be any disruption to process flow, lot integrity, to capture, measure and return the part?
Material handling: Are special racks, holding fixtures, transport equipment or other material handling equipment needed to deal with parts to be measured or the measurement system itself?
Environmental issues: Are there any special environmental requirements, conditions, limitations, either affecting this measurement process or neighbouring processes? Is special exhausting required? Is temperature or humidity control necessary? Humidity, vibration, noise, EMI, cleanliness.
Are there any special reliability requirements or considerations? Will the equipment hold up over time? Does this need to be verified ahead of production use?
Spare parts: Common list, adequate supply and ordering system in place, availability, lead-times understood and accounted for. Is adequate and secure storage available? (bearings, hoses, belts, switches, solenoids, valves, etc.)
User instructions: Clamping sequence, cleaning procedures, data interpretation, graphics, visual aids, comprehensive. Available, appropriately displayed.
Documentation: Engineering drawings, diagnostic trees, user manuals, language, etc.
Calibration: Comparison to acceptable standards. Availability and cost of acceptable standards. Recommended frequency, training requirements. Down-time required?
Storage: Are there any special requirements or considerations regarding the storage of the measurement device? Enclosures, environment, security from damage/theft, etc.
Error/Mistake proofing: Can known measurement procedure mistakes be corrected easily (too easily?) by the user? Data entry, misuse of equipment, error proofing, mistake proofing.

Measurement System Implementation Issues (process):

Support: Who will support the measurement process? Lab technicians, engineers, production, maintenance, outside contracted service?
Training: What training will be needed for operators/inspectors/technicians/engineers to use and maintain this measurement process? Timing, resource and cost issues. Who will train? Where will the training be held? Lead time requirements? Coordinated with the actual use of the measurement process.
Data management: How will data output from this measurement process be managed? Manual, computerized, summary methods, summary frequency, review methods, review frequency, customer requirements, internal requirements. Availability, storage, retrieval, backup, security. Data interpretation.
Personnel: Will personnel need to be hired to support this measurement process? Cost, timing, availability issues. Current or new.
Improvement methods: Who will improve the measurement process over time? Engineers, production, maintenance, quality personnel? What evaluation methods will be used? Is there a system to identify needed improvements?
Long-term stability: Assessment methods, format, frequency, and need for long-term studies. Drift, wear, contamination, operational integrity. Can this long-term error be measured, controlled, understood, predicted?
Special considerations: Inspector attributes, physical limitations or health issues: colourblindness, vision, strength, fatigue, stamina, ergonomics.

Measurement Problem Analysis

An understanding of measurement variation and the contribution that it makes to total variation needs to be a fundamental step in basic problem-solving. When the variation in the measurement system exceeds all other variables, it will become necessary to analyze and resolve those issues before working on the rest of the system. In some cases, the variation contribution of the measurement system is overlooked or ignored. This may cause loss of time and resources as the focus is made on the process itself when the reported variation is actually caused by the measurement device.

Step 1: Identify the Issues

When working with measurement systems, as with any process, it is
important to clearly define the problem or issue. In the case of measurement issues, it may take the form of accuracy, variation, stability, etc. The important thing to do is try to isolate the measurement variation and its contribution, from the process variation (the decision may be to work on the process, rather than work on the measurement device). The issue statement needs to be an adequate operational definition that anyone would understand and be able to act on the issue.

Step 2: Identify the Team

The problem-solving team, in this case, will be dependent on the complexity of the measurement system and the issue. A simple measurement system may only require a couple of people. But as the system and issue become more complex, the team may grow in size (maximum team size ought to be limited to 10 members). The team members and the function they represent need to be identified on the problem-solving sheet.

Step 3: Flowchart of Measurement System and Process

The team would review any historical flowcharting of the measurement system and the process. This would lead to a discussion of known and unknown information about the measurement and its interrelationship to the process. The flowcharting process may identify additional members to add to the team.

Step 4: Cause and Effect Diagram

The team would review any historical Cause and Effect Diagram on the Measurement System. This could, in some cases, result in the solution or a partial solution. This would also lead to a discussion on known and unknown information. The team would use subject matter knowledge to initially identify those variables with the largest contribution to the issue. Additional studies can be done to substantiate the decisions.

Step 5: Plan-Do-Study-Act (PDSA)

This would lead to a Plan-Do-Study-Act, which is a form of scientific study. Experiments are planned, data are collected, stability is established, hypotheses are made and proven until an appropriate solution is reached.

Step 6: Possible Solution and Proof of the Correction

The steps and solution are documented to record the decision. A preliminary study is performed to validate the solution. This can be done using some form of design of the experiment to validate the solution. Also, additional studies can be performed over time including environmental and material variation.

Step 7: Institutionalize the Change

The final solution is documented in the report; then the appropriate department and functions change the process so that the problem won’t recur in the future. This may require changes in procedures, standards, and training materials. This is one of the most important steps in the process. Most issues and problems have occurred at one time or another.

Assessing Measurement Systems:

Two important areas need to be assessed:
1) Verify the correct variable is being measured at the proper characteristic location. Verify fixturing and clamping if applicable. Also, identify any critical environmental issues that are interdependent with the measurement. If the wrong variable is being measured, then no matter how accurate or how precise the measurement system is, it will simply consume resources without providing benefit.
2) Determine what statistical properties the measurement system needs to have in order to be acceptable. In order to make that determination, it is important to know how the data are to be used, for, without that knowledge, the appropriate statistical properties cannot be determined. After the statistical properties have been determined, the measurement system must be assessed to see if it actually possesses these properties or not.

Phase I testing:

Phase 1 testing is an assessment to verify the correct variable is being measured at the proper characteristic location per measurement system design specification. Also if there are any critical environmental issues that are interdependent with the measurement. Phase 1 could use a statistically designed experiment to evaluate the effect of the operating environment on the measurement system’s parameters (e.g., bias, linearity, repeatability, and reproducibility).
The knowledge gained during Phase 1 testing should be used as input to the development of the measurement system maintenance program as well as the type of testing which should be used during Phase 2.

Phase 2 testing:

Phase 2 testing provides ongoing monitoring of the key sources of variation for continued confidence in the measurement system (and the data being generated) and/or a signal that the measurement system has degraded over time.

When developing Phase 1 or Phase 2 test programs there are several factors that need to be considered:

What effect does the appraiser have on the measurement process? If possible, the appraisers who normally use the measurement device should be included in the study.
Is appraiser calibration of the measurement equipment likely to be a significant cause of variation? If so, the appraisers should recalibrate the equipment before each group of readings.
How many sample parts and repeated readings are required? The number of parts required will depend upon the significance of the characteristic being measured and upon the level of confidence required in the estimate of measurement system variation.

General issues to consider when selecting or developing an assessment procedure include:

Should standards, such as those traceable to NMI, be used in the testing and, if so, what level of standard is appropriate? Standards are frequently essential for assessing the accuracy of a measurement system. If standards are not used, the variability of the measurement system can still be assessed, but it may not be possible to assess its accuracy with reasonable credibility. Lack of such credibility may be an issue, for instance, if attempting to resolve an apparent difference between a producer’s measurement system and a customer’s measurement system.
For the ongoing testing in Phase 2, the use of blind measurements may be considered. Blind measurements are measurements obtained in the actual measurement environment by an operator who does not know that an assessment of the measurement system is being conducted.
The cost of testing.
The time required for the testing.
Any term for which there is no commonly accepted definition should be operationally defined. Examples of such terms include accuracy, precision, repeatability, reproducibility, etc.
Will the measurements made by the measurement system be compared with measurements made by another system? If so, one should consider using test procedures that rely on the use of standards such as those discussed in Phase 1 above. If standards are not used, it may still be possible to determine whether or not the two measurement systems are working well together. However, if the systems are not working well together, then it may not be possible, without the use of standards, to determine which system needs improvement.
How often should Phase 2 testing be performed? This decision may be based on the statistical properties of the individual measurement system and the consequence to the facility, and the facility’s customers of a manufacturing process that, in effect, is not monitored due to a measurement system not performing properly.

Preparation for a Measurement System Study:

Typical preparation prior to conducting the study is as follows:

The approach to be used should be planned. For instance, determine by using engineering judgment, visual observations, or a gage study, if there is an appraiser influence in calibrating or using the instrument. There are some measurement systems where the effect of reproducibility can be considered negligible; for example, when a button is pushed and a number is printed out.
The number of appraisers, number of sample parts, and number of repeat readings should be determined in advance. Some factors to be considered in this selection are:
(a) Criticality of dimension – critical dimensions require more parts and/or trials. The reason being the degree of confidence desired for the gage study estimations.
(b) Part configuration – bulky or heavy parts may dictate fewer samples and more trials.
(c) Customer requirements.
Since the purpose is to evaluate the total measurement system, the appraisers chosen should be selected from those who normally operate the instrument.
Selection of the sample parts is critical for proper analysis and depends entirely upon the design of the MSA study, the purpose of the measurement system, and availability of part samples that represent the production process. When an independent estimate of process variation is not available, OR to determine process direction and continued suitability of the measurement system for process control, the sample parts must be selected from the process and represent the entire production operating range. The variation in sample parts (PV) selected for the MSA study is used to calculate the Total Variation (TV) of the study. The TV index (i.e., %GRR to TV) is an indicator of process direction and continued suitability of the measurement system for process control. If the sample parts DO NOT represent the production process, the TV must be ignored in the assessment. Ignoring TV does not affect assessments using tolerance (product control) or an independent estimate of process variation (process control). Samples can be selected by taking one sample per day for several days. Again, this is necessary because the parts will be treated in the analysis as if they represent the range of production variation in the process. Since each part will be measured several times, each part must be numbered for identification.
The instrument should have discrimination that allows at least one-tenth of the expected process variation of the characteristic to be read directly. For example, if the characteristic’s variation is 0.001, the equipment should be able to “read” a change of 0.0001.
Assure that the measuring method (i.e., appraiser and instrument) is measuring the dimension of the characteristic and is following the defined measurement procedure.

To minimize the likelihood of misleading results, the following steps need to be taken:

The measurements should be made in a random order to ensure that any drift or changes that could occur will be spread randomly throughout the study. The appraisers should be unaware of which numbered part is being checked in order to avoid any possible knowledge bias. However, the person conducting the study should know which numbered part is being checked and record the data accordingly, that is Appraiser A, Part 1, first trial; Appraiser B, Part 4, a second trial, etc.
In reading the equipment, measurement values should be recorded to the practical limit of the instrument discrimination. Mechanical devices must be read and recorded to the smallest unit of scale discrimination. For electronic readouts, the measurement plan must establish a common policy for recording the right-most significant digit of the display. Analog devices should be recorded to one-half the smallest graduation or limit of sensitivity and resolution. For analog devices, if the smallest scale graduation is 0.0001”, then the measurement results should be recorded to 0.00005”.
The study should be managed and observed by a person who understands the importance of conducting a reliable study.

Analysis of the Results

The results should be evaluated to determine if the measurement device is acceptable for its intended application. A measurement system should be stable before any additional analysis is valid.

1. Acceptability Criteria – Gage Assembly and Fixture Error

For measurement systems whose purpose is to analyze a process, a general guideline for measurement system acceptability is as follows:

Replicable Measurement Systems

The procedures are appropriate to use when:

Only two factors or conditions of measurement (i.e., appraisers and parts) plus measurement system repeatability are being studied
The effect of the variability within each part is negligible
There is no statistical interaction between appraisers and parts
The parts do not change functionally or dimensionally during the study, i.e., are replicable

Conducting the Study for Determining Stability:

Obtain a sample and establish its reference value(s) relative to a traceable standard. If one is not available, select a production part that falls in the mid-range of the production measurements and designate it as the master sample for stability analysis. The known reference value is not required for tracking measurement system stability. It may be desirable to have master samples for the low end, the high end, and the mid-range of the expected measurements. Separate measurements and control charts are recommended for each.
On a periodic basis (daily, weekly), measure the master sample three to five times. The sample size and frequency should be based on knowledge of the measurement system. Factors could include how often recalibration or repair has been required, how frequently the measurement system is used, and how stressful the operating conditions are. The readings need to be taken at differing times to represent when the measurement system is actually being used. This will account for warm-up, ambient or other factors that may change during the day.
Plot the data on an X &R or X &s control chart in time order.
Analysis of Results:
Establish control limits and evaluate for out-of-control or unstable conditions using standard control chart analysis.

Example of study of Stability

To determine if the stability of a new measurement instrument was acceptable, the process team selected a part near the middle of the range of the production process. This part was sent to the measurement lab to determine the reference value which is 6.01. The team measured this part 5 times once a shift for four weeks (20 subgroups). After all the data were collected, X & R charts were developed

Analysis of the control charts indicates that the measurement process is stable since there are no obvious special cause effects visible.

Conducting the Study for Determining Bias by Independent Sample Method

The independent sample method for determining whether the bias is acceptable uses the Test of Hypothesis:
H₀ bias = 0
H₀ bias ≠ 0
The calculated average bias is evaluated to determine if the bias could be due to random (sampling) variation.

Obtain a sample and establish its reference value relative to a traceable standard. If one is not available, select a production part that falls in the midrange of the production measurements and designate it as the master sample for bias analysis. Measure the part n ≥ 10 times in the gage or tool room, and compute the average of the n readings. Use this average as the “reference value.”
Have a single appraiser measure the sample n ≥ 10 times in a normal manner.
Analysis of Results
Determine the bias of each reading:
bias_i = x_i– reference value
Plot the bias data as a histogram relative to the reference value. Review the histogram, using subject matter knowledge, to determine if any special causes or anomalies are present. If not, continue with the analysis. Special caution ought to be exercised for any interpretation or analysis when n < 30.
Compute the average bias of the n readings.
Compute the repeatability standard deviation.
Determine if the repeatability is acceptable by calculating the
%EV = 100 [EV/TV] = 100 [ σ_{repeatability} /TV]
Where the total variation (TV) is based on the expected process
variation (preferred) or the specification range divided by 6
Determine the t statistic for the bias:

Bias is acceptable (statistically zero) at the α level if the p-value associated with t_bias is less than α; or zero falls within the 1-α confidence bounds based on the bias value:

Determining Bias by Independent Sample Method

A manufacturing engineer was evaluating a new measurement system for monitoring a process. An analysis of the measurement equipment indicated that there should be no linearity concerns, so the engineer had only the bias of the measurement system evaluated. A single part was chosen within the operating range of the measurement system based upon documented process variation. The part was measured by layout inspection to determine its reference value. The part was then measured fifteen times by the lead operator.

Using a spreadsheet and statistical software, the supervisor generated the histogram and numerical analysis.

The histogram did not show any anomalies or outliers requiring additional analysis and review. The repeatability of 0.2120 was compared to an expected process variation (standard deviation) of 2.5. Since the %EV = 100(.2120/2.5) = 8.5%, the repeatability is acceptable and the bias analysis can continue. Since zero falls within the confidence interval of the bias (– 0.1107, 0.1241), the engineer can assume that the measurement bias is acceptable assuming that the actual use will not introduce
additional sources of variation.

Conducting the Study for Determining Bias by Control Chart Method

If an X & R chart is used to measure stability, the data can also be used to evaluate bias. The control chart analysis should indicate that the measurement system is stable before the bias is evaluated.

Obtain a sample and establish its reference value relative to a traceable standard. If one is not available, select a production part that falls in the mid-range of the production measurements and designate it as the master sample for bias analysis. Measure the part n ≥ 10 times in the gage or tool room, and compute the average of the n readings. Use this average as the “reference value.”
Conduct the stability study with g (subgroups) ≥ 20 subgroups of size m.

Analysis of Results.

If the control chart indicates that the process is stable and m = 1, use the analysis described for the independent sample method.
If m ≥ 2, plot the data as a histogram relative to the reference value. Review the histogram, using subject matter knowledge, to determine if any special causes or anomalies are present. If not, continue with the analysis.
Obtain the x double bar from the control chart
Compute the bias by subtracting the reference value from x double bar
bias = x double bar – reference value
Compute the repeatability standard deviation using the Average Range σ_{repeatability} =R bar/d^*₂
where d^*₂ is taken from d^*₂ table
Determine if the repeatability is acceptable by calculating the
%EV = 100 [EV/TV] = 100 [σ_{repeatability} /TV]
Where the total variation (TV) is based on the expected process variation (preferred) or the specification range divided by 6.
Determine the t statistic for the bias:Bais 4
Bias is acceptable (statistically zero) at the α level if zero falls
within the 1-α confidence bounds around the bias value: Bais 5

Example – Determining Bias by Control Chart Method

Referring to the above table, the stability study was performed on a part which had a reference value of 6.01. The overall average of all the samples (20 subgroups of size 5 for n=100 samples) was 6.021. The calculated bias is therefore 0.011. Using a spreadsheet and statistical software, the supervisor generated the numerical analysis

Analysis of Bias Studies

If the bias is statistically non-zero, look for these possible causes:

Error in master or reference value. Check the mastering procedure.
Worn instrument. This can show up in the stability analysis and will suggest the maintenance or refurbishment schedule.
The instrument made to the wrong dimension.
The instrument measuring the wrong characteristic.
Instrument not calibrated properly. Review the calibration procedure.
Instrument used improperly by appraiser. Review measurement instructions.
Instrument correction algorithm incorrect.
If the measurement system has non-zero bias, where possible it should be recalibrated to achieve zero bias through the modification of the hardware, software or both. If the bias cannot be adjusted to zero, it still can be used through a change in procedure (e.g., adjusting each reading by the bias). Since this has a high risk of appraiser error, it should be used only with the concurrence of the customer.

Conducting the Study for Determining Linearity

Select g ≥ 5 parts whose measurements, due to process variation,
cover the operating range of the gage.
Have each part measured by layout inspection to determine its reference value and to confirm that the operating range of the subject gage is encompassed.
Have each part measured m ≥ 10 times on the subject gage by one of the operators who normally use the gage. Select the parts at random to minimize appraiser “recall” bias in the measurements.
Analysis of Results.
Calculate the part bias for each measurement and the bias average for each part.
Plot the individual biases and the bias averages with respect to the reference values on a linear graph.
Calculate and plot the best fit line and the confidence band of the line using the following equations.
For the best fit line, use:
where
and
For a given x₀, the α level confidence bands are:
The standard deviation of the variability of repeatability.
σ_{repeatability} = S
Determine if the repeatability is acceptable by calculating the
%EV = 100 [EV/TV] = 100 [σ_{repeatability} /TV]
Where the total variation (TV) is based on the expected process variation (preferred) or the specification range divided by 6.
Plot the “bias = 0” line and review the graph for indications of special causes and the acceptability of the linearity. For the measurement system linearity to be acceptable, the “bias = 0” line must lie entirely within the confidence bands of the fitted line.
If the graphical analysis indicates that the measurement system linearity is acceptable then the following hypothesis should be true:
H₀: a = 0 slope = 0
do not reject if
If the above hypothesis is true then the measurement system has the same bias for all reference values. For the linearity to be acceptable this bias must be zero.
H₀: b = 0 intercept (bias) = 0
do not reject if

Example – Determining Linearity

A plant supervisor was introducing a new measurement system to the process. Five parts were chosen throughout the operating range of the measurement system based upon documented process variation. Each part was measured by layout inspection to determine its reference value. Each part was then measured twelve times by the lead operator. The parts were selected at random during the study.

Graphical Analysis

The graphical analysis indicates that special causes may be influencing the measurements system. The data for reference value 4 appear to be bimodal. Even if the data for reference value 4 were not considered, the graphical analysis clearly shows that this measurement system has a linearity problem. The R² value indicates that a linear model may not be an appropriate model for these data. F Even if the linear model is accepted, the “bias = 0” line intersects the confidence bounds rather than being contained by them. At this point, the supervisor ought to begin problem analysis and resolution on the measurement system, since the numerical analysis will not provide any additional insights. However, wanting to make sure no paperwork is left unmarked, the supervisor calculates the t-statistic for the slope and intercept:

t_a = -12.043

t_b = 10.158

Taking the default α = .05 and going to the t-tables with (gm – 2) = 58 degrees of freedom and a proportion of .975, the supervisor comes up with the critical value of:

t_58,.975 = 2.00172

Since | t_a | > t_58,.975 , the result obtained from the graphical analysis is reinforced by the numerical analysis – there is a linearity problem with this measurement system.

If the measurement system has a linearity problem, it needs to be recalibrated to achieve zero bias through the modification of the hardware, software or both. If the bias cannot be adjusted to zero bias throughout the measurement system range, it still can be used for product/ process control but not analysis as long as the measurement system remains stable.

Conducting the Study for Determining Repeatability and Reproducibility:

The Variable Gage Study can be performed using:

Range method
Average and Range method (including the Control Chart method)
ANOVA method

Except for the Range method, the study data design is very similar for each of these methods. The ANOVA method is preferred because it measures the operator to part interaction gauge error, whereas the Range and the Average and Range methods do not include this variation. As presented, all methods ignore within-part variation (such as roundness, diametric taper, flatness, etc.,) in their analyses. However, the total measurement system includes not only the gage itself and its related bias, repeatability, etc., but also could include the variation of the parts being checked. The determination of how to handle within-part variation needs to be based on a rational understanding of the intended use of the part and the purpose of the measurement.

Range Method

The Range method is a modified variable gage study which will provide a quick approximation of measurement variability. This method will provide only the overall picture of the measurement system. It does not decompose the variability into repeatability and reproducibility. It is typically used as a quick check to verify that the GRR has not changed.

Average and Range Method

The Average and Range method (X̅ & R) is an approach which will provide an estimate of both repeatability and reproducibility for a measurement system. Unlike the Range method, this approach will allow the measurement system’s variation to be decomposed into two separate components, repeatability and reproducibility. However, variation due to the interaction between the appraiser and the part/gage is not accounted for in the analysis.

The detailed procedure is as follows:

Obtain a sample of n ≥ 10 parts that represent the actual or expected range of process variation.
Refer to the appraisers as A, B, C, etc. and number the parts 1 through n so that the numbers are not visible to the appraisers.
Calibrate the gage if this is part of the normal measurement system procedures. Let appraiser A measure n parts in random order and enter the results in row 1.
Let appraisers B and C measure the same n parts without seeing each other’s readings; then enter the results in rows 6 and 11, respectively.
Repeat the cycle using a different random order of measurement.
Enter data in rows 2, 7 and 12. Record the data in the appropriate
column. For example, if the first piece measured is part 7 then record the result in the column labelled part 7. If three trials are needed, repeat the cycle and enter data in rows 3, 8 and 13.
Steps 4 and 5 may be changed to the following when the large part size or simultaneous unavailability of parts makes it necessary:
- Let appraiser A measure the first part and record the reading in row 1. Let appraiser B measure the first part and record the reading in row 6. Let appraiser C measure the first part and record the reading in row 11.
- Let appraiser A repeat reading on the first part and record the reading in row 2, appraiser B record the repeat reading in row 7, and appraiser C record the repeat reading in row 12. Repeat this cycle and enter the results in rows 3, 8, and 13, if three trials are to be used.
An alternative method may be used if the appraisers are on different shifts. Let appraiser A measure all 10 parts and enter the reading in row 1. Then have appraiser A repeat the reading in a different order and enter the results in rows 2 and 3. Do the same with appraisers B and C

Analysis of Variance (ANOVA) Method:

Analysis of variance (ANOVA) is a standard statistical technique and can be used to analyze the measurement error and other sources of variability of data in a measurement systems study. In the analysis of variance, the variance can be decomposed into four categories: parts, appraisers, the interaction between parts and appraisers, and replication error due to the gage.

The following ANOVA procedure will show how total variability is partitioned. To construct this example, the following procedure will be followed:

Choose five parts at random and select a quality characteristic to measure
Identify the parts by numbering them 1 through 5
Pick three technicians/inspectors
Have them randomly measure the parts using the same measuring instrument
Repeat 4, so that there are two replications for each technician/part combination

Next, an ANOVA (Analysis of Variance) table will be constructed to partition total data variation into measurement error (repeatability), inspector-to-inspector variation (reproducibility), and part-to-part variation (process).

A ColSq is determined by squaring the Col total and dividing by Col n, e.g., 20²/10 = 40.
A RowSq is determined by squaring the Row total and dividing by Row n, e.g., 8²/6 = 10.667.
An interaction CellSq is determined by squaring the Cell total and dividing by cell sample size n, e.g., (2 + 1)² /2 = 4.5.
∑X² = 2² + 1² +…+ 1.5² + 0.5² = 114.75 ∑X = 54.5 N = 30

CM = Correction for the Mean = (∑X)²/N = (54.5)²/30 = 99.008
∑TSqs = 99.625 ∑PSqs = 108.875 ∑CellSqs = 111.125
TotSS = ∑X² – CM = 114.75 – 99.008 = 15.742
TechSS = ∑TSqs – CM = 99.625 – 99.008 = 0.6167
PartSS = ∑PSqs – CM = 108.875 – 99.008 = 9.867
lnterSS = ∑CeIISqs – CM – TechSS – PartSS
= 111.125 – 99.008 – 0.6167 – 9.867 = 1.633
ErrorSS = TotSS – TechSS – PartSS – lnterSS
= 15.742 – 0.6167 – 9.867 – 1.633 = 3.625

Technician DF = Number of technicians – 1

Part DF = Number of parts – 1
Interaction DF = (Technician DF) x (Part No. DF) Total DF = N – 1
Error DF = Total DF – Technician DF – Part No. DF – Interaction DF
MS = SS/DF F = Effect MS/Error MS= 0.3083/0.2417 = 1.28
The Var (Variance) = (Effect MS – Error MS)/(Variance Coefficient)

The variance coefficient terms come from the original data table, where technician equals 10, part equals 6 and interaction equals 2. The three calculations follow:

The Adjusted Variance column converts the negative interaction variance to 0. The % column shows the percent contribution of each component based on the Adj Var Column.
SlGe (0.4916) is the square root of the Error MS (0.2417) and represents Repeatability. SlGtot (0.7368) is the sigma of total data. The difference between SlGe and SlGtot is due to the difference among technicians and the difference among parts.
Repeatability is the error variance and contributes 39.03% of the total variation in the data. Reproducibility is the variation among technicians which contributes 1.08% of the variation in the data. However, the F ratio test for technicians is 1.28 compared to an F critical value of 3.68 at the 95% confidence level. The null hypothesis that there is no difference among technicians is not rejected. This implies that a reduction in measurement variation cannot be achieved by directing improvement activities at the three technicians. There is no interaction. The interaction variance is effectively 0. This means that each technician measures each part the same way.
Because variances are additive, one could say that total measurement contribution is repeatability variance + technician variance =.1.08% + 39.03% = 40.11%. If R&R variation is to be reduced, it is the source of repeatability variation which must be addressed. Process variation accounts for 59.89% of the total variation in the data. Note that the null hypothesis of no difference between parts would be rejected. F_cal (10.21) is greater than F_α (3.06). Whether this is too much process variation requires a comparison of the total data with the specifications. The specifications have no way of knowing the variance components of product output measurements.

Back to Home Page

If you need assistance or have any doubt and need to ask any question contact us at preteshbiswas@gmail.com. You can also contribute to this discussion and we shall be very happy to publish them in this blog. Your comment and suggestion are also welcome.