Home
Root Cause Analysis Guidance Document
Contents
1. Inadvertant SPR Hus o to next page Local panel flag indication Operation at transformer D PPEN L E Figure D 2 Example of Cause and Effect Charting D 4 Cause and Effect Chart Example Continued B C EN i Internal transformer Cover gas pressure faults too high E EI LU E a Found cover aas at Operation set DES Performed gas and Beda ne ee ENE us oil anlysis and higher than normal higher to prevent evaluation showed no pressure but below negative pressures in sign of problems setpoint the winter and this a hot summer day Pressure within limits Post trin inspection ae I p of gauges NUMAE Operators log Verbal discussion independent Analysis and evaluation SS evaluation also Lt with Operations Department personnel e E D Pressure integrity Short circuit in SPR failure Found a test relief type SPR disassembled and valve on SPR with a found in perfect design set point of 10 psig working order and an actual setpoint of 4 5 neia SOLUTION CUT ow poly From C above we know Replace test valve xformer pressure was approx pipe plug per standard Bench testing per 5 psig cause of trip practice procedure Visual inspection p Bench testing of relief type test valves Figure D 2 Continued D 5 D 6 APPENDIX E CHANGE ANALYSIS C
2. Contributing Cause R Root Cause Recommended Corrective Actions B 2 2 Procedure Worksheet C Applicable LL Not Applicable Why was Procedures a Cause Hate each subcategory cause Procedure Problem Subcategory III 2A Defective or inadequate Procedure lp C Contributing Cause 2B Lack of Procedure ANE GENE NE R Root Cause D Direct Cause Cause Descriptions Recommended Corrective Actions B 3 Rate each subcategory cause TOO 3 Personnel Error Worksheet C Applicable TT Not Applicable Why was Personnel Error a Cause Pamatay v Tv Deme LL Cue EESTI EE LL 3D Ve Ott Direct Cause Contributing Cause Root Cause Cause Description Recommended Corrective Actions B 4 4 Design Problem Worksheet C Applicable LL Not Applicable Why was Design a Cause REECH T HT w 4A Inadequate Man Machine Interface III EE Ee 1 Corse emensus 1 Cause Descriptions Rate each subcategory cause D Direct Cause C Contributing Cause R Root Cause Recommended Corrective Actions B 5 5 Training Deficiency Worksheet TJ Applicable LL Not Applicable Why was Training Deficiency a Cause Training Deficiency Subcategories Lr uu w 5B Insufficient Practice or Hands On Experience AH Jw 5C Inadequate Content 5D Insufficient Refresher Traini
3. LTA LTA Prevent actions LTA LTA systems analysis review LTA LAN dnd accident LTA LTA LTA UJ acres We Sr EE N EE 0 Controls LTA Barriers LTA Design amp development Concepts amp requirements plan geb LTA IAN 0 0 7 Tech info Operobiltiy Maint Inspecion Higher Supv Design Human Maint Inspection General design Other support LTA LTA LTA LA LTA basis LTA factors LTA planLTA plan LTA Process LTA systems LTA 1st Line 0 Supv LTA Operational i spec LTA l 0 Monitor Trending Analysis Corr action 0 LTA LIA LTA Trigger LTA 0 0 Motivation Procedures Qualifi Supv Training Monitoring LTA LTA cations TA LTA Points LTA LTA Communication Knowledge Supv Time Performance error SE LIA LTA training LTA LTA LTA D N detect D N correct Task Non tosk Emergency i 0 Problems None LTA D N Use aa Safety Analysis Aberrant Selection Training Motivation Behavior LTA LTA LTA FI UuLlO SISA EUY 3sSNB joo paseq L 3IOJA T O AnS MORT BASED ROOT CAUSE ANALYSIS FORM m Policy Implementation Risk Assessment Bridge Elements Specific Factors Task Performance INANG NG 8 NI NIO N11 1 l 30 A 2 Findings or Conclusions APPENDIX H HUMAN PERFORMANCE EVALUATION a Input detection b Input understanding c Action selection d Action execution Facility and equipment operability procedures and documcentation and management attitudes are all
4. Factor Analysis Events and Causal Factor Analysis identifies the time sequence of a series of tasks and or actions and the surrounding conditions leading to an occurrence The results are displayed in an Events and Causal Factor chart that gives a picture of the relationships of the events and causal factors Change Analysis Change Analysis is used when the problem is obscure It is a systematic process that is generally used for a single occurrence and focuses on elements that have changed o Barrier Analysis Barrier Analysis 1s a systematic process that can be used to identify physical administrative and procedural barriers or controls that should have prevented the occurrence e Management oversight and Risk Tree MORT Analysis MORT and Mini MORT are used to identify inadequacies in barriers controls specific barrier and support functions and management functions It identifies specific factors relating to an occurrence and identifies the management factors that permitted these factors to exist Human Performance Evaluation Human Performance Evaluation identifies those factors that influence task performance The focus of this analysis method is on operability work environment and management factors Man machine interface studies to improve performance take precedence over disciplinary measures Kepner Tregoe Problem Solving and Decision Making Kepner Tregoe provides a systematic framework for gathering organizing and e
5. es Nd BALI NAN 5 0 ao ee hoe DOWA DAWA Se TERR ae CIT eee eae E eas 5 2 4 Management Oversight and Risk Tree OMORT aa 5 2 5 Human Performance Evaluation 2 2 cece ee eee eae 5 2 6 Kepner Tregoe Problem Solving and Decision Making 6 PHASE IM CORRECTIVE ACTIONS 224c220200554004 0924 0000G eess CG KAKA SCR 7 PHASE IV INFORM 0 0 00 0 ccc een 8 PHASE V FOLLOW UP 00 0 ee 9 REFERENCES 2 2202 aan APPENDIX A CAUSE CODES 3235 ato cti au dr d EE cde d NA AY ee ee A 1 APPENDIX B CAUSAL FACTOR WORKSHEETS 00 cece neces APPENDIX C CAUSAL FACTOR ANALYSIS EXAMPLES 2222220000 APPENDIX D EVENTS AND CAUSAL FACTOR ANALYSIS lens D 1 APPENDIX E CHANGE ANALYSIS ceu eee or hr arr A ENN KG E 1 APPENDIX F BARRIER ANALYSIS 3 2 0433 paawa KGG KB PAWA KAL ENG PRAAN S RP ows F 1 APPENDIX G MANAGEMENT OVERSIGHT AND RISK TREE MORT ANALYSIS 2 2200 APPENDIX H HUMAN PERFORMANCE EVALUATION e vi ROOT CAUSE ANALYSIS GUIDANCE DOCUMENT 1 SUMMARY This document is a guide for root cause analysis specified by DOE Order 5000 3A Occurrence Reporting and Processing of Operations Information Causal factors identify program control deficiencies and guide early corrective actions As such root cause analysis is central to DOE Order 5000 3A The basic reason for investigating and reporting the causes of occurrences is to enable the identification of corrective a
6. of actions or happenings while the conditions are anything that shapes the outcome and ranges from physical conditions such as an open valve or noise to attitude or safety culture The events and conditions as given on the chart describe a causal factor chain The direct root and contributing cause relationships in the causal factor chain are shown in Figure 3 Occurrence Serious or Complex Yes No Use all applicable Use scaled down methods analytical models or informal analysis Use Obscure Cause Change Analysis Organizational Behavior Breakdown Use concept for all cases Complex Barriers and Controls Barrier Analysis Procedure or Administrative Problems Built into MORT AAAH Multi faceted Problems Lo SS Wi uu Events and Causal Factor Charting with long causal and or MORT factor chains BEE KEE Human Performance Evaluation People Problems mg and or MORT Thorough analysis of both Kepner Tregoe Problem Solving causes and corrective actions and Decision Making Figure 2 Summary of Root Cause Methods Flow Chart 10 II METHOD Events and Causal Factor Anal ysi s Change Analysis Barrier Analysis MORT Mi ni MORT Human Performance Evaluations HPE Kepner Tregoe TABLE 1 SUMMARY OF ROOT CAUSE METHODS Use for multi faceted problems with long or complex causal factor chain Use when cause is obscure Especially useful in evaluating equipment f
7. system was not designed for frequent cycling and blew fuse during start Recommended Corrective Actions Evaluate and implement design or operational changes to eliminate fuse blowing 5 Training Deficiency Worksheet Applicable LL Not Applicable Why was Training Deficiency a Cause ERR T v ER e TL Ka Decree RES ES eer LL Eesen Cause Descriptions Hate each subcategory cause D Direct Cause C Contributing Cause R Root Cause 5A No Training Provided The employee was not trained on high voltage NOTE The training program was adequate Recommended Corrective Actions Train employee on high voltage C 9 6 Management Problem Worksheet C9 Applicable L Not Applicable Why was Management Problem a Cause Hate each subcategory Cause D Direct Cause C Contributing Cause R Root Cause aragna aen Suae 7 3 ERR e SE EES RSR KEE KREE 6D Improper Resource Allocation 6E Policy Not Adequately Defined Disseminated or Enforced 6D Other Cause Descriptions 6A Inadequate Administrative Control Reporting and correcting system malfunction fuse blowing was inadequate 6C Inadequate Supervision The root cause was the supervisor assigned an unqualified person to work on high voltage Recommended Corrective Actions 1 Train supervisors to verify qualifications when assigning personnel to hazar
8. that includes the following basic steps maybe used a Identify the problem Remember that actuation of a protective system constitutes the occurrence but is not the real problem the unwanted unplanned condition or action that resulted in actuation is the problem to be solved For an example dust in the air actuates a false fire alarm In this case the occurrence is the actuation of an engineered safety feature The smoke detector and alarm functioned as intended the problem to be solved is the dust in the air not the false fire alarm Another example is when an operator follows a defective procedure and causes an occurrence The real problem is the defective procedure the operator has not committed an error However if the operator had been correctly trained to perform the task and therefore could reasonably have been expected to detect the defect in the procedure then a personnel problem may also exist b Determine the significance of the problem Were the consequences severe Could they be next time How likely is recurrence Is the occurrence symptomatic of poor attitude a safety culture problem or other widespread program deficiency Base the level of effort of subsequent steps of your assessment upon the estimation of the level of significance c Identify the causes conditions or actions immediately preceding and surrounding the problem the reason the problem occurred d Identify the reasons why the causes in the precedi
9. to identify the cause Each block is an effect and a cause except for the first block which is the primary effect and the last block in each series which is the root cause For each cause list in a block just below the cause two ways you know it to be true If only one way is known or not firm all possible causes should be evaluated as potential causes and the bases for rejected and accepted causes should be stated When this process gets to the point where a cause can be corrected to prevent recurrence in a way that allows meeting your objectives and 1s within your control you have found the root cause or causes D 2 Cause and Effect Chart How do you know List two or more this e g ways that explain how How do you know Alarm typer Transient you know each cause this etc Data Acquisition System Personnel statement etc Figure D 1 Conceptual Process of Cause and Effect Charting D 3 Cause and Effect Chart Example of Cause and Effect Charti Turbine Control Valve Beech Ir S Ke CI RP sm tft Reactor SCRAM teeny osure RPS Relay Tripped A Alarm Typer e Flag Set e Personnel Observations Alarm Typer e Handle Cocked e Transient Data Acquisi e TDAS e Indicator Light tion System Relay SPR Actuation EE s ranstOrmel Go to next page on TR N1 Pressure Increased O e Verification by electricians that logic train is functional D em
10. 2 Findings included e The regular electrician was sick so a substitute who was not trained on high voltage was used Cause Code 5A No Training Provided e The substitute did not follow procedures The substitute tied out the interlocks and used the wrong meter Cause Code 3C Violation of Requirement or Procedure The fuse obtained from the storeroom was outdated and was no good Cause Code 1A Defective or Failed Part The large fan was not designed for cycling frequent startups and had been regularly blowing fuses Cause Code 4B Inadequate or Defective Design The supervisor knew the substitute was inexperienced but did not observe the substitute or give any special assistance Cause Code 6C Inadequate Supervision e Known defects had not been corrected Cause Code 6A Inadequate Administrative Control To correct these conditions the following recommendations were made Investigate and repair the system so that it does not blow fuses Train supervisors to ensure that the worker is qualified for that task Provide high voltage training as needed e Evaluate management response to safety problems and operation of malfunctioning equipment As a result of the potential significance of this occurrence a formal detailed root cause analysis was performed A high level of effort was expended but the effort was justified due to the consequences of a repeat occurrence 4 2 1 Regular 4 1 Safety
11. 6 1 Wanted to man sick requirement check it out M n 2 6 Defect not corrected Vu A designed to run continually gt M d m ud Meter not adequately installed jot 10 0 Put metor opened m 11 0 Fire 12 0 John is box the interlocks ee bali bumed P uL a Keess l j i j f m IN 1 Wanted to 11 1 Used a check voltage 600 V met across fuse i NS md LL b 9 5 Not 11 2 Didn t know experienced V he needed a higher N range meter Na A Maa Figure C 1 Events and Causal Factors Chart C 4 1 Equipment Material Worksheet XX Applicable Not Applicable Why was Equipment Material a Cause REH 3 v 9 ewe v T T Cue TT Ka ES EE Deren LLL KG NN Cause Descriptions Rate each subcategory cause D Direct Cause C Contributing Cause R Root Cause 1A Defective or Failed Part The replacement fuse was out of date and was no good Recommended Corrective Actions Evaluate the parts inventory and procurement system and where needed implement program to discard and replace outdated parts C 5 2 Procedure Worksheet Applicable Not Applicable Why was Procedures a Cause Procedure Problem Subcategory La puru rv Eege 2A Defective or Inadequate Procedure d d C Contributing Cause 2B Lack of Procedure GC a R Root Cause Rate each subcategory cause Caus
12. DOE NE STD 1004 92 DOE GUIDELINE ROOT CAUSE ANALYSIS GUIDANCE DOCUMENT February 1992 U S Department of Energy Office of Nuclear Energy Office of Nuclear Safety Policy and Standards Washington D C 20585 ii ABSTRACT DOE Order 5000 3A Occurrence Reporting and Processing of Operations Information requires the investigation and reporting of occurrences including the performance of root cause analysis and the selection implementation and follow up of corrective actions The level of effort expended should be based on the significance attached to the occurrence Most off normal occurrences need only a scaled down effort while most emergency occurrences should be investigated using one or more of the formal analytical models A discussion of methodologies instructions and worksheets in this document guides the analysis of occurrences as specified by DOE Order 5000 34 ii IV CONTENTS AA HE AHA AP AA lii E SUMMARY rP rrrm 2 IDEEINCUICHNS EE 3 OVERVIEW OF OCCURRENCE INVESTIGATION ne 4 PHASE I DATA COLLECTION 4 ppa ma KABAN deo eo EU KA CR UR EC REOR Ru E ae Sa PHASE IT ASSESSMENT a odio acia oo dE dorp E EE inpet oom d icd dies 5 1 Assessment and Reporting Guidance 0 00 cc eee eens IRO Cauce MCINOUS 422 omo 46 5 d 9 9 eee ee 90829 2078 AUR PUR RC IHRE edo nee ee 5 2 1 Events and Causal Factor Analysis 2 cece eee ene IL Chane TI SIS NA ota es EE eue dde edidi
13. ENTS AND CAUSAL FACTOR ANALYSIS Cause and Effects Walk through Task Analysis Cause and Effects Walk through Task Analysis 1s a method in which personnel conduct a step by step reenactment of their actions for the observer without carrying out the actual function If appropriate it may be possible to use a simulator for performing the walk through rather than the actual work location Objectives include Determining how a task was really performed Identifying problems in human factors design discrepancies in procedural steps training etc Preconditions are that participants must be the people who actually do the task Steps in Cause and Effects Task Analysis are as follows E Obtain preliminary information so you know what the person was doing when the problem or inappropriate action occurred Decide on a task of interest Obtain necessary background information Obtain relevant procedures Obtain system drawings block diagrams piping and instrumentation diagrams etc Interview personnel who have performed the task but not those who will be observed to obtain understanding of how the task should be performed Produce a guide outlining how the task will be carried out A procedure with key items underlined is the easiest way of doing this The guide should indicate steps in performing task and key controls and displays so that You will know what to look for You will be able to record actions more easily Th
14. OE SSDC 76 45 27 November 1985 second edition U S Department of Energy 4 D Fillmore and A Trost Investigating and Reporting Accidents Effectively SSDC 41 DOE 76 45 41 EG amp G Idaho Inc 3 J L Burton Method Identifies Root Causes of Nuclear Reactor Scrams Power Engineering October 1987 6 D L Gano Root Cause and How to Find It Nuclear News August 1987 10 1 12 13 14 15 16 17 M Paradies and D Busch Root Cause Analysis at the Savannah River Plant private communication October 1988 Chong Chiu A Comprehensive Course in Root Cause Analysis and Corrective Action for Nuclear Power Plants Workshop Manual Failure Prevention Inc San Juan Capistrano CA 1988 R J Nertney J D Cornelison W A Trost Root Cause Analysis of Performance Indicators WP 21 System Safety Development Center EG amp G Idaho Inc Idaho Falls ID 1989 J D Cornelison MORT Based Root Cause Analysis WP 27 System Safety Development Center EG amp G Idaho Inc Idaho Falls ID 1989 J R Buys and J L Clark Events and Causal Factors Charting SSDC 14 System Safety Development Center EG amp G Idaho Inc Idaho Falls ID August 1978 W A Trost and R J Nertney Barrier Analysis SSDC 29 System Safety Development Center EG amp G Idaho Inc Idaho Falls ID July 1985 M G Bullock Change Control and Analysis SSDC 21 System Safety Development Center EG amp G Idaho I
15. The procedure was not sufficiently detailed to ensure adequate verification the procedure did not state that the operator was to verify the correct hookup only to verify the correct gas mixture in the cylinder The cylinders had been moved by maintenance personnel to facilitate other noncylinder work in the area and had been returned to the wrong position in the rack management did not want the cylinders moved by maintenance but had not implemented any controls The cylinders were not color coded This was classified as an off normal occurrence related to nuclear safety The problem was inadequate cooling and the resulting high temperature in the experiment loop The direct cause was not verifying correct hookup because of inadequate startup procedures Cause Code 2A Procedure Problem Defective or Inadequate Procedure Contributing causes were maintenance personnel returning the cylinder to the wrong position Cause Code 3B Personnel Inadequate Attention to Detail and identical leads and colors of cylinders with different contents Cause Code 4A Design Inadequate Man Machine Interface The root cause was determined to be the prevailing attitudes and culture that contributed to the maintenance errors and poor design Cause Code 6E Management Policy Not Adequately Defined Disseminated or Enforced In this case personnel error is not a valid cause because the operator had not been trained to this requirement and could not reasonabl
16. ailures Use to identify barrier and equipment failures and procedural or administrative problems Use when there is a shortage of experts to ask the right questions and whenever the problem is a recurring one Helpful in solving programmatic problems Use whenever people have been identified as being involved in the problem cause Use for major concerns where all spects need thorough analysis Provides visual display of analysis process Identifies probable contributors to the condition Simple 6 step process Provides systematic approach Can be used with limited prior training Provides a list of questions for specific control and management factors Thorough analysis Highly structured approach focuses on all aspects of the occurrence and problem resolution Ti me consuming and requires familiarity with process to be effective Limited value because of the danger of accepting wrong obvious answer Requires familiarity with process to be effective May only identify area of cause not specific causes None if process is closely followed More comprehensive than may be needed WHEN TO USE ADVANTAGES DI SADVANTAGES mm Requires a broad perspective of the event to identify unrelated problems Helps to identify where deviations occurred from acceptable methods A singular problem technique that can be used in support of a larger investigation All root causes may not be identifi
17. anges could have prevented the unwanted flow of energy Why What maintenance changes could have prevented the unwanted flow of energy Why Could the unwanted energy have been deflected or evaded Why What other controls are the barriers subject to Why Was this event foreseen by the designers operators maintainers anyone Is it possible to have foreseen the occurrence Why Is it practical to have taken further steps to have reduced the risk of the occurrence Can this reasoning be extended to other similar systems components Were adequate human factors considered in the design of the equipment What additional human factors could be added Should be added F 1 Is the system component user friendly Is the system component adequately labeled for ease of operation Is there sufficient technical information for operating the component properly How do you know Is there sufficient technical information for maintaining the component properly How do you know Did the environment mitigate or increase the severity of the occurrence Why What changes were made to the system component immediately after the occurrence What changes are planned to be made What might be made Have these changes been properly adequately analyzed for effect What related changes to operations and maintenance have to be made now Are expected changes cost effective Why How do you know What would you have done differently to have prevented the occurren
18. ation or materials 6 Management Problem 6A Inadequate administrative control Work organization planning deficiency ON UJ 6C Inadequate supervision 6D Improper resource allocation 6E Policy not adequately defined disseminated or enforced 6F Other management problem 7 External Phenomenon 7A Weather or ambient condition JB Power failure or transient JC External fire or explosion 7D Theft tampering sabotage or vandalism A 1 APPENDIX B CAUSAL FACTOR WORKSHEETS After an appropriate root cause model has been used to identify the direct cause the root cause and any applicable contributing cause these findings can be related to the ORPS cause categories by using one or more of the worksheets in this appendix Each of the seven major cause worksheets has a matrix to list the applicable subcategory cause for each finding The same subcategory cause may be listed for up to four similar findings under columns I through IV The Worksheet Summary can be used to list from the individual worksheets the one direct cause the one root cause and up to three contributing causes their descriptions and the corrective actions for electronic entry Worksheet Instructions l Check each worksheet as applicable or nonapplicable PUN List subcategory cause information on each applicable worksheet a List the applicable subcategory cause for the root cause the contributing causes and the direct cause by
19. ce disregarding all economic considerations as regards operation maintenance and design What would you have done differently to have prevented the occurrence considering all economic concerns as regards operation maintenance and design F 2 Work Task Clean Relay Contact Occurrence Reactor Trip Sequence of Events Maintenance Electricians BP Electricians Follow Given Procedure Assignment System Tagout Warning Tag Requested M Hung Barriers Analysis Tagout Tagout Communications Procedure f mg ok OF Process Process gt Process Aa A Electricians given Electricians Tag hung on P689 only P690 is still energized MWR requests de energizing two panels so relays can be references a Maint Procedure but cleaned Opera not told of change tions will only in scope by allow one panel foreman at a time to be tagged out Electrical foreman told and agrees step to verify dead power supply before starting They Barrier Barrier Barrier Barrier Holds Holds Fails Fails Figure F 1 Examples of Barrier Analysis F 3 MWR to work which go to P690 and begin procedure Procedure has no open first relay and plant trips Reactor Trip Occurrence Electricians never trained to always check power supply prior to working on electrical equipment Barrier Fails APPENDIX G MANAGEMENT OVERSIGHT AND RISK TREE MORT ANALYSIS A Mini MORT analysis char
20. ctions adequate to prevent recurrence and thereby protect the health and safety of the public the workers and the environment Every root cause investigation and reporting process should include five phases While there may be some overlap between phases every effort should be made to keep them separate and distinct Phase I Data Collection It is important to begin the data collection phase of root cause analysis immediately following the occurrence identification to ensure that data are not lost Without compromising safety or recovery data should be collected even during an occurrence The information that should be collected consists of conditions before during and after the occurrence personnel involvement including actions taken environmental factors and other information having relevance to the occurrence Phase II Assessment Any root cause analysis method may be used that includes the following steps l Identify the problem 2 Determine the significance of the problem 3 Identify the causes conditions or actions immediately preceding and surrounding the problem 4 Identify the reasons why the causes in the preceding step existed working back to the root cause the fundamental reason which if corrected will prevent recurrence of these and similar occurrences throughout the facility This root cause is the stopping point in the assessment phase The most common root cause analysis methods are Events and Causal
21. cylinder brake The Barrier Analysis Checklist asks Were there unwanted energies present Vibration was determined to be the cause of the broken solder connection Using other questions in the Barrier Analysis Checklist or by merely asking the next logical questions we discover that vibration had not been considered in the design Inspections had been conducted during the last shutdown The installation had been according to design specifications and verified by quality assurance This was classified as an unusual occurrence involving performance degradation of Class A equipment The direct cause was Cause Code 1A Equipment Material Problem defective or failed part lacking something to perform its intended function The joint was soldered adequately but lacked support The root cause was Cause Code 4B Design Problem something essential was not included C Corrective actions included repair of the broken connection inspection of the other connections and installation of shrink tubing for structural support In addition a checklist including vibration was developed to avoid oversight in design considerations EXAMPLE 3 An experiment high temperature alarm occurred during reactor startup Change analysis Mini MORT or Cause and Effects are all adequate for this investigation It was revealed that e The cooling gas lead was hooked to the wrong cylinder The operator had followed the startup procedure to verify correct hook up
22. d any involved personnel been working Where did the condition occur What were the physical conditions in the area Where was the condition identified Was location a factor in causing the condition Human factor Lighting Noise Temperature Equipment labeling Radiation levels Personal protective equipment required in the area Radiological protective equipment required in the area Accessibility Indication availability Other activities 1n the area What position is required to perform tasks in the area Equipment factor Humidity Temperature Cleanliness Was the condition an inappropriate action or was it caused by an inappropriate action An omitted action An extraneous action An action performed out of sequence An action performed to a too small of a degree To a too large of a degree Was procedure use a factor in the condition E 3 WHO Was there an applicable procedure Was the correct procedure used Was the procedure followed Followed in sequence Followed blindly without thought Was the procedure Legible Misleading Confusing An approved current revision Adequate to do the task In compliance with other applicable codes and regulations Did the procedure Have sufficient detail Have sufficient warnings and precautions Adequately identify techniques and components Have steps in the proper sequence Cover all involved systems R
23. ding the physical barriers designing installation signs warnings training or procedures G 1 Providing planning scheduling administrative controls resources or constraints Verifying that the barriers controls have been implemented and are being maintained by operational readiness inspections audits maintenance and configuration change control Verifying that planning scheduling and administrative controls have been implemented and are adequate Policy and policy implementation identification of requirements assignment of responsibility allocation of responsibility accountability vigor and example in leadership and planning Cause definitions used with this method are similar to those in DOE Order 5000 3A A cause causal factor is any weakness or deficiency in the barrier control functions or in the administration management functions that implement and maintain the barriers controls and the plans procedures A causal factor chain sequence or series 1s a logical hierarchal chain of causal factors that extends from policy and policy implementation through the verification and implementation functions to the actual problem with the barrier control or administrative functions A direct cause is a barrier control problem that immediately preceded the occurrence and permitted the condition to exist or adverse event to occur Since any element on the chart can be an occurrence the next upstream condition or event o
24. dous tasks 2 Implement procedures and controls to report and correct malfunctioning Systems C 10 7 External Phenomena Worksheet C Applicable XX Not Applicable Why was External Phenomena a Cause Emus Saco TATA W ERR ERR Dmceemweswen _ CE LL Ma 1 Rate each subcategory cause D Direct Cause C Contributing Cause R Root Cause Cause Descriptions Recommended Corrective Actions C 11 Worksheet Summary is Direct Root Contributing Equipment C Material Problem enel oe I Readiness Problem dis Personnel Error omm o t Management Field problem eficiency Cause Description The direct cause was an untrained employee violated safety procedures by tying out an interlock and using the wrong meter to test a high voltage fuse The root cause was the supervisor assigned an unqualified substitute to work on high voltage Contributing causes were failure to maintain up to date parts fuse and tolerance of an unsatisfactory operational system frequent fuse blowing Corrective Actions 1 Train supervision to verify qualifications when assigning personnel to hazardous tasks 2 Evaluate parts inventory and procurement system and where needed discard and replace outdated parts 3 Implement procedures and controls to report and correct malfunctioning systems 4 Train employees as needed on high voltage systems C 12 APPENDIX D EV
25. dynamics experiments windmills radioactive waste disposal systems and burial grounds testing laboratories research laboratories transportation activities and accommodations for analytical examinations of irradiated and unirradiated components Reportable Occurrence An event or condition to be reported according to the criteria defined in DOE Order 5000 3A Occurrence Report An occurrence report is a written evaluation of an event or condition that is prepared in sufficient detail to enable the reader to assess its significance consequences or implications and evaluate actions being employed to correct the condition or to avoid recurrence Event A real time occurrence e g pipe break valve failure loss of power Note that an event is also anything that could seriously impact the intended mission of DOE facilities Condition Any as found state whether or not resulting from an event that may have adverse safety health quality assurance security operational or environmental implications A rendition is usually programmatic in nature for example an existing error in analysis or calculation an anomaly associated with resulting from design or performance or an item indicating a weakness in the management process are all conditions Cause Causal Factor A condition or an event that results in an effect anything that shapes or influences the outcome This may be anything from noise in an instrument channel a pipe break a
26. e o Integrate information into the investigative process relevant to the causes of or the contributors to the undesirable consequences Change Analysis is a good technique to use whenever the causes of the condition are obscure you do not know where to start or you suspect a change may have contributed to the condition Not recognizing the compounding of change e g a change made five years previously combined with a change made recently is a potential shortcoming of Change Analysis Not recognizing the introduction of gradual change as compared with immediate change also is possible This technique may be adequate to determine the root cause of a relatively simple condition In general though it is not thorough enough to determine all the causes of more complex conditions Figure E 1 shows the six steps involved in Change Analysis Figure E 2 is the Change Analysis worksheet The following questions help identify information required on the worksheet WHAT What is the condition What occurred to create the condition What occurred prior to the condition What occurred following the condition E I WHEN What activity was in progress when the condition occurred What activity was in progress when the condition was identified Operational evolution in the work space Surveillance test Power increase decrease Starting stopping equipment Operational evolution outside the work space Valve line up Fue
27. e Descriptions Recommended Corrective Actions C 6 3 Personnel Error Worksheet Applicable C Not Applicable Why was Personnel Error a Cause aang sbamoy LEA 3A Inadequate Work Environment Hate each subcategory cause D Direct Cause C Contributing Cause 38 Inattention to Detail ae R Root Cause 3C Violation of Requirement or Procedure j 3D Verbal Communication Problem NU 3E Other Human Error Cause Description 3C Violation of Requirement or Procedure Untrained employee tied out interlocks in violation of procedure and used wrong meter NOTE Although an employee error was the direct cause we do not blame the employee See corrective action Recommended Corrective Actions 1 Train supervisors to verify qualifications when assigning personnel to a hazardous task 2 Reemphasize the need to obtain authorization prior to bypassing any interlock C 7 4 Design Problem Worksheet xx Applicable TI Not Applicable Why was Design a Cause Design Problem Subcategories KABAN 4A z inadequate Man Machine Interface bod 4B Inadequte or Defective Design CI 4C Error in Equipment or Material Selection 4D Drawing Specification or Date Errors ME 1 Cause Descriptions Hate each subcategory cause D Direct Cause C Contributing Cause R Root Cause 4B Inadequate or Defective Design The
28. e line of reasoning in the investigation process is Outline what happened step by step Begin with the occurrence and identify the problem condition situation or action that was not wanted and not planned Determine what program element was supposed to have prevented this occurrence Was it lacking or did it fail Investigate the reasons why this situation was permitted to exist This line of reasoning will explain why the occurrence was not prevented and what corrective actions will be most effective This reasoning should be kept in mind during the entire root cause process Effective corrective action programs include the following Management emphasis on the identification and correction of problems that can affect human and equipment performance including assigning qualified personnel to effectively evaluate equipment human performance problems implementing corrective actions and following up to verify corrective actions are effective Development of administrative procedures that describe the process identify resources and assign responsibility Development of a working environment that requires accountability for correction of impediments to error free task performance and reliable equipment performance Development of a working environment that encourages voluntary reporting of deficiencies errors or omissions Training programs for individuals in root cause analysis Training of personnel and managers to recogn
29. ecklist on the form was developed without reviewing the hazard identified on the SAR Cause Code 6B Management Work Organization Planning Deficiency Also on the Mini MORT chart under performance error training is listed Investigation of this factor revealed that a contributing cause was that neither the health physics technician nor the operator recognized the hazard Cause Code 5A Training Deficiency No Training Provided Note that water in the pump was a condition Some may feel that this condition was the direct cause of this occurrence but water in a pump given as a cause of water leaking from a pump is too simplistic there is a need to know why a pump containing water was removed from a hot cell In addition operator error should be listed as a cause only if the operator had been trained and reasonably could have been expected to recognize the hazard Also note that full MORT analysis was not used for this off normal occurrence the Mini MORT chart led to asking the few right questions with a low level of effort required to perform the root cause analysis EXAMPLE 2 With the reactor at full power the outer shim cylinder would not move when attempting to adjust power While there was no immediate safety concern the reactor was shut down Since this was a physical barrier that did not perform its function we use barrier analysis to ask why Investigation revealed a broken connection in the wire that activates a solenoid to release the
30. ed This process is based on the MORT Hazard Target Concept Ifthis process fails to identify problem areas seek additional help or use cause and effect analysis Requires HPE training Requires training Kepner Tregoe Condition N Condition N LS P V n peo UIT TES Condition N ondion N b Root Cause N ee c mm men Condition Contributing Cause Condition Contributing Cause Any 2s found or existing state that influences the outcome of a particular task procees or operation al Conditions that may exist but ere not identified Figure 3 Causal Factor Relationships 12 This diagram is a graphical display of what is known Since all conditions are a result of prior actions the diagram identifies what questions to ask to follow the path to the source or root cause In real life the causal factor chain will usually be complex with many branches In such cases a diagram will be necessary to understand what happened and why The cause and effect block diagram offers these advantages It provides a means for organizing the occurrence data It provides the investigator with a concise summary of what 1s known and what is unknown thus it serves as a guide to direct the course of the investigation It results in a detailed display of the sequence of facts conditions and activities It assists in organization of the report data and provides a picture format for b
31. element the sum of all the findings is a measure of how widespread the element inadequacy is The results guide the specific and generic corrective actions A brief explanation of the what and why may assist in using mini MORT for causal analyses When a target inadvertently comes in contact with a hazard and sustains damage the event is an accident A hazard is any condition situation or activity representing a potential for adversely affecting economic values or the health or quality of people s lives A target can be any process hardware people the environment product quality or schedule anything that has economic or personal value What prevents accidents or adverse programmatic impact events Barriers that surround the hazard and or the target and prevent contact or controls and procedures that ensure separation of the hazard from the target Plans and procedures that avoid conflicting conditions and prevent programmatic impacts In a facility what functions implement and maintain these barriers controls plans and procedures Identifying the hazards targets and potential contacts or interactions and specifying the barriers controls that minimize the likelihood and consequences of these contacts Identifying potential conflicts problems in areas such as operations scheduling or quality and specifying management policy plans and programs that minimize the likelihood and consequences of these adverse occurrences Provi
32. en Management Problem Cause Description Corrective Actions B 9 APPENDIX C CAUSAL FACTOR ANALYSIS EXAMPLES EXAMPLE 1 Contaminated water leaked from a pump wrapped in plastic after the pump was removed from a hot cell Investigation using Mini MORT revealed o A safe work permit was obtained and properly signed off but did not contain adequate precautions against possible water involvement in the task e The safe work permit included a list of hazards but omitted liquid potential e A Safety Analysis Report SAR identified this particular hazard but this information was not used in preparing the safe work permit checklist This occurrence was an off normal release of radionuclides Using Mini MORT as a guide controls less than adequate was identified The problem was leakage of contaminated water The direct cause was not draining the pump before removing it from the hot cell Following down the Mini MORT chart Performance Error Job Assignment Less Than Adequate LTA was found The operator had not been instructed or trained on this hazard and the safe work permit did not include this precaution Cause Code 2A Defective or Inadequate Procedure lacks something essential to successfully perform activity Continuing on the Mini MORT chart Technical Information Communication and Knowledge were found Asking questions about these factors revealed that the root cause was the safe work permit form The ch
33. ent and personnel These five elements must be managed therefore management is also a necessary element Whenever there is an occurrence one of these six program elements was inadequate to prevent the occurrence External phenomena beyond operational control serves as a seventh cause category These causal factors specified in DOE Order 5000 3A can be associated in a logical causal factor chain as shown in Figure 1 Note that a direct contributing or root cause can occur any place in the causal factor chain that 1s a root cause can be an operator error while a management problem can be a direct cause depending on the nature of the occurrence These seven cause categories are subdivided into a total of 32 subcategories The direct cause contributing causes and root cause are all selected from these subcategories see Appendix A Management Factors we emp e Managemen Bridge or Transfer External EE Factors Phenomena R Design Training mmm em mm mw 05 e nin Darriara a i risil Dairiviro dii g Controls saa at am ma m am de IA A a TT H at dio m al cQquipmaenuividteridi Procedure Personnel Figure 1 Causal Factor Categories Associated in a Logical Chain 5 1 Assessment and Reporting Guidance To perform the assessment and report the causal factors and corrective actions Analyze and determine the events and causal factor chain Any root cause analysis method
34. equire adequate work review Which personnel Were involved with the condition Observed the condition Identified the condition Reported the condition Corrected the condition Mitigated the condition Missed the condition What were The qualifications of these personnel The experience levels of these personnel The work groups of these personnel The attitudes of these personnel Their activities at the time of involvement with the condition Did the personnel involved Have adequate instruction Have adequate supervision Have adequate training Have adequate knowledge Communicate effectively Perform correct actions Worsen the condition Mitigate the condition E 4 1 Occurrence with Undesirable Consequence Comparable Activity without Undesirable Consequence 5 Analyze Differences for Effect on Undesirable Consequence Set Down Differences Integrate Information Relevant to the Causes of the Undesirable Consquence Figure E 1 Six Steps Involved in Change Analysis E 5 Change Factor What Conditions occurrence activity equipment When Occurred identified plant status schedule Where Physical location environmental conditions How Work practice ommission extraneous action out of sequence procedure ta ri VVnhO Personnel involved training qualificat
35. hange Analysis looks at a problem by analyzing the deviation between what 1s expected and what actually happened The evaluator essentially asks what differences occurred to make the outcome of this task or activity different from all the other times this task or activity was successfully completed This technique consists of asking the questions What When Where Who How Answering these questions should provide direction toward answering the root cause determination question Why Primary and secondary questions included within each category will provide the prompting necessary to thoroughly answer the overall question Some of the questions will not be applicable to any given condition Some amount of redundancy exists in the questions to ensure that all items are addressed Several key elements include the following Consider the event containing the undesirable consequences Consider a comparable activity that did not have the undesirable consequences Compare the condition containing the undesirable consequences with the reference activity Set down all known differences whether they appear to be relevant or not Analyze the differences for their effects in producing the undesirable consequences This must be done with careful attention to detail ensuring that obscure and indirect relationships are identified e g a change in color or finish may change the heat transfer parameters and consequently affect system temperatur
36. hen there is a shortage of experts to ask the right questions However because each of the management factors may apply to the specific barrier control factors the direct linkage or relationship is not shown but is left up to the analyst For this reason Events and Causal Factor Analysis and MORT should be used together for serious occurrences one to show the relationship the other to prevent oversight A number of condensed versions of MORT called Mini MORT have been produced For a major occurrence justifying a comprehensive investigation a full MORT analysis could be performed while Mini MORT would be used for most other occurrences Appendix G describes the Mini MORT technique 13 5 2 5 Human Performance Evaluation Human Performance Evaluation is used to identify factors that influence task performance It is most frequently used for man machine interface studies Its focus is on operability and work environment rather than training operators to compensate for bad conditions Also human performance evaluation may be used for most occurrences since many conditions and situations leading to an occurrence ultimately result from some task performance problem such as planning scheduling task assignment analysis maintenance and inspections Training in ergonomics and human factors is needed to perform adequate human performance evaluations especially in man machine interface situations Appendix H discusses this technique 5 2 6 Kep
37. ion supervision Work Sheet Difference Change Effect Figure E 2 Change Analysis Worksheet E 6 Questions to Answer APPENDIX F BARRIER ANALYSIS There are many things that should be addressed during the performance of a Barrier Analysis NOTE In this usage a barrier is from Management Oversight and Risk Tree MORT terminology and is something that separates an affected component from an undesirable condition situation Figure F 1 provides an example of Barrier Analysis The questions listed below are designed to aid in determining what barrier failed thus resulting in the occurrence What barriers existed between the second third etc condition situation and the second third etc problems If there were barriers did they perform their functions Why Did the presence of any barriers mitigate or increase the occurrence severity Why Were any barriers not functioning as designed Why Was the barrier design adequate Why Were there any barriers in the condition situation source s Did they fail Why Were there any barriers on the affected component s Did they fail Why Were the barriers adequately maintained Were the barriers inspected prior to expected use Why were any unwanted energies present Is the affected system component designed to withstand the condition situation without the barriers Why What design changes could have prevented the unwanted flow of energy Why What operating ch
38. ize and report occurrences including early identification of significant and generic problems Development of programs to ensure prompt investigation following an occurrence or identification of declining trends in performance to determine root causes and corrective actions Adoption of a classification and trending mechanism that identifies those factors that continue to cause problems with generic implications 4 PHASE I DATA COLLECTION It is important to begin the data collection phase of the root cause process immediately following occurrence identification to ensure that data are not lost Without compromising safety or recovery data should be collected even during an occurrence The information that should be collected consists of conditions before during and after the occurrence personnel involvement including actions taken environmental factors and other information having relevance to the condition or problem For serious cases photographing the area of the occurrence from several views may be useful in analyzing information developed during the investigation Every effort should be made to preserve physical evidence such as failed components ruptured gaskets burned leads blown fuses spilled fluids partially completed work orders and procedures This should be done despite operational pressures to restore equipment to service Occurrence participants and other knowledgeable individuals should be identified Once all
39. l handling Removing equipment from service Returning equipment to service Maintenance activity Surveillance Corrective maintenance Modification installation Troubleshooting Training activity What equipment was involved in the condition What equipment initiated the condition What equipment was affected by the condition What equipment mitigated the condition What is the equipment s function How does it work How is it operated What failed first Did anything else fail due to the first problem What form of energy caused the equipment problem What are recurring activities associated with the equipment What corrective maintenance has been performed on the equipment What modifications have been made to the equipment What system or controls barriers should have prevented the condition What barrier s mitigated the consequences of the condition When did the condition occur What was the facility s status at the time of occurrence When was the condition identified What was the facility s status at the time of identification E 2 WHERE HOW What effects did the time of day have on the condition Did it affect Information availability Personnel availability Ambient lighting Ambient temperature Did the condition involve shift work personnel If so What type of shift rotation was in use Where in the rotation were the personnel For how many continuous hours ha
40. l implementation and continued effectiveness of the corrective actions What impact will the development and implementation of the corrective actions have on other work groups Is the implementation of the corrective actions measurable For example Revise step 6 2 of the procedure to reflect the correct equipment location 1s measurable Ensure the actions of procedure step 6 2 are performed correctly in the future is not measurable 15 7 PHASE IV INFORM Electronic reporting to ORPS is part of the inform process for all occurrences For those occurrences containing classified information an unclassified version shall be entered into ORPS Effectively preventing recurrences requires the distribution of these reports especially the lessons learned to all personnel who might benefit Methods and procedures for identifying personnel who have an interest is essential to effective communications In addition an internal self appraisal report identifying management and control system defects should be presented to management for the more serious occurrences The defective elements can be identified using MORT or Mini MORT as described in Appendix G Consideration should be given to directly sharing the details of root cause information with similar facilities where significant or long standing problems may also exist 8 PHASE V FOLLOW UP Follow up includes determining if corrective actions have been effective in resol
41. management should be involved in the process Proposed corrective actions should be reviewed to ensure the above criteria have been met and should be prioritized based on importance scheduled a change in priority or schedule should be approved by management entered into a commitment tracking system and implemented in a timely manner A complete corrective action program should be based not only on specific causes of occurrences but also on items such as lessons learned from other facilities appraisals and employee suggestions A successful corrective action program requires management that is involved at the appropriate level and is willing to take responsibility and allocate adequate resources for corrective actions Additional specific questions and considerations in developing and implementing corrective actions include Do the corrective actions address all the causes Will the corrective actions cause detrimental effects What are the consequences of implementing the corrective actions What are the consequences of not implementing the corrective actions What is the cost of implementing the corrective actions capital costs operations and maintenance costs Will training be required as part of the implementation In what time frame can the corrective actions reasonably be implemented What resources are required for sucessful development of the corrective actions What resources are required for successfu
42. n operator error or a weakness or deficiency in management or administration In the context of DOE Order 5000 3 A there are seven major cause causal factor categories These major categories are subdivided into a total of 32 subcategories see Appendix A Causal Factor Chain Sequence of Events and Causal Factors A cause and effect sequence in which a specific action creates a condition that contributes to or results in an event This creates new conditions that in turn result in another event Earlier events or conditions in a sequence are called upstream factors Direct Cause The cause that directly resulted in the occurrence For example in the case of a leak the direct cause could have been the problem in the component or equipment that leaked In the case of a system misalignment the direct cause could have been operator error in the alignment Contributing Cause A cause that contributed to an occurrence but by itself would not have caused the occurrence For example in the case of a leak a contributing cause could be lack of adequate operator training in leak detection and response resulting in a more severe event than would have otherwise occurred In the case of a system misalignment a contributing cause could be excessive distractions to the operators during shift change resulting in less than adequate attention to important details during system alignment Root Cause The cause that if corrected would prevent recurre
43. n the chart is the direct cause and can be a management factor Management is seldom a direct cause for a real time loss event such as injury or property damage but may very well be a direct cause for conditions A root cause is the fundamental cause which if corrected will prevent recurrence of this and similar events This is usually not a barrier control problem but a weakness or deficiency in the identifica tion provision or maintenance of the barriers controls or the administrative functions In the context of DOE Order 5000 3A a root cause 1s ordinarily control related involving such upstream elements as management and administration n any case it is the original or source cause A contributing cause is any cause that had some bearing on the occurrence on the direct cause or on the root cause but is not the direct or the root cause G 2 LO eyo Steg LYOW IUN I O annig Event A Oversights Assumed omissions Risks A What Why l i Specifics LTA Mgmt LTA i i A 0 FAULT TREE KEY LTA Less then adequate 0 OR Gate any input results in an output AN AM fata ait ian ba must ha A MYU Uic UN NIputa II AN WG present to provide an output n number of inputs a oo Accident cort ao Policy LTA Implementation LTA Risk Assessment LTA i A 0 0 t i Hazard Borrier Controls Target Recovery 0 N X Emergency Relations Goals Tech info Hazard Safety program
44. nc Idaho Falls ID March 1981 N W Knox and R W Eicher MORT Users Manual SSDC 4 Rev 2 System Safety Development Center EG amp G Idaho Inc Idaho Falls ID 1983 G J Briscoe WP 28 SSDC MORT Based Risk Management EG amp G Idaho Inc Idaho Falls ID 199 J L Harbour and S G Hill HSYS A Methodology for Analyzing Human Performance in Operational Settings Draft EGG HFRU 3306 C H Kepner and B B Tregoe The New Rational Manager Princeton Research Press Princeton NJ 1981 17 18 APPENDIX A CAUSE CODES 1 Equipment Material Problem lA Defective or failed part IB Defective or failed material IC Defective weld braze or soldered joint ID Error by manufacturer in shipping or marking IE Electrical or instrument noise IF Contamination 2 Procedure Problem 2A Defective or inadequate procedure 2B Lack of procedure 3 Personnel Error 3A Inadequate work environment 3aB Inattention to detail 3C Violation of requirement or procedure 3D Verbal communication problem 3E 5 Other human error 4 Design Problem 4A Inadequate man machine interface 4B Inadequate or defective design 4C Error in equipment or material selection 4D Drawing specification or data errors 5 Training Deficiency 5A No training provided SB Insufficient practice or hands on experience SC x Inadequate content SD Insufficient refresher training JE x Inadequate present
45. nce of this and similar occurrences The root cause does not apply to this occurrence only but has generic implications to a broad group of possible occurrences and it is the most fundamental aspect of the cause that can logically be identified and corrected There may be a series of causes that can be identified one leading to another This series should be pursued until the fundamental correctable cause has been identified For example in the case of a leak the root cause could be management not ensuring that maintenance is effectively managed and controlled This cause could have led to the use of improper seal material or missed preventive maintenance on a component which ultimately led to the leak In the case of a system misalignment the root cause could be a problem in the training program leading to a situation in which operators are not fully familiar with control room procedures and are willing to accept excessive distractions 3 OVERVIEW OF OCCURRENCE INVESTIGATION The objective of investigating and reporting the cause of occurrences is to enable the identification of corrective actions adequate to prevent recurrence and thereby protect the health and safety of the public the workers and the environment Programs can then be improved and managed more efficiently and safely The investigation process is used to gain an understanding of the occurrence its causes and what corrective actions are necessary to prevent recurrence Th
46. ner Tregoe Problem Solving and Decision Making Kepner Tregoe is used when a comprehensive analysis is needed for all phases of the occurrence investigation process Its strength lies in providing an efficient systematic framework for gathering organizing and evaluating information and consists of four basic steps a Situation appraisal to identify concerns set priorities and plan the next steps b Problem analysis to precisely describe the problem identify and evaluate the causes and confirm the true cause This step is similar to change analysis ed Decision analysis to clarify purpose evaluate alternatives assess the risks of each option and to make a final decision d Potential problem analysis to identify safety degradation that might be introduced by the corrective action identify the likely causes of those problems take preventive action and plan contingent action This final step provides assurance that the safety of no other system is degraded by changes introduced by proposed corrective actions These four steps cover all phases of the occurrence investigation process and thus Kepner Tregoe can be used for more than causal factor analysis Separate worksheets provided by Kepner Tregoe provide a specific focus on each of the four basic steps and consist of step by step procedures to aid in the analyses This systems approach prevents overlooking any aspect of the concern As formal Kepner Tregoe training is needed for th
47. ng a dog SE Inadequate Presentation or Materials Cause Descriptions Rate each subcategory cause D Direct Cause C Contributing Cause H Root Cause Recommended Corrective Actions 6 Management Problem Worksheet CJ Applicable C Not Applicable Why was Management Problem a Cause RER 1 7 7 N_ 6A Inadequate Adminstrative Control C Contributing Cause 6B Work Organization Planning Deficiency PPP R Root Cause 6D improper Resource Allocation Rate each subcategory cause D Direct Cause 6E Policy Not Adequately Defined Disseminated or Enforced 6D Other Cause Descriptions Recommended Corrective Actions B 7 7 External Phenomena Worksheet CJ Applicable TT Not Applicable Why was External Phenomena a Cause External Phenomena Subcategories pb pom fm pov 7A Weather or Ambient Condition pa 403 Rate each subcategory cause D Direct Cause C Contributing Cause 7B Power Failure or Transient II lo 7C External Fire or Explosion R Root Cause 7D Theft Tampering Sabotage Vandalism Cause Descriptions Recommended Corrective Actions B 8 Worksheet Summary i Direct Root Contributin Equipment Material Problem Operational Ee Readiness Problem rorem Personnel Error Design Management Field nomen s Training Bridge Problem Gert
48. ng identification step existed working your way back to the root cause the fundamental reason that if corrected will prevent recurrence of this and similar occurrences throughout the facility and other facilities under your control This root cause is the stopping point in the assessment of causal factors It is the place where with appropriate corrective action the problem will be eliminated and will not recur 2 summarize findings list the causal factors and list corrective actions Summarize your findings using the worksheets in Appendix B and classify each finding or cause by the cause categories in Appendix A Select the one most direct cause and the root cause the one for which corrective action will prevent recurrence and have the greatest most widespread effect In cause selection focus on programmatic and system deficiencies and avoid simple excuses such as blaming the employee Note that the root cause must be an explanation the why of the direct cause not a repeat of the direct cause In addition a cause description is not just a repeat of the category code description it is a description specific to the occurrence Also up to three contributing causes may be selected Describe the corrective actions selected to prevent recurrence including the reason why they were selected and how they will prevent recurrence Collect additional information as necessary Appendix B includes instructions and worksheets that may be u
49. nsider conducting a walk through as part of this interview if time permits Although preparing for the interview is important it should not delay prompt contact with participants and witnesses The first interview may consist solely of hearing their narrative A second more detailed interview can be arranged if needed The interviewer should always consider the interviewee s objectivity and frame of reference Interviewing others Consider interviewing other personnel who have performed the job in the past Consider using a walk through as part of the interview Reviewing records Review relevant documents or portions of documents as necessary and reference their use in support of the root cause analysis Record appropriate dates and times associated with the occurrence on the documents reviewed Examples of documents include the following Operating logs Correspondence Inspection surveillance records Maintenance records Meeting minutes Computer process data Procedures and instructions Vendor Manuals Drawings and specifications Functional retest specification and results Equipment history records Design basis information Safety Analysis Report SAR Technical Specifications Related quality control evaluation reports 5 Operational Safety Requirements Safety Performance Measurement System Occurrence Reporting and Processing System SPMS ORPS Reports Radiological surveys Trend charts and graphs Facility parame
50. occurrences and a relatively low level effort should be adequate for most off normal occurrences In any case the depth of analysis should be adequate to explain why the occurrence happened determine how to prevent recurrence and assign responsibility for corrective actions An inordinate amount of effort to pursue the causal path is not expected if the significance of the occurrence is minor A high level effort includes use and documentation of formal root cause analysis to identify the upstream factors and the program deficiencies Both Events and Causal Factor Analysis and MORT could be used together in an extensive investigation of the causal factor chain An intermediate level might be a simple Barrier Change or Mini MORT Analysis A low level effort may include only gathering information and drawing conclusions without documenting use of any formal analytical method However in most cases a thorough knowledge and understanding of the root cause analytical methods is essential to conducting an adequate investigation and drawing correct conclusions regardless of the selected level of effort 5 2 1 Events and Causal Factor Analysis Events and Causal Factor Analysis is used for multi faceted problems or long complex causal factor chains The resulting chart is a cause and effects diagram that describes the time sequence of a series of tasks and or actions and the surrounding conditions leading to an event The event line is a time sequence
51. oroughly familiarize yourself with the guide and decide exactly what information you are going to record and how you will record it You may want to check off each step and controls or displays used as they occur Discrepancies and problems may be noted in the margin or in a space provided for comments adjacent to the step Select personnel who normally perform the task If the task 1s performed by a crew crew members should play the same role they fulfill when carrying out the task Observe personnel walking through the task and record their actions and use of displays and controls Note discrepancies and problem areas D 1 You should observe the task as it is normally carried out however if necessary you may stop the task to gain full understanding of all steps Conducting the task as closely to the conditions that existed when the event occurred will provide the best understanding of the event causal factors Summarize and consolidate any problem areas noted Identify probable contributors to the event CAUSE AND EFFECT CHART Figure D 1 shows the conceptual process of cause and effect charting Figure D 2 shows a sample cause and effect chart The primary effect given on the chart is the problem you are trying to prevent from recurring To complete the cause and effect chart l Identify the cause and effect starting with the primary effect For each effect there 1s a cause that then becomes the next effect for which you need
52. ose using this method a further description is not included in this document 6 PHASE Ill CORRECTIVE ACTIONS The root cause analysis enables the improvement of reliability and safety by selecting and implementing effective corrective actions To begin identify the corrective action for each cause then apply the following criteria to the corrective actions to ensure they are viable If the corrective actions are not viable re evaluate the solutions l Will the corrective action prevent recurrence 2 Is the corrective action feasible 3 Does the corrective action allow meeting primary objectives or mission 14 4 Does the corrective action introduce new risks Are the assumed risks clearly stated The safety of other systems must not be degraded by the proposed corrective action 5 Were the immediate actions taken appropriate and effective A systems approach such as Kepner Tregoe should be used in determining appropriate corrective actions It should consider not only the impact they will have on preventing recurrence but also the potential that the corrective actions may actually degrade some other aspect of nuclear safety Also the impact the corrective actions will have on other facilities and their operations should be considered The proposed corrective actions must be compatible with facility commitments and other obligations In addition those affected by or responsible for any part of the corrective actions including
53. part of the work environment that needs to be evaluated for each of these steps Common problems that need to be considered are Cognitive overload Cognitive underload boredom Habit intrusion Lapse of memory recall Spatial misorientation Mindset preconceived idea Tunnel vision or lack of big picture Unawareness Wrong assumptions made Reflect instinctive action Thinking and actions not coordinated Insufficient degree of attention applied Shortcuts evoked to complete job Complacency lack of perceived need for concern Confusion Misdiagnosis Fear of failure consequences Tired fatigued Where high risk 1s very sensitive to noncompliance with requirements each of the human performance factors should be considered in order to achieve a high degree of reliability These factors also should be considered in system design control and operator training as well as causal factor determination and corrective action decisions CONCLUDING MATERIAL Review Activity Preparing Activity DOE Field Offices DOE EH 52 DP AL EH CH EM ID NE NV ER OAK Project Number EM OH NE OR 6910 0060 ER HF RL
54. placing an R C or D in the appropriate box The same cause may be listed for up to four similar findings for example four different failed parts b Under cause description reference each cause with the code and Roman numeral from the matrix and describe each cause explain how it was related to the occurrence C Under recommended corrective actions list the action intended to correct each cause to prevent recurrence 3 Transfer the direct the root and up to three contributing causes and the corrective actions to the Worksheet Summary When there are more than three contributing causes select those that result in the greatest and most widespread improvement when corrected Note that even though only three contributing causes may be reported corrective actions should be made for all identified causes Use the ORPS PC software to transmit the results to the ORPS database Refer to Appendix C for an example of how to use the worksheets B 1 1 Equipment Material Worksheet CJ Applicable C Not Applicable Why was Equipment Material a Cause Equipment Material Problem Subcategories ln ban fw 1A Defective or Failed Part Ii po 18 Defective or Failed Material Ii l boe 1C Defective Weld Braze or Soldered Joint 1D Error by Manufacturer in Shipping or Marking bj 1E Electrical or Instrument Noise LL LL Cause Descriptions Rate each subcategory cause D Direct Cause C
55. riefing management Appendix D describes this technique 5 2 2 Change Analysis Change Analysis is used when the problem is obscure It is a systematic process that is generally used for a single occurrence and focuses on elements that have changed It compares the previous trouble free activity with the occurrence to identify differences These differences are subsequently evaluated to determine how they contributed to the occurrence Appendix E describes this technique 5 2 3 Barrier Analysis Barrier Analysis is a systematic process that can be used to identify physical administrative and procedural barriers or controls that should have prevented the occurrence This technique should be used to determine why these barriers or controls failed and what is needed to prevent recurrence Appendix F describes this technique 5 2 4 Management Oversight and Risk Tree MORT MORT Mini MORT is used to used to prevent oversight in the identification of causal factors It lists on the left side of the tree specific factors relating to the occurrence and on the right side of the tree it lists the management deficiencies that permit specific factors to exist The management factors all support each of the specific barrier control factors Included is a set of questions to be asked for each of the factors on the tree As such it 1s useful in preventing oversight and ensuring that all potential causal factors are considered It is especially useful w
56. sed to collect and summarize data Appendix C contains examples of root cause analyses 3 Enter the occurrence report using ORPS Enter the occurrence report into ORPS using the ORPS User s Manual as necessary When entering the cause code data using ORPS PC Software match your direct cause root cause and each of the contributing causes with one of the cause categories given in Appendix A also available through a HELP screen 5 2 Root Cause Methods A number of methods for performing root cause analysis are given in the references 3 through 17 Many of these methods are specialized and apply to specific situations or objectives Most have their own cause categorizations but all are very effective when used within the scope for which they were designed The most common methods are Events and Causal Factor Analysis Change Analysis Barrier Analysis Management Oversight and Risk Tree MORT Analysis Human Performance Evaluation Kepner Tregoe Problem Solving and Decision Making A summary of the most common root cause methods when it is appropriate to use each method and the advantages disadvantages of each are given in Figure 2 and Table 1 The extent to which these methods are used and the level of analytical effort spent on root cause analysis should be commensurate with the significance of the occurrence A high level effort should be spent on most emergencies an intermediate level should be spent on most unusual
57. t 1s shown in Figure G 1 This chart is a checklist of what happened less than adequate specific barriers and controls and why it happened less than adequate management To perform the MORT analysis l Identify the problem associated with the occurrence and list it as the top event 2 Identify the elements on the what side of the tree that describe what happened in the occurrence what barrier or control problems existed Ii For each barrier or control problem identify the management elements on the why side of the tree that permitted the barrier control problem 4 Describe each of the identified inadequate elements problems and summarize your findings These findings can then be related to the ORPS cause codes using the worksheets in Appendix B For critical self assessment not an ORPS requirement the findings can also be related to MORT elements given in Figure G 2 MORT Based Root Cause Analysis Form To do this enter the findings in the left hand column Next select the MORT elements from the top of the root cause form that most closely relate to the finding by placing a check in the column below the MORT elements and on the same line where the finding is listed more than one element can be related to a single finding Then sum the number of checks under each MORT element the sum can be entered at the bottom of the page even though there is no place designated on the form The relative number of checks under each MORT
58. ter readings Sample analysis and results chemistry radiological air etc Work orders e Acquiring related information Some additional information that an evaluator should consider when analyzing the causes includes the following Evaluating the need for laboratory tests such as destructive nondestructive failure analysis Viewing physical layout of system component or work area developing layout sketches of the area and taking photographs to better understand the condition Determining if operating experience information exists for similar events at other facilities Reviewing equipment supplier and manufacturer records to determine if correspondence has been received addressing this problem 5 PHASE Il ASSESSMENT The assessment phase includes analyzing the data to identify the causal factors summarizing the findings and categorizing the findings by the cause categories specified in DOE Order 5000 3A see Appendix A The major cause categories are Equipment Material Problem Procedure Problem Personnel Error Design Problem Training Deficiency Management Problem External Phenomena These categories have been carefully selected with the intent to address all problems that could arise in conducting DOE operations Those elements necessary to perform any task are equipment material procedures instructions and personnel Design and training determine the quality and effectiveness of equipm
59. the data associated with this occurrence have been collected the data should be verified to ensure accuracy The investigation may be enhanced if some physical evidence is retained Establishing a quarantine area or the tagging and segregation of pieces and material should be performed for failed equipment or components The basic need is to determine the direct contributing and root causes so that effective corrective actions can be taken that will prevent recurrence Some areas to be considered when determining what information is needed include Some Activities related to the occurrence Initial or recurring problems Hardware equipment or software programmatic type issues associated with the Occurrence Recent administrative program or equipment changes Physical environment or circumstances methods of gathering information include Conducting interviews collecting statements Interviews must be fact finding and not fault finding Preparing questions before the interview is essential to ensure that all necessary information is obtained The causal factor work sheets in Appendix B can be used as a tool to help gather information Interviews should be conducted preferably in person with those people who are most familiar with the problem Individual statements could be obtained if time or the number of personnel involved make interviewing impractical Interviews can be documented using any format desired by the interviewer Co
60. valuating information and applies to all phases of the occurrence investigation process Its focus on each phase helps keep them separate and distinct The root cause phase is similar to change analysis Phase III Corrective Actions Implementing effective corrective actions for each cause reduces the probability that a problem will recur and improves reliability and safety Phase IV Inform Entering the report on the Occurrence Reporting and Processing System ORPS is part of the inform process Also included is discussing and explaining the results of the analysis including corrective actions with management and personnel involved in the occurrence In addition consideration should be given to providing information of interest to other facilities Phase V Follow up Follow up includes determining if corrective action has been effective in resolving problems An effectiveness review is essential to ensure that corrective actions have been implemented and are preventing recurrence Management involvement and adequate allocation of resources are essential to successful execution of the five root cause investigation and reporting phases 2 DEFINITIONS See DOE Order 3000 3 A Section 5 Facility Any equipment structure system process or activity that fulfills a specific purpose Examples include accelerators storage areas fusion research devices nuclear reactors production or processing plants coal conversion plants magnetohydro
61. ving problems First the corrective actions should be tracked to ensure that they have been properly implemented and are functioning as intended Second a periodic structured review of the corrective action tracking system normal process and change control system and occurrence tracking system should be conducted to ensure that past corrective actions have been effectively handled The recurrence of the same or similar events must be identified and analyzed If an occurrence recurs the original occurrence should be re evaluated to determine why corrective actions were not effective Also the new occurrence should be investigated using change analysis The process change control system should be evaluated to determine what improvements are needed to keep up with changing conditions Early indications of deteriorating conditions can be obtained from tracking and trend analyses of occurrence information In addition the ORPS database should be reviewed to identify good practices and lessons learned from other facilities Prompt corrective actions should be taken to reverse deteriorating conditions or to apply lessons learned 9 REFERENCES l DOE Order 5000 3A Occurrence Reporting and Processing of Operations Information U S Department of Energy May 30 1990 2 User s Manual Occurrence Reporting and Processing System ORPS Draft DOE ID 10319 EG amp G Idaho Inc Idaho Falls ID 1991 3 Accident Incident Investigation Manual SSDC 27 D
62. y have been expected to take the extra precautions Note that in this case as a minimum corrective action should include review and revision as appropriate of other procedures and training operators to the new procedures Further corrective action would include installation of fittings that make it impossible to hook up the wrong cylinder a review of other hookups within the facility to correct similar problems and the use of human factors ergonomics in configuration design and control EXAMPLE 4 A large 2400 volt fan system blew a fuse The electrician obtained a fuse from the store room tagged out the switch and replaced the fuse The system would not work so the electrician bypassed a safety interlock and used a meter to check the fuse A large fireball erupted causing burns that required hospitalization and 50 lost workdays This was classified as an off normal personnel safety occurrence in patient hospitalization However because this was a near fatality and because there existed a potential for significant programmatic impact the investigation used formal Cause and Effects Analysis with charting to identify all of the contributing conditions and any weaknesses in programmatic or operational control A condensed version of the working chart is given in Figure C 1 The significant findings are given below The worksheets following the chart illustrate transferring the findings to the ORPS cause subcategories on the worksheets C
Download Pdf Manuals
Related Search
Related Contents
Antecipação e Expectativas Face ao Tipo de Parto CODES PANNES BLU Dash JR 0.5GB Black BTS Commerce International Formal Report - PROJ354 Home Gebrauchsanleitung Instruction manual Mode d`emploi Handleiding Re-configurable trigger assembly Philips SoundRing wireless speaker DS3880W Copyright © All rights reserved.
Failed to retrieve file