Project problems? Use a fault tree analysis to get back on track

PostsProject management

Georgina Guthrie

January 06, 2022

We all know that sinking feeling when we realize a project hasn’t gone to plan. But rather than wasting time beating yourself up, embrace the opportunity to work out what went wrong and improve. Doing this shows that you’re accepting accountability and working hard to get back on track, which is a sign of great leadership and resilience.

But what if you want to address potential obstacles before they occur? This is where a fault tree analysis (FTA) comes in handy.

What is a fault tree analysis?

A fault tree analysis is a method of determining a current or potential cause of failure. An FTA gives you a visual representation of a process from start to finish, including the relationships between events. Presenting this information in a top-level diagram helps you logically pinpoint weaknesses and identify sources of failure.

But it’s more than a reactive tool. A fault tree guides you through a series of events or systems, so you can identify areas where things could go wrong. Within this analysis, failure could be the product of hardware, software, or human error.

Fault trees depend on a type of algebraic analysis known as Boolean Logic. The basic concept involves three designations — And, Or, and Not — which distinguish between the different conditions that must be met to cause an event in the series.

If you aren’t clear on what this means yet, don’t worry. We’ll discuss this topic in more detail later.

Types of fault tree analysis

The type of evaluation method you should use largely depends on the application. Qualitative fault tree analysis examines the reliability of a system based on the number of components and the projected failure rate of different parts of the system. The goal is to increase system reliability by lowering the system’s dependency on high-risk components.

Quantitative fault tree analysis measures risk probability. This approach considers how much a particular path contributes to the entire system’s functionality. By weighing the quantitative value of a path or sequence of events, you can determine what conditions are likely to cause the system to fail.

A brief history of fault tree analysis

FTAs originated as an engineering tool to forecast machine or software failure. In the early 1960s, a team at Bell Telephone Laboratories, led by H.A. Watson, first created fault tree analysis as a safety protocol for the Minuteman System.

During the development of the intercontinental ballistic missile (ICBM), the U.S. Air Force needed a way to improve the reliability of handling this high-risk technology. The goal was to analyze a potential combination of events that could trigger the top-level event. The fault tree method made it possible to predict the likelihood of system failure in specific occurrences and then reduce the probability of that event actually happening.

Building on the research of Bell Labs, Dave Haasl of Boeing applied the fault tree analysis as a general safety evaluation tool. Collectively, their work formed the basis of a handbook for system analysis in industries such as aerospace, nuclear power, chemical engineering, and manufacturing.

Fault tree analysis has evolved into a more mainstream risk management tool. For instance, software developers can use it to identify system errors while testing a product before it goes to market and for troubleshooting after it’s released to the public.

How can a fault tree analysis help you?

Forecasting helps you locate errors and then prioritize tasks to fix the issue(s).
In-depth analysis improves the viability of future projects. For example, if you’re repeating a task or developing a similar product, you can build upon a previous FTA to identify and avoid problems.
Identifying risks early allows you to design high-quality systems and maintenance procedures.
Presenting information in a diagram means everyone can understand the relationships between different events at a glance.
Creating a visual record of your analysis is useful for your team and stakeholders, who may need to refer to it at a later date.

Who needs a fault tree analysis?

FTAs are most useful for companies whose work involves complex systems with a high risk of danger or toxic contamination. This is especially true in industries where failure can have a massive impact, such as aeronautics, defense, petrochemicals, environmental protection, or nuclear power.

FTA is also increasingly valuable in industries that depend on well-maintained software systems, such as cybersecurity, tech, finance, healthcare, and software engineering. Although it could apply to broader business analysis, there are many other methodologies better suited for project or organizational assessments.

Fault tree diagram symbols explained

Fault tree diagram symbols fall into two categories: events and gates. You’ll find a description of the different symbols, as well as their meanings below.

Fault Tree Analysis Symbols — Created in Cacoo

Events

A Basic Event shows the failure event in a process and requires no additional analysis.
An External Event shows something that is normally expected to occur.
An Undeveloped Event indicates a situation where information is unavailable or considered unimportant.
A Conditioning Event indicates a restriction that’s applied to a logic gate.
An Intermediate Event provides additional event information.
Transfer In/Out refers to a transfer to a related fault tree.

Gates

An Or Gate indicates whether one or more input events occur.
The And Gate refers to an event that occurs only if all the input conditions are met.
The Exclusive Or Gate refers to an event that occurs only if one — and only one — of the input conditions are met.
The Priority And Gate is for an event that occurs if all of the input events occur in a specific order.
The Inhibit Gate is an event that will only occur if all input events take place, as well as whatever is described in the conditional event.

How to make a fault tree analysis diagram

1. First, you need to define your system and work out what constitutes failure. This is especially important in software or engineering, where one element could fail, but the rest of the system could potentially continue operating. There should be one top-level failure per fault tree.

2. Next, add inputs, which may or could potentially contribute to the top-level fault. Here is where you can insert logic gate symbols. Use them to organize which causes lead to the failure and which require multiple events before causing failure.

Tip: use diagramming software to create your fault tree quickly and easily with the help of a pre-made template.

3. Now analyze the fault tree diagram to determine the actual or potential root cause(s) of the top-level failure. Identify likely events that lead to failure or initiate paths that lead to issues. Then work out ways to resolve or mitigate these paths. You may want to combine this stage with a cause and effect diagram to help you determine each cause.

4. Finally, define and launch your plan. Alongside your basic symbols, you can note down the probability of each event contributing to the top-level failure, and then add actionable items or create a risk management plan.

Final thoughts

You can create a fault tree analysis with a wide range of tools, but by far, the easiest option is a dedicated online diagramming tool. Creating your diagram in the cloud means your team always has the latest version at their fingertips. It also means you can access real-time updates and feedback whenever someone adds a comment or edits the diagram. This means you can spend less time sending out status emails and more time collaborating with your team.

This post was originally published on July 5, 2019, and updated most recently on January 6, 2022.

Keywords

problem solving

lean