By ITIL® from Experience©
Logging or opening a problem record for every incident who’s cause is unknown, in theory, a great idea. The arguments for doing so are compelling1. Knowing the root-cause of every incident and addressing the issue can yield significant benefits like:
- Faster incident resolution time by using known-errors2 which reduces:
- Impact to the business
- Troubleshooting effort spent by I.T.
- Reducing the number of incidents which:
- Increases availability of services to the business
- Reduces effort by I.T. to maintain services
Not surprisingly, organizations sees Problem Management as a silver bullet to tilt the balance between operations and addressing the demand from the business; to redistribute resources between what as Gartner refers to as "run, grow, transform."3
Few organizations raise a problem for every unexplained incident. Typically, they have rigid high-availability requirements or they have a corporate culture of uncompromising quality. Moreover, they have achieved a high level of operational proficiency and IT Service Management maturity.
ITIL® v3, in its 2011 Edition states that when closing an incident we should ask if it is an: "Ongoing or recurring problem? Determine (in conjunction with resolver groups) whether the incident was resolved without the root cause being identified. In this situation, it is likely that the incident could recur and require further preventive action to avoid this. In all such cases, determine if a problem record related to the incident has already been raised. If not, raise a new problem record in conjunction with the problem management process so that preventive action is initiated.4 (Emphasis is our own).
However, most organizations should not open a problem for every unexplained incident if their people's problem-solving skills and their process capabilities are not adequately developed. Without having the proper enabler in place problem records will quickly accumulate and that "silver-bullet" expectation will certainly be missed.
To enable the organization to grow its Problem Management capabilities, start small. Focus on a recurring pain point. For example, mandate that a problem be opened for every unexplained incidents related to specific Configuration Items (CIs) or group of critical, high-impact, fragile CIs. However, if this would generate more problem records than the process’ capacity to efficiently address them, consider doing it only for every Major Incident (see Should a Problem be opened for every Major Incident); hopefully theses are few and far enough apart to allow the process to complete before another Major Incident occurs.
A well-defined approach with a set of decision criterion is recommended and preferable to ITIL's 2011 Edition’s suggestion mentioned above. In fact this recommendation aligns with ITIL’s 2007 Edition and ITIL v2 whereby a decision to create a problem record is made, if warranted, while closing the incident.5
Being selective helps Problem Management from becoming the over flow of Incident Management. A precise and well defined scope also makes it easier to avoid taking on too much work thus, paralyzing the process. In addition, it avoids the emergence of an attitude where people may give up on resolving incidents and simply create a problem to pass it on to another team. Equally important, when the same resources are responsible for incidents and problems, too many open problems will affect the performance of the Incident Management process as resources may be diverted to address problems.
In summary, organizations must have the right capabilities to be able to efficiently address every problem opened for each unexplained incident. As a result, organizations should start with a well-defined implementation approach to phase in Problem Management properly gauged to the process capacity as it develops these capabilities. An implementation roadmap helps to communicate the approach and set expectations.
- How to first introduce Problem Management
- Should a Problem be opened for every Major Incident
- Which problem management technique should we use
- The incident was resolved by replacing the hardware. Should the incident be kept open to manage the repair process
Copyright 2014 - ITIL® from Experience - D.Matte