Problem Management Process
ITIL® 2011 Service Operation Process and Policy Pack v2 ©CertiKit ITIL is a registered trade mark of AXELOS Limited
Problem Management Process
Implementation guidance The header page and this section, up to and including Disclaimer, must be removed from the final version of the document. For more details on replacing the logo, yellow highlighted text and certain generic terms, see the Completion Instructions document.
Purpose of this document This document sets out the problem management process including flowchart, activities, reporting and roles and responsibilities.
Areas of the ITIL® Framework addressed The following areas of the ITIL Framework are addressed by this document: •
Service Operation: Problem Management
General guidance Of all the ITIL processes, problem management probably has the greatest potential to directly affect the service you provide to users. There are many techniques that may be used to analyse and investigate problems and we would recommend that you become familiar with as many of them as possible. In implementing problem management two different approaches are often taken. The first involves the creation of a central problem management team which is responsible for logging and managing problems. The second takes the view that problem management is an activity that everyone in the IT team should take responsibility for and encourages a more distributed process with less central co-ordination. As with many such decisions, whichever is chosen, the real key lies in the degree of commitment shown to making it work.
Review frequency We would recommend that this document is reviewed annually.
Version 1
Page 2 of 37
[Insert date]
Problem Management Process
Document fields This document may contain fields which need to be updated with your own information, including a field for Organization Name that is linked to the custom document property “Organization Name”. To update this field (and any others that may exist in this document): 1. Update the custom document property “Organization Name” by clicking File > Info > Properties > Advanced Properties > Custom > Organization Name. 2. Press Ctrl A on the keyboard to select all text in the document (or use Select, Select All via the Editing header on the Home tab). 3. Press F9 on the keyboard to update all fields. 4. When prompted, choose the option to just update TOC page numbers. If you wish to permanently convert the fields in this document to text, for instance, so that they are no longer updateable, you will need to click into each occurrence of the field and press Ctrl Shift F9. If you would like to make all fields in the document visible, go to File > Options > Advanced > Show document content > Field shading and set this to “Always”. This can be useful to check you have updated all fields correctly. Further detail on the above procedure can be found in the toolkit Completion Instructions. This document also contains guidance on working with the toolkit documents with an Apple Mac, and in Google Docs/Sheets.
Copyright notice Except for any specifically identified third-party works included, this document has been authored by CertiKit, and is ©CertiKit except as stated below. CertiKit is a company registered in England and Wales with company number 6432088.
Licence terms This document is licensed on and subject to the standard licence terms of CertiKit, available on request, or by download from our website. All other rights are reserved. Unless you have purchased this product you only have an evaluation licence. If this product was purchased, a full licence is granted to the person identified as the licensee in the relevant purchase order. The standard licence terms include special terms relating to any third-party copyright included in this document.
Version 1
Page 3 of 37
[Insert date]
Problem Management Process
Disclaimer Please Note: Your use of and reliance on this document template is at your sole risk. Document templates are intended to be used as a starting point only from which you will create your own document and to which you will apply all reasonable quality checks before use. Therefore, please note that it is your responsibility to ensure that the content of any document you create that is based on our templates is correct and appropriate for your needs and complies with relevant laws in your country. You should take all reasonable and proper legal and other professional advice before using this document. CertiKit makes no claims, promises, or guarantees about the accuracy, completeness or adequacy of our document templates; assumes no duty of care to any person with respect its document templates or their contents; and expressly excludes and disclaims liability for any cost, expense, loss or damage suffered or incurred in reliance on our document templates, or in expectation of our document templates meeting your needs, including (without limitation) as a result of misstatements, errors and omissions in their contents.
Version 1
Page 4 of 37
[Insert date]
Problem Management Process
Problem Management Process
Version 1
DOCUMENT REF
ITILSO0402
VERSION
1
DATED
[Insert date]
DOCUMENT AUTHOR
[Insert name]
DOCUMENT OWNER
[Insert name/role]
Page 5 of 37
[Insert date]
Problem Management Process
Revision history VERSION
DATE
REVISION AUTHOR
SUMMARY OF CHANGES
Distribution NAME
TITLE
Approval NAME
Version 1
POSITION
SIGNATURE
Page 6 of 37
DATE
[Insert date]
Problem Management Process
Contents 1
2
Introduction ............................................................................................................... 9 1.1
Vision statement.......................................................................................................... 9
1.2
Purpose ....................................................................................................................... 9
1.3
Objectives.................................................................................................................. 10
1.4
Scope ........................................................................................................................ 10
Problem management process ................................................................................. 11 2.1
Overview and process diagram ................................................................................... 11
2.2
Process triggers.......................................................................................................... 13
2.3
Process inputs............................................................................................................ 13
2.4
Process activities........................................................................................................ 14
2.4.1 2.4.2 2.4.3 2.4.4 2.4.5 2.4.6 2.4.7 2.4.8 2.4.9 2.4.10
2.5
Process outputs ......................................................................................................... 18
2.6
Problem management tools ....................................................................................... 18
2.6.1 2.6.2 2.6.3 2.6.4
2.7 2.7.1 2.7.2 2.7.3 2.7.4 2.7.5 2.7.6 2.7.7
3
Problem detection ..................................................................................................................... 14 Problem logging ........................................................................................................................ 14 Problem categorisation ............................................................................................................. 15 Problem prioritisation ............................................................................................................... 15 Problem investigation and diagnosis ......................................................................................... 16 Workaround .............................................................................................................................. 17 Raise known error record if required ......................................................................................... 17 Change Request ........................................................................................................................ 17 Problem resolution and closure ................................................................................................. 18 Major problem review .......................................................................................................... 18
Service desk system................................................................................................................... 18 Problem analysis and investigation tools ................................................................................... 19 Email ......................................................................................................................................... 19 Configuration management system ........................................................................................... 19
Communication and training ...................................................................................... 19 Communication with users ........................................................................................................ 20 Communication with customers ................................................................................................ 20 Communication with IT teams ................................................................................................... 20 Communication with suppliers .................................................................................................. 20 Process performance ................................................................................................................. 21 Communication related to changes ........................................................................................... 21 Training for problem management ............................................................................................ 21
Roles and responsibilities ......................................................................................... 23 3.1
Operational roles ....................................................................................................... 23
3.2
RACI matrix................................................................................................................ 23
3.3
Problem management process owner......................................................................... 24
3.4
Problem management process manager ..................................................................... 24
3.5
Problem identifier ...................................................................................................... 25
3.6
Problem analyst ......................................................................................................... 25
Version 1
Page 7 of 37
[Insert date]
Problem Management Process
3.7
Third-line resource ..................................................................................................... 26
4
Associated documentation ....................................................................................... 27
5
Interfaces and dependencies .................................................................................... 28
6
7
8
5.1
Other service management processes ........................................................................ 28
5.2
Business processes ..................................................................................................... 29
Process measurements and metrics ......................................................................... 30 6.1
Critical success factors................................................................................................ 30
6.2
Key performance indicators........................................................................................ 30
6.3
Process reviews and audits......................................................................................... 31
Process reporting ..................................................................................................... 32 7.1
Process reports .......................................................................................................... 32
7.2
Operational reports ................................................................................................... 34
Glossary, abbreviations and references .................................................................... 35 8.1
Glossary..................................................................................................................... 35
8.2
Abbreviations ............................................................................................................ 36
8.3
References................................................................................................................. 37
Figures Figure 1: Problem management process ..................................................................................... 12
Tables Table 1: Determination of priority .............................................................................................. 16 Table 2: Priority definitions ........................................................................................................ 16 Table 3: RACI matrix ................................................................................................................... 23 Table 4: Associated documentation ............................................................................................ 27 Table 5: Interfaces with other service management processes .................................................... 28 Table 6: Interfaces with business processes ................................................................................ 29 Table 7: Critical success factors ................................................................................................... 30 Table 8: Key performance indicators ........................................................................................... 31 Table 9: Process reports ............................................................................................................. 33 Table 10: Operational reports..................................................................................................... 34 Table 11: Glossary of relevant terms ........................................................................................... 36
Version 1
Page 8 of 37
[Insert date]
Problem Management Process
1 Introduction 1.1 Vision statement The vision of [Service Provider] in the area of service management is as follows: [Insert the vision statement defined as part of service strategy] This process forms a key part of the realisation of that vision.
1.2 Purpose In order to reduce the number and frequency of incidents and improve the level of service to users, it is essential that the causes of incidents are investigated and, through managed actions, permanently removed. Problem management has the potential to not only make services better but to reduce the support overhead of providing them, so minimising cost and maximising warranty. It is important therefore that it is carried out according to a clear, well designed process. This document defines how the process of problem management is implemented within [Organization Name]. The purpose of the problem management process according to ITIL® is:
“… to manage the lifecycle of all problems from first identification through further investigation, documentation and eventual removal.” Source: “ITIL Service Operation Book 2011. Copyright © AXELOS Limited 2011. Reproduced under license from AXELOS. A problem is defined as:
“… the underlying cause of one or more incidents.” Source: “ITIL Service Operation Book 2011. Copyright © AXELOS Limited 2011. Reproduced under license from AXELOS.
Version 1
Page 9 of 37
[Insert date]
Problem Management Process
1.3 Objectives The objectives of the problem management process are to: • • • • • •
Proactively prevent incidents from occurring by identifying and fixing their root cause Minimise the impact of incidents that cannot be prevented by providing information about their causes and workarounds Define the way in which problems will be identified, logged, investigated, resolved and reported on so that consistency is achieved within the IT organization Ensure that the management and investigation of problems takes due account of business priorities and helps to maximise business productivity Foster an effective and efficient approach to the handling of problems that presents a positive image to the business and maintains user satisfaction Ensure that information about problems and their progress is communicated to the relevant parties in a timely and accurate manner at all times
1.4 Scope The scope of this process is defined according to the following parameters: • • • •
Organizational o [List organizations and parts of those organizations covered] Geographical o [List locations from which problems will be identified and managed] Services o [Define the services covered by the process] Technical o [If necessary, cover the technology that may give rise to problems managed via this process]
This process covers all problems identified by [Service Provider] in support of the customers and users of services defined in the service catalogue. The following areas are specifically excluded from this process: [Describe any areas that need to be clearly stated as outside the scope]
Version 1
Page 10 of 37
[Insert date]
Problem Management Process
2 Problem management process 2.1 Overview and process diagram The process of problem management is shown in Figure 1 and summarised below. Problems may be identified from many sources, usually either as a result of ongoing or recent incidents (reactive problem management) or from a retrospective analysis of historical incidents (proactive problem management). Once detected, problems need to be logged, categorised and prioritised in a similar way to incidents. A problem will then be investigated, potentially using a variety of available approaches and tools, including chronological analysis, pain value analysis, brainstorming and 5-Whys. The purpose of the investigation is to identify the root cause of the problem which, if fixed, will eliminate incidents arising from it, so improving service to users. If a workaround is already known or becomes available as a result of the diagnosis of the problem, then this may be communicated to the incident management process for use until the problem is fixed. Once the problem has been successfully diagnosed and the root cause found, a known error record is raised in the known error database. If a change is needed to resolve the root cause of the problem, then a change request will be raised via the change management process. The problem will then be resolved, and the resolution tested to ensure that it was successful. If fixed, the problem will then be closed and, if the severity of the problem justifies it, a major problem review will be carried out to identify lessons learned and any additional improvement actions.
Version 1
Page 11 of 37
[Insert date]
Problem Management Process
Figure 1: Problem management process
Version 1
Page 12 of 37
[Insert date]
Problem Management Process
2.2 Process triggers The problem management process is initiated as a result of one or more of the following triggers: •
• •
As a reaction to one or more incidents with similar symptoms occurring for which the cause is not currently known. This may be recognised by: o The service desk o Second line o Third line o Suppliers o Customers o Users o Other source or stakeholder From information provided by the service transition stage regarding problems that have not been resolved prior to live running e.g. bugs in software or issues with configuration items As a result of a proactive analysis of previous incidents or message logs carried out with the intention of identifying common factors and trends worth investigating
2.3 Process inputs The process of problem management requires a number of inputs in order to be able to function effectively. These may not always be available but will ideally be: •
• • • • •
Details of incidents related to the problem, including o Number of incidents o Dates and times of incidents o Categorisations o Impacts o Symptoms o Actions carried out so far with results Configuration Management System (CMS) records for relevant CIs Technical and business input to investigation and diagnosis sessions such as brainstorming Details of completion of requested changes from the change management process Feedback from incident management, users and other parties regarding whether the problem resolution has been successful Information from internal development teams and external suppliers regarding software and hardware problems that are known about but not yet fixed in the version in use within the organization
Version 1
Page 13 of 37
[Insert date]
Problem Management Process
2.4 Process activities The individual process activities at each step are detailed as follows.
2.4.1 Problem detection In order to make the problem management function as effective as it can be it is important that problems are identified as early as possible. Problems may be identified from any source, including: • • • • • •
IT team members Suppliers Monitoring tools Customers Users Analysis of incident records
All of the above will be encouraged to provide feedback to the problem management team about potential problems, including user perception of specific areas of service which may indicate an underlying problem. The problem management team will provide advice and guidance about whether an issue represents a valid problem. Problem management will make regular contact with the business in order to get team members to put forward potential problems for investigation. Often the user is aware of things that are not right but perseveres with them because that is the way it has always been. The problem manager will make efforts to get the business perspective on what they see as the main IT-related problems. Initially the problem manager will contact the Business Relationship Managers (BRMs) to create a first pass list and ask them to pass on the problem identification concept to those areas of the business they are in regular contact with. The problem manager will then liaise with the BRMs on an ongoing basis to log, assess, prioritise and investigate those problems that are brought forward. In addition, on a quarterly basis, an analysis of logged incidents will be performed by the problem manager in order to identify areas in which possible problems exist.
2.4.2 Problem logging It is important that, once identified, problems are recorded so that effort can be allocated to resolving them.
Version 1
Page 14 of 37
[Insert date]
Problem Management Process
Upon a problem being recognised, a problem record will be created by the problem management team within the service desk system and populated with the references of the related incidents and the details of the symptoms of the problem, including: • • • •
The business impact of the problem (quantified where possible) Users and user groups affected Any relevant information about the timing of the problem Possible causes identified so far
2.4.3 Problem categorisation Three levels of categorisation will be used for problems. These will the same as used for incidents so that a degree of cross-referencing can be performed. Category hierarchies will be available within the service desk system and will be reviewed on a regular basis as part of process improvement activities. Changes to the categories will be managed carefully so that the implications to SLA reporting are understood and catered for accordingly. The process manager will review on a regular basis the use of categories in logging problems to ensure that they are used consistently by all parties.
2.4.4 Problem prioritisation The priority of a problem will determine the order in which it is addressed by problem managers and subsequent teams involved in its investigation. This will be based on a combination of two factors: • •
Impact: A measure of the effect of a problem on business processes Urgency: A measure of how quickly the business needs the problem to be fixed
The priority should consider the benefits that will be achieved if we manage to resolve it (obviously not all problems will be resolvable). These benefits may take a number of forms but the main questions to be asked will be: • • • • • •
How much will business disruption be reduced? (e.g. no. of man-hours p.a.) What effect will this have on our customers? How many incidents will we prevent p.a.? How much time will be saved in the IT team? What direct costs will we avoid? What effect will solving this problem have on staff morale?
These questions will allow a benefit profile to be created for the problem which will indicate how much effort it makes sense to put in to get it solved.
Version 1
Page 15 of 37
[Insert date]
Problem Management Process
Both impact and urgency will be assessed on a scale of high, medium and low. The priority of a problem will then be calculated based on the rating of its urgency and impact as follows:
IMPACT/URGENCY
HIGH
MEDIUM
LOW
High
1
2
3
Medium
2
3
4
Low
3
4
5
Table 1: Determination of priority
The priority of a problem will be calculated automatically by the service desk system based on the above rules. The definitions of each priority level are as follows:
PRIORITY
TITLE
DESCRIPTION
1
Critical
Significant delay or disruption to the business until the problem is fixed
2
High
Significant delay or disruption to parts of the business until the problem is fixed
3
Medium
Localised delay or disruption affecting one or more users
4
Low
Localised inconvenience affecting single user
5
Planning
Very minor inconvenience or non-urgent problem
Table 2: Priority definitions
There may be circumstances where a problem affecting a single user has a significant business impact, particularly if the user is a member of the senior management team or a high-value financial transaction is involved. The priority should therefore be set in consultation with the user.
2.4.5 Problem investigation and diagnosis Once a problem has been logged, all activities performed with respect to that problem should be recorded as actions in the problem record e.g. adding notes, referring to supplier. Where appropriate, one or more of the following techniques will be used by the problem management team to define the problem and its possible causes in more detail:
Version 1
Page 16 of 37
[Insert date]
Problem Management Process
• • • • • •
Chronological Analysis Pain Value Analysis Kepner and Tregoe Brainstorming Ishikawa Diagrams Pareto Analysis
If the problem management team cannot resolve the problem, they may opt to escalate it further e.g. to a third line team or an external supplier. In this case the problem remains with the problem management team and it is the problem team member’s responsibility to ensure that the problem is updated on a regular basis based on feedback from the support team or external supplier.
2.4.6 Workaround Any workarounds found which reduce or eliminate the symptoms of the problem temporarily should be recorded in the problem record and made available to the service desk. Any instances that make use of a workaround should still be recorded as incidents and linked to the outstanding problem record. This gives a continuing indication of the frequency of the problem.
2.4.7 Raise known error record if required Once investigations have been completed and cause of the problem is diagnosed (or before this point if useful), the status of the problem will be moved to that of “known error”. This indicates that the cause is known but the problem is not yet “fixed”. A knowledgebase (Known Error Database) will be maintained within the service desk system into which known errors will be placed.
2.4.8 Change Request Where a change to the live environment is required in order to fix the problem, a change request must be raised in accordance with change management procedures. The reference numbers of such changes must be recorded in the problem record and the problem reference listed in each change record.
Version 1
Page 17 of 37
[Insert date]
Problem Management Process
2.4.9 Problem resolution and closure Once the problem has been diagnosed and resolved, it may be closed. In some circumstances it may be decided to close the problem without it being resolved e.g. if the cost of resolving it is prohibitive or the service involved is about to be replaced or retired. In this case the reasons should be documented in the problem record. Related incidents that remain open and are resolved as part of the resolution of the problem should also be closed.
2.4.10 Major problem review In the case of major problems which have had a significant impact upon service to users, a problem review will be carried out by the problem manager to identify lessons learned. The report produced will be made available to interested parties and any recommendations input to the service improvement plan.
2.5 Process outputs The outputs of the problem management process will be the following: • • • • •
Closed problems Complete and accurate problem records Feedback from customers and users regarding levels of satisfaction Communication and feedback to other service management processes such as availability management, capacity management and change management Reports to management regarding problem volumes, impacts, resolution success rates and process effectiveness
2.6 Problem management tools There are a number of key software tools that underpin an effective problem management process. These are subject to change as requirements and technology are updated and so specific systems are not described here. However, the main types of tools that play a significant part in the process within [Organization Name] are as follows.
2.6.1 Service desk system The service desk system provides the workflow engine and database to implement the core activities within problem management. These include:
Version 1
Page 18 of 37
[Insert date]
Problem Management Process
• • • • • • • • • •
Problem logging Routing and assignment of problems to teams and individuals Recording of actions against problems Updating of problem status from open through to closed Assessment of impact and urgency and auto-calculation of priority Email communication with users from within problem records Problem categorisation to multiple levels Reporting Knowledgebase of past incidents with search capability Known error database
The service desk system is integrated with the systems that support various other processes, including incident, change and configuration management.
2.6.2 Problem analysis and investigation tools There are various techniques that may be used during the different stages of the investigation of a problem. Some of these, such as Pareto Analysis and Ishikawa Diagrams, may be supported by tools implemented using spreadsheets and mapping software.
2.6.3 Email The email system is key to communication between the problem management team and other involved groups such as users and suppliers.
2.6.4 Configuration management system The CMS provides real-time information about the hardware and software within the IT environment and allows problem management to view any changes that have been implemented on key components that are under consideration with regard to a problem. It allows the installed software and its versions to be viewed without the need to access the user’s computer remotely as well as helping problem management understand the relationships between service components.
2.7 Communication and training There are various forms of communication that must take place for the problem management process to be effective. These are described below.
Version 1
Page 19 of 37
[Insert date]
Problem Management Process
2.7.1 Communication with users It is likely that many of the incidents that give rise to the identification of a problem are reported by users. If such incidents are not able to be closed via the use of a workaround then it will be appropriate to keep these users informed about the progress of the investigation of the problem. In the event that such incidents can be closed but reoccur on a regular basis then users will still want to be kept informed about when the underlying problem will be fixed, and the frequent incidents can be expected to cease. Emails that are exchanged with the user should be incorporated into the request record so that a full audit trail of all communication is kept and is available to whoever is working on the problem. It may be appropriate to invite selected users to sessions organized to investigate problems via the various techniques available such as brainstorming. Users who have first-hand knowledge of the symptoms and circumstances of a problem can provide valuable insight into its causes and may speed up its resolution.
2.7.2 Communication with customers Even where there is no formal SLA associated with the resolution of problems, customers should be kept informed about the progress of high priority problems affecting their business area, including what is being done to resolve them and the resources dedicated to their investigation.
2.7.3 Communication with IT teams Problem management needs the support of technical specialists to identify and resolve sometimes complex problems for the benefit of the business and often the IT team itself. The problem manager will foster close relationships with key teams within the IT organization so that the benefits of effective problem management are understood and demonstrated. IT specialists will be involved in investigative sessions and are likely to be key contributors to the use of techniques such as chronological analysis and fault isolation.
2.7.4 Communication with suppliers Often the input of suppliers will be critical to diagnose, test and resolve difficult problems. Their knowledge of the products and services they supply will usually exceed that available in-house and sometimes access to the developers of products may be needed to determine a resolution.
Version 1
Page 20 of 37
[Insert date]
Problem Management Process
The internal supplier manager for the third party involved should be kept informed of the ongoing communication between problem management and supplier staff and may be useful in securing additional resource to speed up investigations.
2.7.5 Process performance It is important that the performance of the problem management process is monitored and reported upon on a regular basis in order to assess whether the process is operating as expected. The content of performance reports is set out in section 6 of this document, but it is vital that the reports are not only produced but are also communicated to the appropriate audience. This will include the customers of the IT service and the management of IT concerning resource utilisation and allocation. Depending on the health of the process it may be appropriate to hold regular meetings with customers and IT management to discuss the performance and agree any actions to improve it.
2.7.6 Communication related to changes The problem management process manager must have visibility of the change management schedule and ideally will be briefed on any changes with the potential to affect ongoing problems. This may be a regular meeting or carried out on an ad-hoc basis according to the frequency of occurrence of such changes. Problem management will also communicate with change management as part of the logging of changes to resolve problems and the review of these after the event.
2.7.7 Training for problem management In addition to a well-defined process and appropriate software tools it is essential that the people aspects of problem management are adequately addressed. The process requires that training be provided to all participants in order that it runs as smoothly as possible. The main areas in which training will be required for problem management are as follows. • • •
The problem management process itself, including the activities, roles and responsibilities involved Problem management software tools such as the service desk system and configuration management system Specific problem investigation techniques such as Kepner-Tregoe, 5-Whys and Affinity Mapping
Version 1
Page 21 of 37
[Insert date]
Problem Management Process • • •
Soft skills such as customer service, dealing with difficult conversations and avoiding technical jargon The basics of the technology and how it is implemented within [Organization Name] The business, its structure, locations, priorities and people
In addition, training should be provided to the user population regarding how to identify and report a problem, including: • • •
The difference between an incident, a service request, a problem and a change proposal and how they are handled How to report a problem via the various means available What may be expected of them as part of problem investigation
This training may be provided via short workshops and supplemented by on demand resources such as videos and user guides.
Version 1
Page 22 of 37
[Insert date]
Problem Management Process
3 Roles and responsibilities This section describes the main operational roles involved in the problem management process, their interaction with the process and their detailed responsibilities.
3.1 Operational roles The following main roles participate in the problem management process: • • • •
Problem Identifier (user, service desk analyst, supplier etc.) Problem Analyst Third-Line Resource (including suppliers) Process Manager
There will also be interaction with IT and business management at various points in the process.
3.2 RACI matrix The table below clarifies the responsibilities of these roles at each step of the problem management process using the RACI system, i.e.: • •
• •
R: Responsible A: Accountable
STEP
PROBLEM IDENTIFIER
C: Consulted I: Informed
PROBLEM ANALYST
THIRD-LINE RESOURCE
PROCESS MANAGER
Problem detection
R
I
A
Problem logging
C
R
C
A
Problem categorisation
C
R
C
A
Problem prioritisation
C
R
C
A
Problem investigation and diagnosis
I
R
C
A
Workaround
I
R
C
A
Raise known error record
R
A
Change request
R
A
Problem resolution and closure
I
R
I
A
Major problem review
C
R
C
A/C
Table 3: RACI matrix
Version 1
Page 23 of 37
[Insert date]
Problem Management Process
3.3 Problem management process owner The responsibilities of the problem management process owner are: • • • • • • • • • • • • • • •
Sponsoring, designing and change managing the process and its metrics Defining the process strategy Assisting with process design Ensuring that appropriate process documentation is available and current Defining appropriate policies and standards to be employed throughout the process Periodically auditing the process to ensure compliance to policy and standards Periodically reviewing the process strategy to ensure that it is still appropriate and change as required Communicating process information or changes as appropriate to ensure awareness Providing process resources to support activities required throughout the service lifecycle Ensuring process technicians have the required knowledge and the required technical and business understanding to deliver the process and understand their role in the process Reviewing opportunities for process enhancements and for improving the efficiency and effectiveness of the process Addressing issues with the running of the process Identifying improvement opportunities for inclusion in the CSI register Making improvements to the process Working with other process owners to ensure there is an integrated approach to the design and implementation of request fulfilment, problem management, event management, access management and incident management
Source: ITIL Service Operation Book 2011. Copyright © AXELOS Limited 2011. Reproduced under license from AXELOS.
3.4 Problem management process manager The responsibilities of the problem management process manager are: • • • • • • • •
Working with the process owner to plan and co-ordinate all process activities Ensuring all activities are carried out as required throughout the service lifecycle Appointing people to the required roles Managing resources assigned to the process Working with service owners and other process managers to ensure the smooth running of services Monitoring and reporting on process performance Identifying improvement opportunities for inclusion in the CSI register Working with the CSI manager and process owner to review and prioritise improvements in the CSI register
Version 1
Page 24 of 37
[Insert date]
Problem Management Process • • • • • • • • •
Making improvements to the process implementation Planning and managing support for problem management tools and processes Coordinating interfaces between problem management and other service management processes Driving the efficiency and effectiveness of the problem management process Producing management information Managing the work of problem management staff Monitoring the effectiveness of problem management and making recommendations for improvement Developing and maintaining the problem management systems Developing and maintaining the problem management process and procedures
(Source: “ITIL Service Operation Book 2011. Copyright © AXELOS Limited 2011. Reproduced under license from AXELOS)
3.5 Problem identifier Due to the diverse nature of problem management the person initially identifying the problem may be located in one of many teams including the business, IT and a supplier. The main responsibilities of the person identifying the problem are: • • • •
Inform the service desk or problem management team directly that they believe a problem exists Provide all relevant information about the potential problem including symptoms, timings, frequency, impact and urgency Participate in investigative sessions when required Assist in confirming the resolution of the problem
3.6 Problem analyst The responsibilities of the Problem Analyst with respect to the problem management process are to: • • • • • •
Interact with users and other parties in a professional manner at all times Log problems in a timely manner according to the organization’s established procedures Ensure that all appropriate questions are asked to establish the impact, urgency and circumstances of the problem Collect all relevant information regarding the problem Record all relevant information accurately and promptly within the problem management system Use appropriate tools and training to analyse, investigate and diagnose the problem
Version 1
Page 25 of 37
[Insert date]
Problem Management Process • •
Involve other IT and business teams and suppliers where necessary Confirm with those affected by the problem that it has been successfully resolved prior to closure
3.7 Third-line resource The third-line resource may be an internal team or a supplier. The responsibilities of the third-line resource within the problem management process are as follows. • • • •
Assist in the definition of the problem from a technical viewpoint Participate in the analysis and investigation of the problem when requested Perform actions to resolve the problem under change management Help to confirm the resolution of the problem
Version 1
Page 26 of 37
[Insert date]
Problem Management Process
4 Associated documentation The following documentation is relevant to the problem management process and should be read in conjunction with it:
DOCUMENT
REFERENCE
VERSION
LOCATION
ITIL Service Operation Book
ISBN number 9780113313075
2011
[Network drive location]
Problem logging and management procedure
[Network drive location]
Service management system user guide
[Network drive location]
Service management system administration guide
[Network drive location]
Incident Management Process
ITILSO0301
V1.0 Final
[Network drive location]
Change Management Process
V1.0 Final
[Network drive location]
Configuration Management Process
V1.0 Final
[Network drive location]
Table 4: Associated documentation
In the event that any of these items is not available, please contact the Service Desk Supervisor.
Version 1
Page 27 of 37
[Insert date]
Problem Management Process
5 Interfaces and dependencies The problem management process has a number of interfaces and dependencies with other processes within service management and the business. These are outlined here and are described in further detail in the relevant procedural documentation.
5.1 Other service management processes ITIL LIFECYCLE STAGE
PROCESS
INPUTS TO PROBLEM MANAGEMENT FROM THE NAMED PROCESS
OUTPUTS FROM PROBLEM MANAGEMENT TO THE NAMED PROCESS
Service Strategy
Financial Management for IT Services
Cost information to help assess the relative priority of problems Costings of hardware and software components to be used to resolve problems
Cost of proposed problem resolutions for input to budget cycle
Service Design
Service Level Management
Service Level Agreements to determine impact of problems
Problem status information for inclusion in service level reports
Availability Management
Areas in which availability needs to be improved
Resolved problems to improve availability
Capacity Management
Performance information as part of investigation of capacity issues
Resolved performance problems
IT Service Continuity Management
Details of service continuity plans for options appraisal
Invocations of service continuity plans in the event of major problems
Service Asset and Configuration Management
Configuration Management System (CMS) records
Linking of problems to CIs
Change Management
Information about changes that may have affected existing problems or created new ones
Changes raised to resolve problems
Release and Deployment Management
Known errors with new releases Release schedules for changes to fix problems
Information regarding priority of problems for which fixes are included in planned releases
Knowledge Management
Information about current services
Known errors for inclusion in the KEDB
Service Operation
Incident Management
Problems raised as a result of one or more incidents Ongoing information about incidents related to problems
Resolution of problems leading to closure of open incidents Workarounds Known errors
Continual Service Improvement
7-Step Improvement Process
Problem management process improvements
Potential service improvements identified from resolved problems
Service Transition
Table 5: Interfaces with other service management processes
Version 1
Page 28 of 37
[Insert date]
Problem Management Process
5.2 Business processes [Business processes will obviously be numerous and highly industry- and organizationspecific. We therefore recommend that you only address those that are closely linked to the process in question here.]
BUSINESS AREA
BUSINESS PROCESS
INPUTS TO PROBLEM MANAGEMENT FROM THE NAMED PROCESS
OUTPUTS FROM PROBLEM MANAGEMENT TO THE NAMED PROCESS
Human Resources
All
Notification of potential problems
Resolved problems
Finance
All
Notification of potential problems
Resolved problems
Sales and Marketing
All
Notification of potential problems
Resolved problems
Production/ Operations
All
Notification of potential problems
Resolved problems
Legal and Compliance
All
Notification of potential problems
Resolved problems
Research and Development
All
Notification of potential problems
Resolved problems
Distribution and Logistics
All
Notification of potential problems
Resolved problems
Customer Services
All
Notification of potential problems
Resolved problems
Purchasing
All
Notification of potential problems
Resolved problems
Public Relations
All
Notification of potential problems
Resolved problems
Administration
All
Notification of potential problems
Resolved problems
[Insert further business processes here] Table 6: Interfaces with business processes
Version 1
Page 29 of 37
[Insert date]
Problem Management Process
6 Process measurements and metrics In order to determine whether the problem management process is working effectively and achieving what we want it to achieve, we must first define our critical success factors and identify how we will determine if they are being fulfilled.
6.1 Critical success factors The following factors are defined as critical to the success of the problem management process:
REF
CRITICAL SUCCESS FACTOR
CSF1
Business impact and disruption is minimised
CSF2
Service levels are being maintained
CSF3
User satisfaction and confidence in IT services is maintained
CSF4
The process provides value for money
Table 7: Critical success factors
Achievement of these critical success factors will be measured via the use of relevant Key Performance Indicators (KPIs).
6.2 Key performance indicators The following KPIs will be used on a regular basis to evidence the successful operation of the problem management process:
CSF REF
KPI REF
KEY PERFORMANCE INDICATOR
CSF1
KPI1.1
Number of problems resolved per month
KPI1.2
Number of incidents prevented per month
KPI1.3
Number of workarounds identified
KPI1.4
Mean time to resolve problems by problem type
KPI2.1
Number of major problems raised
KPI2.2
Number of hours service lost due to identified problems
KPI3.1
User satisfaction scores from user surveys
KPI3.2
Customer satisfaction scores from customer surveys
CSF2
CSF3
Version 1
Page 30 of 37
[Insert date]
Problem Management Process
CSF REF
CSF4
KPI REF
KEY PERFORMANCE INDICATOR
KPI3.3
Number of complaints about the problem management process
KPI3.4
Number of open problems
KPI4.1
Staff to problem ratio
KPI4.2
Average cost per resolved problem
Table 8: Key performance indicators
6.3 Process reviews and audits Reviews will be carried out by the process owner in conjunction with the process manager on a three-monthly basis to assess whether the problem management process is operating effectively and delivering the desired results. These reviews will have the following as input: • • • • • • •
Follow-up action list from previous reviews Relevant changes and developments within the business and IT KPI reports from the previous period Details of all complaints logged during the period Internal and external audit reports Feedback from users and customers Identified opportunities for improvement
Each review will be documented by the process owner and actions arising agreed and published. Audits will be carried out on an annual basis by the internal auditing department. The scope and timing of the audit will be agreed in advance. Recommendations from the audit will be published and actions discussed and agreed with the process owner. All actions will be followed up by the internal auditor within the agreed timescales for each action.
Version 1
Page 31 of 37
[Insert date]
Problem Management Process
7 Process reporting It is important that regular reports are produced for two main reasons: 1. To help to assess whether the problem management process is meeting its critical success factors (see section 6.1 above) 2. To assist operational supervisors in the day-to-day management of the problem management process and its resourcing These two purposes may require different views of the information available and will need to be produced at varying frequencies for differing audiences. The format of the reports produced will also be subject to regular review and amendment as requirements become clearer and the available reporting technology within the business matures. What must be avoided is the continued production of reports that are not read and serve no purpose. It is up to the process owner, in consultation with the process manager, to ensure that all reporting remains focussed and relevant. The following tables show the reports that will be produced together with their purpose, method of production, data source, audience and frequency. Some of the reports listed will be used for multiple purposes.
7.1 Process reports The following reports are produced by the process manager and are intended to help the process owner assess whether the CSFs for problem management are being met.
REF
REPORT TITLE
DESCRIPTION
METHOD OF PRODUCTION
DATA SOURCE
FREQ
AUDIENCE
CSFR1
Resolved problems
Number of problems resolved by month
Service desk system reporting tool
Service desk database
Weekly
Process owner
CSFR2
Incidents prevented
Number of incidents logged per month prior to problem resolution minus number logged after resolution, by problem
Service desk system reporting tool
Service desk database
Weekly
Process owner
CSFR3
Workarounds
Number of workarounds added to KEDB per month
Service desk system reporting tool
Service desk database
Weekly
Process owner
Version 1
Page 32 of 37
[Insert date]
Problem Management Process
REF
REPORT TITLE
DESCRIPTION
METHOD OF PRODUCTION
DATA SOURCE
FREQ
AUDIENCE
CSFR4
Problem Resolution Time
Average resolution time per closed problem
Service desk system reporting tool
Service desk database
Monthly
Process owner
CSFR5
Major problems
Number of major problems raised per month
Service desk system reporting tool
Service desk database
Monthly
Process owner
CSFR6
Hours service lost
Number of hours service lost as a result of incidents related to open problems
Service desk system reporting tool
Service desk database
Monthly
Process owner
CSFR7
User satisfaction
User satisfaction survey results
Survey tool report
Survey tool database
Quarterly
Process owner
CSFR8
Customer satisfaction
Customer satisfaction survey results
Survey tool report
Survey tool database
Sixmonthly
Process owner
CSFR9
Problem management complaints
Number and listing of new complaints per month
Complaints system reports
Complaints database
Monthly
Process owner
CSFR10
Open problems
Number of open problems at end of month
Service desk system reporting tool
Service desk database
Monthly
Process owner
CSFR11
Resolution ratio
Staff to resolved problem ratio
Spreadsheet chart
Service desk database and staff attendance records
Monthly
Process owner
CSFR12
Problem cost
Average cost per resolved problem by type
Spreadsheet costing model
Service desk system and costings from Finance
Quarterly
Process owner
CSFR13
[Insert further reports]
Table 9: Process reports
Version 1
Page 33 of 37
[Insert date]
Problem Management Process
7.2 Operational reports The following reports are to provide further ongoing operational information to the process manager. They are in addition to the relevant process reports described above.
REF
REPORT TITLE
DESCRIPTION
METHOD OF PRODUCTION
DATA SOURCE
FREQ
AUDIENCE
OPR1
Analyst productivity
Problems logged and closed by problem analyst
Service desk system reporting tool
Service desk database
Weekly
Process manager
OPR2
Problem category
Open problems by category
Service desk system reporting tool
Service desk database
Monthly
Process manager
OPR3
High priority
Problems of priority 1 and 2 that are currently open
Service desk system filter view
Service desk database
Weekly
Process manager
OPR4
Problem dashboard
Dashboard of key problem status
Service desk system reporting tool
Service desk database
Weekly
Process manager
[Insert further reports] Table 10: Operational reports
Version 1
Page 34 of 37
[Insert date]
Problem Management Process
8 Glossary, abbreviations and references 8.1 Glossary For a full list of terms used and their definitions within ITIL, please refer to the back of any of the books in the ITIL Lifecycle Suite 2011. The following subset of terms is specifically relevant to this document:
TERM
MEANING
Category
A named group of things that have something in common
Change
The addition, modification or removal of anything that could have an effect on IT services
Closed
The final status in the lifecycle of an incident, problem, change etc. When the status is closed, no further action is taken
Configuration item
Any component or other service asset that needs to be managed in order to deliver an IT service
Configuration management system
A set of tools, data and information that is used to support service asset and configuration management
Continual service improvement
A stage in the lifecycle of a service. Continual service improvement ensures that services are aligned with changing business needs by identifying and implementing improvements to IT services that support business processes
Customer
Someone who buys goods or services. The customer of an IT service provider is the person or group who defines and agrees the service level targets
Escalation
An activity that obtains additional resources when these are needed to meet service level targets or customer expectations.
Financial management
A generic term used to describe the function and processes responsible for managing an organization’s budgeting, accounting and charging requirements
Impact
A measure of the effect of an incident, problem or change on business processes
Incident
An unplanned interruption to an IT service or reduction in the quality of an IT service
Incident management
The process responsible for managing the lifecycle of all incidents
IT service
A service provided by an IT service provider. An IT service is made up of a combination of information technology, people and processes
Key performance indicator
A metric that is used to help manage an IT service, process, plan, project or other activity
Known error database
A database containing all known error records
Model
A representation of a system, process, IT service, configuration item etc. that is used to help understand or predict future behaviour
Operational level agreement
An agreement between an IT service provider and another part of the same organization
Version 1
Page 35 of 37
[Insert date]
Problem Management Process
TERM
MEANING
Priority
A category used to identify the relative importance of an incident, problem or change
Problem
A cause of one or more incidents
Service desk
The single point of contact between the service provider and the users
Service level agreement
An agreement between an IT service provider and a customer
Urgency
A measure of how long it will be until an incident, problem or change has a significant impact on the business
User
A person who uses the IT service on a day-to-day basis. Users are distinct from customers, as some customers do not use the IT service directly
Vision
A description of what the organization intends to become in the future
Workaround
Reducing or eliminating the impact of an incident or problem for which a full resolution is not yet available
Table 11: Glossary of relevant terms
Based on ITIL Service Operation Book 2011. Copyright © AXELOS Limited 2011. Reproduced under license from AXELOS.
8.2 Abbreviations The following abbreviations are used in this document: • • • • • • • • • • • •
BRM: Business Relationship Manager CI: Configuration Item CMS: Configuration Management System CSF: Critical Success Factor CSI: Continual Service Improvement IT: Information Technology ITIL: Information Technology Infrastructure Library KEDB: Known Error Database OLA: Operational Level Agreement SKMS: Service Knowledge Management System SLA: Service Level Agreement UC: Underpinning Contract
Version 1
Page 36 of 37
[Insert date]
Problem Management Process
8.3 References The following sources have been used in the creation of this process document and should be consulted for more information on particular aspects of it: • • • • •
ITIL Service Operation Book 2011. Copyright © AXELOS Limited 2011 [Organization Name] IT organization structure, published dd/mm/yy [Organization Name] Business Strategy yyyy-yyyy [Organization Name] IT Strategy yyyy-yyyy [Organization Name] IT Service Management Strategy yyyy-yyyy
Version 1
Page 37 of 37
[Insert date]