Identify and classify problems and their root causes and provide timely resolution to prevent recurring incidents. Provide recommendations for improvements.
Increase availability, improve service levels, reduce costs, and improve customer convenience and satisfaction by reducing the number of operational problems.
Percent of critical business processes, IT services and IT-enabled business programmes covered by risk assessment
Number of significant IT-related incidents that were not identified in risk assessment
Percent of enterprise risk assessments including IT-related risk
Frequency of update of risk profile
Number of business disruptions due to IT service incidents
Percent of business stakeholders satisfied that IT service delivery meets agreed-on service levels
Percent of users satisfied with the quality of IT service delivery
Frequency of capability maturity and cost optimization assessments
Trend of assessment results
Satisfaction levels of business and IT executives with IT-related costs and capabilities
Level of business user satisfaction with quality and timeliness (or availability) of management information
Number of business process incidents caused by non-availability of information
Ratio and extent of erroneous business decisions where erroneous or unavailable information was a key factor
Decrease in number of recurring incidents caused by unresolved problems
Percent of major incidents for which problems were logged
Percent of workarounds defined for open problems
Percent of problems logged as part of the proactive problem management activity
Number of problems for which a satisfactory resolution that addressed root causes were found
Define and implement criteria and procedures to report problems identified, including problem classification, categorization and prioritization.
Identify problems through the correlation of incident reports, error logs and other problem identification resources. Determine priority levels and categorization to address problems in a timely manner based on business risk and service definition.
Handle all problems formally with access to all relevant data, including information from the change management system and IT configuration/asset and incident details.
Define appropriate support groups to assist with problem identification, root cause analysis and solution determination to support problem management. Determine support groups based on pre-defined categories, such as hardware, network, software, applications and support software.
Define priority levels through consultation with the business to ensure that problem identification and root cause analysis are handled in a timely manner according to the agreed-on SLAs. Base priority levels on business impact and urgency.
Report the status of identified problems to the service desk so customers and IT management can be kept informed.
Maintain a single problem management catalogue to register and report problems identified and to establish audit trails of the problem management processes, including the status of each problem (i.e., open, reopen, in progress or closed).
Investigate and diagnose problems using relevant subject management experts to assess and analyze root causes.
Identify problems that may be known errors by comparing incident data with the database of known and suspected errors (e.g., those communicated by external vendors) and classify problems as a known error.
Associate the affected configuration items to the established/known error.
Produce reports to communicate the progress in resolving problems and to monitor the continuing impact of problems not solved. Monitor the status of the problem-handling process throughout its life cycle, including input from change and configuration management.
As soon as the root causes of problems are identified, create known-error records and an appropriate workaround, and identify potential solutions.
As soon as the root causes of problems are identified, create known-error records and develop a suitable workaround.
Identify, evaluate, prioritize and process (via change management) solutions to known errors based on a cost-benefit business case and business impact and urgency.
Identify and initiate sustainable solutions addressing the root cause, raising change requests via the established change management process if required to resolve errors. Ensure that the personnel affected are aware of the actions taken and the plans developed to prevent future incidents from occurring.
Close problem records either after confirmation of successful elimination of the known error or after agreement with the business on how to alternatively handle the problem.
Inform the service desk of the schedule of problem closure, e.g., the schedule for fixing the known errors, the possible workaround or the fact that the problem will remain until the change is implemented, and the consequences of the approach taken. Keep affected users and customers informed as appropriate.
Throughout the resolution process, obtain regular reports from change management on progress in resolving problems and errors.
Monitor the continuing impact of problems and known errors on services.
Review and confirm the success of resolutions of major problems.
Make sure the knowledge learned from the review is incorporated into a service review meeting with the business customer.
Collect and analyze operational data (especially incident and change records) to identify emerging trends that may indicate problems. Log problem records to enable assessment.
Capture problem information related to IT changes and incidents and communicate it to key stakeholders. This communication could take the form of reports to and periodic meetings amongst incident, problem, change and configuration management process owners to consider recent problems and potential corrective actions.
Ensure that process owners and managers from incident, problem, change and configuration management meet regularly to discuss known problems and future planned changes.
To enable the enterprise to monitor the total costs of problems, capture change efforts resulting from problem management process activities (e.g., fixes to problems and known errors) and report on them.
Produce reports to monitor the problem resolution against the business requirements and SLAs. Ensure the proper escalation of problems, e.g., escalation to a higher management level according to agreed-on criteria, contacting external vendors, or referring to the change advisory board to increase the priority of an urgent request for change (RFC) to implement a temporary workaround.
To optimize the use of resources and reduce workarounds, track problem trends.
Identify and initiate sustainable solutions (permanent fix) addressing the root cause, and raise change requests via the established change management processes.
References :
ISACA. (2012). COBIT 5 Enabling Processes. USA: ISACA.