1
1
IT Service Management
PV203
Vladimír Vágner
24.04.2023
2
What shall we discuss
today?
- RACI model
- Service Levels
- Measurements
- Reporting
Lesson 7
2
First introduced in the 1950s, RACI was originally called the “Decision Rights Matrix”
and is also known as “Responsibility Charting.” There are also other RACI variations,
like RASCI, ARCI, and DACI. It is the only project management tool that deals with
people and role.
3
3
RACI model
A RACI chart, also known as a RACI matrix or RACI model, is a diagram that
identifies the key roles and responsibilities of users against major tasks within
a project.
RACI charts serve as a visual representation of the functional role played by
each person on a project team. It balances the workload and establishes the
decision-maker.
A RACI matrix is a very important tool that can help in the implementation and
correct functioning of a process. The RACI matrix is mostly used to align the human
elements in the process. Usually there are many different people involved in any
process and they have differing responsibilities. A RACI matrix makes an explicit
documentation of this and keeps as a ready reference to be used at different stages
in the process. Here is how the RACI matrix can be utilized.
Responsible: This is the class of people who are ultimately responsible for getting
the work done. This may refer to the individual workers that perform the given task
or it could refer to the system in case the task is automated.
Accountable: This is the class of people that are accountable to oversee that the
work gets done. This usually means the immediate manager overseeing the work.
Consulted: These may be subject matter exerts who need to be consulted at the
time of an exception. There is a possibility that am unanticipated scenario arises in a
process. These are the people who will do the thinking and suggest any deviations
from the Standard Operating Procedure (SOP).
Informed: This is the class of people who have some interest in the performance of a
given task. This may be a manager trying to control the execution of the task at
hand. Also this could be an input signal to the other process.
4
4
RACI model
RACI models are used to manage resources and roles for the delivery of a piece of work or task.
• Only one person can be ACCOUNTABLE for any task. The person who is accountable for the task
has the overall authority for the task – but they may not carry out individual pieces of work
themselves.
• Any number of people can be RESPONSIBLE as part of the RACI model. These are the workers
who will get the actual tasks done, and they will report to the Accountable resource about their
progress.
• Sometimes resources are CONSULTED to get a task done. This might be a person within the
organisation who has specific knowledge, or it could be a document store, or even an internet
search engine. These resources need to be tracked to ensure they are available when required.
• Other resources need to be INFORMED. These resources are stakeholders who need to track and
understand exactly how the task is progressing, or they may need an output from the task. Business
sponsors, for example, will typically be informed about progress as part of a project.
When RACI is applied to service management processes, the process owner will be accountable for all
the process activities, even if they are not responsible for carrying them out.
Rules for using RACI Matrix
Only One Responsible and Accountable Person: It is essential that only one person
be assigned the R/A roles. Having more than one person responsible for the same
task increases ambiguity and the chances of the work not being performed. It could
also lead to duplication of work and wastage of efforts and costs.
Having more than one accountable person again leads to the same problem.
However, having only one person accountable also leads to a problem. If the
assigned person is incompetent, the whole process may go for a toss. It is for this
reason that there is often a hierarchy of accountable people in place.
Responsible-Accountable is Mandatory: The consult or inform roles are not
mandatory for every activity. It is possible that some activities may not require them
at all. But the responsible accountable roles must be assigned. Even if the system is
performing the tasks automatically, someone must be made accountable to see that
it does get done.
Communication with the Consultant: There must be a two way channel of
communication with the consultant. This communication is itself a task and must be
explicitly listed having its own responsible accountable persons. The important
aspect is that the communication should be two-way. Hence one has to ensure that
adequate follow-up is done and there is minimum time lag to complete the
communication.
Inform the Required Stakeholders: This is a one way channel of communication. It is
usually meant to be a signal for some other process to begin or as a control metric
to ensure smooth functioning of the same process. Usually this is automated but
needs accountability like other automated tasks.
Sometimes the extended version is used – RASCI matrix.
R - Responsible – Those who do the work to achieve a task. There is typically one
role with a participation type of Responsible.
A - Accountable – Those who are ultimately accountable for the correct and
thorough completion of the deliverable or task, and the one to whom Responsible is
accountable. Typically, the Process Owner is Accountable for a process, and there
must be only one Accountable specified for each task or deliverable.
S - Support – Resources allocated to Responsible. Unlike Consulted, who may
provide input to the task, Support will assist in completing the task.
C - Consulted – Those who are not directly involved in a process but provide inputs
and whose opinions are sought.
I - Informed – Those who receive outputs from a process or are kept up-to-date on
progress, often only on completion of the task or deliverable.
4
5
5
Service Design
manager
Service Level
Manager
Problem Manager Security Manager Procurement
Manager
Activity 1 A,R C I I C
Activity 2 A R C C C
Activity 3 I A,I R,C,I I C
Activity 4 I A R I R,C,I
Activity 5 I I A C I
RACI models
Map roles and responsibilities to processes and activities
• Responsible
ü Execution
• Accountable
ü Results
• Consulted
ü Expertise and perspective
• Informed
ü Communication
6
7
ITIL Roles
Responsibilities
ITILProcesses
A role is a set of responsibilities, activities and
authorities granted to a person or team. A role is
defined in a process or function. One person or team
may have multiple roles. For example, the roles of
incident manager and problem manager may be carried
out by a single person.
Roles are often confused with job titles but it is
important to realize that they are not the same. Each
organization will define appropriate job titles and job
descriptions which suit their needs, and individuals
holding these job titles can perform one or more of the
required roles.
8
8
What is a role?
• A set of connected behaviors or actions performed by a
person, team or group in a specific context
• One person or team may have multiple roles
• A process defines the scope and responsibilities of a role
• May or may not be titled
9
9
PROCESS
OWNER
PROCESS
MANAGER
PROCESS
PRACTITIONER
SERVICE
OWNER
Operational management
of a process
Carries out the process
activities
Accountable for the
delivery of specific IT
service
Defining the process
strategy
Work with the process
owner
Understands how their role
links to services and
creates value
Attends CAB
Assist the process design
including metrics
Makes sure all process
activities are carried out
Work with other
stakeholders
Attends int. and ext.
service review meeting
Process documentation
assurance
Monitoring and reporting
the process performance
Makes sure that inputs,
outputs and interfaces are
correct
Communicate with
customers
Auditing the process Appointing staff Create and update records
of their activities
Serving as SPOC
Process improvement Work with service owner(s) Participate in SLA and OLA
negotiations
Polices and standards
definition
Identify improvements
Sponsoring the process Makes improvements to
process implementation
CAB – Change Advisory Board
SPOC – Single Point of Contact
SLA – Service Level Agreement
OLA - Operation Level
Agreement
RACI tool helsp to identify the positiong of specific roles and the relationship setup
10
10
Executive management
Overall account management
Service delivery
Deliver services within
scope of the agreement
Transition-related activities
Account Manager – Lead
Service Manager – Lead
Service Managers
Program Management Office – Lead
Contract Manager
Finance Manager
Architecture Manager
Delivery Project Executive
Service Delivery Team
Project Management Office
Contracts, Service Control, Finance, etc.
Resource Manager, HR Director, etc.
Chief Technologist / Chief Architect
Contract Executive
Relationship Manager
Contract Transition Manager
Director of Service Delivery
Project Executive
Transition Manager
StrategicTacticalOperational
StrategicTacticalOperational
Joint management model – customer/vendor - example
Services VendorServices User
Latest Agile approach influences the ostion of Line Management!!!
11
11
Accountable roles
• Process Owners
• Service Owners
• Line Management
• IT Steering Group
• Change Advisory Board
These must-have roles are accountable for quality, results, conformance and continual
improvement. They may or may not be operational. There is only ONE accountable role for each
activity.
12
1212
Responsible roles
Persons or groups that
execute one or more
activities (actually do the
work)
Informed roles
Receive communication
about the activity
Consulted Roles
Provide specific expertise
or perspective
Based on the circumstances, individuals or groups will likely play multiple “roles” for the same activity – sometimes
simultaneously.
Other RACI roles
The ITIL 4 SLM practice defines the purpose of service level management as “…to set
clear business-based targets for service levels, and to ensure that delivery of services
is properly assessed, monitored, and managed against these targets.”
Servicde Level should relate to a measurable business outcome. All too often the IT
industry has used SLA targets as a way to capture numbers and figures that make no
sense in a business context. By focusing on what the business wants as an outcome,
your organization will be able to deliver services that add value rather than simply
hit arbitrary metrics and targets.
The Service Level Agreement is basically a contract between a service provider and a
customer. The agreement e.g. ensures that all the computer equipment will be well
maintained.
When talking about OLA, it is an agreement between the internal support groups of
an institution that supports SLA. According to the Operational Level Agreement, each
internal support group has certain responsibilities to the other group. The OLA
clearly depicts the performance and relationship of the internal service groups.
13
1313
Service Levels
Service level describes, usually in measurable terms, the
services a service provider furnishes a customer within a
given time period.
SLA – Service Level
Agreement
OLA –Operational Level
AgreementVs.
Service level is the metrics by which a particular service is measured. Service level is
mostly used in the service-based industries. Service level provides the expectations
of quality and service type and also remedies when requirements are not met.
Service level is an important component of any vendor contract. Service level
includes all elements of the particular service provided and the conditions of service
availability. The exact measurement related to service levels depends upon the type
of service provided, volume of work, quality of work and the service provider. In
some cases, there are multiple approaches to determine the service levels. Service
level is often documented with the help of a service-level agreement, which
describes in detail the level of service anticipated by a customer from a vendor. Most
service providers have service levels and standard level agreements. A service-level
agreement deals with the reliability, responsiveness, monitoring and escalation
procedures related to service levels.
Service level measurement helps the involved parties to understand the level of
service quality. With a service-level agreement in place, it protects all involved
parties in the agreement. Service level helps in understanding the measures of the
certain goals and indirectly helps in achieving the goals.
A Service Level Agreement (SLA) is the service contract component between a
14
An SLA is a negotiated agreement between two or
more parties designed to create a common
understanding about the service.
It is :
vA communication tool
vA conflict resolution tool
vA living document
vA method for gauging service efectivness
service provider and customer. A SLA provides specific and measurable aspects
related to service offerings. For example, SLAs are often included in signed
agreements between Internet service providers (ISP) and customers.
SLA is also known as an operating level agreement (OLA) when used in an
organization without an established or formal provider-customer relationship (very
often it is used as internal agreement between different units).
Adopted in the late 1980s, SLAs are currently used by most industries and markets.
By nature, SLAs define service output but defer methodology to the service
provider's discretion. Specific metrics vary by industry and SLA purpose.
SLAs features include:
• Specific details and scope of provided services, including priorities,
responsibilities and guarantees
• Specific, expected and measurable services at minimum or target levels
• Informal or legally binding
• Descriptive tracking and reporting guidelines
• Detailed problem management procedures
• Detailed fees and expenses
• Customer duties and responsibilities
• Disaster recovery procedures
• Agreement termination clauses
In outsourcing, a customer transfers partial business responsibilities to an external
service provider. The SLA serves as an efficient contracting tool for current and
continuous provider-customer work phases.
14
What Are Key Components of an SLA?
The SLA should include components in two areas: services and management.
Service elements include specifics of services provided (and what's
excluded, if there's room for doubt), conditions of service availability,
standards such as time window for each level of service (prime time and nonprime
time may have different service levels, for example), responsibilities of
each party, escalation procedures, and cost/service tradeoffs.
Management elements should include definitions of measurement standards
and methods, reporting process, contents and frequency, a dispute resolution
process, an indemnification clause protecting the customer from third-party
litigation resulting from service level breaches (this should already be covered
in the contract, however), and a mechanism for updating the agreement as
required.
This last item is critical; service requirements and vendor capabilities change,
so there must be a way to make sure the SLA is kept up-to-date.
15
1515
What Are Key Components of an SLA?
Service Elements covers the
„WHATs“
Management Elements covers the
„HOWs“
16
1616
Service Elements
Service Elements communicates :
ü (What) services to be provided (and/or NOT to be provided)
ü (What are) the conditions of service availability
ü (What are) the service standards
ü (What are) the responsibilities of both parties
17
1717
Management Elements
Management Elements communicates:
ü How service effectiveness will be tracked
ü How information about service effectiveness
will be reported and addressed
ü How service-related disagreements will be
resolved
ü How the parties will review and revise the
agreement
What should I consider when selecting metrics for my SLA?
Choose measurements that motivate the right behavior. The first goal of any metric
is to motivate the appropriate behavior on behalf of the client and the service
provider. Each side of the relationship will attempt to optimize its actions to meet
the performance objectives defined by the metrics. First, focus on the behavior that
you want to motivate. Then, test your metrics by putting yourself in the place of the
other side. How would you optimize your performance? Does that optimization
support the originally desired results?
Ensure that metrics reflect factors within the service provider's control. To
motivate the right behavior, SLA metrics have to reflect factors within the
outsourcer's control. A typical mistake is to penalize the service provider for delays
caused by the client's lack of performance. For example, if the client provides change
specifications for application code several weeks late, it is unfair and demotivating to
hold the service provider to a prespecified delivery date. Making the SLA two-sided
by measuring the client's performance on mutually dependent actions is a good way
to focus on the intended results.
Choose measurements that are easily collected. Balance the power of a desired
metric against its ease of collection. Ideally, the SLA metrics will be captured
automatically, in the background, with minimal overhead, but this objective may not
be possible for all desired metrics. When in doubt, compromise in favor of easy
18
18
18
collection; no one is going to invest the effort to collect metrics manually.
Less is more. Despite the temptation to control as many factors as possible, avoid
choosing an excessive number of metrics or metrics that produce a voluminous
amount of data that no one will have time to analyze.
Set a proper baseline. Defining the right metrics is only half of the battle. To be
useful, the metrics must be set to reasonable, attainable performance levels. Unless
strong historical measurement data is available, be prepared to revisit and readjust
the settings at a future date through a predefined process specified in the SLA.
18
19
1919
Factors that Affects The Timeline of SLA Implementation
§ The service environment
§ The proximity of the parties
§ The span of impact of the SLA
§ The relationship between the parties
§ The availability of a model
§ Prior SLA experience
20
2020
The SLA should address the following …
• A brief service description
• Validity period and/or SLA change control mechanism
• Authorisation details
• A brief description of communications, including reporting
• Contact details of people authorized to act in emergencies,
to participate in incidents and problem correction, recovery
and workaround
• Business or service hours (e.g. 08:00 to 17:00), date
exceptions (e.g. weekends, public holidays), critical
business definitions, ..
• Scheduled and agreed service interuptions, including
notice to be given and number per period
• Customer responsibilities (e.g. security)
21
2121
• Service provider liability and obligations (e.g. security)
• Impact and priority guidelines
• Escalation and notification process
• Complaints procedure
• Service targets
• Workload limits (upper and lower), e.g. the ability of the service to
support the agreed number of users/volume of work, system
throughput
• High level financial management details, e.g. charge codes etc.
• Actions to be taken in the event of service interruption
• Housekeeping procedures
• Glossary of terms
• Supporting and related services
• Any exceptions to the terms given in the SLA
What Kind of Metrics Should be Monitored?
Many items can be monitored as part of an SLA, but the scheme should be
kept as simple as possible to avoid confusion and excessive cost on either
side. In choosing metrics, examine your operation and decide what is most
important. The more complex the monitoring (and associated remedy)
scheme, the less likely it is to be effective, since no-one will have time to
properly analyze the data. When in doubt, opt for ease of collection of metric
data; automated systems are best, since it is unlikely that costly manual
collection of metrics will be reliable.
Depending on the service, the types of metric to monitor may include:
Service availability: the amount of time the service is available for use. This
may be measured by time slot, with, for example, 99.5 percent availability
required between the hours of 8 am and 6 pm, and more or less availability
specified during other times. E-commerce operations typically have extremely
aggressive SLAs at all times; 99.999 percent uptime is a not uncommon
requirement for a site that generates millions of dollars an hour.
Defect rates: Counts or percentages of errors in major deliverables.
Production failures such as incomplete backups and restores, coding
errors/rework, and missed deadlines may be included in this category.
22
SLA
Objectives
example
22
Technical quality: in outsourced application development, measurement of
technical quality by commercial analysis tools that examine factors such as
program size and coding defects.
Security: In these hyper-regulated times, application and network security
breaches can be costly. Measuring controllable security measures such as
anti-virus updates and patching is key in proving all reasonable preventive
measures were taken, in the event of an incident.
22
23
23
Your Company, Inc. IT Help Desk
Service Level Agreement
Provider of Service
XXX IT Help Desk staff
Type of Service
IT Help Desk primary first level support
Service Period
January 1, 20.. through December 31, 20..
Performance
In order to provide optimal first level support service to all departments, all
problem and repair calls must be received by the Help Desk.
The company XXX IT HELP DESK will provide (Customer Name/Department
Name) with the following support:
First level problem determination where
1. All problems will be recorded.
2. Problems will be resolved or assigned to the appropriate specialist.
3. Problems will be monitored.
4. Users will be notified of commitment times and any problems that occur in
meeting the established commitment.
5. Problem resolution will be documented and available in report status.
6. Monthly reports will be provided.
A single point of contact with the XXX department for
1. Orders for new equipment.
2. Equipment moves, adds, and changes (Equipment includes personal
computers, printers, and telephones).
3. Services such as data entry, building access authorizations, new computer
user IDs and passwords, voice mail, Centrex lines, mainframe connections,
file server connections, reports, and application program problems and
requests.
Service level is the metrics by which a particular service is measured. Service level
is mostly used in the service-based industries. Service level provides the
expectations of quality and service type and also remedies when requirements are
not met.
Service level is an important component of any vendor contract. Service level
includes all elements of the particular service provided and the conditions of service
availability. The exact measurement related to service levels depends upon the type
of service provided, volume of work, quality of work and the service provider. In
some cases, there are multiple approaches to determine the service levels. Service
level is often documented with the help of a service-level agreement, which
describes in detail the level of service anticipated by a customer from a vendor. Most
service providers have service levels and standard level agreements. A service-level
agreement deals with the reliability, responsiveness, monitoring and escalation
procedures related to service levels.
Service level measurement helps the involved parties to understand the level of
service quality. With a service-level agreement in place, it protects all involved
parties in the agreement. Service level helps in understanding the measures of the
certain goals and indirectly helps in achieving the goals.
24
S Smart
Service levels should be straightforward and emphasise
what you want to happen.
M Measurable
If a service level cannot be measured, then you cannot
determine whether it has been achieved.
A Achievable acceptable
It must be possible to achieve the service level with an
investment of time and resources.
R Relevant
Achieving the service level must contribute to the
overall business mission.
T Timely reporting period of the
The service level must be something that can be
achieved ad measured over the
SLA.
SMART service levels
ITIL focuses on three types of options for structuring SLA: Service-based, Customerbased,
and Multi-level or Hierarchical SLAs. Many different factors will need to be
considered when deciding which SLA structure is most appropriate for an
organization to use.
24
Key Performance Indicators (KPIs) are the critical (key) indicators of progress toward
an intended result. KPIs provides a focus for strategic and operational improvement,
create an analytical basis for decision making and help focus attention on what
matters most. As Peter Drucker famously said, “What gets measured gets done.”
Managing with the use of KPIs includes setting targets (the desired level of
performance) and tracking progress against that target. Managing with KPIs often
means working to improve leading indicators that will later drive lagging benefits.
Leading indicators are precursors of future success; lagging indicators show how
successful the organization was at achieving results in the past.
Good KPIs:
• Provide objective evidence of progress towards achieving a desired result
• Measure what is intended to be measured to help inform better decision
making
• Offer a comparison that gauges the degree of performance change over
time
• Can track efficiency, effectiveness, quality, timeliness, governance,
compliance, behaviors, economics, project performance, personnel
performance or resource utilization
25
What are
KPIs used
for?
What is KPI (definition of Key Performance Indicators)
A set of quantifiable measures that a company or industry uses to gauge or
compare perfomance in terms of meeting their strategic and operational goals.
KPIs are used by individuals and
organisations to evaluate their success at
reaching critical targets.
High-level KPI may focus on company-wide
performance, while low-level KPIS may
focus on processes within departments,
teams or individuals.
Use a KPI when you need to track a
progress toward a goal over a period of
time.
• Are balanced between leading and lagging indicators
The relative business intelligence value of a set of measurements is greatly improved
when the organization understands how various metrics are used and how different
types of measures contribute to the picture of how the organization is doing. KPIs
can be categorized into several different types:
• Inputs measure attributes (amount, type, quality) of resources consumed in
processes that produce outputs
• Process or activity measures focus on how the efficiency, quality, or consistency
of specific processes used to produce a specific output; they can also measure
controls on that process, such as the tools/equipment used or process training
• Outputs are result measures that indicate how much work is done and define
what is produced
• Outcomes focus on accomplishments or impacts, and are classified as
Intermediate Outcomes, such as customer brand awareness (a direct result of,
say, marketing or communications outputs), or End Outcomes, such as customer
retention or sales (that are driven by the increased brand awareness)
• Project measures answer questions about the status of deliverables and
milestone progress related to important projects or initiatives
25
26
KPIs vs. SLAs
KPIs and SLAs both provide beneficial information for
organizations to use during the decision-making
process. However, SLAs establish the baseline
performance expectations and monitor the agreement
and what's done to meet the expectations. Conversely,
KPIs report on the efficiency or success in satisfying
expectations or achieving organizational goals.
SLAs and KPIs have unique specific purposes. SLAs
ensure performance metrics stay above a specific
level of success. KPIs, however, promote optimal
performance and help ensure improvements occur to
deliver the expected results.
Every organization needs both Strategic and Operational measures, and some
typically already exists. This Figure depicts strategic, operational and other measures
as described below:
Ø Strategic Measures track progress toward strategic goals, focusing on
intended/desired results of the End Outcome or Intermediate Outcome. When
using a balanced scorecard, these strategic measures are used to evaluate the
organization’s progress in achieving its Strategic Objectives depicted in each of the
following four balanced scorecard perspectives:
• Customer/Stakeholder
• Financial
• Internal Processes
• Organizational Capacity
Ø Operational Measures, which are focused on operations and tactics, and
designed to inform better decisions around day-to-day product / service delivery
or other operational functions
Ø Project Measures, which are focused on project progress and effectiveness
Ø Risk Measures, which are focused on the risk factors that can threaten our
27
Measurements
- strategic
- operational
- project
- risk
- employee
Inputs
(e.g. FTEs,
budgets, ..)
Process
(e.g. efficiency
output/input,
cycle time, cost
per unit, ..)
Outputs
(e.g. widgets
produced,
brochures
produced, …)
Project
(e.g. schedule, resources, scope, risk, …)
Operational Measurements
Intermediate
Outcomes
(e.g. widget
sales,
awareness, ...)
Final
Outcomes
(e.g. profitability,
program impact,
..)
Strategic Business Inteligence Increases
Strategic Impact Measurements
success
Ø Employee Measures, which are focused on the human behavior, skills, or
performance needed to execute strategy
An entire family of measures, including those from each of these categories, can be
used to help understand how effectively strategy is being executed.
27
IT metrics are quantitative assessments that enable organizations to understand the
performance of their IT initiatives. Key Performance Indicators (KPIs) are a subset of
metrics to illustrate how effectively specific business objectives associated with IT
performance are achieved.
What do IT metrics track?
Even when technology is not a core business of your organization, you can take
advantage of vast volumes and variety of data to make well-informed strategic
decisions. The strategic significance of IT metrics can be described in two domains:
Technology Performance. Data generated by connected technologies, IT
infrastructure, and technology systems can be gathered, processed, and
analyzed to identify technology performance. This information can be used to
maintain efficiency of technology systems at lower costs.
Business Performance. This data also contains hidden insights on the impact
of strategic choices on business performance. In order to drive the best
business outcomes, IT must identify and analyze metrics that correlate highly
with business performance.
28
Products
or
outcomes
Measureme
nts
and
metrics
Activities
and
actions
Decision
making
Direct influence
Indirect influence
Some important IT metrics and KPIs fall in the following domains and categories:
Operational Metrics
Operational KPIs help organizations track performance over a pre-defined period or
in real time. These metrics are associated with a range of business functions. But, in
the domain of enterprise IT, operational metrics focus primarily on the performance
of IT resources and functions. These resources include the workforce, technologies
and services used to conduct business operations or enable products and services to
end-users. Examples of operational metrics include:
• HR: Workforce productivity, overtime hours, employee turnover
rate, cost of hiring and training.
• IT Infrastructure: Infrastructure downtime, frequency of
production deployments, number of workloads processed, capital
and expense cost, resource availability.
• IT Solutions and Services: Service uptime, availability, reliability,
cost per user, cost per user acquisition, network outages.
• ITSM and Service Desk: Service availability, First Call Resolution
Rate, cost per contact, SLA breach rate, user satisfaction
Operational metrics overlap with a range of categories that focus on unique aspects
29
1
2
3
4
5
6
No
Yes
Understand
requirements
and outcomes
Determine
metrics
Verify metrics
Metrics
Fulfil
need?
Determine tool(s)
for measuring ands
collecting metrics
Re-evaluate
metrics with
tool(s)
Implement
metrics
Business requirements and
expectations; serviceprovider
requirements
Metrics providing evidenceof
actions andoutcomes basedon
requirements
Evaluatemetrics against
requirements andexpectations
Reviewtools alreadyin-houseor
determineif atool purchaseis
necessary
Ensuretool cancollect all
requiredmetrics
Gather metrics.
Process themetrics..
Analyzethemetrics.
Findopportunities for improvement
of organizational operations driven by technology. These metrics evaluate how the
resources made available to various functions of the organization contribute to the
overall business performance.
System Reliability
IT systems, including hardware infrastructure and applications, must operate
reliably—every second of downtime incurs revenue losses. The metrics associated
with system reliability help organizations evaluate historical performance and
predict future performance. These metrics not only empower IT teams to perform
upgrades and maintenance activities proactively but also give the business
confidence to scale operations and pursue new business opportunities that rely on
stable and reliable IT systems performance.
These metrics are mostly focused on the technology performance and require
additional layers of analytics and correlations to evaluate correlations with the
business performance. Common examples of system reliability metrics include:
• Outages: Mean Time to Resolve (MTTR), Mean Time to Failure
(MTTF), frequency and schedule of planned and unplanned
outages, redundancy levels for power and utility supplies,
hardware assets
• Network: Capacity, latency, incidents
• Procurement: hardware resources that are not easily replaced by
strategic suppliers and standard channels of procurement
• Cost: Operational and capital expenses, cost per user, cost per unit
asset such as data storage
• Security: Data breaches and network infringements encountered
and deflected, security policy adherence, cybersecurity awareness
training drills and results
IT Support and Customer Expectations
Measuring user satisfaction helps organizations identify operational and
performance issues within their organization and its IT resources. The IT service desk
for instance, is established to ensure that IT services are delivered effectively to
internal and external end users. The performance of the IT service desk has a direct
correlation with organizational capacity to deliver the expected services and
satisfaction levels of end-users.
Some common examples for metrics that track IT support and customer experience
include:
• Service Availability: How readily is the promised service made
available to the end-user, as per expected performance, quality
and dependability. Repeated outages, recurring technical and
security issues compromise service availability and hence,
customer expectations of the service quality
29
• Common IT Metrics:
o Mean Time to Resolve (MTTR) represents the average time
taken to resolve a ticket.
o Mean Time Between Failure (MTBF) represents the time
between failures.
o Mean Time to Failure (MTTF) represents the system uptime
after a possible issue has been resolved.
o These metrics must be evaluated collectively: a dependable
service fails less frequently, resolves fast after a failure and
remains available for prolonged duration.
• Service Desk Metrics: First Call Resolution Rate, Cost per Contact,
Customer Satisfaction, Net Promoter Score, Agent Satisfaction
Service Level Agreements (SLAs)
Most organizations procure a range of IT services and technology solutions. SLAs
oblige vendors to deliver promised service levels defined by specific metrics. Failure
to comply with these metrics not only penalizes vendors but also impacts end users
and, therefore, revenue generating opportunities. It is therefore critical to identify
the metrics that best describe the required performance levels. Examples of SLA
metrics include:
• SLA Compliance Ratio: The ratio between number of incidents
resolved in compliance with SLA and the total number of incidents
• SLA Performance: Uptime, Scale up/down capacity, response time,
storage
• SLA Verification: Defect rates, technical quality, security, business
results/KPIs, number of complaints
Financial Metrics
The performance of IT is a natural tradeoff with financial investments. It is therefore
critical to evaluate the financial performance of an IT initiative. The optimal tradeoff
between cost, performance, security and other ROIs should be considered with
designing and evaluating IT metrics and KPIs. Some common financial metrics
include:
• Cost: Cost of budget, budget variance, resource cost, maintenance
and support expenses
• Scheduling: Scheduling variance and schedule overhead
• Risk: Legal, societal, and natural causes impacting business
performance or ability to scale a product functionality and service
to extended marketplaces
29
Finding the right IT metrics strategy is not about deploying sophisticated analytics
technologies. It’s about identifying the metrics that yield most insightful knowledge
that can help align IT with desired business goals. Specifically, the chosen IT metrics
and KPIs should help organizations find accurate and actionable answers to the
following questions:
• How productive is the IT staff using the available technology resources?
• Are end-users and customers satisfied with the available services and
support?
• How dependable is the performance of core products and technology
solutions?
• Are the IT projects delivered efficiently and effectively?
30
Key
Performance
Indicators
Vision
Mission
Goals
Objectives
Critical success
factors
Metrics
Measurements
Guidanceanddirection(input)
Feedback/responses(output)
Strategic
High level, business-
focused
Tactical
Detailed statements
addressing requirements
for success
Operational
Provides proof of
execution
A metric framework
is a set of definitions
of metrics.
There are hidden dangers here if both AST and DT are only loosely described. It is all
too easy for differences in views to hamper understanding and this, in turn, may lead
to an ambiguous set of commitments. AST is relatively easy to agree and from the
outset should be revisited on a regular basis to ensure that it does actually mean
what it says. There will often be times when the service provider has to rejig their
service offering to support their customers, meaning that the original AST is now
compromised. The onus would be on the SLM, to review consistently how
customers are using services in relation to set expectations and instigate the
necessary reworking of SLAs as appropriate.
If AST is relatively easy to agree and set, DT is a potential nightmare. Is it referring to
a loss of service that affects all users? Is the scope intended to include a partial
failure only affecting one person? Does it include maintenance time? How do we
actually measure DT? Does the customer have the ability to measure this in the
same way you can? Do they need to?
There is, of course, no easy answer away from the hard work involved in building a
strong relationship with the customer. It is to no good end that an SLM creates a
service level (for example, availability of service), measured only from a provider
perspective, and then presents what appears to be a compliant service at the review
meeting.
better to agree in advance the criteria for measuring the service levels that can be
31
Measurement formula - example
AST = agreed service time
DT = downtime𝐴𝑣𝑎𝑖𝑙𝑎𝑏𝑖𝑙𝑖𝑡𝑦 [%] =
("#$%&$)
&$
×100%
accomplished by the service provider, no matter how high-level or simple they may
be, review them and make changes as appropriate
this approach of test-and-review will work for other commitments, too – reliability,
security, support, capacity, throughput, response times and continuity. Service levels
can be measured by both parties – and it must be stressed that both parties have a
responsibility to do this – and create a base understanding to build on. I am not
advocating changing service levels all the time; this would prove to be
counterproductive. However, in the early life of an SLA, an approach of regular
review and small changes on a frequent basis will provide long-term benefits for
relationships and trust.
31
32
3232
To complex analysis even with predictions
Results have to be reported – information is crucial for the overall
success
The modern ITSM tools should provide integrated reporting functionality. Also
online dashboard function is required.
33
3333
Example – Tableau SW
KPI and SLA results visualization and “dashboarding” specific SW
34
ITSM future directions (May
2nd)
Do not forget about the visits
to Kyndryl’s center (April 18th
and 25th at 16:00)
What shall we
talk about
next?
35
35
What about the exam?
Dates :
• May 23rd 16:00
• May 30th 16:00 (T.B.C.)
• June 20th 16:00