Metrics: How to Improve Key Business Results - LightNovelsOnl.com
You're reading novel online at LightNovelsOnl.com. Please use the follow button to get notifications about your favorite novels and its latest chapters so you can come back anytime and won't miss anything.
Customer Satisfaction was another example of data that was easy to attain. The department utilized a third-party survey company that sent survey invitations (via e-mail) to the customer of every closed trouble call. The vendor collected the responses, tabulated the results, and sent a weekly report to the department. I was given a monthly copy of the results for inclusion in the Report Card. All of the measures we had identified were collectable un.o.btrusively, with no disruption of the department's workflow. This is not always possible, but it is always a goal.
Since most of the data was attainable through automated systems, the data we had was readily available and had a high level of detachment from human error. The few places where humans did interact with the data, the team's desire to produce accurate information made us very happy to have this service as our flags.h.i.+p.
Recall the Metric Development Plan. We realized it was important to identify the source, the data (each component), how and when to collect, and how to a.n.a.lyze it. In the case of the Service Desk much of this was already done. Table 9-3 takes the categories and measures identified for the Service Desk and further breaks them down into the data needed, where that data will be found, and some basic a.n.a.lysis of the data. This a.n.a.lysis can be programmed into a software tool for display.
This was our starting point. We identified these measures easily. Some of them were already being collected and a.n.a.lyzed. As with most things, there were other options to choose from. After looking at the a.n.a.lysis, we reevaluated our draft of the measures.
Let's return to the Metric Development Plan, which consists of the following: Purpose statement.
How the metrics will be used.
How the metrics won't be used.
Customers of the metrics.
a.n.a.lysis.
Schedules.
A Picture for the Rest of Us.
Prose.
Purpose statement. Our purpose statement was defined for us-how can we communicate to our leaders.h.i.+p how healthy our services and products are?
How will it be used. You might think this answer would be simple and obvious; it would be used to answer the questions. In this particular case, it would be used to communicate the health of the Service Desk, from the customer's point of view.
How it won't be used. Most people want this to be obvious also, expecting not to have to answer the question. Of course I had to answer it.
It would not be used to differentiate between a.n.a.lysts It would not be used for performance reviews It would not be used to push the team of a.n.a.lysts to reach different levels of performance-in other words, the measures wouldn't become targets to be achieved.
Customers of the metrics. The customers of this metric (Health of the Service Desk) was first and foremost the service desk itself. The manager and the a.n.a.lysts were the owners of the data, and they were the "rightful" owners of the information derived. Another customer was the director of our support services (who the manager answered to), the CIO, and finally the executive. All of these were customers. Each customer needed different levels of information.
The data owners (a.n.a.lysts and manager) could benefit from even the lowest levels of the data. The director would need to see the anomalies. She would want to know what the causes of those anomalies were. The CIO would want to know about anomalies that required his level of involvement. If the Service Desk determined that it needed an upgrade to their phone system, a new automated call system, or an expert system, the funding would have to be approved by the CIO. The data would help support these requests.
The CIO would also want to know about any trends (positive or negative), or anomalies that might reflect customer dissatisfaction. Basically, the CIO would want to know about anomalies that his boss (the executive) might ask about. Most of the time the executive would ask because a key customer or group of customers complained about a problem area. The CIO shouldn't hear about the anomaly from his boss.
The same can be said of the executive. If the service's health was below expectations, and it ended up reflecting back on the parent organization, the executive would rightly want to know why and what was being done to make things better (either repeat the positive experiences or eliminate the negative).
a.n.a.lysis. Besides the planned a.n.a.lysis, the results of the information would have to be a.n.a.lyzed for trending and/or meaning. Now that we had the ground work laid out, it was time to dive a little deeper. We had to collect the data and a.n.a.lyze it to ensure our initial guesses of what we'd use were on target.
Availability.
We started with the abandoned call rate for the service. When we looked at the data shown in Figure 9-1, I asked the manager (and the staff) to perform a simple litmus test. I asked the manager if she thought the department was unresponsive to the customer. Was the abandoned rate too high? If it was higher than expected, was it accurate? If it was, why was it so high?
Figure 9-1. Abandoned call rate The manager had heard many times before that abandoned rates were standard measures of performance for call centers. When she looked at the data she said it "didn't feel right." Not because it cast the department in an unfavorable light, but because she had confidence that her unit was more responsive to the needs of the customer than the rate showed (the data showed that the department was "dropping" more than two out of every ten calls).
This prompted the proper response to the measures: we investigated. We looked at two facets-the processes and procedures used to answer calls and the raw data the system produced. The process showed that calls that were not answered within two rings were sent to an automated queue. This queue started with a recording, informing the caller that all a.n.a.lysts were busy and one would be with the caller shortly. It was telling that the recording provided information about any known issues with the IT Services like, "the current network outage was being worked and should be back up shortly," is one example. Most days, the recording's first 30 seconds conveyed information that may have satisfied the callers' needs.
Upon further inspection, we found that the raw data included the length of the call (initiation time vs. abandoned time). This allowed us to pull another measure, as shown in Table 9-4.
The measure was charted in Figure 9-2. We looked at it compared to the total abandoned rate to see if it told a clearer story.
Figure 9-2. Percentage of abandoned calls less than 30 seconds in duration As with all measures, a major question is how to communicate the measure (what graphical representation to use). In the case of Availability, we started with an Abandoned Call Rate in the form of Percentage of Calls Abandoned. When we added the more specific Calls Abandoned in Less Than 30 Seconds, we again used a percentage. We chose to show it in relation to the total or only show the Percentage of Abandoned Calls with the qualification that "abandoned" was defined as calls abandoned after 30 seconds.
After a year of looking at the measures in conjunction with improvements to the processes (including a shorter recording), the department chose to drop the Total Abandoned Rate and use only Abandoned Calls Less Than 30 Seconds. This was a better answer to the question of availability since it would allow for the following: Wrong numbers.
Questions answered/problems resolved by the automatic recording.
Customers who changed their mind (They may have chosen to use the new e-mail or chat functions for a.s.sistance. Perhaps their problem solved itself while the customer was calling.) While the a.s.sumption that a caller who didn't wait more than 30 seconds was not disappointed by the wait was only an a.s.sumption, it was believed that this would provide a more accurate account. This would have to be compared with the Speed (see the additions to the Speed measures) and the customer satisfaction measure of "timeliness." Of course, the only ones filling out the survey were ones who stayed on the phone long enough to have their call answered.
With a solid start on Availability, let's look at Speed. Speed started as Time to Resolve, which was known to be a concern with customers. Not that the organization was deficient in this aspect, but that the customer cared about how long it took to resolve an issue.
Speed.
We started with the open and close times for cases tracked in the trouble call tracking system. This data required human input. It required that the a.n.a.lyst be religious in his behaviors and adherence to the processes, procedures, and policies established around trouble-call tracking. If the manager were almost any other manager, I would have had to spend a considerable amount of time ensuring that the workforce understood that the information would not be misused and that it would be in the best interest of the department, each and every worker, and the manager for the data input into the system to be as accurate as possible. Regardless of the story that it told. If the a.n.a.lyst "fudged" the data so that it wouldn't "look so bad" or so that it "looked extra good," the information would be rendered useless. Wrong decisions could be made.
In this case, I trusted the manager and only spent a minimal amount of time communicating at a staff meeting the importance of accuracy in the data and how the resulting measures and information could be used to improve processes-and would not reflect on individual performance. The key to this explanation was consistent with any of the measures and any of the units I worked with.
The data, then measures, then information, and finally metrics should not reflect the performance of an individual.
The metrics, moreover, did not reflect on how efficiently the department was run.
What the data, measures, information, and metrics did clearly reflect was the customers' perception of the service. Regardless of what the "truth" was, the department would benefit from knowing the customers' perception. This is especially true in speed.
Depending on the tools you use for capturing speed, you can report the exact time it took to resolve a problem. For example, you could use the automated call system to identify the time the call was initiated (instead of when the a.n.a.lyst opened the case file in the trouble call system). You could use another call system to log the day and time the a.n.a.lyst completed the final call to the customer, closing the case (a.s.suming it took more than one call). You would have to have a means of connecting the case information to the call system. This level of accuracy is unnecessary in most cases-it is enough to log the time the case was opened and closed. But even if you had the accuracy described, the customer may "perceive" the resolution to take longer than it actually did.
I can a.s.sure you, showing the customer "proof" that the case actually took less time than he perceived it to take will do nothing to improve the customer's level of satisfaction nor change his opinion of how fast the department works. To the customer, perception is reality. And while it is useful to know the objective reality-you also have to address the customers' perception.
Even if you meet your Service Level Agreements (SLAs), a.s.suming you have them, a case may be perceived as taking longer than expected.
Note: If you can get SLAs for your services, these are great starting points for doc.u.menting the customers' expectations for a service or product. If you have them, stay alert. You may find that although the customer agreed to the SLA, it may not reflect their expectations, not as time pa.s.ses. It may not even reflect their expectations the day they are written.
For our metric, I collected the time the case was opened and time closed. Our trouble call system had the capability of toggling a "stop clock" switch. This function captured the amount of time the switch was toggled in the "stop" position until it was toggled back to "active."
This was very useful as it allowed the worker to capture the span of time the customer did not expect work to be performed. This was not used to subtract evening hours or weekends when the desk was closed. The customer, while not necessarily expecting work to get done during this time, did consider "down time" as part of the total time to correct their problem. The stop clock was used for specific, well-defined instances, such as The customer was not available (on vacation, out of town) so the case, while resolved, could not be closed.
The customer requested that the resolution not be implemented until a specific time. Like installing new software because she didn't want the upgrade to mess up her current work.
In the case of a scheduled fix, I used another data point, scheduled resolution time, which was used as the start time in the formula. I rarely found the stop clock being used in this instance although it could also work. If you use the stop clock or scheduled resolution times, you still must capture the start time and close time. You have to keep the source data because you don't know when you'll have a customer argue that the real time should have been from the time of the call, or you may find that you want to show how far in advance you're getting the requests for future resolutions. Are you getting the request a day before? A week? Or a month before it's needed? What are your processes for dealing with these requests? How do you ensure the work gets done on schedule? The source data, including actual start and close times will help in checking your process.
So we started by looking at the Time to Resolve compared to the customers' expectations as shown in Figure 9-3.
Figure 9-3. Time to Resolve: the percentage of cases resolved within expectations For Speed we also had to make a decision on how to represent the data. We could show the data as the number of calls resolved within expectations. This could also be a percentage. You may notice a trend here. Percentages are an easy way to depict data, especially when combining it with the number of instances used to determine the percentage. Normally all that is required is the totals (the specifics can then be derived if necessary).
Like Availability, the service provider decided that there was part of the story missing. Besides the time it takes to complete the work, the customer also cared about how long it took before they were able to talk to a living, breathing a.n.a.lyst instead of listening to the recording. So, we needed to collect data on time to respond. Table 9-5 shows the breakdown for Time to Respond. It's a good example of a measure that requires multiple data points to build.
During the definition of this measure, it became clear that there were other measures of Time to Respond. Besides the length of time before the a.n.a.lyst picked up the phone, there were also call backs for customers who left voicemail. Since the Service Desk was not open on weekends or after hours, customers leaving voicemail was a common occurrence. So, the Time to Respond needed to include the time it took to call the customer back (and make contact). The expectations were based on work hours (not purely the time the message was left). If the customer left a message on a Friday at 5:15 p.m., the expectation wasn't that he'd be called back 8:00 a.m. on Monday. The expectation, as always, would be a range-that the Service Desk would attempt to call him back within three work hours for example. The tricky part was to determine if the call back had to be successful or if leaving a message on the customer's voicemail const.i.tuted contact, and therefore a response.
Time to Respond was an addition much later in the development of the metric. It wasn't used the first year. Flexibility is one of the keys to a meaningful metric program.
Flexibility is one of the keys to a meaningful metric program.
Accuracy.
The defects produced in and by the system are traditionally examined. Those caught before distribution are part of "efficiency." Those that reach the customer (and which they are aware of) are part of the measures we were developing. We needed ways to represent faulty production or service delivery. In the case of the Service Desk, I looked for a simple measure of accuracy.
Rework was an easy choice. Using the trouble-call tracking system, we could track the number of cases that were reopened after the customer thought it was resolved. As long as the customer saw the reopening of the case as rework, it would be counted. We found that occasionally cases were prematurely closed-by the service desk or by second-level support. Later the customer would call the Service Desk with the same problem that was believed to have been resolved. The a.n.a.lysts were doing an admirable job of reopening the case (rather than open a new case making their total cases-handled numbers look better and keeping accuracy from taking a hit). This honest accounting allowed the Service Desk to see themselves as their customers saw them. Table 9-6 shows the breakdown for Rework. You should notice that the a.n.a.lysis is very simple in all of these cases. In the appendix on tools I will discuss briefly the place statistics plays.
Figure 9-4 shows Rework. You may have noticed by now that the charts all look somewhat alike. If you look closer, you'll see they look exactly alike. The only difference so far has been the data (values) and the t.i.tles. This consistency should benefit you inasmuch as those reviewing the measures get used to the presentation method and how to read them.
Figure 9-4. Percentage of cases reopened.
Now is as good a time as any to point out how triangulation helps deter "chasing data." Organizations often find themselves chasing data to try and put a positive spin on everything, to have only good news. The a.n.a.lysts could have taken every legitimate rework call back and logged them in as a new case. Doing this could nearly (if not totally) eliminate rework. But it would artificially increase other data. The number of calls handled would increase. The number of cases worked would also increase. And customer satisfaction with the knowledge of the a.n.a.lyst would likely drop. If the customer knows that it's rework, but sees another auto-generated survey come across his e-mail for a new case-as if the worker didn't realize it was the same problem-would eventually result in lower ratings for Skills/Knowledge of the a.n.a.lyst.
Of course, Speed to Resolve could also look better, since no cases would show taking the time it took to rework an issue. Without triangulation this could easily happen. Of course, I had an ace in the game. Even without triangulation, the manager of the department and her workforce were all believers in serving their customers. Not only providing the service, but doing so as well as possible. They were believers in continuous process improvement and in service excellence. Even without triangulation, I have total faith that this department would not chase the data.
But by using triangulation, and not looking at measures in isolation, it helps even the less committed departments stay true to the customer's viewpoint.
If we saw spikes in other categories because of false reporting in one category, we'd find anomalies that would require explanation. Besides these anomalies, Rework would be so low (or non-existent) that it too would be an anomaly. This is another reason (besides wanting to replicate successes) we investigate results that exceed expectations, as well as those that fail to meet them.
Usage.
For Usage, we captured the number of unique customers each month and also ran it as a running total. Using the potential customer base, we were able to derive a percentage of unique customers using the Service Desk. I've heard arguments that some data is just impossible to get. In this case I enjoy turning to Douglas W. Hubbard's book, How to Measure Anything (Wiley, 2007). Not because it has examples of all possible measures, but because the methods offered give readers confidence that they can literally measure anything.
Unique customers should be able to be measured against the customer base, regardless of your service. If you are a national service desk-say, like Microsoft or Amazon.com or Sam's Club-the customer base can be still determined. As you know by now, exact (factual) numbers may not be obtainable, but getting a very good and meaningful estimate is very feasible. The customer base can be estimated by determining how many sales of the software in question were made. If Microsoft sold 150,000 copies of a t.i.tle, how many calls, from unique customers, were received about that software? Amazon.com has information on the total number of customers it has. Same can be said for Sam's Club since it's a paid members.h.i.+p outlet. Walmart, McDonalds, and the neighborhood supermarket have a more difficult time.
In the case of Walmart and McDonalds, their national call center can use the marketing data on "number of customers." Each has information that can be used. Not necessarily unique customers, but even so, they have a good idea of how many repeat customers they have and total customers, so the numbers can be derived.
The neighborhood supermarket can determine either the number of customers or consider the populace of the neighborhood (based on a determined radius using the store as the center) as the potential customer base.
In the case of the Service Desk, we had identifying information for each customer (when provided). Of course, this data was only as accurate as the a.n.a.lyst's capture of it (misspellings of names could be an issue) and the honesty (or willingness) of the customer to provide it. One hundred percent accuracy wasn't necessary though. Good data was good enough.
Sometimes "good" data is good enough.
Looking at the data over a three-year period showed that the customer base usage was pretty steady, and we felt more confident that anomalies would stand out. In Figure 9-5 you can see what we saw. The last year showed a steady increase. But not an unexpected one, since the last year showed a lot of new technologies, software, and hardware put into use.
Figure 9-5. Percentage of unique customers.
As with the two previous measurement areas, Usage was represented in percentages. Besides the ease of reporting, it is also easy to understand. Another benefit is consistency for the viewer. When you report multiple measures from various sources (triangulation), you can make it seem less complex by keeping the reporting of the measures as consistent as possible.
An immediate improvement to the Usage measure came through the inclusion of a survey result. We had revamped our annual customer satisfaction survey and added a question on how customers obtained a.s.sistance with information technology issues. We asked who the preferred source for help was: "When you have an IT problem, where do you turn to first?" The answer could be selected from the following choices: Friend.
Coworker.
Internet (search engines, general information web sites, etc.) Vendor/manufacturer web sites Hardware/software manufacturer service desk.
The IT service desk.
Other.
None of the above-I don't seek a.s.sistance.
While the data resulting from this survey didn't directly reflect actual usage, we used it in this category since it provided answers in the spirit of why we sought usage data. Were our customers happy with our service? Did they trust our Service Desk to provide what they needed? Was our Service Desk a preferred provider of trouble resolution?
As you'll see in Figure 9-6, one interesting result of the survey collection was the higher percentage in the first year.
Figure 9-6. Unique users as a percentage of customers who chose the Service Desk as their first choice for a.s.sistance The only difference in the survey from 2007 and the following two years was the number of options the respondent was offered. In 2007, there were only three choices and in the two subsequent years there were eight.
Another Usage measure we could use to round out the picture was unique customers for the abandoned calls. Regardless if they were counted against availability (greater or less than 30 seconds), it would be useful information to know how many unique customers were calling our Service Desk. In the case of those who didn't stay on the line to speak to an a.n.a.lyst and hung up before 30 seconds had expired, we were a.s.suming that their need was satisfied. If they stayed on longer than 30 seconds and then hung up, we a.s.sumed their problems went unresolved.
So, for the calls under 30 seconds, we could count those as serviced customers. Looking at the number of calls abandoned could be a significant number of customers we were missing in our data pool. The data would also be useful to the manager of the Service Desk in other ways. If the same number was calling and hanging up before an a.n.a.lyst responded, the manager could contact that number and see if there was a specific need that was going unfulfilled. This would require that the automated call system captured the calling number without the phone being answered by an a.n.a.lyst. At the time of creating these metrics, this was not possible with the current phone system. So, the need was captured as a future requirement if and when the call system was changed.
Customer Satisfaction.
The last set of measures we used to round out our metric involved the cla.s.sic customer satisfaction survey. Our organization had been using a third-party agency for the administration, collection, and tabulation of Trouble Call Resolution Satisfaction Surveys for a while, so the data was readily available. The questions were standardized for all users of the service, allowing us to compare to the average for their clients-within our industry, and overall. This pleased our management and higher leaders.h.i.+p immensely. They thoroughly enjoy the ability to compare their services against a benchmark. They like it even more when their services compare favorably (as ours did).
To keep to the same style of measure-percentages of a total, would be possible. We could have reported the percentages of fives, fours, threes, twos and ones. This would prove to be too complex, especially over four questions. Instead we opted to show the percentage of customers who were "satisfied." We defined satisfied as a 4 or 5 on the 5-point scale.