There is a framework we use for picking metrics called “Goals-Signals-Metrics.”
Basically, first you decide what the goals are that you want to achieve for your product or system. Then you decide on the signals you want to examine that tell you if you’re achieving your goal–essentially, these are what you would measure if you had perfect knowledge of everything. Then you choose metrics that give you some proxy or idea of that signal, since few signals can be measured perfectly.
A goal should be framed as the thing that you want your team, product, or project to accomplish. It should not be framed as “I want to measure X.”
Most of the trouble that teams have in defining metrics comes from defining unclear goals.
Webster’s Third New International Dictionary defines a “goal” as:
The end toward which effort or ambition is directed; a condition or state to be brought about through a course of action.
This needs to be stated in a fashion specific enough that it could be measured. Not that you have to know in advance what all the metrics will be, but that conceptually, a person could know whether you were getting closer to (or further from) your goal.
In developing a goal, you can start with a vague statement of your desire. For example:
LinkedIn’s systems should be reliable.
However, that’s not measurable. So the first thing you do is clarify definitions.
First off, what does “LinkedIn’s systems” mean? What does “reliable” mean? How do we choose which of our systems we want to measure?
Well, to figure this out, we have to ask ourselves why we have this goal. The answer could be “We are the team that assures reliability of all the products that are used by LinkedIn’s users.” In that case, that would clarify our goal to be:
The products that are used by LinkedIn’s users should be reliable.
That’s still not measurable. Nobody could tell you, concretely, if you were accomplishing that goal. Here’s the remaining problem: what does “reliable” mean?
Once again, we have to ask ourselves why we have this goal. To do this, we might look at the larger goals of the company. At LinkedIn, our top-level vision is: “Create economic opportunity for every member of the global workforce.” This gives us some context to define reliability: we somehow want to look at things that prevent us from accomplishing that vision. Of course, in a broad sense, there are many factors that could prevent us: social, cultural, human, economic, etc. So we say, well, our scope is what we can do technically with our software development and production-management processes, systems, and tools. What sort of technical issues would members experience as “unreliability” in that context? Probably bugs, performance issues, and downtime.
So we could update our goal to be:
LinkedIn’s users have an experience of our products that is free from bugs, performance issues, and downtime.
We could get more specific and define “bug,” “performance issue,” and “downtime,” if it’s not clear to the team what those specifically mean. The trick here is to get something that’s clear and measurable, without it being super long. What I would recommend, if you wanted to clarify those terms, is to create sub-goals for each of those terms. That is, keep this as the overall goal, and then state three more goals, one for bugs, one for performance issues, and one for downtime, which do a better job of spelling out what each of those means.
One of the things that you’ll notice about this exercise is that not only does it help us define our metrics, it actually helps clarify what is the most important work we should be doing. For example, this goal tells us that we should be paying more attention to the experience of our users than the specific availability numbers of low-level services (even though we might care about those, too).
If you aren’t sure how to measure something, it’s very likely that you haven’t defined what it is that you are measuring. This was the problem with “developer productivity” measurements from the past–they didn’t define what “developer productivity” actually meant, concretely, exactly, in the physical universe. They attempted to measure an abstract nothing, so they had no real metrics. This is why it is so important to understand and clarify your goals before you start to think about metrics.
Signals are what you would measure if you had perfect knowledge—if you knew everything in the world, including everything that was inside of everybody else’s mind. These do not have to actually be measurable. They are a useful mental tool to help understand the areas one wants to measure.
Signals are the answer to the question, “How would you know you were achieving your goal(s)?”
For example, some signals around reliability might be:
The concept of “signals” can also be useful to differentiate them from “metrics” (things that can actually be measured). Sometimes people will write down a signal in a doc and then claim it is a metric, and this distinction in terms can help clarify that.
Metrics are numbers over time that can actually be measured. A metric has the following qualities:
All metrics are proxies for your signal. There are no perfect metrics. It is a waste of time to try to find the “one true metric” for anything. Instead, create multiple metrics and triangulate the truth from looking at different metrics. All metrics will have flaws, but sets of metrics can collectively provide insight.
Example metrics for our “reliability” goal from above might be something like:
If you look at the signals above, you will see that some of these metrics map back to those signals.
Overall, there is a lot to know about metrics, including what makes metrics good or bad, how to take action on them, what types of metrics to use in what situation, etc. It would be impossible to cover all of it in this doc, but we attempt to cover some of it in other docs on this site.