It is important when designing metrics, systems, or processes involving data that we understand how they will be used to drive what decisions, by who. The data needs to be something that engineering leadership, senior ICs, front-line managers, or front-line engineers can do something effective with.
When you propose a metric, system, or process involving data, you should always explain how it will drive decisions.
Let’s look at some bad examples to demonstrate this.
Most of us know that “lines of code written per software engineer” is a bad metric. But it’s especially bad in this context of effectively driving the right decisions.
Who does what with that number? It tells nothing to engineering leadership and senior engineers–there is no specific action they can take to fix it. The front-line engineers don’t care that much, other than to look at their own stats and feel proud of themselves for having written a lot of code. It’s only the front-line managers who will be able to do something with it, by trying to talk to their engineers or figure out why some of their engineers are more “productive” than others.
Now, to be clear, it’s fine to make a metric, system, or process that only one of these groups can use. The problem with this metric is that when you give it to a front-line manager, it often either (a) misleads them or (b) is useless. They may see that one engineer writes fewer lines of code than another, but the engineer writing fewer lines actually has more impact on the users of the product, or is doing more research, or is writing better-crafted code.
Basically, what you’ve actually done is sent the front-line manager on an unreliable wild-goose chase that wasted their time and taught them to distrust your metrics and you! You drove no decision for that manager–in fact, you instead created a problem that they have to solve: constantly double-checking your metric and doing their own investigations against it, since it’s so often wrong. Worse yet, if the manager trusts this metric without any question, it could lead to bad code, rewarding the wrong behavior, a bad engineering culture, and low morale.
Another way to do this wrong is to have some metric but not have any individual who will make a decision based on it. For example, measuring test coverage on a codebase that nobody works on would not be very useful.
At the level of front-line managers and front-line engineers, it is sufficient to provide information that allows people to figure out where a problem is, so that front-line engineers can track down the actual problem and solve it. This can sometimes be more “data” and less “insights.”
At more senior levels, more “insights” are required as opposed to raw data. In general, the more senior a person you are presenting to, the more work you should have done up front to provide insights gathered from data, as opposed to just showing raw data.
When you deliver data or insights to people, it should actually be something that will influence their decisions. They should be interested in having data for their decisions, and be willing to act based on the data. It’s important to check this up front before doing an analysis–otherwise, you can end up doing a lot of analysis work that ends up having no impact, because the recipient was never going to take action on it in the first place.
For example, sometimes teams have mandates of things they must do, such as legal or policy compliance, where it doesn’t matter what you say about the work they do, they still have to do it in exactly the way they plan to do it. It’s not useful to try to change their mind with data–your work will not result in any different decision.
In general, if a person has already made up their mind and no data or insight will realistically sway them, it’s not worth doing the work to provide them data or insights. We might have some opinion about the way the world should be, but that doesn’t matter, because a person has to change their mind in order for action to happen. If we won’t even potentially change somebody’s mind, we should not do the work.
It can be useful to ask people who request data:
If the answer to question #2 is “no,” then it’s not worth working on that analysis.