"Experience, Empiricism, Excellence"
Please share with your colleagues and friends

The Metrics and Reporting View looks at the various feedback cycles, metrics and reports in Holistic Software Development.

Reports and metrics can be useful but they are only ever indicators of reality. We discuss what makes for good evidence in detail in the People chapter under Evidence Based Decision Making, but an important element is that we are confident of its accuracy. One way to improve confidence is to gather corroborating evidence from alternative sources. In simple terms that means we should always back up a metric or report with context and narrative from the people who actually know what’s going on. We strongly recommend that leaders Go See to understand their teams, products and business rather than sit behind a wall of reports and layers of management.
 
Analyzing a large number of reports against project history we’ve found a number of systemic issues with metrics and reporting:
 
False Accuracy
This one bites twice! 
 
False accuracy is implied by numerical values, graphs and well produced reporting. The most common general case is presenting data in an over-precise way (e.g. 3.275% happiness improvement). The .275% implies a very precise measurement to be able to differentiate such subtle changes in happiness, which simply doesn’t exist.
 
Another issue is that numerical representation of non-exact states implies accuracy that doesn’t exist. An example here is agile maturity. As we described earlier, this is a difficult thing to measure anyway. Putting an exact figure on it, such as 3.5 out of 4 implies an accurate form of measurement which doesn’t exist. Worse, averaging such scores across teams to provide an organizational maturity further spreads the misrepresentation of a subtle and subjective assessment as a numerical fact.
 
As we mentioned in Estimates, uncertainty should always be presented along with numbers, estimated and metrics.
 
Sampling Frames
Data is only representative when it covers a meaningful amount of the population. We’ve seen many cases when people extrapolate from a sampling frame of 1 to the entire organization. This kind of projection is a normal human behavior but one that we must be wary of when talking about reporting and metric. 
 
Just because one team experiences a problem doesn’t mean it’s a systemic issue in the organization. Even if 10 teams experience an issue it might not be systemic in an organization of 1000s. 
 
Comparison against hypotheticals
This one is easy to spot but surprisingly common, especially in Business Cases. We’ve seen a number of examples where people make arguments, or present evidence in terms of a change against a hypothetical situation. This is simply making up evidence to justify a belief. Obvious examples include “… as compared to a normal development team”.
 
Greenshifting
We’ve observed a tendency for people to only want to report good news. Especially as information is filtered up through an organization through layers of management. Each layer often adds a subtle positive spin to messages until an escalated problem can end up reported as “Green” and healthy at senior management.
 
This can be prevented by reducing organizational layers, making all reports transparent and clarifying a cultural direction that Greenshifting is not helpful, it’s lying.
 
Correlation != Causation
Correlation does not equal, or even imply, causation. Just because two figures seem to have changed together doesn’t mean that one change caused the other. For example, if an organization notices a trend that average lead time is reducing across teams at the same time as more office coffee machines have failed than normal that does not mean that lack of coffee decreases lead time. Others might note that during the same time period more superhero movies have been released than previously. Again that doesn’t indicate that superhero movies reduce software lead time.
 
Silly examples are easy to spot, but often when metrics are presented next to each other such as happiness and lead time they imply a causation relationship. These implications should be challenged as to what evidence there is linking the two metrics.
 
In this case, there is academic evidence for example: “Happiness and Productivity” by Andrew J Oswald, Eugenio Proto and Daniel Sgroi - University of Warwick. 
 
Interpretation
No matter how good the evidence is, and how well presented it is, different people are liable to interpret numbers graphs and written reports in different ways. Learning how to interpret data, statistics and reports is a necessary skill for Business Leaders but we should never assume everyone will interpret the same metric the same way.
 
One example we saw in a client organization was a Cumulative Burnup that showed a cumulative 13% increase in created items over the number of completed items:
  1. One leader interpreted this as a nice healthy demand on development services. 
  2. Another interpreted this as a runaway over-demand problem in the organization. 
  3. Another interpreted it as evidence of significant failure demand in the organization.
 

Please share this page

Submit to DeliciousSubmit to DiggSubmit to FacebookSubmit to Google PlusSubmit to StumbleuponSubmit to TwitterSubmit to LinkedIn