Agile Metrics: Lessons from your Headbone

First published on, 15 December 2017 (syndicated).

Chart from 'The Phrenological Journal' ("Know Thyself"), print from Dr. E. Clark. Via Wikimedia Commons.
"That's the problem with so many organizations around entrepreneurship. They're driven by metrics that don't matter." - Brad Feld

It's alright Madam, I'm a Doctor

Have you ever had your bumps read? No no, don't be alarmed, it's your cranium I'm referring to. At one time it was thought possible to deduce interesting things about you just by its shape. In fact the practice of phrenology - the supposed determining of mental attributes by the contours of the skull - was held to be quite useful in the Victorian age. Thousands of measurements were taken of a great many people, in the hope that budding genius and latent criminal might be identified swiftly by means of calipers. Today, this method is dismissed as nothing more than pseudoscience. How fair is that assessment though, when phrenology's adherents clearly had such an abundance of wonderful metrics?

You see, in today's data-driven world we rely on metrics extensively. Large corporations and government departments will often take "Key Performance Indicators", and other such measurements, to inform them about conditions and trends. Few organizations of any real size now seem able to operate without them. However, we have also come to understand that almost any darned fool thing can be measured, and that the usefulness of a measurement must therefore be assessed critically.

We have come to appreciate the danger of measuring the wrong thing, or perhaps of measuring the right thing only to correlate it to the wrong conclusion. Hence in large enterprises and across the public sector, a badly thought-out KPI can be more a source of caustic amusement than edification. Some indicators might seem almost as crazy as bump-reading, in that no meaningful correlation can be drawn from the measure. Metrics concerning expenditure, for example, can easily tell us the price of everything and yet the value of nothing. Sometimes a correlation might be essentially valid, and yet it connects the measurement to the value so tenuously as to be worthless. I have chewed pencil-ends to pulp over the "number of jobs safeguarded", a quantity so overlapping with other factors that it was well-nigh impossible to determine at all.

To most of us, if not always the higher-ups who demand them, these screwball measurements can seem ludicrous. Hence phrenology amuses us now, instead of confounding us by its failure. We know perfectly well that the simple act of measuring things is not the same as a validated learning process in which we qualify, gather, and critically assess empirical evidence. We understand that we must choose our metrics carefully.

Focusing on empiricism

What kind of metrics should we look for in agile practice? As practitioners ourselves, we know that empirical evidence of progress can only be found in the delivery of working software, early and often. Any measurements we take should therefore be aligned as closely as possible to the ongoing delivery of working product. Value must always be held to lie in the increments delivered, and not in promissory notes such as requirements or design specifications, or assertions that certain stage-gates have been passed.

The Scrum Framework is founded on such empiricism, but it does not prescribe any particular metrics that we might use. No specific practices are asserted even though the few rules which define a valid implementation of Scrum are immutable. Many valid implementations are possible, and so all sorts of choices are left open to us regarding metrics.

This silence may seem unhelpful when clear guidance would be appreciated. Remember though that Scrum is not really an agile methodology, but rather a framework through which a team's agile process can be described. The essential rules for implementing such a process are laid out in the Scrum Guide, along with how to inspect and adapt it so it might be improved. Beyond that, the framework is minimally prescriptive.

Why Scrum is silent on metrics

A great many lean and agile practices, including some that are widely believed to be canonical, are not therefore even mentioned in Scrum. For example, no prescription is made for the use of information radiators, even though most teams do in fact utilize task boards of some sort in order to communicate the current state of play. Instead the framework speaks of a need for transparency regardless of how information is conveyed. Similarly, we find nothing in the Scrum Guide about metrics. There is nothing about latency, throughput, cycle time, or even burn-down rates. All we are told is that there must be some means of estimating how much work can be taken on, and of being able to estimate at any given time how much is thought likely to remain.

The prescription of specific practices, such as which tools or metrics to use, would constrain Scrum and reduce its general utility. It would become harder to select options by context, and therefore harder to use well by skilled teams. Scrum teams ought to follow the rules of the framework while inspecting and adapting their own selected practices, whatever those practices might be. The use of a task board can be an excellent option, as can the use of a burn-down chart from which projective measurements may be taken. However, teams must be trusted in the decision regarding whether or not to apply these things in their particular situation.

What not to measure

Unfortunately, the "implementation decisions" that ought to be entrusted to a team are all too often commandeered by a parent organization and its higher-ups. The irony of this is palpable. Executives might seem to be the authority least qualified to make decisions regarding agile practice, including metrics. A good Scrum Team can be expected to understand that the only true measure of progress lies in the ongoing delivery of business value. All other measures are secondary to that assessment. An organization's managers, however, may not grasp the importance of incremental delivery in reducing leaps of faith, nor of de-risking a project by means of empiricism and validated learning. Their culture may instead be aligned towards the attainment of traditional stage-gates vouchsafed by promissory notes, and by passing risk on to others. Failing that, when an agile initiative is on the boil and lip service must be paid to it, they may latch on to proxy measures for evidencing headway. One old chestnut is for managers to assess the story points supposedly "delivered" by teams. Although really only intended to help teams estimate how much work they can take on, story points can be easier numbers for managers to get hold of than measurements of actual value, and they are readily subverted for misuse as a sort of value currency in reports. Since Scrum does not prescribe specific measures at all, executives can flood the vacuum with their own cultural norms and expectations, and select and abuse measures in ways that abominate agile practice.

Of all bad metrics, perhaps none have been more damaging than that horrid fixation a great many executives have developed upon story points, and their wish to find in them a surrogate for value delivery. Indeed, some have come to view points not even as a surrogate, but rather as a true and accurate measure of team performance. They are oblivious to the nature of story points as relative estimates, the essential purpose being to help teams gauge how much work they can plan into a Sprint irrespective of value. To that end, different teams may reasonably use different systems and different measures, and they ought to be free to change them at will. The numbers can mean nothing to anyone else since they are drawn for that team's singular estimation purposes. There is no correlation with value or with productivity, either of which can only be evidenced by the increment produced. Yet if teams believe they are being assessed or compared in terms of story points, then story points are very likely what they will conspire to "deliver". Consequently, instead of working as developers providing value, they are coerced into becoming a species of shady corporate accountant, looking for new ways to play the numbers and cook the books.

The pressure for "Story point productivity" may seem ridiculous in its artlessness and naiveté. However, there is no question that it has gained a grip on organizational thinking at a senior level. Unfortunately it has, and the bump-readers are back in town.


In the next post, we'll see that a particular measure is not at fault if it helps timely decision-making which puts value delivery at a premium. We'll look at how story points and other estimates can be usefully applied, by Scrum Teams themselves, in support of projective practice. We'll look at burn-up and burn-down charts and velocity, and contrast these with other techniques that put an emphasis on throughput, the potential for depreciation, and achieving transparency over waste.