This post examines Agile metrics at three different levels. Firstly, how well are we doing in our implementation of Agile? Secondly, how can we effectively measure how well our teams are doing on a day-to-day basis? Finally, what sorts of things can we measure when looking at how well we have done in the past?
A series of metrics is recommended that could be implemented at each of these levels along with some actual templates that could be used to get going.
Table of Contents
The Purpose of Measuring
In the context of Scrum, the purpose of measuring is not to find out how big, heavy or long something is. Its purpose is to reduce uncertainty. This usually sits uncomfortably because software metrics are often concerned with exact data. We must come back to the reasons why we are measuring. Taken further, and building on the idea of reducing uncertainty, we could ask two questions. First, how we are doing in our adoption of Scrum? And secondly, how are our teams performing?
How Are We Doing In Our Adoption of Scrum?
Typical questions we may want to ask include:
- Has our investment in adopting Scrum been worthwhile?
- What should we focus on improving next?
- Should we continue with Scrum?
- Are we better at software development than we were a year ago?
- Are we producing better products?
- Do our products have fewer defects?
- Are we faster than we used to be?
How Are Our Teams Performing?
Agile prescribes reporting that is easy, and public. A good metrics system for teams should cover four areas:
- Resources. Keep track of key resources: planned versus actual; number of developers; number of customers assigned in the project; number of testers; number of computers in use, and so on.
- Scope. Keep track of the number of user stories over time: how many exist, how many are done, how many more are expected. Consider tracking total estimated time for the project and estimated versus actual scope consumption.
- Quality. Use an acceptance testing graph showing number of tests and number succeeding over time.
- Time. Track the results of each release plan. Graph schedule versus time. Discuss dropped or added functionality and its impact on time.
This document sets out some standard metrics that will enable us to answer these two questions and reduce uncertainty.
Ways of Measuring
At the simplest level we are interested in recording metrics at three different levels of the Agile ‘onion’:
- Background Noise. In the busy moments that fill the typical working day we often lose sight of the bigger perspective and with it the goal of becoming and sustaining Agile practices. This level strives to measure how Agile we are. It is probably the most neglected level but if it was analysed it would yield information that would serve to improve performance at the other two levels.
- Down at the coal face. This is the level that stakeholders typically focus on the most. It is where the question, “When will we be done?” is asked. There are certainly some useful metrics that can be deployed at this level. The most famous being the Sprint Burndown.
- Reflection: Agile, Is It Worth It? Measuring Earned Value. This third level is also often neglected, in much the same way that the traditional Post Implementation Review is ignored. Agile and Scrum find their roots in the teachings of Lean and the Toyota Production System. A culture of continuous improvement is prevalent in these systems. For our purposes, we can use Retrospectives and measure the Earned Value we deliver to our customers.
Background Noise: Measuring How Agile We Are
There are two immediate metrics we could introduce to measure how Agile we are: a Survey and a Balanced Scorecard. These instruments should be reviewed, but probably at different intervals. For instance, we may choose to run the survey after every release. The Scorecard, however, should be updated more regularly, probably weekly.
Scott Ambler has a excellent set of Agile Surveys and he updates them regularly. Of particular relevance is his 2019 survey, How Agile Are You? We recommend you run this survey on your teams and stakeholders as soon as possible and regularly thereafter.
A Balanced Scorecard
One of the main problems with metrics is that it is difficult to come up with measures that cannot be gamed by those being measured. Think of performance measures in HR. It really is difficult to convince a team that the reason you are gathering metrics is because you sincerely wish to nurture a self-organising culture where failure is examined constructively and continuous improvement is encouraged. The Balanced Scorecard attempts to address this problem.
Down at the coal face: Measuring Agility on a day-to-day basis
Typical day-to-day measures include the Sprint Burndown chart, an example of which is given below in Figure 1. This simply measures the rate at which a team is burning through the work it has committed to deliver in the current Sprint, in hours or story points. A challenge with this is accuracy of estimation and waste. Over a number of sprints both of these should improve and the team will begin to improve its velocity (i.e. the number of story points the team delivers in a sprint or the number of hours it eats through) and eventually reach a stable velocity. Once this is achieved delivery timelines will become more predictable as the team will be very good at estimating user stories and knowing what it can commit to in a given timeframe.
Other measures should include:
- Waste. This is a measure of how much time a Scrum team member is spending on activities unrelated to the sprint goal. It is a useful metric for two reasons. Firstly, when it is compared to a sprint burndown chart it highlights the accuracy of the team’s estimation versus its completion. For instance, if at the end of the sprint a team had 100 hours of work remaining but during the sprint the team had recorded 75 hours of waste then really the team would be just 25 hours out in its estimation. Secondly, it highlights the common causes that steal time from the Scrum team. Over time this data can be reviewed and the miscreants dealt with!
- Completed Stories. This is a useful metric as it measures how many user stories have been completed and how many are left to do. The metric should be updated at the end of each sprint. If the number of stories to do is growing within a particular release, then it may be worth finding out why. For instance, are existing stories being split into smaller stories to make them easier to implement? Or are more being added because progress is good?
- Unit Test Scores. An Agile team has Unit Tests for everything that could possibly break. This is a super measure because it tells you if your current build is working and it also tells you how much of your build is covered by tests. Ideally it will be integrated with a continuous build process such as Cruise Control.
- Acceptance Test Scores. This is a measure of the scope of testing and measures the ratio of number of acceptance tests to number of user stories, per sprint. Again, this ratio should approach 1:1 as you near the release date.
Reflection: Agile, Is It Worth It? Measuring Earned Value
The third level involves metrics that capture how well we’ve done. These are useful as they provide a valuable seam of data that can be used to improve matters going forward. There are two main areas we should focus on to begin with: Retrospectives and Earned Value Management.
The retrospective is a critical Scrum ceremony that takes place at the end of every sprint. The data from it is largely subjective rather than quantitative and hence it is often overlooked as a source of metrics. Nonetheless there are a number of ‘games’ that can be played in retrospectives to keep them fresh and engaging for the Scrum team, including:
- Activities to Set the Stage. Games that can be played at the start of a retrospective include Check-In / Check-Out; ESVP (Explorers, Shoppers, Vacationers, Prisoners).
- Activities to Gather Data. Mine The Timeline (stimulate memories of what happened and find nuggets); Mad Sad Glad (surfaces feelings); Satisfaction Histogram (highlight how satisfied team members are with a focus area); Team Radar (help the team gauge how well it is doing on a variety of measures).
- Activities to Generate Insights. Five Whys (discover underlying conditions that contribute to an issue); Fishbone (look past symptoms to identify root causes); Learning Matrix (help team members find what’s significant in their data).
- Activities to Decide What to Do. Retrospective Planning Game (develop detailed plans for proposals); SMART goals (translate ideas into priorities and action plans); Circle of Questions (help team choose an experiment or action steps for the next Sprint).
- Activities to Close the Retrospective. Appreciations (allow team members to notice and appreciate each other); Temperature Reading (check on ‘where we are at’. A practical way to process what is happening for the team).
Earned Value Management
This is perhaps the most complex metric to implement and would take some time and effort to instil the discipline required. However, it is extremely useful. Earned Value Management (EVM) seeks to provide dispassionate project reporting. Heroics are factored out in favour of straight forward scientific reporting of how well we’re doing against what we said we would do, and at what cost. It can help you forecast the time and cost to complete an activity from now and the final outturn cost and time. This helps you integrate the three critical elements of project management: scope, cost and time. To carry out EVM you need to know:
- What started
- What finished
- What you achieved
- What you spent
EVM can be taken further and used to produce other indices including a Cost Performance Index and Schedule Performance Index. These indices can be used to plot whether a team is:
- Behind Schedule and Under Budget
- Ahead of Schedule and Under Budget
- Behind Schedule and Over Budget
- Ahead of Schedule and Over Budget
We usually have a good idea of how much our projects cost. We also know when they start and finish. To be able to implement EVM we would need to better understand the value the business team places on the user stories they are giving us. There’s the challenge – and the subject of a future post!
Always start small and build. An initial approach should consist of the following at each level:
- Background Noise. Run one of Scott Ambler’s surveys and implement a Balanced Scorecard.
- Down at the coal face. Implement Sprint Burndowns across all teams immediately. In the coming Sprints add in the other metrics outlined.
- Reflection: Agile, Is It Worth It? Measuring Earned Value. Mandate Sprint retrospectives going forward and, in parallel, work with the business to derive a measure of value so you can implement EVM.