Strategic Staffing, Or, How many people should I put on this (Agile) project?

One of the questions that comes up again and again in the context of Agile work is “How do I know how many people to put on this project?”

(I’m using the word ‘Project’ in the loosest sense, think “piece of work”)

My usual answer is: Staff Strategically, allocate people to a piece of work in relation to how important this piece of work is relative to everything else you are trying to do. Keep teams together for as long as possible, only make small, gradual changes to teams over time.

This isn’t really a problem when work is underway, you simply sum up the ball-park estimates on the work remaining (in abstract points), measure the velocity (again in abstract points), divide total by the velocity and if you get an answer you like then all is well. If the answer is not soon enough you can (gradually) add more people, and if the answer is very good you might even slow down.

However, the problem really comes when you are starting a piece of work. Particularly if you are bidding on a contract to do work for someone else. The problem occurs because without any data (velocity, sum of outstanding work) you can’t estimate when you will be done, and without doing the work you can’t get any data.

Rule 1: Without data you can’t forecast anything so don’t pretend you can.

So, what do you do?

Option 1 is to pretend you aren’t doing Agile; use whatever methods and techniques you’ve used before, get an estimate (or maybe more of a guess) on how many people you need for how long then get started with those numbers. As soon as you start throw away the plans. Wing it for an couple of iterations, by then you have the data and can re-plan.

Not a perfect solution but one that could work.

Option 2 is to just start. Start as small as you can, I think two people is about as small as you can get – below that you are into the realm of micro-projects which have their own dynamics.

Let this mini-team run for a few iterations and then, as if by magic, you have data.

Once you have data you can: keep as is, grown, shrink or cancel entirely.

Rule 2: Start small, get data, grow later

Starting small is essential because it reduces risk and because it minimises the momentum which can make it difficult to cancel work.

It also minimises the influence of Conway’s Law: “every organization will produce systems which are a copy of the communication paths in the organization.”

If you start with a big team you will get a big project – allocate a C# developer, a DBA and a UI designer and you’ll get a three-tier architecture. If you start small you should get a small project.

In the longer term how you staff a project is less a function of “How many people do I need to do this work?” and more a question of “How many people can I afford to do this work?”

Projects are never really staffed on a “how many people do I need” basis, we just pretend they are. Writing today, March 2011, you will probably put more people on a piece of work than you would 18 months ago, but fewer than you would have three years ago.

Rule 3: Staff work strategically relative to the other work to be done and corporate priorities, rebalance only occasionally

Strategic staffing means looking at how any one piece of work stacks up against the other pieces of work in play – or proposed – and the resources that are available to use. It means considering: what is the benefit of a piece of work? What is the risk? What work absolutely must be done, and what work can be left undone? What work is experimental?

The trick is to arrange work so teams can grown, or shrink, as resources and priorities change.

That is not a recipe for changing team composition every week. Indeed, the general principle should be: keep teams together and consistent for as long as possible. Only change teams occasionally, and only grow teams slowly.

However, things do change and work needs rebalancing. This should be an active, measured, and occasional process. Better to change staff allocations once a quarter in a review meeting than every week as a knee-jerk reaction.

That is what strategic staffing is. If you find yourself changing staff allocations every few weeks in response to events you aren’t working strategically.

As for deciding how many people to put on a piece of work when you are bidding for a contract, well, its another argument for constructing contracts as ongoing, rolling contracts.

Telling your client: “We think we need six people for six months” is little better than lying if you don’t have data to back it up. Have these six people worked together before? How productive will they be with the client’s environment? With the clients requirements? What work might emerge?

Better to tell the client: “We propose to start with three people for three months, at the end of that time we will review progress with you and decide with you, when we have data, whether to increase or decrease staffing. If you want to move faster than we can allocate four people and review after one month.”

It might not be what your client wants to hear but it sure beats guessing, or lying. Involve your customer, give them choices. It might not be a great sales technique but if the client won’t engage in this conversation there are probably other conversations they won’t have either.

Final roundup of facts from Capers Jones

In two previous entries I’ve reported some interesting statistics and findings – possibly facts – from Capers Jones book Applied Software Measurement (see Software Facts – well, numbers at least and More Facts and Figures from Capers Jones). I want to finish off with a few notes from the later chapters of the book.

On packaged software

  • Modifications to existing COTS package – I assume he includes SAP and Oracle here – is high-risk with low productivity
  • When changes exceed 25% it may be cheaper to write from scratch. (I’m not sure what it is 25% of, total size of the package, 25% of function points, 25% of features?)
  • Packages over 100,000 function points (approximately 1m lines of code) usually have poor bug removal rates, below 90%

On management

  • Jones supports my frequent comment that comes redundancies and downsizing Test and QA staff are among the first to be let go
  • Projects organised using matrix management have a much higher probability of being cancelled or running out of control
  • Up to 40%(even 50%) of effort is unrecorded in standard tracking systems

On defects/fault

  • “Defect removal for internal software is almost universally inadequate”
  • “Defect removal is often the most expensive single software activity”
  • WIthout improving software quality it is not possible to make significant improve to productivity
  • “Modifying well-structured code is more than twice as productive as modifying older unstructured software” – when he says this I assume he doesn’t mean “structured programming” but rather “well designed code”
  • Code complexity is more likely to be because of poorly trained programmers rather than problem complexity
  • Errors tend to group together, some modules will be very buggy, others relatively bug free
  • Having specialist, dedicated, maintenance developers is more productive than giving general developers maintenance tasks. Interleaving new work and fixes slows things down
  • Each round of testing generally finds 30-35% of bugs, design and code reviews often find over 85%
  • Unit testing effectiveness is more difficult to measure than other forms of testing because developers perform this themselves before formal testing cuts in. From the studies available it is a less effective form of testing with only about 25% of defects found this way.
  • As far as I can tell, the “unit testing” Jones has examined isn’t of the Test Driven Development type supported by continuous integration and automatic test running. Such a low figure doesn’t seem consistent with other studies (e.g. the Nagappan, Maximilien, Bhat and Williams study I discussed in August last year.)
  • Formal design and code reviews are cheaper than testing.
  • SOA will only work if quality is high (i.e. few bugs)
  • Client-server applications have poor quality records, typically 20% more problems than mainframe applications

Documentation

  • For a typical in-house development project paperwork will be 25-30% of the project cost, with about 30 word of English for every line of code
  • Requirements are one of the chief sources of defects – thus measuring “quality” as conformance to requirements is illogical

Agile/CMM/ISO

  • There is no evidence that companies adopting ISO 9000 in software development have improved quality
  • Jones considers ISO 9000, 9001, 9002, 9003 and 9004 to be subjective and ambiguous
  • Below 1,000 function points (approximately 10,000 lines of code) Agile methods are the most productive
  • Above 10,000 function points the CMMI approach seems to be more productive
  • I wold suggest that as time goes by Agile is learning to scale and pushing that 1,000 upwards

Jones also makes this comment: “large systems tend to be decomposed into components that match the organizational structures of the developing enterprise rather than components that match the needs of the software itself.”

In other words: Conway’s Law. (See also my own study on Conway’s Law.) Its a shame Jones missed this reference, given how well the book is referenced on the whole I’m surprised.

Elsewhere Jones is supportive of code reuse, he says successful companies can create software with as much as 85% reused code. This surprises me, generally I’m a skeptical of code reuse. I don’t disbelieve Jones, I’d like to know more about what these companies do. As has to be more about the organisational structure than just telling developers: “write reusable code”.

Overall the book is highly recommended although there are several things I would like to see improved for the next revision.

First, Jones does repeat himself frequently – sometimes exactly the same text. Removing some of the duplication would make for a shorter book.

Second, as noted above, Jones has no numbers on how automated unit testing, i.e. Test Driven Development and similar, stacks up against traditional unit testing and reviews. I’d like to see some numbers here. Although to be fair it depends on Jone’s clients asking him to examine TDD.

Finally, Jones is very very keen on function points as a measurement tool. I agree with him, lines of code is silly, the arguments for function points are convincing. But, I’m not convinced his definition of function points is the right one, primarily because it doesn’t account for algorithmic logic.

In my own context, Agile, I’d love to be able to measure function points. Jones rails against Agile teams for not counting function points. However, counting function points is expensive. Until it is cheap and fast Agile teams are unlikely to do it. Again, there is little Jones can do directly to fix this but I’d like him to examine the argument.

I want to finish my notes on Jones book with what I think is his key message:

“Although few managers realize it, reducing the quantity of defects during development is, in fact, the best way to shorten schedules and reduce costs.”

Humans can't estimate tasks

As I said in my last blog entry I’ve been looking at some of the academic research on task time estimation. Long long ago, well 1979, two researchers, Kahneman and Tversky described “The Planning Fallacy.”

The Planning Fallacy is now well established in academic literature and there is even a Planning Fallacy wikipedia page. All the other literature I looked at takes this fallacy as a starting point. What the fallacy says is two-fold:

  • Humans systematically underestimate how long it will take to do a task
  • Humans are over confident in their own estimates

Breaking out of the fallacy is hard. Simply saying “estimate better” or “remember how long previous tasks took” won’t work. You might just succeed in increasing the estimate to something longer than it takes to do but you are still no more accurate.

Curiously the literature does show that although human’s can’t estimate how long a task will take, the estimate they produce do correlate with actual time spent on a task. In other words: any time estimate is likely to be too small but relative to other estimates the estimate is good.

Second, it seems the planning fallacy holds retrospectively. If you are asked to record how long you spend on a task you are quite likely to underestimate it. There seems no reason to believe retrospective estimation is significantly more accurate than future estimation.

Something else that comes out of the research is: psychologists and others who study this stuff still don’t completely understand what the brain is up to, some of the studies contradict each other and there are plenty of subtle differences which influence estimates.

Third, although we don’t like to admit it deadlines play a big role in both estimating and doing. If a deadline is mentioned before an estimate is given people tend to estimate within the deadline. People are also quite good at meeting deadlines (assuming no outside blocks that is), partly this is because estimating then doing is a lot about time management. i.e. managing your time to fit within the estimate.

While deadlines seem like a good way of bring work to a conclusion it doesn’t seem a particularly good idea to base deadlines on estimates. Consider two scenarios:

  • Simple estimate becomes a deadline: If we ask people to estimate how long a piece of work will take they will probably underestimate. So if this estimate is then used as a deadline the deadline may well be missed.
  • Pessimistic estimate becomes deadline: If people are encouraged, coerced, or scared into giving pessimistic estimates the estimate will be too long. If this estimate is then used as a deadline work is likely to be completed inside the time but there will be “slack” time. The actual time spent on the task may be the same either way but the total elapsed (end-to-end) time will be longer.

I’ve written my findings down in full, albeit somewhat roughly, together with a full list of the articles I examined in depth. “Estimation and Retrospective Estimation” can be downloaded from my website.

I would love to spend more time digging into this subject but I can’t. Anyone else want to?

Apology, correction and the Estimation Project

Over the last few months I’ve been thinking about how people perform estimates when developing software. I’ve take time – not enough – to look through some of the research on estimation and try and understand how we can improve estimates.

I’ve written a long (11 pages, over 6000 words), and rough, essay recording some of my findings. You can download “Estimation and Retrospective Estimation” from my website, I don’t intend to publish the essay anywhere but I expect some of the findings to be included in some future pieces. In this blog entry and the next I want to report some of the motivation and findings.

To start, a correction and an apology.

In January Jon Jagger sat in on one of my Agile training courses. Jon asked me about two references I appeared to cite. I have failed to be able to support these references with the necessary research and I must apologise to Jon and others who heard me say I could.

Now if someone, me or anyone else, goes around saying “there is evidence for this” or “this is backed by research’ they should be able to back it up. I have failed on this count. I apologise to Jon and everyone else who has heard me say these things. I can’t track down my references. I should not have claimed there is research which I can’t provide a reference for.

I’m going to repeat my claims now for two reasons. First, I’m hoping that someone out there will be able to point me at the information. Second, if you’ve heard me say either of these things and claim there is research to back it up then, well I apologise to you too. And if you hear anyone else say similar things, please pin them down and find the original source.

The first thing Jon pulled me up on was in the context of planning poker. I normally introduce this as part of one of my course exercises. Jon heard me say “Make your estimates fast, there is evidence that estimates fast estimates are just as accurate as slow ones.”

Does anyone have any references for this?

I still believe this statement, however, I can’t find any references to it. I thought I had read it in several places but I can’t track them down now. I’m sorry.

Thinking about it now, I think I originally heard another Agile trainer say this during a game a planning poker a few years ago. So its possible that others have heard this statement and someone has the evidence.

Turning to my second dubious claim, I said: “I once saw some research that suggests we are up to 140% in judging actuals.” It is true that I read some research that appeared to show this – I think it was in 2003. However, I can’t find that research so I can’t validate my memory, therefore this is not really valid.

By “actuals” I mean: the actual amount of time it took to do a piece of work.

Again, does anyone know the research I have lost the reference to?

And again, I apologise to Jon and everyone who has heard me state this as fact. It may well be true but I can’t prove it and should not have cited it.

Why did I say it? Well, I find a lot of people get really concerned about the difference between estimates and actual, whether some task was completed in the time estimated. To my mind this is a red herring, actuals only get in the way.

This is where things start to get interesting. I decided to look again at the research on estimation and specifically recording “actuals.” The first thing I learned was that what we call “actuals” is better called “retrospective estimation.”

The “Estimation and Retrospective Estimation” essay record my journey and findings in full. In my next blog entry I’m record some of my conclusions.

Why Quality Most Come First & a comment on the 'System Error' report

On Monday night I presented “Why Quality Must Come First” at Skills Matter in London, the slides are now online (previous link to my website) and there is a pod-cast on the Skills Matter website.

This was a revised and updated of my talk at the Agile Business Conference last October. There are two key arguments:

  • High quality is essential to achieve Agile: if you have poor quality you can’t close an iteration, you can’t mark work done, and schedules will be unpredictable.
  • High quality (fewer bugs) is the means to achieve shorter delivery time: better and faster are not alternatives, they not even complementary, they are cause and effect.

That second point is one that I don’t think is appreciated enough, in fact I think a lot of people have a belief that you can deliver faster if you turn-down the quality. Many years ago I worked with a great developer called Roy. One day Roy went to our manager, the conversation went something like this:

Roy: Would you rather have the software sooner with bugs or later without?
Manager: Sooner with bugs

Both Roy and the Manager were seeing a trade-off that does not exist.

Resetting this broken mental model, unlearning this trade off is one of the biggest challenges that faces out industry.

Read my lips: High quality software is faster to develop. Invest in quality and you will finish sooner.

This is also one of the reasons I prefer the XP approach over the Scrum approach. It is also one of the reasons I was disappointed with the “System Error” report from the UK Institute for Government last week.

The reason the report got so much attention was because it recommended the UK Government adopt Agile and Iterative development in IT projects. By the way, the report is from a think tank, it is wrong to say this report means the UK Government will adopt Agile, or that the UK Government wants to adopt Agile, it just means some “thinkers” suggest it should.

I haven’t had a chance to do anything more than skim the report yet but what struck me was the lack of discussion about quality. The report continues the belief that quality is an effect not a cause.

The report lays out four principles for Agile projects:

  • Modularity
  • Iterative approach
  • Responsiveness to change
  • Putting users at the core

There is one big thing missing here: Quality.

Without high quality you can’t take an iterative approach because nothing is ever finished.

Without high quality you can’t be responsive to change because nothing is deployable and over time crud builds up which slows you down.

Without quality your users will not trust you, they will not believe you have changed, and will spend a lot of their time. Pushing bad buggy software one to users is not a sign that you have put them at the core.

Leaving aside the fact that Modularity is one of the most over used and consequently meaningless words how can anything buggy be modular? Do you want to (re)use a buggy module? Bugs are side effects, not something you want in modular code.