In this chapter we’ll explore what work we do that’s useful and what work we do that isn’t useful. There is a fair amount of misunderstanding about what is waste because many people take the Lean concept of waste from the manufacturing world. In knowledge work things such as planning and estimation should not be considered waste since they help shorten the cycle time of the work being done. Of course, it is possible to over plan or take too much time to estimate.
We’ll start with an exercise to get you thinking. Take out two sheets of paper. On one of them put the title “Useful work” and on the other put “Wasted work.” Consider the work you do that adds value, such as getting requirements, and write them on the “Useful work” page. Then consider the work you do that is really wasted, such as fixing bugs. Take 10 minutes and think of as many of these as you can for each page.
Scroll down after completing the exercise.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Here are two lists of useful and wasted work that I’ve come up with. Of course, neither are complete but both give an idea of what’s involved. Let’s go through these and see if we can learn something. If you have items on your wasted work page not listed below, do your best to analyze it in the way I’ll be analyzing my list.
There is a pattern to wasted work that is worth understanding. Let’s go through each of the items listed there and see what we can learn.
Our useful work makes progress on the mission of your organization. Induced work is work was created by making a mistake or having a misunderstanding. I’m not suggesting mistakes and misunderstandings can be avoided. However, the amount of the induced work greatly increases the longer the time from the error until it is detected. I suggest we can usually vastly reduce the cost of the mistakes even when we can’t avoid making them. The common theme in doing this will be to minimize the time from making the mistake until detecting it. The notion that delays increase our waste can also be applied to most of the other items on the right. Let’s see.
Re-doing requirements or working from old requirements is caused when you have a delay from when you got the requirement until you needed to use it. Building the wrong feature is usually due to a miscommunication between the customer (or their proxy) and the development team. The greater the delay between getting the initial requirement and actually building it will increase the amount of work involved. Building unneeded features is so axiomatic in our industry that we think it unavoidable. However, if one builds features in stages, one can often learn that a feature isn’t needed by the time one gets ready to build it.
If we focus on building the most important features in small batches we can use what we learned to see if we actually need the pieces we deferred. This is another tenet of Lean – work on small batches. This accelerates value delivery while shortening delays to feedback. All of this contributes to reducing induced work.
Let’s look at the other items on the list of induced work. You may have noticed that the fixing, in fixing bugs, is in quotes. The reason is that developers don’t actually spend a lot of time on fixing bugs even though they have the experience that they do. Let me explain.
Consider this, imagine the worst bug you’ve ever had in your experience, or the worst bug you’ve seen a developer have if you’ve never been one. Think of the time they spent “fixing” it. Most likely, the first few hours were investigating the problem, then trying something, then setting things back after that didn’t work. Notice, up to this point, no fixing has been done. Investigating and relearning has taken place. The fix itself typically takes very little time.
Some people protest that this is just semantics. I disagree, but even if true it’d be important. There are two activities taking place here. The first is a discovery of what we have to do (finding) and the second is doing it (fixing).
Let’s take a look at this another way. Imagine a developer writes a bug. As a small aside I’ve noticed that developers talk about bugs as if they don’t write bugs but rather that they either show up or testers put them in. Notice how they often say “I found a bug!” or “testing found a bug!” as if they had nothing to do with it. BTW: I noticed this by observing myself, so I’m not deriding anyone. Anyway, now imagine that he/she is told about it immediately. How long does it take to fix? Let’s say an hour. Now, imagine that they aren’t told about this for a couple of weeks and further imagine that nothing else has changed. How long does fixing take now? Lot’s longer, maybe days longer. And it gets even worse if you have other work going on where the code has been changed by others or is using code modified by others since the original code was written.
The additional time required to find and fix from the first case to the second case is not semantics and it is a different nature than fixing code. It is clearly additional re-learning and discovery time. The reality is that we spend much more time finding our problems than fixing them and the greater the delay from creating the error until detecting it the greater this amount of increased time is. Also notice that this is not task-switching time as it is often attributed to – one might start working on the bug fix and concentrate on it alone and this phenomenon will still occur.
Continuing down our list, I would suggest that ‘overbuilding frameworks’ and ‘essentially duplicating components’ are more due to a lack of technical skills that can be improved through the use of design patterns and emergent design. Duplication is also exacerbated by delays as sometimes people forget what has been done.
The last work type on the right is “integration” errors. Again, note the quotes. I mark them that way since integration errors are exceedingly rare. An integration error would be an error in integration. More than 99.9% of the things I’ve seen called integration errors are actually errors that occurred well before integration. That is, the teams needing to integrate did not stay in sync with their understanding or their code. The integrator integrated just fine. The error lay in the fact that the components he integrated properly just don’t work together properly. Calling an error that occurred upstream an “integration” error is equivalent to calling a bug found in testing a “testing” error. Again we can see that the greater the delay from the error occurring and its detection in integration will increase the work taken to fix it. Note that his is just another reason why continuous integration is good. Continuous integration isn’t about avoiding integration errors, it is about detecting miscommunications between groups working together as they occur.