Wednesday, November 19, 2008

Testing private methods, TDD and Test-Driven Refactoring

While being a TDD--Test Driven Development--coach, I am always asked about testing private methods. I finally decided to write about my experience on testing private methods.

When really doing TDD (i.e. writing the test code before writing the functional code, and then refactoring the code) private methods are guaranteed to be test covered. When driving the functional code design and implementation from the unit test (aka Test Driven Development), no method is created private. Instead, the private methods are extracted—extract method refactoring step--from a public or package level method. The Example 1 below presents a scenario on applying TDD and how the private method is created.

New code development following TDD

Example 1: new code development following TDD

Consider the development example following a TDD sequence for creating methodA and methodB.

  1. create testMethodA/ create methodA/ refactoring
  2. create testMethodB/ create methodB/ refactoring: extract methodC

On the first sequence, the testMethodA is created for validating the functionality expected of methodA. The methodA is successfully created: All tests including the testMethodA are passing. Then you look for improvements in the code; refactoring.

On the second sequence, the testMethodB is created for validating the functionality expected of methodB. The methodB is successfully created. All tests including the testMethodB are passing. Then you look for improvements in the code; refactoring.

While looking for improvements in the TDD Refactoring step, you recognize that methodA and methodB have a common fragment of code (code duplication). You extract the common code fragment into a private method whose name explains the purpose of the method--methodC. Then methodA and methodB invoke methodC.

In this example testMethodA covers methodA, which invokes private methodC; therefore the private methodC is test covered.

Please keep in mind that this is a simplified example. Your methods and tests should not read like testMethodA / methodA, and testMethodB / methodB. Jeff Patton describes how test cases can describe design in his Test-Driven Development Isn’t Testing paper.

When doing TDD, the private methods emerge from your code. And because you test-drive your development, no functionality is added without a test. So, for code fully developed by following TDD you won’t have to test the private method separately: Private methods are already being tested by previously written tests.

Improving legacy code following Test-Driven Refactoring

Now let’s look into a more realistic example. The majority of the code I have been working on is not new code; therefore I am not doing pure TDD; instead, I am doing Test Driven Refactoring,

Test-Driven Refactoring

Test-Driven Refactoring is an evolutionary approach to improve legacy code which instructs you to have test-proven refactoring intent. Basically, you start by writing a passing test around the code to be improved, and then you refactor the code; improving its internals, yet still passing the test suite.

While doing Test Driven Refactoring, I am trying to perform small refactoring steps and, at times, I find myself attempting to test a private method. This typically happens when I am working on an existing code base with very low test coverage. And the public methods are too complex to write tests for.

Example 2: test driven refactoring for existing low test coverage codebase.

Consider that you want to improve the following code:

public void overstuffedMethodX(){


// very complex code

// invoke private method methodY()
someString = methodY();

}

private String methodY (){

}

In the scenario presented in Example 2, the public method does not have corresponding unit tests. And I don’t feel comfortable refactoring code which does not have tests around it. Therefore I will follow Test-Driven Refactoring for improving the code. Below I will explain two different approaches for doing Test-Driven Refactoring for improving the code in Example 2.

Top down Test-Driven Refactoring

First you create tests for the complex public method:

public void testOverstuffedMethodX(){…}

At this point there is test coverage around the overstuffedMethodX() functionality, so you are able to refactor the overstuffedMethodX() code, including refactoring for the private method methodY().

In the top down Test-Driven Refactoring approach, first the unit test for the public method is created, and then its internals (including the private methods) are refactored.

Let’s now look into another approach.

Bottom up Test-Driven Refactoring.

No test for the complex public method is created.

Instead you look for smaller pieces of improvement.

You change the access level for the private method to make it accessible from a unit test.

private String methodY (){}

becomes

String methodY (){} // package level access in Java

Then you write test for methodY()

public void testMethodY (){…}

Then you refactor methodY(), and verify that the improvement works as the testMethodY() test still passes.

In the bottom up Test-Driven Refactoring approach, you first improve the test coverage and the code for the private methods and the internals of the complex overloaded method. By doing so, the code becomes less complex; after that you can start moving up the chain, increasing and broaden the test coverage until you are able to take care of the public method.

When applying the bottom up Test-Driven Refactoring approach for Example 2, the private methodY() is made package level in order to be accessible by its corresponding unit test (consider the package access level for the Java language). Similarly to testMethodY(), other tests are added in a bottom up approach. After increasing the test coverage and improving the code internals, it becomes easier to create testOverstuffedMethodX(), and finally, refactor the overstuffed complex method.

Top down versus Bottom up Test-Driven Refactoring

Even I consider the top down approach to be more purist as it does not change the private methods access level, its implementation is not always straightforward. The public method might be almost “untestable” (e.g., static, dependencies, long, complex) in its current state. For such cases, a bottom-up approach might be more suitable. As I see it today, code refactoring activity is a combination of bottom-up and top-down approaches that enables you to have small proven steps towards a cleaner solution, which still keep the same interface (the external behavior which is verified by the test suite).

Bottom-line, while doing Test Driven Refactoring, I have provisionally added tests for the private methods. In Java, I deliberately remove the private namespace from the method declaration, changing the method access level to package access level. This way, the corresponding test class--located under the same package--can invoke the method and test it.

But I won’t stop the refactoring until I remove the test for the private methods. I have experienced two refactoring sequences which finish without explicit tests for private method.

In the first refactoring sequence, some later refactoring step moves the private method to a different class. Basically, you realize that the method’s functionality is beyond the original class’s purpose; you identify, create the new class, and move the method – which now is public access level.

In the second refactoring sequence, the method access level goes back to being private and you delete the test which was directly invoking it. Because of the increasing test coverage--as a result of the Test Driven Refactoring--new test scenarios are added for the public method which invokes the private method. Once you (perhaps assisted by test coverage and analysis tools) realize the extra test coverage on your private method, you perform my preferred refactoring step--unnecessary code deletion. The test for the private method can be safely deleted as its functionality is being tested by another test method.

Should I add test for private methods?

So, here is my answer to the test private method question: After successfully doing TDD or Test-Driven Refactoring, your code will not have specific tests for the private methods.

If you are developing new code, you are following TDD. You test-drive the development and no functionality is added without a test. Private methods are only created as the result of a refactoring step. And the path of code going through the private method is already being tested by previously written tests. So, after successfully doing TDD, your code will not have specific tests for the private methods.

If you are improving a legacy code, you should be following Test-Driven Refactoring. In this case, you may provisionally add tests for private methods. Gradually, with the increasing test coverage, the tests for the public methods will cover all the paths, including the paths going through the private methods. At this point, you don’t require the tests for private methods anymore. So, after successfully following Test-Driven Refactoring, your code will not have specific tests for the private methods.

Tuesday, October 14, 2008

Business Analysts are Important

I've always been convinced that User Stories are the way to describe feature requests. It just felt natural. IEEE 830 specification are just too ambiguous. Use Cases, even far more verbose, are often too large by covering all possible flows and exceptional cases. User Stories just fit the bill.

Writing good User Stories can be a daunting exercise. A proven way is to use the

AS A persona/role
I WANT goal
SO THAT business value

template with which the User Stories become much more tangible and more important of value to the customer. Watch out for 'and' or 'or' in the I WANT section, those often indicate a second User Story.
So far so good, but this still leaves the acceptance criteria out of the picture. Acceptance Criteria are a good guideline for developers to make sure they do the right thing right. It gives them a change to cover themselves with proper unit tests. QA also uses them as a testing foundation. Most important though, they are the definition of development done. (see 'Measuring Progress') Sure, the customer still has to like and accept them!

So that brought me to use:

GIVEN context
WHEN event
THEN expected outcome

With those kind of user stories you would think you are in pretty good shape. Well, you probably are in the remote case should you develop a programming tool. What does this mean? The more the development team knows about the problem domain the better they are off. Now, consider the case that you develop software for the financial, medical, nuclear or any other foreign field. Looking at the image below you are ok for as long as you stay in the green band where the requirements are clearly understood. In case of the mentioned programming tool your programmers will.


[Agile Software Development with Scrum; Prentice Hall, Ken Schwaber, Mike Beedle, p. 93]

Once you enter the red band, the team enters an unknown and not well understood territory. The User Stories might read like this.

AS A wombut
I WANT to blabalam amplituded chromioms in a shamaluk
SO THAT I can implotude the defused mobolobs

Makes any sense? If you are post-doctoral wombut it will. So, how can this knowledge be transfered to the mortal world of developers?
One option would be to have the wombuts and developers spent long hours together so that they get some understanding about the domain. The more knowledge is missing the longer it will take. Most often, however, wombuts are so busy that they can only spare an hour here and there. There might be some short hallway chats; but often those rise more questions then they answer.
Over the long run, this will lead to an information deficit and the resulting vacuum will be filled with assumptions. Don't assume -- if you assume you make an ASS of yoU and ME ;)



[The dotted line shows limited communication]

In the world of TOC (Theorie of Constraints) you can mitigate a constraint like this with introducing a buffer. The buffer enables you to be productive even though a upstream step is blocked. What could this buffer be? As the title already reveals, it would be a Business Analyst (BA).



[BA as buffer with good communication flow]

Having a Business Analyst will give you two things. First, knowledge you can access all the time; usually the knowledge of the BA is a subset of the experts but in general good enough. If the BA does not know an answer or isn't sure, she will follow up with the domain expert and report back and grow her knowledge by doing this. Second, the BA can help to bridge the communication barrier between the developers and domain experts. She is used to work with developers and understands how they think. Still, the lingua franca between both sides would be the User Stories with their acceptance criteria. If both the developers and domain expert are happy with the resulting User Stories then we have an agreement about how the customer value will be delivered.

What about the blue vertical band on the requirement image from above? The green/blue area is a well understood domain but is technically challenging. Think about the really fancy programming tool mentioned before -- for that strong programmers will be able to do the job. The red/blue area is when you need both; a BA and real good programmers.

Considering I lead a project in the plain red area (not red/blue) and I had to choose between a real good programmer or a BA, I would opt for the BA without hesitation.

Wednesday, September 17, 2008

Whose Fault Is It?

I recently read on a newsgroup about different opinions who is at fault and therefore responsible. The issue was about bugs which are discovered at the end of the iteration or during a following one.

One position was the following. This is the nature of software development and bugs do come up and they should be handled like user stories and be processed in an upcoming iteration. The other was that the developers messed up and since it is their fault they have to fix the bugs in their spare time.

What does each statement mean? In the first it means that the customer will get less value in subsequent iterations as they include bug fixes. The developers get immunity and are not being held accountable for their failures. The customer suffers.
Where as the second one is on the side of the customer as it clearly pushes the fault onto the developers and they need to fix it in their spare time which means long nights and/or week-ends.

First it should be differentiated whether this is a green field project or an existing legacy system which by its very nature tends to be more brittle. Regardless, in my opinion both approaches are too polarized and one approach possible illegal in certain countries. The high goal should be a customer getting all the value they can expect in a fair way and programmers which can have a social life. Another solution to the problem is required.

In the last couple of month I've been reading quite a lot about Lean and how Toyota achieves to make $16B profit while the US car makers are asking for federal help. Lean has some great approaches and one of them is the 5 Whys or Root Cause Analysis. The idea is to ask Why 5 times and the fifth answer provides the root cause. By fixing the root cause you fix all discovered symptoms and the problem should disappear for good.

So, I applied the 5 Whys in retrospective to a project I've been working on a while ago and which had exactly those problems.

Q: Why do we have bugs at the end of the iteration?
A: QA does not accept some user stories as they don't fulfill the acceptance criteria.

Q: Why do the user stories don't fulfill the acceptance criteria?
A: QA and domain expert don't have enough time to specify them on time and therefore the developers don't have access to them during development.

Q: Why do QA and the domain expert have not enough time?
A: They work on other project as well and share their time between those.

Q: Why do the work in several projects at the same time?
A: Matrix Management

Q: Why Matrix Management
A: Upper Management believes that this improves efficiency.

With this process we were able to identify one root cause -- Matrix Management. By removing this cause we should see improvement. In this scenario the solution is rather a structural -- I dare to say political -- then technical. Upper Management needs to be convinced that having dedicated QA and domain experts per project is necessary. Don't assume that this will be easy. I strongly recommend to provide hard data and generate some statistics describing the current situation and how the change would improve the productivity.

I want to re-emphasize the connection between Matrix Management and the problem of not having dedicated resources per team. This was obvious after answer number three but the root cause is the management structure. With that knowledge, you most likely will be able to identify and address further issues.

Once you get this sorted out and you see improvement don't let inertia slow you down. Look out for the next problem and apply the 5 Whys once more.

Just a final thought at the end. Often, if too many people work on too many projects in parallel. This is a symptom for prioritization gone wrong. Fear of postponing the wrong projects causes that more projects are underway as should.

Monday, July 21, 2008

Agile Bridge Analogy

It is quite common to make analogies between the IT industry and Civil engineering. Developers often compare software development and design with construction projects; for example, the importance of having a blueprint and following known successful practices. I have some hesitations about using analogies between these industries. I find the resources used by both industries fairly different and for that reason the analogies can create false expectations and reasoning. Nevertheless, I have been successfully using the analogy of building a bridge for explaining Agile development.

When I was a teenager, I spent my summer vacations in a small beach town near Rio de Janeiro. In that city (Rio das Ostras) there was a little river which separated two parts of the town. Back then there was no bridge to cross the river in a specific section of the town.

Over a few years, I was able to experience the bridge being built. When teaching Agile software development practices, I have been using the construction process for that bridge as an example of how Agile delivers business value incrementally and iteratively.

To better explain the analogy, I will take a look into the traditional approach for building bridges and then I will explain the Agile counterpart.

The Traditional development

Traditionally, a bridge construction will be planned in detail. The blueprint will be produced, the budget allocated and the schedule created (perhaps several months of construction).

'Big-bang release' is the term used to describe a common software development release. At the very end the software is released — one big (bang) release. But what happens if it is not successful?

Below are some photos from unfinished bridges.

Half-built bridges or long running construction does not deliver any business value: nobody is able to cross the river!

The Agile Development

Let’s now look into the Agile approach for building a bridge.

The figure below shows a first version view of the bridge (maybe after a first iteration). Even being very simple (and fast to plan and build) the bridge delivers business value: one person at a time can cross the river.

At the next iteration, another piece of wood is added to the bridge. Now, the bridge handles more load and two people can cross the bridge at a time, as well as bicycles and motorbikes.

Perhaps, for the next iteration(s), the bridge is reinforced and now people, bikes, motorbikes and light vehicles are able to cross the river.


Following an iterative and incremental development style, extra reinforcements are added to the bridge during the next iteration(s) and now people, bicycles, motorbikes, light and heavier vehicles can cross the river.

Even though it took a while to reach the current stage of the bridge, business value was delivered all along.

In this blog entry, I used an analogy between building bridges and developing software. I wrote about an Agile approach for building bridges. In fact, in this approach I have been a satisfied end user (someone wanting to cross a river). Furthermore, I used this analogy to illustrate how Agile focuses on delivering business value incrementally (and in shorter release cycles) through an iterative development process.

Even though in this example I compared software development to building a small bridge, Agile has been proven to work for large projects as well. I have witnessed large successful Agile projects for which business value was iteratively and incrementally delivered.

Wednesday, June 11, 2008

Resetting the 'Shitty' Counter

Again and again I hear the following phrase: 'This code is so bad that it has to be rewritten'. That statement is right in some cases. Even though I prefer to refactor, there are situations were it does not make sense any more. Most code bases can be salvaged with major and persistent refactoring efforts. Over time your refactoring skills get better and better and more systems are cureable but still, some cases are what they are: terminal.
What does it really mean to rewrite a system? It is the defeat of the software team. Often the existing team inherited the system, or worse they are the original authors. In either case, no team was able to make the code base more maintainable or at a minimum not to make it worse. The project lacked fundamental skills like team communication, best coding, configuration and testing practices. They failed -- the code rotted in their hands!
Next, this team decides to rewrite the system and promises to make everything better the next time. Sounds good? To me, it is more like a broken record. Why? They will repeat most of their mistakes and eventually call for another rewrite. Sure, the team learned from some of the old mistakes, but still this does not suffice to prevent another decaying code base. The rewrite means basically to 'Reset the Shitty Counter' [1].
Now your Counter is at zero, but unless you take matters in your own hands and actively manage the risks of your old mistakes your 'Shitty Counter' will slowly inch upwards. Well, ... you know the story!
How can this be achieved? There are many best practices from the agile world and even more books about those. It would be foolish for me to try to explain each of them and their dependencies and how they interact.

However, I will give you my elevator pitch:
  • Test Driven Development (TDD)
  • Pair Programming
  • Continuous Integration
  • Co-location of programmers, testers and domain experts
  • Scrum (do the Nokio Scrum Test to make sure you really do it [2])
  • Never ever change the length of your iteration
  • Determine you velocity
  • Retrospectives after every iteration [3]
From a code point of view, one should aim for:
  • Short methods (not more then 10 lines per method)
  • Single Responsibility Principle per Class
  • Low complexity (McCabe or CCN)
  • Loose Coupling and High Cohesion
  • Law of Demeter
  • Injectability (ie. Spring, Google Guice, PicoContainer)
Static Code Metrics
  • Test Coverage (use with caution unless you have many small tests)
  • Testability of Code (Testability Explorer) [4]
  • Percentage of duplicated code (CPD) [5]
These are the tools of the trade of an agile test infected code warrior. If you use those best practices on a daily basis and refactor your code mercilessly. Over time your code base will actually become better. It is not easy, it is a skill you learn by doing manually, day in day out.

[1] This term was coined by a colleague of mine
[2] http://www.agilecollab.com/the-nokia-test
[3] Agile Retrospectives (Esther Derby & Johanna Rothman)
[4] http://www.testabilityexplorer.org
[5] http://pmd.sourceforge.net/cpd.html