The Dependency Injection Debate

Hackernews is an awesome place. It keeps people involved and interested. There are articles, questions, answers and debates. One such debate going on is about Dependency Injection. It was declared as a non-virtue and subsequently some one else declared it a virtue. Virtue or not, one thing is clear that it is important. While people have strong opinions about it, i have somethings to say as well. The points that is contested seems to be:

Dependency Injection breaks encapsulation

Since I have been mainly a java developer, i would take java related examples in this post. Details in this post may have nothing to do with the original posts above, this is my understanding of the topic and some of the concerns i have seen with DI.

So what is DI really? For a very well written and excellent explanation, please refer martin fowler’s article. In very simple plain terms, in your java code, whenever you use new operator; you are creating a dependency on the object being created. For example,

public class Car {  
    private Engine engine; 
    public Car() { 
        engine = new Engine(); 
    } 
}

Above is a very simple example. Engine in the constructor above is a Dependency. This can be anywhere in the code, in a method, in a block. The problem with such dependencies is that one can not test methods with dependencies. The calling method has no knowledge about the dependency and there is no way a caller can control the behaviour of such dependencies. Let's see why above code is not good.

While testing code, one is concerned about:

  • Unit testing
  • Integration Testing

Unit testing a method that uses new becomes tricky. Imagine if new was used to create an DAO object or a Service object. Your unit test now depends on the availability of the DAO or the service. And running several such unit test would potentially take so long to complete. Other than that test cases cannot switch or change implementation if new was used. For Integration test it may still be valid for certain cases when you need not change configs, database locations etc.

This is where DI comes in. DI is a mechanism to pass all the required dependencies to an object removing any usage of new within a class.

public class Car {  
    private Engine engine; 
    public Car(Engine engine) 
    { 
        this.engine = engine; 
    } 
}

With above code, one can pass a derived engine or dummy engine to the car object from a test case and control its behaviour.

Using DI has few effects:

  • The code becomes testable with full control on how dependencies behave. Unit tests become easier and faster.
  • Since dependencies are being passed, it becomes easier to change implementations.
  • The API of the class in question is no longer sound and robust. Breaks encapsulation in a sense.

It is very clear that we have gained in terms of testability of the code but we have broken encapsulation from an API perspective of the class. Let's take another example to understand it better.

public class DataFetcher {  
    public DataFetcher(Account account, Query query) {
        this.account = account; this.query = query; 
    } 
    public Result fetch() { 
        //get intial set of records 
        ExternalService externalService = null; 
        if(account instanceOf AccountType1) 
            externalService=new ExternalService1(account); 
        else 
            externalService=new ExternalService2(account); 
        externalService.login(); 
        Result results = externalService.fetch(); 
        //call above returns first 10,000 records. we may have more records 
        if(externalService.hasMoreData()) { 
            //modify query to advance it to next set of results. 
            //submit this query as a background job     
            Backgrounder.getInstance().addBackgroundTask(new Task(externalService)); 
        } 
        return results; 
   } 
}

Before you start jumping on your seat about singleton above, yes, there are two problems in the code above:

  1. It uses a new on ExternalService
  2. It uses a singleton

Let's walk through the problem that above class is trying to address. It's a DataFetcher service that takes an Account object and Query Object and returns results. The account can be a google account or Omniture account and the query can be a google analytics query or Omniture  site catalyst query. The service returns first possible batch of results to the caller and then keeps on fetching data in the background.

Now this is not testable as we have used new. To make it slightly more testable we can use DI and inject external service in the constructor as below:

public DataFetcher(Account account, Query query, ExternalService externalService) {  
    this.account = account; 
    this.query = query; 
    this.externalService=externalService; 
 }

We have now removed one dependency. We can control external service now. This is based on that assumption that we have a common interface between ExternalService1 and ExternalService2 and they are owned by us.

What if these two external services were coming from two different jar files from 3rd parties? For example, a google analytics client and a Omniture client. We can still define a common interface and get things working as per the constructor above but the calling class now needs to know about those two jars and corresponding classes or we end up creating a factory or service locator that returns an appropriate service based on account type. Besides, the API for DataFetcher changes, it's no longer “give me an account and a query to get results“. It's “give me an account, a query and an external service to get results“.

If DataFetcher was to be bundled into a jar so that it could be used by third parties, those third parties would need to know about one more class, the factory or service locator or utility class. This for me, breaks encapsulation. The fact that DataFetcher uses an external service is no longer hidden. The caller has nothing to do with the external service, it merely gets it from somewhere and passes it to the DataFetcher.

Now, let's say we made code changes and we pass an ExternalService to the DataFetcher, the method is still not testable because of the singleton. Using singleton is a sign of being a bad programmer, most people would say these days. Global state, after all ,is bad. Indeed it is. Let's see what is happening in this particular use case.

DataFetcher returns an initial result set and starts background processing (threads) to process all remaining data. The backgrounder object may use an internal java based queue or may use an external queue. The queue is shared/global state here. It makes sense to not to create a new backgrounder object for each task submission. The DataFetcher is not doing any interaction with the queue other than submitting a task to it.

To make DataFetcher fully testable one needs to get around backgrounder. The solutions may involve defining an interface and passing that interface into DataFetcher constructor.

public interface IBackgrounder { //........ }  

The calling class now need to know about the internals of DataFetcher, it needs to know that it uses some kind of backgrounder. The interface above exists just to fulfil testing requirement. It does not serve any real purpose in the system. Obviously, it guards us against any future changes where we may need to change our backgrounder class altogether.

So whats the conclusion? Is DI bad or not? Well, it does change the way one codes and it does break encapsulation but thats the price we pay for achieving high level of testability. It also depends on how brutal one wants to be testing and continuous integration. In times where we talk about frequent code check-ins and shorter release cycles, DI becomes essential.