Are you leaking this?

Regarding memory management, Java is in general a much safer language than C or C++. However, it is still possible to get in trouble if you are not careful. One common pitfall is to leak the this reference from a constructor, which can produce strange behaviour, since the this reference points to an object that is not fully constructed yet.

Here is an example of how this can happen (pun intended). Consider a typical implementation of the Observer pattern. We have a class of objects that are observable:

public class Observable {
    private final Collection<Observer> observers = new ArrayList<>();

    public void addObserver(Observer observer) {
        observers.add(observer);
    }

    public void removeObserver(Observer observer) {
        observers.remove(observer);
    }

    public void notifyObservers() {
        for (Observer observer : observers) {
            observer.onEvent();
        }
    }
}

And we have the observers, which are objects that want to be notified when something happens:

public class Observer {
    private final int id;
    private int eventsObserved = 0;

    public Observer(int id, Observable observable) {
        observable.addObserver(this);
        this.id = id;
    }

    public void onEvent() {
        System.out.println("Event received by observer " + id);
        eventsObserved++;
    }

    public int getEventsObserved() {
        return eventsObserved;
    }

    // autogenerated by the IDE
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Observer observer = (Observer) o;
        return id == observer.id;
    }

    // autogenerated by the IDE
    @Override
    public int hashCode() {
        return Objects.hashCode(id);
    }
}

The whole reason the observers exist is to observe the observable, so it makes sense that they register themselves with the observable in the constructor.

Finally, let’s write a small test to see if everything works as expected:

class ObserverTest {
    @Test
    void testObserver() {
        // given
        Observable observable = new Observable();
        Observer observer0 = new Observer(0, observable);
        Observer observer1 = new Observer(1, observable);

        // when
        observable.notifyObservers();

        // then
        assertEquals(1, observer0.getEventsObserved());
        assertEquals(1, observer1.getEventsObserved());
    }
}

The test passes, and the print statements produce the expected output:

Event received by observer 0
Event received by observer 1

Now let’s introduce a small change in the Observable class. We change the ArrayList of observers to a HashSet:

    private final Collection<Observer> observers = new HashMap<>();

We run the test again, and this time it fails:

Expected :1
Actual   :0

and the print statements produce the following output:

Event received by observer 0

Looks like somehow the observer 1 did not receive the event. Our first thought might be that we made a mistake implementing the equals/hashCode. Let’s add a couple of assertions to the test:

assertNotEquals(observer0, observer1);
assertNotEquals(observer0.hashCode(), observer1.hashCode());

Those assertions pass. Given the two observers are different, how come only one of them received the event? The answer is that the two observers were not always different. In particular, they were equal when they were added to the HashSet, and they only became different later. Impossible! The equals/hashCode implementation uses the id field, which is final. It cannot change its value!

Let’s look again at the constructor of the Observer:

    public Observer(int id, Observable observable) {
        observable.addObserver(this);
        this.id = id;
    }

The this reference is leaked from the constructor and passed to the addObserver method of another object. At this point, the object is only partially constructed. The id field has not received its “final” value yet. However, the observable calls equals/hashCode to insert the observers into a set, and those methods operate on partially constructed objects.

Note that neither the Java compiler nor the JVM issue a warning or an error when a method accesses an un-initialised field, even if the field is final.

You may wonder: if id has not been initialised yet, what is its value? Programmers with experience in languages like C or C++ may be getting cold sweats at this point. However, in Java the value of a field is always the default value for its type. Since id is an integer, its default value is 0. This is why the two observers were equal when they were added to the HashSet. However, after the constructor finished, the id field was initialised to the definitive value, and the observers became different. In other words, despite id being a final field, in practice it had two values during its lifetime!

I know what you are thinking: the constructor was purposely written to leak the this reference before the id field was initialised. This is a contrived example. In practice, programmers are unlikely to make that obvious mistake. Just initialise the fields before calling any other method, or avoid calling methods of other objects from the constructor altogether, for example, by separating the registration of the observer from the constructor. Easy!

Unfortunately, the problem is not always that obvious. What if I told you that it is possible to leak a reference to a partially constructed this even without using the this keyword or calling a different object from the constructor?

Let’s see a different example. Consider the following class:

public class Customer {
    private final int age;
    private final double discountRate;

    public Customer(int age) {
        this.age = age;
        this.discountRate = calculateDiscountRate();
    }

    protected double calculateDiscountRate() {
        if (age > 65) {
            return 0.5;
        } else {
            return 0.0; // pay full price
        }
    }

    public double getDiscountRate() {
        return discountRate;
    }
}

Customers get a discount based on their age. However, we plan to have other types of customers with other types of discounts, therefore we have declared the calculateDiscountRate method protected.

We now introduce a subclass of customers, the FrequentCustomer. Customers who visit the shop more than 5 times get an extra 10% discount:

public class FrequentCustomer extends Customer {
    public static final int MINIMUM_VISITS = 5;
    private final int visits;

    public FrequentCustomer(int age, int visits) {
        super(age);
        this.visits = visits;
    }

    @Override
    protected double calculateDiscountRate() {
        if (visits >= MINIMUM_VISITS) {
            return Math.min(1.0, super.calculateDiscountRate() + 0.10); // frequent customers get 10% extra discount
        } else {
            return super.calculateDiscountRate();
        }
    }
}

We write a test to check that the discount rate is calculated correctly:

class FrequentCustomerTest {
    static final int MANY_VISITS = 8;
    static final int FEW_VISITS = 0;
    static final int YOUNG_AGE = 20;
    static final int SENIOR_AGE = 70;

    @Test
    void youngFrequentCustomerShouldGet10PercentDiscount() {
        FrequentCustomer frequentCustomer = new FrequentCustomer(YOUNG_AGE, MANY_VISITS);
        assertEquals(0.1, frequentCustomer.getDiscountRate()); // 10% for frequent visits
    }

    @Test
    void youngInfrequentCustomerShouldGetNoDiscount() {
        FrequentCustomer frequentCustomer = new FrequentCustomer(YOUNG_AGE, FEW_VISITS);
        assertEquals(0.0, frequentCustomer.getDiscountRate());
    }

    @Test
    void elderFrequentCustomerShouldGet60PercentDiscount() {
        FrequentCustomer frequentCustomer = new FrequentCustomer(SENIOR_AGE, MANY_VISITS);
        assertEquals(0.6, frequentCustomer.getDiscountRate()); // 50% for being a senior + 10% for frequent visits
    }

    @Test
    void elderFrequentCustomerShouldGet50PercentDiscount() {
        FrequentCustomer frequentCustomer = new FrequentCustomer(SENIOR_AGE, FEW_VISITS);
        assertEquals(0.5, frequentCustomer.getDiscountRate()); // 50% for being a senior
    }
}

Two of the test cases fail:

youngFrequentCustomerShouldGet10PercentDiscount: expected 0.1, actual 0.0.
elderFrequentCustomerShouldGet60PercentDiscount: expected 0.6, actual 0.5.

Clearly the 10% extra discount is not being applied. But why?

The problem is that the calculateDiscountRate method of the FrequentCustomer class accesses the visits field, but at the time that code is executed, the field is not initialised yet. The constructor of FrequentCustomer calls the constructor of Customer, which in turn calls the overridden calculateDiscountRate method of FrequentCustomer.

In this case, there is no obvious this reference being leaked. In fact, the this keyword does not even appear other than on the left hand side of assignments. There is also no obvious “inversion” of the natural order of a constructor. Fields are initialised before calling methods. At a first glance, the code looks correct.

The morale is that it is not safe to assume that when the constructor of Customer finishes its execution, the object is fully initialised. The constructor of FrequentCustomer still has to be executed in order to finish the initialisation of the object. In fact, it is possible that the calculateDiscountRate method of FrequentCustomer is called before the constructor of FrequentCustomer has executed!

It is clear that omitting the keyword this or avoiding invoking methods of other classes is not sufficient to prevent this class of errors. How can we prevent them, then? A safe recipe is to never call member methods from the constructor. All methods called from the constructor should be static. Another option is to only call final or private methods, but this is still prone to errors because depending on the order of the calls, the method may be called before all the fields are initialised.