Regarding memory management, Java is in general a much safer language than C or C++.
However, it is still possible to get in trouble if you are not careful. One common pitfall
is to leak the this
reference from a constructor, which can produce strange behaviour,
since the this
reference points to an object that is not fully constructed yet.
Here is an example of how this can happen (pun intended). Consider a typical implementation of the Observer pattern. We have a class of objects that are observable:
public class Observable {
private final Collection<Observer> observers = new ArrayList<>();
public void addObserver(Observer observer) {
observers.add(observer);
}
public void removeObserver(Observer observer) {
observers.remove(observer);
}
public void notifyObservers() {
for (Observer observer : observers) {
observer.onEvent();
}
}
}
And we have the observers, which are objects that want to be notified when something happens:
public class Observer {
private final int id;
private int eventsObserved = 0;
public Observer(int id, Observable observable) {
observable.addObserver(this);
this.id = id;
}
public void onEvent() {
System.out.println("Event received by observer " + id);
eventsObserved++;
}
public int getEventsObserved() {
return eventsObserved;
}
// autogenerated by the IDE
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Observer observer = (Observer) o;
return id == observer.id;
}
// autogenerated by the IDE
@Override
public int hashCode() {
return Objects.hashCode(id);
}
}
The whole reason the observers exist is to observe the observable, so it makes sense that they register themselves with the observable in the constructor.
Finally, let’s write a small test to see if everything works as expected:
class ObserverTest {
@Test
void testObserver() {
// given
Observable observable = new Observable();
Observer observer0 = new Observer(0, observable);
Observer observer1 = new Observer(1, observable);
// when
observable.notifyObservers();
// then
assertEquals(1, observer0.getEventsObserved());
assertEquals(1, observer1.getEventsObserved());
}
}
The test passes, and the print statements produce the expected output:
Event received by observer 0
Event received by observer 1
Now let’s introduce a small change in the Observable
class. We change the ArrayList
of
observers to a HashSet
:
private final Collection<Observer> observers = new HashMap<>();
We run the test again, and this time it fails:
Expected :1
Actual :0
and the print statements produce the following output:
Event received by observer 0
Looks like somehow the observer 1 did not receive the event. Our first thought might be that
we made a mistake implementing the equals/hashCode
. Let’s add a couple of assertions to the
test:
assertNotEquals(observer0, observer1);
assertNotEquals(observer0.hashCode(), observer1.hashCode());
Those assertions pass. Given the two observers are different, how come only one of them
received the event? The answer is that the two observers were not always different. In particular,
they were equal when they were added to the HashSet
, and they only became different later.
Impossible! The equals/hashCode
implementation uses the id
field, which is final. It cannot
change its value!
Let’s look again at the constructor of the Observer
:
public Observer(int id, Observable observable) {
observable.addObserver(this);
this.id = id;
}
The this
reference is leaked from the constructor and passed to the addObserver
method
of another object. At this point, the object is only partially constructed. The id
field
has not received its “final” value yet. However, the observable calls equals/hashCode
to
insert the observers into a set, and those methods operate on partially constructed objects.
Note that neither the Java compiler nor the JVM issue a warning or an error when a method accesses an un-initialised field, even if the field is final.
You may wonder: if id
has not been initialised yet, what is its value? Programmers with
experience in languages like C or C++ may be getting cold sweats at this point. However,
in Java the value of a field is always the default value for its type. Since id
is an integer,
its default value is 0. This is why the two observers were equal when they were added to the
HashSet
. However, after the constructor finished, the id
field was initialised to the
definitive value, and the observers became different. In other words, despite id
being a final
field, in practice it had two values during its lifetime!
I know what you are thinking: the constructor was purposely written to leak the this
reference
before the id
field was initialised. This is a contrived example. In practice, programmers
are unlikely to make that obvious mistake. Just initialise the fields before calling any other
method, or avoid calling methods of other objects from the constructor altogether, for example,
by separating the registration of the observer from the constructor. Easy!
Unfortunately, the problem is not always that obvious. What if I told you that it is possible
to leak a reference to a partially constructed this
even without using the this
keyword or
calling a different object from the constructor?
Let’s see a different example. Consider the following class:
public class Customer {
private final int age;
private final double discountRate;
public Customer(int age) {
this.age = age;
this.discountRate = calculateDiscountRate();
}
protected double calculateDiscountRate() {
if (age > 65) {
return 0.5;
} else {
return 0.0; // pay full price
}
}
public double getDiscountRate() {
return discountRate;
}
}
Customers get a discount based on their age. However, we plan to have other
types of customers with other types of discounts,
therefore we have declared the calculateDiscountRate
method protected.
We now introduce a subclass of customers, the FrequentCustomer
. Customers who
visit the shop more than 5 times get an extra 10% discount:
public class FrequentCustomer extends Customer {
public static final int MINIMUM_VISITS = 5;
private final int visits;
public FrequentCustomer(int age, int visits) {
super(age);
this.visits = visits;
}
@Override
protected double calculateDiscountRate() {
if (visits >= MINIMUM_VISITS) {
return Math.min(1.0, super.calculateDiscountRate() + 0.10); // frequent customers get 10% extra discount
} else {
return super.calculateDiscountRate();
}
}
}
We write a test to check that the discount rate is calculated correctly:
class FrequentCustomerTest {
static final int MANY_VISITS = 8;
static final int FEW_VISITS = 0;
static final int YOUNG_AGE = 20;
static final int SENIOR_AGE = 70;
@Test
void youngFrequentCustomerShouldGet10PercentDiscount() {
FrequentCustomer frequentCustomer = new FrequentCustomer(YOUNG_AGE, MANY_VISITS);
assertEquals(0.1, frequentCustomer.getDiscountRate()); // 10% for frequent visits
}
@Test
void youngInfrequentCustomerShouldGetNoDiscount() {
FrequentCustomer frequentCustomer = new FrequentCustomer(YOUNG_AGE, FEW_VISITS);
assertEquals(0.0, frequentCustomer.getDiscountRate());
}
@Test
void elderFrequentCustomerShouldGet60PercentDiscount() {
FrequentCustomer frequentCustomer = new FrequentCustomer(SENIOR_AGE, MANY_VISITS);
assertEquals(0.6, frequentCustomer.getDiscountRate()); // 50% for being a senior + 10% for frequent visits
}
@Test
void elderFrequentCustomerShouldGet50PercentDiscount() {
FrequentCustomer frequentCustomer = new FrequentCustomer(SENIOR_AGE, FEW_VISITS);
assertEquals(0.5, frequentCustomer.getDiscountRate()); // 50% for being a senior
}
}
Two of the test cases fail:
youngFrequentCustomerShouldGet10PercentDiscount
: expected 0.1, actual 0.0.elderFrequentCustomerShouldGet60PercentDiscount
: expected 0.6, actual 0.5.
Clearly the 10% extra discount is not being applied. But why?
The problem is that the calculateDiscountRate
method of the FrequentCustomer
class
accesses the visits
field, but at the time that code is executed, the field is not
initialised yet. The constructor of FrequentCustomer
calls the constructor of Customer
,
which in turn calls the overridden calculateDiscountRate
method of FrequentCustomer
.
In this case, there is no obvious this
reference being leaked. In fact, the this
keyword
does not even appear other than on the left hand side of assignments. There is also no obvious
“inversion” of the natural order of a constructor. Fields are initialised before calling methods.
At a first glance, the code looks correct.
The morale is that it is not safe to assume that when the constructor of Customer
finishes
its execution, the object is fully initialised. The constructor of FrequentCustomer
still
has to be executed in order to finish the initialisation of the object. In fact, it is
possible that the calculateDiscountRate
method of FrequentCustomer
is called before
the constructor of FrequentCustomer
has executed!
It is clear that omitting the keyword this
or avoiding invoking methods of other
classes is not sufficient to prevent this class of errors. How can we prevent them, then?
A safe recipe is to never call member methods from the constructor. All methods called
from the constructor should be static
. Another option is to only call final
or private
methods, but this is still prone to errors because depending on the order of the calls,
the method may be called before all the fields are initialised.