Don’t Get Caught out by Covariance!

Downcasting is bad, you shouldn’t do it, it’s code smell and it’s an anti-pattern. Unfortunately, in real world code you will see plenty of it. So you need to be aware of the pottential pitfalls of using downcasting. One that caught me out recently is covariance.

In C++, you can quite freely cast between types. You can take a pointer to some memory and cast it to any type you like, and read out what you get. Generally speaking you won’t get anything useful. If you’re not careful you’ll get “undefined behaviour”. If you have data stored in memory, the assumption is that you know that type that data is, and you are responsible for using it appropriately.

In languages like Java and C#, things are different. Here, the runtime checks our casts and will throw an exception if it thinks you have gone wrong. The consequences of this difference can sometimes be surprising.

Let’s look at an example. Suppose we have a base class Security defined like so:

public class Security
{
    int Id;
    public Security(int id)
    {
        Id = id;
    }
}

and two subclasses, Stock:

public class Stock : Security
{
    string Name;
    public Stock(int id, string name) : base(id)
    {
        Name = name;
    }
}

and Bond:

public class Bond : Security
{
    double Rate;
    public Bond(string name, double rate) : base(id)
    {
        Rate = rate;
    }
}

Now if we have a reference of type Security, that actually points to a Stock, we can happily downcast it to a reference of type Stock, like this:

Security s1 = new Stock(1, "GOOG");
Stock AsStock = (Stock)s1;
Console.WriteLine(s1.Name);

Because, Stock is a subtype of Security we can cast a reference to a Stock to a reference to a Security. This works because every Stock will have all the fields of a Security, in the same relative locations in memory.

However, if you have a reference of type Security that points to a Security or a Bond, and try and cast it to a Stock, you’ll have trouble. If we run the following code

Security s1 = new Bond(1, 0.02);
Stock AsStock = (Stock)s1;
Console.WriteLine(s1.Name);

we will see a run time exception of the form:

Unhandled exception. System.InvalidCastException: Unable to cast object of type 'Casting.Bond' to type 'Casting.Stock'.

This makes sense, the runtime knows that the object we have a reference to is not a Stock. It knows that we can’t cast this object to a stock in a sensible way. So the runtime stops us by throwing an exception.

Let’s look at a slightly more complicated example. Suppose, instead of a single object, we had a whole array of them. The following code:

Security[] securities = new Stock[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
Stock[] stocks = (Stock[]) securities; 
Console.WriteLine(stocks[0].Name);

will run happily. We can cast a reference of type Security[] that points to a Stock[], to a reference of type Stock[]. However if we try

Security[] securities = new Security[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
Stock[] stocks = (Stock[]) securities; 
Console.WriteLine(stocks[0].Name);

we will get an InvalidCastException:

Unhandled exception. System.InvalidCastException: Unable to cast object of type 'Casting.Security[]' to type 'Casting.Stock[]'.

This might seem a little surprising. The objects in our array are still actually Stocks. We know that we can cast a reference of type Security that points to a Stock to a reference of type Security. Why can’t we cast the Security[] reference to a Stock[] reference?

It’s a subtle one. When we cast an array reference, we are not casting the objects in the array, we are casting the array itself. So in the first array example, we are casting a reference of type Security[] to a reference of type Stock[]. The runtime knows that the reference actually points to an object of type Stock[], so this is fine. There will only ever be Stock objects in this array. Even though we have a reference of type Security[] pointing to this array, we can’t do something like:

Security[] securities = new Stock[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
securities[0] = new Bond(2, 0.02);

the runtime knows that our array is of type Stock[], to it throws an exception:

Unhandled exception. System.ArrayTypeMismatchException: Attempted to access an element as a type incompatible with the array.

However, in the second example, we have a reference of type Security[] that points to an array of type Security[]. Although this array only contains stocks, the runtime cannot in general say whether that is true or not. Suppose we had done something like this instead:

Security[] securities = new Security[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
securities[0] = new Bond(2, 0.02);
Stock[] stocks = (Stock[]) securities; 

We can of course add a Bond to an array of type Security[], the runtime doesn’t keep track of all this, which is why it has no way of knowing if the securities array really does contain objects of type Stock or something else.

The name of this type of casting is Covariance. Eric Lippert, one of the original designers of C#, has a pretty good blog about it. The really important point is that when we have an array of some base type, even if we are able to downcast the individual members of that Array, the runtime might stop us from downcasting the entire array.

This tripped me up in my day job last week. I was performing a data load that returned an array of type Security[], I knew this array contained only objects of type Stock and so I cast it to Stock[]. I merged this into master, but then our regression tests failed. Thankfully my mistake was caught before making it into prod.

Leave a Reply

Your email address will not be published. Required fields are marked *