05 Nov 2020

Variance For Dummies Like Me

I hang out in the C# guild on Discord, and the community is fortunate enough to have several Microsoft employees as regulars in the guild. In more advanced conversations, the topic of variance comes up often enough that I’ve looked it up and had trouble understanding some of the more technical documents on it. I ended up asking for some help on it, and I got a wonderful explanation that I’d like to fine-tune and record for others.

If one were to look up variance on a search engine, they’d get a lot of results. Variance is a large and complex topic and I’m definitely going to miss the finer points of variance. My aim is to give people new to this topic (like I was) a good starting point so they don’t feel entirely lost in conversations.

This article will cover variance as it occurs in C#. It won’t get into any true depth, but the contents will be in as much plain english as possible.

What’s “variance”? I thought it was just covariance and contravariance?

Variance is a language-agnostic ability to vary the type of input or output for some kind of operation or unit of work. The different types of variance determine how strict or loose the requirements are when it comes time for an operation to receive input or provide output.

For statically typed languages that support the concept of generics, such as Java and C#, variance allows for more robust type safety for certain pieces of code or language features. For example, in C#, variance comes up frequently when talking about interfaces and delegates.

The most common types of variance developers frequently talk or hear about are covariance, contravariance, and invariance. In C#, covariance and contravariance correspond to the out and in keywords available in the language, respectively. Invariance doesn’t have a specific keyword, but it is the default state of most operations in C#.

Covariance

Covariance means that some “thing” (a type) and less specific versions of that “thing” are the only acceptable outputs of an operation. Consider IEnumerables in C#:

IEnumerable<string> collection = new List<string>();

IEnumerable

Notice that in the angle brackets where we’d normally define the type of IEnumerable, it says out T instead of just T. What out T tells us is that for IEnumerables of T, the only output types we’ll get back from that IEnumerable will be string or of a less specific type. In this case, a less specific type could be Object. Yes, the base Object class for C#!

Covariance allows the following assignment statement as legal syntax:

IEnumerable<object> objCollection = collection;

Covariance guarantees that an IEnumerable will be able to return its specified type as well as its parent types to the caller.

Contravariance

Covariance means that some “thing” (a type) and more specific versions of that “thing” are the only acceptable inputs of an operation. The example for covariance uses lambda functions, so although the usage is very simple, ensure you’re equipped before reading.

An easy way to demonstrate contravariance is by using the Action type in C#.

Action

The angle brackets here say in T rather than just T. What in T tells us is that for Actions of type T, the only acceptable input will be that of T or of a more specific type.

We’ll need some custom code to demonstrate this behavior, however. I’ve defined the following code:

namespace Variance
{
    // Less specific interface
    public interface IVideoGameCharacter
    {
        public string CharacterName { get; set; }
        public string GameSeries { get; set; }
    }

    // More specific classes that implement the less specific interface
    public class FightingGameCharacter : IVideoGameCharacter
    {
        public string CharacterName { get; set; }
        public string GameSeries { get; set; }
        public string TrademarkMove { get; set; }

        public FightingGameCharacter() { }
    }

    public class FirstPersonShooterCharacter : IVideoGameCharacter
    {
        public string CharacterName { get; set; }
        public string GameSeries { get; set; }
        public string FavoriteWeapon { get; set; }

        public FirstPersonShooterCharacter() { }
    }
}

Using the classes and interface above, I’ve created the following snippet of code:

// This is perfectly fine
Action<IVideoGameCharacter> foo = (i) => Console.WriteLine(i.CharacterName);
Action<FightingGameCharacter> bar = foo;
Action<FirstPersonShooterCharacter> baz = foo;

// This will trigger a compiler error
Action<FirstPersonShooterCharacter> alpha = (i) => Console.WriteLine(i.CharacterName);
// The following two lines will have their right-hand assignments highlighted in Visual Studio
Action<IVideoGameCharacter> beta = alpha;
Action<object> gamma = alpha;

The first example is contravariance on display: we can define an Action that will accept any IVideoGameCharacter as input, which includes the interface IVideoGameCharacter as well as more specific implementations of IVideoGameCharacter, such as FightingGameCharacter and FirstPersonShooterCharacter.

The second example is the opposite of contravariance. We can define an Action that will accept any FirstPersonShooterCharacter as input, but we cannot supply less specific implementations, such as IVideoGameCharacter or the base Object class, as input.

Contravariance guarantees that an Action<T> will be able to return its specified type as well as its child types to the caller.

Invariance

Invariance means that when we define a type on a given interface or delegate, we can only provide that type as input or get that type back as output when interacting with the interface/delegate.

For those familiar with C#, we’re all used to the notion that when we make a List<T>, we can only add items of type T to the list, and we’ll only get items of type T out of it. This is invariance; one type in, one type out. Consider the following list:

List<FightingGameCharacter> fightingGameCharacters = new List<FightingGameCharacter>()
{
    new FightingGameCharacter 
    {
        CharacterName = "Chun-Li",
        GameSeries = "Street Fighter",
        TrademarkMove = "Spinning Bird Kick"
    },

    new FightingGameCharacter
    {
        CharacterName = "Sol Badguy",
        GameSeries = "Guilty Gear",
        TrademarkMove = "Volcanic Viper"
    }
};

If we wanted to retrieve an item from this list, we’d be guaranteed to get an object of type FightingGameCharacter.

If we wanted to add

var doomGuy = new FirstPersonShooterCharacter
{
    CharacterName = "Doomguy",
    GameSeries = "DOOM",
    FavoriteWeapon = "Double-barrel Shotgun"
};

to the list fightingGameCharacters, we wouldn’t be able to. Can’t put a FirstPersonShooterCharacter in a List<FightingGameCharacter>, it’ll trigger a compiler error (specifically CS1503)!

A brief note about variance

While both covariance and contravariance ensure output and input for a specific operation, they’re not strict drill sergeants dictating what can and cannot be output or input. One goal of variance in C# is to ensure that there are as many safe conversions between types as possible.

For example, in the contravariance example that would trigger a compiler error, an easy workaround for that example is to simply add an explicit cast to the variable being assigned. Visual Studio’s intellisense will also suggest this as a potential fix:

Action<FirstPersonShooterCharacter> alpha = (i) => Console.WriteLine(i.CharacterName);
// With this cast, the assignment is now a legal operation!
Action<IVideoGameCharacter> beta = (Action<IVideoGameCharacter>)alpha;

A more in-depth look at variance

Variance is given life by the “magic” of Category Theory, a branch of mathematics that serves many purposes. To define it for mathematical laypeople like myself, category theory is a common language for discussing various concepts that occur in different branches of mathematics.

Tomas Petricek wrote a stellar in-depth article in 2012 about covariance and contravariance when the ideas were introduced in C# 4.0.