Joe Duffy recently introduced the “C# for Systems Programming” project. I spent most of the past four years working on this project, and eagerly awaiting the day that I could talk about it with the community. There have been some requests to talk about the error model, which happens to be one of the areas I was most involved with. So here we go! :)
Like the rest of the language, the error model was designed with the goal of being safe, fast, and easy. By safe, I mean that the language should prevent as many bugs as possible. By fast, I mean that the error model should impose the smallest possible burden on the runtime performance of programs that use it. By easy, I mean that the error model should help programmers be more productive, rather than getting in their way.
C++ is the textbook example of a language with an unsafe error model. There are plenty of error conditions that lead to undefined behavior. Rather than preventing bugs, C++ causes them.
C is an example of languages with a slow error model. In C, programs are littered with branches to check for and/or recover from error conditions. This makes the fast path slower than it needs to be, although good compilers are familiar with C idioms, and can often arrange code so that CPU branch prediction does the right thing.
Java is an example of a language with a complex error model. Programmers frequently resort to using ‘RuntimeException’ subclasses, so as to avoid defining methods that throw ten different exception types. I think it’s fair to say that most programmers feel like Java exceptions are more burdensome than helpful, and this is a big part of why C# doesn’t have checked exceptions at all.
In my opinion, the defining characteristic of our error model (and this extends somewhat beyond the language) is the distinction between recoverable and unrecoverable errors. This distinction underlies a whole host of useful properties that I will discuss throughout this post.
Unrecoverable errors are designed for conditions that can’t really be handled appropriately from within a software component. Generally speaking, if an unrecoverable error is raised, there is a bug in your code. Null dereferences, out-of-bounds array accesses, bad downcasts, out-of-memory, contract/assertion violations… the list goes on. In all of these cases, when the error hits, your component immediately terminates.
Recoverable errors are designed for conditions that your program should be resilient too. In our language, as with C#, recoverable errors are handled with exceptions. In general, recoverable errors originate from interaction between multiple components or with a user. You wouldn’t want your web browser to terminate if it can’t reach a website because your network cable is unplugged! Nor would you want your parser to crash when given invalid input.
It’s possible to convert between these two worlds. For example, test cases can convert recoverable errors into unrecoverable ones, if those errors are truly unexpected. More practically, if one component fails in an unrecoverable way, an external component can observe and/or recover from the failure of that component. (Another way to put it is that all failures are recoverable, but the granularity is much coarser grained than in traditional systems.)
Here are some of the things that this distinction buys us.
- By definition, unrecoverable failures cannot be observed. This frees programmers from the burden of handling or compensating for these failures. You can write your programs assuming that none of these error conditions will happen. You don’t need to handle out-of-memory. You don’t need to use tricks to avoid catching NullReferenceExceptions. It’s much the same as how GCs make programs simpler by avoiding the need to free memory.
- Because unrecoverable failures lead to immediate component termination, they’re very easy to debug (when compiled in debug mode). We can preserve the precise state of a component at the time that it terminates, down to the very last register. This makes unrecoverable failures very easy to debug and fix.
- An optimizing compiler can take advantage of the semantics of unrecoverable failures to generate better code. If a branch leads to an unconditional failure, it can assume that branch is not taken. It can also violate the invariants of a failing program left and right, so long as those violations cannot be observed by external components.
Unrecoverable errors are ubiquitous. We do not impose any restrictions on their use. In fact, because they are so useful for debugging, we positively encourage them.
In contrast, recoverable errors are tightly controlled. If a method might raise a recoverable error, it must be annotated as throwing (the keyword we use is “throws”). In fact, if you call a method that might raise a recoverable error, the call must be annotated to indicate that it might throw (the keyword we use is “try”).
When introducing this error model, we discovered something that we didn’t really anticipate. Some types of failures are ubiquitous. But in our system, all of those ubiquitous failures are unrecoverable. This means that recoverable failures -- i.e. errors that programmers need to handle -- simply don’t happen all that often. I would estimate that fewer than 10% of methods written in our language are “throws” methods. This means that over 90% of methods in our language cannot be observed to fail. Compared to traditional languages, where virtually any line of code can produce an observable failure, I think this is a remarkable result.
It might sound burdensome to have to annotate every throwing method and method call. Isn’t that even worse than Java’s already verbose system? But because so few methods are throwing, we’ve found that developers don’t mind the annotations. In fact, they come to really appreciate being able to tell, at a glance, which parts of their program might fail.
There’s another big way that we were able to make our language less verbose than Java. Because recoverable failures are so rare, most methods that can fail can only fail in a single way. Therefore, there’s no need for complex exception hierarchies. You can simply declare that your methods throws, without needing to list any exception types. If your methods throws, your caller will know why. We do allow you to be more specific about the exception types that your methods throw, but we’ve found that only a tiny portion of throwing methods (which are already a tiny fraction of all methods) use this functionality. The general case is not burdened by this, in the way that it is in Java.
Next, I want to shift gears and talk about the contract system. Methods may declare that they have preconditions and/or postconditions. Also, programmers may declare point-in-time assertions. Preconditions are checked before a method starts running; postconditions are checked after it finishes running (assuming that it does not terminate exceptionally). Preconditions and postconditions may also be discharged statically by an optimizing compiler; however, under no circumstances will a precondition or postcondition be bypassed.
Preconditions and postconditions are treated as part of a method signature, not as part of its implementation. They follow the usual C# visibility rules; any types or entities that are referred to by a contract predicate must be visible to any callers of the method. (Note that this rule permits some implementation flexibility. A compiler is free to generate contract checks in the callers, rather than in the callee. In certain circumstances, this can lead to faster code, or contract checks that are easier to optimize away.)
As described in the “Uniqueness and Reference Immutability” paper, our language has excellent support for understanding side effects at compile time. Most contract systems demand that contract predicates be free of side effects, but have no way of enforcing this property. We do. If a programmer tries to write a mutating predicate, they will get a compile-time error. When we first enabled this feature, we were shocked at the number of places where people wrote the equivalent of “Debug.Assert(WriteToDisk())”. So, practically speaking, this checking has been very valuable.
Also, because of the strength of the type system with respect to side effects, typical contracts in our language are different than they are in other languages. For example, a common pattern in some contract languages is “ensures old(x) == x”, i.e. an assertion that some value did not change during the execution of a method. In our language, those types of properties can be guaranteed statically. Instead, contracts are largely used to constrain inputs and outputs in ways that the type system is not rich enough to describe, such as by requiring that numbers are within a certain range, or that references are non-null.
To match programmer expectations, preconditions and postconditions are designed to satisfy the Liskov substitution principle. An override of a method cannot strengthen preconditions, and it cannot weaken postconditions. In practice, we found that it was simpler for overrides to just inherit the preconditions and postconditions of the method being overridden. Most languages don’t support contravariant inputs or covariant outputs/returns, even though it would be sound to allow it; in the same way, we found that programmers generally want their contracts to match, and that the benefits of variance weren’t worth the complexity cost.
Note that there are some subtleties with respect to interface methods. One method on a concrete class might satisfy multiple interface methods. However, what happens if those interface methods have different contracts? For postconditions, there is no problem -- we can simply check both sets. For preconditions, we could check the logical “or” of the preconditions, but that’s unlikely to be what anyone wants. Instead, we employ the following rule:
- When such a method is called through an interface, we check the contracts for that interface method.
- When such a method is called directly, we allow the user to specify what contracts should be checked.
This is not an entirely satisfying answer. A better approach might be to require -- and statically check -- that the methods all have the same contracts. However, it turns out to be very difficult to verify the equality of two contract predicates in a repeatable way. I’m happy to describe the challenges here if anyone is curious; maybe one of my readers will have a solution that we didn’t think of!
We played around with object invariants, but they are not currently supported by the language. As the joke goes, if you ask ten software engineers when invariants should be checked, you’ll get twelve different answers. Here are some approaches we considered:
- Invariants should be checked on every field write.
- Invariants should be checked on entry and exit to every method.
- Invariants should be checked on entry and exit to every method with a certain accessibility.
- Some combination of the above; e.g., invariants should have accessibility qualifiers, which designates how and when they’re checked.
Every one of these choices was surprising to at least half of the developers we queried. Therefore, by the principle of least surprise, we decided to omit the feature.
Whew! Let’s recap:
- Errors can be either unrecoverable or recoverable.
- Errors never lead to undefined behavior.
- Unrecoverable errors help programmers find and fix bugs.
- Recoverable errors help programs observe and recover from failures that are beyond their control.
- Unrecoverable errors are ubiquitous. Recoverable errors are tightly controlled and sparingly used.
- Preconditions, postconditions, and assertions help programmers specify their program behavior.
- The type system helps to avoid common pitfalls when writing contracts and assertions.
- Compilers can use the semantics of the error model to generate highly optimized code.
- Every feature is designed to be as reliable and as boring as possible.
As always, I eagerly await your questions and feedback. :)