OranLooney - LBYL vs. EAFP

There are basically two error handling strategies:

Look Before You Leap (LBYL)
It's Easier to Ask Forgiveness than Permission (EAFP)

I'm going to assume that you're familiar with both these approaches - most programmers have worked with both - and I'm going to talk about in the why and wherefore of the two approaches. This is my perspective on the subject, which is somewhat idiosyncratic.

LBYL

LBYL is the 'old' way. We validate every piece of data before we use it. It has some problems: Basically, that we have to know, and say, what 'valid' means. That requires a lot of knowledge of the function/object we're trying to validate, so it can violates encapsulation. Idioms help a little: in C, a NULL pointer generally indicates an error. Unfortunately, it doesn't contain very much information about why the error occurred, and there are no idioms for functions that return other non-pointer types.

We have to write all the validation code, which in practice means writing lots of boilerplate code (not to mention frequently forgetting to check some arcane condition.) This is particularly annoying when it prevents us from chaining expressions naturally: Instead of

int key_length = strlen(getval(map,key));

We are constrained to write

int key_length = 0;
char* value = getval(map,key);
if (val != 0 ) {
    key_length = strlen(value);
}

lest strlen() cause a seg fault.

EAFP

The 'new' way is EAFP, typically through the use of exceptions. This solves pretty much all the problems of LBYL, but unfortunately introduces its own set. Let's talk about the strengths of EAFP first.

EAFP rocks!

Let's try a thought experiment: imagine we're wrapping a LBYL library with an EAFP API. We would implement the LBYL in the wrapper functions, and raise an exception if something goes wrong, otherwise proceeding normally. If the LBYL functions return an error, we raise an exception for that, too. Clearly, what we've doing here is centralizing and formalizing the knowledge about the valid cases for calling the LBYL in the system. We are:

providing a well defined, consistent interface for communicating errors back to the client
creating functions/classes that have exactly one responsibility (validation)
adhering to the DRY principle: Don't Repeat Yourself.
making the code more readable by separating out the successful case from the exceptional case

EAFP (implemented with exceptions) in fact solves every problem with LBYL that we found above.

EAFP sucks!

Unfortunately, it also introduces it's own set of problems. The root of the problem seems to be that it's basically an invisible goto. Take a look at this Python method:

persistSomehow(self):
    'Commit the object to the file it is bound to.'
    try:
        self.saveCount += 1
        if self.fileName:
            f = open(self.fileName,'w')
        else:
            f = bindToBlobInDatabase(self.tableName,'w')
        f.write(self.externalId)
        f.write(self.seralize)
    except:
        logError(self,"persistSomehow")
        return False
    finally:
        f.close()
        return True

This looks ok, at first glance. But in fact, if there's a problem opening the file, it will raise an error (an unhelpfulUnboundLocalError at that.) Even worse, If the error happens on the write - perhaps the disk is full - then the file will be corrupted and not corrected. Even the local copy will have incorrectly incremented its saveCount. Even the return value is wrong: the function always returns True, because the return statement in the finally always executes!

The point is not that it's incorrect - I'd like to think I could easily write incorrect code in any language - but that it doesn't look wrong. Moreover, in order to make it correct, I would need a complicated set of nested try-except-finally blocks that would be more confusing the equivalent LBYL code. The cyclomatic complexity of the this method is not 2, but rather 6, since any call that can raise an exception is a branch point. That's a lot of hidden complexity! In this simple example, we can mentally trace the various execution paths through the code fairly easily. In complex code - and by complex I mean only the typical complexity of production code - it's not possible for any programmer to see the many possible execution paths through the code that exceptions allow for. By hiding the complexity of branching on exceptional cases, the programmer has been tricked into writing code too complex to be proved, tested, or debugged.

Raymond Chen has discussed this at some length:

I believe it's important to document and track how much thought, testing, and debugging has gone into a piece of code. (For example, unit tests are good in part because they document exactly which cases have been tested.) If Raymond is right and it's hard to tell if EAFP code is correct, we're doing the exact opposite. Peer code review won't catch as many bugs and any future maintainer will have to spend more time understanding the code before making a modification.

Even in languages that primarily use exceptions to communicate error conditions, we can still use the LBYL paradigm when we want to. We just "program into" the language instead of programming in" the language.

Lack of Conclusion

Now would be the time to make a call one way or the other, EAFP or LBYL. But there's no point in doing so. Both paradigms are deeply embedded in common languages, so programmers have got to know and use both. No exceptions. But sometimes we do get to choose, and there are some principles to guide us:

Use EAFP when you don't actually expect anything to go wrong and there is a single, clear "normal" case.
Use LBYL for complex cases. That will be more work, but the program will not be correct unless we deal with that complexity directly.
We can also deal with complex cases by breaking it into smaller cases, and use EAFP on each one. That's even more work, but might be more maintainable.
Use EAFP when it's ok (or even desirable) to bubble an error several layers; a single try-catch on the higher layer can save us tons of work on the lower levels.
Use LBYL when dealing with persistence, files, and any other irrevocable action. We need the extra control to avoid corrupting data.
Use EAFP when working with exception safe local objects that follow the Resource Acquisition Is Initialization idiom. Most C++ STL classes are exception safe.
When working with a file system or database that supports transactions, EAFP is almost always correct because of the safety net rollbacks provide. Also, LBYL tends to require several trips to the server, which can be a significant performance concern.
All else being equal, we should use the preferred paradigm of the language, even if it's not our preferred paradigm. This will make the poor saps who maintain our code happy.

One final point:

Test all error handling code.

Error handling code is paradoxically among the least tested code, probably because by definition it deals with rare cases. Hard-code raise statements, override API functions with stubs simply return errors, do whatever it takes to test those cases. The reason's very simple: QA probably can't create most of these exceptional cases, so if we don't test it our self, it will get released un-tested.

- Oran Looney May 23rd 2007

Thanks for reading. This blog is in "archive" mode and comments and RSS feed are disabled. We appologize for the inconvenience.

Coding Style Engineering

quietly programming away

LBYL

EAFP

EAFP rocks!

EAFP sucks!

Lack of Conclusion