Comments: Asserting Oneself

Nice rebuttal.

Something that many people who've commented on my original posting seem to miss is that I was indicating problems with the traditional "standard" assert. Use of non standard asserts can and does deal with many of the issues but may raise other issues. My position is that often real error handling should be used instead of an assertion and if that's not the case then you should try and fix the problem another way before resorting to an assertion. Your position seems to be that you may not have time to fix it another way and the assertion helps you remember it wasn't fixed... That seems fair enough.

In the kind of code I write I don't ever want the program to continue onwards after a problem could have been detected; you don't want a financial transaction to occur that is incorrect just because your asserts are compiled out. Likewise you don't want a database update to occur with invalid data simply because an assertion didn't fire in the release build. If a test stays in the code all the time then I don't see it as an assert, it's just normal error checking and it needs some form of handling even if that's just to do the best you can to shut down cleanly. I work with a lot of multi-threaded code so, for me, checks that are important enough for assertions in debug builds are probably more important in release builds. Some race conditions hardly ever show up in debug builds and yet could occur all the time in release builds. For me, these errors are important in release builds and if the system shuts down it's better than it continuing. I can see that this is different in games. I find that once you get used to the pain of the system shutting down when there's an error left in it you quickly get into a mode of working where you don't allow errors to remain. I assume that you do considerable testing with your release builds as well as with your debug builds and I guess you probably have a release build that keeps your assertions in and the real public release that removes them? I can see that with your kind of software there's no point in reporting errors after you have shipped because it's unlikely that you'll fix them and ship v1.1. For me, I find that we end up with more reliable systems if you endure the pain of a slightly less forgiving system.

In your example of a function that takes a pointer that can never be null. If it can never be null then the test should stay in all the time or you should pass it in as a reference. It doesn't matter if you wish to take ownership, you can dereference it after. I find that passing in as a reference is the most reliable way; the interface clearly states "can not be null". The caller cannot pass in null, they cannot fail to check any error return and you don't have to test for null. I agree that in the case of a function that takes ownership of an object the interface could cause misunderstanding. Do your performance requirements really preclude the use of a null pointer check in this situation in the release build? Or is it just a case of there's nothing that you could do so you opt to ignore the issue and pray?

I agree that assert is still a useful tool from time to time but with TDD you need it less. For me, assert is a tool of last resort and I don't find that it's required that often.

Posted by Len Holgate at October 3, 2005 01:33 AM

"So what if asserts go away in release builds? We're writing "shrink-wrapped" software, so that's a good thing because there will be no programmers around with their handy debugger to fix the problem."
Hmm.. I don't know about console games, but for application software it's quite a bit more useful to have to ability to save a core dump and get the user to send an email to you with the core dump. It's a hell of a lot more informative than 'My program just crashes.' :)

Posted by Factory at October 3, 2005 03:34 AM

Len Wrote:
"It doesn't matter if you wish to take ownership, you can dereference it after. I find that passing in as a reference is the most reliable way; the interface clearly states "can not be null"."

Whoa.. To me.. Passing in that reference clearly states "can not take ownership". If a function wants to own a pointer then the function parameters should not hide its intentions. The assert inside the function checking for null is just a workaround for a short-coming of c++. The right thing here is a precondition, which of course you can't do in c++, so you do the next best thing.. An assert.

Posted by Billy Zelsnack at October 3, 2005 07:17 AM

I agree with you 100%. In fact I see unit tests as just more complicated asserts :-p

minor comment

Overriding assert is not always a good thing. If your dev environment catches the assert and allows you to break into the debugger it's much nicer to be able to inspect variables, memory, the call stack, etc. In my current project the lead programmer decided to override assert with his own so when an assert happens it prints the filename and line number. That's useless compared to letting the compiler / debugger deal with it.

Posted by gman at October 3, 2005 07:54 AM

Billy,

As I said, it's not always the right thing to do where you transfer ownership, but, if you allow a pointer to be passed in and that pointer must not be null then I'd prefer to have proper error handling rather than an assert which may or may not be included in the build. If the function doesn't take ownership then I'd always prefer a reference to a pointer that the documentation states "should not be null".

Posted by Len Holgate at October 3, 2005 10:23 AM

the D language contains asserts as part of the language spec. its pretty nice.

Posted by DrMagic at October 4, 2005 03:12 PM

DrMagic

How are these asserts reported in D?

Are they runtime or compile time or both?

Posted by Len Holgate at October 5, 2005 03:22 PM

gman, about overriding assert, you will probably always want to break in the debugger if you're debugging. So just because you provide your own custom implementation, it doesn't mean you lose any of the functionality of the default assert (that's what _asm int 3 will do on an Intel processor).

Also, when you provide your own implementation, you can have it break in your code, and not in the assert implementation, which is a lot nicer for debugging (otherwise you're always having to backtrack a couple levels in the call stack).

Check out the article in Game Programming Gems 1. It's really worthwhile if you're doing C++ development.

Posted by Noel Llopis at October 9, 2005 04:48 PM

Len, I think you might want to checkout the idiom "Programming By Contract" and
also Miro Samek's article on using assertions in C/C++ Users' Journal. It'll
tell you why you would prefer asserts to error checking everywhere and when to
use the error checking and when to assert. Infact, Noel's explainations come
close to what Miro Samek has written about.

Posted by Terence Ou at October 10, 2005 06:55 PM

Terence

What issue was it in?

I've read a lot of stuff about when to use assert and I still disagree with most of it. I think it's possibly just that the nature of the work that I do, and the way that I do it, doesn't really allow for the 'human interaction' involved with most uses of assert.

Posted by Len Holgate at October 12, 2005 03:24 PM

Terence, and others who might be interested,

The article by Miro Samek can be found here http://www.quantum-leaps.com/writings/samek0308.pdf

I'm off to read it now ;)

Posted by Len Holgate at October 12, 2005 03:30 PM

Terence

Interesting article; thanks for the recommendation. It's nice to see an embedded development perspective on it all.

I'm surprised that you seem to think that I disagree with it. After all, the article clearly states that the author thinks that asserts should remain in the release build - which was item number 1 on my list of reasons that the standard assert macro was evil. He believes that these asserts are first class error checking code that deals with errors that do not need to be handled; which I agree with. Since he's working on embedded devices he can simply have these asserts reboot the machine, I'm not that lucky, I need to ensure that locks are released and database transactions are rolled back and whatever so rather than terminate the application at the point of failure it works better for me to unroll the stack and allow the RAII idiom to do my clean up for me before catching the problem at a process/thread/other boundary and shutting down.

His views on programming by contract and defensive programming are very similar to mine. I want and get contract violations to fail reliably all the time and cause a large amount of pain. This gets the problem fixed rather than simply making the code "robust" to poor usage.

He doesn't mention actual C++ exceptions at all really, but then I wouldn't expect him to as the article is a C/C++ article and even if he's using C++, exceptions are not always available on embedded devices. The key point, for me, is that he compares his style of programmig using design by contract and assert to the style where all error values are passed back to the caller. Replacing his 'assert and reboot' style with a throw and don’t catch style doesn't change a great deal.

Still, if all of the stuff that I wrote about my views on this haven't convinced you by now then they're probably not going to. But just to ram the point home ;) I'm actually advocating MORE checking for things that can't happen rather than less and I'm suggesting that these checks are first class code and should be viewed as such...

Anyway, enough already...

Posted by Len Holgate at October 13, 2005 02:03 AM

Len

I'm really glad that we could actually have a friendly discussion regarding the use of asserts.

Based on your previous post after reading Miro Samek's article, errors such as those encountered from database transactions, etc don't fall under the same category as that of the first class errors. Infact, they are no different from that of, user data entry validation, file reading, etc. The point was if whatever was invalid or might cause a problem came from an external source such as that of files, information for textboxes (data entry through forms, etc ) then they should be handled by manual means (error handling code). External source equals dangerous, internal = safe which is why we assert because internal pretty much means programmer code which should be fixed immediately by the programmer (thus the contract).

By doing so, we really optimize our applications because error handling is placed where it should be (information from dangerous sources) at the forefront while not requiring unnecessary error handling code behind which basically just serves as overhead in release builds.

Just to summarise a little. Error handling code is a must. Only thing is it should be placed where it should be at the very specific locations where the errors occur and not further down because if its already been handled, it shouldn't have to be handled further down. If it would be, it has to be a programmer's error. Thats why we use the asserts as contracts basically against programmer and not against external sources.

What's your take on that?

Posted by Terence Ou at October 20, 2005 01:36 AM

Terrance

I guess my issue is the distinction between external and internal and how you define the difference and how that difference is affected by changes to the overal structure of the code.

I may be wrong but you appear to be saying that you should have different error handling policies for internal code and external code. I disagree with this as it means that the error handling policy would need to be changed as the internal/external code boundary moves.

Let’s take a set of libraries that are maintained by a single group of developers as an example. 3 libraries: A, B and C. Initially only A has an API that faces the users of the library (other developers) and B and C are both used by A. As such some proponents of assert would suggest that any error checking done where B or C are called into from A should be done using assertions as breaking the contract is a programmer error and as the same team maintains A, B and C they should be alerted via an assert so that they can fix the bug. Likewise any calls internal to A (or B or C) that are not part of the public API could use asserts rather than other error handling as, again, it's a programmer error. I disagree with this approach because it introduces two levels of error checking, one for the public API and one for the 'internal' API. This makes the error checking fragile in the face of changes to the structure of the code. So, my suggestion has been that you should avoid two error handling policies and simply adopt the external error handling policy across the whole codebase.

Suppose libraries B and/or C grow larger and more important than they were before and therefore their development and maintenance is moved to a separate team. The interface between A and B/C has now gone from 'internal' to 'public'. This means that, theoretically, the error handling policy needs to be revisited and code potentially needs to be changed just because the codebase has moved from the care of one team to another... To me this seems wrong. Likewise, if a previously internal method of A is needed to be exposed from the public API then its error handling policy needs to be reviewed.

Since this distinction between 'internal' and public APIs cannot be enforced in the languages that I use I tend to try and stay away from it and, instead, have a single error reporting strategy that I use consistently across all of the code.

I notice that you mention that you 'really optimize' the application by using asserts. I'd suggest that you only optimise the areas that need it and that you discover what needs it by profiling. As I've said before, some internal programmer errors (I'm thinking of threading and timing issues here mainly) may not show up as often in debug builds as in release builds and if your internal error checking goes away in release builds you may inadvertently trash data because of it...

So, in summary, my position is that by deciding to handle errors only at the obvious point of entry into your codebase (ie the public API) and assuming that errors internal to your codebase can be managed by asserts you're setting yourself up for either code changes or inconsistent error handling when code shifts from being internal to being public.

I agree completely on your idea that you should handle errors at the point of failure but I disagree that internal errors are any different from external errors. If there's a contract I want it to be enforced all the time.

Posted by Len Holgate at October 21, 2005 10:48 AM

Terence, sorry, just realised that I spelt your name wrong in the previous comment!

Posted by Len Holgate at October 21, 2005 10:49 AM

Len,

I apologize for not having made things clear. Infact, your reply was totally off which was really my fault. I didn't mean to say internal regarding internal or external developers. What I meant was for examples like data. Pointers passed into another function/method is data (internal if we don't consider COM or distributed computing, etc), just like information from file or database is data, and how they should be handled differently. If the function or method is receiving an invalid pointer, its the violation caused by the programmer who is using the interface because he/she is not fulfilling the contract. But who can have control over a person entering the wrong password or username or can you control the success of a database transaction? You can't and so you can't assert but instead use validation and error handling code. And it should happen there and then where it happens and not have to handle it again by passing in something invalid like a null pointer just because a transaction failed and so some new couldn't be done and another method is still allowed to be called using the invalid pointer(Of course exception handling might be abit different but we leave that out for now).

As for optimization, just to keep the both of us insync with what optimization is all about. Optimization is not all about profiling first before you decide what is wrong and optimitize. Profiling first is a practice for optimizing code but you don't have to go through a profile just to get your code optimized where possible, it can happen in your own coding habits. Does the optimization from i++ to ++i require a profiling before you do that? It doesn't because as long as you don't need the initial temporary returned by i++ you can just keep it optimized as a habit. The optimization might be small or insignificant, but hey if its got no implications and can be apart of your habits, why not. Its delibrately thinking up all sorts of "smart" optimization techniques without first coding out working code that is the problem that leads to pre-optimization. But for some things, you just know and you just make it a totally harmless habit that totally doesn't violate "Profile First Before Optimization".

Posted by Terence Ou at October 23, 2005 07:07 PM

Terence

No, I think you're missing MY point. ;)

The 'internal' regarding development argument is one of the arguments used by several reference works that people have used to defend their use of assert. However, it doesn't actually make any difference; it's equally applicable to your data example. Lets say you have a function, A(), that accepts a user's password. You have error checking here to ensure that the data is consistent with what you expect because you can't control the source and may have to report errors to the user. We're both in agreement on this.

Now, A() calls B(). Right now nothing except A() calls B() and the data that is passed to B() is checked in A() so you would say that you could simply assert it is valid. My view is that although nothing calls B() with untrusted data YET it may do one day and it's not always easy to know when that day occurs. Sure, code reviews will likely spot it but not everyone has those... As soon as B() can be called with potentially unchecked data then your use of assert is less useful. Sure you might happen to trigger the assert during your development and testing but you might not. If you don't and you build a release build and your asserts happen to be compiled out then users of B() can pass in invalid data. If you leave the assert in then users of B() have to accept that you have two different ways of dealing with and reporting errors. A() reports one way because you expected it to perhaps receive invalid data and B() reports another way because you thought it could never get invalid data. You were wrong, your interface is more complicated and/or broken.

So my approach is to suggest that we avoid an assert and instead look at other ways to ensure that the data passed to B() is "always valid". It may be that we simply repeat the tests that we have in A(). Sometimes this is appropriate. Sometimes it is not. Personally I don't think that's actually what I'd do in this case. I expect I'd change the design of B() so that it accepts a different data type. This is where my original complaints about asserts simply being sticking plasters over bad design comes in... Pass B() a "validated password" object rather than a string and it's now the responsibility of the validated password object to ensure that it's always valid. The checks that happened in A() may change, they may move into constructor of the validated password object or whatever. Does this take longer to write? Possibly. However, the more you work in this way the less you even consider designing the code in the potentially broken way. The point being the data that is passed to B() is always valid. As a reader of the code you don't need to worry about where the data came from, you have tests that prove your "validated password" object can only ever contain a valid password and you forget about it everywhere else. You don't need to look at an assert at the top of every function that uses a password that must be "valid". You don't need to duplicate that assert all over your code. You don't need to worry about it anymore. IMHO the code has become far easier to understand and much more robust.

As for optimisation. Yeah. I agree that you don't need to deliberately unoptimise. But I don't agree that compiling out error checking is a good idea.

I'm not sure we're actually getting anywhere here. I think we both just have different viewpoints and that neither of us is going to change the other's mind. :)

Posted by Len Holgate at October 24, 2005 03:18 AM