LastCraft Home Page

The Last Craft?

This blog...
  • Syndicate this site with RSS 2.0
Related...
  • A lot of this blog material will be rehashes of stuff I've posted on SitePoint.
  • This blog is powered by Wordpress

9/5/2004

Logging is evil

Every serious program has logs. Logging errors is part of programming after all, as natural as breathing. You probably have half a dozen services or deamons writing messages to your hard drive right now and you feel safer for it I’m sure. And of course you often send log messages to yourself.

Just the errors at first.

A logger class is pretty simple, it just timestamps messages and shoves them in a file, so hardly much development overhead. OK, so the class that created the log message needs to know about the logger and that’s a little more awkward. We don’t want any Singletons around here thanks very much, so we’ll pass the logger into the constructor of the class doing the logging. Of course we don’t want our client code affected by changing the constructor signature, but if we are lucky we get most of our main entities through factories anyway. That’s handy because it will save the client code having to know about the Logger class as well, otherwise…er…things could get complicated.

The factories have to know where the log file is on this specific machine, so the factories will need to take a configuration object in their constructor. The configuration file itself, and the parser to go with it, can just be in the source tree. There might be one these already and we weren’t planning to use this code in any other projects anyway (were we?), so the extra configuration dependency shouldn’t be that bad.

Errors arrive a bit late though. Really we want to know what happened just before the error and that means logging any suspect behaviour. We don’t want to trigger any kind of alarm on these so we’ll call them LOG_WARNING level or some such and add filtering to our Logger. We’ll have to get the logging level from configuration file of course, but the real problem is that our original class we are monitoring is now getting a bit cluttered. That’s OK, we know what we are doing. We’ll create a decorator for each and place the log calls in that. We can even tell the factories to apply the decorators or not according to the configuration file, and well, shoot me if my middle name isn’t “Aspect Oriented".

And life just doesn’t seem so easy anymore. It seems to take longer than ever to produce ever more code, new developers have to be taught all about our error handling and monitoring systems and to make things worse, today we have a bug.

Debugging is worse than the dentist. Even with watches, breakpoints and single stepping our forty lines a day productivity can drop to fixing a single line over an entire day. That is bad use of resources, resources that could be better spent rewriting that section of code, or writing better tests, or clearer design, or code review, or a good night’s sleep or any other activity that would prevent the bug in the first place. What can be worse than staring at a debugger? Try doing the same thing by staring at log files. It’s like trying to find a gunshot wound with keyhole surgery. To make matters worse, all the other developers have dumped their LOG_WARNING, LOG_NOTICE and LOG_DEBUG messages (where did they come from?) all over your now mission critical log output. It’s a desperate mess. All that extra code and configuration for what? So we could now begin the most spectacularily inefficient process in the whole of software development…?

So, why did we log that first error?

If there really is a chance of error at that point then we should fix the probable root cause. Inspect the code to see if an error really is possible. Wrap tests around the components that could possibly cause the error and confirm that it cannot every time you check in the code. If it’s too complicated to make that assertion then break the code down. Rewrite it if you have to. If you distrust the code that much it shouldn’t be in your system, otherwise every other module has to code defensively around it. That’s clutter you don’t want.

An external library or system? Wrap it. Simulate failure of the external component with mocks to make sure your error trapping is well defined. That error is a possible input to your program and should be as well defined as when a user types in their own log-in details. Building on a firm foundation may feel like you are going slower, but you won’t be. You certainly won’t be if you can save yourself a couple of days debugging time.

You win in other ways. Your code will be clearer with the “debugging” calls taken out and that means fewer bugs because people will understand the code more easily. You code will be simpler to deploy now it has been freed from yet another file dependency. A file that is security sensitive by the way, which gets you into permissions and so platform dependence. But I think the real gain is that a crutch will have been taken away from the developers.

Logs become a dumping ground. Issues that would have provoked a much needed discussion can be too easily glossed over just by logging data “for later". For example you may be logging usage of a web site for later analysis. The factors that effect the analysis will be known by the client and you would save time by finding these out. Perhaps you don’t therefore have to log every last drop of information. Perhaps you only need overall totals or trends. This is something that should be cleared up with the client straight away, because logging everything is a sign that the goals are not very clear.

You also get the right degree of coupling between analysis components and their data. If analysis happens straight away within the same program all of the code pieces will be local to each other. With a more obvious connection we can prevent the inevitable bloat where data is added to the log, but no one feels safe about taking it away. Assuming you need to log it all. Why not place the result straight into the database and give yourself a real time solution? A log is a form of user interface and as a design choice it has to be subject to the same degree of rigour as your other design choices. Once you do you start to throw up cleaner alternatives.

A log is a symptom of a half solved problem.

6 Comments »

The URI to TrackBack this entry is: http://www.lastcraft.com/blog/wp-trackback.php/4

  1. Thanks for blogging Marcus! FYI wordpress 1.2 trucates your blog entry in the rss feed. I posted in the forums and they say it is fixed in CVS. Thought you should know.

    Comment by culley — 9/6/2004 @ 6:02 pm

  2. I wish we’d kept going with our argument discussion about this over on SitePoint a while back, but I have to say you’re starting to win me over. On a small team of one or two, I think logs tend to be more decipherable, but even so I’m sitting here trying to debug something by logging – and thinking I should instead improve my unit tests. The money quote for me: “If you distrust the code that much it shouldn’t be in your system….”

    Glad to see you blogging!

    Comment by John Gray — 9/8/2004 @ 11:39 pm

  3. Marcus, your post is insane. First off, almost everyone is using log4j, commons-logging or java.util.logging. The patterns are reasonably well understood and reasonably simple, so life should be pretty easy. Next, and this is fundamental, is that ANYTHING can fail. it’s less likely within a vm (you could still get OutOfMemoryExceptions, bad memory, unexpected vm crashes etc.) but once you’re outside your vm e.g. database connections, any network activity, disk etc. you’re almost guaranteed that there will be unexpected errors. Add the fact that it’s virtually impossible to guarantee that any reasonably sized project is error free (and by definition these are unexpected… otherwise why would they be there) and you’d have to be out of your mind not to use logging to capture as much information as possible about potential unexpected errors.

    Comment by Fletch — 9/22/2004 @ 3:33 pm

  4. G’Day. Interesting. I had a 30,000 line MFC C++ prog fail in Belgium last week.
    Exact same code ran without a hitch in the U.K. What was wrong?
    I’ve got 28 years C experience between my ears and I took two days to
    solve it.

    I had to send the prog repeatedly over to the client for them to crash it and
    return a gradually more densely detailed log file output to me for me to
    eventually find it was because Belgium was an hour difference and that makes
    all the difference to the CTime() object (bloody useless thing).

    Without logging I would have been stuffed. Could I have unit tested the program
    to cater for out-of-bounds errors on every MFC object? Could I have used
    Right-BICEP? Well… Yes… I would have loved to. But I had a hellish deadline
    (nothing new there) to write the thing and ship it within two weeks.

    So… No unit tests. No use cases. No mock objects. Hack code quick.
    Suits loitering about each afternoon wanting to know how soon they can
    bill the client.

    “Can we bill yet? Can we bill yet?”

    So. I think that testing when a team is involved and you have adequate time
    and opportunities to build use cases, test cases, create mock objects and the
    like before you start is a laudable goal, poor hard pressed muggins like
    myself often don’t have the option. So a manageable logging scheme is
    at least some help…

    My 2p.
    Kimbo

    Comment by Ms Kimberley Scott — 9/25/2004 @ 11:44 am

  5. I had a complex installer program that had to run on many systems and manipulate NT services and such.
    (these programs are notoriously hard to get right for ALL system configurations and OS versions)
    I learned from a senior programmer to create assertion macros that run even in production code. If
    the assertion failed it would log the error.

    When I was developing I was constantly hit by my own assertions and got good explanations in the log
    for each error I made. So once the code was checked in, it just didn’t have any bugs… well, there
    was one system configuration dependent bug. But when QA found the bug I looked at my log for half
    a second and knew what the problem was and what the fix would be.

    So logging makes development and maintanence faster, not slower as you claim. They help you when
    the you cannot reproduce the customer’s environment. With a log you can find the source of the bug
    even without ever touching the machine where it happened.

    As for defensive programming.. it’s a must. It makes code more maintainable.

    What you say about factories and such is weird. I usually just use a private static logger for each
    class (see log4j docs). Why corrupt the interface of my class with such things?

    Comment by dog — 10/11/2004 @ 12:20 pm

  6. Debugging is the process of inducing more and more bugs, just reshaping the perfect code and changing the
    look and feel of the application and then sigining off the entire lot with your signature and taking the
    credit of making the entire application..

    Comment by sunil — 10/12/2004 @ 7:41 am

RSS feed for comments on this post.

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

(required)

(required)