LastCraft Home Page

The Last Craft?

This blog...
  • Syndicate this site with RSS 2.0
Related...

11/22/2005

Sarbanes-Oxley versus Agile

Agile development assumes that the participents are professionals. We take it for granted that everyone will do the best for the team, that everyone is skilled and capable, and that the team is focused on the project being successful for the sponser. It’s the assumption of doing the right thing that allows us to skimp on the paperwork and excessive formality. Trust is presumed.

Then came Enron.

OK, so there is no trust to be had when it comes to large sums of money, but we developers don’t make financial decisions. What’s Enron got to do with us?

Well, quite a lot as it turns out. You see, the defence of main culprits was ignorance. They “innocently” lost control of the finances and didn’t know what was going on. Negligent leadership yes, criminal act no, they pleaded. It’s to remove that defence that the US govenment introduced the Sarbanes-Oxley legislation. A financial officer accidently losing control of the finances is no longer a satisfactory excuse for staying out of jail. Not only that, but auditors have the duty and right to report any loss of financial control to the shareholders, and for a public company to the public, to give early warning. That’s not a small detail. An international bank, say, that recieved a bad audit on this score would not welcome the publicity. After Enron, it gives the market the jitters.

So what? This is a programming blog, so stick to the point Marcus. OK, I will. The point is that the loss of trust extends all the way down to our daily working practices. Some examples, the obvious one first.

Development time costs money. That money should be tied back to a specific project or task. Free floating development costs are a black hole, and black holes are no longer acceptable. You can emply some kind of standard accounting tool to keep track of this, or you can invent your own. Trouble is, if you invent your own you are subject to an audit every year. Accountants probably won’t be very impressed with a homebrew system that works on index cards, and so using some kind of standard will save a lot of trouble. The change management tool vendors love this of course. Tying version control check-ins to requirements tools is the kind of stuff that cash cows are made of. Tool vendors are spreading the word.

Now if you are very lucky, you have an enlightened project manager who understands the value of refactoring. You see a problem unrelated to your current work, you fix it right then. Everyone wins…er…except you. You have to shoehorn this piece of opportunism through the change management tool. From the demonstrations I have seen so far, these tools don’t look too agile.

Suppose you are writing some simple database code that generates a report. Not much need to refactor here, this has been done hundreds of times. Well, the data in this report probably gets used in the performance indicators of the company, the profit and loss figures upon which the big decisions are made. If there is suspicion of that data, the company is losing control of it’s finances. It could be decieving the shareholders too. This is not about testing, it’s about having the authority to work on that report. The software as well as the data must be secure and no one person may hold all the keys. Your handling of passwords becomes subject to external policy. Opening the version control system to shared code ownership may not be part of that policy. That’s a big loss for an agile team.

None of these problems are insurmountable of course. A dash of politics, some technical adjustments and a dose of guerilla refactoring may get you through the day, but it’s still friction. Fear spreads.

6/3/2005

Listen kids, AJAX is not cool

If you writing a user interface, make sure it responds in 1/10th of a second. That’s a pretty simple rule, and if you break it, you will distract the user. This rule has pretty much become law, never mind lore. You find it in books such as “The Humane Interface” by Jef Raskin and many other user interface guides. If you write GUI software, you are well aware of it.

I cannot find this rule anywhere in Jacob Nielson’s “Designing Web Usability". It’s not a bad book, in fact it’s an excellent one. It’s just that web interfaces are a different usability problem, one of organising fairly static information. You don’t need stopwatches or video cameras to study users in this environment. Just five test subjects and perhaps the Apache logs. This is stuff anyone can do. You only have to state that all links have to be underlined blue to start out in this community. Until now, the low level neurology has been left to the writers of web browsers. The two communities have been separated.

Except now we have AJAX.

AJAX is JavaScript based and JavaScript is usually used to add convenience and to pretty up web pages. Because sites had to work without JavaScript, usage was limited to extending the HTML or saving on the odd page request. Then came GMail. Suddenly we have lot’s of web developers “enhancing” the browser experience with behind the scenes XML fetching back to the original site. I cannot think of a worse collision of technologies than low level user interfaces with requests over the internet. The delays and failures of internet traffic are especially painful in this environment and, from the AJAX demos I’ve seen, the developers aren’t helping.

A typical demo is form validation. The first field is usually one where the user can select a new user name. Of course that username could be taken, and so this initiates an AJAX request to the server. Meanwhile I have carried on typing and am a few fields in when a dialog pops up. You don’t need a GOMS analysis to know that this is going to be extremly annoying. I dismiss the dialog, it said something about a database I think, and retype the second half of my word. I go to submit the form and find the submit button is greyed out. I eventually work out that the username has been cleared and I retype it and quickly click submit. Oh joy.

OK, this is a badly designed example and it could be improved in several ways. For example, within 1/10th of a second, I could highlight the field in some way. Maybe I could grey out the text or highlight the border of the field with a pale yellow. When the response comes back I could highlight the border as red on failure and only then disable the submit button. The highlighted field had better not have scrolled off the screen and I had better have a helpful message next to it by then. If the user can type faster than my server can respond, likely if the user is habituated to a form, then they should be allowed to submit. Otherwise habituation is lost and the interface starts invading their short term memory.

Even when the process is improved, there is no guarantee that we have enhanced the user experience. Entering a form is a familiar operation. We can do it whilst answering the phone or explaining something to the kids. The extra time fetching a separate error page may be time that I am putting to other uses anyway. Perhaps I just needed a rest. An interface that jars me out of my familiar path is probably not helping me at all. I don’t know this of course, because I haven’t measured it. But if you design such an interface, you don’t know either. You need to measure it, and this kind of usability is a lot more work than watching people click on pages.

I read more web pages than I do books and I spend more time doing it than I watch TV. I don’t think I am alone in being habituated to the way the web behaves as pages. When you write AJAX applicatons you drive a horse and cart through one of the most successful metaphors of all time. GMail can get away with it, because it’s very close to a related metaphor, the mail application. Being the new way has a price.

AJAX does have some uses. If you are exploring a dataset, you don’t want to fetch the core data again on every attempt to expand just a portion of the information. I have seen an excellent demo by a colleague with a trading system. Using a rollover it is possible to see trading history for each market indicator. Because this is an intranet system, it is responsive enough, and because the information is embedded in other explanatory content, it makes sense to use a web interface. That demo was cool, but it was a pet project not a deployed application. These put AJAX on collision course with another issue in software development, automated tests.

We have it easy as web developers. We just shovel text around and text is easy to test. Unsurprisingly there are a lot of tools to test web content. Tesing GUIs is far harder, so hard that it isn’t usually attempted. Instead a thin presentation layer is written and the calls to it are intercepted. The presentation layer still has to be tested by hand and that will delay a rollout. GUI applications are usually shrinkwrapped items, so that’s no problem for them. A web server may get rolled out twice a day. That’s a big problem for us.

More advanced tools may help a little here. Selenium and WaTiR at last make JavaScript testing possible, but it’s not easy to set up for integrated testing and you still need a browser. I haven’t yet seen an AJAX demo tested with Selenium. If anyone tries it, I’d like to know.

AJAX has possibilities, but it’s not there yet. Not as a community and not with the tools. Web developers cannot become GUI developers overnight. We need time.

2/9/2005

How did Google get it wrong?

If you run a blog or Wiki you will be only too aware of the Google PageRank™ sytem. In case you have been on a rather extended holiday and/or in a long coma, it’s a system whereby your site climbs the search engine results page if lot’s of other people are linking to you. It’s not quite that simple, but that’s the gist. In competition with each other to promote sales of Viagra, or to get people hooked on gambling, various crooked characters deface public sites with gay abandon. They leave a trail of links pointing at their own sites, often with Chinese titles, all to boost their own PageRank. All to climb Google.

These comment spammers are not nice people.

They will happily destroy the content of a Wiki and overwrite every page. If they don’t get every one, then it’s usually because their script is too stupid to keep track of the pages it’s already written over and so cannot get to the now newly orphaned pages. These scripts hammer the site while they operate. Not only that, but the frequency of attacks is now at epidemic proportions. I get about three separate attacks a day on my blog and about five major attacks a day on this PageRank 7 wiki. Faced with the brutality and increasing frequency of these incursions, ISPs can take down servers believing that are under a denial of service attack. Even if they understand the phenomena, such attacks cause too much server load for the value of having the small blog customer. ISPs are starting to ban the use of tools like WordPress and MoveableType on their end user accounts.

OK, it’s not just Google to blame here, but all of the search engines. It’s just that Google’s system is the most well known and this has historically made it the main spam target. In a tacit acknowledgement of this, Google have decided to help the bloggers. Er…sort of.

Their solution is to allow you to take away the PageRank value of selected links. If you are maintaining Wiki/blog software then comment field links should have a “rel” attribute (uh?) set to “nofollow". That way the spammers will lose the incentive to spam you, because they will get no benefit from the links they leave. “Drat” they say, as the abandon their get rich quick scheme and go off to earn an honest wage.

The plan is so idiotic it’s almost surreal. It obviously in no way penalises the spammers, who are playing a percentage game anyway. So what if a few spams are ploughed into stoney ground? It does make the engine spider’s life a little easier of course, because it can spend less time indexing blogs. Lucky old engines, poor old webmasters who are expected to upgrade all of their software. Software that has been heavily customised and, given that few of these applications are design masterpieces, heavily hacked. I certainly won’t be upgrading when there is zero benefit. Even if I do, the new attribute has to survive RSS feeds and some old and not so smart news aggregators. Really I won’t have time anyway because I am too busy fighting spam.

What’s even more surreal though, is that the software authors are jumping on board and working on adding this as a feature. There is even talk of making it part of the HTML standard. This attribute is about as useful as the blink tag.

Suppose the engines had tackled it differently. Suppose that when your site was spammed, you could dispatch the content of the spam straight to Google, Yahoo, etc. They could then ban all of the links promoted with the dubious posting. A sort of “SpamBack". This changes the market forces significantly from the peddlars point of view. Far from ploughing on less furtile ground, they are now ploughing a mindfield. Rather than one hundred percent of everybody having to manually instruct the GoogleBot, all it would take would be a small percentage of spam aware applications to fight back. The spammers could not risk dumb spamming for fear of tripping these alarms.

I bet there are other simple solutions as well.

So how did Google get it wrong? There are smart people in Google, so did they not allocate enough time to this? Perhaps they lost touch? Can you see blogger peons working in a lowly office from the hallowed windows of a “plex"? Perhaps the Google blog could explain as it’s hardly a public relations coup. Whole sites have sprung up against “nofollow”.

11/27/2004

CEOs are chickens

The following analogy comes from Scrum. In fact I am going to quote Ken Schwaber and Mike Beedle’s “Agile Software Development with Scrum"…

A chicken and a pig are together when the chicken says, “Let’s start a restaurant!” The pig thinks it over and says “What would we call this restaurant?” The chicken says “Ham n’ Eggs!” The pig says, “No thanks. I’d be committed, but you’d only be involved!”

The Scrum meeting rule says that pigs, committed project members, are allowed to talk. Chickens, people who’s career would be unscarred by project failure, can only listen. Opinions from pigs will have the needs of the project at the top of their concerns. They cannot afford to put self interest first, leading to balanced and rational compromise. So in a small web based firm, besides the web developers, who are the pigs?

The sales managers are usually piggies. They have sales targets, so usability, customer profiling and conversion rates are vital to them. Missing those targets is financially limiting, and possibly career limiting too.

Marketing are also in the pig pen. They will need a constant stream of information from the developers, usually in the form of processed log files. They also need to post process content for search engine optimisation and usually have link building programs in play. If the developers cannot supply these services then the marketing plan can be severely disrupted. Even if marketing’s jobs are safe, someone’s head will eventually roll.

Another porker is the content manager. An unpublished author has achieved nothing and will complain loudly. There may have been expensive copyright negotiations beforehand that won’t repay themselves until publication. Also content ages. A delayed appearance on the web site could invalidate it. If the content manager has a problem, the development team will hear about it in about the time it takes to walk down the corridoor.

The support staff slopping around in the mud are utterly dependent on IT. They can have a miserable job, so let’s not make it worse for them.

By contrast the CEO can take project scale action. That action could be as drastic to fire everyone responsible and outsource the whole project to India, and they might do it anyway if it’s perceived to be in the interest of the company. More usually intervention comes in the form of long term strategy changes that affect the other stakeholders. The mission statement of the project will shift accordingly, or the project may fragment or be allowed fewer resources. The original plans can be changed to the point of mutilation by them. The CEO is committed to the company, not the project.

If you are using iterative development then you have supplied your CEO with options other than cancellation or expensive change requests. The CEO already has sufficient influence over the warring parties that there is no need for them to write stories or micromanage priorities.

So, can you keep the CEO from interfering in iteration meetings? Good cluck…

10/27/2004

No URLs in my blog

It’s an experiment. I want to see if URLs are dying.

The idea is that the search engines are now so good, there is no need for a hard link to another site I have no control over. I’m a programmer and so I want loose coupling from my blog. Link rot is a good example of a dangling pointer to me and so I am refactoring back to English descriptions. Back to a query rather than a reference if you like. The plan is given a sufficient keyphrases you will almost certainly find the same reading material I based the blog entry on. Not exactly the same maybe, but blogging is about news and ideas rather than specific documents so I have room for flexibility.

There are going to be problems with this. The first is that the blog entries will be less eye catching. Web users start with a title and then start looking for blue underlined text. Well I don’t have any, just a wall of black and white, so the posts look rather boring. Hopefully I can make up for it with controversial titles that annoy people into reading on.

A more serious problem is that site impaired users are more dependent on significant text than sighted users. It’s common for screen reader users to use the tab key to cycle through titles and links. With less tab hits on the content, but still the same number on the navigation, I am reducing the signal to noise ratio for these users when they browse.

Another slim possibility is that I may know a subject better than my reader and supply insufficient context to the subject without realising. Well if that happens then probably I have lost the plot on the whole posting anyway. I don’t even want to think about that scenario.

One positive benefit is that I plan to refer to books by full title and author. That means you can choose which online bookshop you prefer to use. If you are in the Amazon camp and feel your obligitory quick fix is missing, then you can still select the text and use the A9 metacrawler and arrive almost as quickly. Little is lost I think and I am not usually sympathetic to majorities. You have it too easy.

Of course all of this saves me time as well, so it’s also an experiment in lean blogging. Tracking down URLs and making sure they are the permanent ones, not temporary front pages, is as much effort as an extra paragraph. For something that is only a half dozen paragraphs, that’s a lot of overhead.

It would be nice if links could be created in HTML that just contained keywords or just bounded text. You could set your favourite engine into your browser, or be stuck with MSN if you were using IE, and have the browser submit the query. The browser could even pop up a menu of results as you hovered the mouse over. Handy as I hate having to leave the page I’m on when following an article, but want to look up a reference. All that would be necessary then would be for web content writers to mark text as significant. Perhaps the bold tag would do as a temporary measure.

Anyway, adding anchors seems too inflexible these days. I’m also rather lazy. Actually that’s the real reason.

10/11/2004

assert All Swans Are White()

You cannot prove anything by testing. No matter how hard you try, there could always be something wrong, or some combination of conditions or some external event that you haven’t thought of. You never know absolutely for sure that your project is working correctly and you never will.

The nature of proof is an old issue and one that was faced by the scientific community in it’s debate with religion. We are scientific, the scientists would say, because we do experiments to prove things. Aha, the religious leaders would say, you ain’t proving anything as long as there is another experiment you can do. Another experiment is another unknown and if something is unknown it isn’t proven. Therefore you are a religion too because your conviction is based on faith in your unproven theories, Q.E.D. For the scientists this must have smarted a bit, but fortunately Karl Popper came to the rescue in 1934 with his book “The Logic of Scientific Discovery". Here is his illustration that separates the two camps…

You live in a village and all you have seen today are white swans, from which you conclude all swans are white. A little rash perhaps, but logical. Now this isn’t yet a convincing theory, so we wander down to the village pond and look for swans. We are not looking for white swans though, we are looking for non-white ones. This assymetry is important, because although more white swans advance our case only a little, a single black swan will kill it stone dead. Because the threat to our theory is so devastating, every time it survives it increases our confidence. If we want to pursuade the world that our theory is correct we want to attack it as often as possible. Ideally we will search high and low for black swans, but fail to find any. We want to test the theory in as many novel ways as possible, and with notoriety others will test it too and they will think differently from us. And so collectively we never prove it, but we do get ever more confident.

For this to work our theory must be disprovable. The counter theory that there exists in the universe at least one black swan is not provable, and so not scientific. Note that scientists don’t have to be scientific, only the theories. Actually it helps if they are as mad as hatters, because that way we get a greater variety of testing. This process also allows us to have a single scientific truth. If two theories differ by prediction then we can determine which is correct by experiment. If they don’t then we hack away with Occam’s razor until the theories are identical anyway.

Back to the code. We had a theory that it isn’t broken.

Life is a little simpler for developers because we are not usually dealing with an infinite black box. It’s as if we could see the cogs of a small part of mother nature laid out before us and we are just checking the workmanship. This gives us an alternative approach. If the code is simple enough then we can completely understand it and won’t have any bugs. The whole project will probably never be in that state, but small sections of code will. The extra clarity we get doesn’t just give us our first theories of how the code behaves, but also allows us to eliminate vast swathes of possible tests as too trivial to bother with.

These areas of understanding are also fed by the tests themselves. I think we hop between theory and understanding in short cycles, and have a mix at any one time. Our areas of complete understanding are temporarily demolished during refactoring and are reduced to theories that our code still works as before. During and just after these transitions we add tests to resolve conflicting models in our heads and, with further change, turn unclear parts into areas of understanding again. I think this gets to the heart of testing with refactoring. Because the tests act as a cushion that allows us to drop back to theorising, they allow us to leap into the unknown. They are an agent of change.

On the project scale we haven’t a hope, but more people help. Inspection and pairing help produce more understandable and more fully understood code. For those parts that are just assumed working, more varied people with an antagonistic attitude will shore up those assumptions with novel and challenging tests. As many as resources allow, so perhaps the many eyeballs theory has some merit.

We still haven’t proved it’s working of course, but we can exude as much confidence as we want.

10/2/2004

Install me

I am not the slightest bit interested in your program.

I am surrounded by problems and have a to-do list as long as my arm. The only reason I am at your web site right now is because I have heard an unlikely rumour that one of my problems will be completely eliminated by your software. It is going to positively leap out of the computer and start resolving issues while I put my feet up and start to enjoy life. At least that’s what I’ve heard. You’ll forgive me if I’m sceptical.

First impressions mean a lot. We hate to believe this, but it’s true. When I used to teach I would find that the tone of the lesson was set within the first five minutes. The tone of the first five minutes would be set by how the children entered the classroom and the tone of that would be set by how I greeted them in the corridoor. It’s difficult to turn things around after a bad start.

My first contact with your software is likely the web site with the download link. If the eyeball tracking studies are correct, I will read the title first and then start scanning for blue underlined text. I am already looking for the link marked “download now". As an aside, if I arrived at this page with a Linux browser from a UK IP, chances are I would like the Linux version from a European mirror, so please don’t ask. Assuming the file dialog opens straight away I can consign the thing to my home download folder and carry on reading your project landing page. This is where the fun begins.

You have to hold my hand until the benefits of your project are obvious enough to warrent self study and experimentation, and I’m an unenthusiastic slow learner. We all constantly perform cost benefit analysis at everything we do, and if your project drops below my threshold for even a second I will ditch it and go on to something else. Instant gratification is best.

The first and most difficult hurdle is clicking “install". Don’t think that’s much of a problem? Go to your personal download folder now and have a look around. Full of tar and zip files right? What percentage of those have you unpacked? Of those, how many have you installed? If you are anything like me, likely a third at most are doing more than acting as hard drive filler, and yet all I had to do was two clicks. If your landing page has a long list of install instructions I will even click the browser cancel button right now. The thought of any extra work is just too frightening.

I may want doorstep convenience, but I don’t want you entering my house uninvited. Before you perform any install operation I would like to know exactly where you are putting stuff. It’s my computer and I like to keep it tidy when I can. I also want to be able to remove your program the instant I am disenchanted with it, and if I don’t think that’s possible I won’t install in the first place. My machine is stable right now and I want to keep it that way.

If your program is GUI based then I’ll run it now. I want to do something straight away and I want to see a result. Wizards don’t help, because they do stuff that I don’t understand anyway. Chances are I want to read a file, or write a very simple one. I don’t want to create projects, import directories or fill in loads of personal preferences. Once I know that your software is working I will start on the tutorial.

If your software is a programmer library then things are actually easier. I am going to carry on reading your web page and will read the “quick start” guide. I am going to follow the instructions on your page to the letter and I am not going to engage my brain in any way at all. I want to see the equivalent of “Hello world” in five lines of code or less with exactly the output described by your website. No big XML configuration files or templates to fill out, just a single script. Remember I have also downloaded your rival’s framework. You know, the one who always claims that his version is better than yours in the newsgroups? If everything seems to be working I’ll start on the tutorial.

There is a tutorial isn’t there? One that talks to me at a level and in language I can understand?

And if the tutorial starts to tell me how to solve my problem I’ll cheer up a bit. Once I am reading about the things I can now do it starts to get interesting, even fun. I’ll lean back and sip my tea - did I mention I was from the UK? - and I’ll play with your examples and learn to use your creation to solve my problem. If it does I’ll definitely send you a thank you e-mail. I’ll even send you bug reports when it crashes and suggestions for new features. And when you tell me that the feature already exists I’ll kick myself for not reading your manual and apologise to you profusely. I’ll tell all my friends how great your software is too, even though I never did try that other one from your rival. And it all happened because you had the care to help me through my first tentative step.

How could I ever have doubted you?

9/28/2004

The best software book ever

Books should change people. This one does.

This is not an easy thing to do, but it gets even harder with mental distance between the author and reader. When you first learn something you are in an ideal position to explain it to someone on your own level. I think that’s why learning in small groups is so effective. The group is less likely to get stuck as explanations cross polinate the group. Once you are a few rungs further up the ladder, though, too much seems obvious. Your explanation will skip vital steps, or simply not give enough time for your pupil to take things in. Once you learn a topic a little more thoroughly, you do start to teach it better and bridge the divide. That’s not enough in itself though. It takes a higher level of care to get sufficiently under the skin of a subject not just to understand it, but to know where the cognitive difficulties are to.

John Holt, in his classic book “How Children Fail", describes a lecture where an educationist professor is teaching maths to children who have had extreme learning difficulties. I mean really extreme, to the point of being “retarded". He conducted this math lesson with coloured rods, each colour corresponding to a particular length. The test he set the children was so trivial you probably wouldn’t think it a problem at all.

Both himself and all the children had a tray of different coloured rods. He took two length seven rods and sandwiched between them a length four rod. One end was flush with the other two, leaving a length three gap. The problem he set them was to find this length three rod. For them that was no easy task, but the clever part was next. He turned the ensemble upside down and let the four length drop out. The next task? Find the rod that fits in the gap. Yes, that was all. I don’t want to spoil the ending, but I defy you to read it and not have a tear in your eye.

Fellow programmers don’t usually have that much of a gulf between them, but they do have more complex hurdles. One that is common to just about every programmer is understanding object oriented programming. Not just knowing encapsulation, polymorphism and inheritance, but actually being able to write their first program with objects.

You don’t have to trawl forums and newsgroups very much to realise that this is a popular topic. You see a constant stream of cries for help from people all at sea. They don’t know how to start. They worry that their code is not “reusable". They don’t understand that something they could have written a simple function for seems to be taking three times as much code with objects. Likely they end up with one big class, or lot’s of classes that don’t seem to do anything. Or lot’s of classes that really don’t do anything. And then they add something and it’s all not reusable at all. Some give up and go back to scripting and some of these proclaim that OO is all hype and no one else should bother either.

With such a difficult and well known barrier to cross you would have thought there would be plenty of books to help. There are, but many inject as much fear as they do information. Firing off patterns is not enough you see, we have to explain it a rung or two lower. That’s difficult with such an abstract and subtle topic. I have seen only one book that does this.

Explaining a rung or two lower is clever enough, but this book does more. It doesn’t just take the explanation down to a level that every developer can understand, it turns the subject into a puzzle. Puzzles are fun. You can try them one way and you can try them another and see what happens. That’s the secret of a good puzzle. You want the player to be able to see ahead, but not so far that they see the whole solution. What was once overwhelming, now becomes a wealth of possibilities. That takes away the fear and fear is bad for learning. Fun is what you need and also the confidence that the tools you are using are the same tools as the experts.

You wouldn’t think at first that the book is to do with fun. A large part of it is a rather tedious catalogue and I doubt anyone has read it all the way through. Luckily, I am not measuring success by pages read, but by impact. I can fling this book at an up and coming developer and know that three weeks later our conversation will resume on an altogether higher plane. It works every time and that’s astonishing.

That book is “Refactoring” by Martin Fowler. Someone give that man an OBE.

9/21/2004

I want ENTP libraries

I like working in an agile environment and I like working in small companies. I like smaller simpler languages which are powerful in their own domain rather than all purpose sledge hammers. I like some pieces of code and not others, but it’s not just quality. I like some libraries and APIs, but not others just because I do. This got me thinking.

If code has a personality I thought I had better measure mine. A bit of research later (Ok, trawling Teoma and Google) and amongst others I tried the Myers-Briggs test for team building. Fantastic! I am ENTP, Extrovert-Intuitive-Thinking-Perceiving apparently, and extremely so. Here is how it relates…

Extrovert-Perceiving means I can work in groups and try to lead by concensus without telling people what to do. Ironically, smaller departments and companies are more suitable here because of the loose hierarchies. You will likely work with the same people day in day out fairly closely as well. It also means that I can change direction easily. So that sorts that out and explains why I also have so many projects on the go. Introverts (I) will finish things quietly without support, and Judgers (J) will try to organise their work environment, often around themselves. I am certainly not organised . Maybe that is why I like less formal languages such as scripting languages these days. And probably why I like face to face decisions in small or agile companies.

The Intuitive-Thinker is more interesting. Intuitives don’t have to fully understand things before embarking on an action and favour creative solutions. They also delight in the abstract and are incurably optimistic. I think this is another reason why I like the agile approaches to development. Design is less formalised and more blended with the coding process, allowing free creative reign and exploration rather than controlling that process. So far so uncanny.

Now, any interface has a personality and I think code does too. Some libraries will directly solve your problem and nothing else, but are flexible for being small. Some will be more like frameworks, taking over your code, but supplying rich services. Some will have defaults filled and you will have to subclass. Others give you pieces of a puzzle you assemble yourself. Some give docs, some give tutorials. Some make you write configurations files, others do everything by writing code, and so on. I cannot yet identify these characteristics or axis, but I have started thinking about them. Any ideas?

If code does have a personality that would explain why it is so difficult to write code on bulletin board threads. I’ve seen this fail spectacularily many a time eventually bogging down with everybody ignoring everybody elses interfaces. It also presents problems for class libraries that only have one try at each function point. It also explains why they are best written by a small group of people. as if you like one part of a library, you will probably like the rest as well.

In-house a possible result may be islands of code that don’t talk to each other because the developers have different world views. Usually in development teams there is no shortage of someone to take charge, so more likely one constituency will be reduced to guerilla warfare on the fringes. Another characteristic of these types of test is this outlook isn’t permanently fixed. Either the software development attitudes or your personality type can change according to mood and environment. We are not talking any kind of underying programmer genes here. So maybe your antagonists will come around.

One prediction that caught my eye is your ideal spouse. Mine should be INTJ and she is! They like to plan and organise and make more pragmatic assessments of the ENTP’s rush of ideas. Ideal support and I’d be lost without her. Especially when dealing with the home finances.

I think you marry your methodology. For me the Extreme Programming style handles the important small things for me that I just won’t do myself. Acceptance tests keep me from going off at a tangent, unit tests keep me from writing messy code (and I really rely on testing) and iterations make sure I actually manage to finish things. XP doesn’t get in the way of small groups coming up with solutions, which suits me to the ground.

The ideal developer is supposed to be INTJ or ISTJ, but I didn’t read how they worked that out. The conspiracy theory in me has the more ISTJ style engineers running the programming world in recent times. As a result we have more formalised, logical, step by step systems, but less emphasis on controlling the detail that is supposed to look after itself. Well good for them maybe, but that’s not for me. Us ENTPs are fighting back.

Anyway I am curious to see how people measure up and how that tallies with their coding style. The Myers-Briggs tests are fairly quick and there are a lot on the net. Go on, have a go…

9/20/2004

Be your own consultant

I’ve come across three big time suckers in web development, for that matter software projects in general. The first is bugs of course and I am pretty sure that one would be in your list to. Next up, not understanding the requirements is a big one. Get the requirements wrong and from then on almost everything you do is a waste of time. The last one is getting new developers up to speed. Not just bit part coding, but actively designing and sharing responsibility. This always seems to take six months, regardless of the skill of the incoming developer.

It struck me that these correspond to our three main communication channels. These are developer-machine, developer-client and developer-developer. Is this an anatomy of software troubles? Can we use this as a basis for diagnosis and cure?

Bugs are pretty much a solved problem these days unless you are building nuclear reactors or rocket ships. To rid yourself of them you first need lot’s of testing and you need to do that testing early. If the code is testable, you are mostly there already. You also need code review, of which my favourite method is pair programming. Having to constantly explain your code highlights areas that are vague. This helps to prevent more subtle bugs caused by convoluted solutions rather than clear expressive ones. The clearer one is easier to test too, so they reinforce each other. Bugs don’t really worry me anymore.

Communicating with the clients is tricky. The ideal solution is a customer representative on hand, or at least within earshot. Trouble is the customer all to often has better things to do. There are also several of them, as the CEO may not be the most informed at say marketing as the marketing department. Getting all of these people in sync. can involve either documentation, clear demarkation or frequent meetings. Will documentation, use cases preferably, be enough. Acceptance testing gives tighter control, but requires a more involved and technically aware client. Also customers don’t always know their own users, so does the user interaction design fall within the development team or is that another customer? Either way, for the lowly developers it’s still another source of conflicting goals. If your company is building a project for another company then likely the communication channel is stretched further. Face to face communication is the most powerful technique, so how do we want to ration it?

The client communication problem is rather insurmountable I feel, and so the most effective strategy becomes incremental delivery. Once you know you are almost certainly off course from day one, you have a great incentive to get your software in front of the stakeholders as quickly as possible. At least that way you don’t swerve too far off the road.

I am having a hard time finding information on developer to developer communication. Code comments don’t really cut it for me except as a desperate measure when all other discipline has broken down. A Wiki adds little. The problem of getting a developer fully up to speed is not solved by giving them a small subproject, either. If they do this then they still don’t learn much outside of their area and the rest of the team then don’t know anything about the subproject. Pairing helps the team here, but not the new arrival, although if they are on a six month contract anyway then the team doesn’t care. Well, until the contract is extended. Really they have to get stuck right in and this means having a few artifacts on hand.

It’s worth having diagrams for the physical architecture of the system and any layering or package diagrams. These give an instant map. Part of the problem is giving the new arrival confidence so it’s worth making these 100% accurate before they start work. The main thing an incoming team member lacks is history. The reasons why a decision was made rather than just the final result. I now feel that the first lesson to a new arrival should be a history one.

For an existing team things are easier. Design sessions with everybody involved are the most powerful. I keep forgetting and typing this now has reminded me again. Whiteboard or CRC are sessions are fine. Even if the design is obvious, these dicussions are very memorable. Their role is not to refine the design, although that’s a nice side effect, but to disseminate knowledge. Pairing does this too, but doesn’t seem to work as quickly.

So, if that’s our medicine cabinet, your’s is stocked differently I’m sure, what are our symptoms? Furious clients engaged in legal battles are obvious, as is a bug ridden mess. More interesting are subtle signs of depression or lack of enthusiasm.

Do developers feel they know what’s going on? Perhaps they don’t understand the goal of the project, the mission statement if you will. If a developer does not have direct access to a client, have they set up an informal link to their opposite number in the client’s company? Should they? Sounds like they need some client interaction.

Do they know how other members of the team are getting on? Are they specialising in certain areas of the system? Are they withdrawing into isolated parts of the application because they don’t feel confident elsewhere? Are developers aware of usability issues in the system? Was anyone held up today with issues? What for? Are they setting up parallel work for themself so if they get stuck they can work on something else. Context switching is bad news and this behaviour is suboptimal when trying to push stuff out to the client for quick feedback. These symptoms are all grist for the team communication mill.

What are developer’s views on the code? Are they confident in the code they are using, or are they error checking everything and coding defensively? When was the last time your system was rolled out? Will it work on all the servers in your system? Are all the configuration files up to date or full of legacy junk? Not sure? Time to schedule some man-machine interaction in the form of some testing perhaps?

So how does your team fair? Which is your best and which is your worst channel?

9/11/2004

Programming has nothing to do with computers

In 1977 we sent a message into space.

This one was physical. The Voyager spacecraft passed Poineer in 1998 to become the most distant object we have ever made. Sadly, signals are now too weak for any more spectacular images, although it’s still taking measurements of the stellar bow wave as we pass through interstellar space. Soon though, it will be lifeless.

If aliens ever do catch up with Voyager I don’t thing the gold disc on board will be all that interesting. Granted, it’s got some nice music if they happen to have ears. There is no fast forward though, so they also have to sit through the word “hello” in fifty five languages. We’ve sent interstellar boredom. There’s a little map saying where we are, but if they were space faring then surely they could plot back a trajectory anyway. I suppose it does say good things about the optimism of the human race, but even that seems dated now. Something to put in the alien trophy cabinet maybe, or a rather fetching alien wall ornament. So, no, that’s not the message I am thinking of.

Would they be interested in the Voyager technology? As anthropology, yes. Certainly we would want to know if all civilisations followed the same technological route. They could hazard a guess at the mineral composition of the planet from the materials used, but that would be a game for alien armchair pundits. It doesn’t say much about being human. The earliest chance of intercepting the spacecraft is coming up in about 40,000 years time, so everything else about it would be ancient history. Except for one thing.

Voyager has a computer and that computer carries software.

Do you program a computer? No, you don’t, you program an abstraction of a computer. Not only do you not know how closely that abstraction maps to physical components, you don’t care much either. Ok, you can argue that optimisation gets you closer to the machine, but unless you can tell me you completely understand SCSI caching optimisation, I would argue that even this area is treated as a black box. It’s even rather irritating when performance issues intrude and ruin the fun. Really a mindset of solving crossware puzzles is a closer analogy to coding. Now, you wouldn’t train to solve crosswords by reading up on the mechanics of a printing press, and similarily developers aren’t that fascinated by the innards of semiconductor physics. The original Bletchley Park code breakers were selected on this basis, puzzles rather than mechanics. If only IT recruiters were this enlightened today.

We design all these abstractions to be malleable in our minds. For our code to work effectively we must juggle them, but only enough to be below the threshold of what we are capable of holding in our mind’s eye. The code that we produce illustrates the height of our mental agility and points out the protection we need against the depths of our stupidity. It very much comes from us, not the machine. Perhaps a little mangled by more refined tools, but with the same limits. It’s not even the code itself, but the data structures. The images that we can see and interpret give a deeper window on our minds. The complexity and degree of nesting of the structures give another. How large is our memory? Look at the size of a subroutine or the children in a data node. Chances are these sizes are well below the hardware capabilities of the time. Voyager is a record of how we think.

In 1977 the code was created directly with the human mind and imprinted straight into the hardware, a vintage year for an alien biologist. That mental fingerprint has clocked over 8 billion miles on the way to a destiny we will never know and if it is intercepted, it truly will show what we are capable of. It’s travelled another 2000 miles while you’ve been reading this.

9/5/2004

Logging is evil

Every serious program has logs. Logging errors is part of programming after all, as natural as breathing. You probably have half a dozen services or deamons writing messages to your hard drive right now and you feel safer for it I’m sure. And of course you often send log messages to yourself.

Just the errors at first.

A logger class is pretty simple, it just timestamps messages and shoves them in a file, so hardly much development overhead. OK, so the class that created the log message needs to know about the logger and that’s a little more awkward. We don’t want any Singletons around here thanks very much, so we’ll pass the logger into the constructor of the class doing the logging. Of course we don’t want our client code affected by changing the constructor signature, but if we are lucky we get most of our main entities through factories anyway. That’s handy because it will save the client code having to know about the Logger class as well, otherwise…er…things could get complicated.

The factories have to know where the log file is on this specific machine, so the factories will need to take a configuration object in their constructor. The configuration file itself, and the parser to go with it, can just be in the source tree. There might be one these already and we weren’t planning to use this code in any other projects anyway (were we?), so the extra configuration dependency shouldn’t be that bad.

Errors arrive a bit late though. Really we want to know what happened just before the error and that means logging any suspect behaviour. We don’t want to trigger any kind of alarm on these so we’ll call them LOG_WARNING level or some such and add filtering to our Logger. We’ll have to get the logging level from configuration file of course, but the real problem is that our original class we are monitoring is now getting a bit cluttered. That’s OK, we know what we are doing. We’ll create a decorator for each and place the log calls in that. We can even tell the factories to apply the decorators or not according to the configuration file, and well, shoot me if my middle name isn’t “Aspect Oriented".

And life just doesn’t seem so easy anymore. It seems to take longer than ever to produce ever more code, new developers have to be taught all about our error handling and monitoring systems and to make things worse, today we have a bug.

Debugging is worse than the dentist. Even with watches, breakpoints and single stepping our forty lines a day productivity can drop to fixing a single line over an entire day. That is bad use of resources, resources that could be better spent rewriting that section of code, or writing better tests, or clearer design, or code review, or a good night’s sleep or any other activity that would prevent the bug in the first place. What can be worse than staring at a debugger? Try doing the same thing by staring at log files. It’s like trying to find a gunshot wound with keyhole surgery. To make matters worse, all the other developers have dumped their LOG_WARNING, LOG_NOTICE and LOG_DEBUG messages (where did they come from?) all over your now mission critical log output. It’s a desperate mess. All that extra code and configuration for what? So we could now begin the most spectacularily inefficient process in the whole of software development…?

So, why did we log that first error?

If there really is a chance of error at that point then we should fix the probable root cause. Inspect the code to see if an error really is possible. Wrap tests around the components that could possibly cause the error and confirm that it cannot every time you check in the code. If it’s too complicated to make that assertion then break the code down. Rewrite it if you have to. If you distrust the code that much it shouldn’t be in your system, otherwise every other module has to code defensively around it. That’s clutter you don’t want.

An external library or system? Wrap it. Simulate failure of the external component with mocks to make sure your error trapping is well defined. That error is a possible input to your program and should be as well defined as when a user types in their own log-in details. Building on a firm foundation may feel like you are going slower, but you won’t be. You certainly won’t be if you can save yourself a couple of days debugging time.

You win in other ways. Your code will be clearer with the “debugging” calls taken out and that means fewer bugs because people will understand the code more easily. You code will be simpler to deploy now it has been freed from yet another file dependency. A file that is security sensitive by the way, which gets you into permissions and so platform dependence. But I think the real gain is that a crutch will have been taken away from the developers.

Logs become a dumping ground. Issues that would have provoked a much needed discussion can be too easily glossed over just by logging data “for later". For example you may be logging usage of a web site for later analysis. The factors that effect the analysis will be known by the client and you would save time by finding these out. Perhaps you don’t therefore have to log every last drop of information. Perhaps you only need overall totals or trends. This is something that should be cleared up with the client straight away, because logging everything is a sign that the goals are not very clear.

You also get the right degree of coupling between analysis components and their data. If analysis happens straight away within the same program all of the code pieces will be local to each other. With a more obvious connection we can prevent the inevitable bloat where data is added to the log, but no one feels safe about taking it away. Assuming you need to log it all. Why not place the result straight into the database and give yourself a real time solution? A log is a form of user interface and as a design choice it has to be subject to the same degree of rigour as your other design choices. Once you do you start to throw up cleaner alternatives.

A log is a symptom of a half solved problem.

8/31/2004

Design drool

Programmers are distinguished by their ability to foresee abstract scenarios. We are actually rather good at it, examining several “what if” scenarios every minute. This requires clarity of vision, not just seeing ahead, but also able to backtrack. It’s a bit like chess players examining a tree of variations, alternately going down one branch, deciding they don’t like the look of it and so trying out another instead. It’s all food for developer thought.

We got better at this not just by practice, but also by examining our own thought processes. We don’t just keep track of our decisions, but also for the reasons for those decisions. When one of the underlying assumptions becomes false we quickly reassess and change tack. It’s quite astonishing sometimes to sit back and watch a group of developers discussing a solution around a whiteboard. It’s frantic, a real verbal hive mind. Each developer trashes the assumptions of the other at lightning speed, opportunities are noted, forgotten, returned to and abandoned until the whole design has been attacked from every angle. We see it all clearly now. And it’s great fun.

In fact we love it.

We love the expanse of possibilities, being able to visualise one really clever solution after another and see them all working in our mind’s eye. Our ability to learn to react to the solution before it has really happened is like Pavlov’s dogs drooling at the bell. We enjoy seeing these imaginary solutions, almost as if we had implemented them already. Trouble is we haven’t implemented them. Right now they are no use to anybody.

Finishing them is actually a rather tedious process involving testing, talking to users, coding and endless details. Worse, our plan may turn out to not be immediately useful, despite it’s obvious elegance. That won’t stop us though and we’ll make it work the way it was always supposed to. Because it’s “right” after all. Good Pavlovian dogs that we are, we seek to maximise the fun stuff. This means that a fun design session will go on a little longer. We will add more power to our grand scheme and with it more abstraction. Once we do this it takes longer to implement. Perhaps it never gets fully implemented and the numbers of abandoned projects go up. Oh well, that gets us designing the next program sooner and wouldn’t it be nice if we had a UML code generating tool to save us all of that text based nonsense, or perhaps an underling or two does the coding.

It’s a vicious circle. More unfinished projects leading to grander designs, not just for the solution, but for the methodology as well. After all our next project is so vast we will need to manage an entire team of developers, so the design method has to be perfect. Yippee, the design could takes months!

And right now it’s no use to anybody.

Remember when we first wrote a program for someone else? Or just for ourselves? We would get our space invader moving across the screen (of course yours was an office application? - yeah right) and see it working. From there we would get our bullets moving upward. Flickered too much? Well we fix that first, long before the getting to the AI of the baddies. That’s feature driven iterative developement and it’s the agreed way to write software. Sure the program would be abandoned due to lack of resources, probably machine horsepower or your boredom level, but all the while you would have working code. And you learned from that and you learned fast. You could even play your own game in the meantime.

Once you have seen enough grand designs fail to materialise, you climb up the animal kingdom. It’s all feedback of course. We need to chase the real calories of working solutions before we forget exactly what they are like, because we don’t get handed plates for very long. Luckily all we have to do is identify our true sources of feedback, that design is a means to an end. It’s just that sometimes we don’t look ahead.

8/28/2004

Face the ugly demon

We were faced with a choice: work with code that had become a complete mess or start again.

When I say it was a mess I really do mean a mess. The application had suffered a year of being edited directly on the server that ran it. PHP, SQL, Perl and HTML were all intermingled, there was no architecture and little data normalisation, patches had been layered on patches and some scripts even had multiple duplicate versions. There was no CVS, not even a whiteboard to design with, just a stream of customer requests that drove the next skin graft. That’s not to say that the application wasn’t working, it most certainly was. In fact it was earning a comfortable living for the authors, a successful program by any standards. The problem was that it was now impossible to maintain.

“Maintain” is a funny word for the software industry to have for this situation. Software doesn’t rust after all and to maintain it we just leave it in the corner and let it carry on running, right? What’s there to do, oil the floppy drive?

Really we mean it was now impossible to change. That’s one step short of death for software because in real life programs have to change all of the time. Not just to fix mistakes, but because the users want more, or the market changes or just because adding something might be fun. To keep software alive takes constant churn. A program is a rare beast, one where rigormortis sets in long before it’s last gasp.

Of course we went for a rewrite. Our efforts have been successful by the standards of these endeavours. The old code is now rapidly disappearing and the new OO architecture is clear, tested, flexible and spreading to several other projects. We still have some data porting issues and we are not quite there yet, but the light at the end of the tunnel is shining brightly. So would I go for a rewrite if I had that choice again?

Not a chance.

That rewrite was started over two years ago! It says a lot that the market lead was so great that we could afford such a luxury. It was exactly this period of time that Netscape showed no external progress after version 4.x of their browser and looked what happened to them. We were lucky to finish at all, never mind the lost opportunities that have probably been squandered in this process.

Everyone underestimates project times. This isn’t too bad when you find out quickly on minor project increments. It can even be a running joke how slow we really move. It’s anything but, though, when the whole business sits idle waiting for competitors. I warn you now, that planned rewrite will take longer than you think. A lot longer. Ask your boss what would happen to his business if the rewrite takes four times as long as he has allowed for. He wouldn’t take the risk right? Well it isn’t just a risk - it actually does take four times longer. Your’s will too.

Ok you say, it’s not that simple because there is some time for maintanence of the old system and some new code percolates upward sooner. Is having two jobs to do really going to speed things up? It is probably this split focus that helps to cause the delays, although mostly I think it is the lack of urgency. If someone reports a problem and it’s not a quick legacy fix it will likely get put into the “fixed in 6 months after the rewrite” box. They never ever see the fix. Pretty soon they stop talking to you altogether and the feeling of unreality takes hold.

So look at that ugly code base again. Think it’s worse? It probably isn’t. It doesn’t actually take that long to get control of code. A simple wander around fixing things at random will actually go a long way. Copy it to another machine. How much still works? You will be surprised how easy this turns out to be. In the worst case it may take a week to reattach all the different components, but that is still nothing compared to rewriting it.

Cloning the system on to a development box and automating back the deployment will be the tipping point. After that you are free to delete things and see what breaks. That’s vital. A good programmer will wear down the delete key before any other key on the keyboard. With the deadwood gone it is easy to see commonality. Forget architecture at first and just pull stuff out that makes sense. Feel free to get it completely wrong and have to refactor it all back another way. If you are tidying your house you will create intermediate piles after all, ones you plan to sort through later. Sounds like even more work? Believe me, it will still be quicker than that rewrite and each step will get easier. Any improvements you make will go straight out to the users as well.

Of course any breakages will go straight out too, so you might want to invest in some test tools at this point to save unnecessary effort. The main thing is that you are connected to this rather ill, but still living and breathing, software package.

As the newer parts of our rewritten system have come on stream I have started receiving feelings of reinforcement that I had almost forgotten. If an issue arises we queue the work and that issue is resolved in a matter of days. Some of these requests are big features. That’s a nice feeling, to know a thing is “done” ready for the next one. Keep ‘em coming.

Our development processes are really rather good for such a small company and we usually know how we are doing. I don’t think we could have sustained ourselves through the rewrite period without that, but despite this I have often felt things appeared frustratingly slow. They weren’t, it’s just that every item was left pending rather than totally complete. That’s changed at last, and I now realise how much more effective we could have been.

Oh well, at least now we get the chance to show the world what we can do.

<