Tuesday 19 August 2008

The Three Legged Dog

One of my bug bears - and everyone in IT seems to have at least one – is performance.

Badly performing applications and operating systems really annoy me – often more than other bugs due to their persistence - and more importantly, reduce my productivity. Yet such an annoying feature seems commonplace and I for one do not buy the “it’s the hardware” argument any more than I would buy the “it’s the road” argument for a badly performing car.

I should distinguish between apps run on recommended hardware and those run on an old desktop next to the broom cupboard in an office – far far away... - a not too uncommon experience.

I am sure many of us can catalogue the list of applications and operating systems that fail to perform everyday functions. I have previously referred to Vista as poorly performing (and in my view too fat to live) and even my brand new HTC PDA - sporting Windows Mobile 6.1 – runs like a snail.

As well as not buying into the “it would work well on a CRAY supercomputer”, I also don’t buy into the “part of testing” argument either. Sure, testing should report the issue and indeed FAIL the release because of it, but they are hardly responsible for the performance(though I have to question why they do not exercise this veto more often).

Finally I do not buy into the “it’s a bug” argument. Sure, it is a bug, but not in the sense that I find performance normally referred to. E.g. that that performance issues will become apparent during Performance Testing which, according the V-MODEL, occurs during system testing.

Of course, this is a reasonable place for verification BUT not the first instant for this testing, in the same way that you would not expect it to be the first place to discover of bugs that would otherwise be discovered during unit testing.

I also do not subscribe to the idea that code performs by default and it is only the random occurance of bugs that interfere with this natural state of affairs. I assert that actually the reverse is true; that unless performance is architected and coded in from the start, then it will always perform badly. I would also assert that this statement is more able to explain the reality of our experience of applications.

In my experience, more often than not, the functionality has had to be re-written from scratch – rather than bug fixed - due to its poor performance. It has not been a “bug that crept in” but simply the absence of performance architected/designed code. This also explains why poor performance often isn’t rectified pre-release due to the cost of re-writing code and the effect on the date of release.

However, this is all the more reason why code needs to be architected and designed for performance and not just functionality. It would appear that “non functional requirements” - such as performance – are treated as “soft” requirements rather than “hard” requirements, the implication being that they are an optional extras.

I do not buy into this or the run out of time argument either, since this performance has to be considered up front, not last. If you leave it to last you have left it too late.

The example I had quoted to me on several occasions is the cash machine. Accepting all the functionality of the cash machine, if it took 10-15 minutes to dispense the cash, just how often would you use it?

I also understand that performance (and also scalability) is harder to achieve these days due to “The Stack”. In the “good ‘ole days” you coded in machine code/assembler direct to the CPU. Now you go through numerous stacks, api’s, classes etc which you didn’t write and have actually no idea how they perform under different circumstances. But that only sustains my previous assertion that code does not perform by default, but only through careful design, architecture and unit testing.

It is also true that code may have different performance profiles given the tasks required. For instance, a piece of code relating to train schedules may be very quick to list all the trains departing on a given day, but horrendously slow to list the trains on a given line. This dimensional performance characteristic can be present in many different areas.

Another mistake is data processing; Developers responsible for data entry logic can overly influence the schema for data entry performance over data access, yet data reads often account for 80% of db access in the wild. Thus data entry is fast (since it is dictated initially) but data access is slow. The effect is poor performance overall.

So my TEN principles for improving performance are;

1. Use cases must include performance profiles/response times.
2. Design must include reporting, scope and response times.
3. DBA must be involved in database design and focus on data access/reporting performance.
4. Architecture must be involved in development start-up meetings with performance a key agenda item.
5. Developers need regular performance training.
6. Unit Testing MUST include performance profiling including use cases which effect data dimensions/reporting etc.
7. Developers/architects need to include performance strategies as part of their training/coding approach.
8. Unit Testing must have success/failure criteria including performance.
9. Architecture must be involved in reviewing code not just in terms of formation/patterns/quality but also performance
10. QA Departments are given veto over poor performing releases.

No comments: