Madman.com.au is Temporarily Unavailable

OK, so I've made some comments about the Madman site in the past but lately I have been getting more and more upset by outages in the service. Right now that has really reached boiling point as I've been trying to get to the site for the last 7 hours. Read on for more ranting on the subject, along with some (hopefully) constructive criticism.

For the last month I've been seeing the following on an increasingly frequent basis:

Madman 503 error page

For the lead up to Christmas, and to coincide with Madman's various sales and promotions, this error page is seen more often than not. At least by me. When I see this I tend to get visions of a lone server sitting in a drab room in a Collingwood office building, quietly struggling to keep up with the number of connections and serve out hundreds of pages, images, scripts and, now, high-quality streaming video.

I think one thing has been made painfully obvious at this point, and it's something that should have been learned during the 10th Anniversary Sale: Madman need to upgrade their web service! I mean, really, there are ways to deal with higher than normal loads and they don't need to be terribly expensive.

Optimise the web application.

This should be the first port of call. I'm thinking that the number of database queries should be minimised, output size should be reduced, Java code optimisations should be researched, caching should be implemented and generally have the whole thing given a thorough going over.

Break up the processing responsibilities.

Since the site runs on Java I'm going to assume they're running a Tomcat server. I hope this same server is not used for serving static content like images – the Apache HTTP Server is a much better fit for that. At the same time, it's probably best to use a few servers to perform load balancing and to allow the database server(s) to run on different hardware to the web server(s). This should ensure that no one place is placed under too much load.

Use a Content Delivery Network

This would probably result in the most drastic improvement for the least amount of work. Using a CDN like Amazon S3 or [Akamai][6] would allow all the static content (and there's a lot of it) to be served from servers that are designed to handle almost any load and bandwidth requirements. I'm thinking the streaming trailers would benefit the most from being moved to a CDN, if they're not on one already. On top of the benefit of reducing bandwidth needs and the number of active connections required to the Madman server, Akamai and possibly others provide services to increase the availability of web applications ("Web Application Acceleration") – particularly Java-based ones. I can't say how well these would work here and whether they'd be necessary once other optimisations are performed but it's definitely something to look into.

Well, I think that about sums it up. A few things that could be done, the last being quite simple, to improve the availability of the Madman site and ensure potential customers get turned into actual customers and the raving fans are placated with their anime crack.

[6]: http://www.akamai.com/