August 10, 2016 by dannybishopcreative
But on the night the biggest online flop in Australia’s history played out. The site first became unresponsive, then it failed all-together, including DNS becoming unavailable. When the front page of the census.abs.gov.au was reachable it was almost always impossible to reach the pages that contained the 2016 census form.
The ABS itself had been batting away concerns around privacy, and in doing so had proudly announced it was ready to protect Australia’s privacy, and also ready to meet the expected demand.
Of course the knives were out immediately. IBM, who were providing hosting, became one target. The business that had won the tender to perform load testing also came under fire on Twitter.
So where was the fault?
Of course the technology failed. There’s no doubting that Ultimately that’s the cause of everything.
But someone, somewhere doomed it all to fail when they estimated the traffic.
In the week leading up to Census night, spokespeople for the ABS stated that they had tested the site to 1,000,000 form submissions per hour – twice the load they expected.
So what that tells us is that the ABS actually expected 500,000 form submissions per hour, and tested to 200% of that figure.
How did the ABS come up with that figure?
Australia has somewhere near to 9 million households. Perhaps the ABS used some basic maths to come up with that number… 24 hours in a day, but let’s assume people are asleep for 8 hours of the available 24, that means 16 hours. 9 Million households, but a bunch will do the paper version, so maybe 8 Million online submissions. 8 Million households divided by 16 hours… voila – 500,000 visitors per hour. 500,000 visitors spread evenly, perfectly across every waking hour of August 9th.
But let’s look at it another way. There are almost 12.5 Million people just in Brisbane, Sydney, Canberra, Melbourne and Hobart. Given the assumption of 2.6 people per household that brings us to 4.7 M households in those cities. Add in the regional centres and you’ve got more than 5 Million households on the east coast of Australia.
That’s important because they are 5M households likely to want to complete the Census at a similar time. Let’s say we believe that we assume lots of people work or have kids or eat dinner (not bad assumptions I would think you agree). Maybe they wanted to do the census earlier in the day, but realised they didn’t have the special letter with the code on it to do it during the afternoon at work.
So let’s assume that 7pm to 9pm are pretty good times to assume people are going to do the census on the east coast. That’s 2 hours for 5M people. That’s 2,500,000 people per hour, 5 times the estimation that the ABS counted on.
If you’re feeling kind and spread that over 3 hours then things get better and worse. As we increase the window then we’ve got to account for more of the country. Even if we just take a 2 hour window then we’ve left off Adelaide, whose window opens 1/2 an hour after the east coast cities. If you extend the window to 3 hours then you have to include at least a third of Western Australia’s population as well, which pushes us up to about 8.3M households wanting to do the census online in a three hour period.
So if we push the window to 7pm-10pm then the average traffic over that period is more than 2.78M form submissions per hour.
But even then we’re ignoring peak demand. This ignores the possibility for clustering due to factors like kids bedtimes generally being linked to 1/2 hour distributions (7, 7:30 etc), or advertising on TV (or even the end of an olympic event on TV).
Clustering might mean that instead of the load being evenly spread across every minute of the 3 hour window, perhaps at 7:35 there will be a peak that lasts 5 minutes which delivers 20 minutes worth of traffic. That would mean 4 times the averaged traffic, or more than 11,137,000 form submissions per hour. If the testing was to perform a confidence assessment by delivering 200% of that traffic, they would have needed to mimic more that 22 Million form submissions per hour.
And the ABS were building to 500,000 per hour.
The 500,000 p/h is the real failure here. The website would have been built to that spec. And failed because of it.