Why was Spamavert down?

Sunday May 24, 2009

We didn’t notice that Spamavert went down at first; it has been running so smoothly for a long time that we didn’t know until someone told us.

At first, our maintenance script that deletes old e-mails had been failing for a while; resulting in a backlog of multiple gigabytes of e-mails on the system. Since the system has a small disk quota (it is running on a VPSLink virtual machine), it ran out of disk space (actually we overrun our quota by about 300%), and stopped working.

It stopped working partially because of the VPSlink downtime last week. Even when service were restored to their other customers, we could not get our virtual machine to run properly. We deleted around 18GB of e-mails (that’s several million files), which took around six hours.

However, our disk quota did not update to reflect the freed space. We got the service up and running for a few hours, but it ran out of disk space again after a few hours. We did not discover this until the morning after. We e-mailed VPSLink, but due to the time difference fixing the problem was slow. When our disk quota finally was corrected, everything was back up.

We are working on procedures that will ensure that such extended outages do not happen again. We apologize to anyone who were affected by the outage.

Comments

Commenting is closed for this article.