Incident Report re: Service Outage 29th October 16:42 - 17:37 GMT
By Walt on Friday, October 31 2008, 18:35 - Service status - Permalink
At 16:42 on the 29th October we were alerted to the fact our main file server appeared to be offline. This meant that customers' uploaded files were unavailable. Once it was confirmed that the file server was not accessible, we immediately instigated our recovery plan and brought the standby file server online. We then attached the network storage to it, thus restoring access to customer uploaded files. Services were brought back up by 17:32.
On further investigation, we noticed that the server detected that it had entered an inconsistent state and as a result, halted itself as a safety measure. This an extremely rare occurrence and is the first time we have encountered this behaviour since SiteMaker began. However, this is the reason we have redundant hardware enabling us to quickly fail over to our standby system and recover services rapidly.
As a precaution, we have updated the operating systems of the file servers and, although we have no reason to suspect any damage, we are running a full diagnostics on the hardware.
Comments
Good shout Walt.
I for one have left my job of 9 years to set out in website design - using Sitemaker software.
So with my self-employment now rather pinned on the reliability of Moonfruit - it's good to know that it's taken seriously!
Arif
Is there a problem with the server? as website is not showing message:
The page cannot be displayed
The page you are looking for is currently unavailable. The Web site might be experiencing technical difficulties, or you may need to adjust your browser settings.
Please advise a.s.a.p
Hi Keepsakek,
We are not experiencing any errors, problems or slow-downs. Suggest you contact support and give us some details about your site and account.
If you have problems with accessing support on Moonfruit, please use our alternate site:
http://www.sitemakerlive.com/en/sup...
Thanks
moonfruit not working tonight very well (3rd Nov) around 6.35- 7.20pm was very slow now websites will not load up keep getting this (500 Internal Server Error)
Hi Andy,
Can you please give us a time zone.
We've checked our logs for Monday and have no reported issues. We do run some back up scripts at around midnight GMT but this is extremely unlikely to cause severe slow downs or for that matter Internal Server Errors.
Thanks
Hi there, I can access my site but when I go to edit - it wont load the page. I have had the problem before and didnt get a lot of help and now its really crucial i get the sites edited. I have cleared cookies etc but it still wont work.
HELP!
Hi Emma,
This will not be related to the incident mentioned above. You also you haven't given us any clues to what sites you are referring to. Is it just one site or all of your sites?
The only way we can resolve this is if you contact support and give us a much detail as possible and we will investigate. We will ensure that if the problem is a serious one it will get addressed and resolved.
Thanks
once again moonfruit is offline. anyone else got any issues. Moonfruit what is going on?
Hi, moonfruit and sitemakerlive are down as well as all the sub sites. Do you have an eta of when this will be corrected. much appreciated
Rory
fingers crossed it seems it maybe back
may have spoken to soon
I was beginning to think it was just me!
moonfruit are you able to tell us what is going on and an eta on when it will be fixed please.
Any one been getting a 502 Bad Gateway
as i just got this a few seconds ago both on my website which strange enough is now loading properly and moonfruit well i bet thats working too
has this got any thing to do with the outage which occured at the end of last month
yeah we are getting that message on our site too. it will go in an out of working properly. Waiting to hear from moonfruit on what is happening no details from them on what is happening or an eta on when it will be fixed as yet
well i getting all sites up now although a little slow... is this a regular thing with moonfruit??
actually i take that back.. they are loading fast now.. everything is A OK!
has happend a few times but does not last for long and they normally will let us know on here why it happend
cheers gavin... i wont be best pleased if it a regular thing!
Hi Gavin
A 502 Bad Gateway is a server fault possibly on moonfruit's side
seeing how fast they fixed the last outage expect to see an update from them on what caused this 502 error
thanks for that Andrew
Hi guys, things are all back up and running now. There was a problem with the webservers that required us to restart the entire server farm, which was unfortuneate and time consuming. The slow response time was caused by the servers reloading their cache files as they start up again. We will publish a full status report on this as soon as we have finished our analysis. Thanks,
Joe
Just been over to the forum over at moonfruit hoping to see a few siteleaders have a moan at why their sites weren't working ( this 502 bad gateway ) and guess what nothing at all
saying that some one said about their's being offline
Thanks Joe
for keeping us informed
has this latest small outage got any thing to do with what happened end of last month
Hi Andrew,
No, that was our service provider having a failure of its uninterruptable power supply (the irony...). This was a corrupted image bringing down our image converstion software on our webservers. We've released a patch to stop it happening again, but given we convert millions of images each day, its not a common occurence. It's always frustrating for us too when these things happen close together, but I can assure you they are not linked. Thanks,
Joe
Thanks for that Joe
i come across quite a few internal server 500 due to being on a few other art sites/forums
but seeing this 502 bad gateway was a first for me
Seeing how many sites moonfruit actual y host would they ever come a time where you would look at changing service provider
Hi Andrew,
Yes, that is something we would consider, though in the 8 years we've been with our service provider, they've only had 2 problems, both of which were in the last year. So we're keen to understand why it won't happen again, but we also know that they had 7 years of unblemished service, so it's not always easy to judge. The reality is that in the industry we're in there will always be technical problems. Every provider from Google downwards has service disruptions. I think what sets you apart as a company is how you deal with them, and how you respond to the customers, which I hope we do well. We always try to be open and honest about problems we have and address whatever issues we can to prevent future downtime.
I hope that helps,
Joe