Posts Tagged ‘cluster’

100% uptime

Monday, September 8th, 2008

Its a pretty common question in hosting, whether it be shared, reseller, VPS or dedicated hosting and that is, is 100% uptime possible? Why do all hosts offer 99.8% or 99.99% uptime and not 100%? In this blog I will discuss this, and give some suggestions on how it is possible.

Is 100% uptime Possible?

Well the simple answer is no and that is what most people will say. I mean at the end of the day how can you offer 100% uptime if your upstream providers (datacenters, network providers, hardware etc etc) can not offer a 100% uptime gaurentee. Then of course there is the fact that servers should be rebooted every now and then when there are important updates…. So the question should be is 100% uptime possible when you dont include scheduled downtime?

Well lets take a look at the numbers

Uptime %        Downtime per Month       D’time per year
100             0                                 0
99.99          4.32 mins                  52.56 mins
99.9           43.2 mins                      525.6 mins
99.8           86.4 mins                      1051.2 mins

Ok so at a first look 43.2 mins downtime (the average that datacenters offer) is a lot, but when you look at it in a yearly perspective it seems much better. But anyway thats diverting from the main topic, I meant to show these numbers just to give an idea of what you are dealing with in terms of guarentees.

What can you do to maximise uptime?

There are quite a few things that you can do to stop downtime, and to make sure that you offer the best uptime possible. I am going to split this into two, there are some very easy things you can do that are mainly common sense and wont cost you anything (but time), and then there is the more costly and that will squeeze that last few percent away so you can truley achieve 100% uptime.

So lets start with the easier and cheaper method:

  • Monitoring – This is the easiest and probably the way you can maximise your downtime the easiest. Monitor your servers so that if they go down you can get to it quickly and fix it. This can save hours of downtime.
  • Checking Logs – It may sound very time consuming but once you get an eye for it you can scan the needed logs (messaages, httpd error etc ) quickly and look out for certain lines which warn you about certain things.
    A simple command like cat /var/log/messages | grep fail
  • Constant Security – Securing your server properly can help keep your server up. I shouldnt need to state why. This security includes Updates to software on the server and site software (eg blog software).
  • Optimization – Make sure your server is running as well as possible. Doing this will make sure that any small spikes in traffic will not harm you
  • Monitoring Load – Make sure you log in during peak time to check the load, memory and general performance of the server. Do not overfill the server and take heed when you may need to upgrade

Okay so that was the cheap and easy option. Its pretty simple really and it is generally a good idea to do all the above just to keep your server running well.

So now onto the more expensive solutions for the people who require that extra 0.0001%. I would not reccomend this unless every minute of downtime costs you, as these options are by no means cheap and are not easy to do.

I want to keep away from a bullet pointed list on this, as there is no “set” list of things you do to keep the server up 100% of the time. It can be small things to installing a more effecient web server just as lighttpd to creating a dedicated database server. This really should be customized on a site-to-site basis, you need to analise where the server weaknesses are and fix them. That is the basic way of doing it.

However if you want 100% uptime gaurenteed, well thats the interesting bit isn’t it. You then need to discount for all different types of downtime including network downtime. Well for this sort of solution you are looking at Geoloadbalancing, that is servers that are based in multiple locations around the world. To make this option viable you are looking at having to spend a lot of money, and having to custom code syncing solutions so that the website and databases are as up to date as possible. As for the geoloadbalancing there are multiple ways of doing it and the best is getting a “portable” IP which allows you to “move” it between locations. Thats pretty complicated and they are hard to get so the other way is roundrobin DNS. I will let you google that, and maybe I will explain it later in a blog… but right now I think I have blogged to much and your probably falling asleep. So yeah, if you have any questions go ahead and ask :D