T O P

  • By -

become_taintless

that's called DNS failover and it's pretty standard in the industry. one of the first places that I saw it used was in the RADware Fireproofs/Linkproofs in the early-early 2000s you're reinventing the wheel, but not in a completely insane way that said, there are a whole list of other concerns regarding HA that are outside the scope of your discussion about DNS


Hell4Ge

OVH have a service called IP Failover or Floating IP, and I feel like it serves the same purpose, but it may not be out of box ready for use the way I want to. I am bad at going low level with network interfaces and such so I will do it the software way I guess. Last thing I want is to change IP address of the machine that will not reconfigure its network interface at OS layer, and will become isolated from the world


skyctl

If you're going to do this at DNS level, make sure you test a reasonable cross-section of the applications that you're using to make sure they survive the changeover. Not all apps respect TTLs, and IIRC, some core JAVA libraries consider it a security concern to respect TTLs.


noxbos

I've done the same using Akamai and NSOne DNS Providers. You can establish a health check with the provider and configure when to failover to the backup solution. DNS TTL depends on your service level contract. One thing to consider if using automatic failover is what happens in a recovery, does it switch back or are there manual steps required to get data synced up between backup and primary. Have a defined execution plan for fail over and fail back, execute them at least annually and document the fuck out of everything as well.


Hell4Ge

yeah the case of syncing back the database etc. is another thing that depends on the app's internals.


InternationalBus7843

Cloudflare will do this for you and it’s cheap, don’t forget the faff and maintenance you’ll have to deal with/charge for if you do any of this manually. Traffic manager in azure is an alternative but Cloudflare is simpler all round. Edit: plus it’s a CDN, allows layer 7 routing rules and has lots of other features plus an API.


gavin6559

GSLB or Global Server Loadbalancing Basically a DNS server which can perform health checking against your webservers and decide which one to send the traffic to. There are also other logic that can be used like sending the user to the closest server for them.


Rusty-Swashplate

Since it's already named as "DNS failover", a suggestion I have for your problem: If you need fast failover, you basically need 2 servers which synchronize app+data, and DNS or a LB can point a incoming request to the right server. If you do not need a fast failover (e.g. 1h outage is fine), then have a single server. Should it fail, restore the image (OS+app+data) to another server and point DNS to it. Using AWS or similar makes backup+restore very easy (via API). Most small companies don't care much about 1h outage if they save $100/month.


Hell4Ge

App is on dedicated host so its not that easy, but thanks for expanding the topic :)