|
Knightrous
Site Admin
Joined: 15 Jun 2004
Posts: 8511
Location: NSW
|
Disaster Recovery Plan
This is more related to work and I know a few people on here will be knowledge able. At the moment 'm pretty sure we don't have a brilliant Disaster Recovery Plan at work, and since it kinda lays in my lap, I want to look at working towards a solution.
At the moment we have 8 locations, with around 60-70 users that all connect back to HQ via VPN's to our mini-data center of 4 servers (domain controller, database server and2 terminal servers). Back ups are done daily of the whole servers to magnetic tape and stored at our parent company
Pretty much, if if the HQ gets burnt to the ground, we're in some troubles as the remote locations won't be able to function without us. I'm trying to find a simple way that if we lost the servers inf a fire, we can restore everything back onto new hardware ASAP. I've seen some information on the Symantec Backup System Recovery which is suppose to do cold images of the servers and be able to copy it all onto new hardware without driver failures/hardware mismatches/recreation of all security groups etc etc.
I've also thought about the possibility of using a data center, having everything hosted externally.
A little lost at where to go, it's a fair bit bigger then my disaster recovery plan for at home, which consists of SFA _________________ https://www.halfdonethings.com/
|
Wed Oct 14, 2009 3:25 pm |
|
|
|
|
Nick
Experienced Roboteer
Joined: 16 Jun 2004
Posts: 11802
Location: Sydney, NSW
|
DR is one of my main jobs for a 200+ server data centre I love BESR and can walk you thru any scenarios. version 8.5 is pretty slick, but I wouldn't recommend it for backing up vast amounts of data. The best combination is to use BESR on the system partition and backup databases and large file systems to tape. For recovery, you can restore the sys drive to new hardware and get the big data back off tape.
You also need to have a way to move the BESR images off site; I like to use one server as a repository and have all the other servers create images into separate shares to keep things neat. I then back up all that to tape and send it off-site. BESR can also use FTP to make off-site backups directly if you have some bandwidth.
BESR can keep multiple backups in not much space by doing a full image and then doing much smaller incremental backups - I can keep a month's worth of weekly backups in hardly any more space than one full image. For system backups, more than a month is not much point.
Don't believe all the hype about driver replacement - its pretty good but you absolutely have to test it out to be sure! I have had it fail on occasion and its not fun to tell the boss that you have the data but its not usable.
What's your main tape backup program, I might be able to suggest something for that as well.
Most DR plans do an OK job of backing the data up, but a poor job with getting everything going again. At the last place I worked, the DR plan largely consisted of begging HP to deliver 20 servers on the next day - like that would ever happen!
I don't know if your management has any budget in mind, but if you can twist their arm, I would get a DR-only server that sits off-site (either switched off in a crate or live in a hosted data centre). You spec some fast disks and a huge amount of RAM then set it up as a virtual server host. If the main site goes down you can convert the BESR images directly to virtual servers (works for VMware, not sure about Hyper-V).
That's about as much as I can fit into a forum post, but happy to take it up in emails.
|
Wed Oct 14, 2009 6:31 pm |
|
|
|
|
|
Valen
Experienced Roboteer
Joined: 07 Jul 2004
Posts: 4436
Location: Sydney
|
I'm with brett on this one, in non performance critical systems (ie you aren't "bound" by anything, cpu or disk i/o isn't pegged) I virtualise the servers.
you then snapshot them and rsync the snapshots, or just raw disk images to the backup site.
If your head office splodificates boot the VM's up at the backup site, do some creative routing and your set.
If you want more "rapid" recovery or a more up to date "backup" (assuming you have a decent internet connection) then you can use something like DRBD to run something like a RAID array to the backup site, depending on your needs you can decide if you will block on write until its confirmed as arriving at the backup site, until its sent, or just until its saved locally.
You then run OCFS or similar as the file system on that partition and you can then access the files from both sites simultaniously, whilst this is good for "backup" of files the big thing it gets you is the ability to "live migrate" your virtual machines.
Basically you can move which physical machine the guest is running on without disruption to the host. So if you want to upgrade the physical hardware on a machine, you migrate its guests onto another machine (where they will be somewhat slower as its now doing loads of work(possibly)) do your upgrade and migrate them back, all with no down time. (a delay of a few hundred ms is typical, but if your routing fabric can switch fast enough you wont even drop connections)
Same goes for when the bush fire comes migrate it on over the wire ;->
only big issue is you need a really phat pipe to do that, you need to transfer the contents of the VM's ram over the connection. Compression helps alot there. _________________ Mechanical engineers build weapons, civil engineers build targets
|
Wed Oct 14, 2009 8:58 pm |
|
|
|
|
|
|
|