- Installed bacula-client and bacula-traymonitor packages (sudo apt-get install bacula-client bacula-traymonitor)
- Copied /etc/bacula/bacula-fd.conf and /etc/bacula/tray-monitor.conf from the old laptop
- Changed the host name in both the above files
- Added my new laptop to /etc/bacula/bacula-dir.conf on the bacula director host by copying the job definition of the old laptop and renaming it
Showing posts with label Backups. Show all posts
Showing posts with label Backups. Show all posts
Thursday, 27 October 2011
A New Computer -- Backups
I'd love to find a new backup solution, but the reality is I have Bacula working reasonably consistently right now, and it's the easiest thing to get set up quickly. So I:
Sunday, 14 November 2010
Configuring Bacula Tray Monitor on Ubuntu
I use Bacula to back up my servers and desktop/laptop computers. It's always bugged me that I didn't have a little icon on my Ubuntu desktop showing the status of the backup: whether it was running or not and some indication of progress. Most backup systems have this. In Bacula it's called the tray monitor. The configuration file documentation seemed straightforward, but it took a lot of fiddling to get it right.
I think I have a fairly typical situation:
I think I have a fairly typical situation:
- A backup server with a direct attached backup storage device (in my case, two: a USB-connected 1 TB hard drive, and a DAT-72 tape drive)
- Several clients being backed up on a regular schedule
- One client is the laptop I use as my normal workstation. This is the one I want to put the tray monitor on
- I'm already successfully backing up this configuration, so all my passwords in my Bacula configuration files are correct, and all my firewalls are configured to allow the backup to work
- The laptop and the backup server are both running Ubuntu 10.04
- I installed the tray monitor software on my laptop:
- On my laptop I changed the tray monitor configuration file (
/etc/bacula/tray-monitor.conf
) to look like this: - Still on the laptop, I added the following to the file daemon, aka backup client, configuration file (
/etc/bacula/bacula-fd.conf
): - I restarted the file daemon on the laptop (don't forget this or you'll confuse yourself horribly):
- On the backup server, I added the following to the director configuration file (
/etc/bacula/bacula-dir.conf
): - Finally, I reloaded the configuration file on the backup server:
- Now all I had to do is start the tray monitor. The command line is:
sudo apt-get install bacula-traymonitor
Monitor {
Name = backup02-mon
Password = "Monitor-Password"
RefreshInterval = 5 seconds
}
Client {
Name = pacal-mon
Address = pacal.pender.jadesystems.ca
FDPort = 9102
Password = "Monitor-Password"
}
# Restricted Director, used by tray-monitor to get the
# status of the file daemon
Director {
Name = backup02-mon
Password = "Monitor-Password"
Monitor = yes
}
sudo service bacula-fd restart
# Restricted console used by tray-monitor to get the status of the director
Console {
Name = backup02-mon
Password = "Monitor-Password"
CommandACL = status, .status
}
sudo bconsole
reload
exit
bacula-tray-monitor -c /etc/bacula/tray-monitor.conf
- Select System-> Preferences-> Main Menu
- Select "System Tools" on the left side of the window
- Click on the "New Item" button on the right side of the window
- Fill in the "Name:" box with "Bacula Tray Monitor" and the "Command:" box with the command line above
- Click "OK"
- Click "Close" in the "Main Menu" window
- I used a separate password specifically for the monitor. The tray monitor's configuration file has to be readable by an ordinary user without special privileges. So anyone can see the password. Don't use the same password for the monitor as you use for the director or the file daemons, or you'll be making it easy for anyone who gets access to your computer to read all the files on your network.
- You have to change to above bits of configuration file to match your particular configuration. Change: "laptop.example.com" to the fully qualified domain name of the computer on which you're installing the tray monitor. Change "Monitor-Password" to something else more secure that everyone who reads this blog doesn't know about.
- "backup02-mon" and "laptop-mon" are both names you can change to be anything you want them to be. In my case, "backup02-mon" means the monitor on the backup server (hostname: backup02), and "laptop-mon" means the monitor on the laptop (hostname: laptop)
Monday, 21 April 2008
Tape Rotation with Bacula
I love the topic of backups. I say that because it's IT's dirty secret. No one should keep data in one place only, yet it's very difficult to set up a backup solution. Different organizations have different needs, and so backup software has to provide a lot of options. But the need for options means when you just want to get basic backup running quickly, it's a challenge.
This post is part of a series about rolling your own backup solution. There are other ways to do it, but I wanted to do my own solution one more time...
I'm backing up a Windows XP desktop and a Windows XP laptop, a Dell SC440 which is the VMWare host, plus a number of Linux VMs that provide my basic infrastructure: DNS, DHCP, file server, Subversion server, test platforms for software development, and the backup server itself.
I chose tape in part because I can take the backup off-site. I'll take a tape off-site once a week. That means I might lose a week's worth of work if my house burns down, but I'm not ready to invest in the time and effort to swap tapes every day, either.
The Bacula documentation has a good section on backup strategies, but none of them include mine. I'll have to figure it out myself.
Bacula manages tapes in a tape pool. A pool is just a group of tapes. (Bacula calls tapes "volumes".) I want to let Bacula fill up one tape per week before it uses another, which is the default behaviour. At the end of the week, I want to eject the tape and use another. I'll let Bacula automatically recycle the tapes, meaning that after a week (in my case), Bacula will reuse a tape, overwriting the old backups on it.
Anyway, I started with a rotation to do a full backup Sunday night, incremental backups all week, and then eject the tape Saturday night after the last incremental. With three tapes I would always have last week's tape off site, except on Sunday.
I really only got started when I realized that that's a lot of tape wear given that the off-site happens once a week and that I have a fair bit of disk space on my main server. So my next idea is:
Take a full backup Monday night to disk, and incrementals up to Sunday night. Then, Monday morning write the whole disk volume to tape and take it off-site. That way I only run the tape once a week, and hopefully in a scenario that minimizes the chance of shoe-shining. I'll write the data to disk without compression, and let hardware compression compress the data to tape.
This also has the nice property that last week's backups are also on the disk (if I have enough disk space), so if I need a file I can get it from disk rather than retrieving the tape.
This post is part of a series about rolling your own backup solution. There are other ways to do it, but I wanted to do my own solution one more time...
I'm backing up a Windows XP desktop and a Windows XP laptop, a Dell SC440 which is the VMWare host, plus a number of Linux VMs that provide my basic infrastructure: DNS, DHCP, file server, Subversion server, test platforms for software development, and the backup server itself.
I chose tape in part because I can take the backup off-site. I'll take a tape off-site once a week. That means I might lose a week's worth of work if my house burns down, but I'm not ready to invest in the time and effort to swap tapes every day, either.
The Bacula documentation has a good section on backup strategies, but none of them include mine. I'll have to figure it out myself.
Bacula manages tapes in a tape pool. A pool is just a group of tapes. (Bacula calls tapes "volumes".) I want to let Bacula fill up one tape per week before it uses another, which is the default behaviour. At the end of the week, I want to eject the tape and use another. I'll let Bacula automatically recycle the tapes, meaning that after a week (in my case), Bacula will reuse a tape, overwriting the old backups on it.
Anyway, I started with a rotation to do a full backup Sunday night, incremental backups all week, and then eject the tape Saturday night after the last incremental. With three tapes I would always have last week's tape off site, except on Sunday.
I really only got started when I realized that that's a lot of tape wear given that the off-site happens once a week and that I have a fair bit of disk space on my main server. So my next idea is:
Take a full backup Monday night to disk, and incrementals up to Sunday night. Then, Monday morning write the whole disk volume to tape and take it off-site. That way I only run the tape once a week, and hopefully in a scenario that minimizes the chance of shoe-shining. I'll write the data to disk without compression, and let hardware compression compress the data to tape.
This also has the nice property that last week's backups are also on the disk (if I have enough disk space), so if I need a file I can get it from disk rather than retrieving the tape.
Sunday, 13 April 2008
Bacula Catalog Job and MySQL
To make the Bacula catalog job work:
- Edit /etc/bacula/bacula-dir.conf on the backup server
- Change where it says -u
-p to -u bacula - Edit ~bacula/.my.cnf and put this:
[client]
password=your_secret_password
- chmod 400 ~bacula/.my.cnf ; chown bacula:bacula ~bacula/.my.cnf
Bacula Notes
The Bacula documentation is good, but given the complex and interdependent nature of the backup problem, it's pretty overwhelming at first.
One thing that's not immediately obvious is where the configuration files are. The bacula-fd.conf file for Windows XP clients is at C:\Documents and Settings\All Users\bacula\bacula-fd.conf. On Ubuntu using the packages installed from universe, the configuration files are in /etc/bacula.
If you get errors that the server can't connect to the client, make sure the director definition in the client's bacula-fd.conf allows the director to connect, and that the client's password matches the server's password in the client resource of /etc/bacula/bacula-dir.conf. There's a helpful picture of what you need to do in the Bacula documentation.
One thing that's not immediately obvious is where the configuration files are. The bacula-fd.conf file for Windows XP clients is at C:\Documents and Settings\All Users\bacula\bacula-fd.conf. On Ubuntu using the packages installed from universe, the configuration files are in /etc/bacula.
If you get errors that the server can't connect to the client, make sure the director definition in the client's bacula-fd.conf allows the director to connect, and that the client's password matches the server's password in the client resource of /etc/bacula/bacula-dir.conf. There's a helpful picture of what you need to do in the Bacula documentation.
Friday, 11 April 2008
Accessing a SCSI Tape Drive from a VM
I ordered my Dell SC440 with an internal DAT tape drive. lsscsi reports it as a Seagate DAT72-052. I'm pretty sure that the Ubuntu 6.06 installation picked it up automatically -- I flailed around a bit to get this working but I don't think at the end of the day that I did anything on the host to get the tape drive working.
I'm creating a VM to run my backup. For large installations you won't want to do this, but for me I see no reason not to. And a big part of the reason I'm doing this is to see what's possible, so onward.
To enable the tape on a VM, you have to shut down the VM. Then, in the VMWare Console select VM > Settings > Hardware > Generic SCSI, and specify the physical device to connect to. In my case it was /dev/sg0. You also have to specify the controller and target for the tape drive.
I had no idea what the controller and target were, so on the VMWare host, I did:
When I started the VM, I got a message that said, among other things: "Insufficient permissions to access the file." Since it looked like everything else was right, I did ls -l /dev/sg0 on the VMWare host (not the VM) and got:
I'm creating a VM to run my backup. For large installations you won't want to do this, but for me I see no reason not to. And a big part of the reason I'm doing this is to see what's possible, so onward.
To enable the tape on a VM, you have to shut down the VM. Then, in the VMWare Console select VM > Settings > Hardware > Generic SCSI, and specify the physical device to connect to. In my case it was /dev/sg0. You also have to specify the controller and target for the tape drive.
I had no idea what the controller and target were, so on the VMWare host, I did:
sudo apt-get install lsscsiand got:
lsscsi -c
Attached devices:I took the channel as the controller: 0, and the target: 6. I entered all that into the VMWare Console and clicked enough okays to get out of the configuration. (I couldn't find the link in VMWare's on-line documentation for configuring generic SCSI devices, but if you type "SCSI" in the "Index" tab of the VMWare Console's help window you can find slightly more detailed instructions.)
Host: scsi1 Channel: 00 Target: 06 Lun: 00
Vendor: SEAGATE Model: DAT DAT72-052 Rev: A16E
Type: Sequential-Access ANSI SCSI revision: 03
Host: scsi2 Channel: 00 Target: 00 Lun: 00
Vendor: ATA Model: WDC WD1600YS-18S Rev: 20.0
Type: Direct-Access ANSI SCSI revision: 05
When I started the VM, I got a message that said, among other things: "Insufficient permissions to access the file." Since it looked like everything else was right, I did ls -l /dev/sg0 on the VMWare host (not the VM) and got:
crw-rw---- 1 root tape 21, 0 2008-03-23 17:23 /dev/sg0Since VMWare was running as user vmware, I added the vmware user to the tape group:
sudo adduser vmware tapeThen I restarted the VM and it worked fine. It pays to read the error message closely.
Thursday, 10 April 2008
Another SCSI Package
This is useful:
sudo apt-get install lsscsiIt shows you what SCSI devices you have attached to a machine and some important values.
Wednesday, 9 April 2008
Installing Bacula
To install bacula with MySQL (after you do this):
sudo apt-get install mysql-server bacula-director-mysql
Then you have to set up exim4, the mail system. Choose:
mail sent by smarthost; no local mail
After you install the MySQL version of the bacula director, you can install the rest of bacula this way, and also install some recommended packages:
sudo apt-get install bacula
sudo apt-get install dds2tar scsitools sg3-utils
I had these notes from an earlier set-up of exim4:
sudo apt-get install mysql-server bacula-director-mysql
Then you have to set up exim4, the mail system. Choose:
mail sent by smarthost; no local mail
After you install the MySQL version of the bacula director, you can install the rest of bacula this way, and also install some recommended packages:
sudo apt-get install bacula
sudo apt-get install dds2tar scsitools sg3-utils
I had these notes from an earlier set-up of exim4:
Look into setting up /etc/aliases later to redirect mail to more useful places. Also, make sure the domain of the outgoing address is one known to the outside world (e.g. jadesystems.ca) or the SMTP server will probably reject the message.
Bacula: Backups
To install bacula on Ubuntu, you need to add the universe repositories to /etc/apt/sources.list. It's just a matter of uncommenting four lines:
deb http://ca.archive.ubuntu.com/ubuntu/ dapper universeThen:
deb-src http://ca.archive.ubuntu.com/ubuntu/ dapper universe
...
deb http://security.ubuntu.com/ubuntu dapper-security universe
deb-src http://security.ubuntu.com/ubuntu dapper-security universe
sudo apt-get updateThe standard install of bacula uses sqllite, which the bacula guy reports as having problems...
Tuesday, 8 April 2008
Copying VMs
I tried copying my tiny Ubuntu VM, and it ran, except eth0 wouldn't come up, and of course the host name was wrong.
To fix eth0, you have to update /etc/iftab with the new VMWare-generated MAC address for the Ethernet interface. I added a script to the base VM in /usr/local/sbin/changemac to make it easier:
sudo vi /usr/local/sbin/changemac
And add:
#!/bin/sh
mac=`ifconfig -a | grep "HWaddr" | cut -d " " -f 11`
echo "eth0 mac $mac arp 1" > /etc/iftab
Then do:
sudo chmod u+x /usr/local/sbin/changemac
Note that you're adding the script to the "template" VM, so you'll only have create the script once for each template you create, not each time you create a new VM.
Now you can copy the "template" VM. Make sure the "template" VM isn't running. Log in to the VMWare host, change to the directory where you have the VMs, and copy the VM:
cd /usr/local/vmware/Virtual\ Machines
sudo cp -R --preserve=permissions,owner old_VM_directory new_VM_directory
Now in the VMWare console:
If you forget to change the host name in /etc/dhcp3/dhcient.conf the first time around:
To fix eth0, you have to update /etc/iftab with the new VMWare-generated MAC address for the Ethernet interface. I added a script to the base VM in /usr/local/sbin/changemac to make it easier:
sudo vi /usr/local/sbin/changemac
And add:
#!/bin/sh
mac=`ifconfig -a | grep "HWaddr" | cut -d " " -f 11`
echo "eth0 mac $mac arp 1" > /etc/iftab
Then do:
sudo chmod u+x /usr/local/sbin/changemac
Note that you're adding the script to the "template" VM, so you'll only have create the script once for each template you create, not each time you create a new VM.
Now you can copy the "template" VM. Make sure the "template" VM isn't running. Log in to the VMWare host, change to the directory where you have the VMs, and copy the VM:
cd /usr/local/vmware/Virtual\ Machines
sudo cp -R --preserve=permissions,owner old_VM_directory new_VM_directory
Now in the VMWare console:
- Import the new VM and start it.
- Log in at the console and run /usr/local/sbin/changemac.
- Change /etc/hostname, /etc/dhcp3/dhclient.conf, and /etc/hosts to have the host name you want for the new machine.
- Reboot.
If you forget to change the host name in /etc/dhcp3/dhcient.conf the first time around:
- Change it
- Type sudo date and then enter your password. This is just to make sure that sudo isn't going to prompt you for passwords
- Type sudo ifdown eth0 && sudo ifup eth0
Friday, 31 August 2007
iSCSI vs. Fibre Channel - You're Both Right
Reading another article from an expert who provides less than useful information has finally prompted me to try to provide useful guidance for IT managers of 50 to 1,000 diverse servers running a variety of applications.
iSCSI vs. fibre channel (FC) is a classic technology debate with two camps bombarding each other mercilessly with claims that one or the other is right. The reason the debate is so heated and long lived is because there isn't a right answer: there are different situations in which each one is better than the other. Here's how to figure out what's best for you:
Start with the assumption that you'll use iSCSI. It's less expensive, so if it does what you need, it should be your choice. It's less expensive at all levels: The switches and cables enjoy the ecnomy of scale of the massive market for IP networking. You already have staff who know how to manage IP networks. You already have a stock of Cat 6 cables hanging in your server rooms or network closets.
If you have mostly commodity servers, they transfer data to and from direct-attached storage at less than gigabit speeds. Gigabit iSCSI is fine. If you have a lot of servers, you have to size the switches correctly, but you have to do that with FC as well, and the FC switch will be more expensive. Implement jumbo frames so backups go quickly.
Just because you're using iSCSI doesn't mean you're running your storage network over the same cables and switches as your data centre LAN. In fact, you probably aren't. The cost saving doesn't come from sharing the existing LAN, it comes from the lower cost per port and the reduced people cost (skill sets, training, availability of administrators in the labour market) of using the same technology. As long as your storage and general-purpose networks are not sharing the same physical network, a lot of the criticisms of iSCSI evaporate.
If you have large, specialized servers that can and do need to sustain high data transfer rates, then definitely look at FC. Be sure you're measuring (not just guessing) that you need the data transfer rates.
If you have a large farm of physical servers running a huge number of virtual machines (VMs), look at FC. My experience is that virtual machine infrastructures tend to be limited by RAM on the physical servers, but your environment may be different. You may especially want to think about how you back up your VMs. You may not need the FC performance during the day, but when backups start, watch out. It's often the only time of day when your IT infrastructure actually breaks a sweat.
You might look at a FC network between your backup media servers and backup devices, especially if you already have an FC network for one of the reasons above.
Yes, FC will give you higher data transfer rates, but only if your servers and storage devices can handle it, and few today go much beyond one gigabit. FC will guarantee low latency so your servers won't do their equivalent of "Device not ready, Abort, Retry, Ignore?"
The challenge for an IT manager, even (or especially) those like me who have a strong technical background, is that it's easy to get talked into spending too much money because you might need the performance or low latency. The problem with that thinking is that you spend too much money on your storage network, and you don't have the money left over to, for example, mirror your storage, which may be far more valuable to your business.
A final warning: neither technology is as easy to deal with as the vendor would have you believe (no really?). Both will give you headaches for some reason along the way. If it wasn't hard, we wouldn't get the big bucks, would we?
iSCSI vs. fibre channel (FC) is a classic technology debate with two camps bombarding each other mercilessly with claims that one or the other is right. The reason the debate is so heated and long lived is because there isn't a right answer: there are different situations in which each one is better than the other. Here's how to figure out what's best for you:
Start with the assumption that you'll use iSCSI. It's less expensive, so if it does what you need, it should be your choice. It's less expensive at all levels: The switches and cables enjoy the ecnomy of scale of the massive market for IP networking. You already have staff who know how to manage IP networks. You already have a stock of Cat 6 cables hanging in your server rooms or network closets.
If you have mostly commodity servers, they transfer data to and from direct-attached storage at less than gigabit speeds. Gigabit iSCSI is fine. If you have a lot of servers, you have to size the switches correctly, but you have to do that with FC as well, and the FC switch will be more expensive. Implement jumbo frames so backups go quickly.
Just because you're using iSCSI doesn't mean you're running your storage network over the same cables and switches as your data centre LAN. In fact, you probably aren't. The cost saving doesn't come from sharing the existing LAN, it comes from the lower cost per port and the reduced people cost (skill sets, training, availability of administrators in the labour market) of using the same technology. As long as your storage and general-purpose networks are not sharing the same physical network, a lot of the criticisms of iSCSI evaporate.
If you have large, specialized servers that can and do need to sustain high data transfer rates, then definitely look at FC. Be sure you're measuring (not just guessing) that you need the data transfer rates.
If you have a large farm of physical servers running a huge number of virtual machines (VMs), look at FC. My experience is that virtual machine infrastructures tend to be limited by RAM on the physical servers, but your environment may be different. You may especially want to think about how you back up your VMs. You may not need the FC performance during the day, but when backups start, watch out. It's often the only time of day when your IT infrastructure actually breaks a sweat.
You might look at a FC network between your backup media servers and backup devices, especially if you already have an FC network for one of the reasons above.
Yes, FC will give you higher data transfer rates, but only if your servers and storage devices can handle it, and few today go much beyond one gigabit. FC will guarantee low latency so your servers won't do their equivalent of "Device not ready, Abort, Retry, Ignore?"
The challenge for an IT manager, even (or especially) those like me who have a strong technical background, is that it's easy to get talked into spending too much money because you might need the performance or low latency. The problem with that thinking is that you spend too much money on your storage network, and you don't have the money left over to, for example, mirror your storage, which may be far more valuable to your business.
A final warning: neither technology is as easy to deal with as the vendor would have you believe (no really?). Both will give you headaches for some reason along the way. If it wasn't hard, we wouldn't get the big bucks, would we?
Friday, 27 April 2007
Virtualization: There's Gotta be a Catch
Virtualization solves lots of problems for many if not most organizations that have more than a rack of servers. On an earlier assignment I calculated a worst-case saving of C$380 per month for a virtual server over a physical server (using ESX 2.5 and 3.0 from VMWare). But there's a catch to virtualization, and that catch is backups.
Virtualization introduces wrinkles on your backup approach. Fortunately, to start off you're probably okay doing your backups the same way you always have been. The backup wrinkles are not enough to stop you from embarking on virtualization.
Here are some of the things you need to watch for as you add virtual machines (VMs) to your virtualization platform:
As you add VMs to your infrastructure, you may run into decreasing backup performance. The reason: many servers today are at their busiest during their backup. You may be able to run 20 VMs comfortably on one physical server, but if you try to back up all those VMs at once you'll run into bottlenecks because the physical server has a certain number of network interfaces, and all the data is coming from the same storage device, or at least through the same storage interface. Again, the solution is to watch backup performance as you virtualize and make adjustments.
Be aware that you might have to make changes to your backup infrastructure to deal with real challenges of backup performance introduced by virtualization. If your backups are already a problem, you might want to look into this in more detail. (The problems and solutions are beyond the scope of this post.)
How long do you have to rebuild servers after a data centre fire? You may not have even thought of the problem (don't be embarrassed. Many of us haven't). With virtualization you have to think about it because an equivalent event is more likely to happen: The storage device that holds all your VMs may lose its data, and you're faced with rebuilding all your VMs. I have second-hand experience (e.g. the guys down the street) with storage devices eating all the VMs, but I've never directly known anyone who had a serious fire in the data centre.
If the backups on your physical servers can be restored to bare metal, then you don't have to worry about your storage device eating the VMs. You may have to make some changes to your bare-metal backup -- I have no experience with that topic so I don't know for sure -- but once you do you should be able to restore your VMs relatively quickly.
If you can't or don't have backups that can be restored to bare metal, then you have a challenge. I doubt that most general purpose data centres are full of identically configured servers, with detailed rebuild procedures and air-tight configuration management so every server can be rebuilt exactly like the one that was running before. If you had to rebuild 200 VMs from installation disks, you'd probably be working a lot of long nights.
If most of your servers that you plan to virtualize have database-like data (large files that change every day), I'd recommend looking at changing your backup approach for those servers to a product like ESX Ranger, or look for some of the user-built solutions on the Internet. These products will back up the entire virtual machine every time they run, and may not allow individual file (within the VM) restores. However, for a database server you're probably backing up the whole server every night anyway, so that won't be a significant change to your backup workload.
If you want to virtualize file server-like servers, there isn't really a good solution that I or anyone I know has found at this time. If your backup infrastructure has enough room to take the additional load, simply back up with ESX Ranger or one of the other solutions once a week (or however frequently you do a full backup), along with your current full and incremental backup schedule. If you have to rebuild the VM, you restore the most recent ESX Ranger backup first. If you just have to restore files on the server, because a user deleted an important document, for example, just use the regular backups.
If you have the budget to change backup infrastructures, ESX Ranger can probably provide a pretty good overall solution. However, you have to provide backup and restore for physical servers as well, so the staff who do restores have to be able to deal with two backup systems.
One final gotcha that I've run across: There are some great devices out there from companies like Data Domain that provide excellent compression of exactly the type of data you're backing up when you back up an entire VM. Unfortunately, ESX Ranger compresses the data too, which messes up the storage device's compression. Whatever solution you put together, make sure your vendor commits to performance targets based on the entire solution, not on individual products.
As with so much of what we do in IT, it's really hard to summarize everything in a way that makes sense in a blog post. Comment on this post if you'd like more details or reasons why I make the recommendations I make.
Virtualization introduces wrinkles on your backup approach. Fortunately, to start off you're probably okay doing your backups the same way you always have been. The backup wrinkles are not enough to stop you from embarking on virtualization.
Here are some of the things you need to watch for as you add virtual machines (VMs) to your virtualization platform:
- Do you have highly-tuned start and stop times for your backup jobs, for example when you have inter-dependencies between external events and your backup jobs?
- Do the servers you plan to virtualize have file server-like data, in other words, does it consist of a lot of small files that mostly don't change?
- If you had a fire in your data centre today, before virtualizing, how soon would you have to have all the servers rebuilt?
- Is your backup infrastructure really only being used to half capacity or less?
As you add VMs to your infrastructure, you may run into decreasing backup performance. The reason: many servers today are at their busiest during their backup. You may be able to run 20 VMs comfortably on one physical server, but if you try to back up all those VMs at once you'll run into bottlenecks because the physical server has a certain number of network interfaces, and all the data is coming from the same storage device, or at least through the same storage interface. Again, the solution is to watch backup performance as you virtualize and make adjustments.
Be aware that you might have to make changes to your backup infrastructure to deal with real challenges of backup performance introduced by virtualization. If your backups are already a problem, you might want to look into this in more detail. (The problems and solutions are beyond the scope of this post.)
How long do you have to rebuild servers after a data centre fire? You may not have even thought of the problem (don't be embarrassed. Many of us haven't). With virtualization you have to think about it because an equivalent event is more likely to happen: The storage device that holds all your VMs may lose its data, and you're faced with rebuilding all your VMs. I have second-hand experience (e.g. the guys down the street) with storage devices eating all the VMs, but I've never directly known anyone who had a serious fire in the data centre.
If the backups on your physical servers can be restored to bare metal, then you don't have to worry about your storage device eating the VMs. You may have to make some changes to your bare-metal backup -- I have no experience with that topic so I don't know for sure -- but once you do you should be able to restore your VMs relatively quickly.
If you can't or don't have backups that can be restored to bare metal, then you have a challenge. I doubt that most general purpose data centres are full of identically configured servers, with detailed rebuild procedures and air-tight configuration management so every server can be rebuilt exactly like the one that was running before. If you had to rebuild 200 VMs from installation disks, you'd probably be working a lot of long nights.
If most of your servers that you plan to virtualize have database-like data (large files that change every day), I'd recommend looking at changing your backup approach for those servers to a product like ESX Ranger, or look for some of the user-built solutions on the Internet. These products will back up the entire virtual machine every time they run, and may not allow individual file (within the VM) restores. However, for a database server you're probably backing up the whole server every night anyway, so that won't be a significant change to your backup workload.
If you want to virtualize file server-like servers, there isn't really a good solution that I or anyone I know has found at this time. If your backup infrastructure has enough room to take the additional load, simply back up with ESX Ranger or one of the other solutions once a week (or however frequently you do a full backup), along with your current full and incremental backup schedule. If you have to rebuild the VM, you restore the most recent ESX Ranger backup first. If you just have to restore files on the server, because a user deleted an important document, for example, just use the regular backups.
If you have the budget to change backup infrastructures, ESX Ranger can probably provide a pretty good overall solution. However, you have to provide backup and restore for physical servers as well, so the staff who do restores have to be able to deal with two backup systems.
One final gotcha that I've run across: There are some great devices out there from companies like Data Domain that provide excellent compression of exactly the type of data you're backing up when you back up an entire VM. Unfortunately, ESX Ranger compresses the data too, which messes up the storage device's compression. Whatever solution you put together, make sure your vendor commits to performance targets based on the entire solution, not on individual products.
As with so much of what we do in IT, it's really hard to summarize everything in a way that makes sense in a blog post. Comment on this post if you'd like more details or reasons why I make the recommendations I make.
Subscribe to:
Posts (Atom)