In 2006, I was project manager on a VMware implementation for a health care organization. We virtualized 200 servers in six weeks, after a planning phase of about 2 months. Out of that experience I wondered, "Did virtualization have anything to offer a smaller business?" So I set up a box at home and converted my home "data centre" into a virtualized data centre using VMware's Server product, which was the free product at the time.
After five years it's been an interesting experience and I've learned a lot. At the end of the day, I'm pretty convinced that the small business that has a few servers running in a closet in their office doesn't have a lot to gain from virtualizing within the "closet". (I'm still a big fan of virtualization in a medium or large organization.) I'm going to switch back to running a single server with all the basic services I run (backup, file share, DNS, DHCP, NTP) on a single server image.
I had one experience where the VM approach benefited me: As newer desktops and laptops came into the house, the version of the backup client installed on them by default was newer than the backup master on my backup server (I use Bacula). Rather than play around with installing and updating different versions of the backup client or master, I simply upgraded the backup master VM to a new version of Ubuntu and got the newer version of Bacula. I didn't have to worry about what other parts of my infrastructure I was going to affect by doing the upgrade.
The down side was that I spent a lot of time fooling around with VMware to make it work. Most kernel upgrades require a recompile of the VMware tools on each VM, which was a pain. I spent a fair bit of time working through an issue about timekeeping on the guests versus the VMware host that periodically caused my VMs to slow to a crawl.
Connecting to the web management interface and console plug-in always seemed to be a bit of a black art, and it got worse over time. At the moment, I still don't think modern versions of FireFox can connect to a running VM's console, so I have to keep an old version around when I need to do something with a VM's console (before ssh comes up).
My set-up wasn't very robust in the face of power failures. When the power went off, the VMs would leave their lock files behind. Then, when the power came back, the physical machine would restart but the VMs wouldn't. I would have to go in by hand and clean up the lock files. And often I wouldn't even know there'd been a power failure, so I'd waste a bit of time trying to figure out what was wrong. I should have had a UPS, but that wouldn't solve all the instances where something would crash leaving a lock file behind.
All in all, and even if I had automated some of that, the extra level of complexity didn't buy me anything. In fact, it cost me a lot of time.
Some of these problems would have been solved by using the ESX family of VMware products, but the license fees guarantee that the economics don't work for a small business.
I originally started out planning to give Xen a try, but it turned out not to work with the current (at the time) version of Ubuntu. Today I would try KVM. I played around with it a bit last year and it looked fine for a server VM platform. I needed better USB support, so I switched to VirtualBox. VirtualBox worked fine for me to run the Windows XP VM I used to need to run my accounting program, but it has the free version/enterprise version split that makes me uncomfortable for business use.
So my next home IT project will be to move everything back to a simpler, non-virtualized platform. I'll still keep virtualization around for my sandbox. It's been great to be able to spin up a VM to run, say, an instance of Drupal to test upgrades before rolling out to my web site, for example, or to try out Wordpress, or anything else I need to try.
My blog posts about interesting steps along the virtualization road are here.
Showing posts with label VMWare. Show all posts
Showing posts with label VMWare. Show all posts
Friday, 11 November 2011
Sunday, 18 July 2010
Can't Run VMWare Server 2 Management Interface
I filled the disk on my VMWare Server 2 host, which caused all sorts of grief. Part of the grief was that I couldn't get to the management interface at https://vmhost:8333/ui. I solved that problem by killing the VMWare hostd process (after freeing up some space on the disk):
- Look up the process ID: ps -ea | grep hostd
- Kill the process: sudo kill pid
- Remove the old lock file: sudo rm /var/run/vmware/vmware-hostd.PID
- Restart VMWare management: sudo /etc/init.d/vmware-mgmt restart
Thursday, 14 May 2009
Ubuntu VMs and Time
Ubuntu 8.04 in VMs under VMWare Server 2.0.1 need the kernel parameter "clocksource=acpi_pm" when they boot. Edit /boot/grub/menu.lst and add "clocksource=acpi_pm" to the end of the line that starts "# kopt=". Don't remove the "#". In this context it's not a comment. After you save the file, run "sudo update-grub" and then reboot.
If the guest doesn't have this kernel parameter specified, time runs backwards on the guest O/S, or time hangs or gets stuck. This was showing up especially on my backup server running bacula, and the backup clients that had large amounts of data. I suspect that high loads exacerbate the problem.
If the guest doesn't have this kernel parameter specified, time runs backwards on the guest O/S, or time hangs or gets stuck. This was showing up especially on my backup server running bacula, and the backup clients that had large amounts of data. I suspect that high loads exacerbate the problem.
Monday, 6 April 2009
VMware Server 2 and Ubuntu 8.04 LTS
Some irritating behaviour that I couldn't resolve pushed me to the big upgrade of my virtual infrastructure. I was running VMWare Server 1.0.5 on Ubuntu 6.06 on a Dell SC440. Periodically all the VMs would just lock up for about five minutes. All the VMs would freeze, and even the time (as reported by date) would fall back by five minutes. The host ran just fine, and reported that nothing was happening (e.g. top(1) reported 100% idle).
I was having to endure too many frantic calls from my son that he couldn't get to lego.com, so it was time to do something.
Some Internet research turned up that I would need VMware Server 1.0.6 at a minimum for Ubuntu 8.04, so that meant I would need to do VMware first, and therefore go to VMware Server 2.
The upgrade to VMware Server 2 went fairly smoothly, but I had a couple of problems that sucked far more time than the solution eventually warranted:
P.S. The reason I decided I should try the upgrade is because the time handling in Linux kernels is now much more friendly to running in a VM, whereas the Ubuntu 6.06-era kernel wasn't. I haven't been running long enough to know if the upgrade has fixed the freezing problem.
I was having to endure too many frantic calls from my son that he couldn't get to lego.com, so it was time to do something.
Some Internet research turned up that I would need VMware Server 1.0.6 at a minimum for Ubuntu 8.04, so that meant I would need to do VMware first, and therefore go to VMware Server 2.
The upgrade to VMware Server 2 went fairly smoothly, but I had a couple of problems that sucked far more time than the solution eventually warranted:
- The management interface didn't work. When I connected to vmhost:8222 I got the grey VMware background, but it didn't show the login window. I solved it by some combination of restarting the VMware management server on the VMware host (sudo /etc/init.d/vmware-mgmt restart) and clearing the cache in FireFox
- Once the management interface was up, I couldn't get the console to open on many of my VMs. The error window told me to look at log files and report the problem. The VMs were version 4 of the "hardware". I upgraded the VMs' "hardware" to the current version (7 I think) from the management interface, and was able to open the console
- When I was installing VMtools, I was unable to bring the network interfaces back up. I spent much time trying to figure out what was wrong with VMtools before I realized it was a simple as my DHCP server had gone away. My DHCP server is in a VM, but I don't know if anything about the upgrade is what shut down the DHCP server software
P.S. The reason I decided I should try the upgrade is because the time handling in Linux kernels is now much more friendly to running in a VM, whereas the Ubuntu 6.06-era kernel wasn't. I haven't been running long enough to know if the upgrade has fixed the freezing problem.
Monday, 21 April 2008
Tape Rotation with Bacula
I love the topic of backups. I say that because it's IT's dirty secret. No one should keep data in one place only, yet it's very difficult to set up a backup solution. Different organizations have different needs, and so backup software has to provide a lot of options. But the need for options means when you just want to get basic backup running quickly, it's a challenge.
This post is part of a series about rolling your own backup solution. There are other ways to do it, but I wanted to do my own solution one more time...
I'm backing up a Windows XP desktop and a Windows XP laptop, a Dell SC440 which is the VMWare host, plus a number of Linux VMs that provide my basic infrastructure: DNS, DHCP, file server, Subversion server, test platforms for software development, and the backup server itself.
I chose tape in part because I can take the backup off-site. I'll take a tape off-site once a week. That means I might lose a week's worth of work if my house burns down, but I'm not ready to invest in the time and effort to swap tapes every day, either.
The Bacula documentation has a good section on backup strategies, but none of them include mine. I'll have to figure it out myself.
Bacula manages tapes in a tape pool. A pool is just a group of tapes. (Bacula calls tapes "volumes".) I want to let Bacula fill up one tape per week before it uses another, which is the default behaviour. At the end of the week, I want to eject the tape and use another. I'll let Bacula automatically recycle the tapes, meaning that after a week (in my case), Bacula will reuse a tape, overwriting the old backups on it.
Anyway, I started with a rotation to do a full backup Sunday night, incremental backups all week, and then eject the tape Saturday night after the last incremental. With three tapes I would always have last week's tape off site, except on Sunday.
I really only got started when I realized that that's a lot of tape wear given that the off-site happens once a week and that I have a fair bit of disk space on my main server. So my next idea is:
Take a full backup Monday night to disk, and incrementals up to Sunday night. Then, Monday morning write the whole disk volume to tape and take it off-site. That way I only run the tape once a week, and hopefully in a scenario that minimizes the chance of shoe-shining. I'll write the data to disk without compression, and let hardware compression compress the data to tape.
This also has the nice property that last week's backups are also on the disk (if I have enough disk space), so if I need a file I can get it from disk rather than retrieving the tape.
This post is part of a series about rolling your own backup solution. There are other ways to do it, but I wanted to do my own solution one more time...
I'm backing up a Windows XP desktop and a Windows XP laptop, a Dell SC440 which is the VMWare host, plus a number of Linux VMs that provide my basic infrastructure: DNS, DHCP, file server, Subversion server, test platforms for software development, and the backup server itself.
I chose tape in part because I can take the backup off-site. I'll take a tape off-site once a week. That means I might lose a week's worth of work if my house burns down, but I'm not ready to invest in the time and effort to swap tapes every day, either.
The Bacula documentation has a good section on backup strategies, but none of them include mine. I'll have to figure it out myself.
Bacula manages tapes in a tape pool. A pool is just a group of tapes. (Bacula calls tapes "volumes".) I want to let Bacula fill up one tape per week before it uses another, which is the default behaviour. At the end of the week, I want to eject the tape and use another. I'll let Bacula automatically recycle the tapes, meaning that after a week (in my case), Bacula will reuse a tape, overwriting the old backups on it.
Anyway, I started with a rotation to do a full backup Sunday night, incremental backups all week, and then eject the tape Saturday night after the last incremental. With three tapes I would always have last week's tape off site, except on Sunday.
I really only got started when I realized that that's a lot of tape wear given that the off-site happens once a week and that I have a fair bit of disk space on my main server. So my next idea is:
Take a full backup Monday night to disk, and incrementals up to Sunday night. Then, Monday morning write the whole disk volume to tape and take it off-site. That way I only run the tape once a week, and hopefully in a scenario that minimizes the chance of shoe-shining. I'll write the data to disk without compression, and let hardware compression compress the data to tape.
This also has the nice property that last week's backups are also on the disk (if I have enough disk space), so if I need a file I can get it from disk rather than retrieving the tape.
Friday, 11 April 2008
Accessing a SCSI Tape Drive from a VM
I ordered my Dell SC440 with an internal DAT tape drive. lsscsi reports it as a Seagate DAT72-052. I'm pretty sure that the Ubuntu 6.06 installation picked it up automatically -- I flailed around a bit to get this working but I don't think at the end of the day that I did anything on the host to get the tape drive working.
I'm creating a VM to run my backup. For large installations you won't want to do this, but for me I see no reason not to. And a big part of the reason I'm doing this is to see what's possible, so onward.
To enable the tape on a VM, you have to shut down the VM. Then, in the VMWare Console select VM > Settings > Hardware > Generic SCSI, and specify the physical device to connect to. In my case it was /dev/sg0. You also have to specify the controller and target for the tape drive.
I had no idea what the controller and target were, so on the VMWare host, I did:
When I started the VM, I got a message that said, among other things: "Insufficient permissions to access the file." Since it looked like everything else was right, I did ls -l /dev/sg0 on the VMWare host (not the VM) and got:
I'm creating a VM to run my backup. For large installations you won't want to do this, but for me I see no reason not to. And a big part of the reason I'm doing this is to see what's possible, so onward.
To enable the tape on a VM, you have to shut down the VM. Then, in the VMWare Console select VM > Settings > Hardware > Generic SCSI, and specify the physical device to connect to. In my case it was /dev/sg0. You also have to specify the controller and target for the tape drive.
I had no idea what the controller and target were, so on the VMWare host, I did:
sudo apt-get install lsscsiand got:
lsscsi -c
Attached devices:I took the channel as the controller: 0, and the target: 6. I entered all that into the VMWare Console and clicked enough okays to get out of the configuration. (I couldn't find the link in VMWare's on-line documentation for configuring generic SCSI devices, but if you type "SCSI" in the "Index" tab of the VMWare Console's help window you can find slightly more detailed instructions.)
Host: scsi1 Channel: 00 Target: 06 Lun: 00
Vendor: SEAGATE Model: DAT DAT72-052 Rev: A16E
Type: Sequential-Access ANSI SCSI revision: 03
Host: scsi2 Channel: 00 Target: 00 Lun: 00
Vendor: ATA Model: WDC WD1600YS-18S Rev: 20.0
Type: Direct-Access ANSI SCSI revision: 05
When I started the VM, I got a message that said, among other things: "Insufficient permissions to access the file." Since it looked like everything else was right, I did ls -l /dev/sg0 on the VMWare host (not the VM) and got:
crw-rw---- 1 root tape 21, 0 2008-03-23 17:23 /dev/sg0Since VMWare was running as user vmware, I added the vmware user to the tape group:
sudo adduser vmware tapeThen I restarted the VM and it worked fine. It pays to read the error message closely.
Tuesday, 8 April 2008
Copying VMs
I tried copying my tiny Ubuntu VM, and it ran, except eth0 wouldn't come up, and of course the host name was wrong.
To fix eth0, you have to update /etc/iftab with the new VMWare-generated MAC address for the Ethernet interface. I added a script to the base VM in /usr/local/sbin/changemac to make it easier:
sudo vi /usr/local/sbin/changemac
And add:
#!/bin/sh
mac=`ifconfig -a | grep "HWaddr" | cut -d " " -f 11`
echo "eth0 mac $mac arp 1" > /etc/iftab
Then do:
sudo chmod u+x /usr/local/sbin/changemac
Note that you're adding the script to the "template" VM, so you'll only have create the script once for each template you create, not each time you create a new VM.
Now you can copy the "template" VM. Make sure the "template" VM isn't running. Log in to the VMWare host, change to the directory where you have the VMs, and copy the VM:
cd /usr/local/vmware/Virtual\ Machines
sudo cp -R --preserve=permissions,owner old_VM_directory new_VM_directory
Now in the VMWare console:
If you forget to change the host name in /etc/dhcp3/dhcient.conf the first time around:
To fix eth0, you have to update /etc/iftab with the new VMWare-generated MAC address for the Ethernet interface. I added a script to the base VM in /usr/local/sbin/changemac to make it easier:
sudo vi /usr/local/sbin/changemac
And add:
#!/bin/sh
mac=`ifconfig -a | grep "HWaddr" | cut -d " " -f 11`
echo "eth0 mac $mac arp 1" > /etc/iftab
Then do:
sudo chmod u+x /usr/local/sbin/changemac
Note that you're adding the script to the "template" VM, so you'll only have create the script once for each template you create, not each time you create a new VM.
Now you can copy the "template" VM. Make sure the "template" VM isn't running. Log in to the VMWare host, change to the directory where you have the VMs, and copy the VM:
cd /usr/local/vmware/Virtual\ Machines
sudo cp -R --preserve=permissions,owner old_VM_directory new_VM_directory
Now in the VMWare console:
- Import the new VM and start it.
- Log in at the console and run /usr/local/sbin/changemac.
- Change /etc/hostname, /etc/dhcp3/dhclient.conf, and /etc/hosts to have the host name you want for the new machine.
- Reboot.
If you forget to change the host name in /etc/dhcp3/dhcient.conf the first time around:
- Change it
- Type sudo date and then enter your password. This is just to make sure that sudo isn't going to prompt you for passwords
- Type sudo ifdown eth0 && sudo ifup eth0
Monday, 7 April 2008
Firewall on the VM Quick Reference
Here's how to set up the firewall. Here's my /etc/iptables.rules:
*filter
:INPUT ACCEPT [273:55355]
:FORWARD ACCEPT [0:0]
:LOGNDROP - [0:0]
:OUTPUT ACCEPT [92376:20668252]
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
# Accept SSH so we can manage the VM
-A INPUT -i eth0 -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -i lo -j ACCEPT
# Allow ping (Zenoss uses it to see if you're up).
-A INPUT -p icmp --icmp-type echo-request -j ACCEPT
# Allow SNMP.
-A INPUT -p udp -s 0/0 --sport 1024:65535 --dport 161:162 -j ACCEPT
# Silently block NetBIOS because we don't want to hear about Windows
-A INPUT -p udp --dport 137:139 -j DROP
-A INPUT -j LOGNDROP
# Drop and log the rest.
-A LOGNDROP -p tcp -m limit --limit 5/min -j LOG --log-prefix "Denied TCP: " --log-level 7
-A LOGNDROP -p udp -m limit --limit 5/min -j LOG --log-prefix "Denied UDP: " --log-level 7
-A LOGNDROP -p icmp -m limit --limit 5/min -j LOG --log-prefix "Denied ICMP: " --log-level 7
-A LOGNDROP -j DROP
COMMIT
More on this later.
*filter
:INPUT ACCEPT [273:55355]
:FORWARD ACCEPT [0:0]
:LOGNDROP - [0:0]
:OUTPUT ACCEPT [92376:20668252]
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
# Accept SSH so we can manage the VM
-A INPUT -i eth0 -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -i lo -j ACCEPT
# Allow ping (Zenoss uses it to see if you're up).
-A INPUT -p icmp --icmp-type echo-request -j ACCEPT
# Allow SNMP.
-A INPUT -p udp -s 0/0 --sport 1024:65535 --dport 161:162 -j ACCEPT
# Silently block NetBIOS because we don't want to hear about Windows
-A INPUT -p udp --dport 137:139 -j DROP
-A INPUT -j LOGNDROP
# Drop and log the rest.
-A LOGNDROP -p tcp -m limit --limit 5/min -j LOG --log-prefix "Denied TCP: " --log-level 7
-A LOGNDROP -p udp -m limit --limit 5/min -j LOG --log-prefix "Denied UDP: " --log-level 7
-A LOGNDROP -p icmp -m limit --limit 5/min -j LOG --log-prefix "Denied ICMP: " --log-level 7
-A LOGNDROP -j DROP
COMMIT
ntp on the VM
Bringing up the firewall on the "template" VM, I noticed that I was getting more ntp traffic than I expected. I discovered that in my ignorance, I had set my local ntp server to broadcast, which I don't need. I commented the broadcast line, and everything's still working.
I also found a good post on ntp that answered one of my long-time questions: What should I look at to see if the ntp client was actually working. Do ntpq -p. On the resulting listing, "the delay and offset values should be non-zero and the jitter value should be under 100." (The post is Red Hat based, but the information specifically about ntp is distro-agnostic.)
I also found a good post on ntp that answered one of my long-time questions: What should I look at to see if the ntp client was actually working. Do ntpq -p. On the resulting listing, "the delay and offset values should be non-zero and the jitter value should be under 100." (The post is Red Hat based, but the information specifically about ntp is distro-agnostic.)
Sunday, 30 March 2008
SNMP on the VM
Setting up SNMP on a machine so it can be monitored by Zenoss seems to mess me up every time. This time the problem was the -i option of snmpconf. It's advertised to put the configuration file where the SNMP programs will find it, but it doesn't put it at the front of the list of paths where the programs look, at least not on Ubuntu 6.06.
The solution: don't use snmpconf -i. Run snmpconf to set the access. Make sure it matches what you've set up in Zenoss, particularly the version of SNMP and therefore the access model. When you're done, do sudo mv snmpd.conf /etc/snmp/.
The solution: don't use snmpconf -i. Run snmpconf to set the access. Make sure it matches what you've set up in Zenoss, particularly the version of SNMP and therefore the access model. When you're done, do sudo mv snmpd.conf /etc/snmp/.
Friday, 28 March 2008
SNMP
The basic VM needs to have SNMP running on it, because there's no point having a server if you're not monitoring it. I had Zenoss set up a year ago monitoring some of my computers, but I was getting "bad oid" messages on the new VM template I was setting up.
The solution: Zenoss had a default SNMP version of 1 for Linux systems. I had set up SNMP on the new VM for version 2c. In Zenoss 2.0 I navigated to /Devices/Server/Linux page and selected the zProperties tab, then scrolled down to zSnmpVer and set it to v2c.
The solution: Zenoss had a default SNMP version of 1 for Linux systems. I had set up SNMP on the new VM for version 2c. In Zenoss 2.0 I navigated to /Devices/Server/Linux page and selected the zProperties tab, then scrolled down to zSnmpVer and set it to v2c.
Tuesday, 25 March 2008
Basic Tiny VM Part 1
The basic tiny VM needs:
mount /dev/cdrom
sudo dd if=/dev/cdrom0 of=/usr/local/vmware/ISOs/Ubuntu-6.06.1.iso
The VMTools ISOs are in the /tmp/vmware-server-distrib/lib/isoimages:
sudo cp /tmp/vmware-server-distrib/lib/isoimages/*.iso /usr/local/vmware/ISOs
Install VMTools. Here are some good instructions.
sudo apt-get install ssh ntp-simple snmpd snmp
(snmp is the package that contains snmpconf, which you need to set up snmp, and snmpwalk, which is useful for debugging.)
Configure the ntp server. I've set up an ntp server in the DNS, so I set the "server" line in /etc/ntp.conf to the following:
server ntp
And then restart ntp:
/etc/init.d/ntp-server restart
Run snmpconf to set up SNMP. That's probably a whole post in itself.
I'll do the firewall later. I've ignored my family for too long tonight.
- Ubuntu 6.06.1 Server (the basic install, not LAMP)
- VMTools
- SNMP so you can monitor it (I'm using Zenoss)
- ssh so you can administer it
- ntp as a client so it keeps time. For now I'll sync to my existing ntp server
- basic firewall rules that allow the above
mount /dev/cdrom
sudo dd if=/dev/cdrom0 of=/usr/local/vmware/ISOs/Ubuntu-6.06.1.iso
The VMTools ISOs are in the /tmp/vmware-server-distrib/lib/isoimages:
sudo cp /tmp/vmware-server-distrib/lib/isoimages/*.iso /usr/local/vmware/ISOs
Install VMTools. Here are some good instructions.
sudo apt-get install ssh ntp-simple snmpd snmp
(snmp is the package that contains snmpconf, which you need to set up snmp, and snmpwalk, which is useful for debugging.)
Configure the ntp server. I've set up an ntp server in the DNS, so I set the "server" line in /etc/ntp.conf to the following:
server ntp
And then restart ntp:
/etc/init.d/ntp-server restart
Run snmpconf to set up SNMP. That's probably a whole post in itself.
I'll do the firewall later. I've ignored my family for too long tonight.
Can't Connect to Console of VMs
I had everything built and running VMWare Server. Good. So I copied all the VMs I'd built when I was running VMWare on my desktop over to the new server. I started a few, and they were running fine. I could connect to the Zenoss console on one of them, and could ping both. However, all I got was a black screen when I tried to look at the console of the VM using VMWare Console.
The VMWare documentation recommended using the version of VMWare Console program specific to the server you're running. I grumbled a bit and re-installed (which was actually quite easy), then tried viewing the console of my VMs again. I still got a black screen, but I also got an error message saying that the .vmx file had to have execute permission for the user running the VMWare Console. I checked the .vmx files and sure enough, because of the way I copied them everything had 0644 permissions.
So I cd'd to the directory where all the VM directories were and typed:
find . -name "*.vmx" -exec chmod u+x \{} \;
That worked because the user connecting with VMWare Console is the same one that owned all the files. You'll have to do something slightly different if that's not the case.
Now they work fine.
The VMWare documentation recommended using the version of VMWare Console program specific to the server you're running. I grumbled a bit and re-installed (which was actually quite easy), then tried viewing the console of my VMs again. I still got a black screen, but I also got an error message saying that the .vmx file had to have execute permission for the user running the VMWare Console. I checked the .vmx files and sure enough, because of the way I copied them everything had 0644 permissions.
So I cd'd to the directory where all the VM directories were and typed:
find . -name "*.vmx" -exec chmod u+x \{} \;
That worked because the user connecting with VMWare Console is the same one that owned all the files. You'll have to do something slightly different if that's not the case.
Now they work fine.
Monday, 24 March 2008
VMWare Server on Ubuntu 6.06.1
The install went smoothly. I created a user "vmware" and added it to the admin group. Then I had to:
sudo apt-get install xinetd
sudo apt-get install libx11-6 libx11-dev libxtst6 xlibs-dev
The last line was thanks to this post. Without it, it wouldn't validate my serial number (and I'm sure I would have run into other problems).
The only default I changed was to put my virtual machines in /usr/local/vmware/Virtual Machines, because /usr/local is the big partition I made for VMs.
sudo apt-get install xinetd
sudo apt-get install libx11-6 libx11-dev libxtst6 xlibs-dev
The last line was thanks to this post. Without it, it wouldn't validate my serial number (and I'm sure I would have run into other problems).
The only default I changed was to put my virtual machines in /usr/local/vmware/Virtual Machines, because /usr/local is the big partition I made for VMs.
Sunday, 23 March 2008
Virtualization So Far
As should be obvious from my recent posts, I've been trying to set up a host for virtual machines. I need to be able to try things out easily, and virtual machines are great for that. I'd also like to get rid of my old boxes that are running core network infrastructure. It's not so much that I want to get rid of them, but the risk of continuing to use them is a problem. I have an 11-year-old Macintosh Performa that's my DHCP and DNS server for my whole network. If it breaks, I'm scrambling to replace it unless I get something new built. Obviously if it runs on a computer with a 1 GB hard drive and 32 MB of memory, I should be able to run it on a VM.
Anyway, being cheap I wasn't sure I wanted to pay for VMWare. They have a free version of course, but XenSource's free version is a license-key upgrade, whereas VMWare Server to Virtual Infrastructure (AKA ESX) is a complete software upgrade. So I thought I'd try XenSource, especially since they seemed to be saying that they could run any OS if you bought a CPU with virtualization support.
So I carefully researched the chips I was looking for and bought a Dell SC440 with an Intel Xeon 3050. A low-price server but with the right parts, or so I thought.
The install of XenSource was easy, as was the install of XenCenter, the control program on Windows. Unfortunately, there was a problem with the shortcut to install XenCenter. I posted a question in the Xen community boards and got no help. I found the solution myself a few days later, but not after noticing that there was very, very little activity on the community boards. I wonder if anyone is using Xen, or at least is anyone using it without paying Citrix for support?
Also, it turns out you can't run anything you want as a VM. I tried to run Ubuntu Server 6.06.1 and it gets disk errors. This is a known problem, apparently. Okay, I know it's hard to support every Linux distro, but Ubuntu should be one you support. Look at the numbers.
Anyway, worse than not supporting Ubuntu is that the answer from Citrix seemed to be, "use one of our supported distros." They'll always be niche if that's their approach. The market for virtualization is the world of heterogenous data centres that need to shrink their power and A/C footprint. You're not going to get into that market unless you can run anything that an off-the-shelf PC can run. So, I decided to try VMWare.
Installing a 60-day evaluation copy of ESX 3i didn't work. Neither did installing an evaluation copy of ESX 3.5, but at least it told me that the network card wasn't supported. So I tried Ubuntu 6.06.1, and the network card wasn't supported there, either. Broadcom, what are you doing releasing a NIC that doesn't work with older drivers? I found how to get Ubuntu installed, and so I'll continue with installing the free version of VMWare Server. This is not what I wanted to be doing.
I guess the lesson is you really have to check the hardware compatibility list, but I didn't even know I was going to go this path. I'm interested in how many other problems I'm going to have.
Even though I'm not up with VMWare Server, I have to say that it's the preferable approach. At least you have an underlying OS you can work with, and my experience with VMWare elsewhere says it's going to run whatever I try to put on it. Too bad the thinner versions (ESX) don't work on my hardware.
Anyway, being cheap I wasn't sure I wanted to pay for VMWare. They have a free version of course, but XenSource's free version is a license-key upgrade, whereas VMWare Server to Virtual Infrastructure (AKA ESX) is a complete software upgrade. So I thought I'd try XenSource, especially since they seemed to be saying that they could run any OS if you bought a CPU with virtualization support.
So I carefully researched the chips I was looking for and bought a Dell SC440 with an Intel Xeon 3050. A low-price server but with the right parts, or so I thought.
The install of XenSource was easy, as was the install of XenCenter, the control program on Windows. Unfortunately, there was a problem with the shortcut to install XenCenter. I posted a question in the Xen community boards and got no help. I found the solution myself a few days later, but not after noticing that there was very, very little activity on the community boards. I wonder if anyone is using Xen, or at least is anyone using it without paying Citrix for support?
Also, it turns out you can't run anything you want as a VM. I tried to run Ubuntu Server 6.06.1 and it gets disk errors. This is a known problem, apparently. Okay, I know it's hard to support every Linux distro, but Ubuntu should be one you support. Look at the numbers.
Anyway, worse than not supporting Ubuntu is that the answer from Citrix seemed to be, "use one of our supported distros." They'll always be niche if that's their approach. The market for virtualization is the world of heterogenous data centres that need to shrink their power and A/C footprint. You're not going to get into that market unless you can run anything that an off-the-shelf PC can run. So, I decided to try VMWare.
Installing a 60-day evaluation copy of ESX 3i didn't work. Neither did installing an evaluation copy of ESX 3.5, but at least it told me that the network card wasn't supported. So I tried Ubuntu 6.06.1, and the network card wasn't supported there, either. Broadcom, what are you doing releasing a NIC that doesn't work with older drivers? I found how to get Ubuntu installed, and so I'll continue with installing the free version of VMWare Server. This is not what I wanted to be doing.
I guess the lesson is you really have to check the hardware compatibility list, but I didn't even know I was going to go this path. I'm interested in how many other problems I'm going to have.
Even though I'm not up with VMWare Server, I have to say that it's the preferable approach. At least you have an underlying OS you can work with, and my experience with VMWare elsewhere says it's going to run whatever I try to put on it. Too bad the thinner versions (ESX) don't work on my hardware.
Friday, 27 April 2007
Virtualization: There's Gotta be a Catch
Virtualization solves lots of problems for many if not most organizations that have more than a rack of servers. On an earlier assignment I calculated a worst-case saving of C$380 per month for a virtual server over a physical server (using ESX 2.5 and 3.0 from VMWare). But there's a catch to virtualization, and that catch is backups.
Virtualization introduces wrinkles on your backup approach. Fortunately, to start off you're probably okay doing your backups the same way you always have been. The backup wrinkles are not enough to stop you from embarking on virtualization.
Here are some of the things you need to watch for as you add virtual machines (VMs) to your virtualization platform:
As you add VMs to your infrastructure, you may run into decreasing backup performance. The reason: many servers today are at their busiest during their backup. You may be able to run 20 VMs comfortably on one physical server, but if you try to back up all those VMs at once you'll run into bottlenecks because the physical server has a certain number of network interfaces, and all the data is coming from the same storage device, or at least through the same storage interface. Again, the solution is to watch backup performance as you virtualize and make adjustments.
Be aware that you might have to make changes to your backup infrastructure to deal with real challenges of backup performance introduced by virtualization. If your backups are already a problem, you might want to look into this in more detail. (The problems and solutions are beyond the scope of this post.)
How long do you have to rebuild servers after a data centre fire? You may not have even thought of the problem (don't be embarrassed. Many of us haven't). With virtualization you have to think about it because an equivalent event is more likely to happen: The storage device that holds all your VMs may lose its data, and you're faced with rebuilding all your VMs. I have second-hand experience (e.g. the guys down the street) with storage devices eating all the VMs, but I've never directly known anyone who had a serious fire in the data centre.
If the backups on your physical servers can be restored to bare metal, then you don't have to worry about your storage device eating the VMs. You may have to make some changes to your bare-metal backup -- I have no experience with that topic so I don't know for sure -- but once you do you should be able to restore your VMs relatively quickly.
If you can't or don't have backups that can be restored to bare metal, then you have a challenge. I doubt that most general purpose data centres are full of identically configured servers, with detailed rebuild procedures and air-tight configuration management so every server can be rebuilt exactly like the one that was running before. If you had to rebuild 200 VMs from installation disks, you'd probably be working a lot of long nights.
If most of your servers that you plan to virtualize have database-like data (large files that change every day), I'd recommend looking at changing your backup approach for those servers to a product like ESX Ranger, or look for some of the user-built solutions on the Internet. These products will back up the entire virtual machine every time they run, and may not allow individual file (within the VM) restores. However, for a database server you're probably backing up the whole server every night anyway, so that won't be a significant change to your backup workload.
If you want to virtualize file server-like servers, there isn't really a good solution that I or anyone I know has found at this time. If your backup infrastructure has enough room to take the additional load, simply back up with ESX Ranger or one of the other solutions once a week (or however frequently you do a full backup), along with your current full and incremental backup schedule. If you have to rebuild the VM, you restore the most recent ESX Ranger backup first. If you just have to restore files on the server, because a user deleted an important document, for example, just use the regular backups.
If you have the budget to change backup infrastructures, ESX Ranger can probably provide a pretty good overall solution. However, you have to provide backup and restore for physical servers as well, so the staff who do restores have to be able to deal with two backup systems.
One final gotcha that I've run across: There are some great devices out there from companies like Data Domain that provide excellent compression of exactly the type of data you're backing up when you back up an entire VM. Unfortunately, ESX Ranger compresses the data too, which messes up the storage device's compression. Whatever solution you put together, make sure your vendor commits to performance targets based on the entire solution, not on individual products.
As with so much of what we do in IT, it's really hard to summarize everything in a way that makes sense in a blog post. Comment on this post if you'd like more details or reasons why I make the recommendations I make.
Virtualization introduces wrinkles on your backup approach. Fortunately, to start off you're probably okay doing your backups the same way you always have been. The backup wrinkles are not enough to stop you from embarking on virtualization.
Here are some of the things you need to watch for as you add virtual machines (VMs) to your virtualization platform:
- Do you have highly-tuned start and stop times for your backup jobs, for example when you have inter-dependencies between external events and your backup jobs?
- Do the servers you plan to virtualize have file server-like data, in other words, does it consist of a lot of small files that mostly don't change?
- If you had a fire in your data centre today, before virtualizing, how soon would you have to have all the servers rebuilt?
- Is your backup infrastructure really only being used to half capacity or less?
As you add VMs to your infrastructure, you may run into decreasing backup performance. The reason: many servers today are at their busiest during their backup. You may be able to run 20 VMs comfortably on one physical server, but if you try to back up all those VMs at once you'll run into bottlenecks because the physical server has a certain number of network interfaces, and all the data is coming from the same storage device, or at least through the same storage interface. Again, the solution is to watch backup performance as you virtualize and make adjustments.
Be aware that you might have to make changes to your backup infrastructure to deal with real challenges of backup performance introduced by virtualization. If your backups are already a problem, you might want to look into this in more detail. (The problems and solutions are beyond the scope of this post.)
How long do you have to rebuild servers after a data centre fire? You may not have even thought of the problem (don't be embarrassed. Many of us haven't). With virtualization you have to think about it because an equivalent event is more likely to happen: The storage device that holds all your VMs may lose its data, and you're faced with rebuilding all your VMs. I have second-hand experience (e.g. the guys down the street) with storage devices eating all the VMs, but I've never directly known anyone who had a serious fire in the data centre.
If the backups on your physical servers can be restored to bare metal, then you don't have to worry about your storage device eating the VMs. You may have to make some changes to your bare-metal backup -- I have no experience with that topic so I don't know for sure -- but once you do you should be able to restore your VMs relatively quickly.
If you can't or don't have backups that can be restored to bare metal, then you have a challenge. I doubt that most general purpose data centres are full of identically configured servers, with detailed rebuild procedures and air-tight configuration management so every server can be rebuilt exactly like the one that was running before. If you had to rebuild 200 VMs from installation disks, you'd probably be working a lot of long nights.
If most of your servers that you plan to virtualize have database-like data (large files that change every day), I'd recommend looking at changing your backup approach for those servers to a product like ESX Ranger, or look for some of the user-built solutions on the Internet. These products will back up the entire virtual machine every time they run, and may not allow individual file (within the VM) restores. However, for a database server you're probably backing up the whole server every night anyway, so that won't be a significant change to your backup workload.
If you want to virtualize file server-like servers, there isn't really a good solution that I or anyone I know has found at this time. If your backup infrastructure has enough room to take the additional load, simply back up with ESX Ranger or one of the other solutions once a week (or however frequently you do a full backup), along with your current full and incremental backup schedule. If you have to rebuild the VM, you restore the most recent ESX Ranger backup first. If you just have to restore files on the server, because a user deleted an important document, for example, just use the regular backups.
If you have the budget to change backup infrastructures, ESX Ranger can probably provide a pretty good overall solution. However, you have to provide backup and restore for physical servers as well, so the staff who do restores have to be able to deal with two backup systems.
One final gotcha that I've run across: There are some great devices out there from companies like Data Domain that provide excellent compression of exactly the type of data you're backing up when you back up an entire VM. Unfortunately, ESX Ranger compresses the data too, which messes up the storage device's compression. Whatever solution you put together, make sure your vendor commits to performance targets based on the entire solution, not on individual products.
As with so much of what we do in IT, it's really hard to summarize everything in a way that makes sense in a blog post. Comment on this post if you'd like more details or reasons why I make the recommendations I make.
Subscribe to:
Posts (Atom)