Category Archives: Uncategorized

SAN versus Local Storage and High Availability

I have two problems with buying a SAN. An entry-level SAN has a single point of failure – even if you have redundant power and dual controllers, the backplane can fail. If all your applications are running off that SAN, you’re screwed. But having two SANs for full redundancy is twice the price. I’m looking at £20k for an P4300 versus £10k for the entry level HP SAN (P2000). That’s a lot for a small business like ours. My supplier tells me I’d be crazy to go for a single SAN.

But are there any benefits of virtualisation without shared storage? Well, a lot of people seem to go for two physical hosts with local storage only and replicate them using Veeam replication. This does provide automatic failover, but seems a lot better than waiting for a dead server to be repaired and then possibly having to rebuild it from scratch which could take a couple of days. If one host dies, you spend a few minutes getting everything back and running on the other host. And its cheap. So I’m attracted to this. But my supplier, once again, tells me I’d be crazy.

I followed a debate on a forum that went:

“Yes, shared storage is good, but going with cheap shared storage is risky. You are putting all eggs in one basket, and if such storage dies, HA will not work. In fact, nothing will work. So unless you can afford good fault-tolerant shared storage (read “expensive”), 2 ESX hosts with local storage and replication between them is significantly more fault tolerant solution. Secondly, HA is much worse than replication, because replication provides transactionally-consistent images, while HA does cold restart of crash-consistent image, so such recovery rarely does any good for transactional applications and databases. With Microsoft Exchange for example, out of 2 times I had this situation in production, MDB got corrupted in both cases, so I still had to manually restore from earlier backup.”

“Veeam Backup & Replication product provides functionality of replicating virtual machines between ESX hosts with local storages.”

“HA only works with shared storage and there is no replication involved. HA wil power up guests on the second host when the first host dies. It is the equivalent of pulling power to a server and restarting it. Any decent SAN will have dual controllers and dual PS. So the only single point of failure is a complete failure of the backplane. That’s a risk that most will take. Let’s face it, if that small risk is too great, you are not currently investing enough in DR.”

“HA is pretty simplistic one, it requires shared storage, and does not guarantee successful recovery due to performing simple crash-consistent restart. Replication is much more advanced than that, does not require shared storage, and provides guaranteed application recovery. As for automatic failover with replication, actually there is such capability. With Veeam Essential suite, you also get Veeam Monitor product, that has built-in alerts for VM heartbit, and ability to automatically trigger response action based on alert. This response action can be simple script that automatically starts up the corresponding replica VM on the standby host.”

“Scenario 1. Production ESX host fails, HA does its job restarting VM with Exchange server on another ESX host in the cluster. VM restarts fine, but Exchange is failing due to MDB corruption caused by improper shutdown. I had personally suffered twice from exact same situation. No fun.

Scenario 2. Shared storage goes down. You have to perform full VM restore to local ESX storage now.

Both of these scenarios will results in:

1. A few hours of down time while you are restoring Exchange VM from backup.

2. Up to 24 hours data loss because you have to roll back to your nightly Exchange backup.

Now, compare this with replication between local storage (very popular scenario with our customers). Whether your production ESX host or shared storage fail,

1. Down-time will be less than a few minutes (time it takes for replica VM to start up on standby ESX host).

2. Maximum loss of data will be less than your chosen replication period.

I am not saying HA does not have its uses, most applications will be fine after crash-consistent image restart. However, for some applications this often causes data corruption. On the other hand, applications like Exchange or databases are also most oftenly used apps (and always mission-critical too).”

The guy above pushing replication over high availability is a Director at Veeam, so not exactly unbiased. Then again, in the sales meetings I’ve had, no-one has mentioned that a crash is likely to cause database corruption of your Exchange database (and presumably our ERP SQL-Server database). It may not be likely, but I’d be really annoyed if I spent over fifty grand on a High Availability solution only for it to fail because of database corruption. And replication is probably cheaper.

As usual, it comes down to who to believe. Everyone is selling a product to me.

Steve Albini on cookery show hosts

Frathouse cocksuckers with gimmick hairdos and catch phrases, hooting and hi-fiving, “bringing it,” celebrating gluttonous sports bar chow. Dipshits abbreviating their ingredients and making childish, cutesy-poo “comfort food” full of “yummy veggies,” shit like that. Detestable. You can spot the people who have their shit together because they don’t have to tell you how delicious their food is.

When Rebooting is not enough

Had a Windows 7 laptop today where the USB mouse was giving a USB malfunction error (device not recognized etc etc). Tried all the usual tactics, rebooting, logging in as Administrator, deleting the USB devices in Device Manager and then re-installing them, replacing the mouse…. Everything failed.

After a tip from Google, I powered down the laptop, removed the battery, waited a couple of minutes, then booted up. All worked perfectly.

Apparently, the USB controller on the motherboard can get corrupt and this isn’t fixed by rebooting, the motherboard needs to completely power down in order to fix itself. Amazing.

Making a Smoothwall Advanced Firewall VPN connection

When I try to make an L2TP Road Warrior connection from a laptop connected to the Internet via a Vodafone 3G connection the Smoothwall server gets as far as saying “ISAKMP SA established” but then fails with “Cannot respond to IPSEC SA request because no connection is known from [Smoothwall RED IP address]”.

The 3G connection is a NAT’ed connection and I wonder if this is a problem. Though NAT traversal should prevent any problems.

I tried connecting through a PC connected directly to the internet through the same ADSL modem that the Smoothwall box is connected to and it works, giving the message “IPsec SA established”. But I don’t know if this proves that it is a Vodafone issue, or a NAT issue or what.

Smoothwall on VMWare ESXi

Installing Smoothwall (Express or Advanced Firewall) is a doddle with VSphere. There are only two things that I needed to change in the profile.
1. Remove the SCSI disk and add a new disk and select IDE, as Smoothwall won’t find the SCSI disk.
2. Remove the network adapters and new ones with an adapter type of E1000.

Shades of blue

Spent ages trying to side the best colour of blue to use as hyperlinks on the intranet. Light blue? Dark blue? Currently settled on #0063DC as used by Flickr. Looks great on Flickr, but now I’m thinking its a bit bright on the intranet.

I am obsessed with website colour. This can’t be healthy.

Microsoft DirectAccess

My consultant worked for 9 days to implement Microsoft DirectAccess and at the end of this he walked away having failed.

It works with UAG, but he can’t get it working without UAG. And I have no desire to pay thousands of pounds to purchase UAG.

Now working on Plan B, a Smoothwall VPN solution.