Tuesday, October 28, 2008

Today, I broke a server!

Today, I broke a 8 64-bit Intel-Xeon processor that has 3 TB of harddrive space linked by 6 SCSI 500-gb hard drives in 2 RAID-0 configurations of 1.5 TB each. It also has 16 gb of memory and it was a Dell 2950 with DRAC.

So previously, there was a bug in which if there was a physical harddrive that has over 1TB of HD space, then all Linux server migrations (which we call conversions) would fail because of an integer overflow during the discovery of the target's specifications (it was funny to see that my hard drives were detected to have -652.2 GB of HD space). Hence, no one had ever performed such Linux conversions. The bug was fixed recently, so initially, I was planning to do a SuSE Linux Enterprise Server (SLES) 8 64-bit conversion to it (was thinking of using GA or SP4), but we only supported AMD64 targets. Since Intel is non-Athlon, there is no support... although it is ridiculous that we support the Intel architecture for newer SLESes.

I decided to just try a SLES10 SP2 conversion to see if it works (also decided to charter this as part of my Linux 64-bit testing). Both the SLES10 SP2 and SLES8 were located Unfortunately, halfway through the conversion, while restarting the machine, the boot failed. I restarted the computer, but I was not able to get into BIOS, PXE boot, utility mode, nor configuration setup. I could only get into the BIOS / configuration utilities of the SCSI controller and the network card. After discovering the hard drives, the white cursor would blink indefinitely, waiting for a ghastly hand to press the button to give birth, only to die again.

Firstly, though unlikely, one of my thoughts was that the conversion software messed around with some of the BIOS or low level settings, but I was told that the conversion is software-sided, so it should not affect them. One of my co-workers told me that it was merely an unfortunate event of the computer dying at the wrong time.

No comments: