It was certainly an experience! And not one that I have any desire to repeat …
I decided that it would be a good idea to upgrade my web and email server to OS X Server 10.4. Partly because I was struggling to get spam filtering working as well as I wanted, and partly because I wanted to upgrade to PHP 5.1.4 and the excellent packages for PHP produced by Marc Liyanage no longer support 10.3. Also, I figured that by 10.4.6, the OS is stable enough for a production environment. It also helped that I was able to procure a cheap copy via ebay!
Technorati Tags:
mac, osx, server, installation
So, off I went to London to sit in the data centre and do the upgrade. I booted the X-Serve into firewire target disk mode and booted my laptop off that. I then loaded in the new disk and started the upgrade. The upgrade went fine and so I booted the machine up, and logged in via vnc to make sure I had all the latest system upgrades and so on. This is where the problems began…
I installed the upgrades and restarted, but the machine didn‘t come back up. As a guess, I’m thinking that what happened is that the time it takes to complete the 10.4.6 upgrade (post restart) was so long, that I assumed something had gone wrong and manually restarted it again; thus mashing that upgrade.
There I was with my web server, with a mashed OS. OK thought I, we‘ll do a clean archive and install of the OS; no dice - that option doesn’t exist for OS X Server. What to do?
A bit of negotiating later, and I had procured a new hard disk to fit into the second bay. A complete reinstall later, and the machine was running once more. This time I did the OS upgrades whilst booted from the laptop. It really does take a long time to restart (twice) after the 10.4.6 upgrade. And from there I was able to copy all of the web sites and so on over to the new disk.
I went through and set up all the sites once more and was finally left with a working web server. I decided I would get email running properly when I got home. This was the second mistake. More to the point, the second mistake was wanting to copy the netinfo database from the old install to the new to save me recreating all the users. This process can only be done in single user mode, and if anything goes wrong, you need to be in single user mode in order to recover. And the problem with single user mode is that it requires that you are on site! This is not something you want to get wrong as breaking the netinfo database means you are locked out - very very not good.
My second trip to London to fix the server was much more brief. A quick trip into single user mode confirmed that the netinfo database from the old install was completely borked, but I was at least able to restore access once again.
Finally, back home once again, I was able to recreate all the users and get email running correctly.
So, here are my hard learned top tips for doing an upgrade like this …
- Start the process with a second drive; whether you decide to do a full install from scratch or copy your existing set up to the new drive and do an upgrade, it's so much better to have your original (working) copy of the system on a second drive that you can roll back to.
- If you can, build the whole new system on a drive before you even get to the data centre - you'll still need to copy across any changed files, but at least you are not taking out the machine for any length of time. This does raise one question though - how can you work on an x-serve drive, without having a second x-serve?
- Do any netinfo work, whilst you are at the data centre, and sat in front of a terminal - if you can‘t, don’t try!!
- Have you covered all bases in terms of backups? I thought my backups were pretty comprehensive, but I've had to re-evaluate. For instance, it is possible to create plists from server admin of the setup (and via the command line). And also of the user setups; these will take some of the pain out of having to rebuild from scrtatch.
- If possible, do system upgrades via firewire target disk mode … or at least be very patient before restarting the machine! I knew for instance that it restarted twice after the 10.4.6 upgrade, but I wasn't prepared for it to take so long.
Fortunately, in the end the server wasn't out for hugely long, and everything is running really nicely now. I have learned some good lessons and my inbox is vastly less spam filled now.
SteamSHIFT out.