AIX 6.1 -> 7.1

So, here is this little IntelliStation 275 running 6.1 TL5 since a year. I decided that this is time to upgrade to 7.1. The process is very easy, insert the 7.1 DVD and perform a migration installation, reboot and check the system. In my case, I have a WPAR in the global system (think of it as a kind of a jail, but with a separate init and an entire process stucture under it), which requires special attention.

Here are the steps and conclusions from yesterday. Many of them are trivial for AIX admins.

Preparation, backup

First, I wanted to create a cloned system on the second HDD of the rootvg mirror to make sure I have something to boot if the migration fails. I have a lot of stuff (logs, source code and compiled programs) I wouldn't like to lose. In a well-managed AIX system, the rootvg resides on a mirror of two (set of) disks (in theory, there may be a third copy of the mirror), and since these are in sync, one of them is always bootable if one of them fails. So I freed up one of my two disks by removing the mirror, then the disk itself, and finally modifying the boot list in NVRAM:

# unmirrorvg rootvg hdisk1
# reducevg rootvg hdisk1
# bosboot -ad /dev/hdisk0

The command to create a clone on the free disk:

# alt_disk_install -C -B hdisk1

In fact, the alt_disk_install command is deprecated, so the actual command will be:

# alt_disk_copy -B -d hdisk1

This creates a clone (-C) but doesn't adjust the bootlist to the clone (-B) as I want to run with the old disk (to avoid losing the logs created after the clone, for example).

This way, I will have hdisk0 available for upgrade and hdisk1 with the state of the system from Tuesday, at about 17:30.

Well, I was quite surprised to see that there is no 'alt_disk_install' command on my system because the fileset 'bos.alt_disk_install.rte' was not installed by default...!

I grabbed it from our patch server and copied to the target, so it wasn't a big problem, but having the install filesets on the server would be a better solution... a memo to myself ;)

Next, I noticed that if the WPAR is not running, the filesystems are not mounted and alt_disk_install will ignore any unmounted filesystem for a good reason: it cannot create a file list for copying...

So start any WPARs that need to be migrated (or at least mount their filesystems by hand)!

Then I encountered this error message:

0505-129 alt_disk_install: The rootvg contains logical volume name(s):
 wp_xxxx.hd10,
which exceed the 11 character limitation. To correct this problem, unmount the logical volume(s).  Then, rename and mount the logical volume(s) and retry the command.

Luckily, AIX allows renaming LVs on the fly:

# chlv -n wp_xxxx.hd10 wp_xxxx.hdo

It was time to run the above command again, now with a success. The clone process took almost exactly two hours on a 36GB disk...

On modern systems, you can expect better disk performance.

Here are a few lines about what the command does - it may be interesting for people who don't know AIX:

It creates a file with the new VG, LV and FS parameters based on the existing ones:

Calling mkszfile to create new /image.data file.

Checks if the clone will fit on the new disk:

Checking disk sizes.

Creates the VG and the new LVs and FSs so that the names don't conflict:

Creating cloned rootvg volume group and associated logical volumes.
Creating logical volume alt_hd5.
Creating logical volume alt_hd6.
Creating /alt_inst/ file system.
Creating /alt_inst/app file system.

Then it runs a simple 'find /' and stores the result in a text file...

Generating a list of files
for backup and restore into the alternate file system...

It processes the file list with the tar-alike 'backup' and 'restore' commands

Backing-up the rootvg files and restoring them to the alternate file system...

Finally, it adjusts the descriptor database and writes the proper names directly into the LVM metadata:

Modifying ODM on cloned disk.
Building boot image on cloned disk.
Changing logical volume names in volume group descriptor area.
Fixing LV control blocks...
Fixing file system superblocks...

There may be warnings that can be ignored in most cases. There is a log about the process: /var/adm/ras/alt_disk_inst.log

Should you decide to interrupt, a Ctrl-C triggers a cleanup, the new structure will be removed halfway through the cloning.

And here we go. Looking at the volume groups now:

# lsvg
rootvg
altinst_rootvg

we have a renamed VG. The magic is, the 'sleeping' VG actually identifies itself as rootvg and the LV names within it are the same as the running system, but until you boot to the cloned disk, you won't see them as they would conflict with the existing VG and LV names and mount points. Once you boot with the new disk, another magic will rename the old VG to old_rootvg.

You can use the other VG either from the old or the new system by calling

# alt_rootvg_op -W -d hdiskX

so the old/new filesystems will be accessible, but with different names as the live filesystems (their root is /alt_inst/).

Once you think that the old VG can be discarded, run

# alt_rootvg_op -X altinst_rootvg

The same applies if you revert to the old , but replace the VG name with 'old_rootvg', of course.

The upgrade

I used a serial console with screen.

Insert the DVD, reboot and select the DVD as the boot device from the firmware menus (SMS). Wait a few minutes until the installer loads stuff in the ramdisk and the kernel extensions are initialized.

Installation flavour: Since I did a version migration, I selected the migration option instead of the other two: the clean install or the preservation installation which would result in reset of system configuration.

The installer examines the disks and finds rootvg automatically, there is really nothing much to do here.

Note that if you choose more than one disk during install, they will not be arranged in a mirror, the LVs will be spread evenly over every disk and you will have to move PVs around - avoid that!

As with a normal installation, you have a few options to select from: language, system management software. One option is notable: whether to install every driver. This is really a huge set of packages including EVERY driver, but if you want to use a new piece of hardware, you will most probably need the missing drivers - you decide. On a simple, standalone box there is no need for other drivers than the ones already installed. Yet, I decided to install all drivers now.

With AIX 7.1, three editions were introduced which offer different licensing plans. Express is the default, but you may choose any other edition later with 'chedition'. Note that the enterprise edition installs and starts agent programs (FIXME). See also AIX Version 7.1 Editions

After selecting the options, you will see another screen where replaced files and removed filesets can be viewed prior to starting the actual installation.

The actual process is rather boring, you will see the progress by the number of the packages and the time elapsed, and an overall progress percent as well.

On this system, 533 filesets were updated in more than one hour. I guess the bottleneck was the DVD-ROM and the lack of IDE DMA, as the entire system occupies less than 2GB.

The installer will adjust the bootlist to the disk and reboot, and within a few minutes the AIX banner will appear and the system will go multiuser.

 Saving Base Customize Data to boot disk
 Starting the sync daemon
 Starting the error daemon
 System initialization completed.
 in sinpolhndlr OFF
 TE=OFF
 CHKEXEC=OFF
 CHKSHLIB=OFF
 CHKSCRIPT=OFF
 CHKKERNEXT=OFF
 STOP_UNTRUSTD=OFF
 STOP_ON_CHKFAIL=OFF
 LOCK_KERN_POLICIES=OFF
 TSD_FILES_LOCK=OFF
 TSD_LOCK=OFF
 TEP=OFF
 TLP=OFF
 Successfully updated the Kernel Authorization Table.
 Successfully updated the Kernel Role Table.
 Successfully updated the Kernel Command Table.
 Successfully updated the Kernel Device Table.
 Successfully updated the Kernel Object Domain Table.
 Successfully updated the Kernel Domains Table.
 OPERATIONAL MODE Security Flags
 ROOT                      :    ENABLED
 TRACEAUTH                 :   DISABLED
 System runtime mode is now OPERATIONAL MODE.
 complete
 Starting Multi-user Initialization
  Performing auto-varyon of Volume Groups
  Activating all paging spaces
 0517-075 swapon: Paging device /dev/hd6 is already active.
 The current volume is: /dev/hd1
 Primary superblock is valid.
 The current volume is: /dev/hd10opt
 Primary superblock is valid.
  Performing all automatic mounts
 Multi-user initialization completed

Then the network stack starts and the login prompt appears.

The firstboot inittab entry renames /etc/firstboot to /etc/fb_* - a useful information if you want to check when the system was installed.

Here, the first is the original 6.1 install date and the second is the version update:

# ls -l /etc/fb_*
-rwxrwxr-x    1 root     system           24 Sep 08 2010  /etc/fb_17_17_09_07
-rwxrwxr-x    1 root     system          185 Sep 13 20:25 /etc/fb_20_31_09_13

Post-upgrade

Check base system functions: logins, network, mount points, 3party software.

There is a list of modified files in

/etc/check_config.files

and some of the old files can be found under

/tmp/bos/ .

My migration results are not bad so far, I have yet to recompile programs linked to perl due to the 5.8.8 => 5.10.1 change.

I applied the latest fixes to the compiler filesets as well to make sure I'm not running into stupid bugs again.

I had to apply FPM settings again, as these seemed to have been lost during the update.

Rebuild the rootvg mirror as necessary:

# alt_rootvg_op -X altinst_rootvg
# extendvg -f rootvg hdiskY
# mirrorvg -S rootvg hdiskY
# bosboot -ad /dev/hdiskY
# bootlist -m normal -o
# bootlist -m normal hdiskX hdiskY

AIX OS level update

Run an update as soon as possible if you didn't use a recent media kit - I recommend using the 'fixget' tool.

Here is the command to get a raw list of all available updates for 7.1 (no oslevel specified = 7.1 is selected automatically):

http://www7b.software.ibm.com/webapp/set2/fixget?t=L&of=clean

You can then compare this list to the filesets on your system if you don't want to download unnecessary stuff.

# wget -O download.lst "http://www7b.software.ibm.com/webapp/set2/fixget?t=L&of=clean"

Extract only fileset names from the FTP URLs:

# sed 's/.*\///g;s/\(\.[[:digit:]]*\)\{4\}\.bff//g' download.lst

You can then sort them into installed and not installed groups by piping the above to lslpp:

| while read fs; do lslpp -Ou -lcq $fs | awk -F\: '{print $2} >> have.lst 2>> donthave.lst; done

Then download the list of updates to filesets which are already installed.

You must filter for the needed filesets from the above download list which includes every available update.

# while read fs; do grep "$fs\.[[:digit:]]" download.lst >> final.lst; done < have.lst

You must append the "dot number" after the fileset name in grep because the name of some filesets is a subset of the name of other filesets as well, and it would result in multiple matches for these filesets (bos.rte -> bos.rte.aio bos.rte.boot etc.)

# wget -q -i final.lst

You may run into a situation where a fileset update requires a new one, but this is rather rare and preview mode update will show it to you anyways.

Finally, when the filesets are downloaded, run the update for the installer (bos.rte.install) first:

# install_all_updates -piYd.
# install_all_updates -iYd.

where -p is preview mode, -d Directory is where the filesets are downloaded (in my case, $PWD), -Y accepts licenses, -i installer only.

Update the whole system the same way (prepare for reboot!)

# install_all_updates -pYd.

Look at the summary before letting it run:

FILESET STATISTICS
------------------
  273  Selected to be installed, of which:
      273  Passed pre-installation verification
  ----
  273  Total to be installed

then go for it:

# install_all_updates -Yd.

Since these filesets are mostly binary diffs, the update goes fast... sort of:

Finished processing all filesets.  (Total time:  22 mins 6 secs).

Here is the result you should see:

install_all_updates: Checking for recommended maintenance level 7100-00.
install_all_updates: Executing /usr/bin/oslevel -rf, Result = 7100-00
install_all_updates: Verification completed.
install_all_updates: Log file is /var/adm/ras/install_all_updates.log
install_all_updates: Result = SUCCESS

Then just reboot.

After the system is up, check the OS level and make sure it matches the latest:

# oslevel -s
7100-00-03-1115
# oslevel -sq
Known Service Packs
-------------------
7100-00-03-1115
7100-00-02-1041
7100-00-01-1037
7100-00-00-0000

When you are not on a production system, you can commit the updates once the system is up, freeing some space (in my case, more than 600MB).

# installp -c all

Of course, the SUMA update tool should be able to do this for you automatically, but I had problems with it in the past.

Also, the use of SUMA makes no sense on systems not connected to the Internet.

WPAR upgrade and update

See this subpage with detailed notes.

Notes

The alt_ commands are ksh scripts...

The migration installation cannot be reverted.

From AIX 6.1 onwards, you can determine if you are in a WPAR: 'uname -W' - 0 if you are in the global system, 1 or higher if it is a WPAR.

Here is the schedule of the upgrade:

  • 17:00 alt_disk_install
  • 19:00 alt_disk_install is complete; reboot; boot from DVD; wait for installer; select options
  • 19:15 start installing/updating filesets; reboot
  • 20:30 system is up, login

Recommended reading

IBM AIX Version 7.1 Differences Guide - 3.6 WPAR migration to AIX Version 7.1 (PDF/HTML)

AIX 7.1 migwpar Command (InfoCenter)

Migrating to AIX Version 7.1 (InfoCenter)

...