Wednesday 9 September 2009

Minimal Downtime Checkpoint Upgrade

This is an old process I worked out for an R55->R65 upgrade on a Nokia VRRP cluster to provide minimal downtime. In lab tests this worked with under a second of lost packets. In reality I wouldn't bet on a busy cluster being upgraded without losing traffic under any circumstances!

My preferred method is to wipe everything completely and re-build from config backups rather than try to do software upgrades.


Step 0 is to make sure you have printed copies of all documents (incase you kill the internet), copies of both old and new IPSO, Checkpoint, HFAs any CDs you need and all passwords.


First steps are to upgrade the Server, a newer version server can manage old enforcement modules.

1 - Run upgrade_export on the Smart Center Server (SCS) and save the file remotely.
2 - If you have a backup SCS then run cpstop on it, keep it for backout!
3 - Run cpclean on the SCS to remove all Checkpoint software. Run the CD install.
4 - Use upgrade_import to load the config exported in step 2.
5 - Sort out the licenses (test this thoroughly in advance or allow hours/days/weeks for this step)
6 - Check the server is talking to the enforcement modules correctly.
7 - Backup server can be updated at any point from here on, personally I'd leave it until much later to be sure we don't need to roll back to it.

- SCS now updated -

8 - Run the IPSO backups on all enforcement modules (firewalls).
9 - Turn off "Firewall Monitoring" in the VRRP options on all firewalls.
10 - Set the cluster version to be R65 in Smart Dashboard.
11 - Reboot a backup firewall and run the IPSO install from the boot manager, also install the R65 wrapper.
* This may not work with anonymous FTP, best to set a username/pass on FTP server *
12 - When the box is rebuilt, log in via SSH, edit the file $FWDIR/lib/webgui_client.def and add the IP address of the PC you want to use to access Voyager initially.
13 - Run comp_init_policy -g to regenerate the initial firewall policy with the new client definition.
14 - Reboot the box.
15 - Install any HFAs, check whether the client.def file needs editing again each time (HFA may overwrite it).
16 - Log into the system on voyager, load the IPSO backup file from earlier. Turn off VRRP monitor firewall.
* Check the package and IPSO image configuration, this may have reverted back to the old versions, you may need to re-install Checkpoint packages to get these settings back to sensible values *
17 - Push policy from Smart Dashboard, it should install to the updated firewall only.

* OUTAGE COMING UP!! *
You can use cphaprob state to see whether you have a cluster working but it's unlikely due to the version differences

18 - Force VRRP to fail over, either by editing the priority values or by disabling/disconnecting a monitored interface on the master unit.

Make sure you edit the priorities anyway on the upgraded unit so it remains the master when you restore the IPSO config later onto the next unit.

Don't just turn the priority down on the (old) master as the IPSO restore will reset them and you may end up with a VRRP master that has no firewall software loaded and just acts as a router!

19 - Run any traffic tests you have, decide whether the new version is working properly and whether you want to go past the point of no return! You can still fail back to the old version at this stage.

20 - Run steps 11-17 on the remaining firewall.

21 - Push policy to the entire cluster, verify all units accept it. Check that the cluster is talking with "cphaprob state"
22 - Set VRRP values back to the original state, re-enable "monitor firewall".

23 - Test out the new firewalls. Include

Hopefully it's all worked and you're now on the new versions.