Steps Required to Use ONBAR/Legato backup to initialise HDR
This document explains the
Terminology:
HDR: Informix's Hi-Availability Data Replication
NSR: Legato's NetWorker Server.
Source machine: Host machine that contains the current instance
you want to replicate. This is usually your
primary server once HDR is operational.
Target machine: Host machine intended to be used as secondary
in the HDR pair. A physial restore must
take place on this machine to init HDR.
Preliminary:
1. Install appropriate software on both machines.
Informix: I would recommend no version less than 7.24.UC5
If you are planning a parallel physical restore to init
HDR, you will require a version with bug 82349 fixed.
Legato: This testing was done with NSR 5.0 build 63. To
integrate onbar and NSR, use also need the Legato BMI
add-on. And you must be using the appropriate XBSA
shared library, currently provided as a patch.
The Legato bug referenced here is LGTpa09766
Platform: Supported platform of above products.
My testing was done using Solairs 2.4
2. Make sure both the source and target OnLine instances can communicate
over the network. One way to verify this is to create a small instance
on the target, configured to recognise the source. On the primary,
add entries into your network hosts file (sqlhosts) to recognise
the target instance. With both instance operational, verify remote
access by performing minor remote SQL statements in dbaccess.
4. Configure NSR
NSR architecture gives you a few options to allow this exercise
to work:
1. NSR servers on source and target
2. NSR server on either source or target. Whichever machine
does not contain the NSR server must act as a client-only.
3. Both hosts are remote clients to a different server running
the NSR server.
These steps assume #1 above: NSR servers on both the target and source.
*Note: You may need to add each server to each other's authorized
client list. Although this example illustrates a complete import,
the XBSA object may still require the source client definition.
5. Set-up onbar and make sure it works on each instance.
This includes:
- running NSR's bmi_config (to create Pool templates, ...)
- Defining devices
- these steps have been verified using both cooked
and tape devices (8mm)
- Labeling media with proper pool names
- Mounting device
- proper definition on XBSA library.
At version 7.24.UC5, onbar expects the shared library
to be at '/usr/lib/ibsad001.so'. This can be a UNIX link
to NSR's /usr/lib/libbsa.so.1
- NSR-Identifying row in Informix database 'sysutils'
In dbaccess, select database sysutils, and perform the
following query:
SELECT * FROM bar_version;
Should return:
bar_version bsa_version bar_sm sm_version
1 1.0.1 nwbsa 1
Initialise Informix HDR
I. Take a level-0 archive on primary
This archive must contain all Informix dbspaces.
This archive can be an online archive.
In this case, the following NSR env vars were set for the onbar
session:
NSR_LOG_VOLUME_POOL=DBMIData
NSR_SERVER=
NSR_DATA_VOLUME_POOL=DBMIData # sending log to same place
II. Migrate NSR objects.
- Client indexes (which objects id's are stored on the NSR server)
- Media Management database (what media, and where on the media
it's stored)
- The backup media containing the Informix backup.
*Note: The following are the steps I used to import the NSR restore.
I'm confident the crudeness of using "tar" to migrate the data
can be replaced with appropriate NSR utils (e.g. scanner).
1. Shut down nsr on both instances
$ nsr_shutdown
2. On primary, copy the /nsr/index and the /nsr/mm directories to the
target.
Example: (on primary)
# cd /nsr
# tar cvf NSR_primary.tar index mm
3. Transfer the tar file and install on target:
Example: (on target)
# cd /nsr
# ftp primary_host_name
ftp> cd /nsr
ftp> bin
ftp> get NSR_primary.tar
# rm -rf index mm {remove existing index and mm)
# tar xvf NSR_primary.tar
# nsrck -c (recreates client indexes)
# nsrd (restart NSR server on target)
4. Mount transferred volume(s)
III. Perform Informix Restore on Target.
From an Informix device/chunk perspective, every Informix chunk
must match exactly in size and offset on the source and target machine
for the restore to complete.
1. Copy OnLine device bootstrap file.
The device bootstrap file, is fuond in $INFORMIXDIR/etc, and has the
following name:
oncfg_.SERVERNUM
Once the file has been transferred to the target location, rename it
to match the noted convention, that is, change the file to match
the target instance's DBSERVERNAME and SERVERNUM
Onbar needs this file know what dbspaces to retrieve.
2. Copy ONCONFIG, and change parmeters in target's ONCONFIG to match
3. Create onbar bootstrap file on target machine.
This file is found in $INFORMIXDIR/etc, and should be named:
ixbar.
The only entries needed to be in this file are the entries
put into this same file on the source machine when performing
the initial HDR primary level-0 archive.
4. Before beginning actual restore on target, instruct source
instance to attempt to become 'primary'. This obviously
may cause some troubling messages to appear in online.log,
but the following are expected:
19:28:15 DR: new type = primary, secondary server name = solo_724
19:28:15 DR: Trying to connect to secondary server ...
19:28:18 DR: Primary server connected
19:28:18 DR: Receive error
19:28:18 DR: Failure recovery error (2)
19:28:19 DR: Turned off on primary server
19:28:20 Checkpoint Completed: duration was 0 seconds.
19:28:20 DR: Cannot connect to secondary server
19:28:31 DR: Primary server connected
19:28:31 DR: Receive error
19:28:31 DR: Failure recovery error (2)
19:28:32 DR: Turned off on primary server
19:28:33 Checkpoint Completed: duration was 0 seconds.
19:28:33 DR: Cannot connect to secondary server
To turn on primary, use 'onmode' command
# onmode -d primary
5. Perform restore, be sure to include 'p' option for physical restore
only:
# onbar -r -p
Typical online messages:
19:35:39 Event alarms enabled. ALARMPROG = '$INFORMIXDIR/etc/log_full.sh'
19:35:44 DR: DRAUTO is 0 (Off)
19:35:44 INFORMIX-OnLine Initialized -- Shared Memory Initialized.
19:35:44 Dataskip is now OFF for all dbspaces
19:35:44 Recovery Mode
19:35:44 Physical Restore of rootdbs started.
19:36:19 Physical Restore of rootdbs Completed.
19:36:19 Checkpoint Completed: duration was 0 seconds.
19:36:21 Physical Restore of dbs2 started.
Typical BAR_ACT_LOG messages:
1998-07-30 19:33:15 5434 595 onbar -r -p
1998-07-30 19:33:53 5434 595 Successfully connected to Storage Manager.
1998-07-30 19:35:36 5434 595 Begin cold level 0 restore rootdbs.
1998-07-30 19:36:16 5434 595 Completed cold level 0 restore rootdbs.
1998-07-30 19:36:20 5464 5434 Process 5464 5434 successfully forked.
1998-07-30 19:36:21 5464 5434 Successfully connected to Storage Manager.
1998-07-30 19:36:34 5464 5434 Begin cold level 0 restore dbs2.
1998-07-30 19:36:46 5464 5434 Completed cold level 0 restore dbs2.
1998-07-30 19:36:50 5464 5434 Process 5464 5434 completed.
1998-07-30 19:36:52 5434 595 WARNING: Physical restore complete. Logical
restore required before work can continue.
*Note: here are the active NSR env vars during restore
NSR_CLIENT=monster
NSR_LOG_VOLUME_POOL=DBMIData
NSR_SERVER=solo
NSR_DATA_VOLUME_POOL=DBMIData
5. Instruct target instance to become HDR secondary :
# onmode -d secondary
Typical online message while sync'ing
19:37:10 DR: Server type incompatible
19:37:23 DR: Server type incompatible
19:37:31 DR: new type = secondary, primary server name = monster_724
19:37:31 DR: Trying to connect to primary server ...
19:37:36 DR: Secondary server connected
19:37:36 DR: Failure recovery from disk in progress ...
19:37:37 Logical Recovery Started.
19:37:37 Start Logical Recovery - Start Log 11, End Log ?
19:37:37 Starting Log Position - 11 0x629c
19:37:44 Checkpoint Completed: duration was 0 seconds.
19:37:45 Checkpoint Completed: duration was 0 seconds.
... May require man log/checkpoints for secondary to sync with primary...
19:37:47 Checkpoint Completed: duration was 0 seconds.
19:37:48 DR: Secondary server operational
19:37:49 Checkpoint Completed: duration was 0 seconds.
NOTES:
I found the built-in debug switches for both onbar and NSR to be very
insightful.
To turn on onbar debuggin, just put the following line into your
ONCONFIG file (no need to cycle shared memory)
BAR_DEBUG num # where 'num' = 0-9; 9 producing heaps
ouput defaults to /tmp/bar_dbug.log
To turn on NSR XBSA debugging, set the following env var before
running onbar:
NSR_DEBUG_LEVEL num # where 'num' = 0-9 ; 9 producing heaps of info
output file defaults to /nsr/applogs/xbsa.messages
I currently only have the Solaris XBSA library patch mentioned above.
I don't know yet how to positively identify other than the following
UNIX commands. Make sure yours matches:
> sum /usr/lib/ibsad001.so
35382 1208 /usr/lib/ibsad001.so
> ls -l /usr/lib/ibsad001.so
lrwxrwxrwx 1 root other 11 Jul 23 11:34 /usr/lib/ibsad001.so -> li
bbsa.so.1*
> ls -l /usr/lib/libbsa.so.1
-rwxr-xr-x 1 root other 618296 Jul 29 07:17 /usr/lib/libbsa.so.1*
Got this error during recovery:
bar_act.log:
1998-08-11 11:05:50 9520 9519 /export/home/730/bin/onbar_d -r -p
1998-08-11 11:06:20 9520 9519 Successfully connected to Storage Manager.
1998-08-11 11:06:21 9520 9519 XBSA Error (BSAGetObject): Attempt to authorize
root failed.
and in /nsr/applogs/xbsa.messages:
XBSA-1.0.1 dbmi-1.1.1 11920 Tue Aug 11 15:31:56 1998 _nwbsa_auth_index
_session: received a network error (Severity 5 Number 13): user root on machine
hammerhead.mp.informix.com is not on blobby's remote access list
The fix to this is to add the user root (and it's domain derivitives) to the
secondary's Networker Server's list of remote clients (WHoo ...)
1. startup nwadmin on secondary NSR env. Click on client setup, highlight
the client name where original archive was create (primary server)