6. Common Questions about the Engine

6.30 Why is shared memory not freed when my session ends?

On 9th Feb 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

The memory used by your cursors and prepared statements (4GL implicitely prepares statements even if you did not explicitely prepare any) is freed when the cursor is freed. The communications areas allocated to support your query were freed when your program exited, they are kept for reuse until then, but the connection specific data structures allocated in shared memory are not freed when your program exits. They are kept for future reuse by another process to connect with. You should see that if another copy of the process is executed after the first has exited that the memory used does not continue to increase but that the allocated structures are reused.

6.31 Do deattached indexes take up more space?

On 18th Feb 1998 mcollins@us.dhl.com (Mark Collins) wrote:-

Detached indexes actually take more space. In an attached index, each key includes a one-byte delete flag and a four-byte row location. With a detached index, you also have a four-byte tablespace identifier that points to the base table's tablespace.

This can be verified by running oncheck against an attached index vs. a detached index. Samples of such onchecks are included here:

 
       Attached index
   Level 2 Node 28 Prev 0 Next 27
   Key:    4900001:
   Rowids:       201
   Key:    4900002:
   Rowids:       202

 
       Detached index
   Level 2 Node 3 Prev 0 Next 2
   Key:    4900001:
   Fragids/Rowids:    30001f/     101
   Key:    4900002:
   Fragids/Rowids:    30001f/     102

Note that instead of just "Rowids", we now have "Fragids/Rowids." The fragid value points to the tablespace number of the base table. Regardless of the name "fragids", let me assure you that the table used in this case was not fragmented. This was a simple table defined in a specific dbspace, with the index defined with an "in dbspace" clause as well.

For experimental purposes, I tried the "create index" directing the index into the same dbspace as the base table, and also into a separate dbspace. The oncheck results were the same in both cases. Thus, anytime you specify "in dbspace" when creating an index, even if you specify the same dbspace as for the table, the index is considered detached.

As a result, you may want to modify the space calculations for estimating index pages. I know I wasn't able to calculate index space correctly until I discovered the extra four bytes.

6.32 How do I reduce bufwaits?

On 12th May 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

First, bufwaits is NEVER zero for a REAL working server. That said it is desireable to keep bufwaits to a minimum. A rule of thumb I use is to keep bufwaits to 2-10% of either dskreads or bufwrits (or <3% of the total), since both of these operations require locking an LRU and are most likely to cause a bufwait. There several things that affect bufwaits:

1) More buffers == fewer bufwaits but with your cache %ages this seems OK.

2) More LRUS == fewer bufwaits. This is critical! The more LRUS the better. Avoid LRUS=64 and LRUS=96 as there seems to be a bug in the LRU hash algorithm that causes bufwaits to skyrocket at these particular values (and LRUS=128 triggers a harmless but scary bug at engine startup that complains about zero length Physical and Logical log files). I know that the following values are fine: 4, 16, 32, 127.

3) Lowering LRU_MAX_DIRTY and LRU_MIN_DIRTY can insure that there are sufficient clean buffers available when needed. You can see if this is the problem by looking at FG Writes on the onstat -F report. Foreground should be zero or a fraction of a percent of the total buffer writes. If not you need to be flushing more frequently or you need more buffers, again you buffer count looks fine.

On 31st August 2001 (Art Kagel) wrote:-

If you have many users attached and performing updates more LRU queues prevent spins and BUFWAITS. Hmmm, seems that I was so focused on this annoying bug that I failed to suggest the obvious, that being that there may not be ENOUGH LRU queues to begin with which will show similar symptoms. Let me explain in detail for the benefit of anyone who is not familiar with the LRU hashing algorithm.

When a thread needs a buffer to update and the page is not already in the buffer pool the thread uses a random hash to select an LRU to get the buffer from. If this LRU is not locked by another process doing the same thing the thread acquires the lock on that LRU queue takes the Least Recently Used clean buffer, moves it to the LRU queue's dirty list, locks the buffer page and releases the lock on the LRU. If the LRU is already locked it spins (BUFWAIT) for a while trying a few more times to get a lock on that LRU if not it rehashes to another queue. If there are many updaters and few LRUs it is likely that that LRU will also be locked already and so our poor thread spins again and so on. Eventually it gets a lock on sone LRU and continues. If there are more LRUs then the probablility of finding a free one improves greatly and BUFWAITS go down and throughput goes up.

If the page to be updated is already in the cache but has not yet been modified the number of LRUs is still important as the size of the buffer cache increases. More cache means it is more likely that a page that is about to be modified is already in the clean cache (or the dirty cache for that matter but that is not affected by the number of LRUs). With more LRUs it is less likely that some other process is in the process of moving some other already cached clean page to the dirty list from the same LRU that contains the page that your thread needs. Again this reduces BUFWAITS and improves throughput.

The bug I suspect is that for certain numbers of LRUs the rehash algorithm causes many threads to rehash to the same sequence of LRUs so that the acquisition of LRUs is a race condition and single threaded. Here one thread at a time gets an LRU and drops out of the race the symptoms are that updates/inserts/deletes do not scale (ie adding another updater does not improve throughput significantly) and BUFWAITS skyrocket. We were seeing both of these symptoms.

With 128 LRUs (the maximum) we can have 100 update server tasks running constantly serving update requests from clients and still get near linear gains from adding up to 40 data sync or cleanup tasks concurrent with normal production load.

An additional benefit of many LRUs is that it helps to keep more buffers flushed to disk by the LRU_MIN_DIRTY/LRU_MAX_DIRTY parameters. Because of the hashing used to select LRUs the LRUs fill unevenly. More LRUs fill less evenly than fewer LRUs meaning fewer dirty buffers at CHECKPOINT time and faster checkpoints. Also more LRUs can take advantage of more CLEANERS for faster checkpoints.

On 29th August 2000 spamwillemsspam@soudal.com (Daniel Willems) wrote:-

Here is a correction for the script which measures bufwaits and readaheads


onstat -p | /usr/xpg4/bin/awk '

[a-zA-Z]/{ 
 for (i=1; i <= NF ; i++ ) { 
               name[i]=$i ;
 }

 } 
/[0-9]/{ 
 for (i=1; i <= NF ; i++ ) { 
               content[name[i]]=$i ;
 }
 }
END { 
 ixdaRA = content["ixda-RA"];
 idxRA = content["ixd-RA"];
 daRA = content["da-RA"];
 RApgsused = content["RA-pgsused"];
 print "Read Utilization (UR): ",((RApgsused /( ixdaRA + idxRA + daRA) )*100),"%";

 bufwaits = content["bufwaits"];
 bufwrits = content["bufwrits"];
 pagreads = content["pagreads"];
 print "BufWaits Ratio(BR): ", ((bufwaits/(pagreads + bufwrits)) *100),"%";
 
 }
'

6.33 What status codes does onstat return?

On 9th November 2001 mdstock@mydas.freeserve.co.uk (Mark D. Stock) wrote:-

As a point of interest, you can check the current mode of IDS by checking the status ($?) after onstat -.

The values returned by 'onstat -' are:

IDS 7 and 9

-1 Offline (-1 = 255)
0 Initialisation
1 Quiescent
2 Recovery
3 Backup
4 Shutdown
5 Online
6 Abort

IDS 8

0 Initialisation
1 Quiescent
2 Micro kernel
3 Recovery
4 Backup
5 Shutdown
6 Online
7 Abort

6.34 Why does ontape dislike norewind devices?

On 11th June 1998 davek@summitdata.com (David Kosenko) wrote:-

It is primarily an issue with multiple-volume backups, and with restores. When ontape switches to writing the second volume of an archive set, it first reads the header from the tape to be sure that the tape was actually switched (i.e. it won't write the second volume over the first volume), then "rewinds" the tape. It does the "rewind" by closing the device, which, on a normal (i.e. not no-rewind) driver will cause the tape to rewind. If you use a norewind driver in this case, ontape will read a bit of the tape, corresponding to the header it would write. It would then close the device, reopen it and start writing the new header. With the norewind driver, that header would be one header's worth into the tape, rather than at the start. If you tried to restore from that tape, the restore would fail.

I don't recall offhand if the norewind is an issue with the first tape in a volume or not. It's been a while since I dove into the gritty details.

6.35 Any known issues with shared memory connections?

Yes, in your sqlhosts file shared memory and tcp connections should not use the same service name

On 16th June 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

Yes, as I said, there will be a noticable increase in system calls per second if you do this. With only one CPU VP and one NET VP you may not be able to descern the effect but we run 8 NET VPs and have shared memory listeners in all 28 CPU VPs that we run. In our case, not only is there a noticable, and quantifiable, increase in system calls when the network and shared memory connection share a service, there is a noticeable slowdown in system responsiveness. This is true even though we have 4 out of 32 CPUs reserved for UNIX services. I'd call that a drawback!

On 16th June 1998 davek@summitdata.com (David Kosenko) wrote:-

In fact, the "servername" acting as a "placeholder" in the shared memory connection entry is used as the basis for a filename, /INFORMIXTMP/.inf.XXXXXX (where XXXXX is the servicename you specified for the shared memory connection) that is used by clients accessing the server via the shared memory connection (the file contains a count of the number of shared memory poll threads running, among other stuff I forget now). So you can only use that "servicename" for one shared memory connection on a single Unix box.

6.36 Any fast backup methods under IDS 7.3?

On 16th June 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

This was intended for use as a VERY fast method. Namely in combination with 7.30's support for multiple mirrors. You can have two mirrors defined for each chunk, take the engine to external backup mode, disable one set up mirrors, put the engine back online, backup the disabled mirrors offline, reenable the secondary mirror. Alternatively, you can just leave the additional mirrors offline and use them as the backup directly then disable the other mirror to use as backup and bring the original backup mirrors back online to be caught up. This is great for EXTREMELY large databases as an "archive" only takes about 2 minutes, the time needed to disable one of the mirrors and reenable the other, and if it can be scheduled at a time when there will be no updates there will be no effect on production.

On 2nd April 2001 AHamm@sanderson.net.au (Andrew Hamm) wrote:-

Try changing TAPEBLK and LTAPEBLK to large numbers like 512 and you should see a big jump in the speed of tape activity.

6.37 Is IDS 7.3 most robust when crashes occur?

On 17th Jun 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

Version 7.3 adds another event handler that traps all of those fatal crashes that the event alarm handler could not handle. The problem was that the event alarms were triggered by the thread that crashed, for certain kinds of crashes it could not trap itself. I beat them up about that for over a year and the result is the monitoring thread in V7.30+ that watches for another thread to go down. In addition 7.30 will usually not go down in these cases only the one affected oninit will crash and the engine will stay up (a side effect of the monitor thread is that it can clean things up and determine if it is safe to continue or if the engine needs to be brought down). Also, if you have any version before 7.21 there were many trapable crashes that were simply not caught. The ultimate answer? Upgrade!

On 9th Jan 1999 david@smooth1.co.uk (David Williams) wrote:-

I belive Art is talking about the new ONCONFIG parameter SYSALAMPROGRAM.

6.38 Is raw disk faster than cooked files?

On 22nd Jun 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

....................the safety issue of cooked files is no longer a problem. The big problem with cooked files still is performance. All writes and reads to/from cooked files MUST go through the UNIX buffer cache. This means an additional copy from the server's output buffer to the UNIX buffer page then a synchronous write to disk. This is opposed to a write to a way file where the server's output buffer is written directly to disk without the intervening copy. This is just faster. Anyone who has written anything that can test this can attest to the difference in speed. Here are my test results:


FileType	Sync?	Times (real/user/system) 2run avg
---------------	-----	-----------------------------------------
Filesystem file	  N     14.40/3.70/2.52
		  Y	15.02/3.61/2.63
Cooked disk	  N	12.81/3.74/2.24
		  Y	13.42/3.84/2.43
Raw disk	  N	 9.32/3.67/1.52
		  Y	 9.40/3.66/1.44

From this you can clearly see the cost of Cooked files and of synced cooked files. The tests were done with a version of my ul.ec utility modified to optionally open the output file O_SYNC. Cooked disk partition is almost 50% slower than raw disk partition and cooked filesystem files are almost 60% slower. The penalty for O_SYNC is an additional 5% for cooked files and negligible for RAW files (as expected). The test file was 2.85MB written using 4K I/O pages (the default fopen application buffer size) which should simulate Informix performance. The Cooked and Raw disk partition tests were conducted to a singleton 9GB Fast Wide SCSI II drive using the raw and cooked device files corresponding to the same drive.

On 21st January 2002 "Toni Arte" wrote:-

Sun E450, with 9G disk (with ufs file system): 0.65 seconds

Sun E450, chunk on NetApp 840 over 100M ethernet: 0.18 seconds

I also tried to write the same amount of data to the chunk location, and got these results:

9G disk: 1.0 seconds

NFS: 0.9 seconds

But even this does not describe the true difference. The sequential read speed or sequential write speed does not have anything to do with the real performance on Informix use. In my opinion, the read performance has little impact, as we regularly reach 98+ % read hit cache ratio. Also only the thread waiting for that data is blocked.

The true problem lies in the random write performance, i.e. the duration of the (blocking in Informix 7.31) checkpoints. I have seen terrible performance with file systems, where the random write performance is less than 10% of the random write performance of underlying disk device. The rest is file system overhead.

On 12th September 2001 c.bull@videonetworks.com (Colin Bull) wrote:-

The tests were fairly crude, but we had a short window of time with plenty of disks to do some very quick comparisons.

We loaded the data in for a databse (about 3.5GB) and ran a typical query that uses a lot of resource. We compared a raidset on 5 x 18 GB disks using raw disk against the same setup using flesystem space, so everything was the same apart from storage. All data and indexes were loaded into datadbs1 on both.

The tests all came out consistently 20-25% faster on raw disks.

I particularly wanted to do this test, as our hardware suppliers and Compaq said we could not use raw disk on these devices when we were specifying and ordering. When we set them up, it looked obvious to us that we could. Our Informix rep was ambivalent about it.

Coincidently yesterday I was shown a document from an Informix consultant regarding disk layout for OLTP systems.

I quote ---

Raw Or Cooked?

Another hotly contested issue is whether the database should reside in raw, unmounted disk or in �cooked� operating system files. Once again, the official Informix position is that we don�t really care what the underlying disk technology is, as long as we can access it, we will use it and use it as well as it can be used.

Operating system vendors say that their performance matches raw disk, but actual end user benchmarking reveals that there is a potential 15-25% performance penalty overall using cooked (block) devices versus using raw (character) devices.

This is caused by the overhead of copying the data a second time to/from the Informix buffers and the operating system buffer cache before/after writing/reading disk. This extra step has a fixed cost that varies based on hardware and no high-speed file system or operating system can improve it.

In addition, there is another potential 5-30% performance penalty if Informix chunks are created from file system files instead of device files due to the additional overhead of file system operations and the non-contiguous nature of file system files. This last is very variable and here more modern, tuneable, database aware, file systems can minimize the hit somewhat but you still bear the cooked device �hit� of the underlying mounted block device and the system cache.

Another potential cost is that most operating systems use a dynamic operating system buffer cache management system. As the database will most likely be heavily used, database disk pages will be cached in the operating system�s buffers as well, and may also cause these to grow, leading to a memory bottleneck and potential swapping.

End of quote.

Quite convincing I feel!

6.39 Should online run as root or informix?

On 1st July 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

The ownership of the running oninit processes should be root if the engine is started from the system startup and informix if started by user informix from the command line. It really does not matter much in general except that on different platforms there may be a requirement for the owner to be root in order for processor affinity, core dumping, NOAGE, etc to function properly. Since oninit is an SUID program it's real and effective userid may differ if not started as root which will cause problems core dumping and changing process status such as aging and affinity on certain systems.

6.40 Should online chunks be owned by root or informix?

On 1st July 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

DATABASE CHUNK FILES SHOULD ALWAYS BE OWNED BY USER INFORMIX GROUP INFORMIX AND PERMISSIONS 660! ALWAYS! PERIOD. On certain platforms, DG/UX among them, the volume managers create device files for logical devices on the fly at boot time so that the permissions of these files is reset to root root 666 on reboot. In this case the rc startup script for Informix, which must run after the volume manager script and the filesystems are mounted, should modify the permissions on all active database chunk files before starting the engine. This can be done with a manually maintained chunk list or using the following loop:


for fil in `$INFORMIXDIR/bin/oncheck -pr|awk '/Chunk path/{print $3;}'` 
do
	echo "Fixing permissions for $fil"
	chown informix $fil
	chgrp informix $fil
	chmod 660 $fil
done

This will work for all versions of online (substituting tbcheck for 5.x) except for 5.06 which fails if you have an empty chunk slot in the chunk table page from having deleted a chunk.

6.41 Can archives hang the system waiting for a tape?

On 9th Jan 1999 david@smooth1.co.uk (David Williams) wrote:-

I the next section I believe Art is talking about Online 5.x archives which wait for a tape change end up holding a latch which stops checkpoints from completing and hence can hang the system

On 1st Jul 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

In 5.xx the description that Jonathan Leffler and Scott Black give is just about right, and indeed if the tape needs changing or is hung the checkpoint will wait for the tape change to complete.

However, this was changed in 7.xx. Actually, the physical log pages are copied into a set of temp tables, one per dbspace being backed up, at each checkpoint, before the physical log is cleared. When the archive thread completes a dbspace it copies all of the preimage pages from the corresponding temp table out to the tape. This prevents the archive from stopping the checkpoint except momentarily if a tape change is needed, and is also why IDS 7.x uses DBSPACETEMP space during archives. This change is one main reason that 7.xx archives run much faster than 5.xx archives did.

6.42 Is it worth binding Online VPs myself?

On 18th Aug 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

There can be great benefit. Sometimes, on some systems (especially Numa architecture systems), there are performance gains above those achieved by the ONINIT affinity parameters to be had be affining the VPs yourself. I use a set of scripts that run onstat -g glo to a file and scans the report for the pids of VPs one VP class at a time affining each class, round robin, across the available processors. Doing this and switching back to AIO VPs we achieved 50% better performance than with ONINIT affinity and KAIO threads which Informix reports is 50% faster than ONINIT affinity and AIO VPs. One trick is to affine AIO VPs to the CPUs in the opposite order as the CPU VPs (ie affine CPU VP#1 to CPU#1 but AIO VP#1 to CPU #12 etc). Since the lowest numbered VPs of each class do most of the work this keeps the busiest CPU VPs from interfering with the busiest AIO VPs.

6.43 Subqueries are not working under IDS 7.3 - help?

On 9th Jan 1999 david@smooth1.co.uk (David Williams) wrote:-

IDS 7.3 includes a new feature called subquery flattening where some subqueries are converted into joins instead for performance reasons. This functionality contains some bugs so Informix introduced an environment variable NO_SUBQF. Set this to 1 before starting Online to disable the subquery flattening feature

6.44 What is a .FCx release of Online?

On 21st Oct 1998 com@netinfo-moldova.com (Octav Chiriac) wrote:-

The IDS 7.3 SQL Documentation (I don't remember is it the Guide or Reference):

dbinfo('version','os') The operating system identifier within the version string:

T = Windows NT
U Unix 32 bit running on a 32-bit operating system
H UNIX 32 bit running on a 64-bit os
F UNIX 64 bit running on a 64-bit os

On 30th Oct 1998 tgirsch@iname.com (Thomas J. Girsch) wrote:-

The second letter goes:

A - Alpha Test version
B - Beta Test version
C - Customer Release (first)

Subsequent releases increment the letter to D, E, etc.

6.45 How many semaphores does online need?

On 28th Oct 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

The actual number of semaphores is equal to the number of possible shared memory connections and is allocated in semaphore groups ( in my system the groups each contain 100 semaphores the actual number per group is dependent on the kernel configuration). The engine will allocate as many maximum size groups as needed plus a smaller group if needed to complete the needs of each listening VP separately. For example on my development machine I have:

NETTYPE ipcshm,8,250,CPU

That is 8 listeners with 250 connections each for a total of 2000 possible connections. There are 24 semaphore groups allocated on the system (see ipcs -s) each of the 8 connections has two 100 semaphore groups and one 50 sempahore group to make up the 250 needed.

You can configure the maximun available semaphores and the maximum size of each group in the kernel configuration files, often /etc/system but that is system dependent. Be sure to allow for additional semaphores beyond your Informix requirements for other applications when you configure the kernel. It is best to set these values to AT LEAST the values recommended in the release notes for your platform even if you need fewer semaphores but don't be shy to increase these values beyond the release notes numbers if you need them.

6.46 Does IDS support chunks more than 2Gb in size?

On 28th Oct 1998 june_t@hotmail.com (June Tong) wrote:-

IDS *CANNOT* handle chunks over 2GB in size, on a 2K pagesize machine, (4GB on a 4K pagesize platform) and to tell people that it will is irresponsible. Your customer is asking for trouble, and is going to end up with corruption that is completely irretrievable. Everything will be "almost fine" until they write to the next chunk (e.g. writing to chunk 4 if your 15GB chunk is chunk 3), and the new chunk REPLACES THE PAGES past 2GB in your big chunk, or your big chunk overwrites the pages in the next chunk. And then, your only option will be to restore from an archive taken BEFORE this overwriting took place, and THEN export all your data to ASCII format and re-create your entire instance with 2GB chunks.

(sigh, something else for the FAQ)

If IDS is letting you create chunks over 2GB (or 4GB on a 4K pagesize platform), then this is a MAJOR bug that someone out there ought to report and get fixed ASAP.

On 23rd October 2000 obnoxio@hotmail.com (Obnoxio The Clown) wrote:-

7.x and 9.2x, the chunk size is still 2GB. 8.3x has a platform dependent (but much larger) chunk size.

Rumour has it that 9.30 will have a bigger chunk size. Allegedly.

6.47 Why am I getting lots of busy waits?

On 26th Jan 1999 rdavis@rmi.net (Bob) wrote:-

This is probably not a big deal.

When a cpu vp has no work to do, it will go into a busy wait so that UNIX does not swap the process out. After roughly 1000 spins, the vp will check again. At this point, if there is still no work to be done, a semop is performed and UNIX swaps the cpu vp process out to the UNIX queue.

If your system is relatively inactive, this is nothing to worry about as your primary cpu vp is handling most of the work. If your system is loaded, however, this could be indicative of a lower level problem.

6.48 Why do checkpoints sometimes hang?

On 17th Mar 1999 kagel@bloomberg.net (Art S. Kagel) wrote:-

The checkpoint starts by issuing a block that stops new transactions beginning until the checkpoint completes, then it waits for all threads in critical section to free themselves and block waiting for the checkpoint thread. Only when all threads requiring critical section latches are blocked can the checkpoint continue. Here you are waiting for a thread in a, probably HUGE, rollback which is a critical section itself, before the checkpoint can start.

6.49 What is this mt.c mentioned in online crashes?

On 17th March 1999 kagel@bloomberg.net (Art S. Kagel) wrote:-

I think that mt.c is the source for the mainline of the monitor thread in the misc VP that watches to make sure the other VPs stay online.

6.50 Why does onstat sometimes fail with a changing data structure error?

On 15th July 1999 kagel@bloomberg.net (Art S. Kagel) wrote:-

Onstat reads dynamic data structures in shared memory without latching them to avoid affecting online operations. Since many of these structures involve links and pointers and are in a constant state of flux it is possible that while following a set of links onstat may find itself in a loop or a dead-end. In this case it aborts gracefully, having giving up any hope of continuing, with the very true message: "Changing data structure forced command termination." Running the onstat report again usually works. Also this was more common in the earlier 7.2x versions, so upgrading may alleviate the problem somewhat

6.51 What are the restrictions with Workgroup Server?

On 29th July 1999 murray@quanta.co.nz (Murray Wood) wrote:-

In the Informix (workgroup Server) box I have received it has a small piece of paper with the following restrictions listed:

CPU: No more than 4 Intel 32 bit or 2 RISC 32 bit CPU's
Components: You are restricted from using ... PDQ, data partitioning, parallel backup, parallel restore, parallel load, parallel unload, optical disks, advanced replication, VLM (more than 2Gb Ram), Microsoft Cluster Server.
Product Options: The options are not available.. UDO, ADS, EPO

6.52 How can I tell how well Online buffering is working per table?

On 6th August 1999 martyn.hodgson@eaglestar.co.uk (Martyn Hodgson) wrote:-

I've been looking at trying to analyse our IDS buffer effectiveness, in terms of page reuse by object. The query:


select a.dbsname, a.tabname, round(sum(c.reusecnt) / count(*),2) as avg_reuse, 
count(*) as num_buffs
from systabnames a, sysptnext b, sysbufhdr c
where c.pagenum between b.pe_phys and 
b.pe_phys + b.pe_size - 1
and   a.partnum = b.pe_partnum
group by 1,2
order by 4,1,2

seems to give the number of pages of each object in the buffer, and the average number of times each buffer is reused. onstat -p simply gives this percentage for the whole of the buffer, not by object.

6.53 What can I do if Online ends up hung?

On 30th September 1999 kagel@bloomberg.net (Art S. Kagel) wrote:-

Sounds like one or more VPs are hung.

Time to trash the engine and restart. Check onstat -R and onstat -l to make sure that all data and log buffers are safe on disk (if not try onmode -c to force a checkpoint but that is likely to hang as well). Run onstat -g glo and find the PID of the master oninit process. When you have found it (it will be VP #1 a CPU VP) do kill -TERM followed by kill -PIPE the master oninit will exit. After a minute or two the admin vp will notice that the master VP is deceased and initiate a forced shutdown.

Now restart the engine. Fast recovery should work fine even if you could not force the checkpoint.

Obviously this is a last resort thing. OH, I have not tried this on 7.3x with the "stay online" feature enabled. If you have it enabled the engine may stay up after you do this instead of crashing. If it was the master VP that was hung you may now be OK to try a normal shutdown, otherwise you may have to kill -TERM, kill -PIPE all of the VPs.

6.54 How do I fix permission on Informix binaries?

On 18th August 1999 kagel@bloomberg.net (Art S. Kagel) wrote:-

Here are my scripts. The shell script must be run by root and it will in turn invoke the awk script. Put both scripts in your path and make them executable:

fixperms.sh:


cd $INFORMIXDIR
for lists in etc/*files; do
	fixperms.awk $lists | sh
done



#! /usr/bin/awk -f
# Adjust above for location of your favorite version of awk

BEGIN {
	notyet=1;
}
notyet==1 {
	if ($1 == ".") {
		notyet=0;
	} else {
		next;
	}
}
{
	file=$1;
	owner=$2;
	group=$3;
	perms=$4;

	if (length( file ) > 0) {
		printf "chown %s %s\n", owner, file;
		printf "chgrp %s %s\n", group, file;
		printf "chmod %s %s\n", perms, file;
	}
}

6.55 How do I detect network errors with Online?

On 17th August 1999 y_dovgart@tci.ukrtel.net (Yuri Dovgart) wrote:-

Other problem can be in your network - monitor your connections through 'netstat -e' and look for 'Errors' field. At Informix level use 'onstat -ntd' to see accepted/rejected entries.

6.56 What if I get a mt_notifyvp time out error with Online?

On 17th December 1998 kagel@bloomberg.net (Art S. Kagel) wrote:-

There are several bugs causing an mt_notifyvp time out. This indicates that the admin VP which monitors the other VPs has determined that one of the VPs has not responded to its current job or hearbeated its continued busy status in too long and the engine assumes that that VP is hung or crashed so it tries to shut down the engine. Sometimes that cannot happen because things are just to badly hosed, other times the engine comes down but the master oninit is the one hung and it is also the one that normally removes shared memory so the memory stays around and onstat can query it but as the onstat -d messages indicated it cannot get a message from the oninit processes. Running onmode -ky will usually bring down any remaining oninits and shared memory

6.57 How do I tune buffer waits?

On 14th October 1999 kagel@bloomberg.net (Art S. Kagel) wrote:-

BUFWAITS can have three causes:

1) Severe buffer starvation (i.e. need lots more buffers)
2) Excessive read ahead, or RA_PAGES and RA_THRESHOLD too large or close together
Too few LRUS (most common cause!)

To determine if your BUFWAITS is reasonable given your load (some waits are unavoidable as many processes may just need the same buffers at the same time and so will have to wait) calculate the BUFWAIT Ratio (BR):

BR = (bufwaits / (dskreads + bufwrits)) * 100.00

A reasonable BR value is under 10% with the ideal at 7% or less. Anything over 10% is usually death.