FriarTuck
7th December 2004, 15:46
Hello All,

I've got some questions I'd like to ask, ones that seem almost too stupid to ask. I come from the school of thought that UNIX systems are unlike many Windows incarnations in that I can have a *NIX server up well over 500 days without bouncing the server.

Now, the server might be online for > 500 days, but there is considerable insistance that we bounce the baan software (rc.stop/start) weekly. What is the opinion of folks here? Are weekly bounces of the Baan software necessary for performance? If so, why? Are there that many memory leaks?

Cheers, and thank you!
FT

dave_23
7th December 2004, 16:02
Remember, Baan being "up" simply means that its allocated some shared memory at the OS level. I've never heard of a problem with shared memory due to a leak in Baan..

If there were a memory leak, it would be in the bshell or driver processes. Which only exist for the duration of a user's login. (If you set ds_timeout_detect then even ghost ones will clean themselves up..)

If you have processes that run all the time, like job daemon or openworld then maybe bouncing them periodically is a good idea, but you don't have
to take Baan, or the server down to do that..

I don't know if you're just talking about Baan or the database as well. But for Oracle I prefer to keep it running as long as possible. I find that my system runs considerably slower for a few days after I bounce oracle.

Dave

FriarTuck
7th December 2004, 16:24
Hi Dave,

I would whole heartedly agree with you about Oracle performance after a shutdown. The database takes at least a few days to stabilize when it's brought up cold.

I must admit that in my haste to post I wasn't too clear putting my thoughts to screen. I should say that I refer to the baan software itself, as opposed to the client apps such as bshell6.1 and api6.1.

I think I've convinced myself that there isn't even a slow leak in any software since memory is freed up at the end of the day.

Pointers on the operating system (via glance and sar) simply indicate a system that is being used. The database doesn't appear to be too terribly taxed either. The word from the Baan administrator is that people are seeing a general slowdown. There is concern that I've not bounced the baan software (ie $BSE/etc/rc.stop) for my weekly backup.

Of course, Oracle is backed up in hot backup...

I'm bouncing back and forth with so many things today, what I've typed is probably no use as a clarifying statement.

Cheers,
FT

Francesco
7th December 2004, 16:26
My final verdict is no. There is no _valid_ reason to bounce Baan on a UNIX box.

There are a few issues that are "solved" by the weekly bounce that Baan recommends. Mainly this concerns the ipc queue, hanging bshells, and shared memory corruption (Dave is right, no leaks, but it does get nasty some times).

My advice is to tackle these issues rather than following Baan's unfounded recommendation. After all, curing the disease is better than curing the symptoms. :)

dave_23
7th December 2004, 16:57
Francesco -

Baan recommends it??

Dave

FriarTuck
7th December 2004, 17:18
My final verdict is no. There is no _valid_ reason to bounce Baan on a UNIX box.

Go, Unix! :cool:

There are a few issues that are "solved" by the weekly bounce that Baan recommends. Mainly this concerns the ipc queue, hanging bshells, and shared memory corruption (Dave is right, no leaks, but it does get nasty some times).

(my emphasis above)
I have noticed that after I've executed $BSE/etc/rc.stop, running ipcs -q shows a all-mighty gaggle of queues held. Now, I could simply run through and ipcrm each one, but (wanting to take your advice) I've gone on a google mission to see how I can fix the problem.

My advice is to tackle these issues rather than following Baan's unfounded recommendation. After all, curing the disease is better than curing the symptoms. :)

Maybe this answer is right in front of me, but how can I determine/verify shared memory corruption?

Hung bshells are nuked via kill -TERM on bshell pid (or -9 on the Oracle connector). :rolleyes:

You's guys are the best!
FT

Francesco
7th December 2004, 20:30
I have noticed that after I've executed $BSE/etc/rc.stop, running ipcs -q shows a all-mighty gaggle of queues held. Now, I could simply run through and ipcrm each one, but (wanting to take your advice) I've gone on a google mission to see how I can fix the problem.

It's your lucky day.
Credit for this one goes to Mike King.

#!/bin/ksh
# **************************************************************************
# * File : baan_ipcrm.ksh
# * Usage : baan_ipcrm.ksh
# * Purpose : This script will remove all unused message queues created
# * by Baan processes. It does so by locating all message
# * queues created by users in group bsp. It then checks
# * whether the last send and last receive processes still
# * exist on the server. If they are both gone, it should be
# * a good bet that the queue is unused.
# * Author : Michael King
# * Date : July 2001
# **************************************************************************

full_usage()
{
echo 'USAGE:'
echo ' baan_ipcrm.ksh <no parameters>'
}

if [ $# -eq 1 ] ;then
if [ $1 = '-u' ] ;then
full_usage
exit 1
fi
fi

# print all queues, filter for those in group bsp, then grab queue #, last sender, last receiver
queues=$(ipcs -qp |tr -s ' ' ' '| nawk '$6 ~ /bsp/ {printf("%s|%s|%s\n",$2,$7,$8)}')

for i in $queues
do
q_num=$(echo $i | cut -f1 -d'|')
sen=$(echo $i | cut -f2 -d'|')
rec=$(echo $i | cut -f3 -d'|')
# check if either last sender or last receiver process still exists.
if [ -z "$(ps -f -o pid -p ${sen}|grep -v PID)" -a -z "$(ps -f -o pid -p ${rec}|grep -v PID)" ] ;then
# echo we can kill queue $q_num
# for testing purposes, simply echo the queues to be killed
#ipcrm -q $q_num
echo ipcrm -q $q_num
#else
# echo Queue $q_num is still in use
fi
done


Maybe this answer is right in front of me, but how can I determine/verify shared memory corruption?

I have a thread out here somewhere on memory corruption. It shows the pains I went through to identify it as a cause and the solution. Without any indicators you can safely assume that you do not have corruption.

Hung bshells are nuked via kill -TERM on bshell pid (or -9 on the Oracle connector). :rolleyes:
The newer porting sets are actually doing a pretty good job in managing (lost) connectivity and even licenses (hung licenses, another invalid reason for bouncing Baan come to think of it).

Francesco
7th December 2004, 20:32
Francesco -

Baan recommends it??

Dave

They certainly have in the past. If they changed their stance, its news to me.