r_nagu
23rd August 2003, 00:47
Hi,
Some of the users in our office get disconnected from baan when there are in the middle of running a session. Some times they get an error message Detected database server termination and sometimes they dont get any error message at all.
When I looked in the log.bshell file, I saw the following error message just after a user got kicked out of baan.
Start error message
2003-08-22[17:01:45]:E:stagerdj: ******* S T A R T of Error message *******
2003-08-22[17:01:45]:E:stagerdj: Log message called from /port.6.1c.03.01/vobs/tt/lib/al_1/al_sig.c: #182 keyword: CORE DUMPED
2003-08-22[17:01:45]:E:stagerdj: Pid 22318 Uid 1889 Euid 1889 Gid 101 Egid 101
End error message
When I looked in their home directory there was a core file created.
Any ideas whats going on?
Appreciate your help.
Thanks,
NS
dave_23
23rd August 2003, 01:16
Check the data and stack segments of your ulimit -a command. maybe post 'em so we can take a look.
Dave
r_nagu
23rd August 2003, 03:08
Dave,
Here is the output of ulimit -a command.
$ ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 1024
vmemory(kbytes) unlimited
Is the stack(kbytes) option suppose to be unlimited?
Appreciate your help.
Thanks,
NS
dave_23
23rd August 2003, 07:05
Not unlimited but higher than 8k. try something like 32k or 64k..
Dave
OmeLuuk
25th August 2003, 10:30
My suggestion would be: Go to Baan Support with this issue. The application should not core-dump.
Alick Wilson
27th August 2003, 16:53
The causes of core dumps can be hard to find so this is not a solution, just something to try before going to BaaN (as the first thing they may suggest is to update to the latest version of ostpstandard and ostpapihand). In unix run the command:
strings core | grep -v CORE | more
I have used this successfully only once to discover it was unix file permissions in a subdirectory of $BSE/api. If nothing looks obvious in the first few pages, listen to OmeLuuk and go to BaaN.
r_nagu
27th August 2003, 23:53
Hi,
Thanks for the feedback. We did a reboot and everything seems to be working fine. But, just today we had a new problem. I am not sure if this is related to the one we had earlier.
Today, when user were trying to open sessions (most of em) they got an error message saying search domain tctano failed and could not open the session at all. The domain in the error message kept changing based on different sessions but they were all related to package tc. I looked at the domain definition file (dtc.pd) and there was nothing changed. We could not figure out what was going on. Finally, we did a create runtime DD for domains for package tc and everything started working again. Strange thing is that we havent touched the domain definition for package tc in a long time.
Any ideas what might have happened.
As a side note, we dont have baan support anymore. So, cant get help from them.
Appreciate your help.
Thanks,
NS
jpvdgiessen
11th September 2003, 21:21
I found the following about debugging a core:
Generate a core dump using qptool6.1.
Execution:
Qptool6.1 q select * from tiitm001 c617
When information is given press <ctrl \> and the process will stop with a core dump.
First it is necessary to see which binary is responsible for the core dump.
This can be done with the next command:
Strings core |pg
<HP-UX
daytona
B.10.20
9000/889
216894362
qptool6.1
UUUUU
UUUUU
"$Revision: 92453-07 linker linker crt0.o A.10.44 951031 $
/usr/lib/dld.sl
This will give a long output, but somewhere around line 6 you get the info that is needed, in this example it is qptool6.1. Now this binary the right function has to be found, this can be done with a debugger.
There are more debuggers and it is depending on the different systems. Possible debuggers are xdb, dbx, adb, gdb.
xdb $BSE/bin/<binary> core
Start this in the directory where the core is. The debugger will start up. Now a trace has to be made. This can be done with the t option or bt option.
0xc0149fe8 _writev BL _writev,r2
File: unknown Procedure: write + 0x00000040 Line: unknown
Procedures: 0
Files: 0
>t
0 write + 0x00000040 (0, 0, 0, 0)
1 TMEM + 0x3fe8505c (Address not found (UE302)
>
In the underlined part there are the function names within the object. This information can be used to analyze the problem by looking in the source code of the binary. This is something only Baan Tech or PEG Tools can do.
Hope it is helpfull
r_nagu
11th September 2003, 22:38
Jan,
Thanks for your feedback. I will give it a try.
Thanks,
NS