Server timeout on BW client [Archive]

wricks

5th June 2003, 17:24

Hi all,

We have just completed a move from one server to another server.

We moved from a HPUX 10.20, using Oracle 7.3.4 DB and Baan 4c4 to HPUX 11, Oracle 8.1.7, and Baan 4c4. We upgraded our standard program and porting sets to the latest versions.

I have been working with Baan Support with this problem but so far no answers. We have tried many things but we still have the problem. Almost all users have this happen to them, across all functions in the company. Any session can timeout on the user randomly.

The problem appears to be the Server stops comunicating to the client. It appears mostly when you start a sub session or if you hold down the scroll down key on a session.

But mostly it is a random occurance. We have tried many things but the problem persists. If anyone had this problem or know of something to look at or try please let me know.

Thanks in advance for your help.
:)

James

5th June 2003, 17:30

Hi Steve,

Please post the contents of your db_resource file. A setting may need tweaking in there.

Also, have you checked the log files for any errors?

James

wricks

5th June 2003, 17:40

The following is my db_resource:

dbsinit:01
#lock_retry:2*500,3*1000,5*5000
rds_full:2
ssts_set_rows:2
lock_retry:0
baan_oracle_prefetch:0
ora_timeout:{120,60,60,60,60}
nls_lang:american_america.we8iso8859p1
nls_sort:binary
ora_init:0101000
ora_max_array_fetch:2
ora_max_array_insert:1
oracle_client_home:/apps/baan4c/bse/lib/ora/oracle_home

wricks

5th June 2003, 20:24

We currently run Baan and oracle on MCServicguard cluster software.

We used the virtual IP address of the package to point the BW client to the host.

When I enter the real MAC address IP of the server, the problem goes away. Of course this is a problem for fail over. HP is looking into this problem.

Anybody with experience in setting up MC Service Guard and know of this problem please let me know.

victor_cleto

5th June 2003, 20:33

Weirdo, we also have a MC/SG (2 nodes, Oracle on one and Baan on other - bad setup) and we also did a Oracle 7.3.4 to 8.1.7 and HP-UX 10.2 to 11.0 without problems.
(BaanIVc4 + 6.1c.06.02 + SP6 + a lot of patches on top)

Defenetely is not a Baan problem, more like a MC/SG setup problem; MC/SG behaves differently regarding certain parameters on HP11 than on HP10.2, maybe this was overlooked during the migration...

I'm moving this to the OS thread.

Dikkie Dik

6th June 2003, 10:12

I expect to kick into an open door: Have you analysed your tables after?

Dick

James

6th June 2003, 12:03

Thats strange.

You are correct in using the Virtual (Floating) IP Address of the Baan Service-Guard package. This is completely normal and essential for Serviceguard to function.

When the problem occurs, have you tried to ping or telnet to this Virtual IP address from the client?

Also, try failing the Baan package over to the other node. Would be interesting to see if the problem then still occurs.

Are there any Services running inside Serviceguard? Maybe a running Service can be causing this problem.

FriarTuck

6th June 2003, 16:50

We're running a very similar setup as yours.

But:

We don't failover the baan app servers themselves. We have two app servers, so we are willing to take a hit if one were to fail.

---

Our baan app servers point to, what we call, the 'prodora' host (the virtual IP for oracle package). It is the Oracle server that is SG'ed. Should the Oracle server fail, it will do a package switch to our secondary Oracle server (which serves as our development DB). In a failover situation, we are quite willing to stop application development.

The IP of our production and development oracle servers are also SG'ed, so that in a failover the virt IP of production can move to the physical development box. The disks remount via the SAN (a layout that would require a page in itself...)

One thing I didn't see mentioned in your post: how many heartbeat links do you have? We have two:

1) between our two oracle servers on the normal DB->appserver network (which is a separate "private UNIX subnet" from normal corporate traffic), and

2) between the two oracle servers via a crossover.

What we noticed is that in the event of a network storm/flood on the subnetted "private UNIX network" the heartbeat would "cardiac arrest". So our crossover cable is our primary heartbeat, the other is backup should the crossover fail for some reason...

Perhaps this might give some other ideas? FWIW.

This is an interesting problem and one I'd like to follow. It may bite us, you just never know...