en@frrom
12th October 2010, 13:12
We encounter this error suddenly with one of our sites. Have looked around the board but haven't found a solution yet. The $BSE/tmp seems fine. Here a copy of the error logged in log.sort:

2010-10-12[08:51:20(UTC+00:00)]:E:i90070: ******* E N D of Error message *******
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ******* S T A R T of Error message *******
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: Log message called from /BAAN/view/port.7.1d.16/vobs/tt/lib/al_1/al_log.c: #1245 keyword: stack trace
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: Pid 10393 Uid 202457 Euid 202457 Gid 125 Egid 125 Pset i90070@corerppap:10393
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: user_type S language 2 user_name i90070 tty locale ISO88591/NULL
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: Errno 2 (No such file or directory) bdb_errno 0
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 10393: /afs2/baanvc/bse/bin/sort6.2 -t +0 -1 +1n -2 +2n -3 +3 -4 +4n -5 +90n
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1cc8dc waitid (0, 289b, ffbfe0d0, 3)
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1bc264 waitpid (289b, ffbfe224, 0, 0, ffbfe27c, ff260240) + 60
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1af3d0 system (ffbfe378, ff247b44, 20000, 1, ff23e3a8, ffbfe27c) + 2ec
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 00024568 create_stack_trace (9d800, c2c00, ff262a00, b8400, 9d800, 1) + 98
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 000243f8 log_stack_trace (0, 0, 0, ff243ec0, ffbfe5dc, 1) + 4
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 00024580 crash_notification (0, 0, 0, b, ffbffeff, b3c00) + 4
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 00028534 do_emergency (0, c55d4, c55f8, 0, 2457c, c5400) + 28
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 0002869c core_dumped (b, 0, ffbfe9b8, ff23e3a8, 0, b9400) + 1c
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1c89d8 __sighndlr (b, 0, ffbfe9b8, 28680, 0, 0) + c
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1bd0b8 call_user_handler (b, 0, 12, 0, ff262a00, ffbfe9b8) + 3b8
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1bd28c sigacthandler (b, 0, ffbfe9b8, 1, ff262a00, 0) + 4c
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: --- called from signal handler with signal 11 (SIGSEGV) ---
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 00025a14 ncopystr (c2c30, 100, 0, ffbfecd8, b6ba4, 1) + c
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 0001ecd8 _init_user (2, cea11, c2d54, 65, ce800, c2c30) + 294
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 00019848 main (11, ffbffc14, ffbffc5c, b0000, ff260100, b3000) + 14
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 000197c8 _start (0, 0, 0, 0, 0, 0) + 108
2010-10-12[08:53:41(UTC+00:00)]:E:i90070:
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ******* E N D of Error message *******

bdittmar
12th October 2010, 13:39
We encounter this error suddenly with one of our sites. Have looked around the board but haven't found a solution yet. The $BSE/tmp seems fine. Here a copy of the error logged in log.sort:

2010-10-12[08:51:20(UTC+00:00)]:E:i90070: ******* E N D of Error message *******
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ******* S T A R T of Error message *******
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: Log message called from /BAAN/view/port.7.1d.16/vobs/tt/lib/al_1/al_log.c: #1245 keyword: stack trace
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: Pid 10393 Uid 202457 Euid 202457 Gid 125 Egid 125 Pset i90070@corerppap:10393
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: user_type S language 2 user_name i90070 tty locale ISO88591/NULL
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: Errno 2 (No such file or directory) bdb_errno 0
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 10393: /afs2/baanvc/bse/bin/sort6.2 -t +0 -1 +1n -2 +2n -3 +3 -4 +4n -5 +90n
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1cc8dc waitid (0, 289b, ffbfe0d0, 3)
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1bc264 waitpid (289b, ffbfe224, 0, 0, ffbfe27c, ff260240) + 60
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1af3d0 system (ffbfe378, ff247b44, 20000, 1, ff23e3a8, ffbfe27c) + 2ec
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 00024568 create_stack_trace (9d800, c2c00, ff262a00, b8400, 9d800, 1) + 98
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 000243f8 log_stack_trace (0, 0, 0, ff243ec0, ffbfe5dc, 1) + 4
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 00024580 crash_notification (0, 0, 0, b, ffbffeff, b3c00) + 4
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 00028534 do_emergency (0, c55d4, c55f8, 0, 2457c, c5400) + 28
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 0002869c core_dumped (b, 0, ffbfe9b8, ff23e3a8, 0, b9400) + 1c
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1c89d8 __sighndlr (b, 0, ffbfe9b8, 28680, 0, 0) + c
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1bd0b8 call_user_handler (b, 0, 12, 0, ff262a00, ffbfe9b8) + 3b8
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ff1bd28c sigacthandler (b, 0, ffbfe9b8, 1, ff262a00, 0) + 4c
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: --- called from signal handler with signal 11 (SIGSEGV) ---
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 00025a14 ncopystr (c2c30, 100, 0, ffbfecd8, b6ba4, 1) + c
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 0001ecd8 _init_user (2, cea11, c2d54, 65, ce800, c2c30) + 294
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 00019848 main (11, ffbffc14, ffbffc5c, b0000, ff260100, b3000) + 14
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: 000197c8 _start (0, 0, 0, 0, 0, 0) + 108
2010-10-12[08:53:41(UTC+00:00)]:E:i90070:
2010-10-12[08:53:41(UTC+00:00)]:E:i90070: ******* E N D of Error message *******

Hello,

11
EAGAIN
No more processes

This indicates that a fork has failed, either because the system's process table is full or because the user is not allowed to create any more processes.

Maybe the above hints may help.

Systems process table (Parameter)
User is restricted in number of processes.

Regards

marnix
14th October 2010, 21:41
This is not error code 11 (so this is not 'no more processes'), but signal 11, which is SIGSEGV.

It looks like this is an internal Porting Set error during the start-up of the sort6.2 binary. Since the 7.1d.16 is a really old Porting Set (3 years old by now), this could very well be a bug that's now fixed, in the 8.x line of Porting Sets.

If you have a similar set-up that does work, you may want to check for differences in the environment variables, and see how they affect the behavior.

Han Brinkman
15th October 2010, 13:52
I do happen to know what caused this error: the admin moved the home dir's of the users. Untill the users logged in again they had this problem.

@en@frrom: it would be nice to contribute back the solution. I assume you didn't have the time yet.

Regards,