Upgraded to 4.0.3 Std: Consistent 503s

jp_n9

Active Member
#1
This appears to have been introduced with the upgrade to 4.0.3 (from 3.3.24).

First request to a Rails Virtual host succeeds. Next request to same Vhost 503s. Next, succeeds. Next, 503s. Repeat.

If you wait a full 5 minutes between the first successful request and 2nd request, then it does not 503, but succeeds. Looking at the error listed below, perhaps a MySQL related timeout issue?

Linux 32bit
Litespeed 4.0.3 Standard
Ruby / Rails 1.2.3

Gems installed:
actionmailer (1.3.3, 1.2.5, 1.2.0)
actionpack (1.13.3, 1.12.5, 1.12.0)
actionwebservice (1.2.6, 1.2.3, 1.1.6, 1.1.0)
activerecord (1.15.3, 1.14.4, 1.14.0)
activesupport (1.4.2, 1.3.1, 1.3.0)
capistrano (2.2.0, 2.1.0, 1.4.1)
daemons (1.0.10, 1.0.9, 1.0.5)
eventmachine (0.12.8, 0.10.0)
gem_plugin (0.2.3, 0.2.2)
highline (1.5.1, 1.4.0)
mysql (2.7)
needle (1.3.0)
net-sftp (1.1.1, 1.1.0)
net-ssh (1.1.2, 1.1.0, 1.0.10)
rack (1.0.0, 0.3.0)
rails (1.2.3, 1.1.6, 1.1.0)
rake (0.8.7, 0.8.1, 0.7.3)
rmagick (2.2.2, 1.15.2)
ruby-lsapi (3.5)
rubygems-update (1.3.4, 1.0.1, 0.9.2)
sources (0.0.1)
thin (1.2.2, 0.7.1)

ruby 1.8.6 (2007-03-13 patchlevel 0) [i686-linux]

mysql Ver 14.12 Distrib 5.0.77, for pc-linux-gnu (i686) using readline 5.1

From error.log (foo sub'd in for account name):

2009-06-03 19:55:27.366 [INFO] [24.155.38.130:57020-0#foo] connection to [uds://tmp/lshttpd/foo:_.sock] on request #0, confirmed: 1, Connection reset by peer!
2009-06-03 19:55:27.366 [NOTICE] [24.155.38.130:57020-0#foo] [uds://tmp/lshttpd/foo:_.sock] Request in process stage, fail with 503
2009-06-03 19:55:27.366 [NOTICE] [24.155.38.130:57020-0#foo] oops! 503 Service Unavailable
2009-06-03 19:55:27.366 [NOTICE] [24.155.38.130:57020-0#foo] Content len: 0, Request line:
(null)
2009-06-03 19:55:27.366 [NOTICE] [24.155.38.130:57020-0#foo] Redirect: #1, URL: /dispatch.cgi

From stderror.log:

2009-06-03 19:55:27.227 [STDERR] /opt/lsws/fcgi-bin/RailsRunner.rb:92: [BUG] Segmentation fault
 
Last edited:

jp_n9

Active Member
#3
Just for the sake of experimenting, tried upgrading ruby, no change in behavior. Now running:

ruby 1.8.6 (2008-03-03 patchlevel 114) [i686-linux]
 

jp_n9

Active Member
#4
Here's the strace showing a few lines before and after the request hits and 503s:

select(1, [0], NULL, NULL, {0, 194000}) = 0 (Timeout)
kill(13513, SIG_0) = -1 EPERM (Operation not permitted)
select(1, [0], NULL, NULL, {1, 0}) = 0 (Timeout)
kill(13513, SIG_0) = -1 EPERM (Operation not permitted)
select(1, [0], NULL, NULL, {1, 0}) = 0 (Timeout)
kill(13513, SIG_0) = -1 EPERM (Operation not permitted)
select(1, [0], NULL, NULL, {1, 0}) = ? ERESTARTNOHAND (To be restarted)
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(-1, [WIFSIGNALED(s) && WTERMSIG(s) == SIGABRT && WCOREDUMP(s)], WNOHANG|WUNTRACED, NULL) = 28015
wait4(-1, 0xbfb4ebec, WNOHANG|WUNTRACED, NULL) = -1 ECHILD (No child processes)
sigreturn() = ? (mask now [RTMIN])
select(1, [0], NULL, NULL, {0, 442000}) = 0 (Timeout)
kill(13513, SIG_0) = -1 EPERM (Operation not permitted)
select(1, [0], NULL, NULL, {1, 0}) = 0 (Timeout)
kill(13513, SIG_0) = -1 EPERM (Operation not permitted)
 

jp_n9

Active Member
#5
Appears that I upgraded to 4.0.3 on May 29 17:12 as that is the timestamp in /opt/lsws/autoupdate for

May 29 17:12 lsws-4.0.3/

First 503 error shows up in error.log:

2009-05-29 18:42:11.734 [NOTICE] [xxx.xxx.xxx.xx:51618-0#foo] oops! 503 Service Unavailable

Timing would seem highly coincidental.
 

jp_n9

Active Member
#6
Ok, there are core dumps... but I didn't realize that because I set a cron job to delete them 2 years ago. That was issue:

http://litespeedtech.com/support/forum/showthread.php?t=1007

At that time I was on Ruby 1.8.4. Now I'm on 1.8.6 and the output of GDB/bt is slightly different, but still awfully similar:

(gdb) bt
#0 0xb7e60da1 in kill () from /lib/libc.so.6
#1 0xb6c32f4a in pthread_kill () from /lib/libpthread.so.0
#2 0xb6c332f9 in raise () from /lib/libpthread.so.0
#3 0xb7e60b7a in raise () from /lib/libc.so.6
#4 0xb7e61d95 in abort () from /lib/libc.so.6
#5 0x080c0f16 in rb_bug (fmt=0x80db90a "Segmentation fault") at error.c:214
#6 0x080a3fc4 in sigsegv (sig=11) at signal.c:622
#7 <signal handler called>
#8 0x080a6252 in st_lookup (table=0xb7a3a84c, key=3967, value=0xbfdca204) at st.c:250
#9 0x08052e2b in search_method (klass=3080955980, id=3967, origin=0xbfdca204) at eval.c:475
#10 0x08052e75 in rb_get_method_body (klassp=0xbfdca240, idp=0xbfdca244, noexp=0xbfdca234) at eval.c:496
#11 0x08052fdb in rb_method_node (klass=3080955980, id=3967) at eval.c:552
#12 0x080597a4 in rb_obj_respond_to (obj=3080956000, id=6881, priv=0) at eval.c:4208
#13 0x0805985f in rb_respond_to (obj=3080956000, id=6881) at eval.c:4225
#14 0x08080638 in convert_type (val=3080956000, tname=0x80c46f2 "Hash", method=0x80c62fb "to_hash", raise=2) at object.c:2158
#15 0x08080716 in rb_convert_type (val=3080956000, type=11, tname=0x80c46f2 "Hash", method=0x80c62fb "to_hash") at object.c:2183
#16 0x0806da5e in to_hash (hash=3080956000) at hash.c:361
#17 0x0806e1d3 in rb_hash_replace (hash=3080955400, hash2=1) at hash.c:991
#18 0x08066c3f in call_cfunc (func=0x806e1c0 <rb_hash_replace>, recv=3080955400, len=1, argc=1, argv=0x6) at eval.c:5688
#19 0x0805be73 in rb_call0 (klass=3085109520, recv=3080955400, id=4361, oid=0, argc=1, argv=0xbfdca550, body=0xb7e30294, flags=0) at eval.c:5847
#20 0x0805c88d in rb_call (klass=3085109520, recv=3080955400, mid=4361, argc=1, argv=0xbfdca550, scope=1, self=6) at eval.c:6094
#21 0x0805cb02 in vafuncall (recv=3080955400, mid=4361, n=1, ar=0xbfdca594) at ruby.h:679
#22 0x0805cb6e in rb_funcall (recv=3080955400, mid=4361, n=1) at eval.c:6188
#23 0xb73f9fa8 in clear_env () at lsruby.c:90
#24 0xb73f9fcd in setup_cgi_env (data=0x8a65320) at lsruby.c:96
#25 0xb73fa5e4 in lsapi_s_accept (self=3080952880) at lsruby.c:323
#26 0x08066c3f in call_cfunc (func=0xb73fa488 <lsapi_s_accept>, recv=3080952880, len=1, argc=0, argv=0x6) at eval.c:5688
#27 0x0805be73 in rb_call0 (klass=3080952100, recv=3080952880, id=10745, oid=0, argc=0, argv=0x0, body=0xb7a39898, flags=0) at eval.c:5847
#28 0x0805c88d in rb_call (klass=3080952100, recv=3080952880, mid=10745, argc=0, argv=0x0, scope=0, self=3085138340) at eval.c:6094
#29 0x08057ae7 in rb_eval (self=3085138340, n=0x0) at eval.c:3475
#30 0x080579e2 in rb_eval (self=3085138340, n=0x0) at eval.c:3467
#31 0x08056d70 in rb_eval (self=3085138340, n=0x0) at eval.c:3162
#32 0x080543f8 in ruby_exec_internal () at eval.c:1634
#33 0x0805442c in ruby_exec () at eval.c:1654
#34 0x08054451 in ruby_run () at eval.c:1664
#35 0x08052828 in main (argc=2, argv=0xbfdcc3f4, envp=0xbfdcc400) at main.c:48
#36 0xb7e4f62d in __libc_start_main () from /lib/libc.so.6

However, looking through the logs it would appear that these seg faults have sadly been continuing regularly since that initial post 2 years ago.

Which would seem to indicate that, although ugly, the seg faults are not absolutely linked to this new behavior.
 

jp_n9

Active Member
#7
>> If you wait a full 5 minutes between the first successful request and 2nd request, then it does not 503, but succeeds.

I can reduce this period of time in between requests from 5 minutes to 30 seconds by adjusting the value of Max Idle Time from 300 to 30 in Configuration > Server > Ruby Rails.
 
Last edited:

jp_n9

Active Member
#8
Blech. For the moment I have set that value to 0, which now prevents the 503s... but of course totally kills performance.

At least that should hold me over until you are able to check this out. Thanks!
 

mistwang

LiteSpeed Staff
#9
The stack trace does not match the latest ruby-lsapi code, so you probably is using a earlier version of ruby-lsapi, not the latest.

If you have installed ruby-lsapi via tgz package, you may need to remove .../lib/ruby/1.8/i586-linux/lsapi.so, otherwise, it will override the ruby-lsapi installed via gem.
 

jp_n9

Active Member
#10
Wow. That was a nice and easy solution (wish I'd thought of it last night).

I renamed the non-gem lsapi.so to lsapi.so.BAK and now I can bump the Max Idle Time back up to 300 without a single 503.

Thanks so much!
 
Top