Rails/lsapi spawns too many processes under heavy load

#1
I understand how it's supposed to work - it spawns up to your max_processes process that will stay around for awhile. And if things are being slow for some reason, it'll spawn up to another max_processes to deal with the temporary slowdown.

My problem is that if the site gets slow, it's generally because it's having ram issues. So spawning 4 more interpreters (even if they have some copy-on-write shared memory), actually ends up hurting things. I'd prefer it in this case to either be able to 1) limit the number of 'temporary' processes that get spawned, or 2) disable this alltogether and just have it queue requests/etc.

Also, I'm still having problems with some of ruby/lsapi processes that have hanged waiting for something can't be killed without a kill -9, which lshttpd never does. So sometimes I need to go in and clean up old processes by hand.

Also, the 'general' board is still called 3.0 pre-release, perhaps it's name should be updated.

Thanks for a great product!
-Kevin
 

mistwang

LiteSpeed Staff
#2
Hi Kevin,
You can manually patch the lsapilib.c to turn it off
around line 1692, change
Code:
if ( pServer->m_iCurChildren >= (pServer->m_iMaxChildren << 1 ) )
to
Code:
if ( pServer->m_iCurChildren >= (pServer->m_iMaxChildren ) )
recompile then install, this should disable all temporary processes.

For those processes need to be killed with '-9', can you strace it before killing it?
Also, please check if it belong to any process group (child process), or it is a group leader (parent process)?

LSWS and the parent process does send "SIGKILL(-9)" to the processes managed, if the process wont die after trying multiple times.
 
#3
Hmm, seems to imply there's only 1 additional slot?

Based on looking at your code below? Or <<1 is really a bit shift huh? Man, I don't remember my C classes :)

Though, here's a strange thing. On my small sites, "Max connections" is set to 2 under the "rails" tab, but then the 'real time monitor' shows Max CONN = 3, and Eff Max = 3. Odd!


I have far more than 1 extra process running. On my big site, I have Max Connections == 4, but 4 are show as child processes.

Here's something strange, when I use the -f for "ascii forest" (to see what the parent processes, it only shows some of the processes.

root 22322 0.0 0.0 4440 1300 ? S 14:22 0:01 lshttpd
root 22323 0.0 0.0 1468 196 ? S 14:22 0:00 \_ lscgid
webuser 22324 2.5 0.4 12560 9356 ? S 14:22 3:59 \_ lshttpd
webuser 22325 0.0 0.3 92960 7048 ? SNs 14:22 0:04 \_ rails mysite
webuser 26591 15.1 6.5 174316 135028 ? SN 16:14 6:58 \_ rails mysite
webuser 2013 15.8 6.5 172848 133896 ? RN 16:18 6:36 \_ rails mysite
webuser 6633 12.0 3.4 109960 70668 ? SN 16:57 0:24 \_ rails mysite
webuser 10677 17.3 3.1 104208 65092 ? SN 16:59 0:10 \_ rails mysite
webuser 10678 14.7 3.7 117140 77824 ? SN 16:59 0:08 \_ rails mysite


But a ps auxww | grep rails shows two other processes.

webuser 10677 16.6 3.1 104208 65092 ? SN 16:59 0:12 rails mysite
webuser 10678 14.1 3.7 117140 77876 ? SN 16:59 0:10 rails mysite
webuser 12820 16.3 3.0 101448 61976 ? SN 17:00 0:01 rails mysite
webuser 13100 37.0 2.9 100648 61136 ? SN 17:00 0:01 rails mysite
webuser 2013 15.8 6.5 172848 133896 ? SN 16:18 6:39 rails mysite
webuser 22325 0.0 0.3 92960 7048 ? SNs 14:22 0:04 rails mysite
webuser 26591 15.1 6.5 174316 135032 ? SN 16:14 7:00 rails mysite
webuser 6633 12.3 3.4 109544 70456 ? RN 16:57 0:26 rails mysite

Calling auxwwwf again later shows even more though!

webuser 22325 0.0 0.3 92960 7048 ? SNs 14:22 0:04 \_ rails mysite
webuser 26591 15.0 6.5 174316 135100 ? SN 16:14 7:14 | \_ rails mysite
webuser 2013 15.9 6.5 172848 133896 ? SN 16:18 6:57 | \_ rails mysite
webuser 6633 12.4 3.4 110416 71324 ? SN 16:57 0:39 | \_ rails mysite
webuser 16385 0.0 3.3 110412 69288 ? RN 17:02 0:00 | | \_ rails mysite
webuser 15373 18.4 3.2 105844 66364 ? SN 17:02 0:06 | \_ rails mysite
webuser 15935 15.1 3.0 100616 61904 ? RN 17:02 0:02 | \_ rails mysite
webuser 16084 17.1 3.0 101132 61828 ? SN 17:02 0:01 | \_ rails mysite

-> seems to show that my process is spawning a new sub-rails process? I use backticks in a few places for some memory debugging right now, but no forking - perhaps lsapi is doing something strange? Hmm, any ideas?

This is with lsws 3.0, and ruby-lsapi 2.3
 

mistwang

LiteSpeed Staff
#4
Have you patch the code to stop spawning extra?
Some ruby functions may fork, like popen() or popen3(). It should not be lsapi forking again from a child process. Maybe you can strace that process to figure something out. :)
 
#5
Yes, it turns out any backticks are really forks. I was doing some `ps` testing to debug my memory leak, so those were the extra forks() I was seeing.

Still, any idea why it always shows max-processes of 1 more than my max-connections is actually set to in the rails config setup?

Is that the code you had suggested I patch (I didn't yet). Is <<1 really -1, or is it more complicated than that?
 
#7
Thanks, I've applied the patch and it seems to be working great.

My number of processes is still one more than I specify though. TWO if you count the 'parent' process.

I.e. I have max-processes: 2, but ps shows 3 child processes on each parent process. The web console also shows '3 processes', '3 idle'.
 
Top