why do sleeping processes use high cpu and how to limit it
I have a flask app set up with gunicorn and nginx serving as reverse proxy , and I wanted to load test my website and see how much traffic it can handle and since I can't really afford paying for load testing I was simulating visits with aiohttp - async requests to my website - . So when I first start gunicorn whether with gevent or with gthread worker type , the cpu usage as per htop is around 1% , Which is reasonable since no one is visiting the website anyway . then I try visiting the website from the browser and it works fine and I monitor the cpu usage and RAM which jumps about 3% for one visit and then it drops back to almost 0% for the cpu usage . However the issue is when I stimulate high load by sending 5000 requests asynchronously to my website - to load test it - the cpu jumps to 100% and remains that high for about 20 seconds which is fine as I can see that those requests I have sent asynchronously are being served however once all the requests are served , CPU usage drops but not to the normal levels when there are no requests - around 1% - . It stays at about 35% for some reason as if the cpu is being used somehow -which it shouldn't- and used htop to see what is going on , it tends to suggest that the processes started by gunicorn which are sleeping are the ones using that much cpu . and I thought maybe since I am not yet very experienced it might be about the configuration for guinicorn . So in order to rule this out turned off nginx and started a develpment server and directly ran the flask app and re did the same test at fist cpu usage is about 1% and with a couple of individual visits to the website the increases and then drops back to about 1% once all the requests are served but the moment that I start getting high load and cpu usage jumps to 100% for about 20 seconds it never drops back to the normal levels once all the requests are served !? so it isn't about gunicorn nor nginxs' configurations . why does sleeping processes use that much cpu !? How can we identify what is going on ?
also worth mentioning that with each round of load test by sending about 5000 requests asynchronously the cpu usage jumps up bu about 30% once all the requests are served until it remains at 100% cpu usage after a couple of rounds . Also in order to rule out any possibility I removed every part that relies on I/O or database connections . and it doesn't really matter if I am using gevent or gthreads however with gthreads for gunicorn if I am using 2 workers and 2 threads I would assume that a maximum of 4 threads can be created however when I load test the website for some reason on htop I see thousand of threads being created and never being released and getting back to normal once all the requests are served for example when I start gunicorn I will usually have 21 threads and after the test is complete I will have maybe 1400 threads !! aren't threads created by gunicorn closed and terminated once they do their job ?
if anyone understands why linux behaves like this and how to get around that sort of behavior please let me know before I pull all of my hair out :)
2 Replies
if anyone understands why linux behaves like this and how to get around that sort of behavior please let me know before I pull all of my hair out :)
Your hair is safe!
Your problem is not Linux…it's top(1)/htop(1)…and, to some extent, the architecture of gunicorn… According to Wikipedia, gunicorn uses a pre-fork worker model…similar to the apache2(8) pre-fork MPM.
Here are some articles that explain why you shouldn't regard what top(1)/htop(1) is telling you about CPU usage as the abstract truth™:
- https://stackoverflow.com/questions/10628037/cpu-utilization-high-for-sleeping-processes
- https://askubuntu.com/questions/15620/htop-does-not-show-the-correct-cpu-but-top-does
- https://serverfault.com/questions/500525/how-can-sleeping-processes-in-top-be-using-a-percentage-of-cpu
Also, I believe that when a process is swapped out, top(1)/htop(1) just tells you it's swapped out…it doesn't drop the CPU usage to zero.
-- sw