503 (Service Unavailable)
I've built a django application for about 5 months ago, it works fine till yesterday, but now it sometimes shows this error message:
The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later).
server: ubuntu version: 18 LTS server provider linode Thank you for any help
plan: Linode 4GB
2 Replies
I don't know anything about django or python but my first guess would be to investigate the "capacity problems" part of the report.
-- sw
About how much traffic are you getting, in terms of hits per minute? Has it been growing? What was the level of traffic before this started happening and what is it now? Are there maximum "peaks" of traffic level you are hitting that correlate with the 503s happening?
If so, you are probably hitting capacity. If not, some other stranger problem.
If you cannot answer those questions easily, you won't be in a position to solve this, so the first thing you want to do is set up some kind of monitoring, even if it's just scraping log files (you can do quite a bit just with log file scraping, since the log file will even often include the HTTP status code given, so you can even look for 5xx errors specifically)
If you find it looks like you are hitting capacity, there are many ways to add capacity, but usually the easiest is to just keep scaling "vertically" (add more CPU/RAM/etc to the host) until there's no further to go, at which time you'd have to scale "horizontally" (rearchitect so there's more hosts, which could be easy or hard depending on your app -- for a typical Django app, this should be easy, but just making your VM bigger is even easier) and/or just make the code more efficient.
To scale vertically, first figure out what the bottleneck is… CPU? RAM? Disk I/O? Take a look at the output of free -tm
, ps -ef
, top
, and other common tools around the times of peak load and see whether it's CPU or RAM that's maxing out. This will help you select the next bigger plan to upgrade your Linode to.
Then you can test -- upgrade to the bigger plan and wait for the peak time -- are things better?
It can also help to write a script to automatically send a lot of fake requests so you can test your change on a test Linode first and find exactly the capacity of different plans before selecting one for production use.
Or, honestly? If you are short on time and don't want to do all that work…. you can just blindly try upgrading to a plan with about twice as much RAM and CPU and see if your problems go away. They probably will. Then if that doesn't work, you can do all the troubleshooting I mentioned above.
Hope this helps
DG