Process mysteriously dying after ~20 min... Help?
When I only had 60 images I was running tests on, it completed successfully after about 25 minutes. I would usually run it in the background with:
nohup python test.py > test.out 2> test.err < /dev/null
I added more images to my directory so it is now 200+ images. Now the process keeps dying before completion! It just stops at a random point in the classification of an image. The process continues running at 1-2% CPU for a while, then usually it dies (sometimes it doesn't–I just noticed one that has been running for 16 hours at 1% CPU and 80% memory). I've tried it both backgrounded/nohuped and running it in the foreground. I'm starting to think this might be a problem with Linode.
Anybody encountered anything like this before? Any solutions?
4 Replies
You can attach strace to a background process to see the syscalls it's executing.
You can similar things if you compile python with debug symbols and run it under gdb.
James
James