cpu usage as seen from linode and host don't match; need help
I have a linode with 1 vCPU and 2Gb RAM. When I look at the dashboard CPU graph, it shows 100% CPU utilization all the time.
However, on the linode, CPU utilization as seem by top and zabbix shows 90% idle time on average. Based on experience running the same load on bare metal, this is the expected scenario.
I thank in advance for insight to understand what is going on and for clues to debug this issue.
2 Replies
Since the move to KVM, the CPU graphs in the Linode Manager can differ from CPU usage measurements taken from within the Linode, because the host is looking at the CPU usage of the QEMU process, which can be higher on the host due to things like networking and I/O requiring host CPU that is not visible to the guest.
Our problem has been solved and I am posting it here to document the solution for others.
As stated previously, we encountered a situation where our linode would see low load and low CPU utilization, but the host would see it running a 100% CPU utilization all the time.
We traced this host-reported high CPU utilization to a single custom daemon.
After some trial and error, the following patch solved the issue:
// select on sockets for specific actions struct timeval t; t.tv_sec = 0;
- - t.tv_usec = 100;
- + t.tv_usec = 10000;
This is the timeout to a single select syscall. This small change has made the host see the same CPU utilization we were seeing inside the linode.
However, we have not been able to investigate further, as we have no access to both host kernel configuration or qemu/kvm configuration.
My personal guess is that the timeout was lower than some configured tick timer on the host, and that caused qemu to busy-loop on it.