Breaks in reporting graphs?

I'm trying to pinpoint the source of recent high i/o warnings and cpu usage since the forced security update in early March.

Our linode actually failed at reboot during that maintenance window and I wasn't aware. There was a failed message on the Host Job Queue reported which I noticed 8 days ago when I updated cPanel + system packages. I have rebooted the linode after updates.

Since then, I've received a lot more I/O warnings, like: "has exceeded the notification threshold (1000) for disk io rate by averaging 1763.29 for the last 2 hours."

The only thing I can correlate to this issue are "breaks" in my linode reporting graphs. Something that isn't happening on my other linode. See this screenshot: https://www.evernote.com/shard/s9/sh/11 … 66744a3687">https://www.evernote.com/shard/s9/sh/116e73e2-22d0-4967-b781-e47b47776882/982e368fd93286407050d066744a3687

January reports no breaks

February reports 1

March shows 15+

Last 24 hours shows 8

We haven't added any significant resource offenders over this time span. I'm curious if there is some incompatibility with our version of Linux (CentOS 5.11), the kernel, and the recent linode upgrades?

Here is the output of my load average using sar -q command:

12:00:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15

12:10:01 AM 1 179 0.27 0.29 0.32

12:20:01 AM 2 180 0.14 0.30 0.33

12:30:01 AM 3 183 0.37 0.39 0.35

12:40:01 AM 1 181 0.31 0.32 0.32

12:50:01 AM 1 180 0.29 0.38 0.34

01:00:01 AM 3 189 0.20 0.26 0.31

01:10:01 AM 1 179 0.24 0.39 0.37

01:20:01 AM 2 181 0.34 0.42 0.42

01:30:01 AM 1 178 0.25 0.32 0.37

01:40:01 AM 1 228 0.54 0.46 0.42

01:50:01 AM 1 185 0.50 0.41 0.41

02:00:01 AM 4 185 0.21 0.61 0.55

02:10:01 AM 1 178 0.38 0.52 0.54

02:20:01 AM 2 179 0.12 0.36 0.47

02:30:01 AM 3 181 0.16 0.25 0.37

02:40:01 AM 2 182 0.16 0.24 0.31

02:50:01 AM 3 180 0.36 0.28 0.31

03:00:01 AM 3 197 0.42 0.31 0.31

03:10:01 AM 2 177 0.17 0.23 0.28

03:20:01 AM 2 180 0.14 0.36 0.38

03:30:01 AM 2 191 0.56 0.46 0.42

03:40:01 AM 1 178 0.34 0.48 0.47

03:50:01 AM 1 184 0.49 0.48 0.47

04:00:01 AM 3 184 0.26 0.32 0.39

04:00:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15

04:10:01 AM 1 186 0.97 0.53 0.43

04:20:01 AM 2 182 0.26 0.29 0.35

04:30:01 AM 3 187 0.27 0.36 0.39

04:40:01 AM 1 178 0.36 0.32 0.36

04:50:01 AM 1 181 0.44 0.59 0.48

05:00:01 AM 3 187 0.33 0.32 0.40

05:10:01 AM 1 204 0.46 0.63 0.54

05:20:01 AM 3 203 0.80 0.75 0.64

05:30:01 AM 3 190 1.19 1.11 0.88

05:40:01 AM 2 186 1.29 1.35 1.11

05:50:01 AM 2 195 1.14 1.11 1.09

06:00:01 AM 4 192 0.87 1.04 1.09

06:10:01 AM 1 179 0.21 0.69 0.95

06:20:01 AM 2 187 0.42 0.49 0.72

06:30:01 AM 2 187 0.91 0.95 0.87

06:40:01 AM 1 180 0.47 0.48 0.65

06:50:01 AM 1 178 0.41 0.62 0.66

07:00:01 AM 2 193 0.35 0.32 0.47

07:10:01 AM 3 181 0.89 0.59 0.51

07:20:01 AM 2 179 0.27 0.33 0.41

07:30:01 AM 1 183 0.22 0.33 0.38

07:40:01 AM 2 181 1.03 0.67 0.51

07:50:01 AM 3 192 1.11 1.07 0.80

08:00:01 AM 4 188 0.25 0.60 0.74

08:00:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15

08:10:01 AM 1 198 0.62 1.12 1.23

08:20:01 AM 2 210 0.69 0.51 0.82

08:30:01 AM 4 183 0.12 0.38 0.63

08:40:01 AM 1 187 0.66 0.46 0.55

08:50:01 AM 1 188 0.45 0.47 0.53

09:00:01 AM 4 193 0.73 0.47 0.48

09:10:01 AM 1 190 0.49 0.64 0.57

09:20:01 AM 3 198 0.49 0.51 0.52

09:30:01 AM 2 184 0.56 0.54 0.52

09:40:01 AM 1 190 0.54 0.49 0.50

09:50:01 AM 2 208 0.53 0.61 0.56

10:00:01 AM 3 198 0.66 0.82 0.75

10:10:01 AM 2 193 0.30 0.61 0.70

10:20:01 AM 3 203 0.54 0.51 0.59

10:30:01 AM 2 196 1.06 0.87 0.72

Average: 2 188 0.49 0.53 0.54

2 Replies

The breaks in the graphs are from when the server that generates the graphs couldn't communicate with the host for some reason, they happen sometimes they're nothing to do with your node. As for the IO warnings 1763 isn't that high, but you could try using iotop to watch the io usage, also swap usage is often a cause of increase IO.

That's what Linode support said as well. While I appreciate that, it just seems like a massive amount suddenly. Especially when looking at past months where this happened maybe once or twice – now we're well past 30.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct