HE network problems?
20 Replies
We've had two other reports from people on Bell Sympatico in Ontario. With info provided by one, I was able to determine the problem was a Telia link in Chicago. I've contacted the HE NOC and they confirmed that the Telia link in Equinix Chicago was saturated and they are working to upgrade the link capacity.
# traceroute he.net
traceroute to he.net (216.218.186.2), 30 hops max, 52 byte packets
1 192.168.1.1 (192.168.1.1) 0.690 ms 0.681 ms 0.886 ms
2 10.33.128.1 (10.33.128.1) 7.265 ms 7.511 ms 7.544 ms
3 gig2-1.austtxk-rtr2.austin.rr.com (66.68.0.53) 8.432 ms 8.755 ms 7.943 ms
4 srp0-0.austtxrdc-rtr2.austin.rr.com (24.27.12.34) 8.657 ms 7.768 ms 7.789 ms
5 pos10-3.austtxrdc-rtr4.texas.rr.com (66.68.1.106) 63.898 ms 65.673 ms 65.490 ms
6 son0-1-3.hstqtxl3-rtr1.texas.rr.com (24.93.37.61) 65.750 ms 64.234 ms 63.387 ms
7 pop1-hou-P0-2.atdn.net (66.185.133.153) 17.900 ms pop1-hou-P0-1.atdn.net (66.185.133.145) 19.072 ms pop1-hou-P0-0.atdn.net (66.185.133.157) 17.586 ms
8 bb1-hou-P2-0.atdn.net (66.185.150.148) 64.928 ms 62.528 ms 76.737 ms
9 bb1-atm-P7-0.atdn.net (66.185.152.184) 164.928 ms 276.444 ms 202.914 ms
10 pop2-atm-P0-0.atdn.net (66.185.147.209) 36.849 ms 37.650 ms 36.511 ms
11 Telia-atm.atdn.net (66.185.138.46) 87.784 ms 86.298 ms 83.688 ms
12 dls-bb1-link.telia.net (213.248.80.146) 104.869 ms 103.197 ms 105.358 ms
13 hurricane-113208-dls-bb1.c.telia.net (213.248.92.22) 106.121 ms 104.466 ms 103.746 ms
14 pos5-0.gsr12012.lax.he.net (66.160.184.5) 433.108 ms 379.999 ms 449.236 ms
15 * * *
16 * * *
17 * *
@miallen:
My node is totally unresponsive right now. Actually it comes and goes. Is this what you're seeing? How do you know it's HE? What kind of diagnostic are you using?
That's exactly what I'm seeing, yes; it comes and goes, and when it comes it's really slow (~500ms pings). Tracing that route to he.net makes me believe it's a network congestion issue at the datacenter.
$ mtr --report host18.linode.com
HOST: local Loss% Snt Last Avg Best Wrst StDev
1. 192.168.1.1 0.0% 10 0.6 0.8 0.6 1.8 0.4
2. 10.33.128.1 0.0% 10 7.5 16.7 7.4 89.5 25.6
3. gig2-1.austtxk-rtr2.austin.r 0.0% 10 7.9 8.2 7.6 9.9 0.7
4. srp0-0.austtxrdc-rtr2.austin 0.0% 10 8.2 9.6 8.0 12.7 2.0
5. pos10-3.austtxrdc-rtr4.texas 0.0% 10 65.1 64.7 63.4 66.4 1.0
6. son0-0-0.hstqtxl3-rtr1.texas 0.0% 10 105.8 70.9 63.5 105.8 12.9
7. pop1-hou-P0-1.atdn.net 0.0% 10 16.6 16.9 16.2 18.6 0.7
8. bb1-hou-P2-0.atdn.net 10.0% 10 64.7 65.7 63.8 72.3 2.6
9. bb1-atm-P7-0.atdn.net 20.0% 10 84.2 85.4 83.5 87.6 1.4
10. pop2-atm-P0-3.atdn.net 0.0% 10 37.0 38.7 36.8 44.7 2.8
11. Telia-atm.atdn.net 0.0% 10 85.0 87.3 83.5 100.0 5.1
12. dls-bb1-link.telia.net 0.0% 10 103.4 103.8 103.0 105.6 1.0
13. hurricane-113208-dls-bb1.c.t 0.0% 10 103.7 104.0 102.2 112.0 2.8
14. pos5-0.gsr12012.lax.he.net 10.0% 10 389.7 393.7 377.5 413.4 12.1
15. pos3-2.gsr12416.pao.he.net 0.0% 10 130.6 248.6 127.4 344.7 79.7
16. pos5-0.gsr12416.fmt.he.net 10.0% 10 355.2 250.3 154.4 381.4 87.9
17. pos8-0.gsr12012.fmt.he.net 10.0% 10 130.0 132.8 129.2 155.0 8.3
18. host18.fremont.linode.com 10.0% 10 130.2 121.2 78.9 139.2 24.0
````
But don't let them think they got away without us noticing, Mike!
@Xan:
No, I'm in Texas, actually. Here's a traceroute to he.net.
# traceroute he.net traceroute to he.net (216.218.186.2), 30 hops max, 52 byte packets <snip>13 hurricane-113208-dls-bb1.c.telia.net (213.248.92.22) 106.121 ms 104.466 ms 103.746 ms 14 pos5-0.gsr12012.lax.he.net (66.160.184.5) 433.108 ms 379.999 ms 449.236 ms</snip>
I had assumed hop 13 here was in Chicago based on the path of the traffic from the other customer in Ontario. It seems from your traceroute it is more likely that it is in Equinix, Los Angeles. At any rate, this is the same router that the Ontario customers were having issues with so it looks like your problems are related after all.
# mtr --report host18.linode.com
HOST: layer0 Loss% Snt Last Avg Best Wrst StDev
1\. 241.160.92.64.reverse.layere 0.0% 10 0.5 2.3 0.5 16.7 5.1
2\. 10.1.3.13 0.0% 10 0.6 0.6 0.6 0.8 0.1
3\. 216.39.69.53 0.0% 10 0.5 0.6 0.4 0.8 0.1
4\. bhr2-po-1.fortworthda1.savvi 0.0% 10 0.4 0.5 0.4 0.6 0.1
5\. dcr2-so-3-2-0.dallas.savvis. 0.0% 10 1.6 1.6 1.4 1.7 0.1
6\. dcr1-so-7-2-0.losangeles.sav 0.0% 10 34.1 34.1 33.9 34.9 0.3
7\. bpr3-so-7-0-0.losangelesequi 0.0% 10 34.2 34.2 34.0 34.3 0.1
8\. 208.174.196.54 0.0% 10 33.8 38.0 33.8 75.2 13.1
9\. hurricane-108839-las-bb1.c.t 0.0% 10 359.7 342.5 318.3 359.8 15.3
10\. ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
mtr --report 66.220.1.142
HOST LOSS RCVD SENT BEST AVG WORST
??? 100% 0 16 0.00 0.00 0.00
bogus.union.nj.panjde.comcast.net 0% 16 16 11.37 13.35 22.42
po10-ur02.union.nj.panjde.comcast.net 0% 16 16 11.38 13.11 15.44
po10-ur01.jerseycity.nj.panjde.comcast.net 0% 16 16 12.01 13.50 17.35
po10-ur02.jerseycity.nj.panjde.comcast.net 0% 16 16 11.84 13.76 15.93
po10-ur01.narlington.nj.panjde.comcast.net 0% 16 16 12.75 14.32 17.82
po70-ar01.verona.nj.panjde.comcast.net 0% 16 16 13.13 24.51 178.02
ar01.plainfield.nj.panjde.comcast.net 0% 16 16 13.60 15.59 23.77
GE-2-0-cr01.plainfield.nj.core.comcast.net 0% 16 16 13.46 14.83 17.13
12.118.149.9 0% 16 16 14.38 16.95 23.59
tbr2-p014001.n54ny.ip.att.net 0% 16 16 15.51 17.22 19.74
12.123.0.93 0% 16 16 14.96 16.29 18.34
nyk-b2-link.telia.net 0% 16 16 14.29 17.00 30.07
nyk-bb2-pos1-1-0.telia.net 0% 16 16 14.57 15.81 17.08
chi-bb1-pos7-0-0-0.telia.net 0% 16 16 35.42 38.68 48.66
dls-bb1-pos6-0-0.telia.net 0% 16 16 56.44 61.10 108.59
hurricane-113208-dls-bb1.c.telia.net 0% 16 16 56.19 57.69 58.84
pos5-0.gsr12012.lax.he.net 0% 16 16 89.36 92.35 100.59
??? 100% 0 16 0.00 0.00 0.00
````
@miallen:
My node is totally unresponsive right now. Actually it comes and goes. Is this what you're seeing? How do you know it's HE? What kind of diagnostic are you using?
Traceroute and mtr output is the same as what one would expect with a saturated link. This can be caused by a link not having enough capacity to handle normal surge traffic at the busiest part of the day. Another frequent cause is an outage of one link causing saturation of another link that is still up due to the much greater than normal traffic. Upon finding this I contacted the HE NOC and they confirmed that the problem was indeed a saturated link and they were working with Telia to get it upgraded.
HOST: ????? Loss% Snt Last Avg Best Wrst StDev
1\. router.local 0.0% 10 1.1 1.1 0.9 1.6 0.3
2\. 10.125.192.1 0.0% 10 8.3 10.4 6.9 22.1 4.4
3\. hsrp1-cs.myacc.net 0.0% 10 7.9 10.0 7.5 15.5 2.4
4\. border5.3-5.advancecable-7.m 0.0% 10 9.5 13.1 9.5 29.4 6.7
5\. core2.pc1.bbnet1.mia003.pnap 0.0% 10 16.2 11.8 9.6 16.5 2.6
6\. 12.118.175.81 0.0% 10 9.3 10.8 9.3 13.0 1.3
7\. tbr2-p013603.ormfl.ip.att.ne 0.0% 10 26.8 29.3 26.7 38.1 3.5
8\. tbr1-cl1474.attga.ip.att.net 0.0% 10 27.0 28.5 26.9 33.4 2.0
9\. ggr2-ge00.attga.ip.att.net 0.0% 10 26.4 44.9 25.8 181.8 48.8
10\. 192.205.33.42 0.0% 10 28.7 30.7 25.3 55.8 9.4
11\. dls-bb1-link.telia.net 0.0% 10 54.4 53.9 47.3 91.1 13.3
12\. hurricane-113208-dls-bb1.c.t 0.0% 10 48.7 49.1 45.5 60.0 4.0
13\. pos5-0.gsr12012.lax.he.net 10.0% 10 366.4 349.0 319.9 366.4 15.3
14\. pos3-2.gsr12416.pao.he.net 10.0% 10 239.0 112.2 88.0 239.0 48.7
15\. pos5-0.gsr12416.fmt.he.net 10.0% 10 169.0 212.9 122.3 359.2 83.5
16\. pos10-0.gsr12012.fmt.he.net 0.0% 10 90.8 90.8 87.7 106.2 5.5
17\. host16.fremont.linode.com 10.0% 10 88.8 89.5 88.5 91.2 1.2
Excuse my ignorance but why do the high latencies show up on the second router after the Telia link?
@raman:
This is the trace from Coral Springs, Florida:
HOST: ????? Loss% Snt Last Avg Best Wrst StDev <snip>12\. hurricane-113208-dls-bb1.c.t 0.0% 10 48.7 49.1 45.5 60.0 4.0 13\. pos5-0.gsr12012.lax.he.net 10.0% 10 366.4 349.0 319.9 366.4 15.3</snip>
Excuse my ignorance but why do the high latencies show up on the second router after the Telia link?
Number 12 is a Telia router. Number 13 is an HE router. The latency and packet loss starts at hop 13 and continues. This indicates the problem is the link between these two routers.
@mikegrb:
Number 12 is a Telia router. Number 13 is an HE router. The latency and packet loss starts at hop 13 and continues. This indicates the problem is the link between these two routers.
Got it, I guess the router name got chopped off in the report…
Cheers,
Raman
traceroute to www.taupehat.com (64.62.231.41), 30 hops max, 40 byte packets
1 * * *
2 eugn-dsl-gw01-193.eugn.qwest.net (0.0.0.0) 41.127 ms 40.991 ms 41.097 ms
3 eugn-agw1.inet.qwest.net (67.42.192.93) 41.322 ms 41.303 ms 56.599 ms
4 egn-core-01.inet.qwest.net (205.171.150.33) 41.102 ms 41.082 ms 40.921 ms
5 stl-core-01.inet.qwest.net (205.171.5.125) 48.698 ms 47.446 ms 58.311 ms
6 svx-core-02.inet.qwest.net (67.14.1.58) 64.940 ms 64.945 ms 72.331 ms
7 sjp-brdr-01.inet.qwest.net (205.171.214.138) 64.826 ms 64.921 ms 65.471 ms
8 sjo-bb1-geth1-2-0.telia.net (213.248.86.13) 65.049 ms 64.498 ms 73.814 ms
9 las-bb1-pos7-0-0-0.telia.net (213.248.80.17) 104.238 ms 76.053 ms 76.322 ms
10 hurricane-108839-las-bb1.c.telia.net (213.248.94.42) 454.248 ms 461.861 ms 506.956 ms
11 pos3-2.gsr12416.pao.he.net (65.19.129.1) 708.738 ms 637.735 ms 458.944 ms
12 pos5-0.gsr12416.fmt.he.net (216.218.229.33) 608.642 ms 475.644 ms 531.779 ms
13 pos8-0.gsr12012.fmt.he.net (66.220.20.138) 458.224 ms 461.637 ms 456.687 ms
14 taupehat.com (64.62.231.41) 451.295 ms 478.868 ms 483.318 ms
$ mtr --report host18.linode.com
HOST: newton.betadome.net Loss% Snt Last Avg Best Wrst StDev
1\. up3.linode.com 0.0% 10 0.9 0.7 0.6 0.9 0.1
2\. vl1.dsr02.dllstx2.theplanet. 0.0% 10 0.7 18.8 0.6 117.3 40.1
3\. vl22.dsr02.dllstx3.theplanet 0.0% 10 0.9 10.2 0.9 93.2 29.2
4\. 25.7f.5546.static.theplanet. 0.0% 10 1.1 1.0 0.9 1.1 0.1
5\. dal-ix.he.net 0.0% 10 1.0 0.9 0.9 1.1 0.1
6\. pos5-0.gsr12012.lax.he.net 0.0% 10 35.4 35.4 35.3 35.6 0.1
7\. pos3-2.gsr12416.pao.he.net 0.0% 10 49.1 98.6 48.9 160.6 48.5
8\. pos5-0.gsr12416.fmt.he.net 0.0% 10 49.5 82.6 49.4 303.5 81.3
9\. pos10-0.gsr12012.fmt.he.net 0.0% 10 49.8 49.5 49.4 49.8 0.1
10\. host18.fremont.linode.com 0.0% 10 49.9 50.0 49.6 50.5 0.3
Something's up at HE, that's for sure.
anubis cdrom # mtr --report host18.linode.com
HOST: anubis Loss% Snt Last Avg Best Wrst StDev
1\. 192.168.0.1 0.0% 10 0.8 1.1 0.8 3.2 0.7
2\. ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
3\. GE-1-6-ur04.beaverton.or.bve 0.0% 10 10.9 10.3 8.9 11.3 0.8
4\. 10g-9-2-ur03.beaverton.or.bv 0.0% 10 10.1 13.1 9.0 25.9 5.1
5\. 10g-9-2-ar01.beaverton.or.bv 50.0% 10 10.3 10.3 9.6 12.2 1.1
6\. 12.119.199.21 0.0% 10 32.3 31.6 29.9 34.1 1.2
7\. 12.127.6.42 0.0% 10 34.8 34.1 31.5 39.1 2.1
8\. tbr2-cl10.sffca.ip.att.net 0.0% 10 32.9 32.2 30.6 34.8 1.1
9\. 12.122.82.169 0.0% 10 33.3 37.4 30.1 54.3 7.6
10\. 192.205.33.26 0.0% 10 35.8 33.5 31.7 35.8 1.2
11\. las-bb1-pos7-0-0-0.telia.net 0.0% 10 42.7 54.3 40.9 162.4 38.0
12\. hurricane-108839-las-bb1.c.t 10.0% 10 487.6 467.0 428.4 487.6 19.9
13\. pos3-2.gsr12416.pao.he.net 10.0% 10 698.3 500.0 431.1 698.3 77.6
14\. pos5-0.gsr12416.fmt.he.net 10.0% 10 503.3 482.8 440.5 522.5 28.5
15\. pos10-0.gsr12012.fmt.he.net 10.0% 10 500.8 478.4 449.8 502.2 23.0
16\. host18.fremont.linode.com 10.0% 10 500.6 476.6 456.5 500.6 20.0
# mtr --report 66.220.1.142
HOST LOSS RCVD SENT BEST AVG WORST
??? 100% 0 16 0.00 0.00 0.00
ge-2-1-ur01.union.nj.panjde.comcast.net 0% 16 16 11.56 13.15 14.73
po10-ur02.union.nj.panjde.comcast.net 0% 16 16 11.05 13.14 16.52
po10-ur01.jerseycity.nj.panjde.comcast.net 0% 16 16 11.93 13.81 19.66
po10-ur02.jerseycity.nj.panjde.comcast.net 0% 16 16 11.57 19.96 110.82
po10-ur01.narlington.nj.panjde.comcast.net 0% 16 16 11.93 14.28 21.52
po70-ar01.verona.nj.panjde.comcast.net 0% 16 16 12.64 14.78 18.15
ar01.plainfield.nj.panjde.comcast.net 0% 16 16 13.81 15.51 19.57
GE-2-0-cr01.plainfield.nj.core.comcast.net 0% 16 16 13.99 15.49 17.04
12.118.149.9 0% 16 16 14.25 15.51 16.85
tbr2-p014001.n54ny.ip.att.net 0% 16 16 15.42 17.04 19.49
12.123.0.93 0% 16 16 14.86 18.43 36.70
nyk-b2-link.telia.net 0% 16 16 14.83 16.61 25.72
nyk-bb1-pos7-2-0.telia.net 0% 16 16 14.74 16.54 18.19
chi-bb1-pos6-0-0-0.telia.net 0% 16 16 35.28 37.17 41.15
dls-bb1-pos6-0-0.telia.net 0% 16 16 56.30 58.28 64.94
hurricane-113208-dls-bb1.c.telia.net 0% 16 16 56.29 57.77 59.05
pos5-0.gsr12012.lax.he.net 7% 15 16 329.76 349.14 372.88
pos3-2.gsr12416.pao.he.net 0% 16 16 100.80 186.87 364.52
pos5-0.gsr12416.fmt.he.net 0% 16 16 97.25 210.64 462.22
pos10-0.gsr12012.fmt.he.net 7% 15 16 93.03 95.69 98.28
li4-142.members.linode.com 7% 15 16 94.39 96.84 105.06