Saturday, April 16, 2011

Looking into 3G - and why my skype failed

I spent a frustrated morning try to upload/restore a friends backup to his server and skype my cousin at the same time. I am currently using the safaricom 3G service as my primary connection but had to fall back to an alternate providers broadband for this upload (I required a consistent uninterrupted service for this upload). 3G has always served me well - since i moved my lab - more like sold some of mine to use the work lab...

In the meantime, I started messing around with some tools trying to figure out this 3G 'issue' and the effect of large buffers more out of curiosity - It (3G) really serves me well when its working, that and I was bored...

A few things to note:
- Today is a saturday so I expect more contention since the sites around here serve residential/home users. Which means that with my large files, TCP is  wrecking havoc as usual.

Buffers on all the network elements are shared and distributed among all clients, the radio controllers are shared and obviously we share the internet backhaul networks. That initial connection to the Radio is what I was curious about.

We have gone through cycles of high capacity at the edge, then at the core then to the edge again. In the past it used to be that Dialup users in Kenya rarely cumulatively filled an ISP's capacity, Newer technologies like DSL, frame relay, ppp multilink saved the consumer but moved the bottleneck to the core.

The internet has a single method of mitigating/signalling congestion. By dropping packets.This is the only way you notice that 'hey, that packet never arrived, and do something about it'. Windowing (tcp) is built around this mechanism. The other mechanism is known as Explicit congestion Notification (ECN). It's like telling your friend on your way to work driving in the opposite direction ', Hey, the road is flooded back there', use another route or dont go at all.

The best solution is always more capacity, however you can only get so much with 3G/edge/gprs. What most computers and home routers have nowadays is huge buffers. Buffers increase delay - because you pack the packet longer. Which means some packets get to their destination pretty much useless. Its like being in traffic jam past a doctors appointment time. getting there late is useless. So the very solutions you build in (longer jam controlled by a traffic cop) tends to break the network more.

Remember the internet and our networks rely on packets dropping to deal with congestion. excessive buffering breaks that.

So back to 3G; please note most of what powers 3G and Edge (actually lets focus on 3G) was designed at a time telecommunication networks didnt care much about data. So obviously transmitting 1500bytes as a single packet is pretty much impossible (ie the MTU on most of those systems is much much lower). This obviously calls for alot of what tcp is known for - fragment, transmit, reorder and ----buffering.

Unfortunately I decided on this article at a time when the 3G network seems to be okay. at least the RTT are not as bad as earlier in the day.
C:\Documents and Settings\jgitau>ping 196.201.208.2

Pinging 196.201.208.2 with 32 bytes of data:

Reply from 196.201.208.2: bytes=32 time=84ms TTL=56
Reply from 196.201.208.2: bytes=32 time=104ms TTL=56
Reply from 196.201.208.2: bytes=32 time=83ms TTL=56
Reply from 196.201.208.2: bytes=32 time=111ms TTL=56

Ping statistics for 196.201.208.2:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 83ms, Maximum = 111ms, Average = 95ms


So what could possibly have been happening - when things were not working out for me: When you are served by a busy RNC, you have to wait for some time to retransmit the damaged packets, or the RNC to retransmit it to back to you (tcp 101). Most of these are buffered waiting completion (remember each packet is fragmented then put together for onward transmission). Also remember TCP is end to end, however on a 3G network, the said 'end points' are actually multiple endpoints. You probably use up about 8 - 10 IP addresses for each connection - RNC to the Core, SGSN to GGSN, GGSN to the Internet etc- each of those elements have to bring up a session for you to transmit....

By not signalling congestion, the buffers fill up because the endpoints never backed off. The buffers stay full until the load lessens.  Suddenly all of you 'clients' are suffering and complaining but the RNC can't really do much for you now can it?...So is buffering bad?

This whole thing becomes worse when you try tuning stuff and realize that the bandwidth for 3G is variable. I say pick an amount lets say a conservative 128K and tune your system with if you are so inclined.

I have no point here today other than to say that 3G networks are not easy to predict. The RNC is the first bit that actually deals with your packet and more often than not is going to be the first culprit when congestion occurs. Everything else from there on is able to handle larger packet sizes. ehh no wait there's an SGSN just after that:-)....End to end qos could help but I know of no one implementing it...I however look forward to LTE and maybe a technology like HSUPA - what that does is eliminate the number of buffers you have to deal with.

Sooo tools I use frequently or would like to use more of: - I put some of them here just so I remember where to find them....:-)
tstat
Mlab has  set of tools
xplot
tcptrace
netalyzr and a sample output from my 3g connection

sample output:
Network buffer measurements (?): Uplink 3500 ms, Downlink 430 ms
We estimate your uplink as having 3500 msec of buffering. This is quite high, and you may experience substantial disruption to your network performance when performing interactive tasks such as web-surfing while simultaneously conducting large uploads. With such a buffer, real-time applications such as games or audio chat can work quite poorly when conducting large uploads at the same time.
We estimate your downlink as having 430 msec of buffering. This level may serve well for maximizing speed while minimizing the impact of large transfers on other traffic. 
 

Note the Uplink buffer above. So obviously my skype suffered if i uploaded the 'huge' files on one computer while skyping on another.

Wednesday, April 13, 2011

why designing networks is cool!

When companies engage a network designer be it in house or a consultant, one of the most beautiful things is that post implementation feeling; there's always a change. Mostly for the better. The value is visible, the ROI immediate - well almost.

Quite a number of design recommendations are left out, compromises are made, its very engaging. Also, as was with the last major design work I undertook, some companies do actually get into the design process with a clear understanding of the role they must play, what is required, the support they must accord and a willingness to let their networks be transformed by it (the process). I'll also add having the right engineering team to push some good decisions that look unnecessary to the management is sometimes necessary.

There has to be a management solidly behind innovation, new technologies and technique of doing things. For instance the choice between eigrp/OSPF or ISIS should really not start a debate with management ditto anycast vs load balancers for some services? let the guys decide and justify their design.

Good design also happens to be a single element in the overall system. It has to be supported by the business. It has to influence the business, It has to be fed by the business, it has to fit into it's culture, support its products.

In the end its fun watching a good design get implemented, its even better watching others work on it, change things, enhance it, grow it. It's very satisfying.

Tuesday, April 12, 2011

More exams!!

several things are slowly taking shape. one is annoying:

I still can't find a credible cisco learning partner to work with towards getting a ccsi.


It could however prove to be an interesting opportunity too...

Monday, April 11, 2011

8 June, 2011 - World IPv6 Day

I hate that this blog hasn't focused a lot more on ipv6. I take solace in the fact that mobile networks are not going to ipv6 soon (mainly out of ignorance if you ask me), Infact I suspect they will have to be forced to use it since no one will be thinking about it if the decision is left to the guys I see making current decisions in the telco space (imagine if apple released an IPv6 only iphone).

Mobile operators stand to benefit the most from IPv6 mainly from M2M applications/communications. Incidentally People so afraid of change are unfortunately in charge of moving us forward (from the regulator to the operators). Focus on mobile number portability has wasted lots of time. a few people saw it as the dead end it seems to be.

Its a clear case of the blind leading the sighted:-) I see it in the whole industry, there's alot of talk in mailing lists about 'issues' but no action *Please read disclaimer below if you're about to rant*. Politics doesn't get work done.

It will be a consultants field day:-) when IPv6 gets forced on the networks. Closer to home, we have some internet peering but dont have a single service on IPv6 (2c0f:fe38::/32): from the cable and wireless looking glass you'll find us represented:-) I would really like to have some IPv6 pdp contexts activated, an IPv6 dmz, to test end to end mobile IPv6.

inet6.0: 5546 destinations, 31745 routes (5535 active, 0 holddown, 14 hidden)
+ = Active Route, - = Last Active, * = Both

2c0f:fe38::/32     *[BGP/170] 2w3d 09:10:07, MED 0, localpref 80
                      AS path: 6453 33771 I
                    > to 2001:5002:100:4::2 via ae0.1404

* so yes our network is IPv6 ready, we can definately provide IPv6 connectivity but we again haven't really tested any service - yet, and you wont have many places to 'go' to that areipv6 enabled. I however wish you'd begin testing. Believe me you'll save money in the near future.

we haven't progressed the IPv6 initiative as much as we should have in Kenya either, the network guys seem ready. The local exchange point has a bunch of us IPv6 peering, but we as yet have no applications running on it - apart from DNS and hmm I wonder if the google global cache reachable through KIXP is IPv6 enabled.


tracing to the ipv6.google.com uses our international link so I guess not, or I used the wrong fqdn.

Primary#traceroute ipv6 ipv6.google.com
Type escape sequence to abort.
Tracing the route to 2A00:1450:8002::93

  1 2001:5A0:C00:100::35 [AS 6453] 292 msec
    2001:5A0:C00:100::15 224 msec
    2001:5A0:C00:100::35 248 msec
  2 2001:5A0:2A00:100::1 [AS 6453] 180 msec 180 msec 180 msec
  3 2001:5A0:2000:400::2 [AS 6453] 188 msec 188 msec 184 msec
  4 2A01:3E0:FFF0:400::D [AS 6453] 188 msec 188 msec 188 msec
  5 2A01:3E0:FF80:100::9 [AS 6453] 200 msec 196 msec 196 msec
  6 2A01:3E0:FF20::3A [AS 6453] 196 msec 220 msec 196 msec
  7 2001:7F8::3B41:0:1 [AS 6453] 200 msec 228 msec 200 msec
  8 2001:4860::1:0:10 [AS 6453] 228 msec 200 msec 200 msec
  9 2001:4860::1:0:8 [AS 6453] 208 msec 208 msec 204 msec
 10 2001:4860::8:0:2AC3 [AS 6453] 212 msec 212 msec 212 msec
 11 2001:4860::2:0:87D [AS 6453] 212 msec 208 msec 220 msec
 12 2001:4860:0:1::25 [AS 6453] 216 msec
    2001:4860:0:1::23 212 msec
    2001:4860:0:1::25 220 msec
 13 2A00:1450:8002::93 [AS 6453] 208 msec 212 msec 208 msec


I hope and wish to have a full IPv6 DMZ (dns,smtp,ntp,pop,www,wap,looking glass etc) by the IPV6 day.

So...scoot over to the isc . its important to note here that whether we like it or not, among others, Facebook, Google, Yahoo, Cisco, Akamai Technologies, Limelight Networks, W3C, Bing (Microsoft), Tom's Hardware, Rackspace, Verizon, and Juniper have committed to participating in the experiment (wikipedia).We will all participate if our users visit sites affiliated with the networks above. so we might as well do something about our infrastructure.

what are you doing about it?

I am not directly responsible for this infrastructure at work anymore but I'll definately make a concerted effort to ensure our customers don't get caught off guard. and now Im sleepy:-)

Sunday, April 10, 2011

Software-Defined Networking (SDN) and other things Im catching up on

Sundays tend to find me at home just hanging out with friends. Today was extra great I did just that with a bonus. I've met someone new (to me) that might very well join my 'the circle of trust'.

We (happened to be all CCIE's) - note Kenya has 7 8 CCIE's so getting more than 3 together is always quite interesting - we basically threw ideas discussed the current networking trends, opportunities, where we are, what we are, who we are, how things are done here vs how they happen elsewhere whether there's opportunity to do better than others etc etc....well obviously this paragraph has nothing to do with SDN...

SDN (software defined networking) is an NGO promoting change in the way networks are run and managed.

It's based on openflow, a relatively new protocol and its supported by some of the biggest users and buyers of networking equipment. Looking at the list of  members this evening tells me that this will be a definite game changer in the future.

Soon I hope to get to test the protocol. Indigo have a list of supported hardware. The opengear sounds like something I might just have. If I get at least two, we'll give it a test drive. Either way the idea of commoditiz'ing networking gear is very appealing.

anyhow here's a list of places to check on openflow:
  1. : this podcast here is a good start
  2. : openflow networking website
  3. : Ivan's analysis of the same
  4. : on networkworld
  5. : A company actually making and hoping and I believe will sell the switches
  6. : and another one
also there's a Linux Software Reference System which lets you run openflow on a linux pc with multiple NIC's. Expect something on openflow here at some point in the future. When working with SME's, i expect cheap networking gear like this to feature prominently. Mikrotik is so far my favorite, we'll see how openflow and SDN fare.

*Other areas I'm trying to catch up on:-
  • IOS-XR - on CRS-1's
  • NX-OS - this one will be tricky. Rumor has it that our new data center (an area I'm weak in) will be running a couple of Nexus. I might have to make new alliances to get a hold of some switches running NX-OS. I am totally clueless on this and can't wait to just power one up.
  • LTE - I just ordered three books on LTE (Safari doesn't have much on this). So in a months' time I'll be focusing on it. I might very well move to the section dealing with LTE at work if only to get a grasp of what the vendors are doing. the base level knowledge will have to be read though.