Saturday, April 16, 2011

Looking into 3G - and why my skype failed

I spent a frustrated morning try to upload/restore a friends backup to his server and skype my cousin at the same time. I am currently using the safaricom 3G service as my primary connection but had to fall back to an alternate providers broadband for this upload (I required a consistent uninterrupted service for this upload). 3G has always served me well - since i moved my lab - more like sold some of mine to use the work lab...

In the meantime, I started messing around with some tools trying to figure out this 3G 'issue' and the effect of large buffers more out of curiosity - It (3G) really serves me well when its working, that and I was bored...

A few things to note:
- Today is a saturday so I expect more contention since the sites around here serve residential/home users. Which means that with my large files, TCP is  wrecking havoc as usual.

Buffers on all the network elements are shared and distributed among all clients, the radio controllers are shared and obviously we share the internet backhaul networks. That initial connection to the Radio is what I was curious about.

We have gone through cycles of high capacity at the edge, then at the core then to the edge again. In the past it used to be that Dialup users in Kenya rarely cumulatively filled an ISP's capacity, Newer technologies like DSL, frame relay, ppp multilink saved the consumer but moved the bottleneck to the core.

The internet has a single method of mitigating/signalling congestion. By dropping packets.This is the only way you notice that 'hey, that packet never arrived, and do something about it'. Windowing (tcp) is built around this mechanism. The other mechanism is known as Explicit congestion Notification (ECN). It's like telling your friend on your way to work driving in the opposite direction ', Hey, the road is flooded back there', use another route or dont go at all.

The best solution is always more capacity, however you can only get so much with 3G/edge/gprs. What most computers and home routers have nowadays is huge buffers. Buffers increase delay - because you pack the packet longer. Which means some packets get to their destination pretty much useless. Its like being in traffic jam past a doctors appointment time. getting there late is useless. So the very solutions you build in (longer jam controlled by a traffic cop) tends to break the network more.

Remember the internet and our networks rely on packets dropping to deal with congestion. excessive buffering breaks that.

So back to 3G; please note most of what powers 3G and Edge (actually lets focus on 3G) was designed at a time telecommunication networks didnt care much about data. So obviously transmitting 1500bytes as a single packet is pretty much impossible (ie the MTU on most of those systems is much much lower). This obviously calls for alot of what tcp is known for - fragment, transmit, reorder and ----buffering.

Unfortunately I decided on this article at a time when the 3G network seems to be okay. at least the RTT are not as bad as earlier in the day.
C:\Documents and Settings\jgitau>ping 196.201.208.2

Pinging 196.201.208.2 with 32 bytes of data:

Reply from 196.201.208.2: bytes=32 time=84ms TTL=56
Reply from 196.201.208.2: bytes=32 time=104ms TTL=56
Reply from 196.201.208.2: bytes=32 time=83ms TTL=56
Reply from 196.201.208.2: bytes=32 time=111ms TTL=56

Ping statistics for 196.201.208.2:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 83ms, Maximum = 111ms, Average = 95ms


So what could possibly have been happening - when things were not working out for me: When you are served by a busy RNC, you have to wait for some time to retransmit the damaged packets, or the RNC to retransmit it to back to you (tcp 101). Most of these are buffered waiting completion (remember each packet is fragmented then put together for onward transmission). Also remember TCP is end to end, however on a 3G network, the said 'end points' are actually multiple endpoints. You probably use up about 8 - 10 IP addresses for each connection - RNC to the Core, SGSN to GGSN, GGSN to the Internet etc- each of those elements have to bring up a session for you to transmit....

By not signalling congestion, the buffers fill up because the endpoints never backed off. The buffers stay full until the load lessens.  Suddenly all of you 'clients' are suffering and complaining but the RNC can't really do much for you now can it?...So is buffering bad?

This whole thing becomes worse when you try tuning stuff and realize that the bandwidth for 3G is variable. I say pick an amount lets say a conservative 128K and tune your system with if you are so inclined.

I have no point here today other than to say that 3G networks are not easy to predict. The RNC is the first bit that actually deals with your packet and more often than not is going to be the first culprit when congestion occurs. Everything else from there on is able to handle larger packet sizes. ehh no wait there's an SGSN just after that:-)....End to end qos could help but I know of no one implementing it...I however look forward to LTE and maybe a technology like HSUPA - what that does is eliminate the number of buffers you have to deal with.

Sooo tools I use frequently or would like to use more of: - I put some of them here just so I remember where to find them....:-)
tstat
Mlab has  set of tools
xplot
tcptrace
netalyzr and a sample output from my 3g connection

sample output:
Network buffer measurements (?): Uplink 3500 ms, Downlink 430 ms
We estimate your uplink as having 3500 msec of buffering. This is quite high, and you may experience substantial disruption to your network performance when performing interactive tasks such as web-surfing while simultaneously conducting large uploads. With such a buffer, real-time applications such as games or audio chat can work quite poorly when conducting large uploads at the same time.
We estimate your downlink as having 430 msec of buffering. This level may serve well for maximizing speed while minimizing the impact of large transfers on other traffic. 
 

Note the Uplink buffer above. So obviously my skype suffered if i uploaded the 'huge' files on one computer while skyping on another.

No comments:

Post a Comment