The SC 09 Conference took place early this month in Portland. The Bandwidth Challenge (BWC) is an interesting and friendly rivalry between research groups to develop high performance network protocols and interesting applications that use them. The Bandwidth Challenge was started ten years ago at SC 99, which also took place in Portland.
Some of the history is available at the web site scinet.supercomputing.org. For example, in 2000, there were 2 OC-48 (2.5 Gbps) circuits that connected the research exhibits at the conference to external research networks and the challenge was to develop network protocols and applications that could fill these circuits. The winner of the BWC (called the Network Bandwidth Challenge in 2000) was a scientific visualization application called Visapult that reached 1.48 Gbps and transferred 262 GB in 1 hour (providing 582 Mbps of sustained bandwidth utilization).
This year, there were approximately 24 10 GE circuits and one 40 GE circuit that connected research exhibits to external exhibits and one of the applications reached a bandwidth utilization of over 114 Gbps.
I have had an interest in the BWC over the years, because you cannot analyze data without accessing it and accessing and transporting large remote datasets has always been a challenge. To say it slightly different, for large datasets and high performance networks, network transport protocols are an important element of the analytic infrastructure.
It’s useful to know the bandwidth delay product of a network, which is the product of the network capacity (in Mbps, say) multiplied by the round trip time (RTT) of a packet (in sec). This measures the amount of data on the network that has been transmitted but not yet received. This can be MB of data for wide area high performance networks. This data must be buffered so that it can be resent if a packet is not received.
Challenges that have been worked out over the past decade include:
- Improving TCP so that it is effective over networks with high bandwidth delay products. One of the successes is the development of FAST TCP, a variant of the TCP protocol.
- Developing reliable and friendly UDP-based protocols that are effective over networks with high bandwidth delay products. For example, the open source UDT protocol has proved over time to be quite effective. (Disclosure: I have been involved in the development of the UDT protocol.)
- Developing architectures that are effective for high end-to-end performance for transporting large datasets, from disks at one end to disks at the other end.
For the past several years, it has been relatively routine for applications using FAST TCP or UDT to fill a wide area 10 Gbps network link or multiple 10 Gbps network links, if these are available.
Today’s problems include:
- Connecting data intensive devices and applications to high performance networks. For example, with high throughput sequencing, biology is becoming data intensive, yet very few high throughput sequencing devices are connected to high performance research networks.
- Incorporating the appropriate network protocols into data intensive applications. For example, one of the reasons, the Sector/Sphere cloud is effective over wide area networks is that it is based upon UDT and not TCP. (Disclosure: I have been involved in the development of the Sector/Sphere cloud.)
I ran into the first problem just after I got back from SC 09. At SC 09, we ran a number of wide area data intensive applications, and in fact won the 2009 BWC for these applications. For example, a new variant of UDT called UDX reached 9.2 Gbps over a network link with 200 ms RTT. In contrast, as soon as I got back to Chicago, I worked for a couple of days trying to get access to 200 GB of sequence data, since the sequencing instrument that produced it was not connected to a high performance network. With the device connected to a high performance research network, the data would have been available in a few minutes.
To summarize, today network experts are comfortable designing systems that can easily fill wide area 10 GE networks, but most analytic applications are not designed to use the required protocols or to to take advantage of high performance networks, and most do not have access to the required networks, even if the applications could benefit from them.
In disciplines, like biology, that are becoming data intensive, this type of analytic infrastructure will provide distinct competitive advantages.