The Internet as a Tap

What if the Internet worked like a tap?

This is a question that has been at the back of my mind for some time now and I finally decided to 'put pen to paper'.

Since its inception, the Internet has gone through dramatic change. Typically though, a server has data that a client needs.

In this blog post, I draw an analogy between downloading data from a server and fetching water from a well; using the analogy to explain technical concepts.

The Bucket

One way to fetch water from a well, is to carry a bucket, make the journey to the well and carry the water back with you in the bucket.

If your aim is to collect water as efficiently as possible, there are two main factors to consider: how far is the well, how much water can I carry in one trip.

Round-trip Latency: The time taken to get from your starting point to the well and back (excluding the time to fetch the water from the well).

Bandwidth: How much water you can carry in one trip. i.e. the size of the bucket.

In Internet terms, this is analogous to the request/response cycle. The client sends a request to the server, waits and gets a response back. [1]

This is one of the primary strategies to transfer data over the Internet and is the basis of the HTTP/1.1 specification as defined in RFC 2616.

The HTTP protocol is a request/response protocol. A client sends a
request to the server [...]. The server responds [...]. [2]

Whilst this request/response protocol works great for many scenarios and has many benefits, it has its shortfalls.

Just like with the well, if the server is far away, it can take a considerable amount of time to reach the server and get a response back. The problem is exacerbated when multiple request/response cycles have to be made.

The Hose

Now, instead of using a bucket and having to make journeys back and forth, you can lay a hose from the water source to its destination.

The advantage of this is that you no longer have to make multiple trips to the well and bandwidth can be increased by using a hose with a wider diameter.

This is analogous to the WebSocket protocol.

The protocol has two parts: a handshake and the data transfer. [...] Once the client and server have both sent their handshakes, and if
the handshake was successful, then the data transfer part starts. [3]

However, there is still an issue of waiting for the initial drops of water to arrive from the well, through the hose.

The Tap

Imagine the inconvenience you would face if every time you turned on your tab, you had to wait for the water to flow from the water treatment plant many miles away. This is what the hose-based system is like.

What if instead of the hose system, we use a tap (faucet). With a tap, the flow of water is controlled by a valve; when the value is closed no water flows, when it's open water flows. Most importantly, the water is available, on-demand, at its point of use.

Currently, as far as I'm aware, there is no Internet protocol analogous to this.

In spite of this, there are many techniques that try to come as close to this as possible.

Prefetching

With prefetching, content that might be accessed is 'optimistically' loaded in the background so that when the user requests the content, it's instantly available. [4]

Google search uses this technique to improve the user experience by prefetching the first result if it is most probably the desired result.

Edge networks and CDNs

Sometimes, the original source of the data is just too far away.

CDNs, Content Delivery Networks, are globally distributed networks of servers that aim to serve data to clients from the closest edge node to the user. This reduces latency which in turn improves responsive times and performance.

For example, Facebook Live Video uses CDNs and edge networks extensively. [5] [6]

Using the water fetching analogy, this is akin to having multiple smaller sources of water that feed from the original source. This is like the tap except the valve is not at the point of use but at some less optimal location (e.g. at the end of your street).

Caveats of the tap

Internet usage is often metered by bandwidth and 'how much of the pipe you use'. For the tap to be fully realized, the pipes would always be full.

Whilst right now, such a system might sound absurd, I do see a future where the Internet pathways are 'always full' and at least some data is instantly available at the point of use.

Conclusion

As I mentioned at the onset, the analogy between data distribution on the Internet and water distribution has been at the back of my mind for some time.

Whilst I describe the tap as a concept that could be applied to data distribution on the Internet, I have not proposed any technical solution on how to realize this. Nonetheless, it is still interesting and worthwhile to explore the different techniques currently used on the Internet.


  1. Dissecting a request/response cycle https://www.w3.org/wiki/How_does_the_Internet_work#Dissecting_a_request.2Fresponse_cycle ↩︎

  2. Hypertext Transfer Protocol -- HTTP/1.1 - 1.4 Overall Operation https://www.ietf.org/rfc/rfc2616.txt ↩︎

  3. The WebSocket Protocol - 1.2. Protocol Overview https://tools.ietf.org/html/rfc6455#section-1.2 ↩︎

  4. Link prefetching https://en.wikipedia.org/wiki/Link_prefetching ↩︎

  5. Under the hood: Broadcasting live video to millions https://code.facebook.com/posts/1653074404941839/under-the-hood-broadcasting-live-video-to-millions/ ↩︎

  6. How Facebook Live Streams To 800,000 Simultaneous Viewers http://highscalability.com/blog/2016/6/27/how-facebook-live-streams-to-800000-simultaneous-viewers.html ↩︎