Protocols: Understanding Network Software


Goals

This module is meant to give a quick high-level overview of how networks work. That is, how protocols are set up to enable communication of data between machines connected in a network.

The goal is for you to understand the core concepts in the three core network protocols: (1) datalink layer protocol, (2) network layer protocol, (3) transport layer protocol.

Before we start, let's identify a few major themes and also clarify what we are not going to cover:

Warning:

Life after this module: If you want the next level of detail (or skill), we'll have some suggested-next-steps recommendations at the end.


Meet the players: the key components that make up a network

Consider two PC's that communicate across the internet and the various components involved.


Packets: the basic ingredient

As mentioned earlier, large files are broken into smaller units called packets:

  • By way of analogy, it's as if a book that is to be mailed has its chapters torn out and mailed separately, only to be glued back into a book at the destination.

  • An important goal is to make this packet decomposition seamless to the applications, who shouldn't know it even happened.

  • Why is it done this way? In the early days, machines did not have much memory and so even a modest-sized file would easily overwhelm memory. Thus, files were broken into smaller units and sent one by one. Clearly, there's some inefficiency here since each packet has a packet header (some bookkeeping info) that we should count as overhead.

The layered approach to network protocol software

While large software packages these days are organized as a complex hierarchy of objects, network software is organized as a simple chain of objects: in layers.

The first set of networking protocols (called TCP/IP, that still dominates today) had four layers. Soon after, the International Standards Organization defined a 7-layer version that extended the 4-layer TCP/IP:

  • Note: We will not describe all layers here.
  • Layers 2-7 are generally all in software (although hardware is often used to optimize aspects of these layers).

  • Layer 1 is designed to be the interface to actual hardware. For convenience of discussion, we often include the hardware in this layer. It does have some software (to handle some types of garbled transmissions, for example).

  • Layer 1-4 are more or less equivalent to the 4-layer internet protocol that dominates today's networking. These contain what most people consider the core ideas in networking.

  • Layers 5-7 are often directly implemented by applications.

  • Layers 5-7 are still useful as abstractions because they help classify and organize the myriad of protocols in use today.

  • Some protocols defined above Layer 4 have now created their own special niche, such as HTTP or SOAP.

Note: in what follows, we will greatly simplify the functionality of the layers to convey core concepts.

In particular, we will condense the layers to

where the "application layer" will refer to everything higher than the transport layer.


The transport layer

Main responsibilities of the transport layer:
  • Provide the connection abstraction to application programs.

  • Handle lost packets and packets that come out of order.

  • Other responsiblities: slow down the rate of packet transmission if the network is too congested.

What might a transport layer API look like? Consider this simple spec:

public interface TransportLayer {

    // An application says: "I want to create a connection to destination dest"
    // The transport layer returns a connection ID that the application can use to 
    // refer to this connection in further interactions.
    public int openConnection (int dest);

    // Connection set up may take time, so an application calls this 
    // repeatedly to find out whether the connection is set up.
    public boolean isReady (int connID);

    // Send a packet of data over the connection.
    public void sendPacket (TransportPacket packet);

    // This method will be called by the layer below to tell the 
    // transport layer, "Hey, a packet has arrived for you".
    public void receivePacket (TransportPacket packet);

    // Close a particular connection when done.
    public void closeConnection (int connID);
}
  

An application might use the transport layer as follows (similar to how sockets are used):

  int connID = transport.openConnection (destNodeID);
  while (! transport.isReady (connID)) {
     // ... sleep a little ...
  }
  TransportPacket packet = new TransportPacket (" ... some data ...");
  transport.sendPacket (packet);
  

What should the transport layer do in openConnection()?

  • It should probably send a special packet to the destination's transport layer (its peer on the other side) saying "I need to set up a connection with you".

  • The destination transport layer sends back a "Sure, let's both use ID=173" for this particular connection.

  • Upon receiving this acknowledgement, the source transport layer then indicates that the connection is ready.
Port numbers:
  • What happens when an application opens two different connections to the same destination machine? For example, one is a webservice, the other is for FTP?

  • An additional "ID" (called a port number) is used to distinguish between multiple connections between the same source-destination pair.

  • Thus, the openConnection() call should really look like:
    public interface TransportLayer {
    
        public int openConnection (int dest, int portNum);
    
        // ...
    }
      
What does the sendPacket() method supposed to do?
  • This method should divide a large piece of data into smaller packets if needed and send them down to the network layer.

  • If a file is broken into multiple packets, they need to be numbered.
A transport layer's view of the network:

  • The transport layer only sees the layer above (application) and the layer below (network).

  • It has no understanding of the network itself, no understanding of how the network is connected (topology).

The network layer

Main responsibilities:
  • At the end host (a PC), the network layer does not really do much other than take in a transport packet, make a network packet out of it, and send that network packet down to the datalink layer.

  • However, the network layers on nodes inside the network do most of the core work we associate with complex networks: routing.

  • The network layer is the only layer that "sees the network".

  • To understand a network layer's functions, we'll examine the three most important functions separately.

  • Job #1: route packets using a routing table:

    • Every node's network layer has a routing table that determines the "rule of the moment" for routing packets.
    • A routing table is merely a data structure for storing some local routing information.
    • A packet's journey is determined by routing tables at the nodes visited by the packet:

  • Job #2: maintain stats about link usage.
    • Every node monitors its outgoing links and maintains some statistics about usage and performance.
    • These numbers are then used in constructing the next set of routing tables.

  • Job #3: compute routes periodically
    • As conditions change, some links get more congested than others. Yet other links are always slow.
    • Accordingly, nodes engage in a distributed computation and update their routing tables.

A network layer might look like this:

public interface NetworkLayer {

    // This method is called by the transport layer above when
    // that layer has a packet to send. The packet is assumed to
    // contain information about the destination. This method is
    // only called at the end hosts.
    public void sendPacket (NetworkPacket packet);

    // The network layer receives a packet from below (datalink layer).
    // This packet could be destined for this node (an end host) or
    // needs to be forwarded along according to the dictates of the routing table.
    public void receivePacket (NetworkPacket packet);

    // This will be called by the independent process that's doing
    // the link measurement to update the stats for the link to
    // a particular neighbor. Here is where a routing computation
    // is initiated.
    public void updateLinkStatus (int neighbor, LinkStatus status);
}
  

The datalink layer

Main responsibilities:
  • A datalink layer's view is limited to only a single link.

  • Each datalink layer is responsible for reliable packet transmission across a link.
         => If a packet is lost or corrupted, it needs to be re-transmitted.
  • In the old days (and for some satellites, these days), one worried about overwhelming the node on the other side of the link. Thus, a node limited the number of packets sent without getting an acknowledgement from the other side.

Possible structure of a datalink layer:

public interface DatalinkLayer {

    // The network layer calls this to send a packet across. It's
    // assumed that the packet contains the destination (neighbor) ID.
    public void sendPacket (DatalinkPacket packet);

    // This method is called by the physical layer when a packet has come in.
    // The datalink layer sends it up to the network layer.
    public void receivePacket (DatalinkPacket packet);
}
  

Where to go from here

We have only provided a quick, high-level view of how network protocol software is organized and what the key ideas are.

Important network-related concepts we haven't covered:

  • Physical layer ideas. How these devices work, coding theory, modulation.

  • Higher-level protocols. For example: HTTP, Bluetooth.

  • Network services. For example, DNS.

  • Special types of networks. Wireless networks, satellite communications, ad-hoc networks.

  • Distributed computing: remote-procedure calls, distributed data management, network caching.

  • Network security. Secure communication, attacks on networks, authentication.

Next steps:

  • Understand the protocols. The best way is to write your own protocols. Note that we haven't said anything about how their functionality is implemented.

  • Learn more about how the 7 layers work (conceptually). There are several "next level" details to learn about current protocols, once you've had a crack at designing your own. Two good books to try are:
    • Computer Networks by A.S.Tanenbaum, Prentice-Hall, 2002.
    • Computer Networking: A Top-Down Approach, by J.F.Kurose and K.W.Ross, Addison-Wesley, 2009.

  • Examine code. Read implementations of the layers. There are also some books that walk through TCP/IP code, all of which is written in C.