Understanding UDP vs. TCP for Network Programming: Structures, Applications, and Performance Considerations - ECE-180D-WS-2024/Wiki-Knowledge-Base GitHub Wiki

Understanding UDP vs. TCP for Network Programming: Structures, Applications, and Performance Considerations

Introduction

When exploring online you may think that every website you visit or email you send is just part of the unlimited access that your device has to the internet. However, every digital interaction and byte of data transferred over a network has to abide by a working set of rules. The Transmission Control Protocol/Internet Protocol (TCP/IP) serves as this governance through a suite of communication protocols that interconnect network devices [1]. This protocol involves the handling of how data is broken down, addressed, transmitted, routed and received over a network [1]. We will focus on comparing two protocols within the Transport Layer of TCP/IP — Transfer Control Protocol (TCP) and User Datagram Protocol (UDP) before extending our discussion to include QUIC, a recently developed transport protocol built on top of UDP. This examination will be both qualitative and technical, providing network theory along with insight into the fundamental data structures.

Background on TCP/IP Model

The TCP/IP model is part of every network domain to oversee efficient and error free transmission. The TCP/IP model consists of four “layers” that performs a specific function on the data transmission from start to finish. These layers are divided into Application, Transport, Internet and Network Interface.

The application layer is responsible for understanding the type of data being used as they are the programs that use network services. Application types include Domain Name Service (DNS), File Transfer Protocol (FTP) and Hypertext Transfer Protocol (HTTP), among many others. The transport layer is responsible for establishing the end-to-end connection between the sender and the receiver while also deciding how to divide the data from the application to send in packets. The internet layer routes this packetized data based on Internet Protocol (the reason IP addresses assigned to each device) and ensures there is a path for the transmission. The network access layer is responsible for the actual transfer of the data through raw binary over the physical communication paths in the network channel [2]

(Figure 1: Application Layer of TCP/IP Model [2])

(Figure 2: Transport Layer of TCP/IP Model [2])

(Figure 3: Internet Layer of TCP/IP Model [2])

(Figure 4: Network Access Layer of TCP/IP Model [2])

The TCP/IP model involves communication between two parties, the client and the server. A client (often the user machine) is provided a service, like access to a file by a server in the network (another computer). When you type an address to a website (i.e https://www.youtube.com/) the browser first goes to the DNS server where it finds the IP address the website is, then the browser sends HTTP request message to the server to send a copy of the website to your device. After this, if the server approves the message it will send the client back an approval notice and then start to send small packets of data that your browser begins to assemble and display for you [3]. All of this is accomplished over the internet connection established by TCP/IP.

Methods of TCP and UDP

The Transfer Control Protocol (TCP) and the User Datagram Protocol (UDP) are the two primary protocols that make up the transport layer of the TCP/IP model. Given the type of communication an application will choose one or the other for end-to-end connectivity with the server.

TCP is a connection oriented protocol that provides reliable transmission of data through a certain set of parameters that are agreed upon by client and the server before this connection is established [5].

To establish a connection, there is a three way handshake (Figure 5):

(Figure 5: Steps For Connection in TCP [5])

Host A must send a synchronize (SYN) message to Host B
Host B responds with an acknowledgment (ACK) message along with a SYN message
Host A responds to the SYN message with its own ACK message

From this point there is a bidirectional state of communication between the two hosts (client and server). Once this connection is established data can now be exchanged over the network. Regardless of the size of the data, the TCP divides the data into smaller packets and assigns a sequence number so that it can be built back after being received. This sequencing helps with identifying lost packets to ensure that they are sent again. The size of these packets are decided by the receiving host's TCP window size as a measure of flow control to prevent buffering.[5]

The process of sending and acknowledging is a continuous process that ensures that both the client and server are on the same page. As illustrated below (Figure 6), the sequence number corresponds with the acknowledgement number based on the size of the receivers (Host B) TCP window. This easily allows the sender and receiver to know which bytes are expected next[5].

(Figure 6: Sequence and Acknowledgement Windowing [5])

In contrast, UDP is a simpler connectionless oriented protocol that does not require three way handshakes, sequence numbering and or acknowledgments confirming data received. Essentially, UDP packages and transmits the data without concern for the ensuing outcomes. While UDP has methods to check whether data is corrupted or not, there is no protocol to solve for lost or incorrectly ordered packets.

TCP and UDP Structures

Because there are countless applications that can be used on a given client, both TCP and UDP use port numbers to identify the type of service being requested for a given client or server. A device's IP address in combination with a port number is known as a socket. This allows different network services to operate on the same device. Thus, when a server receives a packet, the port number is what tells the transport layer what application to transfer the packet [5].

Each packet in a TCP or UDP connection is led by a header that contains all information that allows the protocol to govern. The TCP Header includes the source and destination ports, sequence number, acknowledgement number, window size and many more fields (Figure 8). Because UDP is more lenient with its protocol, its header is solely the source and destination ports, length of data packet and an error checking variable.

(Figure 7: UDP Header Fields [5])

(Figure 8: TCP Header Fields [5])

Case Examples for TCP vs. UDP

Through the stark differences in TCP and UDP connection there is a clear relationship between reliability and speed. By having smaller data headers and less digital oversight, in some cases UDP connections have an advantage over TCP in terms of transmission time. Whereas, for some applications there is no room for missing packets and reliability is tantamount.

TCP is best for direct communication where a reliable connection is necessary. When sending emails or transferring files, it is more important to guarantee the integrity of every packet. Whether it's a critical business proposal, a legal document, or vital medical records, TCP ensures the completeness of the data. Some examples of protocols which utilize TCP are the File Transfer Protocol (FTP), Hypertext Transfer Protocol (HTTP), and Simple Mail Transfer Protocol (SMTP) [7]. All of these use cases require perfect byte-level integrity - otherwise the transferred file would be corrupted. TCP is a natural fit to ensure this outcome.

UDP is best used for real-time applications, like audio and video streaming. The stateless nature of UDP data transmission makes online multiplayer gaming possible because its more beneficial to receive data quickly even if accuracy suffers [6]. Because, to the human eye, the occasional missed game frame or lag is negligible if it allows us to remain connected and continue playing uninterrupted. Some specific protocols which are built on UDP are Voice over IP (VoIP) and the Domain Name Service (DNS) [7].

DNS is a particularly interesting use case of UDP because it still requires reliable data transmission. DNS uses UDP over TCP because the communication between a user and a DNS server is inherently very brief. TCP's three-way acknowledgment handshake becomes especially costly in this scenario as there is not a long enough transfer of data to amortize the startup cost. DNS achieves reliability through the use of application layer network protocols [7]. Implementing reliable data transfer in a less heavy handed protocol provides the core benefits of both UDP and TCP without either protocol’s primary downsides. This is the motivation behind the development of the QUIC protocol.

The QUIC Protocol

QUIC is an encrypted by default internet transport protocol built on UDP with the intention of accelerating HTTP traffic as well as making it more secure [8]. QUIC was initially developed at Google in 2012 and stood for Quick UDP Internet Connections before it was adopted by the IETF and the acronym was dropped [9]. Although still connection oriented, QUIC offers major improvements over TCP in terms of speed. Instead of requiring a combination of TCP and TLS to guarantee secure transmission, QUIC provides security features entirely on its own. This is beneficial for speed and security, as the handshaking stage of QUIC connection establishment takes only one round trip rather than the two needed by TCP and TLS.

(Figure 9: QUIC in the Network Stack [9])

QUIC achieves additional HTTP performance improvements over base TCP through design choices targeted at improving HTTP latency. Although the decoupled nature of the IP/TCP stack is ideal for allowing independent improvements of protocols at each layer, cross-stack design offers more potential for optimization. QUIC specifically accelerates HTTP by mitigating the head of line blocking problem. Prior to QUIC, HTTP required packets from different streams of data to all enter the same queue on a connection. This resulted in slowdowns when a packet loss from one stream blocked all data transmission due to TCP's packet recovery mechanism. QUIC alleviates this problem by allowing for multiple transport streams. Each HTTP data stream is mapped to a QUIC transport stream, thereby preventing a lost packet on one HTTP stream from stalling data transfer on the entire connection [8]. By incorporating design elements specifically geared towards boosting HTTP performance, QUIC shows the utility of cross-stack TCP/IP stack in improving the efficiency of the network applications. TCP and UDP provide base functionality but their extensibility is inherently limiting if the goal is to maximize efficiency for a specific application.

(Figure 10: QUIC’s Solution to Head of Line Blocking[10])

Conclusion

In the digital world, TCP/IP and its Transport Layer of TCP and UDP communication is vital to the operation and reliability of internet services. Without UDP, streaming and broadcasts would lack the speed necessary for real-time experiences, while without TCP, everyday messaging would be flawed with missing data. These capabilities are because of their underlying data structures and protocol that balance speed, reliability, and data integrity. TCP and UDP connections are invaluable in the modern age of communication as it allows for a wide variety of applications that can communicate in different manners.

SOURCES

[1]https://www.techtarget.com/searchnetworking/definition/TCP-IP

[2]https://www.simplilearn.com/tutorials/cyber-security-tutorial/what-is-tcp-ip-model

[3]https://developer.mozilla.org/en-US/docs/Learn/Getting_started_with_the_web/How_the_Web_works

[4]https://www.pearsonhighered.com/assets/samplechapter/0/1/3/0/0130322202.pdf

[5]http://routeralley.com/guides/tcp_udp.pdf

[6]https://www.avast.com/c-tcp-vs-udp-difference

[7]https://www.geeksforgeeks.org/examples-of-tcp-and-udp-in-real-life/

[8]https://blog.cloudflare.com/the-road-to-quic

[9]https://www.auvik.com/franklyit/blog/what-is-quic-protocol/

[10]https://www.cdnetworks.com/media-delivery-blog/what-is-quic/