Error Detection in Data Communication - 180D-FW-2024/Knowledge-Base-Wiki GitHub Wiki

Introduction

In digital communication systems, data is represented as a sequence of binary bits—ones and zeros—that are transmitted across various mediums, such as copper wires, optical fibers, or wireless networks. These channels, however, are not always reliable, as they are subject to interference, noise, and other disruptions that can alter the bits during transmission. When a bit changes from a one to a zero, or vice versa, it results in what is known as a bit error. Even minor errors in data transmission can lead to corrupted files, garbled messages, or system malfunctions.

To address these challenges, error detection and correction techniques have been developed to ensure data reliability in digital communication. These techniques add redundancy to the transmitted message, allowing the receiver to check for potential errors. Some methods are designed to simply detect errors, prompting the system to resend the data if an error is found, while others can correct errors on the spot, eliminating the need for retransmission. The selection of an error-handling method often depends on the communication system’s requirements: simple error detection may suffice for short, low-stakes messages, but applications that demand high reliability and low latency may benefit from more sophisticated error correction.

Error detection and correction techniques can broadly be classified into two categories: Backward Error Correction (BEC), which involves detecting errors and then resending the data if errors are detected, and Forward Error Correction (FEC), where errors are corrected at the receiver without requiring retransmission. This article explores several of the most common error detection and correction techniques—including parity checks, checksums, and Hamming codes—that enable digital communication systems to maintain accuracy and robustness despite transmission challenges.

Types of errors

Single-bit flip

A single-bit flip is the simplest type of error. A single-bit flip error occurs when only one of the bits in a message change during transmission. This kind of error is typically easy to detect, and a straightforward parity check is often sufficient to identify it.

Single-Bit Flip

Multiple-bit flip

A multiple-bit flip occurs when more than one bit in a message change during transmission. These flips can either be scattered randomly throughout the message or occur in a sequence, known as a burst error. Though rarer than single-bit flips, multiple-bit flips can still happen, especially in high-noise environments. Detecting multiple-bit flips requires more robust techniques, such as checksums or cyclic redundancy checks.

Random Multiple-Bit Flip

Burst Error

Backward Error Detection

One-Dimensional Parity Check

In a parity check, an additional bit (the parity bit) is appended to the message. This parity bit is set so that the total number of ones in the message is even. When the receiver counts the ones in the received message, if the total is even, it suggests either no error occurred or that an even number of bits flipped. If the count is odd, however, an error is flagged, and the message can be resent.

Even Parity

A parity check works best when dealing with simple, single-bit errors, as it can easily detect them. However, if an even number of bits flip during transmission, the parity will still appear correct, and the error will go undetected. This limitation makes parity checks most suitable for detecting isolated errors in low-noise environments, where the likelihood of multiple-bit errors is minimal. Although parity checks were very widely used in the past for applications such as checking data on magnetic hard drives, they have fallen largely out of favor in modern computing. Instead, more robust methods such as checksums, and cyclic redundancy checks are used.

Two-Dimensional Parity Check

In a two-dimensional parity check, parity bits are computed for each row, similar to a basic parity check. Additionally, parity bits are determined for all columns, and both sets of parity bits are transmitted alongside the data. Upon reception, these parity bits are compared with those calculated from the received data to verify accuracy.

Advantages of Two-Dimensional Parity Check

In comparison to a single-dimensional parity check, a two-dimensional parity check can detect and correct all single bit errors
The two-dimensional parity check can also detect two or three bit errors that occur anywhere in the matrix

Checksum

The checksum method helps ensure data integrity by adding a summary value to the end of a message, allowing the receiver to verify that the data has not been altered during transmission. In this method, the message is divided into smaller substrings of n bits, which are then summed together using one’s complement arithmetic—a method that simplifies binary addition by allowing carries to be added back to the result. The one’s complement of the resulting sum, called the checksum, is appended to the message as an n-bit value before transmission.

When the message arrives, the receiver performs the same steps: it divides the message (including the checksum) into n-bit segments, sums them with one’s complement arithmetic, and checks if the final sum’s complement is zero. A zero result indicates that no errors were detected, while a non-zero result flags an error in transmission, prompting retransmission of the original message.

Checksum is effective at detecting both single-bit errors and random multiple-bit errors, particularly those distributed across different parts of the message. However, certain patterns of bit flips, such as pairs of flipped bits in symmetric positions, can sometimes cancel each other out, leaving the checksum unchanged. This means that while checksums are robust for general-purpose error detection, they are not foolproof, especially for detecting specific types of errors or patterns that might evade detection.

Cyclic Redundancy Check (CRC)

CRC operates similarly to a checksum, but instead of using addition, it relies on binary division. The sender selects a specific polynomial, which is then used to divide each n-bit substring of the message. The remainders from each division operation are combined to produce a final remainder, which is appended to the end of the message as a CRC code. When the receiver gets the message, it performs the same division operation using the same polynomial. If the remainder is zero, it indicates no error was detected. CRC is highly effective for detecting single-bit errors, as well as multiple-bit errors, and is widely used in digital communication for its robustness. It is particularly well-suited for applications where error detection must be reliable across high-noise channels

Forward Error Correction

Why use FEC?

Each of the methods above detect errors, but they require the entire message to be resent if an error is detected—a process known as Backward Error Correction (BEC). This approach is inefficient when only a small portion of the message is corrupted. To avoid retransmission, we can use Forward Error Correction (FEC) techniques, which can correct errors without needing the message resent.

Hamming Codes

Hamming codes were invented by Richard Hamming in the late 1940s at Bell Labs and represent the first practical method of forward error correction. These codes work by adding multiple parity bits to a message in a specific pattern, such that each data bit is covered by multiple parity bits, and each parity bit covers multiple data bits. This structure enables Hamming codes to detect and correct single-bit errors and to detect certain types of multiple-bit errors, such as two-bit burst errors. This innovation marked a significant advancement in digital communication, allowing for more reliable data transmission without the need for retransmission.

The number of parity bits p required for a given number of data bits d is determined by the formula: $2^p≥d+p+1$. For example, if you need to transmit 7 data bits $(d=7)$, you’ll need 4 parity bits $(p=4)$, making the total message length 11 bits. This makes Hamming codes ideal for applications where data accuracy is critical, and an increase in message length is acceptable. Hamming codes are still used today for long distance satellite communication.

Step-by-Step Example: Encoding "1011" with Hamming Codes

Calculate the total number of bits needed. Since we have 4 data bits, the number of parity bits required is $p=3$, as $2^3=8≥4+3+1$. So, the total message will be 7 bits (4 data bits + 3 parity bits).
Number each bit position in binary, and designate positions that are powers of two as parity bits. The positions are as follows:
- Positions: 1, 2, 3, 4, 5, 6, 7
- Bits: P1, P2, D3, P4, D5, D6, D7

Hamming blank parity

Here, P1, P2, and P4 are parity bits, while D3, D5, D6, and D7 are the data bits (containing "1011").

Assign each parity bit to monitor specific data bits. Each parity bit will cover data bits whose positions in binary contain a "1" in the position of that parity bit:
- P1 (covers bits 1, 3, 5, 7)
- P2 (covers bits 2, 3, 6, 7)
- P4 (covers bits 4, 5, 6, 7)
Calculate parity bits.
- P1: Covers D3 (1), D5 (0), D7 (1). Total ones = 2 (even), so P1 = 0.
- P2: Covers D3 (1), D6 (1), D7 (1). Total ones = 3 (odd), so P2 = 1.
- P4: Covers D5 (0), D6 (1), D7 (1). Total ones = 2 (even), so P4 = 0.

Hamming parity added

Error Detection and Correction

Suppose an error occurs during transmission, and the receiver gets 1110011 instead of 0110011 (P1 is flipped from 0 to 1).

The receiver recalculates the parity bits using the same method.
The recalculated parity bits indicate an inconsistency: P1 should be 0 (since the sum of bits it covers is still even), but it’s received as 1.
The positions of incorrect parity bits pinpoint the exact location of the error. In this case, P1 is flagged as incorrect, indicating that the bit in position 1 is flipped.
By correcting position 1 from 1 back to 0, the original message 0110011 is restored, corresponding to "1011".

Advantages of Error Detection

Greater Data Reliability: Error detection helps maintain the integrity of transmitted data, ensuring it remains accurate and free of errors so that the recipient receives exactly what the sender intended
Optimized Network Performance: By identifying and isolating network issues that cause errors, error detection enhances overall network efficiency and minimizes downtime
Stronger Data Security: Error detection also plays a role in safeguarding transmitted data, ensuring it remains unaltered and protected from tampering

Disadvantages of Error Detection

Increased Overhead: Implementing error detection consumes extra resources and processing power, potentially leading to higher network load, slower performance, and increased latency
False Positives: Error detection systems may occasionally misidentify errors, triggering unnecessary data retransmissions and adding to network congestion
Restricted Error Correction: While error detection can pinpoint errors, it does not fix them. The recipient must request a retransmission from the sender, causing potential delays and further network strain

Future of Error Detection

Machine Learning and Artificial Intelligence

As ChatGPT and Deepseek make large strides, it opens up the possibilities of integrating AI with error detection techniques. By training AI Models with enormous amounts of data, it may be possible to identify pattens that can indicate errors in data transmission. Through this new manner of detection, the models can automatically detect and correct errors, which eliminates the need for manual fixing. As a matter of fact, according to this blog from Faster Capital, researchers have already been able to provide a proof of this concept in DNA sequencing. While this does sound great on paper, it may only be worth AI's intensive operations for very specific use-cases, like satellite communication, where there is a lot of downtime between transmissions.

Quantum Error Detection

Quantum computing can perform complex computations exponentially faster than regular computers. As a result, they encounter more errors. This presents new opportunities & challenges for error detection methods. There are preliminary research into this, such as the startup Error Corp, whose mission is to pioneer "new approach to detect and correct quantum errors more efficiently, promising to reduce the number of qubits required for quantum error correction drastically" (Wolba 3). According to Error Corp's founder, this company has the "potential to propel quantum computing beyond its current limitations and address some of the most challenging problems in science and technology" (3). In addition to Error Corp, Google has also taken strides in this space with their Google Research group called Willow, who has a similar aim as Error Corp to improve quantum computing's efficiency through error correction. According to the Willow article, they have achieved stable progress, growing the size of their qubits from 3x3 to 7x7, substantially dropping the logical error probability, and increasing the lifespan for the qubits by 24 times from its first emergence.

References

https://en.wikipedia.org/wiki/Parity_bit https://www.geeksforgeeks.org/error-detection-code-checksum/ https://en.wikipedia.org/wiki/Cyclic_redundancy_check https://www.youtube.com/watch?v=X8jsijhllIA https://www.geeksforgeeks.org/hamming-code-in-computer-network/ https://research.google/blog/making-quantum-error-correction-work/ https://fastercapital.com/topics/future-developments-in-error-detection-techniques.html https://www.future-of-computing.com/error-corp-shaping-the-future-of-error-control-for-quantum-computers/