Connection Dropped in one Direction One Sided Trace - microsoft/CSS_SQL_Networking_Tools GitHub Wiki

Connection Dropped in One Direction - One-Sided Trace

In this example, we have only the trace from the client machine.
In general, a client trace and a server trace are required to confirm a packet drop issue. But, if you know the expected sequence of packets, sometimes you can tell from a one-sided trace.

The application error is: A transport-level error has occurred when sending the request to the server. (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.)

Note: "A transport-level error" means an error at the network layer.

Frame  Delta     Source IP   Dest IP     Description
------ --------- ----------- ----------- --------------------------------------------------------------------------------------------------
...
--- Traffic is okay up to frame 203606
203606 0.0000273 10.10.33.94 10.10.30.16 TCP: [Bad CheckSum]Flags=...A...., SrcPort=55705, DstPort=63414, PayloadLen=0, Seq=3020705548, Ack=49355660, Win=64860 (scale factor 0x0) = 64860

--- The problem starts with the next four packets.
--- The client sends several packets to SQL Server due to a long query and SQL ACKs the first packet (frame 203607) in frame 203613. Note the Ack=49357040 in this frame matches the Seq=49355660 - 49357040 on frame 203607.
--- SQL sends a Dup Ack (frame 203615) indicating it thinks it lost some packets.
203607 0.0000146 10.10.30.16 10.10.33.94 TCP:[Continuation to #203602]Flags=...A...., SrcPort=63414, DstPort=55705, PayloadLen=1380, Seq=49355660 - 49357040, Ack=3020705548, Win=64860
203608 0.0000030 10.10.30.16 10.10.33.94 TDS:Data, Version = 7.300000(No version information available, using the default version), Reassembled Packet
203610 0.0000303 10.10.30.16 10.10.33.94 TDS:Data, Version = 7.3 (0x730a0003), Reassembled Packet
203611 0.0000026 10.10.30.16 10.10.33.94 TCP:[Continuation to #203610]Flags=...AP..., SrcPort=63414, DstPort=55705, PayloadLen=1100, Seq=49359800 - 49360900, Ack=3020705548, Win=64860
203613 0.0000265 10.10.33.94 10.10.30.16 TCP: [Bad CheckSum]Flags=...A...., SrcPort=55705, DstPort=63414, PayloadLen=0, Seq=3020705548, Ack=49357040, Win=64860 (scale factor 0x0) = 64860
203615 0.0000167 10.10.33.94 10.10.30.16 TCP:[Dup Ack #203613] [Bad CheckSum]Flags=...A...., SrcPort=55705, DstPort=63414, PayloadLen=0, Seq=3020705548, Ack=49357040, Win=64860 (scale factor 0x0) = 64860

--- The client sends a new packet.
--- SQL sends another Dup Ack of frame 203613 that ACKs frame 203607, indicating there is a gap in continuous transmission.
--- SQL waits 30 seconds for a reply, which results in the Keep-Alive packet 30 seconds later (frame 523289).
203616 0.0002455 10.10.30.16 10.10.33.94 TDS:Continuous RPCRequest, Version = 7.3 (0x730a0003), SPID = 0, PacketID = 12, Flags=...AP..., SrcPort=63414, DstPort=55705, PayloadLen=1149, Seq=49360900 - 49362049, Ack=3020705548, Win=64860
203618 0.0000184 10.10.33.94 10.10.30.16 TCP:[Dup Ack #203613] [Bad CheckSum]Flags=...A...., SrcPort=55705, DstPort=63414, PayloadLen=0, Seq=3020705548, Ack=49357040, Win=64860 (scale factor 0x0) = 64860

--- The client retransmits the packet in Frame 203608, the packet after the DUP ACK indicated was the last once received
207100 0.3191863 10.10.30.16 10.10.33.94 TCP:[ReTransmit #203608] [Bad CheckSum]Flags=...A...., SrcPort=63414, DstPort=55705, PayloadLen=1380, Seq=49357040 - 49358420, Ack=3020705548, Win=64860 (scale factor 0x0) = 64860
231520 0.5941605 10.10.30.16 10.10.33.94 TCP:[ReTransmit #203608] [Bad CheckSum]Flags=...A...., SrcPort=63414, DstPort=55705, PayloadLen=1380, Seq=49357040 - 49358420, Ack=3020705548, Win=64860 (scale factor 0x0) = 64860
239901 1.2184355 10.10.30.16 10.10.33.94 TCP:[ReTransmit #203608] [Bad CheckSum]Flags=...A...., SrcPort=63414, DstPort=55705, PayloadLen=1380, Seq=49357040 - 49358420, Ack=3020705548, Win=64860 (scale factor 0x0) = 64860
258472 2.4065393 10.10.30.16 10.10.33.94 TCP:[ReTransmit #203608] [Bad CheckSum]Flags=...A...., SrcPort=63414, DstPort=55705, PayloadLen=1380, Seq=49357040 - 49358420, Ack=3020705548, Win=64860 (scale factor 0x0) = 64860
350891 4.8438963 10.10.30.16 10.10.33.94 TCP:[ReTransmit #203608] [Bad CheckSum]Flags=...A...., SrcPort=63414, DstPort=55705, PayloadLen=1380, Seq=49357040 - 49358420, Ack=3020705548, Win=64860 (scale factor 0x0) = 64860

--- The server, having not received anything, sends a Keep-Alive packet (ACK + Payload length of 1)
523289 20.624419 10.10.33.94 10.10.30.16 TCP:[Keep alive] [Bad CheckSum]Flags=...A...., SrcPort=55705, DstPort=63414, PayloadLen=1, Seq=3020705547 - 3020705548, Ack=49357040, Win=64860 (scale factor 0x0) = 64860

--- The client terminates the connection
523290 0.0002219 10.10.30.16 10.10.33.94 TCP:[Keep alive ack]Flags=...A.R.., SrcPort=63414, DstPort=55705, PayloadLen=0, Seq=49357040, Ack=3020705548, Win=0

We see the client seems to be receiving packets from SQL Server okay, but that SQL Server is not seeing the client packets. Note the Acknowledgement number on the server packets remains at 49357040, much less that the client's Sequence number 49358420.

This was eventually discovered to be a smart switch dropping the packets from the client to the server.