VPN Delays Packets Causing Connection to Reset - microsoft/CSS_SQL_Networking_Tools GitHub Wiki

VPN Delays Packets Causing Connection to Reset

In this case a VPN between data centers did not drop packets, but queued them up due to congestion issues. However, the delay was great enough to cause TCP.SYS to think the connection was bad.

This trace was taken on the server. We do not have the client side of this trace or the issue would be more obvious.

Frame Time Offset Source IP   Dest IP     Description
----- ----------- ----------- ----------- --------------------------------------------------------------------------------------
23103 176.1043234 10.10.10.30 10.10.20.50 TCP:[Continuation to #23085]Flags=...A...., SrcPort=54321, DstPort=1455, PayloadLen=13
23104 176.1044389 10.10.20.50 10.10.10.30 TCP: [Bad CheckSum]Flags=...A...., SrcPort=1455, DstPort=54321, PayloadLen=0, Seq=25
23105 176.1044656 10.10.10.30 10.10.20.50 TCP:[Continuation to #23085]Flags=...A...., SrcPort=54321, DstPort=1455, PayloadLen=13
23106 176.1046604 10.10.20.50 10.10.10.30 TCP: [Bad CheckSum]Flags=...A...., SrcPort=1455, DstPort=54321, PayloadLen=0, Seq=25

--- The client has not received an acknowledgement for Frame 23085 and retransmits it
23107 176.1047174 10.10.10.30 10.10.20.50 TCP:[ReTransmit #23085]Flags=...A...., SrcPort=54321, DstPort=1455, PayloadLen=1310, S
23108 176.1047174 10.10.10.30 10.10.20.50 TCP:[ReTransmit #23085]Flags=...A...., SrcPort=54321, DstPort=1455, PayloadLen=1310, S
23109 176.1047469 10.10.20.50 10.10.10.30 TCP: [Bad CheckSum]Flags=...A...., SrcPort=1455, DstPort=54321, PayloadLen=0, Seq=25
23110 176.1047691 10.10.20.50 10.10.10.30 TCP:[Dup Ack #23106] [Bad CheckSum]Flags=...A...., SrcPort=1455, DstPort=54321, Payl
23111 176.1048097 10.10.10.30 10.10.20.50 TCP:[ReTransmit #23085]Flags=...A...., SrcPort=54321, DstPort=1455, PayloadLen=1310, S
23112 176.1048302 10.10.20.50 10.10.10.30 TCP:[Request Fast-Retransmit from Seq583988460] [Bad CheckSum]Flags=...A...., SrcPor
23113 176.1048628 10.10.10.30 10.10.20.50 TCP:[ReTransmit #23085]Flags=...A...., SrcPort=54321, DstPort=1455, PayloadLen=1310, S

--- After retransmitting the packet four times, the client closes the connection with an ACK+RESET packet.
--- Normally we should see no packets after this, or maybe one, if it is in flight.
23114 176.1048628 10.10.10.30 10.10.20.50 TCP:Flags=...A.R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583972740, Ack=2574

23115 176.1048819 10.10.20.50 10.10.10.30 TCP: [Bad CheckSum]Flags=...A...., SrcPort=1455, DstPort=54321, PayloadLen=0, Seq=25

--- However, the client sends 17 RESET packets indicating there were 17 packets in-flight
--- from the SQL Server at the time it closed the connection.
23116 176.1195189 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583971430, Ack=5839
23117 176.1195519 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583971430, Ack=5839
23118 176.1195519 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583971430, Ack=5839
23119 176.1195519 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583971430, Ack=5839
23120 176.1245539 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583974050, Ack=5839
23121 176.1263881 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583976670, Ack=5839
23122 176.1313331 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583979290, Ack=5839
23123 176.1313910 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583981910, Ack=5839
23124 176.1319531 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583981910, Ack=5839
23125 176.1321113 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583988460, Ack=5839
23126 176.1323034 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583988460, Ack=5839
23127 176.1326310 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583988460, Ack=5839
23128 176.1327701 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583988460, Ack=5839
23129 176.1330567 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583981910, Ack=5839
23130 176.1335751 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583983220, Ack=5839
23131 176.1340293 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583984530, Ack=5839
23132 176.1343805 10.10.10.30 10.10.20.50 TCP:Flags=.....R.., SrcPort=54321, DstPort=1455, PayloadLen=0, Seq=583988460, Ack=5839

Explanation

  1. The ACK+RESET packet is sent when the connection is open to abort any further processing.
  2. The RESET packet is sent when a packet is received after the connection has been closed.
  3. The 17 RESET packets indicate that 17 packets from the server arrived after the connection was closed.
  4. Only 1 packet was sent from the server after the ACK+RESET packet (frame 23115), so the other 16 packets were in flight before then. The top of the trace segment shows two of them and the others go back before frame 23085 that was retransmitted, indicating that the ACK packet would have been delayed as well.

The two computers were in different data centers. Local connections never exhibited any sort of issues, only ones over the VPN.

We could also reproduce this issue using PSPING.EXE from www.sysinternals.com, indicating that the problem was definitely not related to SQL Server but had the potential to affect any application.