1. Viewing TCP Connection Information with Wireshark
The process of establishing and communicating through TCP connections is undoubtedly well-known to you. You can also refer to the article I previously wrote on this topic:
Transmission Control Protocol – TCP
1.1 Establishing Connection
As shown in the image below, these three lines represent the process of the TCP three-way handshake:
Firstly, the client TCP process sends a SYN packet with an initial sequence number Seq of 0. In addition to this, we can also see more detailed information such as MSS, Selective ACK, and so on in Wireshark:
In these pieces of information, you might be particularly interested in:
Maximum Segment Size (MSS) — The maximum length of an individual TCP segment.
Window Size (WSopt) — The size of the window.
SACK — Selective ACK, which allows retransmission of only the individual lost segments when needed. This feature is enabled only when both ends support it.
Timestamps options (TSopt) — Delay between the client and the server.
The second line of messages acknowledges the client’s SYN packet and also contains the server’s SYN information.
The packet includes the server’s initial sequence number and the server window size information.
Apart from the sequence number of the client packet, the third line of the client’s ACK packet also specifies the client’s window size.
1.2 Troubleshooting
It’s quite simple. If you see in the packet capture results that the client sends a SYN packet, but the server doesn’t reply or responds with an RST packet, then obviously the server’s corresponding port might not be listening, actively rejected, or blocked by a firewall.
After confirming that both the client and server are running normally, you can check firewall configurations, verify the correctness of the username and password you’re providing, and ensure that the IP address and port you’re trying to access are correct.
You might attempt to check the server’s status using the ping command, but in many cases, the server might block ICMP packets through the firewall, preventing you from pinging the server successfully. But this doesn’t mean the server is down
2. TCP Retransmissions
One of the most common issues in the TCP communication process is TCP retransmission.
TCP retransmission is an important mechanism used to recover from damaged, lost, duplicated, or out-of-order packets by TCP. If the sender doesn’t receive acknowledgments for sent packets within a certain time, it triggers retransmission.
During communication, if the percentage of TCP retransmitted packets reaches 0.5%, it significantly impacts performance. If it reaches 5%, the TCP connection may be disrupted.
In Wireshark, retransmitted packets are labeled as TCP Retransmission.
To obtain all retransmitted packets in the current packet capture results through configuration of a display filter:
expert.message == "Retransmission (suspected)"
Illustrated in the diagram below:
2.1 TCP retransmissions occurring to multiple destination addresses
Just like in the image above, you’ll notice that the ‘Destination’ is not concentrated and is distributed among multiple destination servers. This is often indicative of a network issue, possibly due to high load on your network interface card (NIC).
By using Wireshark’s ‘Statistics’ menu and selecting the ‘IO Graph’ option, you can open Wireshark’s IO load monitoring feature. This will allow you to see whether the communication on your current machine has reached the bottleneck of the NIC’s load.
If it’s like the diagram shown above, the network interface card load is not high, it’s possible that there’s a fault in the NIC or the link, or there might be high-load links congesting the bandwidth.
You can log in to the communication devices within the link to check the packet loss rate.
2.2 TCP retransmissions only occur to the same destination address
In a situation similar to the one depicted in the diagram above, where all TCP retransmissions are concentrated on a single destination address, it’s often due to the application’s own processing performance being low.
To further confirm if this is the cause, you can follow these steps:
- As explained in the previous section, use the IO Graph provided by Wireshark to check if the network load is excessively high.
- Use the ‘Conversation’ option under the ‘Statistics’ menu to open the Network Conversations window. Under the IPv4 tab, check the ‘Limit to display filter’ checkbox. This will allow you to see all sessions with retransmissions, helping you to confirm further.
- In the Network Conversations window, click on the TCP tab. Similarly, check the ‘Limit to display filter’ checkbox to view the specific retransmission ports. This way, you can identify the application causing the issue and pinpoint the specific problem.”
You might want to focus on whether the TCP retransmissions occur at specific intervals or are triggered by certain events. For instance, in the diagram below, TCP retransmissions occur approximately every 30ms, coinciding with a specific operation executed by the client in the software. This suggests that the operation might be triggering the occurrence of slow requests.
2.3 TCP retransmission caused by unresponsiveness application
If multiple TCP retransmissions occur shortly after sending SYN or ACK packets during connection establishment, with increasing intervals between the TCP retransmissions, this is often due to application unresponsiveness.
In such a scenario, when investigating the reason for the unresponsiveness of the application, after 15 to 20 seconds, the application may attempt to re-establish the connection. You can also manually restart the application to re-establish the connection.
2.4 TCP retransmissions caused by network jitter
The TCP protocol itself employs methods such as the Nagle algorithm, sliding window protocol, slow start, congestion avoidance, and fast recovery to prevent network congestion.
However, network jitter is a significant issue for the TCP protocol, often leading to TCP retransmissions during instances of network jitter.
To diagnose this issue, you can perform a ping to the destination address and observe the fluctuation in the time values to determine if there is any instability.
You can check:
- Whether the network link is congested and if the link status is stable.
- If the server hosting the application lacks resources, experiences hardware issues, or has inadequate configurations.
- Whether there is device overload or insufficient resources within the network link.
3. Conclusion
The aforementioned types of issues, overall, can be approached for resolution using the following strategies:
- Inductive Analysis: Determine if the problem is associated with a specific host, a particular TCP connection, or a specific behavior.
- Methodical Investigation: Check if the network link is overloaded, experiencing packet loss, if there are performance issues with servers or client hosts, or if the application itself is facing performance problems.
- Identifying the Root Issue: Determine if the problem is caused by network jitter.
Based on my experience, the majority of performance issues stem from the business logic layer, meaning they are caused by the application code. Therefore the first thing to investigate is whether there have been any modifications to the application code when the performance issues arose. These modifications might be contributing to the problems. Only after ruling out this scenario should efforts be directed toward using tools to analyze network issues through packet capturing. Otherwise, there’s a risk of heading in the wrong direction and seeking solutions in the wrong places.
Usually, the problem is not solely due to network jitter, even though it’s the easiest attribution. In most cases, attributing issues to network jitter is simply a form of laziness.