Being aware of the myriad ways in which network I/O can fail goes a long way toward creating a robust system and diagnosing problems. This post contains some Windows specifics, but their are equivalents to most categories in any OS.
- Server response codes
- Wikipedia’s list of HTTP status codes is handy, as well as the RFC 7231 summary of each class.
- httpstat.us provides a helpful service for testing your reaction to status codes.
- In .NET, the WebException will provide the error status code via the Response property.
- Timeouts
- Often thought of as just one type of error, but there are more specific sub types. Depending on your API tools you may or may not have visibility into all of them. For large transfers or poor networks you may need to dive deeper. It can also help in debugging to know which particular timeout has occurred.
- Connection timeout: how long does it take to establish the TCP connection.
- Send timeout: how long it takes for a set of bytes to be sent (not all bytes intended to be sent, just one block).
- Receive timeout: how long it takes for a set of bytes to be received (not all bytes intended to be received, just one block).
- Overall timeout: how long the overall connection is allowed to remain open.
- Port already in use or blocked by Windows firewall
- These two conditions manifest as the same error message in .NET:
An attempt was made to access a socket in a way forbidden by its access permissions {IP:Port}
- To determine if the cause is a port conflict and which process to blame, try TCPView or netstat -o on the command line.
- To determine if a Windows firewall rule is blocking the connection, you will need to inspect its log.
- These two conditions manifest as the same error message in .NET:
- DNS failures
- The .NET exception for name resolution failure does not distinguish among timeouts, name not found, and no live network adapter. Even the HResult is the same.
WebException: 'The remote name could not be resolved: '{X}''
- A timeout can take a surprisingly long time in Windows and is a system wide setting. The DnsClient.NET library provides more control.
- You can flush Windows DNS cache entries on the command line with ipconfig /flushdns.
- The .NET exception for name resolution failure does not distinguish among timeouts, name not found, and no live network adapter. Even the HResult is the same.
- Windows hosts file
- Located at
C:\Windows\System32\drivers\etc
, this file allows you to override domain name queries to use the IP you desire. - Makes a nice ad blocker and helpful in a pinch to modify code you do not control. But can be evil if others are not aware it is modified.
- On the command line,
nslookup
bypasses the hosts file and will return an outside query result. If your code is seeing a different DNS result than nslookup, suspect your hosts file. - If UAC is enabled on your system you will need to run your editor as admin or edit a copy and manually overwrite the hosts file.
- Located at
- API misuse
- Some network APIs will stop you if you attempt to make a request that the local code knows is invalid without even touching the network.
- A .NET
HttpWebRequest
can throwInvalidOperationException
andProtocolViolationException
in these cases.WebRequest
can also complain before the network call with a few exceptions.
- Locally bad pipe
- This category includes all of the things like a disabled network adapter, cable not plugged in, or not connected to a WiFi access point.
- In .NET you can check if any network interface is up, but once you make a network call you will not get an explicit exception message that will distinguish this scenario from other causes. However, if your logs are detailed, you may notice that the time between request and failure is faster in this situation than others. In the case of an HTTP request by domain name, you will see a WebException “The remote name could not be resolved” (a DNS error) happen very quickly (under 100ms) which ordinarily takes over 500ms. A request directly with an IP address will quickly return a WebException “Unable to connect to the remote server”.
- Connection pooling, port exhaustion
- TCP connections are often pooled for efficiency reasons. This is desirable but potentially limiting if you are load testing or have high server-to-server loads. At the OS level, review MaxUserPort and TCPTimedWaitDelay.
- Failing to apply
using
statements or callDispose
can exhaust connection pools. - Some .NET APIs enforce ServicePointManager limits on connections per domain. The default of which is only 2. This originates from HTTP/1.1 spec for clients (which most clients no longer respect). This is ugly if you are making HTTP calls from a server. It can be changed within app/web config.
- HTTPS handshake
- Certificates becoming invalid due to lack of upkeep is sadly common.
- Unless some evil person has modified your ServicePointManager ServerCertificateValidationCallback. This a process wide setting that can disable HTTPS validation. Code this to always return true anywhere (including a 3rd party DLL) and the entire process will ignore certificate errors. It could also be used for development/test purposes but this is such an evil setting I would avoid it like the plague. If you must use it, apply it narrowly.
- If your local system clock is way off it may fall outside the valid date range of the certificate. This may seem unlikely until you start pulling devices from storage with dead CMOS batteries.
- Captive portals
- A captive portal is that annoying web page you see at a hotel or other business that requires you to do something before gaining access to a network.
- I have had more than one cellular M2M router present a captive portal page when there is no connection. Depending on how it is implemented, you may get a seemingly valid HTTP reply but good luck parsing it.
- Bad close of Fiddler
- An esoteric situation, but one that can drive a QA team insane. In older versions of Fiddler (an HTTP/S sniffer), closing it down abruptly would break the system’s network adapter and require opening and cleanly closing it again. I have not been able to reproduce this on the latest version (v5).
Featured Image Credit
Buffalo Gap National Grassland, South Dakota. Photo by Nick Bushby.