|It is an external server. One should always design and code with the expectation that external calls will fail.
They can fail for the following known intermittent reasons.
1. The server is down
2. The network is having issues.
They can fail repeatedly for following reasons.
1. Something is incorrectly configured.
2. Your code is not correctly setting up the call. (Credentials, wrong message, invalid message, etc.)
There can be processing issues
1. The server never responds
2. The server takes too long (a timeout is exceeded.)
3. The server has an internal error and returns an error code.
4. The server returns an error code that suggests a retry is possible.
5. The server returns an error code that suggests a retry is not possible.
There can be other known/unknown reasons not in the lists above.
You can choose to implement a retry strategy but that can only work for some of the cases above.
The problem with retry strategies is the following
1. Are there situations where it must not be retried? For example if you just attempted to update the inventory by removing (data) 10 items do you want to keep retrying that again and again for every possible error? That could be a problem if the server is in fact succeeding (the 10 were removed) but then fails when attempting to format a correct response back to you.
2. Are there situations where it will never work so retries are pointless?
One must also evaluate what retry strategies can do to the entire enterprise. For example if simple retries are in place and there is a chain of 5 services that keep retrying (service A retries B which retries C etc) what happens to the original caller while they wait?
Even more complicated what happens with timeouts? If service B has three retries at 90 seconds each and service A also looks for timeouts then a single call to B would require a minimum timeout of 270 seconds in A. And B would need 3 of those. Presuming of course that A even knows that B is using a timeout like that.