In this example, if the error code 101503 occurs when trying to connect to the first endpoint, the endpoint is not retried, whereas in the second endpoint, the endpoint is always retried if error code 101503 occurs. You can specify enabled or disabled error codes (but not both) for a given endpoint.
With leaf endpoints, if an error occurs during a message transmission process, that message will be lost. The failed message will not be retried again. These errors occur very rarely, but still message failures can occur. With some applications these message losses are acceptable, but if even rare message failures are not acceptable, use the failover endpoint.
Here is the configuration for failover endpoints. At the configuration level, a failover is a logical grouping of one or more leaf endpoints.
<failover> <endpoint .../>+ </failover>
When a message comes to the
Failover state, it will go through its list of endpoints to pick the first one in
Timeout state. Then it will send the message using that particular endpoint. If an error occurs while sending the message, the failover will go through the endpoint list again from the beginning and will try to send the message using the first endpoint.
Some errors put the endpoint into
Timeout and some keep the endpoint in the
Active state. In these cases, the retry can happen using the same endpoint. If the failure occurs with the first endpoint within the failover group and this error does not put the endpoint into
Suspended state, the retry will happen using the same endpoint.
Failover gives priority to the first endpoint that is not in the
Suspended state. So it will send the message through the first endpoint in the failover group, as long as it is not suspended. When the first endpoint is suspended, it will send the requests using the second endpoint. When the first endpoint becomes ready to send again, it will try again on the first endpoint, even though the second endpoint is still active.
If there is only one service endpoint and the message failure is not tolerable, failovers are possible with a single endpoint.
A sample failover with one address endpoint:
<endpoint name="SampleFailover"> <failover> <endpoint name="Sample_First" statistics="enable" > <address uri="http://localhost/myendpoint" statistics="enable" trace="disable"> <timeout> <duration>60000</duration> </timeout> <markForSuspension> <errorCodes>101504, 101505, 101500</errorCodes> <retriesBeforeSuspension>3</retriesBeforeSuspension> <retryDelay>1</retryDelay> </markForSuspension> <suspendOnFailure> <initialDuration>1000</initialDuration> <progressionFactor>2</progressionFactor> <maximumDuration>64000</maximumDuration> </suspendOnFailure> </address> </endpoint> </failover> </endpoint>
Sample_First endpoint is marked as
Timeout if a connection times out, closes, or sends IO errors. For all the other errors, it will be marked as
Suspended. When this error occurs, the failover will retry using the first non suspended endpoint. In this case, it is the same endpoint (
Sample_First). It will retry until the retry count becomes 0. The retry happens in parallel. Since messages come to this endpoint using many threads, the same message may not be retried three times. Another message may fail and can reduce the retry count.
The retry count is per endpoint, not per message.
In this configuration, we assume that these errors are rare and if they happen once in a while, it is OK to retry again. If they happen frequently and continuously, it means that it requires immediate attention to get it back to normal stateFor information on configuring a failover endpoint to handle errors, see Configuring Failover Endpoints.