|
Pythomnic » Documentation » Architecture (cage structure) » Cage-to-cage communications » Asynchronous P2P message queues (retry attempts)
It might appear that from the receiving target_cage's point of view there's
no difference whether the call arrived via a synchronous RPC or as a message.
But there is one difference, very important to understand to use queued calls
in Pythomnic. The difference is about handling errors. See, with a synchronous
RPC call the caller is waiting for a callee and should the call fail, the caller
will become aware of it right away and be able to react. Not the case with
message queue - if message processing fails, the caller will not be aware
and the call will just be lost. To remedy this situation, calls that arrive in a
queued messages are retried upon failure. For example,
# on target cage, in target_module.py
def target_method(*args, **kwargs):
...
# if this throws, the call is retried again in a while
...
You are therefore supposed to catch any exceptions you expect, process, log
and silence them, and let the method return normally.
# on target cage, in target_module.py
def target_method(*args, **kwargs):
try:
...
except ExpectedExceptionOne:
# do something, but don't rethrow
except ExpectedExceptionTwo:
# do something, but don't rethrow
This way, you will be able to gracefully handle the expected exceptions,
while unexpected ones will cause a retry. Should the failure condition
be transient, as it often is, the retried attempt will succeed. Should the
call have encountered a permanent failure condition, it will be kept retrying
pointlessly, until an administrator or developer notices. In this case the
reloadability of Pythomnic application module serves well - while the call
is kept on storage, waiting for its next retry attempt, the developer
could interfere, modify the faulty (or, unprepared for a fault) target_module.py
and upon its next retry attempt the module is reloaded and the call succeeds.
This behaviour helps enormously in dealing with unexpected problems and
own bugs.
The final question with respect to retries is how much retry is enough.
Will the call be retried forever ? For how long will it sleep between
retry attempts ? The answer is - each P2P queue has a whole lot of parameters
controlling the retry behaviour. You could examine them in config/config_execute_on.py
but at this moment sufficient is to describe the default behaviour. Thus, the
default P2P queue will keep retrying each call for three full days - just enough
time for someone to come in on Monday morning and find out something got broken
on Friday night. The first retry attempt will occur in a minute, second in slightly
over a minute etc., thus growing exponentially until the retry delay hits an hour.
Then it will keep retrying the call once an hour, resulting in a total of 80 attempts
over three days.
The described P2P messaging is very useful as it is, and you also may want use
an extension mechanism to it - the call chains described in the next section.
|
Features Download Documentation Tutorial |