This change alters how timeouts are calculated for threads added into the queue
when the lock cannot be immediately acquired, and adds logic to reduce the
timeouts when a thread leaves the queue by either acquiring the lock or timing-
out while waiting. Tests have been added to ensure that the added and altered
lua code is necessary to provide the documented behavior of the fair lock, and
that the changes do not break existing desired behavior.
The timeout drift issue is resolved by decreasing the timeouts in the
redisson_lock_timeout sorted set when a thread is removed from the queue. This
logic was added to the tryLockInnerAsync lua code (both variations) in the
branch where the lock is successfully acquired, and in this case all timeouts
except the timeout being removed from the queue are decreased by threadWaitTime.
Additionally, the existing lua code in acquireFailedAsync was changed to always
decrease the timeouts regardless of where the removed thread is in the queue,
however this requires that the queue be traversed to determine the position of
the thread being removed is in the queue so that only those threads after it
have their timeouts decreased. The existing code also had the behavior where if
the 1st and 2nd threads in the queue were removed via acquireFailedAsync, the
TTL for the 3rd thread would equal the lock TTL and it would not be able to
acquire the lock fairly if the lock expired. This change requires the change to
the timeout calculation in order to fix both the timeout drift and the unfair
timeout decrease problems of the existing code.
The existing timeout calculation at the end of the lua code for
tryLockInnerAsync in the tryLock w/ waitTime call path used to be either the
lock timeout value + 5s for the first thread in the queue or for the other
threads in the queue the value was essentially the first thread's timeout + 5s.
This second rule for the 2nd thread is correct per the documentation, but for
the 3rd to the Nth thread, the timeout would not allow these threads to acquire
the lock fairly within 5s after the prior thread if the 1st and 2nd thread died
since their timeouts are the same as the 2nd thead, and this is contrary to the
documentation which provides 5s per thread in the queue. The new code sets the
timeout for a thread added to the queue to 5s plus the timeout of the thread at
the end of the queue; there is always a check now to see if the thread that has
failed to acquire the lock in the lua script is already in the queue, and if it
is already in the queue, then the lua code returns the approximate ttl based on
that thread's current timeout (timeout - 5s).
Note that the "remove stale threads" while loop was not altered even though it
also removes threads from the queue. This is the expected behavior and was
preserved, and some added tests now check the timeout expiration behavior.