This is just an experiment for now, there are a couple of race
conditions, mostly harmless for the performance gain experiment that
this commit represents so far.
The general idea here is to take Redis single threaded and instead
fan-out on expansive kernel calls: write(2) in this case, but the same
concept could be easily implemented for read(2) and protcol parsing.
However just threading writes like in this commit, is enough to evaluate
if the approach is sounding.
when redis appends the blocked client reply list to the real client, it didn't
bother to check if it is in fact the master client. so a slave executing that
module command will send replies to the master, causing the master to send the
slave error responses, which will mess up the replication offset
(slave will advance it's replication offset, and the master does not)
Fake clients are used in special situations and are not linked to the
normal clients list, freeing them will always result in Redis crashing
in one way or the other.
It's not common to send replies to fake clients, but we have one usage
in the modules API. When a client is blocked, we associate to the
blocked client object (that is safe to manipulate in a thread), a fake
client that accumulates replies. So because of this bug there was
the problem described in issue #5443.
The fix was verified to work with the provided example module. To write
a regression is very hard and unlikely to be triggered in the future.
Related to #4840.
Note that when we re-enter the event loop with aeProcessEvents() we
don't process timers, nor before/after sleep callbacks, so we should
never end calling freeClientsInAsyncFreeQueue() when re-entering the
loop.
If we are going to read a large object from network
try to make it likely that it will start at c->querybuf
boundary so that we can optimize object creation
avoiding a large copy of data.
But only when the data we have not parsed is less than
or equal to ll+2. If the data length is greater than
ll+2, trimming querybuf is just a waste of time, because
at this time the querybuf contains not only our bulk.
It's easy to reproduce the that:
Time1: call `client pause 10000` on slave.
Time2: redis-benchmark -t set -r 10000 -d 33000 -n 10000.
Then slave hung after 10 seconds.