The AOF tail of a combined RDB+AOF is based on the premise of applying
the AOF commands to the exact state that there was in the server while
the RDB was persisted. By expiring keys while loading the RDB file, we
change the state, so applying the AOF tail later may change the state.
Test case:
* Time1: SET a 10
* Time2: EXPIREAT a $time5
* Time3: INCR a
* Time4: PERSIT A. Start bgrewiteaof with RDB preamble. The value of a is 11 without expire time.
* Time5: Restart redis from the RDB+AOF: consistency violation.
Thanks to @soloestoy for providing the patch.
Thanks to @trevor211 for the original issue report and the initial fix.
Check issue #4950 for more info.
See issue #2819 for details. The gist is that when we want to send INFO
because we are over the time, we used to send only INFO commands, no
longer sending PING commands. However if a master fails exactly when we
are about to send an INFO command, the PING times will result zero
because the PONG reply was already received, and we'll fail to send more
PINGs, since we try only to send INFO commands: the failure detector
will delay until the connection is closed and re-opened for "long
timeout".
This commit changes the logic so that we can send the three kind of
messages regardless of the fact we sent another one already in the same
code path. It could happen that we go over the message limit for the
link by a few messages, but this is not significant. However now we'll
not introduce delays in sending commands just because there was
something else to send at the same time.
problems fixed:
* failing to read fragmentation information from jemalloc
* overflow in jemalloc fragmentation hint to the defragger
* test suite not triggering eviction after population
Usually blocking operations make a lot of sense with multiple keys so
that we can listen to multiple queues (or whatever the app models) with
a single connection. However in the synchronous case it is more useful
to be able to ask for N elements. This is a change that I also wanted to
perform soon or later in the blocking list variant, but here it is more
natural since there is no reply type difference.
Some times it was not released on error, sometimes it was released two
times because the error path expected the "di" var to be NULL if the
iterator was already released. Thanks to @oranagra for pinging me about
potential problems of this kind inside rdb.c.