When implementing the code that saves and loads these aux fields we used rdb
format that was added for that in redis 5.0, but then we added the 'when' field
which meant that the old redis-check-rdb won't be able to skip these.
this fix adds an opcode as if that 'when' is part of the module data.
Without such change, the diskless replicas, when loading RDB files from
the socket will not abort when a broken RDB file gets loaded. This is
potentially unsafe, because right now Redis is not able to guarantee
that encoding errors are safe from the POV of memory corruptions (for
instance the LZF library may not be safe against untrusted data?) so
better to abort when the RDB file we are going to load is corrupted.
Instead I/O errors are still returned to the caller without aborting,
so that in case of short read the diskless replica can try again.
now that replica can read rdb directly from the socket, it should avoid exiting
on short read and instead try to re-sync.
this commit tries to have minimal effects on non-diskless rdb reading.
and includes a test that tries to trigger this scenario on various read cases.
* create module API for forking child processes.
* refactor duplicate code around creating and tracking forks by AOF and RDB.
* child processes listen to SIGUSR1 and dies exitFromChild in order to
eliminate a valgrind warning of unhandled signal.
* note that BGSAVE error reply has changed.
valgrind error is:
Process terminating with default action of signal 10 (SIGUSR1)
The implementation of the diskless replication was currently diskless only on the master side.
The slave side was still storing the received rdb file to the disk before loading it back in and parsing it.
This commit adds two modes to load rdb directly from socket:
1) when-empty
2) using "swapdb"
the third mode of using diskless slave by flushdb is risky and currently not included.
other changes:
--------------
distinguish between aof configuration and state so that we can re-enable aof only when sync eventually
succeeds (and not when exiting from readSyncBulkPayload after a failed attempt)
also a CONFIG GET and INFO during rdb loading would have lied
When loading rdb from the network, don't kill the server on short read (that can be a network error)
Fix rdb check when performed on preamble AOF
tests:
run replication tests for diskless slave too
make replication test a bit more aggressive
Add test for diskless load swapdb
Fix#5790 and 5878.
Maybe a better option was to have such fields named with the first
byte '%' as those are info fields for specification, however now to
break it in a backward incompatible way is not an option, so let's use
the fields actively to provide info when sensible, otherwise ignore
when they are not really helpful.
RESTORE now supports:
1. Setting LRU/LFU
2. Absolute-time TTL
Other related changes:
1. RDB loading will not override LRU bits when RDB file
does not contain the LRU opcode.
2. RDB loading will not set LRU/LFU bits if the server's
maxmemory-policy does not match.
This way we let big endian systems to still load old RDB versions.
However newver versions will be saved and loaded in a way that make RDB
expires cross-endian again. Thanks to @oranagra for the reporting and
the discussion about this problem, leading to this fix.
Again thanks to @oranagra. The object idle time does not fit into an int
sometimes: use the native type that the serialization function will get
as argument, which is uint64_t.
The AOF tail of a combined RDB+AOF is based on the premise of applying
the AOF commands to the exact state that there was in the server while
the RDB was persisted. By expiring keys while loading the RDB file, we
change the state, so applying the AOF tail later may change the state.
Test case:
* Time1: SET a 10
* Time2: EXPIREAT a $time5
* Time3: INCR a
* Time4: PERSIT A. Start bgrewiteaof with RDB preamble. The value of a is 11 without expire time.
* Time5: Restart redis from the RDB+AOF: consistency violation.
Thanks to @soloestoy for providing the patch.
Thanks to @trevor211 for the original issue report and the initial fix.
Check issue #4950 for more info.
Some times it was not released on error, sometimes it was released two
times because the error path expected the "di" var to be NULL if the
iterator was already released. Thanks to @oranagra for pinging me about
potential problems of this kind inside rdb.c.
This is a big win for caching use cases, since on reloading Redis will
still have some idea about what is worth to evict and what not.
However this only solves part of the problem because the information is
only partially propagated to slaves (on write operations). Reads will
not affect slaves LFU and LRU counters, so after a failover the eviction
decisions are kinda random until keys start to collect some aging/freq info.
However since new slaves are initially populated via RDB file transfer,
this means that if we spin up a new slave from a master, and perform an
immediate manual failover (for instance in order to upgrade the master),
the slave will have eviction informations to use for some time.
The LFU/LRU info is persisted only if the maxmemory policy is set to one
of the relevant type, even if no actual "maxmemory" memory limit is
set.