The new ROLE command is designed in order to provide a client with
informations about the replication in a fast and easy to use way
compared to the INFO command where the same information is also
available.
Since there are ways to alter the configEpoch outside of the failover
procedure (for exampel CLUSTER SET-CONFIG-EPOCH and via the configEpoch
collision resolution algorithm), make always sure, before replacing our
configEpoch with a new one, that it is greater than the current one.
SET-CONFIG-EPOCH, used by redis-trib at cluster creation time, failed to
update the currentEpoch, making it possible after a failover for a
server to set its configEpoch to a value smaller than the current one
(since configEpochs are obtained using currentEpoch).
The bug totally break the Redis Cluster algorithms and protocols
allowing for permanent split brain conditions about the slots
configuration as shown in issue #1799.
I'm not sure if while the visibility is the inner block, the fact we
point to 'dbuf' is a problem or not, probably the stack var isx
guaranteed to live until the function returns. However obvious code is
better anyway.
The lua_to*string() family of functions use a non optimal format
specifier when converting integers to strings. This has both the problem
of the number being converted in exponential notation, which we don't
use as a Redis return value when floating point numbers are involed,
and, moreover, there is a loss of precision since the default format
specifier is not able to represent numbers that must be represented
exactly in the IEEE 754 number mantissa.
The new code handles it as a special case using a saner conversion.
This fixes issue #1118.
If we are in the signal handler, we don't want to handle
the signal again. In extreme cases, this can cause a stack overflow
and segfault Redis.
Fixes#1771
There is a time defined by REDIS_CLUSTER_WRITABLE_DELAY where fail -> ok
switch is not possible after startup as a master for some time, however
the contrary (ok -> fail) should always be possible.
Every log contains, just after the pid, a single character that provides
information about the role of an instance:
S - Slave
M - Master
C - Writing child
X - Sentinel
Behrad Zari discovered [1] and Josiah reported [2]: if you block
and wait for a list to exist, but the list creates from
a non-push command, the blocked client never gets notified.
This commit adds notification of blocked clients into
the DB layer and away from individual commands.
Lists can be created by [LR]PUSH, SORT..STORE, RENAME, MOVE,
and RESTORE. Previously, blocked client notifications were
only triggered by [LR]PUSH. Your client would never get
notified if a list were created by SORT..STORE or RENAME or
a RESTORE, etc.
Blocked client notification now happens in one unified place:
- dbAdd() triggers notification when adding a list to the DB
Two new tests are added that fail prior to this commit.
All test pass.
Fixes#1668
[1]: https://groups.google.com/forum/#!topic/redis-db/k4oWfMkN1NU
[2]: #1668
When scanning the argument list inside of a redis.call() invocation
for pre-cached values, there was no check being done that the
argument we were on was in fact within the bounds of the cache size.
So if a redis.call() command was ever executed with more than 32
arguments (current cache size #define setting) redis-server could
segfault.
Thanks to this change, when there is some code like:
clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE|...);
... and later before returning to the event loop ...
clusterUpdateState();
The clusterUpdateState() function will clar the flag and will not be
repeated in the clusterBeforeSleep() function. This especially important
for config save/fsync flags which are slow to execute and not a good
idea to repeat without a good reason.
This is implemented for all the CLUSTER_TODO flags.
The new command is able to reset a cluster node so that it starts again
as a fresh node. By default the command performs a soft reset (the same
as calling it as CLUSTER RESET SOFT), and the following steps are
performed:
1) All slots are set as unassigned.
2) The list of known nodes is flushed.
3) Node is set as master if it is a slave.
When an hard reset is performed with CLUSTER RESET HARD the following
additional operations are performed:
4) A new Node ID is created at random.
5) Epochs are set to 0.
CLUSTER RESET is useful both when the sysadmin wants to reconfigure a
node with a different role (for example turning a slave into a master)
and for testing purposes.
It also may play a role in automatically provisioned Redis Clusters,
since it allows to reset a node back to the initial state in order to be
reconfigured.
The previous code handling a lost slot (by another master with an higher
configuration for the slot) was defensive, considering it an error and
putting the cluster in an odd state requiring redis-cli fix.
This was changed, because actually this only happens either in a
legitimate way, with failovers, or when the admin messed with the config
in order to reconfigure the cluster. So the new code instead will try to
make sure that the keys stored match the new slots map, by removing all
the keys in the slots we lost ownership from.
The function that deletes the keys from the lost slots is called only
if the node does not lose all its slots (resulting in a reconfiguration
as a slave of the node that got ownership). This is an optimization
since the replication code will anyway flush all the instance data in
a faster way.
Using CLUSTER FAILOVER FORCE it is now possible to failover a master in
a forced way, which means:
1) No check to understand if the master is up is performed.
2) No data age of the slave is checked. Evan a slave with very old data
can manually failover a master in this way.
3) No chat with the master is attempted to reach its replication offset:
the master can just be down.
Automatic failovers only happen in Redis Cluster if the slave trying to
be elected was disconnected from its master for no more than 10 times
the node-timeout value. However there should be no such a check for
manual failovers, since these are initiated by the sysadmin that, in
theory, knows what she is doing when a slave is selected to be promoted.