3241 Commits

Author SHA1 Message Date
antirez
a6fe4ca321 CLUSTER SLOTS: main loop should skip only slaves and zero slot masters. 2014-06-25 15:08:33 +02:00
Matt Stancliff
e14829de30 Cluster: Add CLUSTER SLOTS command
CLUSTER SLOTS returns a Redis-formatted mapping from
slot ranges to IP/Port pairs serving that slot range.

The outer return elements group return values by slot ranges.

The first two entires in each result are the min and max slots for the range.

The third entry in each result is guaranteed to be either
an IP/Port of the master for that slot range - OR - null
if that slot range, for some reason, has no master

The 4th and higher entries in each result are replica instances
for the slot range.

Output comparison:
127.0.0.1:7001> cluster nodes
f853501ec8ae1618df0e0f0e86fd7abcfca36207 127.0.0.1:7001 myself,master - 0 0 2 connected 4096-8191
5a2caa782042187277647661ffc5da739b3e0805 127.0.0.1:7005 slave f853501ec8ae1618df0e0f0e86fd7abcfca36207 0 1402622415859 6 connected
6c70b49813e2ffc9dd4b8ec1e108276566fcf59f 127.0.0.1:7007 slave 26f4729ca0a5a992822667fc16b5220b13368f32 0 1402622415357 8 connected
2bd5a0e3bb7afb2b56a2120d3fef2f2e4333de1d 127.0.0.1:7006 slave 32adf4b8474fdc938189dba00dc8ed60ce635b0f 0 1402622419373 7 connected
5a9450e8279df36ff8e6bb1c139ce4d5268d1390 127.0.0.1:7000 master - 0 1402622418872 1 connected 0-4095
32adf4b8474fdc938189dba00dc8ed60ce635b0f 127.0.0.1:7002 master - 0 1402622419874 3 connected 8192-12287
5db7d05c245267afdfe48c83e7de899348d2bdb6 127.0.0.1:7004 slave 5a9450e8279df36ff8e6bb1c139ce4d5268d1390 0 1402622417867 5 connected
26f4729ca0a5a992822667fc16b5220b13368f32 127.0.0.1:7003 master - 0 1402622420877 4 connected 12288-16383

127.0.0.1:7001> cluster slots
1) 1) (integer) 0
   2) (integer) 4095
   3) 1) "127.0.0.1"
      2) (integer) 7000
   4) 1) "127.0.0.1"
      2) (integer) 7004
2) 1) (integer) 12288
   2) (integer) 16383
   3) 1) "127.0.0.1"
      2) (integer) 7003
   4) 1) "127.0.0.1"
      2) (integer) 7007
3) 1) (integer) 4096
   2) (integer) 8191
   3) 1) "127.0.0.1"
      2) (integer) 7001
   4) 1) "127.0.0.1"
      2) (integer) 7005
4) 1) (integer) 8192
   2) (integer) 12287
   3) 1) "127.0.0.1"
      2) (integer) 7002
   4) 1) "127.0.0.1"
      2) (integer) 7006
2014-06-25 15:03:41 +02:00
antirez
f29b12d0bf Cluster: myself->ip autodiscovery.
Instead of having an hardcoded IP address in the node configuration, we
autodiscover it via MEET messages for automatic update when the node is
restarted with a different IP address.

This mechanism was discussed in the context of PR #1782.
2014-06-25 11:28:57 +02:00
antirez
46319094db Old form of CLIENT KILL should still allow suicide. 2014-06-24 12:49:28 +02:00
antirez
e3bae84606 Sentinel implementation of ROLE. 2014-06-23 12:07:41 +02:00
antirez
be8f4d49d4 Silence different signs comparison warning in sds.c. 2014-06-23 11:50:24 +02:00
Matt Stancliff
5cd83ef539 Sentinel: bind source address
Some deployments need traffic sent from a specific address.  This
change uses the same policy as Cluster where the first listed bindaddr
becomes the source address for outgoing Sentinel communication.

Fixes #1667
2014-06-23 11:44:35 +02:00
Matt Stancliff
d830dcb12d Add REDIS_BIND_ADDR access macro
We need to access (bindaddr[0] || NULL) in a few places, so centralize
access with a nice macro.
2014-06-23 11:44:34 +02:00
Matt Stancliff
ef897a41e8 Cancel SHUTDOWN if initial AOF is being written
Fixes #1826 (and many other reports of the same problem)
2014-06-23 11:44:34 +02:00
antirez
fb2f637c4a Allow to call ROLE in LOADING state. 2014-06-21 11:39:43 +02:00
antirez
7970d53997 ROLE command: array len fixed for slave output. 2014-06-21 11:17:18 +02:00
antirez
22d17bc14f Cluster: clear NOADDR flag when updating node address. 2014-06-20 09:32:47 +02:00
antirez
41f12ac988 Sentinel: send hello messages ASAP after config change.
Eventual configuration convergence is guaranteed by our periodic hello
messages to all the instances, however when there are important notices
to share, better make a phone call. With this commit we force an hello
message to other Sentinal and Redis instances within the next 100
milliseconds of a config update, which is practically better than
waiting a few seconds.
2014-06-19 15:17:06 +02:00
antirez
94bc467328 Sentinel: handle SRI_PROMOTED flag correctly.
Lack of check of the SRI_PROMOTED flag caused Sentienl to act with the
promoted slave turned into a master during failover like if it was a
normal instance.

Normally this problem was not apparent because during real failovers the
old master is down so the bugged code path was not entered, however with
manual failovers via the SENTINEL FAILOVER command, the problem was
easily triggered.

This commit prevents promoted slaves from getting reconfigured, moreover
we now explicitly check that during a failover the slave turning into a
master is the one we selected for promotion and not a different one.
2014-06-19 10:28:27 +02:00
Alex Suraci
9f8dcfe69a add missing signal.h include 2014-06-17 21:59:12 -07:00
Matt Stancliff
20c2a38ad0 Add SIGINT handler to cli --intrinsic-latency
If we run a long latency session and want to Ctrl-C out of it,
it's nice to still get the summary output at the end.
2014-06-17 10:12:57 -04:00
antirez
2c17591224 Sentinel: send SLAVEOF with MULTI, CLIENT KILL, CONFIG REWRITE.
This implements the new Sentinel-Client protocol for the Sentinel part:
now instances are reconfigured using a transaction that ensures that the
config is rewritten in the target instance, and that clients lose the
connection with the instance, in order to be forced to: ask Sentinel,
reconnect to the instance, and verify the instance role with the new
ROLE command.
2014-06-17 11:03:21 +02:00
antirez
bb2011d992 CLIENT KILL API modified.
Added a new SKIPME option that is true by default, that prevents the
client sending the command to be killed, unless SKIPME NO is sent.
2014-06-16 14:50:15 +02:00
antirez
e06b3819ea CLIENT KILL: fix closing link of the current client. 2014-06-16 14:28:23 +02:00
antirez
e7affd266c New features for CLIENT KILL. 2014-06-16 14:24:28 +02:00
antirez
f26f79ea37 Assign an unique non-repeating ID to each new client.
This will be used by CLIENT KILL and is also a good way to ensure a
given client is still the same across CLIENT LIST calls.

The output of CLIENT LIST was modified to include the new ID, but this
change is considered to be backward compatible as the API does not imply
you can do positional parsing, since each filed as a different name.
2014-06-16 14:22:55 +02:00
antirez
56d26c2380 Client types generalized.
Because of output buffer limits Redis internals had this idea of type of
clients: normal, pubsub, slave. It is possible to set different output
buffer limits for the three kinds of clients.

However all the macros and API were named after output buffer limit
classes, while the idea of a client type is a generic one that can be
reused.

This commit does two things:

1) Rename the API and defines with more general names.
2) Change the class of clients executing the MONITOR command from "slave"
   to "normal".

"2" is a good idea because you want to have very special settings for
slaves, that are not a good idea for MONITOR clients that are instead
normal clients even if they are conceptually slave-alike (since it is a
push protocol).

The backward-compatibility breakage resulting from "2" is considered to
be minimal to care, since MONITOR is a debugging command, and because
anyway this change is not going to break the format or the behavior, but
just when a connection is closed on big output buffer issues.
2014-06-16 10:43:05 +02:00
antirez
96e0fe6232 Fix semantics of Lua calls to SELECT.
Lua scripts are executed in the context of the currently selected
database (as selected by the caller of the script).

However Lua scripts are also free to use the SELECT command in order to
affect other DBs. When SELECT is called frm Lua, the old behavior, before
this commit, was to automatically set the Lua caller selected DB to the
last DB selected by Lua. See for example the following sequence of
commands:

    SELECT 0
    SET x 10
    EVAL "redis.call('select','1')" 0
    SET x 20

Before this commit after the execution of this sequence of commands,
we'll have x=10 in DB 0, and x=20 in DB 1.

Because of the problem above, there was a bug affecting replication of
Lua scripts, because of the actual implementation of replication. It was
possible to fix the implementation of Lua scripts in order to fix the
issue, but looking closely, the bug is the consequence of the behavior
of Lua ability to set the caller's DB.

Under the old semantics, a script selecting a different DB, has no simple
ways to restore the state and select back the previously selected DB.
Moreover the script auhtor must remember that the restore is needed,
otherwise the new commands executed by the caller, will be executed in
the context of a different DB.

So this commit fixes both the replication issue, and this hard-to-use
semantics, by removing the ability of Lua, after the script execution,
to force the caller to switch to the DB selected by the Lua script.

The new behavior of the previous sequence of commadns is to just set
X=20 in DB 0. However Lua scripts are still capable of writing / reading
from different DBs if needed.

WARNING: This is a semantical change that will break programs that are
conceived to select the client selected DB via Lua scripts.

This fixes issue #1811.
2014-06-12 16:05:52 +02:00
antirez
73fefd0bc0 Scripting: Fix for a #1118 regression simplified.
It is more straightforward to just test for a numerical type avoiding
Lua's automatic conversion. The code is technically more correct now,
however Lua should automatically convert to number only if the original
type is a string that "looks like a number", and not from other types,
so practically speaking the fix is identical AFAIK.
2014-06-11 10:10:58 +02:00
Matt Stancliff
76efe1225f Scripting: Fix regression from #1118
The new check-for-number behavior of Lua arguments broke
users who use large strings of just integers.

The Lua number check would convert the string to a number, but
that breaks user data because
Lua numbers have limited precision compared to an arbitrarily
precise number wrapped in a string.

Regression fixed and new test added.

Fixes #1118 again.
2014-06-10 14:26:13 -04:00
antirez
8ef79e72ac Cluster: fix an error message when logging failover auth denied. 2014-06-10 17:39:42 +02:00
antirez
58799718be Cluster: better comment for clusterSendFailoverAuthIfNeeded() epoch test. 2014-06-10 17:20:21 +02:00
antirez
61eb0eae83 Cluster: log granted failover authorizations. 2014-06-10 16:56:08 +02:00
antirez
d5d92deb6c Cluster: log configEpoch updates to myself. 2014-06-10 16:38:36 +02:00
antirez
8204ab0098 Cluster: log when a master denies a failover auth. 2014-06-10 16:07:26 +02:00
antirez
9b3bc82c1a Cluster: cluster_my_epoch added to CLUSTER INFO output. 2014-06-10 11:35:40 +02:00
Salvatore Sanfilippo
08c7363647 Merge pull request #1743 from mattsta/cygwin-compile-fix
Cygwin compile fix
2014-06-09 11:42:14 +02:00
Salvatore Sanfilippo
c7f93143f6 Merge pull request #1669 from mattsta/blpop-internally-added-keys
Fix blocking operations from missing new lists
2014-06-09 11:37:28 +02:00
antirez
6a13193d8f ROLE output improved for slaves.
Info about the replication state with the master added.
2014-06-07 17:38:20 +02:00
antirez
d34c2fa3bb ROLE command added.
The new ROLE command is designed in order to provide a client with
informations about the replication in a fast and easy to use way
compared to the INFO command where the same information is also
available.
2014-06-07 17:27:49 +02:00
antirez
32d0a79f78 Cluster: check that configEpoch never goes back.
Since there are ways to alter the configEpoch outside of the failover
procedure (for exampel CLUSTER SET-CONFIG-EPOCH and via the configEpoch
collision resolution algorithm), make always sure, before replacing our
configEpoch with a new one, that it is greater than the current one.
2014-06-07 14:37:09 +02:00
antirez
a2c2ef7de5 Cluster: SET-CONFIG-EPOCH should update currentEpoch.
SET-CONFIG-EPOCH, used by redis-trib at cluster creation time, failed to
update the currentEpoch, making it possible after a failover for a
server to set its configEpoch to a value smaller than the current one
(since configEpochs are obtained using currentEpoch).

The bug totally break the Redis Cluster algorithms and protocols
allowing for permanent split brain conditions about the slots
configuration as shown in issue #1799.
2014-06-07 14:25:47 +02:00
Salvatore Sanfilippo
a2403227c7 Merge pull request #1772 from andygrunwald/typo-avarege-average
Fixed typo in word avarege in result message of --intrinsic-latency analyzer
2014-06-06 11:19:21 +02:00
Salvatore Sanfilippo
113be48221 Merge pull request #1780 from badboy/patch-8
Small typo fixed
2014-06-06 10:45:00 +02:00
Salvatore Sanfilippo
1e221d101c Merge pull request #1788 from zionwu/unstable
fix issue 1787
2014-06-06 10:33:11 +02:00
antirez
14fb0ac649 Don't process min-slaves-to-write for slaves.
Replication is totally broken when a slave has this option, since it
stops accepting updates from masters.

This fixes issue #1434.
2014-06-05 10:48:05 +02:00
antirez
3758f27bc1 Fixed dbuf variable scope in luaRedisGenericCommand().
I'm not sure if while the visibility is the inner block, the fact we
point to 'dbuf' is a problem or not, probably the stack var isx
guaranteed to live until the function returns. However obvious code is
better anyway.
2014-06-04 18:57:12 +02:00
antirez
072982d83c Scripting: better Lua number -> string conversion in luaRedisGenericCommand().
The lua_to*string() family of functions use a non optimal format
specifier when converting integers to strings. This has both the problem
of the number being converted in exponential notation, which we don't
use as a Redis return value when floating point numbers are involed,
and, moreover, there is a loss of precision since the default format
specifier is not able to represent numbers that must be represented
exactly in the IEEE 754 number mantissa.

The new code handles it as a special case using a saner conversion.

This fixes issue #1118.
2014-06-04 18:33:24 +02:00
zionwu
dc8584696a fix issue 1787 2014-06-01 02:23:24 +08:00
antirez
8a588ac14d More trailing spaces in sentinel.c removed. 2014-05-28 15:46:05 +02:00
Jan-Erik Rediger
b187c591e3 Small typo fixed 2014-05-28 09:46:01 +02:00
Matt Stancliff
7a0c5fdf12 Disable recursive watchdog signal handler
If we are in the signal handler, we don't want to handle
the signal again.  In extreme cases, this can cause a stack overflow
and segfault Redis.

Fixes #1771
2014-05-26 17:53:33 +02:00
antirez
88c2307535 Cluster: always allow ok -> fail switch in clusterUpdateState().
There is a time defined by REDIS_CLUSTER_WRITABLE_DELAY where fail -> ok
switch is not possible after startup as a master for some time, however
the contrary (ok -> fail) should always be possible.
2014-05-26 16:24:12 +02:00
Andy Grunwald
94e3bb568a Fixed typo in word avarege in result message of --intrinsic-latency analyzer 2014-05-22 20:01:12 +02:00
antirez
b239a32aae redisLogFromHandler() format changed to match new logs format. 2014-05-22 19:24:35 +02:00