Commit Graph

1622 Commits

Author SHA1 Message Date
antirez
97ffcd351b Cluster: use the failure report API to reimplement failure detection.
The new system detects a failure only when there is quorum from masters.
2013-02-26 14:58:39 +01:00
antirez
6356cf6808 Set process name in ps output to make operations safer.
This commit allows Redis to set a process name that includes the binding
address and the port number in order to make operations simpler.

Redis children processes doing AOF rewrites or RDB saving change the
name into redis-aof-rewrite and redis-rdb-bgsave respectively.

This in general makes harder to kill the wrong process because of an
error and makes simpler to identify saving children.

This feature was suggested by Arnaud GRANAL in the Redis Google Group,
Arnaud also pointed me to the setproctitle.c implementation includeed in
this commit.

This feature should work on all the Linux, OSX, and all the three major
BSD systems.
2013-02-26 11:52:12 +01:00
antirez
1b1b3f6c06 Cluster: invert two functions declarations in more natural order. 2013-02-26 11:19:48 +01:00
antirez
d5e8b0a47f Cluster: cleanup idle failure reports every time we remove one.
This is not very important as anyway when the function counting the
number of reports is called the cleanup is performed. However with this
change if only part of the nodes that reported the failure will report
the node is back ok, we'll cleanup the older entries ASAP. In complex
split net split scenarios, and when we are dealing with clusters having
nodes in the order of ~ 1000, this can save some CPU.
2013-02-26 11:15:18 +01:00
antirez
9cb578ced0 Cluster: new function clusterNodeDelFailureReport() for failure reports.
This is the missing part of the API that will be used to reimplement
failure detection of Cluster nodes.
2013-02-25 19:13:22 +01:00
Arnaud Granal
3d588b37ce Fix error "repl-backlog-size must be 1 or greater"
The parameter repl-backlog-size is not parsed correctly in the configuration file. argv[0] is parsed instead of argv[1].
2013-02-25 19:25:48 +02:00
antirez
18f537083a Cluster: no limits for the count parameter of CLUSTER GETKEYSINSLOT.
Not sure why I set a limit to 1 million keys, there is no reason for
this artificial limit, and anyway this is s a stupid limit because it is
already high enough to create latency issues. So let's the users shoot
on their feet because maybe they just actually know what they are doing.
2013-02-25 12:41:13 +01:00
antirez
544bbe5387 Cluster: validate slot number in CLUSTER COUNTKEYSINSLOT. 2013-02-25 12:40:32 +01:00
antirez
996a643752 Cluster: use O(log(N)) algo for countKeysInSlot(). 2013-02-25 12:37:50 +01:00
antirez
d4fa40655d Cluster: new sub-command CLUSTER COUNTKEYSINSLOT.
The new sub-command uses the new countKeysInSlot() API and allows a
cluster client to get the number of keys for a given hashslot.
2013-02-25 12:04:31 +01:00
antirez
a517c89321 Cluster: verifyClusterConfigWithData() implemented. 2013-02-25 11:43:49 +01:00
antirez
9947b0d96d A comment in main() clarified. 2013-02-25 11:40:21 +01:00
antirez
d2154254be Cluster: fix case for getKeysInSlot() and countKeysInSlot().
Redis functions start in low case. A few functions about cluster were
capitalized the wrong way.
2013-02-25 11:25:40 +01:00
antirez
c2eb4a606f Cluster: use CountKeysInSlot() when we just need the count. 2013-02-25 11:23:04 +01:00
antirez
ad3bca1fdf Cluster: added stub for verifyClusterConfigWithData().
See the top-comment for the function in this commit for details about
what the function is supposed to do.
2013-02-25 11:20:17 +01:00
antirez
1abce14611 Cluster: added new API countKeysInSlot().
This is similar to getKeysInSlot() but just returns the number of keys
found in a given hash slot, without returning the keys.
2013-02-25 11:15:03 +01:00
0x20h
977035b7b5 suppress external diff program when using git diff. 2013-02-24 18:17:46 +01:00
antirez
825e07f2fd Cluster: if no previous config exists, create the myself node as master. 2013-02-22 19:24:01 +01:00
antirez
f4093753e4 Cluster: add cluster_size field in CLUSTER INFO output. 2013-02-22 19:20:38 +01:00
antirez
d218a4e244 Cluster: new state information, cluster size.
The definition of cluster size is: the number of known nodes in the
cluster that are masters and serving at least an hash slot.
2013-02-22 19:18:30 +01:00
antirez
5c55ed9388 Cluster: remove warning adding clusterNodeSetSlotBit() prototype. 2013-02-22 17:45:49 +01:00
antirez
974929770b Cluster: introduced a failure reports system.
A §Redis Cluster node used to mark a node as failing when itself
detected a failure for that node, and a single acknowledge was received
about the possible failure state.

The new API will be used in order to possible to require that N other
nodes have a PFAIL or FAIL state for a given node for a node to set it
as failing.
2013-02-22 17:43:35 +01:00
antirez
36af851550 redis-trib: check that all the nodes agree about the slots configuration. 2013-02-22 12:25:16 +01:00
antirez
51b5058d04 redis-trib: skeleton of coverage fix for "keys in multiple nodes" case. 2013-02-22 11:33:10 +01:00
antirez
a81c598f95 redis-trib: handle slot coverage fix in the "no nodes with keys" case. 2013-02-22 10:23:53 +01:00
antirez
392e0fa7eb Cluster: fix case of slotToKey...() functions. 2013-02-22 10:16:21 +01:00
antirez
d04770988d Cluster: empty the internal sorted set mapping keys to slots on FLUSHDB/ALL. 2013-02-22 10:15:32 +01:00
antirez
efe51dfff5 redis-trib: specify single node address when fixing coverage. 2013-02-22 10:05:07 +01:00
antirez
915e81335e redis-trib: ability to fix uncovered slots for the trivial case. 2013-02-21 18:10:06 +01:00
antirez
619d3945f8 redis-trib: fixed typo in method name. 2013-02-21 16:58:27 +01:00
antirez
07b6322735 Cluster: more correct update of slots state when PONG is received. 2013-02-21 16:52:06 +01:00
antirez
c6da9d9fac Call clusterUpdateState() after CLUSTER SETSLOT too. 2013-02-21 16:31:22 +01:00
antirez
3a99d1228a Aesthetic change to make a line more 80-cols friendly. 2013-02-21 16:24:48 +01:00
antirez
5fd9f701da redis-trib: move instance vars in the right class. 2013-02-21 13:06:59 +01:00
antirez
7898bf4b7e redis-trib: some refactoring and skeleton of the "fix" command. 2013-02-21 13:00:41 +01:00
antirez
dc4af60628 Cluster: clusterAddSlot() was not doing what stated in the comment. 2013-02-21 11:51:17 +01:00
antirez
fdb57233e2 Cluster: always use cluster(Add|Del)Slot to modify the cluster slots table. 2013-02-21 11:44:58 +01:00
antirez
786b8d6c87 Use RESTORE-ASKING for MIGRATE when in cluster mode. 2013-02-20 17:36:54 +01:00
antirez
ea7fc82a4a Cluster: new command flag forcing implicit ASKING.
Also using this new flag the RESTORE-ASKING command was implemented that
will be used by MIGRATE.
2013-02-20 17:28:35 +01:00
antirez
9ec1b709f5 Cluster: ASKING command fixed, state was not retained. 2013-02-20 17:07:52 +01:00
antirez
b8d8b9ec41 redis-trib: set the migrating slot in the correct way when resharding. 2013-02-20 15:29:53 +01:00
antirez
9a04e12cc0 Cluster: I/O errors are now logged at DEBUG level. 2013-02-20 13:18:51 +01:00
antirez
917dd53216 redis-trib: make a few comments 80-cols friendly. 2013-02-15 17:11:55 +01:00
antirez
455da35c7f Cluster: specific error code for cluster down condition. 2013-02-15 16:53:24 +01:00
antirez
522c3255db Merge branch 'unstable' of github.com:antirez/redis into unstable 2013-02-15 16:45:04 +01:00
antirez
02796ba7a7 Cluster: sanity checks on the cluster bus message length. 2013-02-15 16:44:39 +01:00
antirez
8853698a6f Removed useless newlines from hashTypeCurrentObject(). 2013-02-15 13:12:55 +01:00
antirez
6b9c661838 Cluster: make valgrind happy initializing all the bytes of the node IP. 2013-02-15 12:58:35 +01:00
antirez
7371d5e248 Remove wrong decrRefCount() from getNodeByQuery().
This fixes issue #607.
2013-02-15 11:57:53 +01:00
antirez
20f52b5b78 Top comment for getNodeByQuery() improved. 2013-02-15 11:50:54 +01:00
antirez
6b641f3aeb redis-cli: update prompt on cluster redirection. 2013-02-14 18:49:08 +01:00
antirez
e0e15bd06d Cluster: with 16384 slots we need bigger buffers. 2013-02-14 15:36:33 +01:00
antirez
1a32d99b28 Cluster: move cluster config file out of config state.
This makes us able to avoid allocating the cluster state structure if
cluster is not enabled, but still we can handle the configuration
directive that sets the cluster config filename.
2013-02-14 15:20:02 +01:00
antirez
1649e509c3 Cluster: the cluster state structure is now heap allocated. 2013-02-14 13:20:56 +01:00
antirez
9dfd11c3da Cluster: Initialize ip and port in createClusterNode(). 2013-02-14 13:01:28 +01:00
antirez
a26690e8b5 Cluster: redis-trib updated to use 16384 hash slots. 2013-02-14 12:55:34 +01:00
antirez
ebd666db47 Cluster: from 4096 to 16384 hash slots. 2013-02-14 12:49:16 +01:00
antirez
072c91fe13 PSYNC: another change to unexpected reply from PSYNC. 2013-02-13 18:43:40 +01:00
antirez
0e1be5347b PSYNC: More robust handling of unexpected reply to PSYNC. 2013-02-13 18:33:33 +01:00
antirez
7404b95833 Avoid compiler warning by casting to match printf() specifier. 2013-02-13 13:38:20 +01:00
antirez
3419c8ce70 Replication: more strict error checking for master PING reply. 2013-02-12 16:53:27 +01:00
antirez
dc24a6b132 Return a specific NOAUTH error if authentication is required. 2013-02-12 16:25:41 +01:00
antirez
24f258360b Replication: added new stats counting full and partial resynchronizations. 2013-02-12 15:33:54 +01:00
antirez
04bdb3a2a4 Add missing bracket removed for error after rebase of PSYNC. 2013-02-12 12:56:32 +01:00
antirez
3af478e9ef PSYNC: debugging printf() calls are now logs at DEBUG level. 2013-02-12 12:52:22 +01:00
antirez
89b48f0825 Remove harmless warning in slaveTryPartialResynchronization(). 2013-02-12 12:52:21 +01:00
antirez
0ed6daa48b PSYNC: don't use the client buffer to send +CONTINUE and +FULLRESYNC.
When we are preparing an handshake with the slave we can't touch the
connection buffer as it'll be used to accumulate differences between
the sent RDB file and what arrives next from clients.

So in short we can't use addReply() family functions.

However we just use write(2) because we know that the socket buffer is
empty, since a prerequisite for SYNC to work is that the static buffer
and the output list are empty, and in general it is not expected that a
client SYNCs after doing some heavy I/O with the master.

However a short write connection is explicitly handled to avoid
fragility (we simply close the connection and the slave will retry).
2013-02-12 12:52:21 +01:00
antirez
d2a0348a49 SYNC not allowed with pending data on the static output buffer. 2013-02-12 12:52:21 +01:00
antirez
da315d3325 Log the unexpected string received in place of the SYNC payload length. 2013-02-12 12:52:21 +01:00
antirez
41d64a7516 After SLAVEOF <newslave> don't allow chained slaves to PSYNC. 2013-02-12 12:52:21 +01:00
antirez
078882025e PSYNC: work in progress, preview #2, rebased to unstable. 2013-02-12 12:52:21 +01:00
antirez
e34a35a511 Use the new unified protocol to send SELECT to slaves.
SELECT was still transmitted to slaves using the inline protocol, that
is conceived mostly for humans to type into telnet sessions, and is
notably not understood by redis-cli --slave.

Now the new protocol is used instead.
2013-02-12 12:50:28 +01:00
antirez
4b83ad4e1f Use replicationFeedSlaves() to send PING to slaves.
A Redis master sends PING commands to slaves from time to time: doing
this ensures that even if absence of writes, the master->slave channel
remains active and the slave can feel the master presence, instead of
closing the connection for timeout.

This commit changes the way PINGs are sent to slaves in order to use the
standard interface used to replicate all the other commands, that is,
the function replicationFeedSlaves().

With this change the stream of commands sent to every slave is exactly
the same regardless of their exact state (Transferring RDB for first
synchronization or slave already online). With the previous
implementation the PING was only sent to online slaves, with the result
that the output stream from master to slaves was not identical for all
the slaves: this is a problem if we want to implement partial resyncs in
the future using a global replication stream offset.

TL;DR: this commit should not change the behaviour in practical terms,
but is just something in preparation for partial resynchronization
support.
2013-02-12 12:50:28 +01:00
antirez
7465ac7ab1 Emit SELECT to slaves in a centralized way.
Before this commit every Redis slave had its own selected database ID
state. This was not actually useful as the emitted stream of commands
is identical for all the slaves.

Now the the currently selected database is a global state that is set to
-1 when a new slave is attached, in order to force the SELECT command to
be re-emitted for all the slaves.

This change is useful in order to implement replication partial
resynchronization in the future, as makes sure that the stream of
commands received by slaves, including SELECT commands, are exactly the
same for every slave connected, at any time.

In this way we could have a global offset that can identify a specific
piece of the master -> slaves stream of commands.
2013-02-12 12:50:28 +01:00
antirez
1a27d41156 Makefile: valgrind target added (forces -O0 and libc malloc). 2013-02-11 12:11:28 +01:00
antirez
76a5b21c3a Tcp keep-alive: send three probes before detectin an error.
Otherwise we end with less reliable connections because it's too easy
that a single packet gets lost.
2013-02-08 17:06:01 +01:00
antirez
124a635bc5 Set SO_KEEPALIVE on client sockets if configured to do so. 2013-02-08 16:40:59 +01:00
antirez
ee21c18e5d Add SO_KEEPALIVE support to anet.c. 2013-02-08 16:30:26 +01:00
antirez
a6c2f9012f Make all WATCHers dirty when the slave reloads the DB. 2013-02-08 10:26:19 +01:00
antirez
46dd4c62b3 LASTSAVE is a "random" command. 2013-02-07 19:13:00 +01:00
antirez
b70b459b0e TCP_NODELAY after SYNC: changes to the implementation. 2013-02-05 12:04:30 +01:00
charsyam
c85647f354 Turn off TCP_NODELAY on the slave socket after SYNC.
Further details from @antirez:

It was reported by @StopForumSpam on Twitter that the Redis replication
link was strangely using multiple TCP packets for multiple commands.
This wastes a lot of bandwidth and is due to the TCP_NODELAY option we
enable on the socket after accepting a new connection.

However the master -> slave channel is a one-way channel since Redis
replication is asynchronous, so there is no point in trying to reduce
the latency, we should aim to reduce the bandwidth. For this reason this
commit introduces the ability to disable the nagle algorithm on the
socket after a successful SYNC.

This feature is off by default because the delay can be up to 40
milliseconds with normally configured Linux kernels.
2013-02-05 12:04:25 +01:00
Rock Li
8063155cd0 retval doesn't initalized
If each if conditions are all fail, variable retval will under uninitlized
2013-02-05 15:56:04 +08:00
Gengliang Wang
002747336a Fix a bug in srandmemberWithCountCommand()
In CASE 2, the call sunionDiffGenericCommand will involve the string "srandmember" 
> sadd foo one
(integer 1)
> sadd srandmember two
(integer 2)
> srandmember foo 3
1)"one"
2)"two"
2013-02-04 14:01:08 +08:00
antirez
089cbe643f Sentinel: advertise the promoted slave address only after successful setup. 2013-01-31 17:19:21 +01:00
Salvatore Sanfilippo
aca005c246 Merge pull request #914 from catwell/unstable
fix comments forgotten in #285 (zipmap -> ziplist)
2013-01-31 03:37:48 -08:00
antirez
ad297a1a67 Z*STORE event fixed: generate del only if resulting sorted set is empty. 2013-01-29 13:50:01 +01:00
antirez
e41d1d77e3 Generate del events when S*STORE commands delete the destination key. 2013-01-29 13:43:13 +01:00
antirez
4dfb5752e0 Send 'expired' events when a key expires by lookup. 2013-01-28 13:15:19 +01:00
antirez
562b2bd6a7 Keyspace notifications: fixed a leak and a bug introduced in the latest commit. 2013-01-28 13:15:16 +01:00
antirez
fce016d31b Keyspace events: it is now possible to select subclasses of events.
When keyspace events are enabled, the overhead is not sever but
noticeable, so this commit introduces the ability to select subclasses
of events in order to avoid to generate events the user is not
interested in.

The events can be selected using redis.conf or CONFIG SET / GET.
2013-01-28 13:15:12 +01:00
antirez
1c0c551776 decrRefCount -> decrRefCountVoid in list constructor. 2013-01-28 13:15:08 +01:00
antirez
da04e6ed44 Keyspace events added for more commands. 2013-01-28 13:14:56 +01:00
antirez
8766e81079 Fix decrRefCount() prototype from void to robj pointer.
decrRefCount used to get its argument as a void* pointer in order to be
used as destructor where a 'void free_object(void*)' prototype is
expected. However this made simpler to introduce bugs by freeing the
wrong pointer. This commit fixes the argument type and introduces a new
wrapper called decrRefCountVoid() that can be used when the void*
argument is needed.
2013-01-28 13:14:53 +01:00
antirez
2d20e68fe4 notifyKeyspaceEvent(): release channel names using the right pointers. 2013-01-28 13:14:49 +01:00
antirez
5b9357a6b3 Initial test events for the new keyspace notification API. 2013-01-28 13:14:46 +01:00
antirez
2ea9518a53 Fixed over-80-cols comment in db.c 2013-01-28 13:14:42 +01:00
antirez
08d8bb57df Two fixes to initial keyspace notifications API. 2013-01-28 13:14:39 +01:00
antirez
4cdbce341e Keyspace events notification API. 2013-01-28 13:14:36 +01:00
Pierre Chapuis
50d43a9823 fix comments forgotten in #285 (zipmap -> ziplist) 2013-01-28 11:07:17 +01:00