Commit Graph

3892 Commits

Author SHA1 Message Date
antirez
8e553ec67c Sentinel test: basic failover tested. Framework improvements. 2014-02-18 16:31:52 +01:00
antirez
c7b7439528 Sentinel test: basic tests for MONITOR and auto-discovery. 2014-02-18 11:53:54 +01:00
antirez
c4fbc1d336 Sentinel test: info fields, master-slave setup, fixes. 2014-02-18 11:38:49 +01:00
antirez
19b863c7fa Prefix test file names with numbers to force exec order. 2014-02-18 11:07:42 +01:00
antirez
141bac4c79 Sentinel test: provide basic commands to access instances. 2014-02-18 11:04:55 +01:00
antirez
7cec9e48ce Sentinel: SENTINEL_SLAVE_RECONF_RETRY_PERIOD -> RECONF_TIMEOUT
Rename define to match the new meaning.
2014-02-18 10:27:38 +01:00
antirez
18b8bad53c Sentinel: fix slave promotion timeout.
If we can't reconfigure a slave in time during failover, go forward as
anyway the slave will be fixed by Sentinels in the future, once they
detect it is misconfigured.

Otherwise a failover in progress may never terminate if for some reason
the slave is uncapable to sync with the master while at the same time
it is not disconnected.
2014-02-18 08:50:57 +01:00
antirez
af788b5852 Sentinel: initial testing framework.
Nothing tested at all so far... Just the infrastructure spawning N
Sentinels and N Redis instances that the test will use again and again.
2014-02-17 17:38:04 +01:00
antirez
34c404e069 Test: colorstr moved to util.tcl. 2014-02-17 17:36:50 +01:00
antirez
a1dca2efab Test: code to test server availability refactored.
Some inline test moved into server_is_up procedure.
Also find_available_port was moved into util since it is going
to be used for the Sentinel test as well.
2014-02-17 16:44:57 +01:00
antirez
ede33fb912 Get absoulte config file path before processig 'dir'.
The code tried to obtain the configuration file absolute path after
processing the configuration file. However if config file was a relative
path and a "dir" statement was processed reading the config, the absolute
path obtained was wrong.

With this fix the absolute path is obtained before processing the
configuration while the server is still in the original directory where
it was executed.
2014-02-17 16:44:53 +01:00
antirez
e1b77b61f3 Sentinel: better specify startup errors due to config file.
Now it logs the file name if it is not accessible. Also there is a
different error for the missing config file case, and for the non
writable file case.
2014-02-17 16:44:49 +01:00
antirez
51bd9da1fd Update cached time in rdbLoad() callback.
server.unixtime and server.mstime are cached less precise timestamps
that we use every time we don't need an accurate time representation and
a syscall would be too slow for the number of calls we require.

Such an example is the initialization and update process of the last
interaction time with the client, that is used for timeouts.

However rdbLoad() can take some time to load the DB, but at the same
time it did not updated the time during DB loading. This resulted in the
bug described in issue #1535, where in the replication process the slave
loads the DB, creates the redisClient representation of its master, but
the timestamp is so old that the master, under certain conditions, is
sensed as already "timed out".

Thanks to @yoav-steinberg and Redis Labs Inc for the bug report and
analysis.
2014-02-13 15:13:26 +01:00
antirez
7e8abcf693 Log when CONFIG REWRITE goes bad. 2014-02-13 14:32:44 +01:00
antirez
f2bdf601be Test: regression for issue #1549.
It was verified that reverting the commit that fixes the bug, the test
no longer passes.
2014-02-13 12:26:38 +01:00
antirez
21e6b0fbe9 Fix script cache bug in the scripting engine.
This commit fixes a serious Lua scripting replication issue, described
by Github issue #1549. The root cause of the problem is that scripts
were put inside the script cache, assuming that slaves and AOF already
contained it, even if the scripts sometimes produced no changes in the
data set, and were not actaully propagated to AOF/slaves.

Example:

    eval "if tonumber(KEYS[1]) > 0 then redis.call('incr', 'x') end" 1 0

Then:

    evalsha <sha1 step 1 script> 1 0

At this step sha1 of the script is added to the replication script cache
(the script is marked as known to the slaves) and EVALSHA command is
transformed to EVAL. However it is not dirty (there is no changes to db),
so it is not propagated to the slaves. Then the script is called again:

    evalsha <sha1 step 1 script> 1 1

At this step master checks that the script already exists in the
replication script cache and doesn't transform it to EVAL command. It is
dirty and propagated to the slaves, but they fail to evaluate the script
as they don't have it in the script cache.

The fix is trivial and just uses the new API to force the propagation of
the executed command regardless of the dirty state of the data set.

Thank you to @minus-infinity on Github for finding the issue,
understanding the root cause, and fixing it.
2014-02-13 12:10:43 +01:00
antirez
fc08c8599f AOF write error: retry with a frequency of 1 hz. 2014-02-12 16:27:59 +01:00
antirez
fe8352540f AOF: don't abort on write errors unless fsync is 'always'.
A system similar to the RDB write error handling is used, in which when
we can't write to the AOF file, writes are no longer accepted until we
are able to write again.

For fsync == always we still abort on errors since there is currently no
easy way to avoid replying with success to the user otherwise, and this
would violate the contract with the user of only acknowledging data
already secured on disk.
2014-02-12 16:11:36 +01:00
antirez
db6d628c3e Cluster: clusterDelNode(): remove node from master's slaves. 2014-02-11 10:34:25 +01:00
antirez
5e0e03be41 Cluster: UPDATE messages are the norm and verbose.
Logging them at WARNING level was of little utility and of sure disturb.
2014-02-11 10:18:24 +01:00
antirez
8251d2d150 Cluster: redis-trib fix: handling of another trivial case. 2014-02-11 10:13:18 +01:00
antirez
4a64286c36 Cluster: configEpoch assignment in SETNODE improved.
Avoid to trash a configEpoch for every slot migrated if this node has
already the max configEpoch across the cluster.

Still work to do in this area but this avoids both ending with a very
high configEpoch without any reason and to flood the system with fsyncs.
2014-02-11 10:09:17 +01:00
antirez
72f7abf6a2 Cluster: clusterSetStartupEpoch() made more generally useful.
The actual goal of the function was to get the max configEpoch found in
the cluster, so make it general by removing the assignment of the max
epoch to currentEpoch that is useful only at startup.
2014-02-11 10:00:14 +01:00
antirez
44f7afe28a Cluster: always increment the configEpoch in SETNODE after import.
Removed a stale conditional preventing the configEpoch from incrementing
after the import in certain conditions. Since the master got a new slot
it should always claim a new configuration.
2014-02-11 09:50:37 +01:00
antirez
a1349728ea Cluster: on resharding upgrade version of receiving node.
The node receiving the hash slot needs to have a version that wins over
the other versions in order to force the ownership of the slot.

However the current code is far from perfect since a failover can happen
during the manual resharding. The fix is a work in progress but the
bottom line is that the new version must either be voted as usually,
set by redis-trib manually after it makes sure can't be used by other
nodes, or reserved configEpochs could be used for manual operations (for
example odd versions could be never used by slaves and are always used
by CLUSTER SETSLOT NODE).
2014-02-11 00:36:05 +01:00
antirez
6dc26795aa Cluster: fsync at every SETSLOT command puts too pressure on disks.
During slots migration redis-trib can send a number of SETSLOT commands.
Fsyncing every time is a bit too much in production as verified
empirically.

To make sure configs are fsynced on all nodes after a resharding
redis-trib may send something like CLUSTER CONFSYNC.

In this case fsyncs were not providing too much value since anyway
processes can crash in the middle of the resharding of an hash slot, and
redis-trib should be able to recover from this condition anyway.
2014-02-10 23:54:08 +01:00
antirez
218358bbbd Cluster: conditions to clear "migrating" on slot for SETSLOT ... NODE changed.
If the slot is manually assigned to another node, clear the migrating
status regardless of the fact it was previously assigned to us or not,
as long as we no longer have keys for this slot.

This avoid a race during slots migration that may leave the slot in
migrating status in the source node, since it received an update message
from the destination node that is already claiming the slot.

This way we are sure that redis-trib at the end of the slot migration is
always able to close the slot correctly.
2014-02-10 23:51:47 +01:00
antirez
3107e7ca60 Cluster: remove debugging xputs from redis-trib. 2014-02-10 19:14:05 +01:00
antirez
1ae50a9b1d Cluster: redis-trib fix: cover new case of open slot.
The case is the trivial one a single node claiming the slot as
migrating, without nodes claiming it as importing.
2014-02-10 19:10:23 +01:00
antirez
59e03a8f35 redis-trib: log event after we have reference to 'master'. 2014-02-10 18:48:40 +01:00
antirez
bf670e0745 Cluster: don't update slave's master if we don't know it.
There is no way we can update the slave's node->slaveof pointer if we
don't know the master (no node with such an ID in our tables).
2014-02-10 18:33:34 +01:00
antirez
a3755ae9ee Cluster: ignore slot config changes if we are importing it. 2014-02-10 18:04:43 +01:00
antirez
6fc53e16ad Cluster: update configEpoch after manually messing with slots. 2014-02-10 18:01:58 +01:00
antirez
be0bb19fd3 Cluster: redis-trib, more info about open slots error. 2014-02-10 17:44:16 +01:00
antirez
1a73c992a3 Cluster: fixed inverted arguments in logging function call. 2014-02-10 17:21:10 +01:00
antirez
32563b4a5f Cluster: clear the FAIL status for masters without slots.
Masters without slots don't participate to the cluster but just do
redirections, no need to take them in FAIL state if they are back
reachable.
2014-02-10 17:18:27 +01:00
antirez
5b2082ead3 Cluster: replica migration should only work for masters serving slots. 2014-02-10 17:08:37 +01:00
antirez
f106a79309 Cluster: redis-trib del-node variable typo fixed. 2014-02-10 16:59:09 +01:00
antirez
f885fa8bac Cluster: clusterReadHandler() fixed to work with new message header. 2014-02-10 16:27:37 +01:00
antirez
344a065d51 Cluster: don't propagate PUBLISH two times.
PUBLISH both published messages via Cluster bus and replication when
cluster was enabled, resulting in duplicated message in the slave.
2014-02-10 16:00:27 +01:00
antirez
7bf7b7350c Cluster: signature changed to "RCmb" (Redis Cluster message bus).
Sounds better after all.
2014-02-10 15:55:21 +01:00
antirez
dced9c0619 Cluster: discard bus messages with version != 0. 2014-02-10 15:54:22 +01:00
antirez
007e1c7cb2 Cluster: added signature + version in bus packets. 2014-02-10 15:53:09 +01:00
antirez
100cd7b21e Added a release notes file good for "unstable". 2014-02-10 15:38:54 +01:00
antirez
b1c3c6e518 Old Changelog file removed from unstable branch. 2014-02-10 15:19:12 +01:00
antirez
dca95f241c Cluster: redis-trib: options table entry for add-node fixed. 2014-02-10 12:34:21 +01:00
antirez
6df4ffe639 Don't count time to feed MONITORs in SLOWLOG. 2014-02-07 18:29:20 +01:00
antirez
142281dc79 Cluster: keys slot computation now supports hash tags.
Currently this is marginally useful, only to make sure two keys are in
the same hash slot when the cluster is stable (no rehashing in
progress).

In the future it is possible that support will be added to run
mutli-keys operations with keys in the same hash slot.
2014-02-07 17:39:01 +01:00
antirez
2d6eb68993 Sentinel: allow SHUTDOWN command in Sentinel mode. 2014-02-07 11:22:24 +01:00
antirez
970de3e9c0 Check for EAGAIN in sendBulkToSlave().
Sometime an osx master with a Linux server over a slow link caused
a strange error where osx called the writable function for
the socket but actually apparently there was no room in the socket
buffer to accept the write: write(2) call returned an EAGAIN error,
that was not checked, so we considered write(2) == 0 always as a connection
reset, which was unfortunate since the bulk transfer has to start again.

Also more errors are logged with the WARNING level in the same code path
now.
2014-02-05 16:38:10 +01:00