61 Commits

Author SHA1 Message Date
antirez
3e9ce38b0a Sentinel: check Slave INFO state more often when disconnected.
During the initial handshake with the master a slave will report to have
a very high disconnection time from its master (since technically it was
disconnected since forever, so the current UNIX time in seconds is
reported).

However when the slave is connected again the Sentinel may re-scan the
INFO output again only after 10 seconds, which is a long time. During
this time Sentinels will consider this instance unable to failover, so
a useless delay is introduced.

Actaully this hardly happened in the practice because when a slave's
master is down, the INFO period for slaves changes to 1 second. However
when a manual failover is attempted immediately after adding slaves
(like in the case of the Sentinel unit test), this problem may happen.

This commit changes the INFO period to 1 second even in the case the
slave's master is not down, but the slave reported to be disconnected
from the master (by publishing, last time we checked, a master
disconnection time field in INFO).

This change is required as a result of an unrelated change in the
replication code that adds a small delay in the master-slave first
synchronization.
2016-07-22 10:51:25 +02:00
antirez
d614f1c37e Sentinel: CKQUORUM tests 2015-05-19 12:26:09 +02:00
antirez
65090401b7 Sentinel / Cluster test: exit with non-zero error code on failures. 2015-03-30 14:29:01 +02:00
Matt Stancliff
28343966a4 Spell software correctly 2014-09-29 06:49:07 -04:00
antirez
e21e0ba3dc Sentinel test: more correct sentinels config reset.
In the initialization test for each instance we used to unregister the
old master and register it again to clear the config.
However there is a race condition doing this: as soon as we unregister
and re-register "mymaster", another Sentinel can update the new
configuration with the old state because of gossip "hello" messages.

So the correct procedure is instead, unregister "mymaster" from all the
sentinel instances, and re-register it everywhere again.
2014-06-23 14:07:47 +02:00
antirez
f62dfa0f50 Sentinel test: tolerate larger delays in init tests. 2014-06-19 15:58:45 +02:00
antirez
d06d8d6ffa Sentinel test: unit 02, avoid some time related false positives. 2014-06-19 15:56:28 +02:00
antirez
f16ad11c71 Sentinel test: add manual failover test. 2014-06-19 10:33:12 +02:00
Matt Stancliff
f7d9827330 Add correct exit value to failed tests 2014-06-18 08:10:04 -04:00
antirez
e8631a6991 Cluster / Sentinel test: instances count moved to run.tcl. 2014-04-29 16:17:15 +02:00
antirez
897adc1c8c Sentinel test files / directories layout improved.
The test now runs in a self-contained directory.
The general abstractions to run the tests in an environment where
mutliple instances are executed at the same time was extrapolated into
instances.tcl, that will be reused to test Redis Cluster.
2014-04-24 11:08:22 +02:00