304 Commits

Author SHA1 Message Date
antirez
8a588ac14d More trailing spaces in sentinel.c removed. 2014-05-28 15:46:05 +02:00
antirez
01e3f9ba1d Remove trailing spaces from sentinel.c. 2014-05-20 14:22:42 +02:00
antirez
2102778606 Sentinel: log when a failover will be attempted again.
When a Sentinel performs a failover (successful or not), or when a
Sentinel votes for a different Sentinel trying to start a failover, it
sets a min delay before it will try to get elected for a failover.

While not strictly needed, because if multiple Sentinels will try
to failover the same master at the same time, only one configuration
will eventually win, this serialization is practically very useful.
Normal failovers are cleaner: one Sentinel starts to failover, the
others update their config when the Sentinel performing the failover
is able to get the selected slave to move from the role of slave to the
one of master.

However currently this timeout was implicit, so users could see
Sentinels not reacting, after a failed failover, for some time, without
giving any feedback in the logs to the poor sysadmin waiting for clues.

This commit makes Sentinels more verbose about the delay: when a master
is down and a failover attempt is not performed because the delay has
still not elaped, something like that will be logged:

    Next failover delay: I will not start a failover
    before Thu May  8 16:48:59 2014
2014-05-08 16:38:53 +02:00
antirez
931beae9b0 Sentinel: generate +config-update-from event when a new config is received.
This event makes clear, before the switch-master event is generated,
that a Sentinel received a configuration update from another Sentinel.
2014-05-08 15:59:34 +02:00
antirez
35667d75c3 Fixed undefined variable value with certain code paths.
In sentinelFlushConfig() fd could be undefined when the following if
statement was true:

        if (rewrite_status == -1) goto werr;

This could cause random file descriptors to get closed.
2014-03-24 21:07:44 +01:00
Matt Stancliff
4290455145 Sentinel: Notify user when config can't be saved 2014-03-24 13:54:14 -04:00
Salvatore Sanfilippo
906c4d77c0 Merge pull request #1617 from mattsta/remove-unused-warning
Cluster: remove variable causing warning
2014-03-24 18:33:22 +01:00
Matt Stancliff
67ed5f00aa Cluster: remove variable causing warning
GCC-4.9 warned about this, but clang didn't.

This commit fixes warning:
sentinel.c: In function 'sentinelReceiveHelloMessages':
sentinel.c:2156:43: warning: variable 'master' set but not used [-Wunused-but-set-variable]
     sentinelRedisInstance *ri = c->data, *master;
2014-03-18 15:35:09 -04:00
antirez
b9e90a70fa Sentinel: sentinelRefreshInstanceInfo() minor refactoring.
Test sentinel.tilt condition on top and return if it is true.
This allows to remove the check for the tilt condition in the remaining
code paths of the function.
2014-03-18 15:35:47 +01:00
antirez
218cc5fc39 Sentinel: propagate down-after-ms changes to slaves and sentinels. 2014-03-18 14:37:44 +01:00
antirez
bb6d850160 Sentinel: down-after-milliseconds is not master-specific.
addReplySentinelRedisInstance() modified so that this field is displayed
for all the kind of instances: Sentinels, Masters, Slaves.
2014-03-18 11:21:17 +01:00
antirez
ae0b7680b3 Sentinel failure detection implementation improved.
Failure detection in Sentinel is ping-pong based. It used to work by
remembering the last time a valid PONG reply was received, and checking
if the reception time was too old compared to the current current time.

PINGs were sent at a fixed interval of 1 second.

This works in a decent way, but does not scale well when we want to set
very small values of "down-after-milliseconds" (this is the node
timeout basically).

This commit reiplements the failure detection making a number of
changes. Some changes are inspired to Redis Cluster failure detection
code:

* A new last_ping_time field is added in representation of instances.
  If non zero, we have an active ping that was sent at the specified
  time. When a valid reply to ping is received, the field is zeroed
  again.
* last_ping_time is not reset when we reconnect the link or send a new
  ping, so from our point of view it represents the time we started
  waiting for the instance to reply to our pings without receiving a
  reply.
* last_ping_time is now used in order to check if the instance is
  timed out. This means that we can have a node timeout of 100
  milliseconds and yet the system will work well since the new check is
  not bound to the period used to send pings.
* Pings are now sent every second, or often if the value of
  down-after-milliseconds is less than one second. With a lower limit of
  10 HZ ping frequency.
* Link reconnection code was improved. This is used in order to try to
  reconnect the link when we are at 50% of the node timeout without a
  valid reply received yet. However the old code triggered unnecessary
  reconnections when the node timeout was very small. Now that should be
  ok.

The new code passes the tests but more testing is needed and more unit
tests stressing the failure detector, so currently this is merged only
in the unstable branch.
2014-03-17 18:33:45 +01:00
antirez
3a2ff55617 Sentinel: use CLIENT SETNAME when connecting to Redis.
This makes debugging / monitoring of Sentinels simpler since you can
identify sentinels in CLIENT LIST output of Redis instances.
2014-03-15 14:59:23 +01:00
Matt Stancliff
584052ee6b Fix segfault from accessing array out of bounds
argc == 2; argv[2] == crash
2014-03-14 17:38:05 -04:00
antirez
ed813863f0 Sentinel: be safe under crash-recovery assumptions.
Sentinel's main safety argument is that there are no two configurations
for the same master with the same version (configuration epoch).

For this to be true Sentinels require to be authorized by a majority.
Additionally Sentinels require to do two important things:

* Never vote again for the same epoch.
* Never exchange an old vote for a fresh one.

The first prerequisite, in a crash-recovery system model, requires to
persist the master->leader_epoch on durable storage before to reply to
messages. This was not the case.

We also make sure to persist the current epoch in order to never reply
to stale votes requests from other Sentinels, after a recovery.

The configuration is persisted by making use of fsync(), this is
considered in the context of this code a good enough guarantee that
after a restart our durable state is restored, however this may not
always be the case depending on the kind of hardware and operating
system used.
2014-03-14 14:58:44 +01:00
antirez
365094028b Sentinel: fake PUBLISH command to receive HELLO messages.
Now the way HELLO messages are received is unified.
Now it is no longer needed for Sentinels to converge to the higher
configuration for a master to be able to chat via some Redis instance,
the are able to directly exchanges configurations.

Note that this commit does not include the (trivial) change needed to
send HELLO messages to Sentinel instances as well, since for an error I
committed the change in the previous commit that refactored hello
messages processing into a separated function.
2014-03-14 11:07:42 +01:00
antirez
9dfe426fc8 Sentinel: HELLO processing refactored into sentinelProcessHelloMessage(). 2014-03-14 11:07:42 +01:00
Jan-Erik Rediger
5f5118bdad Small typo fixed 2014-03-05 00:41:02 +01:00
antirez
47750998a6 Sentinel: more aggressive failover start desynchronization.
Sentinel needs to avoid split brain conditions due to multiple sentinels
trying to get voted at the exact same time.

So far some desynchronization was provided by fluctuating server.hz,
that is the frequency of the timer function call. However the
desynchonization provided in this way was not enough when using many
Sentinel instances, especially when a large quorum value is used in
order to force a greater degree of agreement (more than N/2+1).

It was verified that it was likely to trigger a split brain
condition, forcing the system to try again after a timeout.
Usually the system will succeed after a few retries, but this is not
optimal.

This commit desynchronizes instances in a more effective way to make it
likely that the first attempt will be successful.
2014-03-04 17:09:36 +01:00
antirez
b15411df98 Sentinel: log quorum with +monitor event. 2014-02-24 17:10:20 +01:00
antirez
6b373edb77 Sentinel: generate +monitor events at startup. 2014-02-24 16:33:55 +01:00
antirez
3b7a757468 Sentinel: log +monitor and +set events.
Now that we have a runtime configuration system, it is very important to
be able to log how the Sentinel configuration changes over time because
of API calls.
2014-02-24 16:33:43 +01:00
antirez
25cebf7285 Sentinel: added missing exit(1) after checking for config file. 2014-02-24 16:22:52 +01:00
antirez
b1c1386374 Sentinel: IDONTKNOW error removed.
This error was conceived for the older version of Sentinel that worked
via master redirection and that was not able to get configuration
updates from other Sentinels via the Pub/Sub channel of masters or
slaves.

This reply does not make sense today, every Sentinel should reply with
the best information it has currently. The error will make even more
sense in the future since the plan is to allow Sentinels to update the
configuration of other Sentinels via gossip with a direct chat without
the prerequisite that they have at least a monitored instance in common.
2014-02-22 17:34:46 +01:00
antirez
7d7b3810e7 Sentinel: report instances role switch events.
This is useful mostly for debugging of issues.
2014-02-20 12:13:52 +01:00
antirez
7cec9e48ce Sentinel: SENTINEL_SLAVE_RECONF_RETRY_PERIOD -> RECONF_TIMEOUT
Rename define to match the new meaning.
2014-02-18 10:27:38 +01:00
antirez
18b8bad53c Sentinel: fix slave promotion timeout.
If we can't reconfigure a slave in time during failover, go forward as
anyway the slave will be fixed by Sentinels in the future, once they
detect it is misconfigured.

Otherwise a failover in progress may never terminate if for some reason
the slave is uncapable to sync with the master while at the same time
it is not disconnected.
2014-02-18 08:50:57 +01:00
antirez
e1b77b61f3 Sentinel: better specify startup errors due to config file.
Now it logs the file name if it is not accessible. Also there is a
different error for the missing config file case, and for the non
writable file case.
2014-02-17 16:44:49 +01:00
antirez
2d6eb68993 Sentinel: allow SHUTDOWN command in Sentinel mode. 2014-02-07 11:22:24 +01:00
antirez
3ff1bb4b2e Sentinel: check arity for SENTINEL MASTER command.
This fixes issue #1530.
2014-01-31 10:13:38 +01:00
antirez
d5763dceaf SENTINEL SET master quorum implemented. 2014-01-14 09:23:26 +01:00
antirez
fe86f890b0 SENTINEL SET: error on bad option name + flush config on error. 2014-01-13 11:55:57 +01:00
antirez
f822516e43 SENTINEL SET implemented.
The new command allows to change master-specific configurations
at runtime. All the settable parameters can be retrivied via the
SENTINEL MASTER command, so there is no equivalent "GET" command.
2014-01-13 11:53:29 +01:00
antirez
3cdcaff069 Sentinel: fix wrong arity error message. 2014-01-13 11:05:13 +01:00
antirez
964f6b17e9 Sentinel: SENTINEL REMOVE command added.
The command totally removes a monitored master.
2014-01-10 15:39:36 +01:00
antirez
cf2835519e Sentinel: releaseSentinelRedisInstance() top comment fixed.
The claim about unlinking the instance from the connected hash tables
was the opposite of the reality. Also the current actual behavior is
safer in most cases, so it is better to manually unlink when needed.
2014-01-10 15:33:42 +01:00
antirez
9d0f46c6f5 Sentinel: flush config on disk when new master is added. 2014-01-10 15:22:06 +01:00
antirez
39f9f449b0 Sentinel: SENTINEL MONITOR command implemented.
It allows to add new masters to monitor at runtime.
2014-01-10 15:18:24 +01:00
antirez
c42e4bd0b6 Sentinel: added SENTINEL MASTER <name> command.
With SENTINEL MASTERS it was already possible to list all the configured
masters, but not a specific one.
2014-01-10 14:41:52 +01:00
antirez
2bb9cd464e Add all the configurable fields to addReplySentinelRedisInstance().
Note: the auth password with the master is voluntarily not exposed.
2014-01-10 14:31:41 +01:00
antirez
5a7d04ee7b Trip comment to 80 cols in SentinelCommand(). 2014-01-10 14:13:04 +01:00
antirez
5320148883 Sentinel: dead code removed. 2013-12-13 11:01:13 +01:00
antirez
2eb781b35b dict.c: added optional callback to dictEmpty().
Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.

This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).
2013-12-10 18:46:24 +01:00
antirez
c590549e40 Sentinel: fix reported role info sampling.
The way the role change was recoded was not sane and too much
convoluted, causing the role information to be not always updated.

This commit fixes issue #1445.
2013-12-06 12:46:56 +01:00
antirez
2b414a4b5f Sentinel: fix reported role fields when master is reset.
When there is a master address switch, the reported role must be set to
master so that we have a chance to re-sample the INFO output to check if
the new address is reporting the right role.

Otherwise if the role was wrong, it will be sensed as wrong even after
the address switch, and for enough time according to the role change
time, for Sentinel consider the master SDOWN.

This fixes isue #1446, that describes the effects of this bug in
practice.
2013-12-06 11:37:46 +01:00
antirez
11e81a1e9a Fixed grammar: before H the article is a, not an. 2013-12-05 16:35:32 +01:00
antirez
f80cf7363a Sentinel: don't write HZ when flushing config.
See issue #1419.
2013-12-02 15:56:10 +01:00
antirez
dffebbc904 Sentinel: better time desynchronization.
Sentinels are now desynchronized in a better way changing the time
handler frequency between 10 and 20 HZ. This way on average a
desynchronization of 25 milliesconds is produced that should be larger
enough compared to network latency, avoiding most split-brain condition
during the vote.

Now that the clocks are desynchronized, to have larger random delays when
performing operations can be easily achieved in the following way.
Take as example the function that starts the failover, that is
called with a frequency between 10 and 20 HZ and will start the
failover every time there are the conditions. By just adding as an
additional condition something like rand()%4 == 0, we can amplify the
desynchronization between Sentinel instances easily.

See issue #1419.
2013-12-02 12:29:42 +01:00
antirez
0addf8aff1 Sentinel: log vote received from other Sentinels. 2013-11-28 15:23:46 +01:00
huangz1990
86a540a66e fix a bug in sentinel.c about pub/sub link 2013-11-26 19:55:51 +08:00