redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-24 00:59:02 -05:00

Author	SHA1	Message	Date
antirez	b72cecd7c8	Sentinel command renaming: fix CONFIG SET after refactoring.	2018-06-25 17:23:32 +02:00
antirez	91a384a5cd	Sentinel command renaming: implement SENTINEL SET.	2018-06-25 17:13:20 +02:00
antirez	903582dd7b	Sentinel: make SENTINEL SET able to handle different arities.	2018-06-25 17:12:39 +02:00
antirez	c303e768bf	Sentinel command renaming: config rewriting.	2018-06-25 16:55:01 +02:00
antirez	60df7dbea1	Sentinel command renaming: rename-command option parsing.	2018-06-25 16:47:50 +02:00
antirez	72e8a33b35	Sentinel command renaming: base machanism implemented.	2018-06-25 14:06:05 +02:00
antirez	6a66b93b18	Sentinel: add an option to deny online script reconfiguration. The ability of "SENTINEL SET" to change the reconfiguration script at runtime is a problem even in the security model of Redis: any client inside the network may set any executable to be ran once a failover is triggered. This option adds protection for this problem: by default the two SENTINEL SET subcommands modifying scripts paths are denied. However the user is still able to rever that using the Sentinel configuration file in order to allow such a feature.	2018-06-14 18:57:58 +02:00
antirez	8631e64779	Sentinel: fix delay in detecting ODOWN. See issue #2819 for details. The gist is that when we want to send INFO because we are over the time, we used to send only INFO commands, no longer sending PING commands. However if a master fails exactly when we are about to send an INFO command, the PING times will result zero because the PONG reply was already received, and we'll fail to send more PINGs, since we try only to send INFO commands: the failure detector will delay until the connection is closed and re-opened for "long timeout". This commit changes the logic so that we can send the three kind of messages regardless of the fact we sent another one already in the same code path. It could happen that we go over the message limit for the link by a few messages, but this is not significant. However now we'll not introduce delays in sending commands just because there was something else to send at the same time.	2018-05-23 17:13:44 +02:00
antirez	adeed29a99	Use SipHash hash function to mitigate HashDos attempts. This change attempts to switch to an hash function which mitigates the effects of the HashDoS attack (denial of service attack trying to force data structures to worst case behavior) while at the same time providing Redis with an hash function that does not expect the input data to be word aligned, a condition no longer true now that sds.c strings have a varialbe length header. Note that it is possible sometimes that even using an hash function for which collisions cannot be generated without knowing the seed, special implementation details or the exposure of the seed in an indirect way (for example the ability to add elements to a Set and check the return in which Redis returns them with SMEMBERS) may make the attacker's life simpler in the process of trying to guess the correct seed, however the next step would be to switch to a log(N) data structure when too many items in a single bucket are detected: this seems like an overkill in the case of Redis. SPEED REGRESION TESTS: In order to verify that switching from MurmurHash to SipHash had no impact on speed, a set of benchmarks involving fast insertion of 5 million of keys were performed. The result shows Redis with SipHash in high pipelining conditions to be about 4% slower compared to using the previous hash function. However this could partially be related to the fact that the current implementation does not attempt to hash whole words at a time but reads single bytes, in order to have an output which is endian-netural and at the same time working on systems where unaligned memory accesses are a problem. Further X86 specific optimizations should be tested, the function may easily get at the same level of MurMurHash2 if a few optimizations are performed.	2017-02-20 17:29:17 +01:00
antirez	041ab04419	Trim comment to 80 cols.	2016-09-14 16:41:05 +02:00
oranagra	68bf45fa1e	Optimize repeated keyname hashing. (Change cherry-picked and modified by @antirez from a larger commit provided by @oranagra in PR #3223).	2016-09-12 13:19:05 +02:00
antirez	3e9ce38b0a	Sentinel: check Slave INFO state more often when disconnected. During the initial handshake with the master a slave will report to have a very high disconnection time from its master (since technically it was disconnected since forever, so the current UNIX time in seconds is reported). However when the slave is connected again the Sentinel may re-scan the INFO output again only after 10 seconds, which is a long time. During this time Sentinels will consider this instance unable to failover, so a useless delay is introduced. Actaully this hardly happened in the practice because when a slave's master is down, the INFO period for slaves changes to 1 second. However when a manual failover is attempted immediately after adding slaves (like in the case of the Sentinel unit test), this problem may happen. This commit changes the INFO period to 1 second even in the case the slave's master is not down, but the slave reported to be disconnected from the master (by publishing, last time we checked, a master disconnection time field in INFO). This change is required as a result of an unrelated change in the replication code that adds a small delay in the master-slave first synchronization.	2016-07-22 10:51:25 +02:00
antirez	c383be3b0f	Sentinel: fix cross-master Sentinel address update. This commit both fixes the crash reported with issue #3364 and also properly closes the old links after the Sentinel address for the other masters gets updated. The two problems where: 1. The Sentinel that switched address may not monitor all the masters, it is possible that there is no match, and the 'match' variable is NULL. Now we check for no match and 'continue' to the next master. 2. By ispecting the code because of issue "1" I noticed that there was a problem in the code that disconnects the link of the Sentinel that needs the address update. Basically link->disconnected is non-zero even if just a single link (cc -- command link or pc -- pubsub link) are disconnected, so to check with if (link->disconnected) in order to close the links risks to leave one link connected. I was able to manually reproduce the crash at "1" and verify that the commit resolves the issue. Close #3364.	2016-07-04 18:45:24 +02:00
antirez	f7351f4c07	Fix Sentinel pending commands counting. This bug most experienced effect was an inability of Redis to reconfigure back old masters to slaves after they are reachable again after a failover. This was due to failing to reset the count of the pending commands properly, so the master appeared fovever down. Was introduced in Redis 3.2 new Sentinel connection sharing feature which is a lot more complex than the 3.0 code, but more scalable. Many thanks to people reporting the issue, and especially to @sskorgal for investigating the issue in depth. Hopefully closes #3285.	2016-06-16 19:27:24 +02:00
Salvatore Sanfilippo	5d83f6cfde	Merge pull request #3274 from MOON-CLJ/fix_promoted_slave Sentinel: fix check when can't send the command to the promoted slave	2016-06-15 17:24:11 +02:00
andyli	93a09877fe	fix comment "b>a" to "a > b"	2016-06-10 09:15:26 +02:00
antirez	2a57ad5d90	Fixed typo in Sentinel compareSlavesForPromotion() comment.	2016-06-10 09:15:01 +02:00
MOON_CLJ	aa578446ba	fix check when can't send the command to the promoted slave	2016-05-26 13:10:12 +08:00
Salvatore Sanfilippo	f5ff91f675	Merge pull request #2998 from danielhtshih/unstable Fix a possible race condition of sdown event detection if sentinel's connection to master/slave/sentinel became disconnected just after the last PONG and before the next PING.	2016-05-05 17:16:58 +02:00
antirez	751b5666fb	Sentinel: improve handling of known Sentinel instances. 1. Bug #3035 is fixed (NULL pointer access). This was happening with the folling set of conditions: * For some reason one of the Sentinels, let's call it Sentinel_A, changed ID (reconfigured from scratch), but is as the same address at which it used to be. * Sentinel_A performs a failover and/or has a newer configuration compared to another Sentinel, that we call, Sentinel_B. * Sentinel_B receives an HELLO message from Sentinel_A, where the address and/or ID is mismatched, but it is reporting a newer configuration for the master they are both monitoring. 2. Sentinels now must have an ID otherwise they are not loaded nor persisted in the configuration. This allows to have conflicting Sentinels with the same address since now the master->sentinels dictionary is indexed by Sentinel ID. 3. The code now detects if a Sentinel is annoucing itself with an IP/port pair already busy (of another Sentinel). The old Sentinel that had the same port/pair is set as having port 0, that means, the address is invalid. We may discover the right address later via HELLO messages.	2016-01-27 16:27:49 +01:00
Daniel Shih	e6d970534b	Fix a possible race condition of sdown detection if the connection to master/slave/sentinel decames disconnected just after the last PONG and before the next PING.	2016-01-12 17:06:47 +08:00
antirez	33769f840c	Sentinel: command arity check added where missing.	2015-09-08 09:27:43 +02:00
Salvatore Sanfilippo	0c62d95538	Merge pull request #2695 from rogerlz/unstable redis-sentinel crash if ckquorum command is executed without args	2015-09-08 09:24:45 +02:00
antirez	6233d210cd	Sentinel: add more commonly useful sections to INFO. Debugging is hard without those when there are problems like the one investigated in issue #2700.	2015-07-29 12:29:12 +02:00
antirez	32f80e2f1b	RDMF: More consistent define names.	2015-07-27 14:37:58 +02:00
antirez	40eb548a80	RDMF: REDIS_OK REDIS_ERR -> C_OK C_ERR.	2015-07-26 23:17:55 +02:00
antirez	2d9e3eb107	RDMF: redisAssert -> serverAssert.	2015-07-26 15:29:53 +02:00
antirez	554bd0e7bd	RDMF: use client instead of redisClient, like Disque.	2015-07-26 15:20:52 +02:00
antirez	424fe9afd9	RDMF: redisLog -> serverLog.	2015-07-26 15:17:43 +02:00
antirez	cef054e868	RDMF (Redis/Disque merge friendlyness) refactoring WIP 1.	2015-07-26 15:17:18 +02:00
Rogerio Goncalves	ef29748d0d	Check args before run ckquorum. Fix issue #2635	2015-07-24 14:08:50 +02:00
antirez	821a986643	Sentinel: fix bug in config rewriting during failover We have a check to rewrite the config properly when a failover is in progress, in order to add the current (already failed over) master as slave, and don't include in the slave list the promoted slave itself. However there was an issue, the variable with the right address was computed but never used when the code was modified, and no tests are available for this feature for two reasons: 1. The Sentinel unit test currently does not test Sentinel ability to persist its state at all. 2. It is a very hard to trigger state since it lasts for little time in the context of the testing framework. However this feature should be covered in the test in some way. The bug was found by @badboy using the clang static analyzer. Effects of the bug on safety of Sentinel === This bug results in severe issues in the following case: 1. A Sentinel is elected leader. 2. During the failover, it persists a wrong config with a known-slave entry listing the master address. 3. The Sentinel crashes and restarts, reading invalid configuration from disk. 4. It sees that the slave now does not obey the logical configuration (should replicate from the current master), so it sends a SLAVEOF command to the master (since the slave master is the same) creating a replication loop (attempt to replicate from itself) which Redis is currently unable to detect. 5. This means that the master is no longer available because of the bug. However the lack of availability should be only transient (at least in my tests, but other states could be possible where the problem is not recovered automatically) because: 6. Sentinels treat masters reporting to be slaves as failing. 7. A new failover is triggered, and a slave is promoted to master. Bug lifetime === The bug is there forever. Commit `16237d78` actually tried to fix the bug but in the wrong way (the computed variable was never used! My fault). So this bug is there basically since the start of Sentinel. Since the bug is hard to trigger, I remember little reports matching this condition, but I remember at least a few. Also in automated tests where instances were stopped and restarted multiple times automatically I remember hitting this issue, however I was not able to reproduce nor to determine with the information I had at the time what was causing the issue.	2015-06-12 18:36:17 +02:00
Salvatore Sanfilippo	4082c38a60	Merge pull request #2571 from therealbill/sentinel-flushconfig-command adding a sentinel command: "flushconfig" per RCP4	2015-05-25 12:06:25 +02:00
antirez	20700fe566	Sentinel: clarify effect of resetting failover_start_time.	2015-05-25 10:32:28 +02:00
antirez	5080f2d699	Sentinel: help subcommand in simulate-failure command	2015-05-25 10:24:27 +02:00
antirez	fb3af75f74	Sentinel: initial failure simulator implemented This commit adds the SENTINEL simulate-failure, that sets specific hooks inside the state machine that will crash Sentinel, for testing purposes.	2015-05-22 11:49:11 +02:00
antirez	c54de703f2	Sentinel: fix sentinelTryConnectionSharing() by checking for no match Trivial omission of the obvious no-match case.	2015-05-20 09:59:55 +02:00
antirez	abc65e8987	Sentinel: SENTINEL CKQUORUM command A way for monitoring systems to check that Sentinel is technically able to reach the quorum and failover, using the currently visible Sentinels.	2015-05-18 12:57:47 +02:00
antirez	b43431ac25	Sentinel: port address update code to shared links logic	2015-05-15 09:47:05 +02:00
antirez	4dee18cb66	Sentinel: config-rewrite unique ID just one time	2015-05-14 17:45:09 +02:00
antirez	f9e942d4ae	Sentinel: remove debugging message from releaseInstanceLink()	2015-05-14 14:12:45 +02:00
antirez	b44c37482c	Sentinel: fix access to NULL link->cc in releaseInstanceLink()	2015-05-14 14:08:23 +02:00
antirez	87b6013adb	Sentinel: remove SHARED! debugging printf	2015-05-14 13:40:23 +02:00
antirez	5a0516b5b9	Sentinel: rewrite callback chain removing instances with shared links Otherwise pending commands callbacks will fire with a reference that no longer exists.	2015-05-14 13:39:26 +02:00
antirez	05dbc82005	Sentinel: debugging code removed from sentinelSendPing()	2015-05-14 10:52:32 +02:00
antirez	58d2bb951a	Sentinel: use active/last time for ping logic The PING trigger was improved again by using two fields instead of a single one to remember when the last ping was sent: 1. The "active" ping is the time at which we sent the last ping that still received no reply. However we continue to ping non replying instances even if they have an old active ping: the link may be disconnected and reconencted in the meantime so the older pings may get lost even if it's a TCP socket. 2. The "last" ping is the time at which we really sent the last ping on the wire, and this is used in order to throttle the amount of pings we send during failures (when no pong is received). All in all the failure detector effectiveness should be identical but we avoid to flood instances with pings during failures or when they are slow.	2015-05-14 09:56:23 +02:00
antirez	3ab49895b4	Sentinel: limit reconnection frequency to the ping period	2015-05-13 14:23:57 +02:00
antirez	0eb0b55ff0	Sentinel: PING trigger improved It's ok to ping as soon as the ping period has elapsed since we received the last PONG, but it's not good that we ping again if there is a pending ping... With this change we'll send a new ping if there is one pending only if two times the ping period elapsed since the ping which is still pending was sent.	2015-05-12 17:03:53 +02:00
antirez	9d5e2ed392	Sentinel: same-Sentinel link sharing across masters	2015-05-12 17:03:00 +02:00
antirez	e0a5246f06	Sentinel: add sentinelGetInstanceTypeString() fuction This is useful for debugging and logging activities: given a sentinelRedisInstance object returns a C string representing the instance type: master, slave, sentinel.	2015-05-12 12:12:25 +02:00

1 2 3 4 5

248 Commits