redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 16:48:27 -05:00

Author	SHA1	Message	Date
antirez	2692339138	Cluster: forced failover implemented. Using CLUSTER FAILOVER FORCE it is now possible to failover a master in a forced way, which means: 1) No check to understand if the master is up is performed. 2) No data age of the slave is checked. Evan a slave with very old data can manually failover a master in this way. 3) No chat with the master is attempted to reach its replication offset: the master can just be down.	2014-05-12 16:34:20 +02:00
antirez	005f564eb3	Cluster: bypass data_age check for manual failovers. Automatic failovers only happen in Redis Cluster if the slave trying to be elected was disconnected from its master for no more than 10 times the node-timeout value. However there should be no such a check for manual failovers, since these are initiated by the sysadmin that, in theory, knows what she is doing when a slave is selected to be promoted.	2014-05-12 16:12:12 +02:00
antirez	5c78f87666	RESTORE: reply with -BUSYKEY special error code. The error when the target key is busy was a generic one, while it makes sense to be able to distinguish between the target key busy error and the others easily.	2014-05-12 10:01:59 +02:00
antirez	71d0e7e0ea	CLUSTER MEET: better error messages when address is invalid. Fixes issue #1734.	2014-05-09 16:36:59 +02:00
antirez	8a170c817d	Cluster: bulk-accept new nodes connections. The same change was operated for normal client connections. This is important for Cluster as well, since when a node rejoins the cluster, when a partition heals or after a restart, it gets flooded with new connection attempts by all the other nodes trying to form a full mesh again.	2014-05-09 11:52:59 +02:00
antirez	3625b52791	Cluster: clusterAcceptHandler() comments updated to match the code.	2014-05-09 11:44:46 +02:00
antirez	11d9ecb71d	CLUSTER SET-CONFIG-EPOCH implemented. Initially Redis Cluster accepted that after cluster creation all the nodes were at configEpoch 0, evolving from zero as failovers happen. However later the semantic was made more strict in order to make sure a cluster has always all the master nodes with a different configEpoch, which is more robust in some corner case (especially resulting from errors by the system administrator). To assign different configEpochs to different nodes at startup was a task performed naturally by the config conflicts resolution algorithm (see the Cluster specification). However this works well only for small clusters or when there are actually just a few collisions, since it is designed for exceptional cases. When a large cluster is created hundred of nodes can be at epoch 0, so the conflict resolution code is slow to provide an unique config to each node. For this reason this new command was introduced. It can be called only when a node is totally fresh: no other nodes known, and configEpoch set to zero, so it is safe even against misuses. redis-trib will use the new command in order to start the cluster already setting an incremental unique config to every node.	2014-04-29 19:15:16 +02:00
antirez	e3cf812c9e	clusterLoadConfig() REDIS_ERR retval semantics refined. We should return REDIS_ERR to signal we can't read the configuration because there is no config file only after checking errno, othewise we risk to rewrite an existing file that was not accessible for some other reason.	2014-04-24 16:23:03 +02:00
antirez	db06108bc1	Lock nodes.conf to avoid multiple processes using the same file. This was a common source of problems among users. The solution adopted is not bullet-proof as if the user deletes the nodes.conf file manually, and starts a new instance with the same nodes.conf file path, two instances will use the same file. However following this reasoning the user may drop a nuclear bomb into the datacenter as well.	2014-04-24 16:04:10 +02:00
kingsumos	a69178fdd2	fix cluster node description showing wrong slot allocation	2014-04-22 11:44:53 -04:00
antirez	67bb2c46b2	Add casting to match printf format. adjustOpenFilesLimit() and clusterUpdateSlotsWithConfig() that were assuming uint64_t is the same as unsigned long long, which is true probably for all the systems out there that we target, but still GCC emitted a warning since technically they are two different types.	2014-04-07 08:58:06 +02:00
antirez	8f52173b2c	Cluster: last_vote_epoch -> lastVoteEpoch. Use cammel case for epochs that are persisted on disk.	2014-03-27 15:01:24 +01:00
antirez	7fb14b73ba	Cluster: save/restore vars that must persist after recovery. This fixes issue #1479.	2014-03-27 14:56:29 +01:00
antirez	6dd2dbbd36	Cluster: handshake "already known" error logged to VERBOSE. This is not really an error but something that always happens for example when creating a new cluster, or if the sysadmin rejoins manually a node that is already known. Since useless logs don't help, moved to VERBOSE level.	2014-03-26 16:35:38 +01:00
antirez	3cf6f1f54f	Cluster: clusterHandleConfigEpochCollision() fixed. New config epochs must always be obtained incrementing the currentEpoch, that is itself guaranteed to be >= the max configEpoch currently known to the node.	2014-03-26 12:31:28 +01:00
antirez	80d4c52cdf	Cluster: better logging for clusterUpdateSlotsConfigWith().	2014-03-26 12:09:38 +01:00
antirez	eb746ec408	Cluster: CLUSTER SETSLOT implementation comment updated. Update the comment since the implementation details changed.	2014-03-25 17:50:46 +01:00
antirez	6c527a89a0	Cluster: configEpoch collisions resolution. The slave election in Redis Cluster guarantees that slaves promoted to masters always end with unique config epochs, however failures during manual reshardings, software bugs and operational errors may in theory cause two nodes to have the same configEpoch. This commit introduces a mechanism to eventually always end with different configEpochs if a collision ever happens. As a (wanted) side effect, this also ensures that after a new cluster is created, all nodes will end with a different configEpoch automatically.	2014-03-25 17:19:58 +01:00
antirez	c1041c570f	Cluster: stay within 80 cols.	2014-03-25 16:07:14 +01:00
antirez	82b53c650c	struct dictEntry -> dictEntry.	2014-03-20 16:20:37 +01:00
antirez	e26f4486b0	Cluster: update node configEpoch on UPDATE messages. The UPDATE message contains the configEpoch of the node configuration advertised in the packet. Update it if needed.	2014-03-11 11:53:09 +01:00
antirez	a2ff90919f	Cluster: set slot error if we receive an update for a busy slot. By manually modifying nodes configurations in random ways, it is possible to create the following scenario: A is serving keys for slot 10 B is manually configured to serve keys for slot 10 A receives an update from B (or another node) where it is informed that the slot 10 is now claimed by B with a greater configuration epoch, however A still has keys from slot 10. With this commit A will put the slot in error setting it in IMPORTING state, so that redis-trib can detect the issue.	2014-03-11 11:49:47 +01:00
antirez	1ed0ad77f0	Cluster: clarified a comment in clusterUpdateSlotsConfigWith().	2014-03-11 11:32:40 +01:00
antirez	8287945ff8	Cluster: flush importing/migrating state when master is turned into slave.	2014-03-11 11:22:06 +01:00
antirez	2e8e0ad44e	Cluster: clusterCloseAllSlots() added.	2014-03-11 11:16:18 +01:00
antirez	787b297046	Cluster: getKeysFromCommand() API cleaned up. This API originated from the "diskstore" experiment, not for Redis Cluster itself, so there were legacy/useless things trying to differentiate between keys that are going to be overwritten and keys that need to be fetched from disk (preloaded). All useless with Cluster, so removed with the result of code simplification.	2014-03-10 13:18:41 +01:00
antirez	c1a7d3e61f	Cluster: abort on port too high error. It also fixes multi-line comment style to be consistent with the rest of the code base. Related to #1555.	2014-03-10 10:41:27 +01:00
Salvatore Sanfilippo	442b06db54	Merge pull request #1555 from mattsta/cluster-port-error-out Cluster port error out	2014-03-10 10:37:50 +01:00
antirez	ed8c55237b	Cluster: be explicit about passing NULL as bind addr for connect. The code was already correct but it was using that bindaddr[0] is set to NULL as a side effect of current implementation if no bind address is configured. This is not guarnteed to hold true in the future.	2014-03-10 10:33:53 +01:00
antirez	3e8a92ef8d	Cluster: log error when anetTcpNonBlockBindConnect() fails.	2014-03-10 10:32:28 +01:00
Salvatore Sanfilippo	3b0edb80ec	Merge pull request #1567 from mattsta/fix-cluster-join Bind source address for cluster communication	2014-03-10 10:28:32 +01:00
antirez	0f1f25784f	Cluster: better timeout and retry time for failover. When node-timeout is too small, in the order of a few milliseconds, there is no way the voting process can terminate during that time, so we set a lower limit for the failover timeout of two seconds. The retry time is set to two times the failover timeout time, so it is at least 4 seconds.	2014-03-10 09:57:52 +01:00
antirez	6984692060	Cluster: fix conditional generating TRYAGAIN error.	2014-03-07 16:18:00 +01:00
antirez	36676c2318	Redis Cluster: support for multi-key operations.	2014-03-07 13:19:09 +01:00
Matt Stancliff	385c25f70f	Remove redundant IP length definition REDIS_CLUSTER_IPLEN had the same value as REDIS_IP_STR_LEN. They were both #define'd to the same INET6_ADDRSTRLEN.	2014-03-06 17:55:43 +01:00
Matt Stancliff	d2040ab9b1	Remove some redundant code Function nodeIp2String in cluster.c is exactly anetPeerToString with a pre-extracted fd.	2014-03-06 17:55:39 +01:00
Matt Stancliff	59cf0b1902	Fix return value check for anetTcpAccept anetTcpAccept returns ANET_ERR, not AE_ERR. This isn't a physical error since both ANET_ERR and AE_ERR are -1, but better to be consistent.	2014-03-06 17:55:31 +01:00
Matt Stancliff	e5b1e7be64	Bind source address for cluster communication The first address specified as a bind parameter (server.bindaddr[0]) gets used as the source IP for cluster communication. If no bind address is specified by the user, the behavior is unchanged. This patch allows multiple Redis Cluster instances to communicate when running on the same interface of the same host.	2014-03-04 17:36:45 -05:00
antirez	8dea2029a4	Fix configEpoch assignment when a cluster slot gets "closed". This is still code to rework in order to use agreement to obtain a new configEpoch when a slot is migrated, however this commit handles the special case that happens when the nodes are just started and everybody has a configEpoch of 0. In this special condition to have the maximum configEpoch is not enough as the special epoch 0 is not unique (all the others are). This does not fixes the intrinsic race condition of a failover happening while we are resharding, that will be addressed later.	2014-03-03 11:12:11 +01:00
Matt Stancliff	ce68caea37	Cluster: error out quicker if port is unusable The default cluster control port is 10,000 ports higher than the base Redis port. If Redis is started on a too-high port, Cluster can't start and everything will exit later anyway.	2014-02-19 17:30:07 -05:00
antirez	db6d628c3e	Cluster: clusterDelNode(): remove node from master's slaves.	2014-02-11 10:34:25 +01:00
antirez	5e0e03be41	Cluster: UPDATE messages are the norm and verbose. Logging them at WARNING level was of little utility and of sure disturb.	2014-02-11 10:18:24 +01:00
antirez	4a64286c36	Cluster: configEpoch assignment in SETNODE improved. Avoid to trash a configEpoch for every slot migrated if this node has already the max configEpoch across the cluster. Still work to do in this area but this avoids both ending with a very high configEpoch without any reason and to flood the system with fsyncs.	2014-02-11 10:09:17 +01:00
antirez	72f7abf6a2	Cluster: clusterSetStartupEpoch() made more generally useful. The actual goal of the function was to get the max configEpoch found in the cluster, so make it general by removing the assignment of the max epoch to currentEpoch that is useful only at startup.	2014-02-11 10:00:14 +01:00
antirez	44f7afe28a	Cluster: always increment the configEpoch in SETNODE after import. Removed a stale conditional preventing the configEpoch from incrementing after the import in certain conditions. Since the master got a new slot it should always claim a new configuration.	2014-02-11 09:50:37 +01:00
antirez	a1349728ea	Cluster: on resharding upgrade version of receiving node. The node receiving the hash slot needs to have a version that wins over the other versions in order to force the ownership of the slot. However the current code is far from perfect since a failover can happen during the manual resharding. The fix is a work in progress but the bottom line is that the new version must either be voted as usually, set by redis-trib manually after it makes sure can't be used by other nodes, or reserved configEpochs could be used for manual operations (for example odd versions could be never used by slaves and are always used by CLUSTER SETSLOT NODE).	2014-02-11 00:36:05 +01:00
antirez	6dc26795aa	Cluster: fsync at every SETSLOT command puts too pressure on disks. During slots migration redis-trib can send a number of SETSLOT commands. Fsyncing every time is a bit too much in production as verified empirically. To make sure configs are fsynced on all nodes after a resharding redis-trib may send something like CLUSTER CONFSYNC. In this case fsyncs were not providing too much value since anyway processes can crash in the middle of the resharding of an hash slot, and redis-trib should be able to recover from this condition anyway.	2014-02-10 23:54:08 +01:00
antirez	218358bbbd	Cluster: conditions to clear "migrating" on slot for SETSLOT ... NODE changed. If the slot is manually assigned to another node, clear the migrating status regardless of the fact it was previously assigned to us or not, as long as we no longer have keys for this slot. This avoid a race during slots migration that may leave the slot in migrating status in the source node, since it received an update message from the destination node that is already claiming the slot. This way we are sure that redis-trib at the end of the slot migration is always able to close the slot correctly.	2014-02-10 23:51:47 +01:00
antirez	bf670e0745	Cluster: don't update slave's master if we don't know it. There is no way we can update the slave's node->slaveof pointer if we don't know the master (no node with such an ID in our tables).	2014-02-10 18:33:34 +01:00
antirez	a3755ae9ee	Cluster: ignore slot config changes if we are importing it.	2014-02-10 18:04:43 +01:00

1 2 3 4 5 ...

377 Commits