redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 08:38:27 -05:00

Author	SHA1	Message	Date
Oran Agra	689b64c3ad	PSYNC2 fix - promoted slave should hold on to it's backlog after a slave is promoted (assuming it has no slaves and it booted over an hour ago), it will lose it's replication backlog at the next replication cron, rather than waiting for slaves to connect to it. so on a simple master/slave faiover, if the new slave doesn't connect immediately, it may be too later and PSYNC2 will fail.	2018-01-16 10:10:42 +02:00
antirez	62a4b817c6	add linkClient(): adds the client and caches the list node. We have this operation in two places: when caching the master and when linking a new client after the client creation. By having an API for this we avoid incurring in errors when modifying one of the two places forgetting the other. The function is also a good place where to document why we cache the linked list node. Related to #4497 and #4210.	2017-12-05 16:02:03 +01:00
zhaozhao.zz	43be967690	networking: optimize unlinkClient() in freeClient()	2017-11-30 18:11:05 +08:00
antirez	4d063bb6ba	PSYNC2: reorganize comments related to recent fixes. Related to PR #4412 and issue #4407.	2017-11-24 11:08:29 +01:00
zhaozhao.zz	6ddf0ea293	PSYNC2: safe free backlog when reach the time limit When we free the backlog, we should use a new replication ID and clear the ID2. Since without backlog we can not increment master_repl_offset even do write commands, that may lead to inconsistency when we try to connect a "slave-before" master (if this master is our slave before, our replid equals the master's replid2). As the master have our history, so we can match the master's replid2 and second_replid_offset, that make partial sync work, but the data is inconsistent.	2017-11-01 17:32:27 +08:00
antirez	bb3b5ddd19	PSYNC2: More refinements related to #4316 .	2017-09-20 11:28:13 +02:00
zhaozhao.zz	b541ccef25	PSYNC2: make persisiting replication info more solid This commit is a reinforcement of commit `c1c99e9`. 1. Replication information can be stored when the RDB file is generated by a mater using server.slaveseldb when server.repl_backlog is not NULL, or set repl_stream_db be -1. That's safe, because NULL server.repl_backlog will trigger full synchronization, then master will send SELECT command to replicaiton stream. 2. Only do rdbSave* when rsiptr is not NULL, if we do rdbSave* without rdbSaveInfo, slave will miss repl-stream-db. 3. Save the replication informations also in the case of SAVE command, FLUSHALL command and DEBUG reload.	2017-09-20 11:18:10 +02:00
antirez	c1c99e9f4e	PSYNC2: Fix the way replication info is saved/loaded from RDB. This commit attempts to fix a number of bugs reported in #4316. They are related to the way replication info like replication ID, offsets, and currently selected DB in the master client, are stored and loaded by Redis. In order to avoid inconsistencies the changes in this commit try to enforce that: 1. Replication information are only stored when the RDB file is generated by a slave that has a valid 'master' client, so that we can always extract the currently selected DB. 2. When replication informations are persisted in the RDB file, all the info for a successful PSYNC or nothing is persisted. 3. The RDB replication informations are only loaded if the instance is configured as a slave, otherwise a master can start with IDs that relate to a different history of the data set, and stil retain such IDs in the future while receiving unrelated writes.	2017-09-19 23:03:39 +02:00
antirez	b75ae0bbea	PSYNC2: Create backlog on slave partial sync as well. A slave may be started with an RDB file able to provide enough slave to perform a successful partial SYNC with its master. However in such a case, how outlined in issue #4268, the slave backlog will not be started, since it was only initialized on full syncs attempts. This creates different problems with successive PSYNC attempts that will always result in full synchronizations. Thanks to @fdingiit for discovering the issue.	2017-09-19 10:33:14 +02:00
antirez	469d6e2b37	PSYNC2: fix master cleanup when caching it. The master client cleanup was incomplete: resetClient() was missing and the output buffer of the client was not reset, so pending commands related to the previous connection could be still sent. The first problem caused the client argument vector to be, at times, half populated, so that when the correct replication stream arrived the protcol got mixed to the arugments creating invalid commands that nobody called. Thanks to @yangsiran for also investigating this problem, after already providing important design / implementation hints for the original PSYNC2 issues (see referenced Github issue). Note that this commit adds a new function to the list library of Redis in order to be able to reset a list without destroying it. Related to issue #3899.	2017-04-27 17:08:37 +02:00
antirez	189a12afb4	PSYNC2: discard pending transactions from cached master. During the review of the fix for #3899, @yangsiran identified an implementation bug: given that the offset is now relative to the applied part of the replication log, when we cache a master, the successive PSYNC2 request will be made in order to include the transaction that was not completely processed. This means that we need to discard any pending transaction from our replication buffer: it will be re-executed.	2017-04-19 14:02:52 +02:00
antirez	22be435efe	Fix PSYNC2 incomplete command bug as described in #3899 . This bug was discovered by @kevinmcgehee and constituted a major hidden bug in the PSYNC2 implementation, caused by the propagation from the master of incomplete commands to slaves. The bug had several results: 1. Borrowing from Kevin text in the issue: "Given that slaves blindly copy over their master's input into their own replication backlog over successive read syscalls, it's possible that with large commands or small TCP buffers, partial commands are present in this buffer. If the master were to fail before successfully propagating the entire command to a slave, the slaves will never execute the partial command (since the client is invalidated) but will copy it to replication backlog which may relay those invalid bytes to its slaves on PSYNC2, corrupting the backlog and possibly other valid commands that follow the failover. Simple command boundaries aren't sufficient to capture this, either, because in the case of a MULTI/EXEC block, if the master successfully propagates a subset of the commands but not the EXEC, then the transaction in the backlog becomes corrupt and could corrupt other slaves that consume this data." 2. As identified by @yangsiran later, there is another effect of the bug. For the same mechanism of the first problem, a slave having another slave, could receive a full resynchronization request with an already half-applied command in the backlog. Once the RDB is ready, it will be sent to the slave, and the replication will continue sending to the sub-slave the other half of the command, which is not valid. The fix, designed by @yangsiran and @antirez, and implemented by @antirez, uses a secondary buffer in order to feed the sub-masters and update the replication backlog and offsets, only when a given part of the query buffer is actually applied to the state of the instance, that is, when the command gets processed and the command is not pending in the Redis transaction buffer because of CLIENT_MULTI state. Given that now the backlog and offsets representation are in agreement with the actual processed commands, both issue 1 and 2 should no longer be possible. Thanks to @kevinmcgehee, @yangsiran and @oranagra for their work in identifying and designing a fix for this problem.	2017-04-19 10:25:45 +02:00
antirez	104584b95e	Fix typo in feedReplicationBacklog() top comment.	2017-04-12 12:28:05 +02:00
antirez	76d87f47c7	Don't leak file descriptor on syncWithMaster(). Close #3804.	2017-02-20 10:18:41 +01:00
antirez	8e390a62ad	Hopefully improve code comments for issue #3616 . This commit also contains other changes in order to conform the code to the Redis core style, specifically 80 chars max per line, smart conditionals in the same line: if (that) do_this();	2016-12-16 17:48:38 +01:00
Salvatore Sanfilippo	ca4ca5073e	Merge pull request #3616 from oranagra/stop_aofrw_before_rdbload CoW improvement, stop AOFRW before flushing and parsing slave RDB	2016-12-16 17:43:20 +01:00
antirez	434e6b2da3	PSYNC2: Do not accept WAIT in slave instances. No longer makes sense since writable slaves only do local writes now: writes are no longer passed to sub-slaves in the stream.	2016-12-02 10:21:20 +01:00
antirez	6eb720ff2d	PSYNC2: Minor memory leak reading -NOMASTERLINK master reply fixed.	2016-11-29 10:25:00 +01:00
antirez	eab865a0a1	PSYNC2: stop sending newlines to sub-slaves when master is down. This actually includes two changes: 1) No newlines to take the master-slave link up when the upstream master is down. Doing this is dangerous because the sub-slave often is received replication protocol for an half-command, so can't receive newlines without desyncing the replication link, even with the code in order to cancel out the bytes that PSYNC2 was using. Moreover this is probably also not needed/sane, because anyway the slave can keep serving requests, and because if it's configured to don't serve stale data, it's a good idea, actually, to break the link. 2) When a +CONTINUE with a different ID is received, we now break connection with the sub-slaves: they need to be notified as well. This was part of the original specification but for some reason it was not implemented in the code, and was alter found as a PSYNC2 bug in the integration testing.	2016-11-28 17:54:04 +01:00
antirez	e09e31b12e	PSYNC2: on transient error jump to error, not write_error.	2016-11-24 15:48:18 +01:00
antirez	5b7d42fff3	PSYNC2: bugfixing pre release. 1. Master replication offset was cleared after switching configuration to some other slave, since it was assumed you can't PSYNC after a switch. Note the case anymore and when we successfully PSYNC we need to have our offset untouched. 2. Secondary replication ID was not reset to "000..." pattern at startup. 3. Master in error state replying -LOADING or other transient errors forced the slave to discard the cached master and full resync. This is now fixed. 4. Better logging of what's happening on failed PSYNCs.	2016-11-23 17:36:45 +01:00
Salvatore Sanfilippo	5b83fa482c	Merge pull request #3612 from deep011/unstable fix a possible bug for 'replconf getack'	2016-11-18 10:45:09 +01:00
oranagra	e3a61950a2	when a slave loads an RDB, stop an AOFRW fork before flusing db and parsing rdb file, to avoid a CoW disaster.	2016-11-16 21:30:59 +02:00
deep011	13a92a5bb1	fix a possible bug for 'replconf getack'	2016-11-16 11:04:33 +08:00
antirez	28c96d73b2	PSYNC2: Save replication ID/offset on RDB file. This means that stopping a slave and restarting it will still make it able to PSYNC with the master. Moreover the master itself will retain its ID/offset, in case it gets turned into a slave, or if a slave will try to PSYNC with it with an exactly updated offset (otherwise there is no backlog). This change was possible thanks to PSYNC v2 that makes saving the current replication state much simpler.	2016-11-10 12:35:29 +01:00
antirez	4e5e366ed2	PSYNC2: Wrap debugging code with if(0)	2016-11-09 15:37:15 +01:00
antirez	2669fb8364	PSYNC2: different improvements to Redis replication. The gist of the changes is that now, partial resynchronizations between slaves and masters (without the need of a full resync with RDB transfer and so forth), work in a number of cases when it was impossible in the past. For instance: 1. When a slave is promoted to mastrer, the slaves of the old master can partially resynchronize with the new master. 2. Chained slalves (slaves of slaves) can be moved to replicate to other slaves or the master itsef, without requiring a full resync. 3. The master itself, after being turned into a slave, is able to partially resynchronize with the new master, when it joins replication again. In order to obtain this, the following main changes were operated: * Slaves also take a replication backlog, not just masters. * Same stream replication for all the slaves and sub slaves. The replication stream is identical from the top level master to its slaves and is also the same from the slaves to their sub-slaves and so forth. This means that if a slave is later promoted to master, it has the same replication backlong, and can partially resynchronize with its slaves (that were previously slaves of the old master). * A given replication history is no longer identified by the `runid` of a Redis node. There is instead a `replication ID` which changes every time the instance has a new history no longer coherent with the past one. So, for example, slaves publish the same replication history of their master, however when they are turned into masters, they publish a new replication ID, but still remember the old ID, so that they are able to partially resynchronize with slaves of the old master (up to a given offset). * The replication protocol was slightly modified so that a new extended +CONTINUE reply from the master is able to inform the slave of a replication ID change. * REPLCONF CAPA is used in order to notify masters that a slave is able to understand the new +CONTINUE reply. * The RDB file was extended with an auxiliary field that is able to select a given DB after loading in the slave, so that the slave can continue receiving the replication stream from the point it was disconnected without requiring the master to insert "SELECT" statements. This is useful in order to guarantee the "same stream" property, because the slave must be able to accumulate an identical backlog. * Slave pings to sub-slaves are now sent in a special form, when the top-level master is disconnected, in order to don't interfer with the replication stream. We just use out of band "\n" bytes as in other parts of the Redis protocol. An old design document is available here: https://gist.github.com/antirez/ae068f95c0d084891305 However the implementation is not identical to the description because during the work to implement it, different changes were needed in order to make things working well.	2016-11-09 15:37:15 +01:00
charsyam	ca6fc4f031	Simple change just using slaves instead of server.slaves	2016-09-24 15:53:57 +09:00
Qu Chen	d982f44372	Fix a bug to delay bgsave while AOF rewrite in progress for replication	2016-08-02 10:44:33 +02:00
antirez	55385f99de	Ability of slave to announce arbitrary ip/port to master. This feature is useful, especially in deployments using Sentinel in order to setup Redis HA, where the slave is executed with NAT or port forwarding, so that the auto-detected port/ip addresses, as listed in the "INFO replication" output of the master, or as provided by the "ROLE" command, don't match the real addresses at which the slave is reachable for connections.	2016-07-27 17:32:15 +02:00
antirez	03f5b508e5	Replication: when possible start RDB saving ASAP. In a previous commit the replication code was changed in order to centralize the BGSAVE for replication trigger in replicationCron(), however after further testings, the 1 second delay imposed by this change is not acceptable. So now the BGSAVE is only delayed if the AOF rewriting process is active. However past comments made sure that replicationCron() is always able to trigger the BGSAVE when needed, making the code generally more robust. The new code is more similar to the initial @oranagra patch where the BGSAVE was delayed only if an AOF rewrite was in progress. Trivia: delaying the BGSAVE uncovered a minor Sentinel issue that is now fixed.	2016-07-22 17:03:18 +02:00
antirez	780a8b1d76	Replication: start BGSAVE for replication always in replicationCron(). This makes the replication code conceptually simpler by removing the synchronous BGSAVE trigger in syncCommand(). This also means that socket and disk BGSAVE targets are handled by the same code.	2016-07-21 12:10:56 +02:00
antirez	acc2336fd1	Centralize slave replication handshake aborting. Now we have a single function to call in any state of the slave handshake, instead of using different functions for different states which is error prone. Change performed in the context of issue #2479 but does not fix it, since should be functionally identical to the past. Just an attempt to make replication.c simpler to follow.	2015-12-03 10:38:56 +01:00
antirez	ed6228851c	PR 2813 fix ported to unstable.	2015-10-15 10:20:09 +02:00
antirez	252cfa0a39	Lazyfree: cond vars to enabled/disable it based on DEL context.	2015-10-02 15:27:57 +02:00
antirez	c69c6c80fb	Lazyfree: ability to free whole DBs in background.	2015-10-01 13:02:26 +02:00
antirez	1e7153831d	Refactoring: unlinkClient() added to lower freeClient() complexity.	2015-09-30 17:10:03 +02:00
antirez	fdb3be939e	Refactoring: new function to test if client has pending output.	2015-09-30 16:41:48 +02:00
antirez	1c7d87df0c	Avoid installing the client write handler when possible.	2015-09-30 16:29:41 +02:00
antirez	d036abe27d	Log client details on SLAVEOF command having an effect.	2015-08-21 15:29:07 +02:00
antirez	f18e5b634d	startBgsaveForReplication(): handle waiting slaves state change. Before this commit, after triggering a BGSAVE it was up to the caller of startBgsavForReplication() to handle slaves in WAIT_BGSAVE_START in order to update them accordingly. However when the replication target is the socket, this is not possible since the process of updating the slaves and sending the FULLRESYNC reply must be coupled with the process of starting an RDB save (the reason is, we need to send the FULLSYNC command and spawn a child that will start to send RDB data to the slaves ASAP). This commit moves the responsibility of handling slaves in WAIT_BGSAVE_START to startBgsavForReplication() so that for both diskless and disk-based replication we have the same chain of responsiblity. In order accomodate such change, the syncCommand() also needs to put the client in the slave list ASAP (just after the initial checks) and not at the end, so that startBgsavForReplication() can find the new slave alrady in the list. Another related change is what happens if the BGSAVE fails because of fork() or other errors: we now remove the slave from the list of slaves and send an error, scheduling the slave connection to be terminated. As a side effect of this change the following errors found by Oran Agra are fixed (thanks!): 1. rdbSaveToSlavesSockets() on failed fork will get the slaves cleaned up, otherwise they remain in a wrong state forever since we setup them for full resync before actually trying to fork. 2. updateSlavesWaitingBgsave() with replication target set as "socket" was broken since the function changed the slaves state from WAIT_BGSAVE_START to WAIT_BGSAVE_END via replicationSetupSlaveForFullResync(), so later rdbSaveToSlavesSockets() will not find any slave in the right state (WAIT_BGSAVE_START) to feed.	2015-08-20 17:39:48 +02:00
antirez	bea1259190	slaveTryPartialResynchronization and syncWithMaster: better synergy. It is simpler if removing the read event handler from the FD is up to slaveTryPartialResynchronization, after all it is only called in the context of syncWithMaster. This commit also makes sure that on error all the event handlers are removed from the socket before closing it.	2015-08-07 12:04:37 +02:00
antirez	88c716a0f5	syncWithMaster(): non blocking state machine.	2015-08-06 18:12:20 +02:00
antirez	ce5761e061	startBgsaveForReplication(): log what you really do.	2015-08-06 09:49:38 +02:00
antirez	3e6d4d599a	Replication: add REPLCONF CAPA EOF support. Add the concept of slaves capabilities to Redis, the slave now presents to the Redis master with a set of capabilities in the form: REPLCONF capa SOMECAPA capa OTHERCAPA ... This has the effect of setting slave->slave_capa with the corresponding SLAVE_CAPA macros that the master can test later to understand if it the slave will understand certain formats and protocols of the replication process. This makes it much simpler to introduce new replication capabilities in the future in a way that don't break old slaves or masters. This patch was designed and implemented together with Oran Agra (@oranagra).	2015-08-06 09:23:23 +02:00
antirez	55ba772703	Fix replication slave pings period. For PINGs we use the period configured by the user, but for the newlines of slaves waiting for an RDB to be created (including slaves waiting for the FULLRESYNC reply) we need to ping with frequency of 1 second, since the timeout is fixed and needs to be refreshed.	2015-08-05 16:49:16 +02:00
antirez	15de6b108b	Make sure we re-emit SELECT after each new slave full sync setup. In previous commits we moved the FULLRESYNC to the moment we start the BGSAVE, so that the offset we provide is the right one. However this also means that we need to re-emit the SELECT statement every time a new slave starts to accumulate the changes. To obtian this effect in a more clean way, the function that sends the FULLRESYNC reply was overloaded with a more important role of also doing this and chanigng the slave state. So it was renamed to replicationSetupSlaveForFullResync() to better reflect what it does now.	2015-08-05 13:34:46 +02:00
antirez	a5a06a8ecd	Don't send SELECT to slaves in WAIT_BGSAVE_START state.	2015-08-05 11:23:22 +02:00
antirez	62b5c60ead	syncCommand() comments improved.	2015-08-05 08:41:57 +02:00
antirez	292fec058a	PSYNC initial offset fix. This commit attempts to fix a bug involving PSYNC and diskless replication (currently experimental) found by Yuval Inbar from Redis Labs and that was later found to have even more far reaching effects (the bug also exists when diskstore is off). The gist of the bug is that, a Redis master replies with +FULLRESYNC to a PSYNC attempt that fails and requires a full resynchronization. However, the baseline offset sent along with FULLRESYNC was always the current master replication offset. This is not ok, because there are many reasosn that may delay the RDB file creation. And... guess what, the master offset we communicate must be the one of the time the RDB was created. So for example: 1) When the BGSAVE for replication is delayed since there is one already but is not good for replication. 2) When the BGSAVE is not needed as we attach one currently ongoing. 3) When because of diskless replication the BGSAVE is delayed. In all the above cases the PSYNC reply is wrong and the slave may reconnect later claiming to need a wrong offset: this may cause data curruption later.	2015-08-04 17:06:10 +02:00

1 2 3 4 5

207 Commits