redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 08:38:27 -05:00

Author	SHA1	Message	Date
antirez	2c66c525f9	ACL: configure the master connection without user.	2019-01-17 18:33:36 +01:00
antirez	709a6612eb	RESP3: addReplyString() -> addReplyProto(). The function naming was totally nuts. Let's fix it as we break PRs anyway with RESP3 refactoring and changes.	2019-01-09 17:00:30 +01:00
antirez	07bce54093	RESP3: Use new deferred len API in replication.c.	2019-01-09 17:00:29 +01:00
antirez	06a4acb7d3	When replica kills a pending RDB save during SYNC, log it. This logs what happens in the context of the fix in PR #5367.	2018-10-31 11:47:10 +01:00
Salvatore Sanfilippo	6204d8c139	Merge pull request #5367 from nUl1/fullresync-stopbgsave Prevent RDB autosave from overwriting full resync results	2018-10-31 11:42:04 +01:00
antirez	3d07ed983e	Fix typo in replicationCron() comment.	2018-10-05 18:30:45 +02:00
Andrey Bugaevskiy	466c277b4f	Move child termination to readSyncBulkPayload	2018-09-27 19:38:58 +03:00
Andrey Bugaevskiy	98a64523c4	Prevent RDB autosave from overwriting full resync results During the full database resync we may still have unsaved changes on the receiving side. This causes a race condition between synced data rename/load and the rename of rdbSave tempfile.	2018-09-19 19:58:39 +03:00
antirez	61b7a176ef	Slave removal: replication.c logs fixed.	2018-09-11 15:32:28 +02:00
antirez	ef2c7a5bbb	Slave removal: SLAVEOF -> REPLICAOF. SLAVEOF is now an alias.	2018-09-11 15:32:28 +02:00
Oran Agra	d55598988b	fix rare replication stream corruption with disk-based replication The slave sends \n keepalive messages to the master while parsing the rdb, and later sends REPLCONF ACK once a second. rarely, the master recives both a linefeed char and a REPLCONF in the same read, \n3\r\n$8\r\nREPLCONF\r\n... and it tries to trim two chars (\r\n) from the query buffer, trimming the '' from *3\r\n$8\r\nREPLCONF\r\n... then the master tries to process a command starting with '3' and replies to the slave a bunch of -ERR and one +OK. although the slave silently ignores these (prints a log message), this corrupts the replication offset at the slave since the slave increases the replication offset, and the master did not. other than the fix in processInlineBuffer, i did several other improvments while hunting this very rare bug. - when redis replies with "unknown command" it includes a portion of the arguments, not just the command name. so it would be easier to understand what was recived, in my case, on the slave side, it was -ERR, but the "arguments" were the interesting part (containing info on the error). - about a year ago i added code in addReplyErrorLength to print the error to the log in case of a reply to master (since this string isn't actually trasmitted to the master), now changed that block to print a similar log message to indicate an error being sent from the master to the slave. note that the slave is marked as CLIENT_SLAVE only after PSYNC was received, so this will not cause any harm for REPLCONF, and will only indicate problems that are gonna corrupt the replication stream anyway. - two places were c->reply was emptied, and i wanted to reset sentlen this is a precaution (i did not actually see such a problem), since a non-zero sentlen will cause corruption to be transmitted on the socket.	2018-07-17 12:51:49 +03:00
Oran Agra	bf680b6f8c	slave buffers were wasteful and incorrectly counted causing eviction A) slave buffers didn't count internal fragmentation and sds unused space, this caused them to induce eviction although we didn't mean for it. B) slave buffers were consuming about twice the memory of what they actually needed. - this was mainly due to sdsMakeRoomFor growing to twice as much as needed each time but networking.c not storing more than 16k (partially fixed recently in 237a38737). - besides it wasn't able to store half of the new string into one buffer and the other half into the next (so the above mentioned fix helped mainly for small items). - lastly, the sds buffers had up to 30% internal fragmentation that was wasted, consumed but not used. C) inefficient performance due to starting from a small string and reallocing many times. what i changed: - creating dedicated buffers for reply list, counting their size with zmalloc_size - when creating a new reply node from, preallocate it to at least 16k. - when appending a new reply to the buffer, first fill all the unused space of the previous node before starting a new one. other changes: - expose mem_not_counted_for_evict info field for the benefit of the test suite - add a test to make sure slave buffers are counted correctly and that they don't cause eviction	2018-07-16 16:43:42 +03:00
Jack Drogon	93238575f7	Fix typo	2018-07-03 18:19:46 +02:00
antirez	677d10b2a8	Set repl_down_since to zero on state change. PR #5081 fixes an "interesting" bug about Redis Cluster failover but in general about the updating of repl_down_since, that is used in order to count the time a slave was left disconnected from its master. While the fix provided resolves the specific issue, in general the validity of repl_down_since is limited to states that are different than the state CONNECTED, and the disconnected time is set when the state is DISCONNECTED. However from CONNECTED to other states, the state machine must always go to DISCONNECTED first. So it makes sense to set the field to zero (since it is meaningless in that context) when the state is set to CONNECTED.	2018-07-03 12:42:14 +02:00
WuYunlong	2e167f7d0e	fix server.repl_down_since resetting, so that slaves could failover automatically as expected.	2018-06-30 09:39:08 +08:00
antirez	27178a3fde	Fix type of argslen in sendSynchronousCommand(). Related to #5037.	2018-06-26 14:38:35 +02:00
antirez	1f1e724f47	Remove black space.	2018-06-26 14:37:22 +02:00
Madelyn Olson	45731edc4b	Addressed comments	2018-06-26 00:57:35 +00:00
Madelyn Olson	e8d68b6b72	Fixed replication authentication with whitespace in password	2018-06-26 00:48:37 +00:00
shenlongxing	c85ae56edc	Fix write() errno error	2018-06-06 13:06:42 +02:00
Wander Hillen	dcffca0a31	Fix typos, add some periods	2018-03-16 09:59:14 +01:00
Salvatore Sanfilippo	4aa2ecd98b	Merge pull request #4269 from jianqingdu/unstable fix not call va_end() when syncWrite() failed	2018-01-24 10:55:25 +01:00
antirez	b23927b240	Hopefully more clear comment to explain the change in #4607 .	2018-01-16 15:52:13 +01:00
Oran Agra	689b64c3ad	PSYNC2 fix - promoted slave should hold on to it's backlog after a slave is promoted (assuming it has no slaves and it booted over an hour ago), it will lose it's replication backlog at the next replication cron, rather than waiting for slaves to connect to it. so on a simple master/slave faiover, if the new slave doesn't connect immediately, it may be too later and PSYNC2 will fail.	2018-01-16 10:10:42 +02:00
antirez	62a4b817c6	add linkClient(): adds the client and caches the list node. We have this operation in two places: when caching the master and when linking a new client after the client creation. By having an API for this we avoid incurring in errors when modifying one of the two places forgetting the other. The function is also a good place where to document why we cache the linked list node. Related to #4497 and #4210.	2017-12-05 16:02:03 +01:00
zhaozhao.zz	43be967690	networking: optimize unlinkClient() in freeClient()	2017-11-30 18:11:05 +08:00
antirez	4d063bb6ba	PSYNC2: reorganize comments related to recent fixes. Related to PR #4412 and issue #4407.	2017-11-24 11:08:29 +01:00
zhaozhao.zz	6ddf0ea293	PSYNC2: safe free backlog when reach the time limit When we free the backlog, we should use a new replication ID and clear the ID2. Since without backlog we can not increment master_repl_offset even do write commands, that may lead to inconsistency when we try to connect a "slave-before" master (if this master is our slave before, our replid equals the master's replid2). As the master have our history, so we can match the master's replid2 and second_replid_offset, that make partial sync work, but the data is inconsistent.	2017-11-01 17:32:27 +08:00
antirez	bb3b5ddd19	PSYNC2: More refinements related to #4316 .	2017-09-20 11:28:13 +02:00
zhaozhao.zz	b541ccef25	PSYNC2: make persisiting replication info more solid This commit is a reinforcement of commit `c1c99e9`. 1. Replication information can be stored when the RDB file is generated by a mater using server.slaveseldb when server.repl_backlog is not NULL, or set repl_stream_db be -1. That's safe, because NULL server.repl_backlog will trigger full synchronization, then master will send SELECT command to replicaiton stream. 2. Only do rdbSave* when rsiptr is not NULL, if we do rdbSave* without rdbSaveInfo, slave will miss repl-stream-db. 3. Save the replication informations also in the case of SAVE command, FLUSHALL command and DEBUG reload.	2017-09-20 11:18:10 +02:00
antirez	c1c99e9f4e	PSYNC2: Fix the way replication info is saved/loaded from RDB. This commit attempts to fix a number of bugs reported in #4316. They are related to the way replication info like replication ID, offsets, and currently selected DB in the master client, are stored and loaded by Redis. In order to avoid inconsistencies the changes in this commit try to enforce that: 1. Replication information are only stored when the RDB file is generated by a slave that has a valid 'master' client, so that we can always extract the currently selected DB. 2. When replication informations are persisted in the RDB file, all the info for a successful PSYNC or nothing is persisted. 3. The RDB replication informations are only loaded if the instance is configured as a slave, otherwise a master can start with IDs that relate to a different history of the data set, and stil retain such IDs in the future while receiving unrelated writes.	2017-09-19 23:03:39 +02:00
antirez	b75ae0bbea	PSYNC2: Create backlog on slave partial sync as well. A slave may be started with an RDB file able to provide enough slave to perform a successful partial SYNC with its master. However in such a case, how outlined in issue #4268, the slave backlog will not be started, since it was only initialized on full syncs attempts. This creates different problems with successive PSYNC attempts that will always result in full synchronizations. Thanks to @fdingiit for discovering the issue.	2017-09-19 10:33:14 +02:00
jianqingdu	498f65ffb7	fix not call va_end when syncWrite() failed fix not call va_end when syncWrite() failed in sendSynchronousCommand()	2017-08-30 21:20:14 -05:00
antirez	469d6e2b37	PSYNC2: fix master cleanup when caching it. The master client cleanup was incomplete: resetClient() was missing and the output buffer of the client was not reset, so pending commands related to the previous connection could be still sent. The first problem caused the client argument vector to be, at times, half populated, so that when the correct replication stream arrived the protcol got mixed to the arugments creating invalid commands that nobody called. Thanks to @yangsiran for also investigating this problem, after already providing important design / implementation hints for the original PSYNC2 issues (see referenced Github issue). Note that this commit adds a new function to the list library of Redis in order to be able to reset a list without destroying it. Related to issue #3899.	2017-04-27 17:08:37 +02:00
antirez	189a12afb4	PSYNC2: discard pending transactions from cached master. During the review of the fix for #3899, @yangsiran identified an implementation bug: given that the offset is now relative to the applied part of the replication log, when we cache a master, the successive PSYNC2 request will be made in order to include the transaction that was not completely processed. This means that we need to discard any pending transaction from our replication buffer: it will be re-executed.	2017-04-19 14:02:52 +02:00
antirez	22be435efe	Fix PSYNC2 incomplete command bug as described in #3899 . This bug was discovered by @kevinmcgehee and constituted a major hidden bug in the PSYNC2 implementation, caused by the propagation from the master of incomplete commands to slaves. The bug had several results: 1. Borrowing from Kevin text in the issue: "Given that slaves blindly copy over their master's input into their own replication backlog over successive read syscalls, it's possible that with large commands or small TCP buffers, partial commands are present in this buffer. If the master were to fail before successfully propagating the entire command to a slave, the slaves will never execute the partial command (since the client is invalidated) but will copy it to replication backlog which may relay those invalid bytes to its slaves on PSYNC2, corrupting the backlog and possibly other valid commands that follow the failover. Simple command boundaries aren't sufficient to capture this, either, because in the case of a MULTI/EXEC block, if the master successfully propagates a subset of the commands but not the EXEC, then the transaction in the backlog becomes corrupt and could corrupt other slaves that consume this data." 2. As identified by @yangsiran later, there is another effect of the bug. For the same mechanism of the first problem, a slave having another slave, could receive a full resynchronization request with an already half-applied command in the backlog. Once the RDB is ready, it will be sent to the slave, and the replication will continue sending to the sub-slave the other half of the command, which is not valid. The fix, designed by @yangsiran and @antirez, and implemented by @antirez, uses a secondary buffer in order to feed the sub-masters and update the replication backlog and offsets, only when a given part of the query buffer is actually applied to the state of the instance, that is, when the command gets processed and the command is not pending in the Redis transaction buffer because of CLIENT_MULTI state. Given that now the backlog and offsets representation are in agreement with the actual processed commands, both issue 1 and 2 should no longer be possible. Thanks to @kevinmcgehee, @yangsiran and @oranagra for their work in identifying and designing a fix for this problem.	2017-04-19 10:25:45 +02:00
antirez	104584b95e	Fix typo in feedReplicationBacklog() top comment.	2017-04-12 12:28:05 +02:00
antirez	76d87f47c7	Don't leak file descriptor on syncWithMaster(). Close #3804.	2017-02-20 10:18:41 +01:00
antirez	8e390a62ad	Hopefully improve code comments for issue #3616 . This commit also contains other changes in order to conform the code to the Redis core style, specifically 80 chars max per line, smart conditionals in the same line: if (that) do_this();	2016-12-16 17:48:38 +01:00
Salvatore Sanfilippo	ca4ca5073e	Merge pull request #3616 from oranagra/stop_aofrw_before_rdbload CoW improvement, stop AOFRW before flushing and parsing slave RDB	2016-12-16 17:43:20 +01:00
antirez	434e6b2da3	PSYNC2: Do not accept WAIT in slave instances. No longer makes sense since writable slaves only do local writes now: writes are no longer passed to sub-slaves in the stream.	2016-12-02 10:21:20 +01:00
antirez	6eb720ff2d	PSYNC2: Minor memory leak reading -NOMASTERLINK master reply fixed.	2016-11-29 10:25:00 +01:00
antirez	eab865a0a1	PSYNC2: stop sending newlines to sub-slaves when master is down. This actually includes two changes: 1) No newlines to take the master-slave link up when the upstream master is down. Doing this is dangerous because the sub-slave often is received replication protocol for an half-command, so can't receive newlines without desyncing the replication link, even with the code in order to cancel out the bytes that PSYNC2 was using. Moreover this is probably also not needed/sane, because anyway the slave can keep serving requests, and because if it's configured to don't serve stale data, it's a good idea, actually, to break the link. 2) When a +CONTINUE with a different ID is received, we now break connection with the sub-slaves: they need to be notified as well. This was part of the original specification but for some reason it was not implemented in the code, and was alter found as a PSYNC2 bug in the integration testing.	2016-11-28 17:54:04 +01:00
antirez	e09e31b12e	PSYNC2: on transient error jump to error, not write_error.	2016-11-24 15:48:18 +01:00
antirez	5b7d42fff3	PSYNC2: bugfixing pre release. 1. Master replication offset was cleared after switching configuration to some other slave, since it was assumed you can't PSYNC after a switch. Note the case anymore and when we successfully PSYNC we need to have our offset untouched. 2. Secondary replication ID was not reset to "000..." pattern at startup. 3. Master in error state replying -LOADING or other transient errors forced the slave to discard the cached master and full resync. This is now fixed. 4. Better logging of what's happening on failed PSYNCs.	2016-11-23 17:36:45 +01:00
Salvatore Sanfilippo	5b83fa482c	Merge pull request #3612 from deep011/unstable fix a possible bug for 'replconf getack'	2016-11-18 10:45:09 +01:00
oranagra	e3a61950a2	when a slave loads an RDB, stop an AOFRW fork before flusing db and parsing rdb file, to avoid a CoW disaster.	2016-11-16 21:30:59 +02:00
deep011	13a92a5bb1	fix a possible bug for 'replconf getack'	2016-11-16 11:04:33 +08:00
antirez	28c96d73b2	PSYNC2: Save replication ID/offset on RDB file. This means that stopping a slave and restarting it will still make it able to PSYNC with the master. Moreover the master itself will retain its ID/offset, in case it gets turned into a slave, or if a slave will try to PSYNC with it with an exactly updated offset (otherwise there is no backlog). This change was possible thanks to PSYNC v2 that makes saving the current replication state much simpler.	2016-11-10 12:35:29 +01:00
antirez	4e5e366ed2	PSYNC2: Wrap debugging code with if(0)	2016-11-09 15:37:15 +01:00
antirez	2669fb8364	PSYNC2: different improvements to Redis replication. The gist of the changes is that now, partial resynchronizations between slaves and masters (without the need of a full resync with RDB transfer and so forth), work in a number of cases when it was impossible in the past. For instance: 1. When a slave is promoted to mastrer, the slaves of the old master can partially resynchronize with the new master. 2. Chained slalves (slaves of slaves) can be moved to replicate to other slaves or the master itsef, without requiring a full resync. 3. The master itself, after being turned into a slave, is able to partially resynchronize with the new master, when it joins replication again. In order to obtain this, the following main changes were operated: * Slaves also take a replication backlog, not just masters. * Same stream replication for all the slaves and sub slaves. The replication stream is identical from the top level master to its slaves and is also the same from the slaves to their sub-slaves and so forth. This means that if a slave is later promoted to master, it has the same replication backlong, and can partially resynchronize with its slaves (that were previously slaves of the old master). * A given replication history is no longer identified by the `runid` of a Redis node. There is instead a `replication ID` which changes every time the instance has a new history no longer coherent with the past one. So, for example, slaves publish the same replication history of their master, however when they are turned into masters, they publish a new replication ID, but still remember the old ID, so that they are able to partially resynchronize with slaves of the old master (up to a given offset). * The replication protocol was slightly modified so that a new extended +CONTINUE reply from the master is able to inform the slave of a replication ID change. * REPLCONF CAPA is used in order to notify masters that a slave is able to understand the new +CONTINUE reply. * The RDB file was extended with an auxiliary field that is able to select a given DB after loading in the slave, so that the slave can continue receiving the replication stream from the point it was disconnected without requiring the master to insert "SELECT" statements. This is useful in order to guarantee the "same stream" property, because the slave must be able to accumulate an identical backlog. * Slave pings to sub-slaves are now sent in a special form, when the top-level master is disconnected, in order to don't interfer with the replication stream. We just use out of band "\n" bytes as in other parts of the Redis protocol. An old design document is available here: https://gist.github.com/antirez/ae068f95c0d084891305 However the implementation is not identical to the description because during the work to implement it, different changes were needed in order to make things working well.	2016-11-09 15:37:15 +01:00
charsyam	ca6fc4f031	Simple change just using slaves instead of server.slaves	2016-09-24 15:53:57 +09:00
Qu Chen	d982f44372	Fix a bug to delay bgsave while AOF rewrite in progress for replication	2016-08-02 10:44:33 +02:00
antirez	55385f99de	Ability of slave to announce arbitrary ip/port to master. This feature is useful, especially in deployments using Sentinel in order to setup Redis HA, where the slave is executed with NAT or port forwarding, so that the auto-detected port/ip addresses, as listed in the "INFO replication" output of the master, or as provided by the "ROLE" command, don't match the real addresses at which the slave is reachable for connections.	2016-07-27 17:32:15 +02:00
antirez	03f5b508e5	Replication: when possible start RDB saving ASAP. In a previous commit the replication code was changed in order to centralize the BGSAVE for replication trigger in replicationCron(), however after further testings, the 1 second delay imposed by this change is not acceptable. So now the BGSAVE is only delayed if the AOF rewriting process is active. However past comments made sure that replicationCron() is always able to trigger the BGSAVE when needed, making the code generally more robust. The new code is more similar to the initial @oranagra patch where the BGSAVE was delayed only if an AOF rewrite was in progress. Trivia: delaying the BGSAVE uncovered a minor Sentinel issue that is now fixed.	2016-07-22 17:03:18 +02:00
antirez	780a8b1d76	Replication: start BGSAVE for replication always in replicationCron(). This makes the replication code conceptually simpler by removing the synchronous BGSAVE trigger in syncCommand(). This also means that socket and disk BGSAVE targets are handled by the same code.	2016-07-21 12:10:56 +02:00
antirez	acc2336fd1	Centralize slave replication handshake aborting. Now we have a single function to call in any state of the slave handshake, instead of using different functions for different states which is error prone. Change performed in the context of issue #2479 but does not fix it, since should be functionally identical to the past. Just an attempt to make replication.c simpler to follow.	2015-12-03 10:38:56 +01:00
antirez	ed6228851c	PR 2813 fix ported to unstable.	2015-10-15 10:20:09 +02:00
antirez	252cfa0a39	Lazyfree: cond vars to enabled/disable it based on DEL context.	2015-10-02 15:27:57 +02:00
antirez	c69c6c80fb	Lazyfree: ability to free whole DBs in background.	2015-10-01 13:02:26 +02:00
antirez	1e7153831d	Refactoring: unlinkClient() added to lower freeClient() complexity.	2015-09-30 17:10:03 +02:00
antirez	fdb3be939e	Refactoring: new function to test if client has pending output.	2015-09-30 16:41:48 +02:00
antirez	1c7d87df0c	Avoid installing the client write handler when possible.	2015-09-30 16:29:41 +02:00
antirez	d036abe27d	Log client details on SLAVEOF command having an effect.	2015-08-21 15:29:07 +02:00
antirez	f18e5b634d	startBgsaveForReplication(): handle waiting slaves state change. Before this commit, after triggering a BGSAVE it was up to the caller of startBgsavForReplication() to handle slaves in WAIT_BGSAVE_START in order to update them accordingly. However when the replication target is the socket, this is not possible since the process of updating the slaves and sending the FULLRESYNC reply must be coupled with the process of starting an RDB save (the reason is, we need to send the FULLSYNC command and spawn a child that will start to send RDB data to the slaves ASAP). This commit moves the responsibility of handling slaves in WAIT_BGSAVE_START to startBgsavForReplication() so that for both diskless and disk-based replication we have the same chain of responsiblity. In order accomodate such change, the syncCommand() also needs to put the client in the slave list ASAP (just after the initial checks) and not at the end, so that startBgsavForReplication() can find the new slave alrady in the list. Another related change is what happens if the BGSAVE fails because of fork() or other errors: we now remove the slave from the list of slaves and send an error, scheduling the slave connection to be terminated. As a side effect of this change the following errors found by Oran Agra are fixed (thanks!): 1. rdbSaveToSlavesSockets() on failed fork will get the slaves cleaned up, otherwise they remain in a wrong state forever since we setup them for full resync before actually trying to fork. 2. updateSlavesWaitingBgsave() with replication target set as "socket" was broken since the function changed the slaves state from WAIT_BGSAVE_START to WAIT_BGSAVE_END via replicationSetupSlaveForFullResync(), so later rdbSaveToSlavesSockets() will not find any slave in the right state (WAIT_BGSAVE_START) to feed.	2015-08-20 17:39:48 +02:00
antirez	bea1259190	slaveTryPartialResynchronization and syncWithMaster: better synergy. It is simpler if removing the read event handler from the FD is up to slaveTryPartialResynchronization, after all it is only called in the context of syncWithMaster. This commit also makes sure that on error all the event handlers are removed from the socket before closing it.	2015-08-07 12:04:37 +02:00
antirez	88c716a0f5	syncWithMaster(): non blocking state machine.	2015-08-06 18:12:20 +02:00
antirez	ce5761e061	startBgsaveForReplication(): log what you really do.	2015-08-06 09:49:38 +02:00
antirez	3e6d4d599a	Replication: add REPLCONF CAPA EOF support. Add the concept of slaves capabilities to Redis, the slave now presents to the Redis master with a set of capabilities in the form: REPLCONF capa SOMECAPA capa OTHERCAPA ... This has the effect of setting slave->slave_capa with the corresponding SLAVE_CAPA macros that the master can test later to understand if it the slave will understand certain formats and protocols of the replication process. This makes it much simpler to introduce new replication capabilities in the future in a way that don't break old slaves or masters. This patch was designed and implemented together with Oran Agra (@oranagra).	2015-08-06 09:23:23 +02:00
antirez	55ba772703	Fix replication slave pings period. For PINGs we use the period configured by the user, but for the newlines of slaves waiting for an RDB to be created (including slaves waiting for the FULLRESYNC reply) we need to ping with frequency of 1 second, since the timeout is fixed and needs to be refreshed.	2015-08-05 16:49:16 +02:00
antirez	15de6b108b	Make sure we re-emit SELECT after each new slave full sync setup. In previous commits we moved the FULLRESYNC to the moment we start the BGSAVE, so that the offset we provide is the right one. However this also means that we need to re-emit the SELECT statement every time a new slave starts to accumulate the changes. To obtian this effect in a more clean way, the function that sends the FULLRESYNC reply was overloaded with a more important role of also doing this and chanigng the slave state. So it was renamed to replicationSetupSlaveForFullResync() to better reflect what it does now.	2015-08-05 13:34:46 +02:00
antirez	a5a06a8ecd	Don't send SELECT to slaves in WAIT_BGSAVE_START state.	2015-08-05 11:23:22 +02:00
antirez	62b5c60ead	syncCommand() comments improved.	2015-08-05 08:41:57 +02:00
antirez	292fec058a	PSYNC initial offset fix. This commit attempts to fix a bug involving PSYNC and diskless replication (currently experimental) found by Yuval Inbar from Redis Labs and that was later found to have even more far reaching effects (the bug also exists when diskstore is off). The gist of the bug is that, a Redis master replies with +FULLRESYNC to a PSYNC attempt that fails and requires a full resynchronization. However, the baseline offset sent along with FULLRESYNC was always the current master replication offset. This is not ok, because there are many reasosn that may delay the RDB file creation. And... guess what, the master offset we communicate must be the one of the time the RDB was created. So for example: 1) When the BGSAVE for replication is delayed since there is one already but is not good for replication. 2) When the BGSAVE is not needed as we attach one currently ongoing. 3) When because of diskless replication the BGSAVE is delayed. In all the above cases the PSYNC reply is wrong and the slave may reconnect later claiming to need a wrong offset: this may cause data curruption later.	2015-08-04 17:06:10 +02:00
antirez	c1e94b6b9c	Force slaves to resync after unsuccessful PSYNC. Using chained replication where C is slave of B which is in turn slave of A, if B reconnects the replication link with A but discovers it is no longer possible to PSYNC, slaves of B must be disconnected and PSYNC not allowed, since the new B dataset may be completely different after the synchronization with the master. Note that there are varius semantical differences in the way this is handled now compared to the past. In the past the semantics was: 1. When a slave lost connection with its master, disconnected the chained slaves ASAP. Which is not needed since after a successful PSYNC with the master, the slaves can continue and don't need to resync in turn. 2. However after a failed PSYNC the replication backlog was not reset, so a slave was able to PSYNC successfully even if the instance did a full sync with its master, containing now an entirely different data set. Now instead chained slaves are not disconnected when the slave lose the connection with its master, but only when it is forced to full SYNC with its master. This means that if the slave having chained slaves does a successful PSYNC all its slaves can continue without troubles. See issue #2694 for more details.	2015-07-28 16:35:02 +02:00
antirez	278ea9d16b	replicationHandleMasterDisconnection() belongs to replication.c.	2015-07-28 14:36:50 +02:00
antirez	32f80e2f1b	RDMF: More consistent define names.	2015-07-27 14:37:58 +02:00
antirez	40eb548a80	RDMF: REDIS_OK REDIS_ERR -> C_OK C_ERR.	2015-07-26 23:17:55 +02:00
antirez	2d9e3eb107	RDMF: redisAssert -> serverAssert.	2015-07-26 15:29:53 +02:00
antirez	14ff572482	RDMF: OBJ_ macros for object related stuff.	2015-07-26 15:28:00 +02:00
antirez	554bd0e7bd	RDMF: use client instead of redisClient, like Disque.	2015-07-26 15:20:52 +02:00
antirez	424fe9afd9	RDMF: redisLog -> serverLog.	2015-07-26 15:17:43 +02:00
antirez	cef054e868	RDMF (Redis/Disque merge friendlyness) refactoring WIP 1.	2015-07-26 15:17:18 +02:00
antirez	8366907bed	Use best effort address binding to connect to the master We usually want to reach the master using the address of the interface Redis is bound to (via the "bind" config option). That's useful since the master will get (and publish) the slave address getting the peer name of the incoming socket connection from the slave. However, when this is not possible, for example because the slave is bound to the loopback interface but repliaces from a master accessed via an external interface, we want to still connect with the master even from a different interface: in this case it is not really important that the master will provide any other address, while it is vital to be able to replicate correctly. Related to issues #2609 and #2612.	2015-06-11 14:34:38 +02:00
antirez	6c60526db9	Net: improve prepareClientToWrite() error handling and comments. When we fail to setup the write handler it does not make sense to take the client around, it is missing writes: whatever is a client or a slave anyway the connection should terminated ASAP. Moreover what the function does exactly with its return value, and in which case the write handler is installed on the socket, was not clear, so the functions comment are improved to make the goals of the function more obvious. Also related to #2485.	2015-04-01 10:07:45 +02:00
Oran Agra	159875b5a3	fixes to diskless replication. master was closing the connection if the RDB transfer took long time. and also sent PINGs to the slave before it got the initial ACK, in which case the slave wouldn't be able to find the EOF marker.	2015-03-31 23:42:08 +03:00
antirez	c3ad70901f	Replication: disconnect blocked clients when switching to slave role. Bug as old as Redis and blocking operations. It's hard to trigger since only happens on instance role switch, but the results are quite bad since an inconsistency between master and slave is created. How to trigger the bug is a good description of the bug itself. 1. Client does "BLPOP mylist 0" in master. 2. Master is turned into slave, that replicates from New-Master. 3. Client does "LPUSH mylist foo" in New-Master. 4. New-Master propagates write to slave. 5. Slave receives the LPUSH, the blocked client get served. Now Master "mylist" key has "foo", Slave "mylist" key is empty. Highlights: * At step "2" above, the client remains attached, basically escaping any check performed during command dispatch: read only slave, in that case. * At step "5" the slave (that was the master), serves the blocked client consuming a list element, which is not consumed on the master side. This scenario is technically likely to happen during failovers, however since Redis Sentinel already disconnects clients using the CLIENT command when changing the role of the instance, the bug is avoided in Sentinel deployments. Closes #2473.	2015-03-24 16:00:09 +01:00
antirez	c5dd686ecb	Replication: put server.master client creation into separated function.	2015-02-04 11:26:20 +01:00
antirez	ce269ad3c5	AnetFormatIP(): renamed, commented, now sticks to IP:port format. A few code style changes + consistent format: not nice for humans but better for parsers.	2014-12-11 18:20:30 +01:00
Matt Stancliff	491881e13b	Cleanup all IP formatting code Instead of manually checking for strchr(n,':') everywhere, we can use our new centralized IP formatting functions.	2014-12-11 10:12:18 -05:00
antirez	1b732c09d0	Network bandwidth tracking + refactoring. Track bandwidth used by clients and replication (but diskless replication is not tracked since the actual transfer happens in the child process). This includes a refactoring that makes tracking new instantaneous metrics simpler.	2014-12-03 12:16:25 +01:00
antirez	bb7fea0d5c	Diskless SYNC: fix RDB EOF detection. RDB EOF detection was relying on the final part of the RDB transfer to be a magic 40 bytes EOF marker. However as the slave is put online immediately, and because of sockets timeouts, the replication stream is actually contiguous with the RDB file. This means that to detect the EOF correctly we should either: 1) Scan all the stream searching for the mark. Sucks CPU-wise. 2) Start to send the replication stream only after an acknowledge. 3) Implement a proper chunked encoding. For now solution "2" was picked, so the master does not start to send ASAP the stream of commands in the case of diskless replication. We wait for the first REPLCONF ACK command from the slave, that certifies us that the slave correctly loaded the RDB file and is ready to get more data.	2014-11-11 17:12:12 +01:00
antirez	f5c6ebbfe3	Disconnect timedout slave: regression introduced with diskless repl.	2014-11-11 15:10:58 +01:00
Matt Stancliff	0014966c1e	Networking: add more outbound IP binding fixes Same as the original bind fixes (we just missed these the first time around). This helps Redis not automatically send connections from the first IP on an interface if we are bound to a specific IP address (e.g. with multiple IP aliases on one interface, you want to send from _your_ IP, not from the first IP on the interface).	2014-10-29 15:09:09 -04:00
antirez	9ec22d9223	Diskless replication: missing listRewind() added. This caused BGSAVE to be triggered a second time without any need when we switch from socket to disk target via the command CONFIG SET repl-diskless-sync no and there is already a slave waiting for the BGSAVE to start. Also comments clarified about what is happening.	2014-10-29 12:48:22 +01:00
antirez	4b8f4b90b9	Log slave ip:port in more log messages.	2014-10-27 12:30:07 +01:00
antirez	8a416ca46e	Added a function to get slave name for logs.	2014-10-27 11:58:20 +01:00
antirez	a27befc495	Diskless replication: log BGSAVE delay only when it is non-zero.	2014-10-27 10:48:39 +01:00
antirez	707352439c	Diskless sync delay is now configurable.	2014-10-27 10:36:30 +01:00
antirez	c4dbc7cdec	Remove duplicated log message about starting BGSAVE.	2014-10-24 10:38:42 +02:00
antirez	456003af25	Diskless replication: less debugging printfs around.	2014-10-17 17:11:48 +02:00
antirez	525c488f63	rio fdset target: handle short writes. While the socket is set in blocking mode, we still can get short writes writing to a socket.	2014-10-17 16:45:53 +02:00
antirez	4b16263bd9	Diskless replication: don't send "\n" pings to slaves. This is useful for normal replication in order to refresh the slave when we are persisting on disk, but for diskless replication the child is already receiving data while in WAIT_BGSAVE_END state.	2014-10-17 10:23:44 +02:00
antirez	25a3d9965e	Diskless replication: remove 40 bytes EOF mark from end of RDB file.	2014-10-17 10:23:11 +02:00
antirez	0c5a06f6bb	Diskless replication: swap inverted branches to compute read len.	2014-10-17 10:22:29 +02:00
antirez	80f7f63b64	Diskless replication: don't enter the read-payload branch forever.	2014-10-17 10:21:18 +02:00
antirez	5ee2ccf48e	Diskless replication: EOF:<mark> streaming support slave side.	2014-10-16 17:09:35 +02:00
antirez	43ae606430	Diskless replication: redis.conf and CONFIG SET/GET support.	2014-10-16 10:22:02 +02:00
antirez	42951ab301	Diskless replication: trigger a BGSAVE after a config change. If we turn from diskless to disk-based replication via CONFIG SET, we need a way to start a BGSAVE if there are slaves alerady waiting for a BGSAVE to start. Normally with disk-based replication we do it as soon as the previous child exits, but when there is a configuration change via CONFIG SET, we may have slaves in WAIT_BGSAVE_START state without an RDB background process currently active.	2014-10-16 10:15:18 +02:00
antirez	5f8360eb21	Diskless replication flag renamed repl_diskless -> repl_diskless_sync.	2014-10-16 10:00:50 +02:00
antirez	e9e007555e	Diskless replication: trigger diskless RDB transfer if needed.	2014-10-16 09:03:52 +02:00
antirez	3730d118a3	Diskless replication: handle putting the slave online.	2014-10-15 15:31:19 +02:00
antirez	75f0cd6520	Diskless replication: RDB -> slaves transfer draft implementation.	2014-10-14 10:11:29 +02:00
antirez	16546f5aca	Add some comments in syncCommand() to clarify RDB target.	2014-10-10 16:25:58 +02:00
Aaron Rutkovsky	3a82b8ac64	Fix typos Closes #1513	2014-09-29 06:49:07 -04:00
Jan-Erik Rediger	9f98b29cef	Fix typo: ad -> and Closes #1537	2014-09-29 06:49:06 -04:00
antirez	95b1979c32	No more trailing spaces in Redis source code.	2014-06-26 18:48:40 +02:00
antirez	7970d53997	ROLE command: array len fixed for slave output.	2014-06-21 11:17:18 +02:00
antirez	6a13193d8f	ROLE output improved for slaves. Info about the replication state with the master added.	2014-06-07 17:38:20 +02:00
antirez	d34c2fa3bb	ROLE command added. The new ROLE command is designed in order to provide a client with informations about the replication in a fast and easy to use way compared to the INFO command where the same information is also available.	2014-06-07 17:27:49 +02:00
antirez	0bcc7cb4bf	CLIENT LIST speedup via peerid caching + smart allocation. This commit adds peer ID caching in the client structure plus an API change and the use of sdsMakeRoomFor() in order to improve the reallocation pattern to generate the CLIENT LIST output. Both the changes account for a very significant speedup.	2014-04-28 17:36:57 +02:00
antirez	970de3e9c0	Check for EAGAIN in sendBulkToSlave(). Sometime an osx master with a Linux server over a slow link caused a strange error where osx called the writable function for the socket but actually apparently there was no room in the socket buffer to accept the write: write(2) call returned an EAGAIN error, that was not checked, so we considered write(2) == 0 always as a connection reset, which was unfortunate since the bulk transfer has to start again. Also more errors are logged with the WARNING level in the same code path now.	2014-02-05 16:38:10 +01:00
antirez	6f54032080	Cluster: function clusterGetSlaveRank() added. Return the number of slaves for the same master having a better replication offset of the current slave, that is, the slave "rank" used to pick a delay before the request for election.	2014-01-29 16:39:04 +01:00
antirez	abd6308d27	Set server.repl_down_since to 0 when changing master. When an instance is potentially set to replicate with another master, it is conceptually disconnected forever, since we have no old copy of the dataset for this master in memory.	2014-01-17 18:20:31 +01:00
antirez	90a81b4ebb	Don't send REPLCONF ACK to old masters. Masters not understanding REPLCONF ACK will reply with errors to our requests causing a number of possible issues. This commit detects a global replication offest set to -1 at the end of the replication, and marks the client representing the master with the REDIS_PRE_PSYNC flag. Note that this flag was called REDIS_PRE_PSYNC_SLAVE but now it is just REDIS_PRE_PSYNC as it is used for both slaves and masters starting with this commit. This commit fixes issue #1488.	2014-01-08 14:28:16 +01:00
antirez	3f92e05637	Clarify a comment in slaveTryPartialResynchronization().	2014-01-08 14:28:13 +01:00
antirez	94e8c9e77e	Make new masters inherit replication offsets. Currently replication offsets could be used into a limited way in order to understand, out of a set of slaves, what is the one with the most updated data. For example this comparison is possible of N slaves were replicating all with the same master. However the replication offset was not transferred from master to slaves (that are later promoted as masters) in any way, so for instance if there were three instances A, B, C, with A master and B and C replication from A, the following could happen: C disconnects from A. B is turned into master. A is switched to master of B. B receives some write. In this context there was no way to compare the offset of A and C, because B would use its own local master replication offset as replication offset to initialize the replication with A. With this commit what happens is that when B is turned into master it inherits the replication offset from A, making A and C comparable. In the above case assuming no inconsistencies are created during the disconnection and failover process, A will show to have a replication offset greater than C. Note that this does not mean offsets are always comparable to understand what is, in a set of instances, since in more complex examples the replica with the higher replication offset could be partitioned away when picking the instance to elect as new master. However this in general improves the ability of a system to try to pick a good replica to promote to master.	2013-12-22 11:43:25 +01:00
antirez	11120689c4	Slaves heartbeats during sync improved. The previous fix for false positive timeout detected by master was not complete. There is another blocking stage while loading data for the first synchronization with the master, that is, flushing away the current data from the DB memory. This commit uses the newly introduced dict.c callback in order to make some incremental work (to send "\n" heartbeats to the master) while flushing the old data from memory. It is hard to write a regression test for this issue unfortunately. More support for debugging in the Redis core would be needed in terms of functionalities to simulate a slow DB loading / deletion.	2013-12-10 18:47:31 +01:00
antirez	2eb781b35b	dict.c: added optional callback to dictEmpty(). Redis hash table implementation has many non-blocking features like incremental rehashing, however while deleting a large hash table there was no way to have a callback called to do some incremental work. This commit adds this support, as an optiona callback argument to dictEmpty() that is currently called at a fixed interval (one time every 65k deletions).	2013-12-10 18:46:24 +01:00
antirez	2c4ab8a534	Log empty DB + Loading data into two separated messages.	2013-12-10 18:43:25 +01:00
antirez	11e81a1e9a	Fixed grammar: before H the article is a, not an.	2013-12-05 16:35:32 +01:00
antirez	c5618e7fdd	WAIT command: synchronous replication for Redis.	2013-12-04 16:20:03 +01:00
antirez	b2f834390c	Log to what master a slave is going to connect to.	2013-11-11 09:25:36 +01:00
antirez	1461422ce6	Replication: install the write handler when reusing a cached master. Sometimes when we resurrect a cached master after a successful partial resynchronization attempt, there is pending data in the output buffers of the client structure representing the master (likely REPLCONF ACK commands). If we don't reinstall the write handler, it will never be installed again by addReply*() family functions as they'll assume that if there is already data pending, the write handler is already installed. This bug caused some slaves after a successful partial sync to never send REPLCONF ACK, and continuously being detected as timing out by the master, with a disconnection / reconnection loop.	2013-10-04 16:12:25 +02:00
antirez	37e06bd952	PSYNC: safer handling of PSYNC requests. There was a bug that over-esteemed the amount of backlog available, however this could only happen when a slave was asking for an offset that was in the "future" compared to the master replication backlog. Now this case is handled well and logged as an incident in the master log file.	2013-10-04 12:25:09 +02:00
antirez	707ff0f714	Make clear that runids are not cluster node IDs.	2013-09-30 11:48:09 +02:00
Maxim Zakharov	70e82e5c79	A mistype fixed	2013-09-03 15:15:48 +02:00
antirez	c06de115af	replicationFeedSlaves() func name typo: feedReplicationBacklogWithObject -> feedReplicationBacklog.	2013-08-12 12:50:45 +02:00
antirez	dcc48a8143	replicationFeedSlave() reworked for correctness and speed. The previous code using a static buffer as an optimization was lame: 1) Premature optimization, actually it was slower than naive code because resulted into the creation / destruction of the object encapsulating the output buffer. 2) The code was very hard to test, since it was needed to have specific tests for command lines exceeding the size of the static buffer. 3) As a result of "2" the code was bugged as the current tests were not able to stress specific corner cases. It was replaced with easy to understand code that is safer and faster.	2013-08-12 12:50:29 +02:00
antirez	aa05128f51	Fix a PSYNC bug caused by a variable name typo.	2013-08-12 11:51:35 +02:00
antirez	89ffba9133	Replication: better way to send a preamble before RDB payload. During the replication full resynchronization process, the RDB file is transfered from the master to the slave. However there is a short preamble to send, that is currently just the bulk payload length of the file in the usual Redis form $..length..<CR><LF>. This preamble used to be sent with a direct write call, assuming that there was alway room in the socket output buffer to hold the few bytes needed, however this does not scale in case we'll need to send more stuff, and is not very robust code in general. This commit introduces a more general mechanism to send a preamble up to 2GB in size (the max length of an sds string) in a non blocking way.	2013-08-12 10:29:14 +02:00
antirez	c151eb6d92	Fix replicationFeedSlaves() off-by-one bug. This fixes issue #1221.	2013-07-28 12:49:34 +02:00
antirez	a31693417d	Fix replicationFeedSlaves() to use sdsEncodedObject() macro.	2013-07-22 10:36:27 +02:00
Ted Nyman	f39a0bdb77	Make sure the log standardizes on 'timeout'	2013-07-12 14:06:27 -07:00
antirez	d1cbad6d14	Use getClientPeerId() for MONITOR implementation.	2013-07-09 16:21:21 +02:00
antirez	90038906f4	Fix old anetPeerToString() API call in replication.c	2013-07-08 16:11:52 +02:00
Geoff Garside	ee5a6df101	Update calls to anetPeerToString to include ip_len.	2013-07-08 15:57:22 +02:00
antirez	8ca265cdb7	Don't disconnect pre PSYNC replication clients for timeout. Clients using SYNC to replicate are older implementations, such as redis-cli --slave, and are not designed to acknowledge the master with REPLCONF ACK commands, so we don't have any feedback and should not disconnect them on timeout.	2013-06-26 10:11:20 +02:00
antirez	f0bf5fd8c7	Use the RSC to replicate EVALSHA unmodified. This commit uses the Replication Script Cache in order to avoid translating EVALSHA into EVAL whenever possible for both the AOF and slaves.	2013-06-24 18:57:31 +02:00
antirez	94ec7db470	Replication of scripts as EVALSHA: sha1 caching implemented. This code is only responsible to take an LRU-evicted fixed length cache of SHA1 that we are sure all the slaves received. In this commit only the implementation is provided, but the Redis core does not use it to actually send EVALSHA to slaves when possible.	2013-06-24 10:26:04 +02:00
antirez	1a54d5963e	Refresh good slaves count when setting slave state as online.	2013-05-30 12:13:25 +02:00
antirez	ed599d3aca	min-slaves-to-write: don't accept writes with less than N replicas. This feature allows the user to specify the minimum number of connected replicas having a lag less or equal than the specified amount of seconds for writes to be accepted.	2013-05-30 11:30:04 +02:00
antirez	3c82c85fcf	Close connection with timedout slaves. Now masters, using the time at which the last REPLCONF ACK was received, are able to explicitly disconnect slaves that are no longer responding. Previously the only chance was to see a very long output buffer, that was highly suboptimal.	2013-05-27 11:42:42 +02:00
antirez	e06a560466	Send ACK to master once every second. ACKs can be also used as a base for synchronous replication. However in that case they'll be explicitly requested by the master when the client sends a request that needs to be replicated synchronously.	2013-05-27 11:42:38 +02:00
antirez	efd87031d0	Don't ACK the master after every command. Sending an ACK is now moved into the replicationSendAck() function.	2013-05-27 11:42:35 +02:00
antirez	dd0adbb777	Make sure that REPLCONF ACK really has no return value.	2013-05-27 11:42:30 +02:00
antirez	6b4635f4f5	REPLCONF ACK command. This special command is used by the slave to inform the master the amount of replication stream it currently consumed. it does not return anything so that we not need to consume additional bandwidth needed by the master to reply something. The master can do a number of things knowing the amount of stream processed, such as understanding the "lag" in bytes of the slave, verify if a given command was already processed by the slave, and so forth.	2013-05-27 11:42:17 +02:00
antirez	b7d085fc0d	Cluster: SLAVEOF command not allowed in cluster mode.	2013-03-05 12:39:41 +01:00
antirez	3be893123f	Make sure replicationSetMaster() works when ip argument is not an sds.	2013-03-04 15:39:55 +01:00
antirez	7bead003e2	SLAVEOF command refactored into a proper API. We now have replicationSetMaster() and replicationUnsetMaster() that can be called in other contexts (for instance Redis Cluster).	2013-03-04 13:22:21 +01:00
antirez	f9b5ca29fd	Use GCC printf format attribute for redisLog(). This commit also fixes redisLog() statements producing warnings.	2013-02-27 12:27:15 +01:00
antirez	072c91fe13	PSYNC: another change to unexpected reply from PSYNC.	2013-02-13 18:43:40 +01:00
antirez	0e1be5347b	PSYNC: More robust handling of unexpected reply to PSYNC.	2013-02-13 18:33:33 +01:00
antirez	3419c8ce70	Replication: more strict error checking for master PING reply.	2013-02-12 16:53:27 +01:00
antirez	24f258360b	Replication: added new stats counting full and partial resynchronizations.	2013-02-12 15:33:54 +01:00
antirez	3af478e9ef	PSYNC: debugging printf() calls are now logs at DEBUG level.	2013-02-12 12:52:22 +01:00
antirez	89b48f0825	Remove harmless warning in slaveTryPartialResynchronization().	2013-02-12 12:52:21 +01:00
antirez	0ed6daa48b	PSYNC: don't use the client buffer to send +CONTINUE and +FULLRESYNC. When we are preparing an handshake with the slave we can't touch the connection buffer as it'll be used to accumulate differences between the sent RDB file and what arrives next from clients. So in short we can't use addReply() family functions. However we just use write(2) because we know that the socket buffer is empty, since a prerequisite for SYNC to work is that the static buffer and the output list are empty, and in general it is not expected that a client SYNCs after doing some heavy I/O with the master. However a short write connection is explicitly handled to avoid fragility (we simply close the connection and the slave will retry).	2013-02-12 12:52:21 +01:00
antirez	d2a0348a49	SYNC not allowed with pending data on the static output buffer.	2013-02-12 12:52:21 +01:00
antirez	da315d3325	Log the unexpected string received in place of the SYNC payload length.	2013-02-12 12:52:21 +01:00
antirez	41d64a7516	After SLAVEOF <newslave> don't allow chained slaves to PSYNC.	2013-02-12 12:52:21 +01:00
antirez	078882025e	PSYNC: work in progress, preview #2 , rebased to unstable.	2013-02-12 12:52:21 +01:00
antirez	e34a35a511	Use the new unified protocol to send SELECT to slaves. SELECT was still transmitted to slaves using the inline protocol, that is conceived mostly for humans to type into telnet sessions, and is notably not understood by redis-cli --slave. Now the new protocol is used instead.	2013-02-12 12:50:28 +01:00
antirez	4b83ad4e1f	Use replicationFeedSlaves() to send PING to slaves. A Redis master sends PING commands to slaves from time to time: doing this ensures that even if absence of writes, the master->slave channel remains active and the slave can feel the master presence, instead of closing the connection for timeout. This commit changes the way PINGs are sent to slaves in order to use the standard interface used to replicate all the other commands, that is, the function replicationFeedSlaves(). With this change the stream of commands sent to every slave is exactly the same regardless of their exact state (Transferring RDB for first synchronization or slave already online). With the previous implementation the PING was only sent to online slaves, with the result that the output stream from master to slaves was not identical for all the slaves: this is a problem if we want to implement partial resyncs in the future using a global replication stream offset. TL;DR: this commit should not change the behaviour in practical terms, but is just something in preparation for partial resynchronization support.	2013-02-12 12:50:28 +01:00
antirez	7465ac7ab1	Emit SELECT to slaves in a centralized way. Before this commit every Redis slave had its own selected database ID state. This was not actually useful as the emitted stream of commands is identical for all the slaves. Now the the currently selected database is a global state that is set to -1 when a new slave is attached, in order to force the SELECT command to be re-emitted for all the slaves. This change is useful in order to implement replication partial resynchronization in the future, as makes sure that the stream of commands received by slaves, including SELECT commands, are exactly the same for every slave connected, at any time. In this way we could have a global offset that can identify a specific piece of the master -> slaves stream of commands.	2013-02-12 12:50:28 +01:00
antirez	a6c2f9012f	Make all WATCHers dirty when the slave reloads the DB.	2013-02-08 10:26:19 +01:00
antirez	b70b459b0e	TCP_NODELAY after SYNC: changes to the implementation.	2013-02-05 12:04:30 +01:00
charsyam	c85647f354	Turn off TCP_NODELAY on the slave socket after SYNC. Further details from @antirez: It was reported by @StopForumSpam on Twitter that the Redis replication link was strangely using multiple TCP packets for multiple commands. This wastes a lot of bandwidth and is due to the TCP_NODELAY option we enable on the socket after accepting a new connection. However the master -> slave channel is a one-way channel since Redis replication is asynchronous, so there is no point in trying to reduce the latency, we should aim to reduce the bandwidth. For this reason this commit introduces the ability to disable the nagle algorithm on the socket after a successful SYNC. This feature is off by default because the delay can be up to 40 milliseconds with normally configured Linux kernels.	2013-02-05 12:04:25 +01:00
guiquanz	9d09ce3981	Fixed many typos.	2013-01-19 10:59:44 +01:00
antirez	ef99e146a8	Undo slave-master handshake when SLAVEOF sets a new slave. Issue #828 shows how Redis was not correctly undoing a non-blocking connection attempt with the previous master when the master was set to a new address using the SLAVEOF command. This was also a result of lack of refactoring, so now there is a function to cancel the non blocking handshake with the master. The new function is now used when SLAVEOF NO ONE is called or when SLAVEOF is used to set the master to a different address.	2013-01-15 13:33:24 +01:00
antirez	d7740fc8f3	Better error reporting when fd event creation fails.	2013-01-03 14:29:34 +01:00
antirez	f1481d4a03	serverCron() frequency is now a runtime parameter (was REDIS_HZ). REDIS_HZ is the frequency our serverCron() function is called with. A more frequent call to this function results into less latency when the server is trying to handle very expansive background operations like mass expires of a lot of keys at the same time. Redis 2.4 used to have an HZ of 10. This was good enough with almost every setup, but the incremental key expiration algorithm was working a bit better under extreme pressure when HZ was set to 100 for Redis 2.6. However for most users a latency spike of 30 milliseconds when million of keys are expiring at the same time is acceptable, on the other hand a default HZ of 100 in Redis 2.6 was causing idle instances to use some CPU time compared to Redis 2.4. The CPU usage was in the order of 0.3% for an idle instance, however this is a shame as more energy is consumed by the server, if not important resources. This commit introduces HZ as a runtime parameter, that can be queried by INFO or CONFIG GET, and can be modified with CONFIG SET. At the same time the default frequency is set back to 10. In this way we default to a sane value of 10, but allows users to easily switch to values up to 500 for near real-time applications if needed and if they are willing to pay this small CPU usage penalty.	2012-12-14 17:10:40 +01:00
antirez	4365e5b2d3	BSD license added to every C source and header file.	2012-11-08 18:31:32 +01:00
antirez	2ea41242f6	Unix socket clients properly displayed in MONITOR and CLIENT LIST. This also fixes issue #745.	2012-11-01 22:10:45 +01:00
antirez	f0b9f80345	"Timeout receiving bulk data" error message modified. The new message now contains an hint about modifying the repl-timeout configuration directive if the problem persists. This should normally not be needed, because while the master generates the RDB file it makes sure to send newlines to the replication channel to prevent timeouts. However there are times when masters running on very slow systems can completely stop for seconds during the RDB saving process. In such a case enlarging the timeout value can fix the problem. See issue #695 for an example of this problem in an EC2 deployment.	2012-10-04 11:52:16 +02:00
antirez	d310fbedab	Fix compilation on FreeBSD. Thanks to @koobs on twitter.	2012-09-17 12:46:06 +02:00
Salvatore Sanfilippo	24bc807b5c	Merge pull request #576 from saj/fix-slave-ping-period Bug fix: slaves being pinged every second	2012-09-05 06:59:37 -07:00
antirez	bb66fc3120	Send an async PING before starting replication with master. During the first synchronization step of the replication process, a Redis slave connects with the master in a non blocking way. However once the connection is established the replication continues sending the REPLCONF command, and sometimes the AUTH command if needed. Those commands are send in a partially blocking way (blocking with timeout in the order of seconds). Because it is common for a blocked master to accept connections even if it is actually not able to reply to the slave requests, it was easy for a slave to block if the master had serious issues, but was still able to accept connections in the listening socket. For this reason we now send an asynchronous PING request just after the non blocking connection ended in a successful way, and wait for the reply before to continue with the replication process. It is very unlikely that a master replying to PING can't reply to the other commands. This solution was proposed by Didier Spezia (Thanks!) so that we don't need to turn all the replication process into a non blocking affair, but still the probability of a slave blocked is minimal even in the event of a failing master. Also we now use getsockopt(SO_ERROR) in order to check errors ASAP in the event handler, instead of waiting for actual I/O to return an error. This commit fixes issue #632.	2012-09-02 12:24:38 +02:00
antirez	784b93087c	Incrementally flush RDB on disk while loading it from a master. This fixes issue #539. Basically if there is enough free memory the OS may buffer the RDB file that the slave transfers on disk from the master. The file may actually be flused on disk at once by the operating system when it gets closed by Redis, causing the close system call to block for a long time. This patch is a modified version of one provided by yoav-steinberg of @garantiadata (the original version was posted in the issue #539 comments), and tries to flush the OS buffers incrementally (every 8 MB of loaded data).	2012-08-28 12:47:33 +02:00
Saj Goonatilleke	9edfe63553	Bug fix: slaves being pinged every second REDIS_REPL_PING_SLAVE_PERIOD controls how often the master should transmit a heartbeat (PING) to its slaves. This period, which defaults to 10, is measured in seconds. Redis 2.4 masters used to ping their slaves every ten seconds, just like it says on the tin. The Redis 2.6 masters I have been experimenting with, on the other hand, ping their slaves every second. (master_last_io_seconds_ago never approaches 10.) I think the ping period was inadvertently slashed to one-tenth of its nominal value around the time REDIS_HZ was introduced. This commit reintroduces correct ping schedule behaviour.	2012-07-05 14:29:27 +10:00
antirez	36def8fd9a	Typo in comment.	2012-06-27 11:26:44 +02:00
antirez	3a32897856	REPLCONF internal command introduced. The REPLCONF command is an internal command (not designed to be directly used by normal clients) that allows a slave to set some replication related state in the master before issuing SYNC to start the replication. The initial motivation for this command, and the only reason currently it is used by the implementation, is to let the slave instance communicate its listening port to the slave, so that the master can show all the slaves with their listening ports in the "replication" section of the INFO output. This allows clients to auto discover and query all the slaves attached into a master. Currently only a single option of the REPLCONF command is supported, and it is called "listening-port", so the slave now starts the replication process with something like the following chat: REPLCONF listening-prot 6380 SYNC Note that this works even if the master is an older version of Redis and does not understand REPLCONF, because the slave ignores the REPLCONF error. In the future REPLCONF can be used for partial replication and other replication related features where there is the need to exchange information between master and slave. NOTE: This commit also fixes a bug: the INFO outout already carried information about slaves, but the port was broken, and was obtained with getpeername(2), so it was actually just the ephemeral port used by the slave to connect to the master as a client.	2012-06-27 09:43:57 +02:00
antirez	ef37997608	Dead code removed from replication.c. The user @jokea noticed that the following line of code into replication.c made little sense: addReplySds(slave,sdsempty()); Investigating a bit I found that this was introduced by commit `6208b3a7` three years ago in the early stages of Redis. The code apparently is not useful at all, so I'm removing it. This change will not be backported into 2.4 so that in the rare case this should introduce a bug, we'll have a chance to detect it into the development branch. However following the code path it seems like the code is not useful at all, so the risk is truly small.	2012-05-24 11:35:21 +02:00
antirez	299290d3a4	Remove useless trailing space in SYNC command sent to master.	2012-05-02 21:47:53 +02:00
David Tran	31788f50b7	Spelling: s/synchrnonization/synchronization	2012-04-25 12:21:56 -07:00
antirez	9157549fad	syncio.c calls in replication.c fixed for the new millisecond timeout API.	2012-03-31 11:23:30 +02:00
antirez	c2672a06cd	Purely aesthetic code change.	2012-03-30 10:39:34 +02:00
Joseph Jang	f892797e1b	Fixed a memory leak with replication occurs when two or more dbs are replicated and at least one of them is >db10	2012-03-30 10:34:29 +02:00
antirez	179e54d2a9	Fix for slaves chains. Force resync of slaves (simply disconnecting them) when SLAVEOF turns a master into a slave.	2012-03-29 09:24:02 +02:00
Premysl Hruby	d194905449	use server.unixtime instead of time(NULL) where possible (cluster.c not checked though)	2012-03-27 17:39:58 +02:00

... 2 3 4 5 6 ...

381 Commits