redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 08:38:27 -05:00

Author	SHA1	Message	Date
antirez	0e9b5adbd3	Replication: clarify why repl_put_online_on_ack exists at all.	2019-08-05 17:38:15 +02:00
antirez	7c9f6ebc8f	Diskless replica: fix disklessLoadRestoreBackups() bug.	2019-07-10 12:36:26 +02:00
antirez	3bbb9a1413	Diskless replica: refactoring of DBs backups.	2019-07-10 11:42:26 +02:00
antirez	81b18fa3a0	Diskless replica: a few aesthetic changes to replication.c.	2019-07-08 18:32:47 +02:00
Oran Agra	2de544cfcc	diskless replication on slave side (don't store rdb to file), plus some other related fixes The implementation of the diskless replication was currently diskless only on the master side. The slave side was still storing the received rdb file to the disk before loading it back in and parsing it. This commit adds two modes to load rdb directly from socket: 1) when-empty 2) using "swapdb" the third mode of using diskless slave by flushdb is risky and currently not included. other changes: -------------- distinguish between aof configuration and state so that we can re-enable aof only when sync eventually succeeds (and not when exiting from readSyncBulkPayload after a failed attempt) also a CONFIG GET and INFO during rdb loading would have lied When loading rdb from the network, don't kill the server on short read (that can be a network error) Fix rdb check when performed on preamble AOF tests: run replication tests for diskless slave too make replication test a bit more aggressive Add test for diskless load swapdb	2019-07-08 15:37:48 +03:00
antirez	074d24df1e	Narrow the effects of PR #6029 to the exact state. CLIENT PAUSE may be used, in other contexts, for a long time making all the slaves time out. Better for now to be more specific about what should disable senidng PINGs. An alternative to that would be to virtually refresh the slave interactions when clients are paused, however for now I went for this more conservative solution.	2019-05-15 12:16:43 +02:00
Salvatore Sanfilippo	caf74e507e	Merge pull request #6029 from chendq8/clientpause fix cluster failover time out	2019-05-15 12:03:19 +02:00
chendianqiang	11f2c6b115	stop ping when client pause	2019-04-17 21:20:10 +08:00
Salvatore Sanfilippo	fcac342955	Merge pull request #3830 from oranagra/diskless_capa_pr several bugfixes to diskless replication	2019-03-22 17:41:40 +01:00
antirez	b3408e9a9b	More sensible name for function: restartAOFAfterSYNC(). Related to #3829.	2019-03-21 17:21:29 +01:00
antirez	9588fd52ac	Mostly aesthetic changes to restartAOF(). See #3829.	2019-03-21 17:18:24 +01:00
Oran Agra	b2e03f8329	diskless replication - notify slave when rdb transfer failed in diskless replication - master was not notifing the slave that rdb transfer terminated on error, and lets slave wait for replication timeout	2019-03-20 17:46:19 +02:00
oranagra	c9e2900efc	bugfix to restartAOF, exit will never happen since retry will get negative. also reduce an excess sleep	2019-03-20 17:20:07 +02:00
antirez	14b17c3615	replicaofCommand() refactoring: stay into 80 cols.	2019-03-18 11:34:40 +01:00
antirez	8a46d32be2	Make comment in #5911 stay inside 80 cols.	2019-03-10 09:48:06 +01:00
John Sully	5b52bc738b	Replicas aren't allowed to run the replicaof command	2019-03-09 11:04:48 -05:00
zhaozhao.zz	ea9d3aefec	ACL: add masteruser configuration for replication In mostly production environment, normal user's behavior should be limited. Now in redis ACL mechanism we can do it like that: user default on +@all ~* -@dangerous nopass user admin on +@all ~* >someSeriousPassword Then the default normal user can not execute dangerous commands like FLUSHALL/KEYS. But some admin commands are in dangerous category too like PSYNC, and the configurations above will forbid replica from sync with master. Finally I think we could add a new configuration for replication, it is masteruser option, like this: masteruser admin masterauth someSeriousPassword Then replica will try AUTH admin someSeriousPassword and get privilege to execute PSYNC. If masteruser is NULL, replica would AUTH with only masterauth like before.	2019-02-12 17:12:37 +08:00
ArkayZheng	76f20729fc	Fix the output bug in rename exceptions.	2019-01-25 21:48:23 +08:00
antirez	4dc69497f5	Refactoring: always kill AOF/RDB child via helper functions.	2019-01-21 11:28:44 +01:00
Salvatore Sanfilippo	adfaf548e3	Merge branch 'unstable' into fixChildInfoPipeFdLeak	2019-01-21 11:20:56 +01:00
Salvatore Sanfilippo	9f939610f3	Merge pull request #5797 from trevor211/fixUpdateDictResizePolicy Fix update dict resize policy	2019-01-21 11:14:48 +01:00
WuYunlong	440385de14	Fix child info pipe fd leak when child process gets killed.	2019-01-21 17:48:45 +08:00
WuYunlong	f004a3e7ff	Update dict resize policy when rdb child process gets killed.	2019-01-21 17:33:18 +08:00
antirez	2c66c525f9	ACL: configure the master connection without user.	2019-01-17 18:33:36 +01:00
antirez	709a6612eb	RESP3: addReplyString() -> addReplyProto(). The function naming was totally nuts. Let's fix it as we break PRs anyway with RESP3 refactoring and changes.	2019-01-09 17:00:30 +01:00
antirez	07bce54093	RESP3: Use new deferred len API in replication.c.	2019-01-09 17:00:29 +01:00
antirez	06a4acb7d3	When replica kills a pending RDB save during SYNC, log it. This logs what happens in the context of the fix in PR #5367.	2018-10-31 11:47:10 +01:00
Salvatore Sanfilippo	6204d8c139	Merge pull request #5367 from nUl1/fullresync-stopbgsave Prevent RDB autosave from overwriting full resync results	2018-10-31 11:42:04 +01:00
antirez	3d07ed983e	Fix typo in replicationCron() comment.	2018-10-05 18:30:45 +02:00
Andrey Bugaevskiy	466c277b4f	Move child termination to readSyncBulkPayload	2018-09-27 19:38:58 +03:00
Andrey Bugaevskiy	98a64523c4	Prevent RDB autosave from overwriting full resync results During the full database resync we may still have unsaved changes on the receiving side. This causes a race condition between synced data rename/load and the rename of rdbSave tempfile.	2018-09-19 19:58:39 +03:00
antirez	61b7a176ef	Slave removal: replication.c logs fixed.	2018-09-11 15:32:28 +02:00
antirez	ef2c7a5bbb	Slave removal: SLAVEOF -> REPLICAOF. SLAVEOF is now an alias.	2018-09-11 15:32:28 +02:00
Oran Agra	d55598988b	fix rare replication stream corruption with disk-based replication The slave sends \n keepalive messages to the master while parsing the rdb, and later sends REPLCONF ACK once a second. rarely, the master recives both a linefeed char and a REPLCONF in the same read, \n3\r\n$8\r\nREPLCONF\r\n... and it tries to trim two chars (\r\n) from the query buffer, trimming the '' from *3\r\n$8\r\nREPLCONF\r\n... then the master tries to process a command starting with '3' and replies to the slave a bunch of -ERR and one +OK. although the slave silently ignores these (prints a log message), this corrupts the replication offset at the slave since the slave increases the replication offset, and the master did not. other than the fix in processInlineBuffer, i did several other improvments while hunting this very rare bug. - when redis replies with "unknown command" it includes a portion of the arguments, not just the command name. so it would be easier to understand what was recived, in my case, on the slave side, it was -ERR, but the "arguments" were the interesting part (containing info on the error). - about a year ago i added code in addReplyErrorLength to print the error to the log in case of a reply to master (since this string isn't actually trasmitted to the master), now changed that block to print a similar log message to indicate an error being sent from the master to the slave. note that the slave is marked as CLIENT_SLAVE only after PSYNC was received, so this will not cause any harm for REPLCONF, and will only indicate problems that are gonna corrupt the replication stream anyway. - two places were c->reply was emptied, and i wanted to reset sentlen this is a precaution (i did not actually see such a problem), since a non-zero sentlen will cause corruption to be transmitted on the socket.	2018-07-17 12:51:49 +03:00
Oran Agra	bf680b6f8c	slave buffers were wasteful and incorrectly counted causing eviction A) slave buffers didn't count internal fragmentation and sds unused space, this caused them to induce eviction although we didn't mean for it. B) slave buffers were consuming about twice the memory of what they actually needed. - this was mainly due to sdsMakeRoomFor growing to twice as much as needed each time but networking.c not storing more than 16k (partially fixed recently in 237a38737). - besides it wasn't able to store half of the new string into one buffer and the other half into the next (so the above mentioned fix helped mainly for small items). - lastly, the sds buffers had up to 30% internal fragmentation that was wasted, consumed but not used. C) inefficient performance due to starting from a small string and reallocing many times. what i changed: - creating dedicated buffers for reply list, counting their size with zmalloc_size - when creating a new reply node from, preallocate it to at least 16k. - when appending a new reply to the buffer, first fill all the unused space of the previous node before starting a new one. other changes: - expose mem_not_counted_for_evict info field for the benefit of the test suite - add a test to make sure slave buffers are counted correctly and that they don't cause eviction	2018-07-16 16:43:42 +03:00
Jack Drogon	93238575f7	Fix typo	2018-07-03 18:19:46 +02:00
antirez	677d10b2a8	Set repl_down_since to zero on state change. PR #5081 fixes an "interesting" bug about Redis Cluster failover but in general about the updating of repl_down_since, that is used in order to count the time a slave was left disconnected from its master. While the fix provided resolves the specific issue, in general the validity of repl_down_since is limited to states that are different than the state CONNECTED, and the disconnected time is set when the state is DISCONNECTED. However from CONNECTED to other states, the state machine must always go to DISCONNECTED first. So it makes sense to set the field to zero (since it is meaningless in that context) when the state is set to CONNECTED.	2018-07-03 12:42:14 +02:00
WuYunlong	2e167f7d0e	fix server.repl_down_since resetting, so that slaves could failover automatically as expected.	2018-06-30 09:39:08 +08:00
antirez	27178a3fde	Fix type of argslen in sendSynchronousCommand(). Related to #5037.	2018-06-26 14:38:35 +02:00
antirez	1f1e724f47	Remove black space.	2018-06-26 14:37:22 +02:00
Madelyn Olson	45731edc4b	Addressed comments	2018-06-26 00:57:35 +00:00
Madelyn Olson	e8d68b6b72	Fixed replication authentication with whitespace in password	2018-06-26 00:48:37 +00:00
shenlongxing	c85ae56edc	Fix write() errno error	2018-06-06 13:06:42 +02:00
Salvatore Sanfilippo	4aa2ecd98b	Merge pull request #4269 from jianqingdu/unstable fix not call va_end() when syncWrite() failed	2018-01-24 10:55:25 +01:00
antirez	b23927b240	Hopefully more clear comment to explain the change in #4607 .	2018-01-16 15:52:13 +01:00
Oran Agra	689b64c3ad	PSYNC2 fix - promoted slave should hold on to it's backlog after a slave is promoted (assuming it has no slaves and it booted over an hour ago), it will lose it's replication backlog at the next replication cron, rather than waiting for slaves to connect to it. so on a simple master/slave faiover, if the new slave doesn't connect immediately, it may be too later and PSYNC2 will fail.	2018-01-16 10:10:42 +02:00
antirez	62a4b817c6	add linkClient(): adds the client and caches the list node. We have this operation in two places: when caching the master and when linking a new client after the client creation. By having an API for this we avoid incurring in errors when modifying one of the two places forgetting the other. The function is also a good place where to document why we cache the linked list node. Related to #4497 and #4210.	2017-12-05 16:02:03 +01:00
zhaozhao.zz	43be967690	networking: optimize unlinkClient() in freeClient()	2017-11-30 18:11:05 +08:00
antirez	4d063bb6ba	PSYNC2: reorganize comments related to recent fixes. Related to PR #4412 and issue #4407.	2017-11-24 11:08:29 +01:00
zhaozhao.zz	6ddf0ea293	PSYNC2: safe free backlog when reach the time limit When we free the backlog, we should use a new replication ID and clear the ID2. Since without backlog we can not increment master_repl_offset even do write commands, that may lead to inconsistency when we try to connect a "slave-before" master (if this master is our slave before, our replid equals the master's replid2). As the master have our history, so we can match the master's replid2 and second_replid_offset, that make partial sync work, but the data is inconsistent.	2017-11-01 17:32:27 +08:00

1 2 3 4 5 ...

253 Commits