redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-22 16:18:28 -05:00

Author	SHA1	Message	Date
Oran Agra	b587c54c24	fix #5580 , display fragmentation and rss overhead bytes as signed these metrics become negative when RSS is smaller than the used_memory. This can easily happen when the program allocated a lot of memory and haven't written to it yet, in which case the kernel doesn't allocate any pages to the process	2018-12-02 15:29:20 +02:00
antirez	e3446fea9e	Streams: XSTREAM SETID -> XSETID. Keep vanilla stream commands at toplevel, see #5426.	2018-10-16 13:17:14 +02:00
Salvatore Sanfilippo	af09df08d7	Merge pull request #5426 from soloestoy/feature-xstream Bugfix data inconsistency after aof rewrite, and add XSTREAM command.	2018-10-16 13:10:36 +02:00
antirez	c9d9ae7baa	Fix propagation of consumer groups last ID. Issue #5433.	2018-10-10 12:51:02 +02:00
antirez	69c30965eb	Introduce protectClient() + some refactoring. The idea is to have an API for the cases like -BUSY state and DEBUG RELOAD where we have to manually deinstall the read handler. See #4804.	2018-10-09 13:15:41 +02:00
zhaozhao.zz	ec511fa709	Streams: add a new command XTREAM XSTREAM CREATE <key> <id or *> -- Create a new empty stream. XSTREAM SETID <key> <id or $> -- Set the current stream ID.	2018-10-09 13:11:04 +08:00
antirez	744fe7f348	Module cluster flags: initial vars / defines added.	2018-09-19 11:20:52 +02:00
antirez	43385c4375	LOLWUT: wrap it into a proper command.	2018-09-12 11:34:10 +02:00
antirez	ef2c7a5bbb	Slave removal: SLAVEOF -> REPLICAOF. SLAVEOF is now an alias.	2018-09-11 15:32:28 +02:00
antirez	6c001bfc0d	Unblocked clients API refactoring. See #4418 .	2018-09-03 18:39:18 +02:00
antirez	3e7349fdaf	Make pending buffer processing safe for CLIENT_MASTER client. Related to #5305.	2018-09-03 18:17:31 +02:00
antirez	067647a783	Introduce repl_slave_ignore_maxmemory flag internally. Note: this breaks backward compatibility with Redis 4, since now slaves by default are exact copies of masters and do not try to evict keys independently.	2018-08-27 12:20:27 +02:00
Salvatore Sanfilippo	19880ab851	Merge pull request #5248 from soloestoy/rewrite-brpoplpush rewrite BRPOPLPUSH as RPOPLPUSH to propagate	2018-08-26 16:31:24 +02:00
zhaozhao.zz	8a1219d93b	block: rewrite BRPOPLPUSH as RPOPLPUSH to propagate	2018-08-14 20:58:58 +08:00
zhaozhao.zz	14c4ddb5a6	pipeline: do not sdsrange querybuf unless all commands processed This is an optimization for processing pipeline, we discussed a problem in issue #5229: clients may be paused if we apply `CLIENT PAUSE` command, and then querybuf may grow too large, the cost of memmove in sdsrange after parsing a completed command will be horrible. The optimization is that parsing all commands in queyrbuf , after that we can just call sdsrange only once.	2018-08-14 00:43:42 +08:00
Salvatore Sanfilippo	92b39a0abf	Merge pull request #5189 from soloestoy/refactor-dbOverwrite refactor dbOverwrite to make lazyfree work	2018-07-31 16:40:35 +02:00
antirez	5401fe7fb9	Introduce writeCommandsDeniedByDiskError().	2018-07-31 13:09:38 +02:00
zhaozhao.zz	fddeeae724	refactor dbOverwrite to make lazyfree work	2018-07-31 12:07:57 +08:00
antirez	c426d85c4c	Control dynamic HZ via server configuration.	2018-07-30 13:37:30 +02:00
antirez	4e9c30a6ca	Merge branch 'dynamic-hz' into unstable	2018-07-30 13:31:23 +02:00
Salvatore Sanfilippo	445a2a2b1b	Merge pull request #4883 from itamarhaber/lua_scripts-in-info-memory Adds memory information about the scripts' cache to INFO	2018-07-23 18:43:05 +02:00
antirez	b65ddfb16a	Dynamic HZ: adapt cron frequency to number of clients.	2018-07-23 14:21:04 +02:00
antirez	e6ea603ad3	Dynamic HZ: separate hz from the configured hz. This way we can remember what the user configured HZ is, but change the actual HZ dynamically if needed in the dynamic HZ feature implementation.	2018-07-23 14:13:58 +02:00
Itamar Haber	993716c351	Adds Lua overheads to MEMORY STATS, smartens the MEMORY DOCTOR	2018-07-22 21:16:00 +03:00
antirez	f9c84d6d39	Hopefully improve commenting of #5126 . Reading the PR gave me the opportunity to better specify what the code was doing in places where I was not immediately sure about what was going on. Moreover I documented the structure in server.h so that people reading the header file will immediately understand what the structure is useful for.	2018-07-16 17:56:54 +02:00
Oran Agra	bf680b6f8c	slave buffers were wasteful and incorrectly counted causing eviction A) slave buffers didn't count internal fragmentation and sds unused space, this caused them to induce eviction although we didn't mean for it. B) slave buffers were consuming about twice the memory of what they actually needed. - this was mainly due to sdsMakeRoomFor growing to twice as much as needed each time but networking.c not storing more than 16k (partially fixed recently in 237a38737). - besides it wasn't able to store half of the new string into one buffer and the other half into the next (so the above mentioned fix helped mainly for small items). - lastly, the sds buffers had up to 30% internal fragmentation that was wasted, consumed but not used. C) inefficient performance due to starting from a small string and reallocing many times. what i changed: - creating dedicated buffers for reply list, counting their size with zmalloc_size - when creating a new reply node from, preallocate it to at least 16k. - when appending a new reply to the buffer, first fill all the unused space of the previous node before starting a new one. other changes: - expose mem_not_counted_for_evict info field for the benefit of the test suite - add a test to make sure slave buffers are counted correctly and that they don't cause eviction	2018-07-16 16:43:42 +03:00
dejun.xdj	61f12973f7	Bugfix: PEL is incorrect when consumer is blocked using xreadgroup with NOACK option. Save NOACK option into client.blockingState structure.	2018-07-09 13:40:29 +02:00
antirez	81778d91bf	Cache timezone and daylight active flag for safer logging. With such information will be able to use a private localtime() implementation serverLog(), which does not use any locking and is both thread and fork() safe.	2018-07-04 16:45:00 +02:00
Jack Drogon	93238575f7	Fix typo	2018-07-03 18:19:46 +02:00
antirez	94b3ee6142	Clarify the pending_querybuf field of clients.	2018-07-03 13:25:41 +02:00
chendianqiang	cbb2ac0799	Merge branch 'unstable' into pending-querybuf	2018-07-03 10:07:26 +08:00
antirez	2edcafb35d	addReplySubSyntaxError() renamed to addReplySubcommandSyntaxError().	2018-07-02 18:49:34 +02:00
Salvatore Sanfilippo	bc6a004588	Merge pull request #4998 from itamarhaber/module_command_help Module command help	2018-07-02 18:46:56 +02:00
chendianqiang	7de1ada070	limit the size of pending-querybuf in masterclient	2018-07-01 14:43:53 +08:00
zhaozhao.zz	b9cbd04b57	clients: add type option for client list	2018-06-28 17:43:05 +08:00
antirez	fb39bfd7af	Take clients in a ID -> Client handle dictionary.	2018-06-27 14:08:42 +02:00
Guy Benoish	b5197f1fc9	Enhance RESTORE with RDBv9 new features RESTORE now supports: 1. Setting LRU/LFU 2. Absolute-time TTL Other related changes: 1. RDB loading will not override LRU bits when RDB file does not contain the LRU opcode. 2. RDB loading will not set LRU/LFU bits if the server's maxmemory-policy does not match.	2018-06-20 15:11:08 +07:00
Oran Agra	482785ac62	add malloc_usable_size for libc malloc this reduces the extra 8 bytes we save before each pointer. but more importantly maybe, it makes the valgrind runs to be more similiar to our normal runs. note: the change in malloc_stats struct in server.h is to eliminate an name conflict. structs that are not typedefed are resolved from a separate name space.	2018-06-19 18:18:23 +03:00
antirez	bd92389c2d	Refactor createObjectFromLongLong() to be suitable for value objects.	2018-06-18 16:55:16 +02:00
Salvatore Sanfilippo	94658303e9	Merge pull request #4758 from soloestoy/rdb-save-incremental-fsync Rdb save incremental fsync	2018-06-16 10:59:37 +02:00
Itamar Haber	e654b68d1f	Merge branch 'unstable' into module_command_help	2018-06-09 21:10:53 +03:00
Salvatore Sanfilippo	be899b824e	Merge pull request #4519 from soloestoy/zset-int-problem Zset int problem	2018-06-08 12:45:11 +02:00
Itamar Haber	76ad23d012	Adds MODULE HELP and implements addReplySubSyntaxError	2018-06-07 18:34:58 +03:00
antirez	19a438e2c0	Streams: use non static macro node limits. Also add the concept of size/items limit, instead of just having as limit the number of bytes.	2018-06-07 14:24:49 +02:00
antirez	56bbab238a	ZPOP: change sync ZPOP to have a count argument instead of N keys. Usually blocking operations make a lot of sense with multiple keys so that we can listen to multiple queues (or whatever the app models) with a single connection. However in the synchronous case it is more useful to be able to ask for N elements. This is a change that I also wanted to perform soon or later in the blocking list variant, but here it is more natural since there is no reply type difference.	2018-05-11 18:00:32 +02:00
antirez	6efb6c1e06	ZPOP: renaming to have explicit MIN/MAX score idea. This commit also adds a top comment about a subtle behavior of mixing blocking operations of different types in the same key.	2018-05-11 17:31:53 +02:00
Itamar Haber	49890c8ee9	Adds memory information about the script's cache to INFO Implementation notes: as INFO is "already broken", I didn't want to break it further. Instead of computing the server.lua_script dict size on every call, I'm keeping a running sum of the body's length and dict overheads. This implementation is naive as it does not take into consideration dict rehashing, but that inaccuracy pays off in speed ;) Demo time: ```bash $ redis-cli info memory \| grep "script" used_memory_scripts:96 used_memory_scripts_human:96B number_of_cached_scripts:0 $ redis-cli eval "" 0 ; redis-cli info memory \| grep "script" (nil) used_memory_scripts:120 used_memory_scripts_human:120B number_of_cached_scripts:1 $ redis-cli script flush ; redis-cli info memory \| grep "script" OK used_memory_scripts:96 used_memory_scripts_human:96B number_of_cached_scripts:0 $ redis-cli eval "return('Hello, Script Cache :)')" 0 ; redis-cli info memory \| grep "script" "Hello, Script Cache :)" used_memory_scripts:152 used_memory_scripts_human:152B number_of_cached_scripts:1 $ redis-cli eval "return redis.sha1hex(\"return('Hello, Script Cache :)')\")" 0 ; redis-cli info memory \| grep "script" "1be72729d43da5114929c1260a749073732dc822" used_memory_scripts:232 used_memory_scripts_human:232B number_of_cached_scripts:2 ✔ 19:03:54 redis [lua_scripts-in-info-memory L ✚…⚑] $ redis-cli evalsha 1be72729d43da5114929c1260a749073732dc822 0 "Hello, Script Cache :)" ```	2018-04-30 19:33:01 +03:00
Itamar Haber	438125b47c	Implements [B]Z[REV]POP and the respective unit tests An implementation of the [Ze POP Redis Module](https://github.com/itamarhaber/zpop) as core Redis commands. Fixes #1861.	2018-04-30 02:10:42 +03:00
antirez	e6b0e8d9ec	Streams: XTRIM command added.	2018-04-19 16:25:29 +02:00
antirez	aba76320d5	Streams: XDEL command.	2018-04-18 13:12:09 +02:00
antirez	de7de53e64	getMaxmemoryState() fixed and improved.	2018-04-11 12:48:26 +02:00
antirez	f97efe0cac	Modules: context flags now include OOM flag. Plus freeMemoryIfNeeded() refactoring to improve legibility. Please review this commit for sanity.	2018-04-09 17:44:30 +02:00
antirez	b2868c7b9c	Modules API: RM_GetRandomBytes() / GetRandomHexChars().	2018-04-05 13:24:22 +02:00
antirez	a97df1a6e1	Modules Cluster API: make node IDs pointers constant.	2018-03-30 13:16:07 +02:00
antirez	0701cad3de	Modules Cluster API: message bus implementation.	2018-03-29 15:13:31 +02:00
antirez	28d28ef3cf	AOF: enable RDB-preamble rewriting by default. There are too many advantages in doing this, RDB is faster to persist, more compact, much faster to load back. The main issues here are that the code is less tested because this was not the old default (so we are enabling it for the new 5.0 release), and that the AOF is no longer a trivially parsable format from now on. However the non-preamble mode will be supported in the future as well, if new data types will be added.	2018-03-25 11:43:30 +02:00
Salvatore Sanfilippo	da621783f0	Merge pull request #4691 from oranagra/active_defrag_v2 Active defrag v2	2018-03-22 09:16:32 +01:00
antirez	0b58ad301e	CG: Replication WIP 1: XREADGROUP and XCLAIM propagated as XCLAIM.	2018-03-19 18:02:19 +01:00
zhaozhao.zz	54cae05ea7	rdb: incremental fsync when redis saves rdb	2018-03-16 00:44:50 +08:00
antirez	0cf6b1e3ae	CG: XINFO CONSUMERS implemented.	2018-03-15 12:54:10 +01:00
antirez	b26f03bd69	CG: XCLAIM now updates the idle time of the message.	2018-03-15 12:54:10 +01:00
antirez	1bc31666da	CG: XPENDING without start/stop variant implemented.	2018-03-15 12:54:10 +01:00
antirez	388c69fe4e	CG: XACK implementation.	2018-03-15 12:54:10 +01:00
antirez	ccdae09046	CG: add & populate group+consumer in the blocking state.	2018-03-15 12:54:10 +01:00
antirez	58f0c000a5	CG: data structures design + XGROUP CREATE implementation.	2018-03-15 12:54:10 +01:00
antirez	432bf4770e	Cluster: ability to prevent slaves from failing over their masters. This commit, in some parts derived from PR #3041 which is no longer possible to merge (because the user deleted the original branch), implements the ability of slaves to have a special configuration preventing that they try to start a failover when the master is failing. There are multiple reasons for wanting this, and the feautre was requested in issue #3021 time ago. The differences between this patch and the original PR are the following: 1. The flag is saved/loaded on the nodes configuration. 2. The 'myself' node is now flag-aware, the flag is updated as needed when the configuration is changed via CONFIG SET. 3. The flag name uses NOFAILOVER instead of NO_FAILOVER to be consistent with existing NOADDR. 4. The redis.conf documentation was rewritten. Thanks to @deep011 for the original patch.	2018-03-14 14:01:38 +01:00
Oran Agra	806736cdf9	Adding real allocator fragmentation to INFO and MEMORY command + active defrag test other fixes / improvements: - LUA script memory isn't taken from zmalloc (taken from libc malloc) so it can cause high fragmentation ratio to be displayed (which is false) - there was a problem with "fragmentation" info being calculated from RSS and used_memory sampled at different times (now sampling them together) other details: - adding a few more allocator info fields to INFO and MEMORY commands - improve defrag test to measure defrag latency of big keys - increasing the accuracy of the defrag test (by looking at real grag info) this way we can use an even lower threshold and still avoid false positives - keep the old (total) "fragmentation" field unchanged, but add new ones for spcific things - add these the MEMORY DOCTOR command - deduct LUA memory from the rss in case of non jemalloc allocator (one for which we don't "allocator active/used") - reduce sampling rate of the rss and allocator info	2018-03-12 15:08:52 +02:00
Oran Agra	be1b4aa9aa	active defrag v2 - big keys are not defragged in one go from within the dict scan instead they are scanned in parts after the main dict hash bucket is done. - add latency monitor sample for defrag - change default active-defrag-cycle-min to induce lower latency - make active defrag start a new scan right away if needed, so it's easier (for the test suite) to detect when it's done - make active defrag quick the current cycle after each db / big key - defrag some non key long term global allocations - some refactoring for smaller functions and more reusable code - during dict rehashing, one scan iteration of the dict, can end up scanning one bucket in the smaller dict and many many buckets in the larger dict. so waiting for 16 scan iterations before checking the time, may be much too long.	2018-03-12 15:07:43 +02:00
antirez	ffde73c57d	Track number of logically expired keys still in memory. This commit adds two new fields in the INFO output, stats section: expired_stale_perc:0.34 expired_time_cap_reached_count:58 The first field is an estimate of the number of keys that are yet in memory but are already logically expired. They reason why those keys are yet not reclaimed is because the active expire cycle can't spend more time on the process of reclaiming the keys, and at the same time nobody is accessing such keys. However as the active expire cycle runs, while it will eventually have to return to the caller, because of time limit or because there are less than 25% of keys logically expired in each given database, it collects the stats in order to populate this INFO field. Note that expired_stale_perc is a running average, where the current sample accounts for 5% and the history for 95%, so you'll see it changing smoothly over time. The other field, expired_time_cap_reached_count, counts the number of times the expire cycle had to stop, even if still it was finding a sizeable number of keys yet to expire, because of the time limit. This allows people handling operations to understand if the Redis server, during mass-expiration events, is able to collect keys fast enough usually. It is normal for this field to increment during mass expires, but normally it should very rarely increment. When instead it constantly increments, it means that the current workloads is using a very important percentage of CPU time to expire keys. This feature was created thanks to the hints of Rashmi Ramesh and Bart Robinson from Twitter. In private email exchanges, they noted how it was important to improve the observability of this parameter in the Redis server. Actually in big deployments, the amount of keys that are yet to expire in each server, even if they are logically expired, may account for a very big amount of wasted memory.	2018-02-19 11:12:49 +01:00
Dvir Volk	3aab12414f	Remove the NOTIFY_MODULE flag and simplify the module notification flow if there aren't subscribers	2018-02-14 21:40:10 +02:00
Dvir Volk	2136035e47	finished implementation of notifications. Tests unfinished	2018-02-14 21:38:58 +02:00
antirez	8075572207	New config options about protocol prefixed with "proto". Related to #4568.	2018-01-11 11:27:41 +01:00
Oran Agra	b509a14c3e	Add config options for max-bulk-len and max-querybuf-len mainly to support RESTORE of large keys	2017-12-29 12:43:48 +02:00
zhaozhao.zz	109ee497be	zset: change the span of zskiplistNode to unsigned long	2017-12-08 16:09:27 +08:00
zhaozhao.zz	e8901b2fe4	zset: fix the int problem	2017-12-08 15:37:08 +08:00
Itamar Haber	8b51121998	Merge remote-tracking branch 'upstream/unstable' into help_subcommands	2017-12-05 18:14:59 +02:00
antirez	62a4b817c6	add linkClient(): adds the client and caches the list node. We have this operation in two places: when caching the master and when linking a new client after the client creation. By having an API for this we avoid incurring in errors when modifying one of the two places forgetting the other. The function is also a good place where to document why we cache the linked list node. Related to #4497 and #4210.	2017-12-05 16:02:03 +01:00
Salvatore Sanfilippo	03cfc8bf3a	Merge pull request #4497 from soloestoy/optimize-unlink-client networking: optimize unlinkClient() in freeClient()	2017-12-05 15:51:15 +01:00
antirez	60d26acfc8	Refactoring: improve luaCreateFunction() API. The function in its initial form, and after the fixes for the PSYNC2 bugs, required code duplication in multiple spots. This commit modifies it in order to always compute the script name independently, and to return the SDS of the SHA of the body: this way it can be used in all the places, including for SCRIPT LOAD, without duplicating the code to create the Lua function name. Note that this requires to re-compute the body SHA1 in the case of EVAL seeing a script for the first time, but this should not change scripting performance in any way because new scripts definition is a rare event happening the first time a script is seen, and the SHA1 computation is anyway not a very slow process against the typical Redis script and compared to the actua Lua byte compiling of the body. Note that the function used to assert() if a duplicated script was loaded, however actually now two times over three, we want the function to handle duplicated scripts just fine: this happens in SCRIPT LOAD and in RDB AUX "lua" loading. Moreover the assert was not defending against some obvious failure mode, so now the function always tests against already defined functions at start.	2017-12-04 11:25:20 +01:00
antirez	65a9740fa8	Fix loading of RDB files lua AUX fields when the script is defined. In the case of slaves loading the RDB from master, or in other similar cases, the script is already defined, and the function registering the script should not fail in the assert() call.	2017-12-01 16:01:10 +01:00
antirez	9bb18e5438	Streams: XRANGE REV option -> XREVRANGE command.	2017-12-01 10:24:25 +01:00
antirez	01ea018c40	Streams: export iteration API.	2017-12-01 10:24:24 +01:00
antirez	19b06935d5	Streams: fix XADD API and keyspace notifications. XADD was suboptimal in the first incarnation of the command, not being able to accept an ID (very useufl for replication), nor options for having capped streams. The keyspace notification for streams was not implemented.	2017-12-01 10:24:24 +01:00
antirez	6468cb2e82	Streams: fix XREAD ready-key signaling. With lists we need to signal only on key creation, but streams can provide data to clients listening at every new item added. To make this slightly more efficient we now track different classes of blocked clients to avoid signaling keys when there is nobody listening. A typical case is when the stream is used as a time series DB and accessed only by range with XRANGE.	2017-12-01 10:24:24 +01:00
antirez	2cacdcd6f8	Streams: XREAD related code to serve blocked clients.	2017-12-01 10:24:24 +01:00
antirez	110041825c	Streams: XREAD get-keys method.	2017-12-01 10:24:24 +01:00
antirez	4086dff477	Streams: augment client.bpop with XREAD specific fields.	2017-12-01 10:24:24 +01:00
antirez	f80dfbf464	Streams: more internal preparation for blocking XREAD.	2017-12-01 10:24:24 +01:00
antirez	4a377cecd8	Streams: initial work to use blocking lists logic for streams XREAD.	2017-12-01 10:24:24 +01:00
antirez	439120c620	Streams: implement stream object release.	2017-12-01 10:24:24 +01:00
antirez	ec9bbe96bf	Streams: XLEN command.	2017-12-01 10:24:24 +01:00
antirez	100d43c1ac	Streams: assign value of 6 to OBJ_STREAM + some refactoring.	2017-12-01 10:24:24 +01:00
antirez	79866a6361	Streams: 12 commits squashed into the initial Streams implementation.	2017-12-01 10:24:24 +01:00
antirez	f11a7585a8	PSYNC2: Save Lua scripts state into RDB file. This is currently needed in order to fix #4483, but this can be useful in other contexts, so maybe later we may want to remove the conditionals and always save/load scripts. Note that we are using the "lua" AUX field here, in order to guarantee backward compatibility of the RDB file. The unknown AUX fields must be discarded by past versions of Redis.	2017-11-30 18:37:52 +01:00
zhaozhao.zz	43be967690	networking: optimize unlinkClient() in freeClient()	2017-11-30 18:11:05 +08:00
Itamar Haber	59d52f7fab	Standardizes the 'help' subcommand This adds a new `addReplyHelp` helper that's used by commands when returning a help text. The following commands have been touched: DEBUG, OBJECT, COMMAND, PUBSUB, SCRIPT and SLOWLOG. WIP Fix entry command table entry for OBJECT for HELP option. After #4472 the command may have just 2 arguments. Improve OBJECT HELP descriptions. See #4472. WIP 2 WIP 3	2017-11-28 21:15:45 +02:00
zhaozhao.zz	583c314725	LFU: do some changes about LFU to find hotkeys Firstly, use access time to replace the decreas time of LFU. For function LFUDecrAndReturn, it should only try to get decremented counter, not update LFU fields, we will update it in an explicit way. And we will times halve the counter according to the times of elapsed time than server.lfu_decay_time. Everytime a key is accessed, we should update the LFU including update access time, and increment the counter after call function LFUDecrAndReturn. If a key is overwritten, the LFU should be also updated. Then we can use `OBJECT freq` command to get a key's frequence, and LFUDecrAndReturn should be called in `OBJECT freq` command in case of the key has not been accessed for a long time, because we update the access time only when the key is read or overwritten.	2017-11-27 18:39:22 +01:00
zhaozhao.zz	53cea97204	LFU: change lfu* parameters to int	2017-11-27 18:38:55 +01:00
antirez	e74f0aa6d1	Fix replication of SLAVEOF inside transaction. In Redis 4.0 replication, with the introduction of PSYNC2, masters and slaves replicate commands to cascading slaves and to the replication backlog itself in a different way compared to the past. Masters actually replicate the effects of client commands. Slaves just propagate what they receive from masters. This mechanism can cause problems when the configuration of an instance is changed from master to slave inside a transaction. For instance we could send to a master instance the following sequence: MULTI SLAVEOF 127.0.0.1 0 EXEC SLAVEOF NO ONE Before the fixes in this commit, the MULTI command used to be propagated into the replication backlog, however after the SLAVEOF command the instance is a slave, so the EXEC implementation failed to also propagate the EXEC command. When the slaves of the above instance reconnected, they were incrementally synchronized just sending a "MULTI". This put the master client (in the slaves) into MULTI state, breaking the replication. Notably even Redis Sentinel uses the above approach in order to guarantee that configuration changes are always performed together with rewrites of the configuration and with clients disconnection. Sentiel does: MULTI SLAVEOF ... CONFIG REWRITE CLIENT KILL TYPE normal EXEC So this was a really problematic issue. However even with the fix in this commit, that will add the final EXEC to the replication stream in case the instance was switched from master to slave during the transaction, the result would be to increment the slave replication offset, so a successive reconnection with the new master, will not permit a successful partial resynchronization: no way the new master can provide us with the backlog needed, we incremented our offset to a value that the new master cannot have. However the EXEC implementation waits to emit the MULTI, so that if the commands inside the transaction actually do not need to be replicated, no commands propagation happens at all. From multi.c: if (!must_propagate && !(c->cmd->flags & (CMD_READONLY\|CMD_ADMIN))) { execCommandPropagateMulti(c); must_propagate = 1; } The above code is already modified by this commit you are reading. Now also ADMIN commands do not trigger the emission of MULTI. It is actually not clear why we do not just check for CMD_WRITE... Probably I wrote it this way in order to make the code more reliable: better to over-emit MULTI than not emitting it in time. So this commit should indeed fix issue #3836 (verified), however it looks like some reconsideration of this code path is needed in the long term. BONUS POINT: The reverse bug. Even in a read only slave "B", in a replication setup like: A -> B -> C There are commands without the READONLY nor the ADMIN flag, that are also not flagged as WRITE commands. An example is just the PING command. So if we send B the following sequence: MULTI PING SLAVEOF NO ONE EXEC The result will be the reverse bug, where only EXEC is emitted, but not the previous MULTI. However this apparently does not create problems in practice but it is yet another acknowledge of the fact some work is needed here in order to make this code path less surprising. Note that there are many different approaches we could follow. For instance MULTI/EXEC blocks containing administrative commands may be allowed ONLY if all the commands are administrative ones, otherwise they could be denined. When allowed, the commands could simply never be replicated at all.	2017-07-12 11:07:28 +02:00
antirez	fc7ecd8d35	AOF check utility: ability to check files with RDB preamble.	2017-07-10 13:38:23 +02:00

1 2 3 4 5 ...

285 Commits