redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-24 00:59:02 -05:00

Author	SHA1	Message	Date
Mota	81fe7a4733	redis-benchmark: default value size usage update. default size of SET/GET value in usage should be 3 bytes as in main code.	2017-07-25 23:43:46 +08:00
Salvatore Sanfilippo	6b64cc47a0	Merge pull request #2259 from badboy/fix-2258 Check that the whole first argument is a number	2017-07-24 15:19:53 +02:00
Salvatore Sanfilippo	964224b77f	Merge pull request #4124 from lamby/proceding-proceeding-typo Correct proceding -> proceeding typo.	2017-07-24 15:19:21 +02:00
Salvatore Sanfilippo	ae40e5f362	Merge pull request #4125 from trevor211/fixAutoAofRewirteMinSize fix rewrite config: auto-aof-rewrite-min-size	2017-07-24 15:18:56 +02:00
Salvatore Sanfilippo	25c231c4c1	Merge pull request #1998 from grobe0ba/unstable Fix missing '-' in redis-benchmark help output (Issue #1996)	2017-07-24 15:18:08 +02:00
Salvatore Sanfilippo	d9565379da	Merge pull request #4128 from leonchen83/unstable fix mismatch argument and return wrong value of clusterDelNodeSlots	2017-07-24 14:18:28 +02:00
liangsijian	ffbbe5a720	Fix lua ldb command log	2017-07-24 19:24:06 +08:00
antirez	314043552b	Modules: don't crash when Lua calls a module blocking command. Lua scripting does not support calling blocking commands, however all the native Redis commands are flagged as "s" (no scripting flag), so this is not possible at all. With modules there is no such mechanism in order to flag a command as non callable by the Lua scripting engine, moreover we cannot trust the modules users from complying all the times: it is likely that modules will be released to have blocking commands without such commands being flagged correctly, even if we provide a way to signal this fact. This commit attempts to address the problem in a short term way, by detecting that a module is trying to block in the context of the Lua scripting engine client, and preventing to do this. The module will actually believe to block as usually, but what happens is that the Lua script receives an error immediately, and the background call is ignored by the Redis engine (if not for the cleanup callbacks, once it unblocks). Long term, the more likely solution, is to introduce a new call called RedisModule_GetClientFlags(), so that a command can detect if the caller is a Lua script, and return an error, or avoid blocking at all. Being the blocking API experimental right now, more work is needed in this regard in order to reach a level well blocking module commands and all the other Redis subsystems interact peacefully. Now the effect is like the following: 127.0.0.1:6379> eval "redis.call('hello.block',1,5000)" 0 (error) ERR Error running script (call to f_b5ba35ff97bc1ef23debc4d6e9fd802da187ed53): @user_script:1: ERR Blocking module command called from Lua script This commit fixes issue #4127 in the short term.	2017-07-23 12:55:37 +02:00
antirez	5bfdfbe174	Fix typo in unblockClientFromModule() top comment.	2017-07-23 12:41:26 +02:00
antirez	a3778f3b0f	Make representClusterNodeFlags() more robust. This function failed when an internal-only flag was set as an only flag in a node: the string was trimmed expecting a final comma before exiting the function, causing a crash. See issue #4142. Moreover generation of flags representation only needed at DEBUG log level was always performed: a waste of CPU time. This is fixed as well by this commit.	2017-07-20 15:17:35 +02:00
antirez	b1c2e1a19c	Fix two bugs in moduleTypeLookupModuleByID(). The function cache was not working at all, and the function returned wrong values if there where two or more modules exporting native data types. See issue #4131 for more details.	2017-07-20 14:59:42 +02:00
Leon Chen	9e7a8c0207	fix return wrong value of clusterDelNodeSlots	2017-07-20 17:24:38 +08:00
Leon Chen	2cdf4cc656	fix mismatch argument	2017-07-18 02:28:24 -05:00
WuYunlong	c32c690de6	fix rewrite config: auto-aof-rewrite-min-size	2017-07-15 10:20:56 +08:00
Chris Lamb	7560d347da	Correct proceding -> proceeding typo.	2017-07-14 22:53:14 +01:00
antirez	bd1782fa0a	Modules: fix thread safe context DB selection. Before this fix the DB currenty selected by the client blocked was not respected and operations were always performed on DB 0.	2017-07-14 13:02:15 +02:00
antirez	8eefc9323d	Allow certain modules APIs only defining REDISMODULE_EXPERIMENTAL_API. Those calls may be subject to changes in the future, so the user should acknowledge it is using non stable API.	2017-07-14 12:07:52 +02:00
antirez	f03947a676	Modules documentation removed from source. Moving to redis-doc repository to publish via Redis.io.	2017-07-14 11:33:59 +02:00
antirez	43aaf96163	Markdown generation of Redis Modules API reference improved.	2017-07-14 11:29:31 +02:00
antirez	e74f0aa6d1	Fix replication of SLAVEOF inside transaction. In Redis 4.0 replication, with the introduction of PSYNC2, masters and slaves replicate commands to cascading slaves and to the replication backlog itself in a different way compared to the past. Masters actually replicate the effects of client commands. Slaves just propagate what they receive from masters. This mechanism can cause problems when the configuration of an instance is changed from master to slave inside a transaction. For instance we could send to a master instance the following sequence: MULTI SLAVEOF 127.0.0.1 0 EXEC SLAVEOF NO ONE Before the fixes in this commit, the MULTI command used to be propagated into the replication backlog, however after the SLAVEOF command the instance is a slave, so the EXEC implementation failed to also propagate the EXEC command. When the slaves of the above instance reconnected, they were incrementally synchronized just sending a "MULTI". This put the master client (in the slaves) into MULTI state, breaking the replication. Notably even Redis Sentinel uses the above approach in order to guarantee that configuration changes are always performed together with rewrites of the configuration and with clients disconnection. Sentiel does: MULTI SLAVEOF ... CONFIG REWRITE CLIENT KILL TYPE normal EXEC So this was a really problematic issue. However even with the fix in this commit, that will add the final EXEC to the replication stream in case the instance was switched from master to slave during the transaction, the result would be to increment the slave replication offset, so a successive reconnection with the new master, will not permit a successful partial resynchronization: no way the new master can provide us with the backlog needed, we incremented our offset to a value that the new master cannot have. However the EXEC implementation waits to emit the MULTI, so that if the commands inside the transaction actually do not need to be replicated, no commands propagation happens at all. From multi.c: if (!must_propagate && !(c->cmd->flags & (CMD_READONLY\|CMD_ADMIN))) { execCommandPropagateMulti(c); must_propagate = 1; } The above code is already modified by this commit you are reading. Now also ADMIN commands do not trigger the emission of MULTI. It is actually not clear why we do not just check for CMD_WRITE... Probably I wrote it this way in order to make the code more reliable: better to over-emit MULTI than not emitting it in time. So this commit should indeed fix issue #3836 (verified), however it looks like some reconsideration of this code path is needed in the long term. BONUS POINT: The reverse bug. Even in a read only slave "B", in a replication setup like: A -> B -> C There are commands without the READONLY nor the ADMIN flag, that are also not flagged as WRITE commands. An example is just the PING command. So if we send B the following sequence: MULTI PING SLAVEOF NO ONE EXEC The result will be the reverse bug, where only EXEC is emitted, but not the previous MULTI. However this apparently does not create problems in practice but it is yet another acknowledge of the fact some work is needed here in order to make this code path less surprising. Note that there are many different approaches we could follow. For instance MULTI/EXEC blocks containing administrative commands may be allowed ONLY if all the commands are administrative ones, otherwise they could be denined. When allowed, the commands could simply never be replicated at all.	2017-07-12 11:07:28 +02:00
antirez	e1b8b4b6da	CLUSTER GETKEYSINSLOT: avoid overallocating. Close #3911.	2017-07-11 15:49:09 +02:00
antirez	5bd46d33db	Fix isHLLObjectOrReply() to handle integer encoded strings. Close #3766.	2017-07-11 12:44:59 +02:00
antirez	e203a46cf3	Clients blocked in modules: free argv/argc later. See issue #3844 for more information.	2017-07-11 12:33:01 +02:00
antirez	14c32c3569	Merge branch 'unstable' of github.com:/antirez/redis into unstable	2017-07-11 09:46:58 +02:00
antirez	54e4bbeabd	Event loop: call after sleep() only from top level. In general we do not want before/after sleep() callbacks to be called when we re-enter the event loop, since those calls are only designed in order to perform operations every main iteration of the event loop, and re-entering is often just a way to incrementally serve clietns with error messages or other auxiliary operations. However, if we call the callbacks, we are then forced to think at before/after sleep callbacks as re-entrant, which is much harder without any good need. However here there was also a clear bug: beforeSleep() was actually never called when re-entering the event loop. But the new afterSleep() callback was. This is broken and in this instance re-entering afterSleep() caused a modules GIL dead lock.	2017-07-11 00:13:52 +02:00
Salvatore Sanfilippo	58104d8327	Merge pull request #4113 from guybe7/module_io_bytes Modules: Fix io->bytes calculation in RDB save	2017-07-10 19:14:34 +02:00
antirez	11182a1a58	redis-check-aof: tell users there is a --fix option.	2017-07-10 16:41:25 +02:00
Guy Benoish	dfb68cd235	Modules: Fix io->bytes calculation in RDB save	2017-07-10 14:41:57 +03:00
antirez	fc7ecd8d35	AOF check utility: ability to check files with RDB preamble.	2017-07-10 13:38:23 +02:00
Salvatore Sanfilippo	6b0670daad	Merge pull request #3853 from itamarhaber/issue-3851 Sets up fake client to select current db in RM_Call()	2017-07-06 15:02:11 +02:00
Salvatore Sanfilippo	38dd30af42	Merge pull request #4105 from spinlock/unstable-networking Optimize addReplyBulkSds for better performance	2017-07-06 14:31:08 +02:00
Salvatore Sanfilippo	2d5aa00959	Merge pull request #4106 from petersunbag/unstable minor fix in listJoin().	2017-07-06 14:29:37 +02:00
sunweinan	87f771bff1	minor fix in listJoin().	2017-07-06 19:47:21 +08:00
antirez	2b36950e9b	Free IO context if any in RDB loading code. Thanks to @oranagra for spotting this bug.	2017-07-06 11:20:49 +02:00
antirez	51ffd062d3	Modules: DEBUG DIGEST interface.	2017-07-06 11:04:46 +02:00
spinlock	10db81af71	update Makefile for test-sds	2017-07-05 14:32:09 +00:00
spinlock	ea31a4eae3	Optimize addReplyBulkSds for better performance	2017-07-05 14:25:05 +00:00
antirez	f9fac7f777	Avoid closing invalid FDs to make Valgrind happier.	2017-07-05 15:40:25 +02:00
antirez	413c2bc180	Modules: no MULTI/EXEC for commands replicated from async contexts. They are technically like commands executed from external clients one after the other, and do not constitute a single atomic entity.	2017-07-05 10:10:20 +02:00
Salvatore Sanfilippo	09dd7b5ff0	Merge pull request #4101 from dvirsky/fix_modules_reply_len Proposed fix to #4100	2017-07-04 12:01:51 +02:00
antirez	eddd8d34c4	Add symmetrical assertion to track c->reply_buffer infinite growth. Redis clients need to have an instantaneous idea of the amount of memory they are consuming (if the number is not exact should at least be proportional to the actual memory usage). We do that adding and subtracting the SDS length when pushing / popping from the client->reply list. However it is quite simple to add bugs in such a setup, by not taking the objects in the list and the count in sync. For such reason, Redis has an assertion to track counts near 2^64: those are always the result of the counter wrapping around because we subtract more than we add. This commit adds the symmetrical assertion: when the list is empty since we sent everything, the reply_bytes count should be zero. Thanks to the new assertion it should be simple to also detect the other problem, where the count slowly increases because of over-counting. The assertion adds a conditional in the code that sends the buffer to the socket but should not create any measurable performance slowdown, listLength() just accesses a structure field, and this code path is totally dominated by write(2). Related to #4100.	2017-07-04 11:55:05 +02:00
Dvir Volk	86e564e9ff	fixed #4100	2017-07-04 00:02:19 +03:00
antirez	b2cd9fcab6	Fix GEORADIUS edge case with huge radius. This commit closes issue #3698, at least for now, since the root cause was not fixed: the bounding box function, for huge radiuses, does not return a correct bounding box, there are points still within the radius that are left outside. So when using GEORADIUS queries with radiuses in the order of 5000 km or more, it was possible to see, at the edge of the area, certain points not correctly reported. Because the bounding box for now was used just as an optimization, and such huge radiuses are not common, for now the optimization is just switched off when the radius is near such magnitude. Three test cases found by the Continuous Integration test were added, so that we can easily trigger the bug again, both for regression testing and in order to properly fix it as some point in the future.	2017-07-03 19:38:31 +02:00
antirez	26e638a8e9	redis-cli --latency: ability to run non interactively. This feature was proposed by @rosmo in PR #2643 and later redesigned in order to fit better with the other options for non-interactive modes of redis-cli. The idea is basically to allow to collect latency information in scripts, cron jobs or whateever, just running for a limited time and then producing a single output.	2017-06-30 15:41:58 +02:00
antirez	7bad78bd2f	Fix abort typo in Lua debugger help screen.	2017-06-30 12:12:00 +02:00
antirez	f8547e53f0	Added GEORADIUS(BYMEMBER)_RO variants for read-only operations. Issue #4084 shows how for a design error, GEORADIUS is a write command because of the STORE option. Because of this it does not work on readonly slaves, gets redirected to masters in Redis Cluster even when the connection is in READONLY mode and so forth. To break backward compatibility at this stage, with Redis 4.0 to be in advanced RC state, is problematic for the user base. The API can be fixed into the unstable branch soon if we'll decide to do so in order to be more consistent, and reease Redis 5.0 with this incompatibility in the future. This is still unclear. However, the ability to scale GEO queries in slaves easily is too important so this commit adds two read-only variants to the GEORADIUS and GEORADIUSBYMEMBER command: GEORADIUS_RO and GEORADIUSBYMEMBER_RO. The commands are exactly as the original commands, but they do not accept the STORE and STOREDIST options.	2017-06-30 10:03:37 +02:00
antirez	01a4b9892d	HMSET and MSET implementations unified. HSET now variadic. This is the first step towards getting rid of HMSET which is a command that does not make much sense once HSET is variadic, and has a saner return value.	2017-06-29 17:38:46 +02:00
Salvatore Sanfilippo	634c64dd18	Merge pull request #4075 from sgn1/brpop_keys Fix Issues in blocking commands in cluster mode.	2017-06-27 17:51:19 +02:00
antirez	365dd037dc	RDB modules values serialization format version 2. The original RDB serialization format was not parsable without the module loaded, becuase the structure was managed only by the module itself. Moreover RDB is a streaming protocol in the sense that it is both produce di an append-only fashion, and is also sometimes directly sent to the socket (in the case of diskless replication). The fact that modules values cannot be parsed without the relevant module loaded is a problem in many ways: RDB checking tools must have loaded modules even for doing things not involving the value at all, like splitting an RDB into N RDBs by key or alike, or just checking the RDB for sanity. In theory module values could be just a blob of data with a prefixed length in order for us to be able to skip it. However prefixing the values with a length would mean one of the following: 1. To be able to write some data at a previous offset. This breaks stremaing. 2. To bufferize values before outputting them. This breaks performances. 3. To have some chunked RDB output format. This breaks simplicity. Moreover, the above solution, still makes module values a totally opaque matter, with the fowllowing problems: 1. The RDB check tool can just skip the value without being able to at least check the general structure. For datasets composed mostly of modules values this means to just check the outer level of the RDB not actually doing any checko on most of the data itself. 2. It is not possible to do any recovering or processing of data for which a module no longer exists in the future, or is unknown. So this commit implements a different solution. The modules RDB serialization API is composed if well defined calls to store integers, floats, doubles or strings. After this commit, the parts generated by the module API have a one-byte prefix for each of the above emitted parts, and there is a final EOF byte as well. So even if we don't know exactly how to interpret a module value, we can always parse it at an high level, check the overall structure, understand the types used to store the information, and easily skip the whole value. The change is backward compatible: older RDB files can be still loaded since the new encoding has a new RDB type: MODULE_2 (of value 7). The commit also implements the ability to check RDB files for sanity taking advantage of the new feature.	2017-06-27 13:19:16 +02:00
antirez	c3998728a2	ARM: Fix stack trace generation on crash.	2017-06-26 10:36:16 +02:00
antirez	c9097393bf	Issue #4027 : unify comment and modify return value in freeMemoryIfNeeded(). It looks safer to return C_OK from freeMemoryIfNeeded() when clients are paused because returning C_ERR may prevent success of writes. It is possible that there is no difference in practice since clients cannot execute writes while clients are paused, but it looks more correct this way, at least conceptually. Related to PR #4028.	2017-06-23 11:42:25 +02:00
Salvatore Sanfilippo	936ade80b2	Merge pull request #4028 from zintrepid/prevent_expirations_while_paused Prevent expirations and evictions while paused	2017-06-23 11:39:02 +02:00
Suraj Narkhede	f85f36f50d	Fix following issues in blocking commands: 1. brpop last key index, thus checking all keys for slots. 2. Memory leak in clusterRedirectBlockedClientIfNeeded. 3. Remove while loop in clusterRedirectBlockedClientIfNeeded.	2017-06-23 00:30:21 -07:00
Suraj Narkhede	d303bca587	Fix brpop command table entry and redirect blocked clients.	2017-06-22 23:52:00 -07:00
antirez	8b768e8ea4	Aesthetic changes to #4068 PR to conform to Redis coding standard. 1. Inline if ... statement if short. 2. No lines over 80 columns.	2017-06-22 11:00:34 +02:00
Salvatore Sanfilippo	6476f1a979	Merge pull request #4068 from FreedomU007/unstable Fix set with ex/px option when propagated to aof	2017-06-22 10:46:58 +02:00
xuzhou	86e9f48a0c	Optimize set command with ex/px when updating aof.	2017-06-22 11:06:40 +08:00
Salvatore Sanfilippo	ef446bf16d	Merge pull request #3802 from flowly/bugfix-calc-stat-net-output-bytes Bugfix calc stat net output bytes	2017-06-20 17:01:16 +02:00
Salvatore Sanfilippo	1d857a99d5	Merge pull request #4056 from season89/unstable Fixed comments of slowlog duration	2017-06-20 16:55:29 +02:00
Salvatore Sanfilippo	0a03187ac4	Merge pull request #3659 from cbgbt/cli-elapsed cli: Only print elapsed time on OUTPUT_STANDARD.	2017-06-20 16:53:56 +02:00
antirez	2a84927f35	redis-benchmark: add -t hset target.	2017-06-19 09:41:11 +02:00
xuzhou	530fcf8687	Fix set with ex/px option when propagated to aof	2017-06-16 17:51:38 +08:00
antirez	53cb27b1d7	SLOWLOG: log offending client address and name.	2017-06-15 12:57:54 +02:00
antirez	ab9d398835	Merge branch 'unstable' of github.com:/antirez/redis into unstable	2017-06-14 18:29:53 +02:00
Qu Chen	4740424049	Implement getKeys procedure for georadius and georadiusbymember commands.	2017-06-14 18:15:48 +02:00
xuchengxuan	3fc4bf07cc	Fixed comments of slowlog duration	2017-06-14 16:42:21 +08:00
Salvatore Sanfilippo	d3b32ca48d	Merge pull request #4034 from amallia/patch-1 Fixed comment in clusterMsg version field	2017-06-13 06:28:23 -07:00
Salvatore Sanfilippo	33035cad04	Merge pull request #4035 from amallia/patch-2 Removed duplicate 'sys/socket.h' include	2017-06-13 06:27:31 -07:00
antirez	5877c02c51	Fix PERSIST expired key resuscitation issue #4048 .	2017-06-13 10:35:51 +02:00
Antonio Mallia	2d1d57eb47	Removed duplicate 'sys/socket.h' include	2017-06-04 15:26:53 +01:00
Antonio Mallia	591dba8055	Fixed comment in clusterMsg version field	2017-06-04 15:09:05 +01:00
Zachary Marquez	a3e53cf9bc	Prevent expirations and evictions while paused Proposed fix to https://github.com/antirez/redis/issues/4027	2017-06-01 16:28:40 -05:00
antirez	e91b81c612	More informative -MISCONF error message.	2017-05-19 12:03:30 +02:00
antirez	e498d9ee3e	Collect fork() timing info only if fork succeeded.	2017-05-19 11:10:36 +02:00
antirez	78211aaaaf	redis-cli --bigkeys: show error when TYPE fails. Close #3993.	2017-05-15 11:22:28 +02:00
antirez	1f598fc2bb	Modules TSC: use atomic var for server.unixtime. This avoids Helgrind complaining, but we are actually not using atomicGet() to get the unixtime value for now: too many places where it is used and given tha time_t is word-sized it should be safe in all the archs we support as it is. On the other hand, Helgrind, when Redis is compiled with "make helgrind" in order to force the __sync macros, will detect the write in updateCachedTime() as a read (because atomic functions are used) and will not complain about races. This commit also includes minor refactoring of mutex initializations and a "helgrind" target in the Makefile.	2017-05-10 10:04:16 +02:00
antirez	de786186a5	atomicvar.h: show used API in INFO. Add macro to force __sync builtin. The __sync builtin can be correctly detected by Helgrind so to force it is useful for testing. The API in the INFO output can be useful for debugging after problems are reported.	2017-05-10 09:33:49 +02:00
antirez	6eb51bf1ec	zmalloc.c: remove thread safe mode, it's the default way.	2017-05-09 16:59:51 +02:00
antirez	9390c384b8	Modules TSC: Add mutex for server.lruclock. Only useful for when no atomic builtins are available.	2017-05-09 16:32:49 +02:00
antirez	ece658713b	Modules TSC: Improve inter-thread synchronization. More work to do with server.unixtime and similar. Need to write Helgrind suppression file in order to suppress the valse positives.	2017-05-09 11:57:09 +02:00
antirez	2a51bac44e	Simplify atomicvar.h usage by having the mutex name implicit.	2017-05-04 17:01:00 +02:00
antirez	52bc74f221	Lazyfree: fix lazyfreeGetPendingObjectsCount() race reading counter.	2017-05-04 10:35:40 +02:00
antirez	7d9326b1f3	Modules TSC: HELLO.KEYS reply format fixed.	2017-05-03 23:43:49 +02:00
antirez	9b01b64430	Modules TSC: put the client in the pending write list.	2017-05-03 14:54:48 +02:00
antirez	e67fb915eb	adlist: fix final list count in listJoin().	2017-05-03 14:54:14 +02:00
antirez	79226cb9fa	adlist: fix listJoin() to handle empty lists.	2017-05-03 14:15:25 +02:00
antirez	6798736909	Modules: remove unused var in example module.	2017-05-03 14:10:21 +02:00
antirez	1ed2ff5570	Modules TSC: HELLO.KEYS example draft finished.	2017-05-03 14:08:12 +02:00
antirez	7127f15ebe	Module: fix RedisModule_Call() "l" specifier to create a raw string.	2017-05-03 14:07:10 +02:00
antirez	3fcf959e60	Modules TSC: Release the GIL for all the time we are blocked. Instead of giving the module background operations just a small time to run in the beforeSleep() function, we can have the lock released for all the time we are blocked in the multiplexing syscall.	2017-05-03 11:26:21 +02:00
antirez	ba4a5a3255	Modules TSC: Export symbols of the new API.	2017-05-02 15:19:28 +02:00
antirez	275905b328	Modules TSC: Handling of RM_Reply* functions.	2017-05-02 15:05:39 +02:00
antirez	9c500b89fb	Modules TSC: Basic TS context creeation and handling.	2017-05-02 12:53:10 +02:00
antirez	59b06b14c9	Modules TSC: GIL and cooperative multi tasking setup.	2017-04-28 18:41:10 +02:00
antirez	469d6e2b37	PSYNC2: fix master cleanup when caching it. The master client cleanup was incomplete: resetClient() was missing and the output buffer of the client was not reset, so pending commands related to the previous connection could be still sent. The first problem caused the client argument vector to be, at times, half populated, so that when the correct replication stream arrived the protcol got mixed to the arugments creating invalid commands that nobody called. Thanks to @yangsiran for also investigating this problem, after already providing important design / implementation hints for the original PSYNC2 issues (see referenced Github issue). Note that this commit adds a new function to the list library of Redis in order to be able to reset a list without destroying it. Related to issue #3899.	2017-04-27 17:08:37 +02:00
antirez	238cebdd5e	Check event loop creation return value. Fix #3951 . Normally we never check for OOM conditions inside Redis since the allocator will always return a pointer or abort the program on OOM conditons. However we cannot have control on epool_create(), that may fail for kernel OOM (according to the manual page) even if all the parameters are correct, so the function aeCreateEventLoop() may indeed return NULL and this condition must be checked.	2017-04-21 16:27:38 +02:00
Salvatore Sanfilippo	3773c06d28	Merge pull request #3950 from kensou97/unstable update block->free after some diff data are written to the child process	2017-04-20 07:55:51 +02:00
antirez	7d9dd80db3	Fix getKeysUsingCommandTable() in cluster mode. Close #3940.	2017-04-19 16:17:08 +02:00
antirez	189a12afb4	PSYNC2: discard pending transactions from cached master. During the review of the fix for #3899, @yangsiran identified an implementation bug: given that the offset is now relative to the applied part of the replication log, when we cache a master, the successive PSYNC2 request will be made in order to include the transaction that was not completely processed. This means that we need to discard any pending transaction from our replication buffer: it will be re-executed.	2017-04-19 14:02:52 +02:00
antirez	22be435efe	Fix PSYNC2 incomplete command bug as described in #3899 . This bug was discovered by @kevinmcgehee and constituted a major hidden bug in the PSYNC2 implementation, caused by the propagation from the master of incomplete commands to slaves. The bug had several results: 1. Borrowing from Kevin text in the issue: "Given that slaves blindly copy over their master's input into their own replication backlog over successive read syscalls, it's possible that with large commands or small TCP buffers, partial commands are present in this buffer. If the master were to fail before successfully propagating the entire command to a slave, the slaves will never execute the partial command (since the client is invalidated) but will copy it to replication backlog which may relay those invalid bytes to its slaves on PSYNC2, corrupting the backlog and possibly other valid commands that follow the failover. Simple command boundaries aren't sufficient to capture this, either, because in the case of a MULTI/EXEC block, if the master successfully propagates a subset of the commands but not the EXEC, then the transaction in the backlog becomes corrupt and could corrupt other slaves that consume this data." 2. As identified by @yangsiran later, there is another effect of the bug. For the same mechanism of the first problem, a slave having another slave, could receive a full resynchronization request with an already half-applied command in the backlog. Once the RDB is ready, it will be sent to the slave, and the replication will continue sending to the sub-slave the other half of the command, which is not valid. The fix, designed by @yangsiran and @antirez, and implemented by @antirez, uses a secondary buffer in order to feed the sub-masters and update the replication backlog and offsets, only when a given part of the query buffer is actually applied to the state of the instance, that is, when the command gets processed and the command is not pending in the Redis transaction buffer because of CLIENT_MULTI state. Given that now the backlog and offsets representation are in agreement with the actual processed commands, both issue 1 and 2 should no longer be possible. Thanks to @kevinmcgehee, @yangsiran and @oranagra for their work in identifying and designing a fix for this problem.	2017-04-19 10:25:45 +02:00

1 2 3 4 5 ...

4196 Commits