redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 00:28:26 -05:00

Author	SHA1	Message	Date
Binbin	58a1d16ff6	Fix timing issue in replication test (#9719 ) So it looks like sampling set loglines [count_log_lines -2] was executed too late, and the replication managed to complete before that. ``` *** [err]: diskless no replicas drop during rdb pipe in tests/integration/replication.tcl log message of '"Diskless rdb transfer, done reading from pipe, 2 replicas still up"' not found in ./tests/tmp/server.6124.69/stdout after line: 52 till line: 52 ``` Changes: 1. when we search the master log file, we start to search from before we sent the REPLICAOF command, to prevent a race in which the replication completed before we sampled the log line count. 2. we don't need to sample the replica loglines sine it's a fresh resplica that's just been started, so the message we're looking for is the first occurrence in the log, we can start search from 0. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-02 10:32:01 +02:00
Binbin	cea7809cea	Fix race condition in psync2-pingoff test (#9712 ) Test failed on freebsd: ``` *** [err]: Make the old master a replica of the new one and check conditions in tests/integration/psync2-pingoff.tcl Expected '162' to be equal to '176' (context: type eval line 18 cmd {assert_equal [status $R(0) master_repl_offset] [status $R(1) master_repl_offset]} proc ::test) ``` There are two possible race conditions in the test. 1. The code waits for sync_full to increment, and assumes that means the master did the fork. But in fact there are cases the master will increment that sync_full counter (after replica asks for sync), but will see that there's already a fork running and will delay the fork creation. In this case the INCR will be executed before the fork happens, so it'll not be in the command stream. Solve that by waiting for `master_link_status: up` on the replica before the INCR. 2. The repl-ping-replica-period is still high (1 second), so there's a chance the master will send an additional PING between the two calls to INFO (the line that fails is the one that samples INFO from both servers). So there's a chance one of them will have an incremented offset due to PING and the other won't have it yet. In theory we can wait for the repl_offset to match, but then we risk facing a situation where that race will hide an offset mis-match. so instead, i think we should just change repl-ping-replica-period to prevent further pings from being pushed. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-01 16:07:08 +02:00
Oran Agra	f1f3cceb50	fix valgrind issues with long double module test (#9709 ) The module test in reply.tcl was introduced by #8521 but didn't run until recently (see #9639) and then it started failing with valgrind. This is because valgrind uses 64 bit long double (unlike most other platforms that have at least 80 bits) But besides valgrind, the tests where also incompatible with ARM32, which also uses 64 bit long doubles. We now use appropriate value to avoid issues with either valgrind or ARM32 In all the double tests, i use 3.141, which is safe since since addReplyDouble uses `%.17Lg` which is able to represent this value without adding any digits due to precision loss. In the long double, since we use `%.17Lf` in ld2string, it preserves 17 significant digits, rather than 17 digit after the decimal point (like in `%.17Lg`). So to make these similar, i use value lower than 1 (no digits left of the period) Lastly, we have the same issue with TCL (no long doubles) so we read raw protocol in that test. Note that the only error before this fix (in both valgrind and ARM32 is this: ``` *** [err]: RM_ReplyWithLongDouble: a float reply in tests/unit/moduleapi/reply.tcl Expected '3.141' to be equal to '3.14100000000000001' (context: type eval line 2 cmd {assert_equal 3.141 [r rw.longdouble 3.141]} proc ::test) ``` so the changes to debug.c and scripting.tcl aren't really needed, but i consider them a cleanup (i.e. scripting.c validated a different constant than the one that's sent to it from debug.c). Another unrelated change is to add the RESP version to the repeated tests in reply.tcl	2021-11-01 13:41:35 +02:00
Oran Agra	48d54265ce	Fix failing cluster tests (#9707 ) Fix failures introduced by #9695 which was an attempt to solve failures introduced by #9679. And alternative to #9703 (i didn't like the extra argument to kill_instance). Reverting #9695. Instead of stopping AOF on all terminations, stop it only on the two which need it. Do it as part of the test rather than the infra (it was add that kill_instance used `R` to communicate to the instance) Note that the original purpose of these tests was to trigger a crash, but that upsets valgrind so in redis 6.2 i changed it to use SIGTERM, so i now rename the tests (remove "kill" and "crash"). Also add some colors to failures, and the word "FAILED" so that it's searchable. And solve a semi-related race condition in 14-consistency-check.tcl	2021-10-31 19:22:21 +02:00
Yossi Gottlieb	f26e90be0c	Use 'gcc' instead of 'ld' to link test modules. (#9710 ) This solves several problems in a more elegant way: * No need to explicitly use `-lc` on x86_64 when building with `-m32`. * Avoids issues with undefined floating point emulation funcs on ARM.	2021-10-31 16:25:57 +02:00
Binbin	033578839b	Fix multiple COUNT in LMPOP/BLMPOP/ZMPOP/BZMPOP (#9701 ) The previous code did not check whether COUNT is set. So we can use `lmpop 2 key1 key2 left count 1 count 2`. This situation can occur in LMPOP/BLMPOP/ZMPOP/BZMPOP commands. LMPOP/BLMPOP introduced in #9373, ZMPOP/BZMPOP introduced in #9484.	2021-10-31 16:10:29 +02:00
Wang Yuan	68886de085	Fix timing issue in replication buffer test (#9697 ) Introduced in #9166	2021-10-29 08:04:12 +03:00
Oran Agra	22a778c880	fix failing cluster tests (#9695 ) When stopping an instance in the cluster tests, disable appendonly first, so that SIGTERM won't be ignored. Recently in #9679 i change the test infra to use SIGSEGV to kill servers that refuse the SIGTERM rather than do SIGKILL directly. This surfaced an issue that i've added in #7725 which changed SIGKILL to SIGTERM (to resolve valgrind issues). So the current situation in the past months was that sometimes servers refused the SIGTERM and waited 10 seconds for the SIGKILL, and this commit resolves that (faster termination).	2021-10-28 12:16:27 +03:00
Wen Hui	5fb4adba65	New Cluster Command: CLUSTER DELSLOTSRANGE and CLUSTER ADDSLOTSRANGE (#9445 )	2021-10-26 21:44:33 -07:00
Wang Yuan	37dc2f13b4	Fix not waiting for data loading to complete in AOF tests (#9683 ) Fix timing issue of a new test introduced in #9326	2021-10-26 14:08:09 +03:00
Oran Agra	37559ca79f	Fix race condition in lazy free test (#9682 ) The first test exited before all the memory was reclaimed, so when the second test sampled used_memory, it was too early.	2021-10-26 13:02:31 +03:00
Oran Agra	665e428435	Testsuite: attempt to find / avoid valgrind warnings of killed processes (#9679 ) I recently started seeing a lot of empty valgrind reports in the daily CI. i.e. prints showing valgrind header but no leak report, which causes the tests to fail https://github.com/redis/redis/runs/3991335416?check_suite_focus=true This commit change 2 things: * first, considering valgrind is just slow, we used to give processes 60 seconds timeout on shutdown instead of 10 seconds we give normally. this commit changes that to 120. * secondly, when we reach the timeout, we first try to use SIGSEGV so that maybe we'll get a stack trace indicating where redis is hang, and we only resort to SIGKILL if double that time passed. note that if there are indeed hang processes, we will normally not see that in the non-valgrind runs, since the tests didn't use to detect any failure in that case, and now they will since `crashlog_from_file` is run after `kill_server`.	2021-10-26 08:34:30 +03:00
Wang Yuan	9ec3294b97	Add timestamp annotations in AOF (#9326 ) Add timestamp annotation in AOF, one part of #9325. Enabled with the new `aof-timestamp-enabled` config option. Timestamp annotation format is "#TS:${timestamp}\r\n"." TS" is short of timestamp and this method could save extra bytes in AOF. We can use timestamp annotation for some special functions. - know the executing time of commands - restore data to a specific point-in-time (by using redis-check-rdb to truncate the file)	2021-10-25 13:08:34 +03:00
Guy Korland	6cf6c36937	Replace deprecated REDISMODULE_POSTPONED_ARRAY_LEN in module tests and examples (#9677 ) REDISMODULE_POSTPONED_ARRAY_LEN is deprecated, use REDISMODULE_POSTPONED_LEN instead	2021-10-25 12:00:43 +03:00
Shaya Potter	12ce2c3925	Add RM_ReplyWithBigNumber module API (#9639 ) Let modules use additional type of RESP3 response (unused by redis so far) Also fix tests that where introduced in #8521 but didn't actually run. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-10-25 11:31:20 +03:00
Wang Yuan	c1718f9d86	Replication backlog and replicas use one global shared replication buffer (#9166 ) ## Background For redis master, one replica uses one copy of replication buffer, that is a big waste of memory, more replicas more waste, and allocate/free memory for every reply list also cost much. If we set client-output-buffer-limit small and write traffic is heavy, master may disconnect with replicas and can't finish synchronization with replica. If we set client-output-buffer-limit big, master may be OOM when there are many replicas that separately keep much memory. Because replication buffers of different replica client are the same, one simple idea is that all replicas only use one replication buffer, that will effectively save memory. Since replication backlog content is the same as replicas' output buffer, now we can discard replication backlog memory and use global shared replication buffer to implement replication backlog mechanism. ## Implementation I create one global "replication buffer" which contains content of replication stream. The structure of "replication buffer" is similar to the reply list that exists in every client. But the node of list is `replBufBlock`, which has `id, repl_offset, refcount` fields. ```c /* Replication buffer blocks is the list of replBufBlock. * * +--------------+ +--------------+ +--------------+ * \| refcount = 1 \| ... \| refcount = 0 \| ... \| refcount = 2 \| * +--------------+ +--------------+ +--------------+ * \| / \ * \| / \ * \| / \ * Repl Backlog Replia_A Replia_B * * Each replica or replication backlog increments only the refcount of the * 'ref_repl_buf_node' which it points to. So when replica walks to the next * node, it should first increase the next node's refcount, and when we trim * the replication buffer nodes, we remove node always from the head node which * refcount is 0. If the refcount of the head node is not 0, we must stop * trimming and never iterate the next node. / / Similar with 'clientReplyBlock', it is used for shared buffers between * all replica clients and replication backlog. / typedef struct replBufBlock { int refcount; / Number of replicas or repl backlog using. / long long id; / The unique incremental number. / long long repl_offset; / Start replication offset of the block. */ size_t size, used; char buf[]; } replBufBlock; ``` So now when we feed replication stream into replication backlog and all replicas, we only need to feed stream into replication buffer `feedReplicationBuffer`. In this function, we set some fields of replication backlog and replicas to references of the global replication buffer blocks. And we also need to check replicas' output buffer limit to free if exceeding `client-output-buffer-limit`, and trim replication backlog if exceeding `repl-backlog-size`. When sending reply to replicas, we also need to iterate replication buffer blocks and send its content, when totally sending one block for replica, we decrease current node count and increase the next current node count, and then free the block which reference is 0 from the head of replication buffer blocks. Since now we use linked list to manage replication backlog, it may cost much time for iterating all linked list nodes to find corresponding replication buffer node. So we create a rax tree to store some nodes for index, but to avoid rax tree occupying too much memory, i record one per 64 nodes for index. Currently, to make partial resynchronization as possible as much, we always let replication backlog as the last reference of replication buffer blocks, backlog size may exceeds our setting if slow replicas that reference vast replication buffer blocks, and this method doesn't increase memory usage since they share replication buffer. To avoid freezing server for freeing unreferenced replication buffer blocks when we need to trim backlog for exceeding backlog size setting, we trim backlog incrementally (free 64 blocks per call now), and make it faster in `beforeSleep` (free 640 blocks). ### Other changes - `mem_total_replication_buffers`: we add this field in INFO command, it means the total memory of replication buffers used. - `mem_clients_slaves`: now even replica is slow to replicate, and its output buffer memory is not 0, but it still may be 0, since replication backlog and replicas share one global replication buffer, only if replication buffer memory is more than the repl backlog setting size, we consider the excess as replicas' memory. Otherwise, we think replication buffer memory is the consumption of repl backlog. - Key eviction Since all replicas and replication backlog share global replication buffer, we think only the part of exceeding backlog size the extra separate consumption of replicas. Because we trim backlog incrementally in the background, backlog size may exceeds our setting if slow replicas that reference vast replication buffer blocks disconnect. To avoid massive eviction loop, we don't count the delayed freed replication backlog into used memory even if there are no replicas, i.e. we also regard this memory as replicas's memory. - `client-output-buffer-limit` check for replica clients It doesn't make sense to set the replica clients output buffer limit lower than the repl-backlog-size config (partial sync will succeed and then replica will get disconnected). Such a configuration is ignored (the size of repl-backlog-size will be used). This doesn't have memory consumption implications since the replica client will share the backlog buffers memory. - Drop replication backlog after loading data if needed We always create replication backlog if server is a master, we need it because we put DELs in it when loading expired keys in RDB, but if RDB doesn't have replication info or there is no rdb, it is not possible to support partial resynchronization, to avoid extra memory of replication backlog, we drop it. - Multi IO threads Since all replicas and replication backlog use global replication buffer, if I/O threads are enabled, to guarantee data accessing thread safe, we must let main thread handle sending the output buffer to all replicas. But before, other IO threads could handle sending output buffer of all replicas. ## Other optimizations This solution resolve some other problem: - When replicas disconnect with master since of out of output buffer limit, releasing the output buffer of replicas may freeze server if we set big `client-output-buffer-limit` for replicas, but now, it doesn't cause freezing. - This implementation may mitigate reply list copy cost time(also freezes server) when one replication has huge reply buffer and another replica can copy buffer for full synchronization. now, we just copy reference info, it is very light. - If we set replication backlog size big, it also may cost much time to copy replication backlog into replica's output buffer. But this commit eliminates this problem. - Resizing replication backlog size doesn't empty current replication backlog content.	2021-10-25 09:24:31 +03:00
Itamar Haber	48e4d77099	Fixes `CLUSTER COUNTKEYSINSLOT` (#9672 ) Introduced via typo in #9504. Also adds a sanity test for coverage.	2021-10-24 12:32:53 +03:00
Shaya Potter	cf860df599	Fix module blocked clients RESP version (#9634 ) Before this commit, module blocked clients did not carry through the original RESP version, resulting with RESP3 clients receiving unexpected RESP2 replies.	2021-10-21 14:01:10 +03:00
Yossi Gottlieb	8bf4c2e38c	Fix test modules build issue on OS X 11. (#9658 )	2021-10-20 21:01:30 +03:00
Oran Agra	7d6744c739	fix new cluster tests issues (#9657 ) Following #9483 the daily CI exposed a few problems. * The cluster creation code (uses redis-cli) is complicated to test with TLS enabled. for now i'm just skipping them since the tests we run there don't really need that kind of coverage * cluster port binding failures note that `find_available_port` already looks for a free cluster port but the code in `wait_server_started` couldn't detect the failure of binding (the text it greps for wasn't found in the log)	2021-10-20 15:40:28 +03:00
guybe7	43e736f79b	Treat subcommands as commands (#9504 ) ## Intro The purpose is to allow having different flags/ACL categories for subcommands (Example: CONFIG GET is ok-loading but CONFIG SET isn't) We create a small command table for every command that has subcommands and each subcommand has its own flags, etc. (same as a "regular" command) This commit also unites the Redis and the Sentinel command tables ## Affected commands CONFIG Used to have "admin ok-loading ok-stale no-script" Changes: 1. Dropped "ok-loading" in all except GET (this doesn't change behavior since there were checks in the code doing that) XINFO Used to have "read-only random" Changes: 1. Dropped "random" in all except CONSUMERS XGROUP Used to have "write use-memory" Changes: 1. Dropped "use-memory" in all except CREATE and CREATECONSUMER COMMAND No changes. MEMORY Used to have "random read-only" Changes: 1. Dropped "random" in PURGE and USAGE ACL Used to have "admin no-script ok-loading ok-stale" Changes: 1. Dropped "admin" in WHOAMI, GENPASS, and CAT LATENCY No changes. MODULE No changes. SLOWLOG Used to have "admin random ok-loading ok-stale" Changes: 1. Dropped "random" in RESET OBJECT Used to have "read-only random" Changes: 1. Dropped "random" in ENCODING and REFCOUNT SCRIPT Used to have "may-replicate no-script" Changes: 1. Dropped "may-replicate" in all except FLUSH and LOAD CLIENT Used to have "admin no-script random ok-loading ok-stale" Changes: 1. Dropped "random" in all except INFO and LIST 2. Dropped "admin" in ID, TRACKING, CACHING, GETREDIR, INFO, SETNAME, GETNAME, and REPLY STRALGO No changes. PUBSUB No changes. CLUSTER Changes: 1. Dropped "admin in countkeysinslots, getkeysinslot, info, nodes, keyslot, myid, and slots SENTINEL No changes. (note that DEBUG also fits, but we decided not to convert it since it's for debugging and anyway undocumented) ## New sub-command This commit adds another element to the per-command output of COMMAND, describing the list of subcommands, if any (in the same structure as "regular" commands) Also, it adds a new subcommand: ``` COMMAND LIST [FILTERBY (MODULE <module-name>\|ACLCAT <cat>\|PATTERN <pattern>)] ``` which returns a set of all commands (unless filters), but excluding subcommands. ## Module API A new module API, RM_CreateSubcommand, was added, in order to allow module writer to define subcommands ## ACL changes: 1. Now, that each subcommand is actually a command, each has its own ACL id. 2. The old mechanism of allowed_subcommands is redundant (blocking/allowing a subcommand is the same as blocking/allowing a regular command), but we had to keep it, to support the widespread usage of allowed_subcommands to block commands with certain args, that aren't subcommands (e.g. "-select +select\|0"). 3. I have renamed allowed_subcommands to allowed_firstargs to emphasize the difference. 4. Because subcommands are commands in ACL too, you can now use "-" to block subcommands (e.g. "+client -client\|kill"), which wasn't possible in the past. 5. It is also possible to use the allowed_firstargs mechanism with subcommand. For example: `+config -config\|set +config\|set\|loglevel` will block all CONFIG SET except for setting the log level. 6. All of the ACL changes above required some amount of refactoring. ## Misc 1. There are two approaches: Either each subcommand has its own function or all subcommands use the same function, determining what to do according to argv[0]. For now, I took the former approaches only with CONFIG and COMMAND, while other commands use the latter approach (for smaller blamelog diff). 2. Deleted memoryGetKeys: It is no longer needed because MEMORY USAGE now uses the "range" key spec. 4. Bugfix: GETNAME was missing from CLIENT's help message. 5. Sentinel and Redis now use the same table, with the same function pointer. Some commands have a different implementation in Sentinel, so we redirect them (these are ROLE, PUBLISH, and INFO). 6. Command stats now show the stats per subcommand (e.g. instead of stats just for "config" you will have stats for "config\|set", "config\|get", etc.) 7. It is now possible to use COMMAND directly on subcommands: COMMAND INFO CONFIG\|GET (The pipeline syntax was inspired from ACL, and can be used in functions lookupCommandBySds and lookupCommandByCString) 8. STRALGO is now a container command (has "help") ## Breaking changes: 1. Command stats now show the stats per subcommand (see (5) above)	2021-10-20 11:52:57 +03:00
qetu3790	4962c5526d	Release clients blocked on module commands in cluster resharding and down state (#9483 ) Prevent clients from being blocked forever in cluster when they block with their own module command and the hash slot is migrated to another master at the same time. These will get a redirection message when unblocked. Also, release clients blocked on module commands when cluster is down (same as other blocked clients) This commit adds basic tests for the main (non-cluster) redis test infra that test the cluster. This was done because the cluster test infra can't handle some common test features, but most importantly we only build the test modules with the non-cluster test suite. note that rather than really supporting cluster operations by the test infra, it was added (as dup code) in two files, one for module tests and one for non-modules tests, maybe in the future we'll refactor that. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-10-19 11:50:37 +03:00
Wen Hui	1c2b5f5318	Make Cluster-bus port configurable with new cluster-port config (#9389 ) Make Cluster-bus port configurable with new cluster-port config	2021-10-18 22:28:27 -07:00
Viktor Söderqvist	b7f2a1a217	Add RedisModule_KeyExists (#9600 ) The LRU of the key is not touched. Locically expired keys are logically not existing, so they're treated as such.	2021-10-18 22:21:19 +03:00
yoav-steinberg	81095b1bd9	Skip Active-defrag edge case test until we fix it. (#9645 ) Test started failing consistently in 32bit builds after upgrading to jemalloc 5.2.1 (#9623).	2021-10-18 13:28:52 +03:00
Oran Agra	276b460ea9	Attempt to fix a valgrind test failure due to timing (#9643 ) in the past few days i've seen two failures in the valgrind daily test. *** [err]: slave fails full sync and diskless load swapdb recovers it in tests/integration/replication.tcl Replica didn't get into loading mode can't reproduce it, but i'm hoping it's just too slow (to start loading within 5 seconds)	2021-10-18 10:45:45 +03:00
Hanna Fadida	61bb044156	Modify mem_usage2 module callback to enable to take sample_size argument (#9612 ) This is useful for approximating size computation of complex module types. Note that the mem_usage2 callback is new and has not been released yet, which is why we can modify it.	2021-10-17 17:31:06 +03:00
Yossi Gottlieb	6d5a911707	Fix daily failures due to macos-latest change. (#9637 ) * Fix test modules linking on macOS 11.x. * Use macOS 10.x for FreeBSD VM as VirtualBox is not yet supported on 11.	2021-10-17 00:07:27 +03:00
Madelyn Olson	a6b5d518a9	Improved the reliability of cluster replica sync tests (#9628 ) Improved the reliability of cluster replica sync tests	2021-10-13 00:06:53 -07:00
Bjorn Svensson	54d01e363a	Move config `cluster-config-file` to generic configs (#9597 )	2021-10-07 22:32:40 -07:00
yoav-steinberg	834e8843de	obuf based eviction tests run until eviction occurs (#9611 ) obuf based eviction tests run until eviction occurs instead of assuming a certain amount of writes will fill the obuf enough for eviction to occur. This handles the kernel buffering written data and emptying the obuf even though no one actualy reads from it. The tests have a new timeout of 20sec: if the test doesn't pass after 20 sec it'll fail. Hopefully this enough for our slow CI targets. This also eliminates the need to skip some tests in TLS.	2021-10-07 15:43:48 +03:00
Huang Zhw	fd135f3e2d	Make tracking invalidation messages always after command's reply (#9422 ) Tracking invalidation messages were sometimes sent in inconsistent order, before the command's reply rather than after. In addition to that, they were sometimes embedded inside other commands responses, like MULTI-EXEC and MGET.	2021-10-07 15:13:42 +03:00
GutovskyMaria	d98d1ad574	Hide empty and loading replicas from CLUSTER SLOTS responses (#9287 ) Hide empty and loading replicas from CLUSTER SLOTS responses	2021-10-06 22:22:27 -07:00
yoav-steinberg	123cc1a1bc	Test fails when flushdb triggers a bgsave (#9535 ) Flush db and then wait for the bgsave to complete.	2021-10-06 11:50:47 +03:00
yoav-steinberg	897c7bddf5	Attempt to fix rare pubsub oubuf maxmemory eviction test failure (#9603 ) * Reduce delay between publishes to allow less time to write the obufs. * More subscribed clients to buffer more data per publish. * Make sure main connection isn't evicted (it has a large qbuf).	2021-10-05 18:00:19 +03:00
yoav-steinberg	83478e6102	argv mem leak during multi command execution. (#9598 ) Changes in #9528 lead to memory leak if the command implementation used rewriteClientCommandArgument inside MULTI-EXEC. Adding an explicit test for that case since the test that uncovered it didn't specifically target this scenario	2021-10-05 12:17:36 +03:00
Meir Shpilraien (Spielrein)	0f8b634cd5	Fix invalid memory write on lua stack overflow (CVE-2021-32626) (#9591 ) When LUA call our C code, by default, the LUA stack has room for 10 elements. In most cases, this is more than enough but sometimes it's not and the caller must verify the LUA stack size before he pushes elements. On 3 places in the code, there was no verification of the LUA stack size. On specific inputs this missing verification could have lead to invalid memory write: 1. On 'luaReplyToRedisReply', one might return a nested reply that will explode the LUA stack. 2. On 'redisProtocolToLuaType', the Redis reply might be deep enough to explode the LUA stack (notice that currently there is no such command in Redis that returns such a nested reply, but modules might do it) 3. On 'ldbRedis', one might give a command with enough arguments to explode the LUA stack (all the arguments will be pushed to the LUA stack) This commit is solving all those 3 issues by calling 'lua_checkstack' and verify that there is enough room in the LUA stack to push elements. In case 'lua_checkstack' returns an error (there is not enough room in the LUA stack and it's not possible to increase the stack), we will do the following: 1. On 'luaReplyToRedisReply', we will return an error to the user. 2. On 'redisProtocolToLuaType' we will exit with panic (we assume this scenario is rare because it can only happen with a module). 3. On 'ldbRedis', we return an error.	2021-10-04 15:17:50 +03:00
Oran Agra	b0ca3be2bb	Fix protocol parsing on 'ldbReplParseCommand' (CVE-2021-32672) (#9590 ) The protocol parsing on 'ldbReplParseCommand' (LUA debugging) Assumed protocol correctness. This means that if the following is given: *1 $100 test The parser will try to read additional 94 unallocated bytes after the client buffer. This commit fixes this issue by validating that there are actually enough bytes to read. It also limits the amount of data that can be sent by the debugger client to 1M so the client will not be able to explode the memory. Co-authored-by: meir@redislabs.com <meir@redislabs.com>	2021-10-04 12:14:12 +03:00
Oran Agra	c5e6a6204c	Fix ziplist and listpack overflows and truncations (CVE-2021-32627, CVE-2021-32628) (#9589 ) - fix possible heap corruption in ziplist and listpack resulting by trying to allocate more than the maximum size of 4GB. - prevent ziplist (hash and zset) from reaching size of above 1GB, will be converted to HT encoding, that's not a useful size. - prevent listpack (stream) from reaching size of above 1GB. - XADD will start a new listpack if the new record may cause the previous listpack to grow over 1GB. - XADD will respond with an error if a single stream record is over 1GB - List type (ziplist in quicklist) was truncating strings that were over 4GB, now it'll respond with an error. Co-authored-by: sundb <sundbcn@gmail.com>	2021-10-04 12:11:02 +03:00
Oran Agra	fba15850e5	Prevent unauthenticated client from easily consuming lots of memory (CVE-2021-32675) (#9588 ) This change sets a low limit for multibulk and bulk length in the protocol for unauthenticated connections, so that they can't easily cause redis to allocate massive amounts of memory by sending just a few characters on the network. The new limits are 10 arguments of 16kb each (instead of 1m of 512mb)	2021-10-04 12:10:31 +03:00
YaacovHazan	5becb7c9c6	improve the stability and correctness of "Test child sending info" (#9562 ) Since we measure the COW size in this test by changing some keys and reading the reported COW size, we need to ensure that the "dismiss mechanism" (#8974) will not free memory and reduce the COW size. For that, this commit changes the size of the keys to 512B (less than a page). and because some keys may fall into the same page, we are modifying ten keys on each iteration and check for at least 50% change in the COW size.	2021-10-04 10:32:26 +03:00
yoav-steinberg	6f4f31f167	decrby LLONG_MIN caused nagation overflow. (#9577 ) Note that this breaks compatibility because in the past doing: DECRBY x -9223372036854775808 would succeed (and create an invalid result) and now this returns an error.	2021-10-03 09:38:05 +03:00
yoav-steinberg	93e8534713	Remove argument count limit, dynamically grow argv. (#9528 ) Remove hard coded multi-bulk limit (was 1,048,576), new limit is INT_MAX. When client sends an m-bulk that's higher than 1024, we initially only allocate the argv array for 1024 arguments, and gradually grow that allocation as arguments are received.	2021-10-03 09:13:09 +03:00
Hanna Fadida	ffafb434fb	Modules: add RM_LoadDataTypeFromStringEncver (#9537 ) adding an advanced api to enable loading data that was sereialized with a specific encoding version	2021-09-30 11:21:32 +03:00
yoav-steinberg	d715655f16	verbose debug print in test to debug rare CI failure. (#9563 )	2021-09-29 17:10:05 +03:00
Oran Agra	5a4ab7c7d2	Fix stream sanitization for non-int first value (#9553 ) This was recently broken in #9321 when we validated stream IDs to be integers but did that after to the stepping next record instead of before.	2021-09-26 18:46:22 +03:00
yoav-steinberg	6600253046	Client eviction ci issues (#9549 ) Fixing CI test issues introduced in #8687 - valgrind warnings in readQueryFromClient when client was freed by processInputBuffer - adding DEBUG pause-cron for tests not to be time dependent. - skipping a test that depends on socket buffers / events not compatible with TLS - making sure client got subscribed by not using deferring client	2021-09-26 17:45:02 +03:00
Yossi Gottlieb	0af7fe2cab	Add --skipfile and --skiptest regex support. (#9555 ) Empty patterns are not considered and skipped. Also, improve help text.	2021-09-26 15:12:37 +03:00
Huang Zhw	bdecbd30df	Fix test randstring, compare string and int is wrong. (#9544 ) This will cause the generated string containing "\". Fixes a broken change in #8687	2021-09-24 16:58:38 +03:00
Yossi Gottlieb	bebc7f8470	Add RM_TrimStringAllocation(). (#9540 ) This commit makes it possible to explicitly trim the allocation of a RedisModuleString. Currently, Redis automatically trims strings that have been retained by a module command when it returns. However, this is not thread safe and may result with corruption in threaded modules. Supporting explicit trimming offers a backwards compatible workaround to this problem.	2021-09-23 15:00:37 +03:00

1 2 3 4 5 ...

1552 Commits