redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 08:38:27 -05:00

Author	SHA1	Message	Date
Binbin	a7726cdf51	Fix SENTINEL subcommands's arity (#9909 ) For `SENTINEL SET`, we can use in these ways: 1. SENTINEL SET mymaster quorum 3 2. SENTINEL SET mymaster quorum 5 parallel-syncs 1 For `SENTINEL SIMULATE-FAILURE`, although it is only used for testing: 1. SENTINEL SIMULATE-FAILURE CRASH-AFTER-ELECTION 2. SENTINEL SIMULATE-FAILURE CRASH-AFTER-ELECTION CRASH-AFTER-PROMOTION	2021-12-08 08:59:02 +02:00
丽媛自己动	0b26e389d8	Update documentation of _dictReset (#9908 )	2021-12-07 11:14:18 -08:00
yoav-steinberg	1736fa4d22	Don't write oom score adj to proc unless we're managing it. (#9904 ) When disabling redis oom-score-adj managment we restore the base value read before enabling oom-score-adj management. This fixes an issue introduced in #9748 where updating `oom-score-adj-values` while `oom-score-adj` was set to `no` would write the base oom score adj value read on startup to `/proc`. This is a bug since while `oom-score-adj` is disabled we should never write to proc and let external processes manage it. Added appropriate tests.	2021-12-07 16:05:51 +02:00
sundb	1808618f5d	Santize dump payload: fix invalid listpack entry start with EOF (#9889 ) When an invalid listpack entry starts with EOF, we will skip it when we verify it in the loop.	2021-12-04 16:43:08 +02:00
Oran Agra	64f6159646	Merge Redis Functions PR (#9780 ) # Redis Function This PR added the Redis Functions capabilities that were suggested on #8693. The PR also introduce a big refactoring to the current Lua implementation (i.e `scripting.c`). The main purpose of the refactoring is to have better code sharing between the Lua implementation that exists today on Redis (`scripting.c`) and the new Lua engine that is introduced on this PR. The refactoring includes code movements and file name changes as well as some logic changes that need to be carefully reviewed. To make the review easier, the PR was split into multiple commits. Each commit is deeply described later on but the main concept is that some commits are just moving code around without making any logical changes, those commits are less likely to cause any issues or regressions and can be reviewed fast. Other commits, which perform code and logic changes, need to be reviewed carefully, but those commits were created after the code movements so it's pretty easy to see what was changed. To sum up, it is highly recommended to review this PR commit by commit as it will be easier to see the changes, it is also recommended to read each commit description (written below) to understand what was changed on the commit and whether or not it's just a huge code movement or a logic changes. ## Terminology Currently, the terminology in Redis is not clearly defined. Scripts refer to Lua scripts and eval also refers only to Lua. Introducing Redis Function requires redefining those terms to be able to clearly understand what is been discussed on each context. * eval - legacy Lua script implementation. * Function - new scripting implementation (currently implemented in Lua but in the future, it might be other languages like javascript). * Engine - the component that is responsible for executing functions. * Script - Function or legacy Lua (executed with `eval` or `evalsha`) ## Refactoring New Structure Today, the entire scripting logic is located on `scripting.c`. This logic can be split into 3 main groups: 1. Script management - responsible for storing the scripts that were sent to Redis and retrieving them when they need to be run (base on the script sha on the current implementation). 2. Script invocation - invoke the script given on `eval` or `evalsha` command (this part includes finding the relevant script, preparing the arguments, ..) 3. Interact back with Redis (command invocation) Those 3 groups are tightly coupled on `scripting.c`. Redis Functions also need to use those groups logics, for example, to interact back with Redis or to execute Lua code. The refactoring attempts to split those 3 groups and define APIs so that we can reuse the code both on legacy Lua scripts and Redis Functions. In order to do so we define the following units: 1. script.c: responsible for interaction with Redis from within a script. 2. script_lua.c: responsible to execute Lua code, uses `script.c` to interact with Redis from within the Lua code. 3. function_lua.c: contains the Lua engine implementation, uses `script_lua.c` to execute the Lua code. 4. functions.c: Contains Redis Functions implementation (`FUNCTION` command,), uses `functions_lua.c` if the function it wants to invoke needs the Lua engine. 4. eval.c: the original `scripting.c` contains the Lua legacy implementation and was refactored to use `script_lua.c` to invoke the Lua code. ## Commits breakdown Notice: Some small commits are omitted from this list as they are small and insignificant (for example build fixes) ### First commit - code movements This commit rename `scripting.c` -> `eval.c` and introduce the new `script_lua.c` unit. The commit moves relevant code from `eval.c` (`scripting.c`) to `script_lua.c`, the purpose of moving the code is so that later we will be able to re-use the code on the Lua engine (`function_lua.c`). The commit only moves the code without modifying even a single line, so there is a very low risk of breaking anything and it also makes it much easier to see the changes on the following commits. Because the commit does not change the code (only moves it), it does not compile. But we do not care about it as the only purpose here is to make the review processes simpler. ### Second commit - move legacy Lua variables into `eval.c` Today, all Lua-related variables are located on the server struct. The commit attempt to identify those variable and take them out from the server struct, leaving only script related variables (variables that later need to be used also by engines) The following variable where renamed and left on the server struct: * lua_caller -> script_caller * lua_time_limit -> script_time_limit * lua_timedout -> script_timedout * lua_oom -> script_oom * lua_disable_deny_script -> script_disable_deny_script * in_eval -> in_script The following variables where moved to lctx under eval.c * lua * lua_client * lua_cur_script * lua_scripts * lua_scripts_mem * lua_replicate_commands * lua_write_dirty * lua_random_dirty * lua_multi_emitted * lua_repl * lua_kill * lua_time_start * lua_time_snapshot This commit is in a low risk of introducing any issues and it is just moving variables around and not changing any logic. ### Third commit - introducing script unit This commit introduces the `script.c` unit. Its purpose (as described above) is to provide an API for scripts to interact with Redis. Interaction includes mostly executing commands, but also other functionalities. The interaction is done using a `ScriptRunCtx` object that needs to be created by the user and initialized using `scriptPrepareForRun`. A detailed list of functionalities expose by the unit: 1. Calling commands (including all the validation checks such as acl, cluster, read only run, ...) 2. Set Resp 3. Set Replication method (AOF/REPLICATION/NONE) 4. Call Redis back on long-running scripts to allow Redis to reply to clients and perform script kill The commit introduces the new unit and uses it on eval commands to interact with Redis. ### Fourth commit - Moved functionality of invoke Lua code to `script_lua.c` This commit moves the logic of invoking the Lua code into `script_lua.c` so later it can be used also by Lua engine (`function_lua.c`). The code is located on `callFunction` function and assumes the Lua function already located on the top of the Lua stack. This commit also change `eval.c` to use the new functionality to invoke Lua code. ### Fith commit - Added Redis Functions unit (`functions.c`) and Lua engine (`function_lua.c`) Added Redis Functions unit under `functions.c`, included: 1. FUNCTION command: * FUNCTION CREATE * FUNCTION CALL * FUNCTION DELETE * FUNCTION KILL * FUNCTION INFO * FUNCTION STATS 2. Register engines In addition, this commit introduces the first engine that uses the Redis Functions capabilities, the Lua engine (`function_lua.c`) ## API Changes ### `lua-time-limit` configuration was renamed to `script-time-limit` (keep `lua-time-limit` as alias for backward compatibility). ### Error log changes When integrating with Redis from within a Lua script, the `Lua` term was removed from all the error messages and instead we write only `script`. For example: `Wrong number of args calling Redis command From Lua script` -> `Wrong number of args calling Redis command From script` ### `info memory` changes: Before stating all the changes made to memory stats we will try to explain the reason behind them and what we want to see on those metrics: * memory metrics should show both totals (for all scripting frameworks), as well as a breakdown per framework / vm. * The totals metrics should have "human" metrics while the breakdown shouldn't. * We did try to maintain backward compatibility in some way, that said we did make some repurpose to existing metrics where it looks reasonable. * We separate between memory used by the script framework (part of redis's used_memory), and memory used by the VM (not part of redis's used_memory) A full breakdown of `info memory` changes: * `used_memory_lua` and `used_memory_lua_human` was deprecated, `used_memory_vm_eval` has the same meaning as `used_memory_lua` * `used_memory_scripts` was renamed to `used_memory_scripts_eval` * `used_memory_scripts` and `used_memory_scripts_human` were repurposed and now return the total memory used by functions and eval (not including vm memory, only code cache, and structs). * `used_memory_vm_function` was added and represents the total memory used by functions vm's * `used_memory_functions` was added and represents the total memory by functions (not including vm memory, only code cache, and structs) * `used_memory_vm_total` and `used_memory_vm_total_human` was added and represents the total memory used by vm's (functions and eval combined) ### `functions.caches` `functions.caches` field was added to `memory stats`, representing the memory used by engines that are not functions (this memory includes data structures like dictionaries, arrays, ...) ## New API ### FUNCTION CREATE Usage: FUNCTION CREATE `ENGINE` `NAME` `[REPLACE]` `[DESC <DESCRIPTION>]` `<CODE>` * `ENGINE` - The name of the engine to use to create the script. * `NAME` - the name of the function that can be used later to call the function using `FUNCTION CALL` command. * `REPLACE` - if given, replace the given function with the existing function (if exists). * `DESCRIPTION` - optional argument describing the function and what it does * `CODE` - function code. The command will return `OK` if created successfully or error in the following cases: * The given engine name does not exist * The function name is already taken and `REPLACE` was not used. * The given function failed on the compilation. ### FCALL and FCALL_RO Usage: FCALL/FCALL_RO `NAME` `NUM_KEYS key1 key2` … ` arg1 arg2` Call and execute the function specified by `NAME`. The function will receive all arguments given after `NUM_KEYS`. The return value from the function will be returned to the user as a result. * `NAME` - Name of the function to run. * The rest is as today with EVALSHA command. The command will return an error in the following cases: * `NAME` does not exist * The function itself returned an error. The `FCALL_RO` is equivalent to `EVAL_RO` and allows only read-only commands to be invoked from the script. ### FUNCTION DELETE Usage: FUNCTION DELETE `NAME` Delete a function identified by `NAME`. Return `OK` on success or error on one of the following: * The given function does not exist ### FUNCTION INFO Usage: FUNCTION INFO `NAME` [WITHCODE] Return information about a function by function name: * Function name * Engine name * Description * Raw code (only if WITHCODE argument is given) ### FUNCTION LIST Usage: FUNCTION LIST Return general information about all the functions: * Function name * Engine name * Description ### FUNCTION STATS Usage: FUNCTION STATS Return information about the current running function: * Function name * Command that was used to invoke the function * Duration in MS that the function is already running If no function is currently running, this section is just a RESP nil. Additionally, return a list of all the available engines. ### FUNCTION KILL Usage: `FUNCTION KILL` Kill the currently executing function. The command will fail if the function already initiated a write command. ## Notes Note: Function creation/deletion is replicated to AOF but AOFRW is not implemented sense its going to be removed: #9794	2021-12-02 21:41:58 +02:00
meir@redislabs.com	cbd463175f	Redis Functions - Added redis function unit and Lua engine Redis function unit is located inside functions.c and contains Redis Function implementation: 1. FUNCTION commands: * FUNCTION CREATE * FCALL * FCALL_RO * FUNCTION DELETE * FUNCTION KILL * FUNCTION INFO 2. Register engine In addition, this commit introduce the first engine that uses the Redis Function capabilities, the Lua engine.	2021-12-02 19:35:52 +02:00
Binbin	e3c0ea1cb4	Fix a harmless bug when using monitor in redis-cli with wrong reply (#9875 ) When we use monitor in redis-cli but encounter an error reply, we will get stuck until we press Ctrl-C to quit. This is a harmless bug. It might be useful if we add parameters to monitor in the future, suck as monitoring only selected db. before: ``` 127.0.0.1:6379> monitor wrong (error) ERR wrong number of arguments for 'monitor' command or subcommand ^C(9.98s) 127.0.0.1:6379> ``` after: ``` 127.0.0.1:6379> monitor wrong (error) ERR wrong number of arguments for 'monitor' command or subcommand 127.0.0.1:6379> ```	2021-12-02 10:41:50 +02:00
meir@redislabs.com	f21dc38a6e	Redis Functions - Moved invoke Lua code functionality to script_lua.c The functionality was moved to script_lua.c under callFunction function. Its purpose is to call the Lua function already located on the top of the Lua stack. Used the new function on eval.c to invoke Lua code. The function will also be used to invoke Lua code on the Lua engine.	2021-12-01 23:57:06 +02:00
meir@redislabs.com	fc731bc67f	Redis Functions - Introduce script unit. Script unit is a new unit located on script.c. Its purpose is to provides an API for functions (and eval) to interact with Redis. Interaction includes mostly executing commands, but also functionalities like calling Redis back on long scripts or check if the script was killed. The interaction is done using a scriptRunCtx object that need to be created by the user and initialized using scriptPrepareForRun. Detailed list of functionalities expose by the unit: 1. Calling commands (including all the validation checks such as acl, cluster, read only run, ...) 2. Set Resp 3. Set Replication method (AOF/REPLICATION/NONE) 4. Call Redis back to on long running scripts to allow Redis reply to clients and perform script kill The commit introduce the new unit and uses it on eval commands to interact with Redis.	2021-12-01 23:54:23 +02:00
meir@redislabs.com	e0cd580aef	Redis Functions - Move Lua related variable into luaCtx struct The following variable was renamed: 1. lua_caller -> script_caller 2. lua_time_limit -> script_time_limit 3. lua_timedout -> script_timedout 4. lua_oom -> script_oom 5. lua_disable_deny_script -> script_disable_deny_script 6. in_eval -> in_script The following variables was moved to lctx under eval.c 1. lua 2. lua_client 3. lua_cur_script 4. lua_scripts 5. lua_scripts_mem 6. lua_replicate_commands 7. lua_write_dirty 8. lua_random_dirty 9. lua_multi_emitted 10. lua_repl 11. lua_kill 12. lua_time_start 13. lua_time_snapshot This commit is in a low risk of introducing any issues and it is just moving varibales around and not changing any logic.	2021-12-01 23:31:08 +02:00
meir@redislabs.com	22aab1ce94	Redis Functions - Move code to make review process easier. This commit is only move code around without changing it. The reason behind this is to make review process easier by allowing the reviewer to simply ignore all code movements. changes: 1. rename scripting.c to eval.c 2. introduce and new file, script_lua.c, and move parts of Lua functionality to this new file. script_lua.c will eventually contains the shared code between legacy lua and lua engine. This commit does not compiled on purpose. Its only purpose is to move code and rename files.	2021-12-01 23:30:59 +02:00
yoav-steinberg	0e5b813ef9	Multiparam config set (#9748 ) We can now do: `config set maxmemory 10m repl-backlog-size 5m` ## Basic algorithm to support "transaction like" config sets: 1. Backup all relevant current values (via get). 2. Run "verify" and "set" on everything, if we fail run "restore". 3. Run "apply" on everything (optional optimization: skip functions already run). If we fail run "restore". 4. Return success. ### restore 1. Run set on everything in backup. If we fail log it and continue (this puts us in an undefined state but we decided it's better than the alternative of panicking). This indicates either a bug or some unsupported external state. 2. Run apply on everything in backup (optimization: skip functions already run). If we fail log it (see comment above). 3. Return error. ## Implementation/design changes: * Apply function are idempotent (have no effect if they are run more than once for the same config). * No indication in set functions if we're reading the config or running from the `CONFIG SET` command (removed `update` argument). * Set function should set some config variable and assume an (optional) apply function will use that later to apply. If we know this setting can be safely applied immediately and can always be reverted and doesn't depend on any other configuration we can apply immediately from within the set function (and not store the setting anywhere). This is the case of this `dir` config, for example, which has no apply function. No apply function is need also in the case that setting the variable in the `server` struct is all that needs to be done to make the configuration take effect. Note that the original concept of `update_fn`, which received the old and new values was removed and replaced by the optional apply function. * Apply functions use settings written to the `server` struct and don't receive any inputs. * I take care that for the generic (non-special) configs if there's no change I avoid calling the setter (possible optimization: avoid calling the apply function as well). * Passing the same config parameter more than once to `config set` will fail. You can't do `config set my-setting value1 my-setting value2`. Note that getting `save` in the context of the conf file parsing to work here as before was a pain. The conf file supports an aggregate `save` definition, where each `save` line is added to the server's save params. This is unlike any other line in the config file where each line overwrites any previous configuration. Since we now support passing multiple save params in a single line (see top comments about `save` in https://github.com/redis/redis/pull/9644) we should deprecate the aggregate nature of this config line and perhaps reduce this ugly code in the future.	2021-12-01 10:15:11 +02:00
Itamar Haber	21aa1d4b91	Adds auto-seq-only-generation via `XADD ... <ms>-*` (#9217 ) Adds the ability to autogenerate the sequence part of the millisecond-only explicit ID specified for `XADD`. This is useful in case added entries have an externally-provided timestamp without sub-millisecond resolution.	2021-11-30 19:56:39 +02:00
Wen Hui	2afa41f628	Sentinel master reboot fix (#9438 ) Add master-reboot-down-after-period as a configurable parameter, to make it possible to trigger a failover from a master that is responding with `-LOADING` for a long time after being restarted.	2021-11-30 18:46:15 +02:00
丽媛自己动	af072c26bc	modify misleading note in comment (#9865 ) now rdbSaveInfo used in both way, so i think we should update previous notes, in case of misleading	2021-11-30 14:10:49 +02:00
Viktor Söderqvist	bdf531e38d	Extend output of DEBUG HELP for POPULATE (#9869 )	2021-11-30 12:36:52 +02:00
Meir Shpilraien (Spielrein)	b8e82d205b	Swap '\r\n' with spaces when returning a big number reply from Lua script. (#9870 ) The issue can only happened with a bad Lua script that claims to return a big number while actually return data which is not a big number (contains chars that are not digits). Such thing will not cause an issue unless the big number value contains `\r\n` and then it messes the resp3 structure. The fix changes all the appearances of '\r\n' with spaces. Such an issue can also happened on simple string or error replies but those already handle it the same way this PR does (replace `\r\n` with spaces). Other replies type are not vulnerable to this issue because they are not counting on free text that is terminated with `\r\n` (either it contains the bulk length like string reply or they are typed reply that can not inject free text like boolean or number). The issue only exists on unstable branch, big number reply on Lua script was not yet added to any official release.	2021-11-30 12:27:05 +02:00
Binbin	3119a3aeb5	Fix CLIENT KILL kill all clients with id 0 (#9853 ) * Fix CLIENT KILL kill all clients with id 0 or with skipme CLIENT KILL with ID argument should only kill the client with the provided ID. In old code, CLIENT KILL with id 0 will kill all the connected clients. Co-authored-by: Ofir Luzon <ofirluzon@gmail.com>	2021-11-29 13:35:36 -08:00
OfirMos	9f9c78578f	fixed mem leak on rdb load error (#9860 ) a rare case of short read that can happen when breaking the master-replica connection on diskless load mode,	2021-11-29 12:09:08 +02:00
Binbin	980bb3ae19	Add REDIS_CFLAGS='-Werror' to CI tests (#9828 ) Update CI so that warnings cause build failures. Also fix a warning in `test-sanitizer-address`: ``` In function ‘strncpy’, inlined from ‘clusterUpdateMyselfIp’ at cluster.c:545:13: /usr/include/x86_64-linux-gnu/bits/string_fortified.h:106:10: error: ‘__builtin_strncpy’ specified bound 46 equals destination size [-Werror=stringop-truncation] 106 \| return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest)); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors ```	2021-11-29 10:30:35 +02:00
sundb	494ee2f1fc	Fix abnormal compression due to out-of-control recompress (#9849 ) This pr is following #9779 . ## Describe of feature Now when we turn on the `list-compress-depth` configuration, the list will compress the ziplist between `[list-compress-depth, -list-compress-depth]`. When we need to use the compressed data, we will first decompress it, then use it, and finally compress it again. It's controlled by `quicklistNode->recompress`, which is designed to avoid the need to re-traverse the entire quicklist for compression after each decompression, we only need to recompress the quicklsitNode being used. In order to ensure the correctness of recompressing, we should normally let quicklistDecompressNodeForUse and quicklistCompress appear in pairs, otherwise, it may lead to the head and tail being compressed or the middle ziplist not being compressed correctly, which is exactly the problem this pr needs to solve. ## Solution 1. Reset `quicklistIter` after insert and replace. The quicklist node will be compressed in `quicklistInsertAfter`, `quicklistInsertBefore`, `quicklistReplaceAtIndex`, so we can safely reset the quicklistIter to avoid it being used again 2. `quicklistIndex` will return an iterator that can be used to recompress the current node after use. ## Test 1. In the `Stress Tester for #3343-Similar Errors` test, when the server crashes or when `valgrind` or `asan` error is detected, print violating commands. 2. Add a crash test due to wrongly recompressing after `lrem`. 3. Remove `insert before with 0 elements` and `insert after with 0 elements`, Now we forbid any operation on an NULL quicklistIter.	2021-11-29 07:57:01 +02:00
Meir Shpilraien (Spielrein)	6b0b04f1b2	Clean Lua stack before parsing call reply to avoid crash on a call with many arguments (#9809 ) This commit `0f8b634cd` (CVE-2021-32626 released in 6.2.6, 6.0.16, 5.0.14) fixes an invalid memory write issue by using `lua_checkstack` API to make sure the Lua stack is not overflow. This fix was added on 3 places: 1. `luaReplyToRedisReply` 2. `ldbRedis` 3. `redisProtocolToLuaType` On the first 2 functions, `lua_checkstack` is handled gracefully while the last is handled with an assert and a statement that this situation can not happened (only with misbehave module): > the Redis reply might be deep enough to explode the LUA stack (notice that currently there is no such command in Redis that returns such a nested reply, but modules might do it) The issue that was discovered is that user arguments is also considered part of the stack, and so the following script (for example) make the assertion reachable: ``` local a = {} for i=1,7999 do a[i] = 1 end return redis.call("lpush", "l", unpack(a)) ``` This is a regression because such a script would have worked before and now its crashing Redis. The solution is to clear the function arguments from the Lua stack which makes the original assumption true and the assertion unreachable.	2021-11-28 11:59:39 +02:00
Viktor Söderqvist	acf3495eb8	Sort out the mess around writable replicas and lookupKeyRead/Write (#9572 ) Writable replicas now no longer use the values of expired keys. Expired keys are deleted when lookupKeyWrite() is used, even on a writable replica. Previously, writable replicas could use the value of an expired key in write commands such as INCR, SUNIONSTORE, etc.. This commit also sorts out the mess around the functions lookupKeyRead() and lookupKeyWrite() so they now indicate what we intend to do with the key and are not affected by the command calling them. Multi-key commands like SUNIONSTORE, ZUNIONSTORE, COPY and SORT with the store option now use lookupKeyRead() for the keys they're reading from (which will not allow reading from logically expired keys). This commit also fixes a bug where PFCOUNT could return a value of an expired key. Test modules commands have their readonly and write flags updated to correctly reflect their lookups for reading or writing. Modules are not required to correctly reflect this in their command flags, but this change is made for consistency since the tests serve as usage examples. Fixes #6842. Fixes #7475.	2021-11-28 11:26:28 +02:00
sundb	4d8700786e	Fix COMMAND GETKEYS on LCS (#9852 ) Remove lcsGetKeys to clean up the remaining STRALGO after #9733. i.e. it still used a getkeys_proc which was still looking for the KEYS or STRINGS arguments	2021-11-28 09:02:38 +02:00
Huang Zhw	596635fa0c	Cleanup: fix typo and remove some obsoleting definitions. (#9851 )	2021-11-27 14:47:51 -08:00
uriyage	fa48fb2d0f	Do not watch keys for dirty client (#9829 ) Currently, the watching clients are marked as dirty when a watched key is touched, but we continue watching the keys for no reason. Then, when the same key is touched again, we iterate again on the watching clients list and mark all clients as dirty again. Only when the exec/unwatch command is issued will the client be removed from the key->watching_clients list. The same applies when a dirty client calls the WATCH command. The key will be added to be watched by the client even if it has no effect. In the field, no performance degradation was observed as a result of the current implementation; it is merely a cleanup with possible memory and performance gains in some situations. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-25 15:23:15 +02:00
Pavel Melkozerov	9630ded313	fix fob bad log messages in rdbSave (#9842 ) (#9843 ) logs message prints wrong file is failed to open temporary file logs have error occurred in getcwd (uses same errno to report error) Co-authored-by: Pavel Melkozerov <pavel.melkozerov@nokia.com>	2021-11-24 16:01:39 +02:00
sundb	4512905961	Replace ziplist with listpack in quicklist (#9740 ) Part three of implementing #8702, following #8887 and #9366 . ## Description of the feature 1. Replace the ziplist container of quicklist with listpack. 2. Convert existing quicklist ziplists on RDB loading time. an O(n) operation. ## Interface changes 1. New `list-max-listpack-size` config is an alias for `list-max-ziplist-size`. 2. Replace `debug ziplist` command with `debug listpack`. ## Internal changes 1. Add `lpMerge` to merge two listpacks . (same as `ziplistMerge`) 2. Add `lpRepr` to print info of listpack which is used in debugCommand and `quicklistRepr`. (same as `ziplistRepr`) 3. Replace `QUICKLIST_NODE_CONTAINER_ZIPLIST` with `QUICKLIST_NODE_CONTAINER_PACKED`(following #9357 ). It represent that a quicklistNode is a packed node, as opposed to a plain node. 4. Remove `createZiplistObject` method, which is never used. 5. Calculate listpack entry size using overhead overestimation in `quicklistAllowInsert`. We prefer an overestimation, which would at worse lead to a few bytes below the lowest limit of 4k. ## Improvements 1. Calling `lpShrinkToFit` after converting Ziplist to listpack, which was missed at #9366. 2. Optimize `quicklistAppendPlainNode` to avoid memcpy data. ## Bugfix 1. Fix crash in `quicklistRepr` when ziplist is compressed, introduced from #9366. ## Test 1. Add unittest for `lpMerge`. 2. Modify the old quicklist ziplist corrupt dump test. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-24 13:34:13 +02:00
Oran Agra	a3a014294f	fix invalid read on corrupt ziplist (#9831 ) If the last bytes in ziplist are corrupt and we decode from tail to head, we may reach slightly outside the ziplist.	2021-11-23 14:56:52 +02:00
guybe7	b161cff5f9	QUIT is a command, HOST: and POST are not (#9798 ) Some people complain that QUIT is missing from help/command table. Not appearing in COMMAND command, command stats, ACL, etc. and instead, there's a hack in processCommand with a comment that looks outdated. Note that it is [documented](https://redis.io/commands/quit) At the same time, HOST: and POST are there in the command table although these are not real commands. They would appear in the COMMAND command, and even in commandstats. Other changes: 1. Initialize the static logged_time static var in securityWarningCommand 2. add `no-auth` flag to RESET so it can always be executed.	2021-11-23 10:38:25 +02:00
Oran Agra	f07dedf73f	Fix invalid access in lpFind on corrupted listpack (#9819 ) Issue found by corrupt-dump-fuzzer test with ASAN. The problem was that lpSkip and lpGetWithSize could read the next listpack entry without validating that it's in range. Similarly even the memcmp in lpFind could do that and possibly crash on segfault and now they'll crash on assert first. The naive fix of using lpAssertValidEntry every time, resulted in 30% degradation in the lpFind benchmark of the unit test. The final fix with the condition at the bottom has no performance implications.	2021-11-22 15:30:00 +02:00
Oran Agra	183b90a625	Fix false positive leak reported by GCC ASAN (#9816 ) Leak found by the corrupt-dump-fuzzer when using GCC ASAN, which seems to falsely report leaks on pointers kept only on the stack when calling exit. Instead we now use _exit on panic / assert to skip these leak checks. Additionally, check for sanitizer warnings in the corrupt-dump-fuzzer between iterations, so that when something is found we know which test to relate it too (and it prints reproduction command list)	2021-11-21 18:47:10 +02:00
Yossi Gottlieb	fd0ca74763	Fix occasional RM_Call() crashes. (#9805 ) With dynamically growing argc (#9528), it is necessary to initialize argv_len. Normally createClient() handles that, but in the case of a module shared_client, this needs to be done explicitly. This also addresses an issue with rewriteClientCommandArgument() which doesn't properly handle the case where the new element extends beyond argc but not beyond argv_len.	2021-11-21 15:54:14 +02:00
Oran Agra	1417648469	Prevent LCS from allocating temp memory over proto-max-bulk-len (#9817 ) LCS can allocate immense amount of memory (sizes of two inputs multiplied by each other). In the past this caused some possible security issues due to overflows, which we solved and also added use of `trymalloc` to return "Insufficient memory" instead of OOM panic zmalloc. But in case overcommit is enabled, it could be that we won't get the OOM panic, and zmalloc will succeed, and then we can get OOM killed by the kernel. The solution here is to prevent LCS from allocating transient memory that's bigger than `proto-max-bulk-len` config. This config is not directly related to transient memory, but using a hard coded value ad well as introducing a specific config seems wrong. This comes to solve an error in the corrupt-dump-fuzzer test that started in the daily CI see #9799	2021-11-21 14:30:20 +02:00
Madelyn Olson	362b3b02e6	Initialize manual failover replica target (#9814 )	2021-11-19 19:56:44 -08:00
RuiliF	b567d3694c	fix memory leak in example module hellocluster (#9813 ) Missing RedisModule_FreeCallReply	2021-11-20 00:48:08 +02:00
Wen Hui	75d50e5de9	Fix module doc - missing function due to blank line (#9807 )	2021-11-18 21:07:53 +02:00
perryitay	0c10f0e1c0	Fix crashes when list-compress-depth is used. (#9779 ) Recently we started using list-compress-depth in tests (was completely untested till now). Turns this triggered test failures with the external mode, since the tests left the setting enabled and then it was used in other tests (specifically the fuzzer named "Stress tester for #3343-alike bugs"). This PR fixes the issue of the `recompress` flag being left set by mistake, which caused the code to later to compress the head or tail nodes (which should never be compressed) The solution is to reset the recompress flag when it should have been (when it was decided not to compress). Additionally we're adding some assertions and improve the tests so in order to catch other similar bugs.	2021-11-18 18:09:30 +02:00
Eduardo Semprebon	1a255e3150	Reject PING with MASTERDOWN when replica-serve-stale-data=no (#9757 ) Currently PING returns different status when server is not serving data, for example when `LOADING` or `BUSY`. But same was not true for `MASTERDOWN` This commit makes PING reply with `MASTERDOWN` when replica-serve-stale-data=no and link is MASTER is down.	2021-11-18 10:53:17 +02:00
guybe7	af7489886d	Obliterate STRALGO! add LCS (which only works on keys) (#9799 ) Drop the STRALGO command, now LCS is a command of its own and it only works on keys (not input strings). The motivation is that STRALGO's syntax was really messed-up... - assumes all (future) string algorithms will take similar arguments - mixes command that takes keys and one that doesn't in the same command. - make it nearly impossible to expose the right key spec in COMMAND INFO (issues cluster clients) - hard for cluster clients to determine the key names (firstkey, lastkey, etc) - hard for ACL / flags (is it a read command?) This is a breaking change.	2021-11-18 10:47:49 +02:00
Binbin	91e77a0cfb	Fixes ZPOPMIN/ZPOPMAX wrong replies when count is 0 with non-zset (#9711 ) Moves ZPOP ... 0 fast exit path after type check to reply with WRONGTYPE. In the past it will return an empty array. Also now count is not allowed to be negative. see #9680 before: ``` 127.0.0.1:6379> set zset str OK 127.0.0.1:6379> zpopmin zset 0 (empty array) 127.0.0.1:6379> zpopmin zset -1 (empty array) ``` after: ``` 127.0.0.1:6379> set zset str OK 127.0.0.1:6379> zpopmin zset 0 (error) WRONGTYPE Operation against a key holding the wrong kind of value 127.0.0.1:6379> zpopmin zset -1 (error) ERR value is out of range, must be positive ```	2021-11-18 10:13:16 +02:00
Madelyn Olson	32215e7889	Unpause clients earlier during manual cluster failover (#9676 ) Unpause clients after manual failover ends instead of the timed offset	2021-11-17 21:28:13 -08:00
Axlgrep	5f8ee3815c	optimze the judgment logic of use_pattern in scanGenericCommand() (#9789 ) in `scan 0 match ""` case, pat is empty sds(patlen is 0), I don't think should access the first character directly in this case(even though the first character is ' \0 '), for the code readability, I switch the two positions of judgment logic.	2021-11-16 17:03:24 +02:00
harleyliao	992a610fe8	set aof rewrite status err, when fork fail (#5606 ) when aof rewrite is failed by fork(), It'll be indicated by aof_last_bgrewrite_status INFO field, same as when the fork child fails later on.	2021-11-16 13:59:03 +02:00
sundb	985430b4fc	Change lzf to handle values larger than UINT32_MAX (#9776 ) Redis supports inserting data over 4GB into string (and recently for lists too, see #9357), But LZF compression used in RDB files (see `rdbcompression` config), and in quicklist (see `list-compress-depth` config) does not support compress/decompress data over UINT32_MAX, which will result in corrupting the rdb after compression. Internal changes: 1. Modify the `unsigned int` parameter of `lzf_compress/lzf_decompress` to `size_t`. 2. Modify the variable types in `lzf_compress` involving offsets and lengths to `size_t`. 3. Set LZF_USE_OFFSETS to 0. When LZF_USE_OFFSETS is 1, lzf store offset into `LZF_HSLOT`(32bit). Even in 64-bit, `LZF_USE_OFFSETS` defaults to 1, because lzf assumes that it only compresses and decompresses data smaller than UINT32_MAX. But now we need to make lzf support 64-bit, turning on `LZF_USE_OFFSETS` will make it impossible to store 64-bit offsets or pointers. BTW, disable LZF_USE_OFFSETS also brings a few performance improvements. Tests: 1. Add test for compress/decompress string large than UINT32_MAX. 2. Add unittest for compress/decompress quicklistNode.	2021-11-16 13:12:25 +02:00
sundb	e725d737fb	Add --large-memory flag for REDIS_TEST to enable tests that consume more than 100mb (#9784 ) This is a preparation step in order to add a new test in quicklist.c see #9776	2021-11-16 08:55:10 +02:00
guoxiang1996	aba70df48f	insufficient size for cached client flags in call() (#9783 ) The client flags is a 64 bit integer, but the temporary cached value on the stack of call() is 32 bit. luckily this doesn't lead to any bugs since the only flags used against this variables are below 32 bit.	2021-11-16 08:21:23 +02:00
Ozan Tezcan	b91d8b289b	Add sanitizer support and clean up sanitizer findings (#9601 ) - Added sanitizer support. `address`, `undefined` and `thread` sanitizers are available. - To build Redis with desired sanitizer : `make SANITIZER=undefined` - There were some sanitizer findings, cleaned up codebase - Added tests with address and undefined behavior sanitizers to daily CI. - Added tests with address sanitizer to the per-PR CI (smoke out mem leaks sooner). Basically, there are three types of issues : 1- Unaligned load/store : Most probably, this issue may cause a crash on a platform that does not support unaligned access. Redis does unaligned access only on supported platforms. 2- Signed integer overflow. Although, signed overflow issue can be problematic time to time and change how compiler generates code, current findings mostly about signed shift or simple addition overflow. For most platforms Redis can be compiled for, this wouldn't cause any issue as far as I can tell (checked generated code on godbolt.org). 3 -Minor leak (redis-cli), use-after-free(just before calling exit()); UB means nothing guaranteed and risky to reason about program behavior but I don't think any of the fixes here worth backporting. As sanitizers are now part of the CI, preventing new issues will be the real benefit.	2021-11-11 13:51:33 +02:00
Jim Brunner	cd0f710be9	fix minor memory leak in rewriteConfigSds (#9762 )	2021-11-09 10:35:22 -08:00
Eduardo Semprebon	c22d3684ba	Fix diskless load handling on broken EOF marker (#9752 ) During diskless replication, the check for broken EOF mark is misplaced and should be earlier. Now we do not swap db, we do proper cleanup and correctly raise module events on this kind of failure. This issue existed prior to #9323, but before, the side effect was not restoring backup and not raising the correct module events on this failure.	2021-11-09 11:46:10 +02:00
Wen Hui	2ce29e032b	Sentinel tls memory leak (#9753 ) There was a memory leak when tls is used in Sentinels. The memory leak is noticed when some of the replicas are offline.	2021-11-08 22:23:31 +02:00
Yossi Gottlieb	a1aba4bf75	Fix EINTR test failures. (#9751 ) * Clean up EINTR handling so EINTR will not change connection state to begin with. * On TLS, catch EINTR and return it as-is before going through OpenSSL error handling (which seems to not distinguish it from EAGAIN).	2021-11-08 16:09:33 +02:00
Huang Zhw	48d870aed1	Move config from clusterCron to config update (#9580 )	2021-11-07 18:56:03 -08:00
yoav-steinberg	79ac57561f	Refactor config.c for generic setter interface (#9644 ) This refactors all `CONFIG SET`s and conf file loading arguments go through the generic config handling interface. Refactoring changes: - All config params go through the `standardConfig` interface (some stuff which is only related to the config file and not the `CONFIG` command still has special handling for rewrite/config file parsing, `loadmodule`, for example.) . - Added `MULTI_ARG_CONFIG` flag for configs to signify they receive a variable number of arguments instead of a single argument. This is used to break up space separated arguments to `CONFIG SET` so the generic setter interface can pass multiple arguments to the setter function. When parsing the config file we also break up anything after the config name into multiple arguments to the setter function. Interface changes: - A side effect of the above interface is that the `bind` argument in the config file can be empty (no argument at all) this is treated the same as passing an single empty string argument (same as `save` already used to work). - Support rewrite and setting `watchdog-period` from config file (was only supported by the CONFIG command till now). - Another side effect is that the `save T X` config argument now supports multiple Time-Changes pairs in a single line like its `CONFIG SET` counterpart. So in the config file you can either do: ``` save 3600 1 save 600 10 ``` or do ``` save 3600 1 600 10 ``` Co-authored-by: Bjorn Svensson <bjorn.a.svensson@est.tech>	2021-11-07 13:40:08 +02:00
Binbin	ddb508c0a2	Fixing import of sys/errno (#9739 )	2021-11-05 07:50:25 +02:00
Eduardo Semprebon	91d0c758e5	Replica keep serving data during repl-diskless-load=swapdb for better availability (#9323 ) For diskless replication in swapdb mode, considering we already spend replica memory having a backup of current db to restore in case of failure, we can have the following benefits by instead swapping database only in case we succeeded in transferring db from master: - Avoid `LOADING` response during failed and successful synchronization for cases where the replica is already up and running with data. - Faster total time of diskless replication, because now we're moving from Transfer + Flush + Load time to Transfer + Load only. Flushing the tempDb is done asynchronously after swapping. - This could be implemented also for disk replication with similar benefits if consumers are willing to spend the extra memory usage. General notes: - The concept of `backupDb` becomes `tempDb` for clarity. - Async loading mode will only kick in if the replica is syncing from a master that has the same repl-id the one it had before. i.e. the data it's getting belongs to a different time of the same timeline. - New property in INFO: `async_loading` to differentiate from the blocking loading - Slot to Key mapping is now a field of `redisDb` as it's more natural to access it from both server.db and the tempDb that is passed around. - Because this is affecting replicas only, we assume that if they are not readonly and write commands during replication, they are lost after SYNC same way as before, but we're still denying CONFIG SET here anyways to avoid complications. Considerations for review: - We have many cases where server.loading flag is used and even though I tried my best, there may be cases where async_loading should be checked as well and cases where it shouldn't (would require very good understanding of whole code) - Several places that had different behavior depending on the loading flag where actually meant to just handle commands coming from the AOF client differently than ones coming from real clients, changed to check CLIENT_ID_AOF instead. Additional for Release Notes - Bugfix - server.dirty was not incremented for any kind of diskless replication, as effect it wouldn't contribute on triggering next database SAVE - New flag for RM_GetContextFlags module API: REDISMODULE_CTX_FLAGS_ASYNC_LOADING - Deprecated RedisModuleEvent_ReplBackup. Starting from Redis 7.0, we don't fire this event. Instead, we have the new RedisModuleEvent_ReplAsyncLoad holding 3 sub-events: STARTED, ABORTED and COMPLETED. - New module flag REDISMODULE_OPTIONS_HANDLE_REPL_ASYNC_LOAD for RedisModule_SetModuleOptions to allow modules to declare they support the diskless replication with async loading (when absent, we fall back to disk-based loading). Co-authored-by: Eduardo Semprebon <edus@saxobank.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-04 10:46:50 +02:00
Itamar Haber	06dd202a05	Fixes LPOP/RPOP wrong replies when count is 0 (#9692 ) Introduced in #8179, this fixes the command's replies in the 0 count edge case. [BREAKING] changes the reply type when count is 0 to an empty array (instead of nil) Moves LPOP ... 0 fast exit path after type check to reply with WRONGTYPE	2021-11-04 09:43:08 +02:00
menwen	ccf8a651f3	Retry when a blocked connection system call is interrupted by a signal (#9629 ) When repl-diskless-load is enabled, the connection is set to the blocking state. The connection may be interrupted by a signal during a system call. This would have resulted in a disconnection and possibly a reconnection loop. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-04 09:09:28 +02:00
perryitay	f27083a4a8	Add support for list type to store elements larger than 4GB (#9357 ) Redis lists are stored in quicklist, which is currently a linked list of ziplists. Ziplists are limited to storing elements no larger than 4GB, so when bigger items are added they're getting truncated. This PR changes quicklists so that they're capable of storing large items in quicklist nodes that are plain string buffers rather than ziplist. As part of the PR there were few other changes in redis: 1. new DEBUG sub-commands: - QUICKLIST-PACKED-THRESHOLD - set the threshold of for the node type to be plan or ziplist. default (1GB) - QUICKLIST <key> - Shows low level info about the quicklist encoding of <key> 2. rdb format change: - A new type was added - RDB_TYPE_LIST_QUICKLIST_2 . - container type (packed / plain) was added to the beginning of the rdb object (before the actual node list). 3. testing: - Tests that requires over 100MB will be by default skipped. a new flag was added to 'runtest' to run the large memory tests (not used by default) Co-authored-by: sundb <sundbcn@gmail.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-03 20:47:18 +02:00
guybe7	f11a2d4dd7	Fix COMMAND GETKEYS on EVAL without keys (#9733 ) Add new no-mandatory-keys flag to support COMMAND GETKEYS of commands which have no mandatory keys. In the past we would have got this error: ``` 127.0.0.1:6379> command getkeys eval "return 1" 0 (error) ERR Invalid arguments specified for command ```	2021-11-03 14:38:26 +02:00
perryitay	77d3c6bff3	fix: lookupKey on SETNX and SETXX only once (#9640 ) When using SETNX and SETXX we could end up doing key lookup twice. This presents a small inefficiency price. Also once we have statistics of write hit and miss they'll be wrong (recording the same key hit twice)	2021-11-03 14:12:33 +02:00
yiyuaner	78025c4a26	Add checks for overflow in redis-check-aof and loadAppendOnlyFile (#9669 ) Co-authored-by: guoyiyuan <guoyiyuan@sbrella.com>	2021-11-02 17:03:07 +02:00
Wang Yuan	526cbb5cff	Fix not updating backlog histlen when trimming repl backlog (#9713 ) Since the loop in incrementalTrimReplicationBacklog checks the size of histlen, we cannot afford to update it only when the loop exits, this may cause deleting much more replication blocks, and replication backlog may be less than setting size. introduce in #9166 Co-authored-by: sundb <sundbcn@gmail.com>	2021-11-02 11:04:11 +02:00
zhaozhao.zz	d08f0552ee	rebuild replication backlog index when master restart (#9720 ) After PR #9166 , replication backlog is not a real block of memory, just contains a reference points to replication buffer's block and the blocks index (to accelerate search offset when partial sync), so we need update both replication buffer's block's offset and replication backlog blocks index's offset when master restart from RDB, since the `server.master_repl_offset` is changed. The implications of this bug was just a slow search, but not a replication failure.	2021-11-02 10:53:52 +02:00
Oran Agra	f1f3cceb50	fix valgrind issues with long double module test (#9709 ) The module test in reply.tcl was introduced by #8521 but didn't run until recently (see #9639) and then it started failing with valgrind. This is because valgrind uses 64 bit long double (unlike most other platforms that have at least 80 bits) But besides valgrind, the tests where also incompatible with ARM32, which also uses 64 bit long doubles. We now use appropriate value to avoid issues with either valgrind or ARM32 In all the double tests, i use 3.141, which is safe since since addReplyDouble uses `%.17Lg` which is able to represent this value without adding any digits due to precision loss. In the long double, since we use `%.17Lf` in ld2string, it preserves 17 significant digits, rather than 17 digit after the decimal point (like in `%.17Lg`). So to make these similar, i use value lower than 1 (no digits left of the period) Lastly, we have the same issue with TCL (no long doubles) so we read raw protocol in that test. Note that the only error before this fix (in both valgrind and ARM32 is this: ``` *** [err]: RM_ReplyWithLongDouble: a float reply in tests/unit/moduleapi/reply.tcl Expected '3.141' to be equal to '3.14100000000000001' (context: type eval line 2 cmd {assert_equal 3.141 [r rw.longdouble 3.141]} proc ::test) ``` so the changes to debug.c and scripting.tcl aren't really needed, but i consider them a cleanup (i.e. scripting.c validated a different constant than the one that's sent to it from debug.c). Another unrelated change is to add the RESP version to the repeated tests in reply.tcl	2021-11-01 13:41:35 +02:00
罗泽轩	155c291006	Remove duplicate SET_OP_XX definitions in t_set.c. (#4326 ) These definitions already exist in server.h.	2021-11-01 11:09:29 +02:00
Binbin	033578839b	Fix multiple COUNT in LMPOP/BLMPOP/ZMPOP/BZMPOP (#9701 ) The previous code did not check whether COUNT is set. So we can use `lmpop 2 key1 key2 left count 1 count 2`. This situation can occur in LMPOP/BLMPOP/ZMPOP/BZMPOP commands. LMPOP/BLMPOP introduced in #9373, ZMPOP/BZMPOP introduced in #9484.	2021-10-31 16:10:29 +02:00
lijinliang	215b909c1f	fix typo in db.c: synchroneus -> synchronous(2 places) (#9702 ) Co-authored-by: lijinliang <lijl@newdt.cn>	2021-10-31 16:01:54 +02:00
Rafi Einstein	734cde7e38	Fix memory leak when there's a read error of module aux data from rdb. (#9705 )	2021-10-31 15:59:48 +02:00
guybe7	975f51fe16	Add new SLOTSRANGE to subcommands table (#9689 )	2021-10-27 10:44:14 +03:00
Wen Hui	5fb4adba65	New Cluster Command: CLUSTER DELSLOTSRANGE and CLUSTER ADDSLOTSRANGE (#9445 )	2021-10-26 21:44:33 -07:00
Wen Hui	43b22f17dc	Sentinel: don't log auth-pass value for better security (#9652 )	2021-10-26 13:13:12 +03:00
Wang Yuan	9ec3294b97	Add timestamp annotations in AOF (#9326 ) Add timestamp annotation in AOF, one part of #9325. Enabled with the new `aof-timestamp-enabled` config option. Timestamp annotation format is "#TS:${timestamp}\r\n"." TS" is short of timestamp and this method could save extra bytes in AOF. We can use timestamp annotation for some special functions. - know the executing time of commands - restore data to a specific point-in-time (by using redis-check-rdb to truncate the file)	2021-10-25 13:08:34 +03:00
Oran Agra	085615af97	Improve code doc of allowed_firstargs following #9504 (#9674 ) Improve code doc for allowed_firstargs (used to be allowed_commands before #9504. I don't think the text in the code needs to refer to the history (it's not there just for backwards compatibility). instead it should just describe what it does.	2021-10-25 13:01:25 +03:00
Guy Korland	6cf6c36937	Replace deprecated REDISMODULE_POSTPONED_ARRAY_LEN in module tests and examples (#9677 ) REDISMODULE_POSTPONED_ARRAY_LEN is deprecated, use REDISMODULE_POSTPONED_LEN instead	2021-10-25 12:00:43 +03:00
Itamar Haber	00362f2a94	Removes admin acl category from CLIENT TRACKINGINFO (#9662 ) overlooked in #9504	2021-10-25 11:33:37 +03:00
Shaya Potter	12ce2c3925	Add RM_ReplyWithBigNumber module API (#9639 ) Let modules use additional type of RESP3 response (unused by redis so far) Also fix tests that where introduced in #8521 but didn't actually run. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-10-25 11:31:20 +03:00
Wang Yuan	c1718f9d86	Replication backlog and replicas use one global shared replication buffer (#9166 ) ## Background For redis master, one replica uses one copy of replication buffer, that is a big waste of memory, more replicas more waste, and allocate/free memory for every reply list also cost much. If we set client-output-buffer-limit small and write traffic is heavy, master may disconnect with replicas and can't finish synchronization with replica. If we set client-output-buffer-limit big, master may be OOM when there are many replicas that separately keep much memory. Because replication buffers of different replica client are the same, one simple idea is that all replicas only use one replication buffer, that will effectively save memory. Since replication backlog content is the same as replicas' output buffer, now we can discard replication backlog memory and use global shared replication buffer to implement replication backlog mechanism. ## Implementation I create one global "replication buffer" which contains content of replication stream. The structure of "replication buffer" is similar to the reply list that exists in every client. But the node of list is `replBufBlock`, which has `id, repl_offset, refcount` fields. ```c /* Replication buffer blocks is the list of replBufBlock. * * +--------------+ +--------------+ +--------------+ * \| refcount = 1 \| ... \| refcount = 0 \| ... \| refcount = 2 \| * +--------------+ +--------------+ +--------------+ * \| / \ * \| / \ * \| / \ * Repl Backlog Replia_A Replia_B * * Each replica or replication backlog increments only the refcount of the * 'ref_repl_buf_node' which it points to. So when replica walks to the next * node, it should first increase the next node's refcount, and when we trim * the replication buffer nodes, we remove node always from the head node which * refcount is 0. If the refcount of the head node is not 0, we must stop * trimming and never iterate the next node. / / Similar with 'clientReplyBlock', it is used for shared buffers between * all replica clients and replication backlog. / typedef struct replBufBlock { int refcount; / Number of replicas or repl backlog using. / long long id; / The unique incremental number. / long long repl_offset; / Start replication offset of the block. */ size_t size, used; char buf[]; } replBufBlock; ``` So now when we feed replication stream into replication backlog and all replicas, we only need to feed stream into replication buffer `feedReplicationBuffer`. In this function, we set some fields of replication backlog and replicas to references of the global replication buffer blocks. And we also need to check replicas' output buffer limit to free if exceeding `client-output-buffer-limit`, and trim replication backlog if exceeding `repl-backlog-size`. When sending reply to replicas, we also need to iterate replication buffer blocks and send its content, when totally sending one block for replica, we decrease current node count and increase the next current node count, and then free the block which reference is 0 from the head of replication buffer blocks. Since now we use linked list to manage replication backlog, it may cost much time for iterating all linked list nodes to find corresponding replication buffer node. So we create a rax tree to store some nodes for index, but to avoid rax tree occupying too much memory, i record one per 64 nodes for index. Currently, to make partial resynchronization as possible as much, we always let replication backlog as the last reference of replication buffer blocks, backlog size may exceeds our setting if slow replicas that reference vast replication buffer blocks, and this method doesn't increase memory usage since they share replication buffer. To avoid freezing server for freeing unreferenced replication buffer blocks when we need to trim backlog for exceeding backlog size setting, we trim backlog incrementally (free 64 blocks per call now), and make it faster in `beforeSleep` (free 640 blocks). ### Other changes - `mem_total_replication_buffers`: we add this field in INFO command, it means the total memory of replication buffers used. - `mem_clients_slaves`: now even replica is slow to replicate, and its output buffer memory is not 0, but it still may be 0, since replication backlog and replicas share one global replication buffer, only if replication buffer memory is more than the repl backlog setting size, we consider the excess as replicas' memory. Otherwise, we think replication buffer memory is the consumption of repl backlog. - Key eviction Since all replicas and replication backlog share global replication buffer, we think only the part of exceeding backlog size the extra separate consumption of replicas. Because we trim backlog incrementally in the background, backlog size may exceeds our setting if slow replicas that reference vast replication buffer blocks disconnect. To avoid massive eviction loop, we don't count the delayed freed replication backlog into used memory even if there are no replicas, i.e. we also regard this memory as replicas's memory. - `client-output-buffer-limit` check for replica clients It doesn't make sense to set the replica clients output buffer limit lower than the repl-backlog-size config (partial sync will succeed and then replica will get disconnected). Such a configuration is ignored (the size of repl-backlog-size will be used). This doesn't have memory consumption implications since the replica client will share the backlog buffers memory. - Drop replication backlog after loading data if needed We always create replication backlog if server is a master, we need it because we put DELs in it when loading expired keys in RDB, but if RDB doesn't have replication info or there is no rdb, it is not possible to support partial resynchronization, to avoid extra memory of replication backlog, we drop it. - Multi IO threads Since all replicas and replication backlog use global replication buffer, if I/O threads are enabled, to guarantee data accessing thread safe, we must let main thread handle sending the output buffer to all replicas. But before, other IO threads could handle sending output buffer of all replicas. ## Other optimizations This solution resolve some other problem: - When replicas disconnect with master since of out of output buffer limit, releasing the output buffer of replicas may freeze server if we set big `client-output-buffer-limit` for replicas, but now, it doesn't cause freezing. - This implementation may mitigate reply list copy cost time(also freezes server) when one replication has huge reply buffer and another replica can copy buffer for full synchronization. now, we just copy reference info, it is very light. - If we set replication backlog size big, it also may cost much time to copy replication backlog into replica's output buffer. But this commit eliminates this problem. - Resizing replication backlog size doesn't empty current replication backlog content.	2021-10-25 09:24:31 +03:00
Oran Agra	6b297cd646	Improve errno reporting on fork and fopen rdbLoad failures (#9649 ) I moved a bunch of stats in redisFork to be executed only on successful fork, since they seem wrong to be done when it failed. I guess when fork fails it does that immediately, no latency spike.	2021-10-24 16:52:44 +03:00
Itamar Haber	48e4d77099	Fixes `CLUSTER COUNTKEYSINSLOT` (#9672 ) Introduced via typo in #9504. Also adds a sanity test for coverage.	2021-10-24 12:32:53 +03:00
Shaya Potter	cf860df599	Fix module blocked clients RESP version (#9634 ) Before this commit, module blocked clients did not carry through the original RESP version, resulting with RESP3 clients receiving unexpected RESP2 replies.	2021-10-21 14:01:10 +03:00
guybe7	8f745da159	Fix sentinel commands, ACL dictIter leak (#9661 )	2021-10-21 12:50:58 +03:00
Oran Agra	7d6744c739	fix new cluster tests issues (#9657 ) Following #9483 the daily CI exposed a few problems. * The cluster creation code (uses redis-cli) is complicated to test with TLS enabled. for now i'm just skipping them since the tests we run there don't really need that kind of coverage * cluster port binding failures note that `find_available_port` already looks for a free cluster port but the code in `wait_server_started` couldn't detect the failure of binding (the text it greps for wasn't found in the log)	2021-10-20 15:40:28 +03:00
guybe7	43e736f79b	Treat subcommands as commands (#9504 ) ## Intro The purpose is to allow having different flags/ACL categories for subcommands (Example: CONFIG GET is ok-loading but CONFIG SET isn't) We create a small command table for every command that has subcommands and each subcommand has its own flags, etc. (same as a "regular" command) This commit also unites the Redis and the Sentinel command tables ## Affected commands CONFIG Used to have "admin ok-loading ok-stale no-script" Changes: 1. Dropped "ok-loading" in all except GET (this doesn't change behavior since there were checks in the code doing that) XINFO Used to have "read-only random" Changes: 1. Dropped "random" in all except CONSUMERS XGROUP Used to have "write use-memory" Changes: 1. Dropped "use-memory" in all except CREATE and CREATECONSUMER COMMAND No changes. MEMORY Used to have "random read-only" Changes: 1. Dropped "random" in PURGE and USAGE ACL Used to have "admin no-script ok-loading ok-stale" Changes: 1. Dropped "admin" in WHOAMI, GENPASS, and CAT LATENCY No changes. MODULE No changes. SLOWLOG Used to have "admin random ok-loading ok-stale" Changes: 1. Dropped "random" in RESET OBJECT Used to have "read-only random" Changes: 1. Dropped "random" in ENCODING and REFCOUNT SCRIPT Used to have "may-replicate no-script" Changes: 1. Dropped "may-replicate" in all except FLUSH and LOAD CLIENT Used to have "admin no-script random ok-loading ok-stale" Changes: 1. Dropped "random" in all except INFO and LIST 2. Dropped "admin" in ID, TRACKING, CACHING, GETREDIR, INFO, SETNAME, GETNAME, and REPLY STRALGO No changes. PUBSUB No changes. CLUSTER Changes: 1. Dropped "admin in countkeysinslots, getkeysinslot, info, nodes, keyslot, myid, and slots SENTINEL No changes. (note that DEBUG also fits, but we decided not to convert it since it's for debugging and anyway undocumented) ## New sub-command This commit adds another element to the per-command output of COMMAND, describing the list of subcommands, if any (in the same structure as "regular" commands) Also, it adds a new subcommand: ``` COMMAND LIST [FILTERBY (MODULE <module-name>\|ACLCAT <cat>\|PATTERN <pattern>)] ``` which returns a set of all commands (unless filters), but excluding subcommands. ## Module API A new module API, RM_CreateSubcommand, was added, in order to allow module writer to define subcommands ## ACL changes: 1. Now, that each subcommand is actually a command, each has its own ACL id. 2. The old mechanism of allowed_subcommands is redundant (blocking/allowing a subcommand is the same as blocking/allowing a regular command), but we had to keep it, to support the widespread usage of allowed_subcommands to block commands with certain args, that aren't subcommands (e.g. "-select +select\|0"). 3. I have renamed allowed_subcommands to allowed_firstargs to emphasize the difference. 4. Because subcommands are commands in ACL too, you can now use "-" to block subcommands (e.g. "+client -client\|kill"), which wasn't possible in the past. 5. It is also possible to use the allowed_firstargs mechanism with subcommand. For example: `+config -config\|set +config\|set\|loglevel` will block all CONFIG SET except for setting the log level. 6. All of the ACL changes above required some amount of refactoring. ## Misc 1. There are two approaches: Either each subcommand has its own function or all subcommands use the same function, determining what to do according to argv[0]. For now, I took the former approaches only with CONFIG and COMMAND, while other commands use the latter approach (for smaller blamelog diff). 2. Deleted memoryGetKeys: It is no longer needed because MEMORY USAGE now uses the "range" key spec. 4. Bugfix: GETNAME was missing from CLIENT's help message. 5. Sentinel and Redis now use the same table, with the same function pointer. Some commands have a different implementation in Sentinel, so we redirect them (these are ROLE, PUBLISH, and INFO). 6. Command stats now show the stats per subcommand (e.g. instead of stats just for "config" you will have stats for "config\|set", "config\|get", etc.) 7. It is now possible to use COMMAND directly on subcommands: COMMAND INFO CONFIG\|GET (The pipeline syntax was inspired from ACL, and can be used in functions lookupCommandBySds and lookupCommandByCString) 8. STRALGO is now a container command (has "help") ## Breaking changes: 1. Command stats now show the stats per subcommand (see (5) above)	2021-10-20 11:52:57 +03:00
qetu3790	4962c5526d	Release clients blocked on module commands in cluster resharding and down state (#9483 ) Prevent clients from being blocked forever in cluster when they block with their own module command and the hash slot is migrated to another master at the same time. These will get a redirection message when unblocked. Also, release clients blocked on module commands when cluster is down (same as other blocked clients) This commit adds basic tests for the main (non-cluster) redis test infra that test the cluster. This was done because the cluster test infra can't handle some common test features, but most importantly we only build the test modules with the non-cluster test suite. note that rather than really supporting cluster operations by the test infra, it was added (as dup code) in two files, one for module tests and one for non-modules tests, maybe in the future we'll refactor that. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-10-19 11:50:37 +03:00
Bjorn Svensson	c9fabc2ef0	Move config `unixsocketperm` to generic configs (#9607 ) Since the size of mode_t is platform dependant we handle the `unixsocketperm` configuration as a generic int type. mode_t is either an unsigned int or unsigned short (macOS) and the range-limits allows for a simple cast to a mode_t.	2021-10-18 23:58:52 -07:00
Wen Hui	1c2b5f5318	Make Cluster-bus port configurable with new cluster-port config (#9389 ) Make Cluster-bus port configurable with new cluster-port config	2021-10-18 22:28:27 -07:00
Viktor Söderqvist	b7f2a1a217	Add RedisModule_KeyExists (#9600 ) The LRU of the key is not touched. Locically expired keys are logically not existing, so they're treated as such.	2021-10-18 22:21:19 +03:00
DarrenJiang13	aa6deff01e	add missed error counting (#9646 ) * add: add missed error counting in sentinel.c and cluster.c	2021-10-18 15:53:10 +03:00
七飒	afd8c4e007	sdstrim remove excessive check (#4045 ) there is no need to compare the value of ep and sp ``` sp = start = s; // the only way that make ep > sp is sdslen(s) == 0 // so when ep > sp,must exist ep-sp == -1 ep = end = s+sdslen(s)-1; while(sp <= end && strchr(cset, sp)) sp++; while(ep > sp && strchr(cset, ep)) ep--; // -1 + 1 already equals 0 len = (sp > ep) ? 0 : ((ep-sp)+1); ``` Signed-off-by: Bo Cai <charpty@gmail.com>	2021-10-17 20:37:52 +03:00
Ilya Shipitsin	94fded4f4f	Code cleanup, resolve an issue identified by cppcheck (#4373 ) [src/bitops.c:512] -> [src/bitops.c:507]: (warning) Either the condition 'if(o&&o->encoding==1)' is redundant or there is possible null pointer dereference: o. This function has checks for `o` to be null or non-null, so it is odd that it accesses it first..	2021-10-17 18:48:15 +03:00
Hanna Fadida	61bb044156	Modify mem_usage2 module callback to enable to take sample_size argument (#9612 ) This is useful for approximating size computation of complex module types. Note that the mem_usage2 callback is new and has not been released yet, which is why we can modify it.	2021-10-17 17:31:06 +03:00
Oran Agra	e7864a2b70	fix typos in module doc / header (#9641 ) the RedisModule_ReplyWithPush prototype was merged by mistake (no such API yet)	2021-10-17 17:15:27 +03:00
guoxiang1996	3c9e5271c6	Use fcntl(fd,F_FULLFSYNC) instead of fsync on OSX, improve power failure safety (#9545 ) On MacOS calling fsync does not guarantee the cache on the disk itself is flushed.	2021-10-15 08:44:25 +03:00
Shaya Potter	24b67d5520	Add RM_ReplyWithVerbatimStringType that takes an ext/type arg (#9632 ) Verbatim Stings in RESP3 have a type/extension. The existing redismoule reply function, hard coded it to "txt".	2021-10-14 09:53:46 +03:00
Ofir Luzon	49d26a9658	Add sleep interval to redis-cli --scan option (#3751 ) Adding -i option (sleep interval) of repeat and bigkeys to redis-cli --scan. When the keyspace contains many already expired keys scanning the dataset with redis-cli --scan can impact the performance Co-authored-by: Oran Agra <oran@redislabs.com>	2021-10-13 16:54:35 +03:00
Madelyn Olson	a6b5d518a9	Improved the reliability of cluster replica sync tests (#9628 ) Improved the reliability of cluster replica sync tests	2021-10-13 00:06:53 -07:00
Ning Xie	075ac34545	Fix redis-cli SCAN sleep interval for big/hot keys (could have been skipped) (#9624 ) bigkeys sleep is defined each 100 scanned keys, and it is checked it only between scan cycles. In cases that scan does not return exactly 10 keys it will never sleep. In addition the comment was sleep each 100 SCANs but it was 100 scanned keys.	2021-10-12 23:00:49 +03:00
yoav-steinberg	252981914f	XADD - skip rewrite the id arg if it was given and is valid. (#9599 ) When calling `XADD` with a predefined id (instead of `*`) there's no need to run the code which replaces the supplied id with itself. Only when we pass a wildcard id we need to do this. For apps which always supply their own id this is a slight optimization.	2021-10-11 13:09:18 +03:00
zhaozhao.zz	484a1ad67e	master client should ignore proto_max_bulk_len in bitops (#9626 )	2021-10-11 13:58:42 +08:00
menwen	7ff7536e2c	Delete unused 'time' fields from struct bio_job (#9622 ) looks like this field was never actually used and the call to time() is excessive.	2021-10-10 08:17:54 +03:00
Bjorn Svensson	b874c6f1fc	Move config logfile to generic config (#9592 ) Move config `logfile` to generic configs	2021-10-07 22:33:08 -07:00
Bjorn Svensson	54d01e363a	Move config `cluster-config-file` to generic configs (#9597 )	2021-10-07 22:32:40 -07:00
Huang Zhw	fd135f3e2d	Make tracking invalidation messages always after command's reply (#9422 ) Tracking invalidation messages were sometimes sent in inconsistent order, before the command's reply rather than after. In addition to that, they were sometimes embedded inside other commands responses, like MULTI-EXEC and MGET.	2021-10-07 15:13:42 +03:00
GutovskyMaria	d98d1ad574	Hide empty and loading replicas from CLUSTER SLOTS responses (#9287 ) Hide empty and loading replicas from CLUSTER SLOTS responses	2021-10-06 22:22:27 -07:00
Andy Pan	2391aefd82	Implement anetPipe() to combine creating pipe and setting flags (#9511 ) Implement createPipe() to combine creating pipe and setting flags, also reduce system calls by prioritizing pipe2() over pipe(). Without createPipe(), we have to call pipe() to create a pipe and then call some functions (like anetCloexec() and anetNonBlock()) of anet.c to set flags respectively, which leads to some extra system calls, now we can leverage pipe2() to combine them and make the process of creating pipe more convergent in createPipe(). Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-10-06 16:08:13 +03:00
yoav-steinberg	5725088ff2	Avoid argv memcpy when queuing a multi command. (#9602 ) When queuing a multi command we duplicated the argv (meaning an alloc and a memcpy). This isn't needed since we can use the previously allocated argv and just reset the client objects argv to NULL. This should saves some memory and is a minor optimization in heavy MULTI/EXEC traffic, especially if there are lots of arguments.	2021-10-06 11:39:09 +03:00
Meir Shpilraien (Spielrein)	4fb39b6700	Added module-acquire-GIL latency stats (#9608 ) The new value indicates how long Redis wait to acquire the GIL after sleep. This can help identify problems where a module perform some background operation for a long time (with the GIL held) and blocks the Redis main thread.	2021-10-06 11:33:01 +03:00
tzongw	f5160ed0aa	improve latency when a client is unblocked by module timer (#9593 ) Scenario: 1. client block on command `XREAD BLOCK 0 STREAMS mystream $` 2. in a module, calling `XADD mystream * field value` via lua from a timer callback 3. client will receive response after some latency up to 100ms Reason: When `XADD` signal the key `mystream` as ready, `beforeSleep` in next eventloop will call `handleClientsBlockedOnKeys` to unblock the client and add pending data to write but not actually install a write handler, so next redis will block in `aeApiPoll` up to 100ms given `hz` config as default 10, pending data will be sent in another next eventloop by `handleClientsWithPendingWritesUsingThreads`. Calling `handleClientsBlockedOnKeys` before `handleClientsWithPendingWritesUsingThreads` in `beforeSleep` solves the problem.	2021-10-06 10:15:03 +03:00
yoav-steinberg	83478e6102	argv mem leak during multi command execution. (#9598 ) Changes in #9528 lead to memory leak if the command implementation used rewriteClientCommandArgument inside MULTI-EXEC. Adding an explicit test for that case since the test that uncovered it didn't specifically target this scenario	2021-10-05 12:17:36 +03:00
Meir Shpilraien (Spielrein)	0f8b634cd5	Fix invalid memory write on lua stack overflow (CVE-2021-32626) (#9591 ) When LUA call our C code, by default, the LUA stack has room for 10 elements. In most cases, this is more than enough but sometimes it's not and the caller must verify the LUA stack size before he pushes elements. On 3 places in the code, there was no verification of the LUA stack size. On specific inputs this missing verification could have lead to invalid memory write: 1. On 'luaReplyToRedisReply', one might return a nested reply that will explode the LUA stack. 2. On 'redisProtocolToLuaType', the Redis reply might be deep enough to explode the LUA stack (notice that currently there is no such command in Redis that returns such a nested reply, but modules might do it) 3. On 'ldbRedis', one might give a command with enough arguments to explode the LUA stack (all the arguments will be pushed to the LUA stack) This commit is solving all those 3 issues by calling 'lua_checkstack' and verify that there is enough room in the LUA stack to push elements. In case 'lua_checkstack' returns an error (there is not enough room in the LUA stack and it's not possible to increase the stack), we will do the following: 1. On 'luaReplyToRedisReply', we will return an error to the user. 2. On 'redisProtocolToLuaType' we will exit with panic (we assume this scenario is rare because it can only happen with a module). 3. On 'ldbRedis', we return an error.	2021-10-04 15:17:50 +03:00
Oran Agra	9e3dca8bef	Fix mem leak in loading AOF, introduced by #9528 (#9582 ) Recently merged PR introduced a leak when loading AOF files. This was because argv_len wasn't set, so rewriteClientCommandArgument would shrink the argv array and updating argc to a small value.	2021-10-04 12:17:22 +03:00
Oran Agra	b0ca3be2bb	Fix protocol parsing on 'ldbReplParseCommand' (CVE-2021-32672) (#9590 ) The protocol parsing on 'ldbReplParseCommand' (LUA debugging) Assumed protocol correctness. This means that if the following is given: *1 $100 test The parser will try to read additional 94 unallocated bytes after the client buffer. This commit fixes this issue by validating that there are actually enough bytes to read. It also limits the amount of data that can be sent by the debugger client to 1M so the client will not be able to explode the memory. Co-authored-by: meir@redislabs.com <meir@redislabs.com>	2021-10-04 12:14:12 +03:00
Oran Agra	c5e6a6204c	Fix ziplist and listpack overflows and truncations (CVE-2021-32627, CVE-2021-32628) (#9589 ) - fix possible heap corruption in ziplist and listpack resulting by trying to allocate more than the maximum size of 4GB. - prevent ziplist (hash and zset) from reaching size of above 1GB, will be converted to HT encoding, that's not a useful size. - prevent listpack (stream) from reaching size of above 1GB. - XADD will start a new listpack if the new record may cause the previous listpack to grow over 1GB. - XADD will respond with an error if a single stream record is over 1GB - List type (ziplist in quicklist) was truncating strings that were over 4GB, now it'll respond with an error. Co-authored-by: sundb <sundbcn@gmail.com>	2021-10-04 12:11:02 +03:00
Oran Agra	fba15850e5	Prevent unauthenticated client from easily consuming lots of memory (CVE-2021-32675) (#9588 ) This change sets a low limit for multibulk and bulk length in the protocol for unauthenticated connections, so that they can't easily cause redis to allocate massive amounts of memory by sending just a few characters on the network. The new limits are 10 arguments of 16kb each (instead of 1m of 512mb)	2021-10-04 12:10:31 +03:00
Oran Agra	7cb89a5a1c	Fix Integer overflow issue with intsets (CVE-2021-32687) (#9586 ) The vulnerability involves changing the default set-max-intset-entries configuration parameter to a very large value and constructing specially crafted commands to manipulate sets	2021-10-04 12:09:25 +03:00
yiyuaner	24cc0b984d	Fix integer overflow in _sdsMakeRoomFor (CVE-2021-41099) (#9558 ) The existing overflow checks handled the greedy growing, but didn't handle a case where the addition of the header size is what causes the overflow.	2021-10-04 11:11:09 +03:00
yoav-steinberg	6f4f31f167	decrby LLONG_MIN caused nagation overflow. (#9577 ) Note that this breaks compatibility because in the past doing: DECRBY x -9223372036854775808 would succeed (and create an invalid result) and now this returns an error.	2021-10-03 09:38:05 +03:00
yoav-steinberg	93e8534713	Remove argument count limit, dynamically grow argv. (#9528 ) Remove hard coded multi-bulk limit (was 1,048,576), new limit is INT_MAX. When client sends an m-bulk that's higher than 1024, we initially only allocate the argv array for 1024 arguments, and gradually grow that allocation as arguments are received.	2021-10-03 09:13:09 +03:00
Binbin	dd3ac97ffe	Cleanup typos, incorrect comments, and fixed small memory leak in redis-cli (#9153 ) 1. Remove forward declarations from header files to functions that do not exist: hmsetCommand and rdbSaveTime. 2. Minor phrasing fixes in #9519 3. Add missing sdsfree(title) and fix typo in redis-benchmark. 4. Modify some error comments in some zset commands. 5. Fix copy-paste bug comment in syncWithMaster about `ip-address`.	2021-10-02 22:19:33 -07:00
Viktor Söderqvist	9a3bd07e9f	Unify dbSyncDelete and dbAsyncDelete (#9573 ) Just a cleanup to make the code easier to maintain and reduce the risk of something being overlooked.	2021-10-01 15:49:33 +03:00
Yunier Pérez	12e4f31d94	Allow to override OPENSSL_PREFIX (#9567 ) While the original issue was on Linux, this should work for other platforms as well.	2021-09-30 15:51:19 +03:00
Hanna Fadida	ffafb434fb	Modules: add RM_LoadDataTypeFromStringEncver (#9537 ) adding an advanced api to enable loading data that was sereialized with a specific encoding version	2021-09-30 11:21:32 +03:00
Wen Hui	2c38caa176	adding missing error check for fstat (#9532 )	2021-09-28 21:10:33 -07:00
Ozan Tezcan	4be2dd6ab9	Use __common__ attribute in redismodule.h for Clang C builds (#9541 )	2021-09-27 09:51:33 +03:00
Oran Agra	5a4ab7c7d2	Fix stream sanitization for non-int first value (#9553 ) This was recently broken in #9321 when we validated stream IDs to be integers but did that after to the stepping next record instead of before.	2021-09-26 18:46:22 +03:00
yoav-steinberg	6600253046	Client eviction ci issues (#9549 ) Fixing CI test issues introduced in #8687 - valgrind warnings in readQueryFromClient when client was freed by processInputBuffer - adding DEBUG pause-cron for tests not to be time dependent. - skipping a test that depends on socket buffers / events not compatible with TLS - making sure client got subscribed by not using deferring client	2021-09-26 17:45:02 +03:00
chenyang8094	7c1f9ef503	Fix obtain the AOF file length error when load AOF (#9510 ) this was a regression from #9012 (not released yet)	2021-09-26 10:24:52 +03:00
Ozan Tezcan	3ff56a6dde	Fix crash due to free() call for a string literal in redis-benchmark (#9546 )	2021-09-24 22:03:19 +03:00
sundb	9967a53f4c	Use dictGetFairRandomKey() for HRANDFIELD,SRANDMEMBER,ZRANDMEMBER (#9538 ) In the `HRANDFIELD`, `SRANDMEMBER` and `ZRANDMEMBER` commands, There are some strategies that could in some rare cases return an unfair random. these cases are where s small dict happens be be hashed unevenly. Specifically when `count*ZRANDMEMBER_SUB_STRATEGY_MUL > size`, using `dictGetRandomKey` to randomize from a dict will result in an unfair random result.	2021-09-24 17:36:26 +03:00
Huang Zhw	f0fab99d6f	Minor optimize getMaxmemoryState, when server.maxmemory is not set (#9533 ) Minor optimize getMaxmemoryState, when server.maxmemory is not set, don't count AOF and replicas buffers. Co-authored-by: Viktor Söderqvist <viktor@zuiderkwast.se>	2021-09-23 17:12:11 +03:00
Yossi Gottlieb	bebc7f8470	Add RM_TrimStringAllocation(). (#9540 ) This commit makes it possible to explicitly trim the allocation of a RedisModuleString. Currently, Redis automatically trims strings that have been retained by a module command when it returns. However, this is not thread safe and may result with corruption in threaded modules. Supporting explicit trimming offers a backwards compatible workaround to this problem.	2021-09-23 15:00:37 +03:00
yoav-steinberg	2753429c99	Client eviction (#8687 ) ### Description A mechanism for disconnecting clients when the sum of all connected clients is above a configured limit. This prevents eviction or OOM caused by accumulated used memory between all clients. It's a complimentary mechanism to the `client-output-buffer-limit` mechanism which takes into account not only a single client and not only output buffers but rather all memory used by all clients. #### Design The general design is as following: * We track memory usage of each client, taking into account all memory used by the client (query buffer, output buffer, parsed arguments, etc...). This is kept up to date after reading from the socket, after processing commands and after writing to the socket. * Based on the used memory we sort all clients into buckets. Each bucket contains all clients using up up to x2 memory of the clients in the bucket below it. For example up to 1m clients, up to 2m clients, up to 4m clients, ... * Before processing a command and before sleep we check if we're over the configured limit. If we are we start disconnecting clients from larger buckets downwards until we're under the limit. #### Config `maxmemory-clients` max memory all clients are allowed to consume, above this threshold we disconnect clients. This config can either be set to 0 (meaning no limit), a size in bytes (possibly with MB/GB suffix), or as a percentage of `maxmemory` by using the `%` suffix (e.g. setting it to `10%` would mean 10% of `maxmemory`). #### Important code changes * During the development I encountered yet more situations where our io-threads access global vars. And needed to fix them. I also had to handle keeps the clients sorted into the memory buckets (which are global) while their memory usage changes in the io-thread. To achieve this I decided to simplify how we check if we're in an io-thread and make it much more explicit. I removed the `CLIENT_PENDING_READ` flag used for checking if the client is in an io-thread (it wasn't used for anything else) and just used the global `io_threads_op` variable the same way to check during writes. * I optimized the cleanup of the client from the `clients_pending_read` list on client freeing. We now store a pointer in the `client` struct to this list so we don't need to search in it (`pending_read_list_node`). * Added `evicted_clients` stat to `INFO` command. * Added `CLIENT NO-EVICT ON\|OFF` sub command to exclude a specific client from the client eviction mechanism. Added corrosponding 'e' flag in the client info string. * Added `multi-mem` field in the client info string to show how much memory is used up by buffered multi commands. * Client `tot-mem` now accounts for buffered multi-commands, pubsub patterns and channels (partially), tracking prefixes (partially). * CLIENT_CLOSE_ASAP flag is now handled in a new `beforeNextClient()` function so clients will be disconnected between processing different clients and not only before sleep. This new function can be used in the future for work we want to do outside the command processing loop but don't want to wait for all clients to be processed before we get to it. Specifically I wanted to handle output-buffer-limit related closing before we process client eviction in case the two race with each other. * Added a `DEBUG CLIENT-EVICTION` command to print out info about the client eviction buckets. * Each client now holds a pointer to the client eviction memory usage bucket it belongs to and listNode to itself in that bucket for quick removal. * Global `io_threads_op` variable now can contain a `IO_THREADS_OP_IDLE` value indicating no io-threading is currently being executed. * In order to track memory used by each clients in real-time we can't rely on updating these stats in `clientsCron()` alone anymore. So now I call `updateClientMemUsage()` (used to be `clientsCronTrackClientsMemUsage()`) after command processing, after writing data to pubsub clients, after writing the output buffer and after reading from the socket (and maybe other places too). The function is written to be fast. * Clients are evicted if needed (with appropriate log line) in `beforeSleep()` and before processing a command (before performing oom-checks and key-eviction). * All clients memory usage buckets are grouped as follows: * All clients using less than 64k. * 64K..128K * 128K..256K * ... * 2G..4G * All clients using 4g and up. * Added client-eviction.tcl with a bunch of tests for the new mechanism. * Extended maxmemory.tcl to test the interaction between maxmemory and maxmemory-clients settings. * Added an option to flag a numeric configuration variable as a "percent", this means that if we encounter a '%' after the number in the config file (or config set command) we consider it as valid. Such a number is store internally as a negative value. This way an integer value can be interpreted as either a percent (negative) or absolute value (positive). This is useful for example if some numeric configuration can optionally be set to a percentage of something else. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-09-23 14:02:16 +03:00
YaacovHazan	a56d4533b7	Adding ACL support for modules (#9309 ) This commit introduced a new flag to the RM_Call: 'C' - Check if the command can be executed according to the ACLs associated with it. Also, three new API's added to check if a command, key, or channel can be executed or accessed by a user, according to the ACLs associated with it. - RM_ACLCheckCommandPerm - RM_ACLCheckKeyPerm - RM_ACLCheckChannelPerm The user for these API's is a RedisModuleUser object, that for a Module user returned by the RM_CreateModuleUser API, or for a general ACL user can be retrieved by these two new API's: - RM_GetCurrentUserName - Retrieve the user name of the client connection behind the current context. - RM_GetModuleUserFromUserName - Get a RedisModuleUser from a user name As a result of getting a RedisModuleUser from name, it can now also access the general ACL users (not just ones created by the module). This mean the already existing API RM_SetModuleUserACL(), can be used to change the ACL rules for such users.	2021-09-23 08:52:56 +03:00
Binbin	14d6abd8e9	Add ZMPOP/BZMPOP commands. (#9484 ) This is similar to the recent addition of LMPOP/BLMPOP (#9373), but zset. Syntax for the new ZMPOP command: `ZMPOP numkeys [<key> ...] MIN\|MAX [COUNT count]` Syntax for the new BZMPOP command: `BZMPOP timeout numkeys [<key> ...] MIN\|MAX [COUNT count]` Some background: - ZPOPMIN/ZPOPMAX take only one key, and can return multiple elements. - BZPOPMIN/BZPOPMAX take multiple keys, but return only one element from just one key. - ZMPOP/BZMPOP can take multiple keys, and can return multiple elements from just one key. Note that ZMPOP/BZMPOP can take multiple keys, it eventually operates on just on key. And it will propagate as ZPOPMIN or ZPOPMAX with the COUNT option. As new commands, if we can not pop any elements, the response like: - ZMPOP: Return a NIL in both RESP2 and RESP3, unlike ZPOPMIN/ZPOPMAX return emptyarray. - BZMPOP: Return a NIL in both RESP2 and RESP3 when timeout is reached, like BZPOPMIN/BZPOPMAX. For the normal response is nested arrays in RESP2 and RESP3: ``` ZMPOP/BZMPOP 1) keyname 2) 1) 1) member1 2) score1 2) 1) member2 2) score2 In RESP2: 1) "myzset" 2) 1) 1) "three" 2) "3" 2) 1) "two" 2) "2" In RESP3: 1) "myzset" 2) 1) 1) "three" 2) (double) 3 2) 1) "two" 2) (double) 2 ```	2021-09-23 08:34:40 +03:00
chenyang8094	59c9716f96	Fix redis-check-rdb rdb_type_string initialization error (#9534 )	2021-09-22 10:40:31 +03:00
Wen Hui	572abee6f1	Add const for relevant method parameters in redis-benchmark (#9516 )	2021-09-20 09:32:33 -07:00
郭伟光	67fcbdf8a1	replace redis-trib to redis-cli since it is no longer available (#9525 )	2021-09-20 09:30:22 -07:00
Wen Hui	7c376398b1	CLIENT LIST / INFO show resp version (#9508 )	2021-09-19 12:13:46 +03:00
Binbin	f898a9e97d	Adds limit to SINTERCARD/ZINTERCARD. (#9425 ) Implements the [LIMIT limit] variant of SINTERCARD/ZINTERCARD. Now with the LIMIT, we can stop the searching when cardinality reaching the limit, and return the cardinality ASAP. Note that in SINTERCARD, the old synatx was: `SINTERCARD key [key ...]` In order to add a optional parameter, we must break the old synatx. So the new syntax of SINTERCARD will be consistent with ZINTERCARD. New syntax: `SINTERCARD numkeys key [key ...] [LIMIT limit]`. Note that this means that SINTERCARD has a different syntax than SINTER and SINTERSTORE (taking numkeys argument) As for ZINTERCARD, we can easily add a optional parameter to it. New syntax: `ZINTERCARD numkeys key [key ...] [LIMIT limit]`	2021-09-16 14:07:08 +03:00
guybe7	08f4e1335c	createSharedObjects: zopomin and zpopmax apeear twice (#9505 ) Introduced by https://github.com/redis/redis/pull/9502	2021-09-15 15:29:35 +03:00
guybe7	7759ec7c43	Cleanup: propagate and alsoPropagate do not need redisCommand (#9502 ) The `cmd` argument was completely unused, and all the code that bothered to pass it was unnecessary. This is a prepartion for a future commit that treats subcommands as commands	2021-09-15 12:53:42 +03:00
guybe7	03fcc211de	A better approach for COMMAND INFO for movablekeys commands (#8324 ) Fix #7297 The problem: Today, there is no way for a client library or app to know the key name indexes for commands such as ZUNIONSTORE/EVAL and others with "numkeys", since COMMAND INFO returns no useful info for them. For cluster-aware redis clients, this requires to 'patch' the client library code specifically for each of these commands or to resolve each execution of these commands with COMMAND GETKEYS. The solution: Introducing key specs other than the legacy "range" (first,last,step) The 8th element of the command info array, if exists, holds an array of key specs. The array may be empty, which indicates the command doesn't take any key arguments or may contain one or more key-specs, each one may leads to the discovery of 0 or more key arguments. A client library that doesn't support this key-spec feature will keep using the first,last,step and movablekeys flag which will obviously remain unchanged. A client that supports this key-specs feature needs only to look at the key-specs array. If it finds an unrecognized spec, it must resort to using COMMAND GETKEYS if it wishes to get all key name arguments, but if all it needs is one key in order to know which cluster node to use, then maybe another spec (if the command has several) can supply that, and there's no need to use GETKEYS. Each spec is an array of arguments, first one is the spec name, the second is an array of flags, and the third is an array containing details about the spec (specific meaning for each spec type) The initial flags we support are "read" and "write" indicating if the keys that this key-spec finds are used for read or for write. clients should ignore any unfamiliar flags. In order to easily find the positions of keys in a given array of args we introduce keys specs. There are two logical steps of key specs: 1. `start_search`: Given an array of args, indicate where we should start searching for keys 2. `find_keys`: Given the output of start_search and an array of args, indicate all possible indices of keys. ### start_search step specs - `index`: specify an argument index explicitly - `index`: 0 based index (1 means the first command argument) - `keyword`: specify a string to match in `argv`. We should start searching for keys just after the keyword appears. - `keyword`: the string to search for - `start_search`: an index from which to start the keyword search (can be negative, which means to search from the end) Examples: - `SET` has start_search of type `index` with value `1` - `XREAD` has start_search of type `keyword` with value `[“STREAMS”,1]` - `MIGRATE` has start_search of type `keyword` with value `[“KEYS”,-2]` ### find_keys step specs - `range`: specify `[count, step, limit]`. - `lastkey`: index of the last key. relative to the index returned from begin_search. -1 indicating till the last argument, -2 one before the last - `step`: how many args should we skip after finding a key, in order to find the next one - `limit`: if count is -1, we use limit to stop the search by a factor. 0 and 1 mean no limit. 2 means ½ of the remaining args, 3 means ⅓, and so on. - “keynum”: specify `[keynum_index, first_key_index, step]`. - `keynum_index`: is relative to the return of the `start_search` spec. - `first_key_index`: is relative to `keynum_index`. - `step`: how many args should we skip after finding a key, in order to find the next one Examples: - `SET` has `range` of `[0,1,0]` - `MSET` has `range` of `[-1,2,0]` - `XREAD` has `range` of `[-1,1,2]` - `ZUNION` has `start_search` of type `index` with value `1` and `find_keys` of type `keynum` with value `[0,1,1]` - `AI.DAGRUN` has `start_search` of type `keyword` with value `[“LOAD“,1]` and `find_keys` of type `keynum` with value `[0,1,1]` (see https://oss.redislabs.com/redisai/master/commands/#aidagrun) Note: this solution is not perfect as the module writers can come up with anything, but at least we will be able to find the key args of the vast majority of commands. If one of the above specs can’t describe the key positions, the module writer can always fall back to the `getkeys-api` option. Some keys cannot be found easily (`KEYS` in `MIGRATE`: Imagine the argument for `AUTH` is the string “KEYS” - we will start searching in the wrong index). The guarantee is that the specs may be incomplete (`incomplete` will be specified in the spec to denote that) but we never report false information (assuming the command syntax is correct). For `MIGRATE` we start searching from the end - `startfrom=-1` - and if one of the keys is actually called "keys" we will report only a subset of all keys - hence the `incomplete` flag. Some `incomplete` specs can be completely empty (i.e. UNKNOWN begin_search) which should tell the client that COMMAND GETKEYS (or any other way to get the keys) must be used (Example: For `SORT` there is no way to describe the STORE keyword spec, as the word "store" can appear anywhere in the command). We will expose these key specs in the `COMMAND` command so that clients can learn, on startup, where the keys are for all commands instead of holding hardcoded tables or use `COMMAND GETKEYS` in runtime. Comments: 1. Redis doesn't internally use the new specs, they are only used for COMMAND output. 2. In order to support the current COMMAND INFO format (reply array indices 4, 5, 6) we created a synthetic range, called legacy_range, that, if possible, is built according to the new specs. 3. Redis currently uses only getkeys_proc or the legacy_range to get the keys indices (in COMMAND GETKEYS for example). "incomplete" specs: the command we have issues with are MIGRATE, STRALGO, and SORT for MIGRATE, because the token KEYS, if exists, must be the last token, we can search in reverse. it one of the keys is actually the string "keys" will return just a subset of the keys (hence, it's "incomplete") for SORT and STRALGO we can use this heuristic (the keys can be anywhere in the command) and therefore we added a key spec that is both "incomplete" and of "unknown type" if a client encounters an "incomplete" spec it means that it must find a different way (either COMMAND GETKEYS or have its own parser) to retrieve the keys. please note that all commands, apart from the three mentioned above, have "complete" key specs	2021-09-15 11:10:29 +03:00
filipe oliveira	b5a879e1c2	Added URI support to redis-benchmark (cli and benchmark share the same uri-parsing methods) (#9314 ) - Add `-u <uri>` command line option to support `redis://` URI scheme. - included server connection information object (`struct cliConnInfo`), used to describe an ip:port pair, db num user input, and user:pass to avoid a large number of function arguments. - Using sds on connection info strings for redis-benchmark/redis-cli Co-authored-by: yoav-steinberg <yoav@monfort.co.il>	2021-09-14 19:45:06 +03:00
Viktor Söderqvist	ea36d4de17	Modules: Add remaining list API functions (#8439 ) List functions operating on elements by index: * RM_ListGet * RM_ListSet * RM_ListInsert * RM_ListDelete Iteration is done using a simple for loop over indices. The index based functions use an internal iterator as an optimization. This is explained in the docs: ``` * Many of the list functions access elements by index. Since a list is in * essence a doubly-linked list, accessing elements by index is generally an * O(N) operation. However, if elements are accessed sequentially or with * indices close together, the functions are optimized to seek the index from * the previous index, rather than seeking from the ends of the list. * * This enables iteration to be done efficiently using a simple for loop: * * long n = RM_ValueLength(key); * for (long i = 0; i < n; i++) { * RedisModuleString elem = RedisModule_ListGet(key, i); // Do stuff... * } ```	2021-09-14 17:48:06 +03:00
zhaozhao.zz	794442b130	PSYNC2: make partial sync possible after master reboot (#8015 ) The main idea is how to allow a master to load replication info from RDB file when rebooting, if master can load replication info it means that replicas may have the chance to psync with master, it can save much traffic. The key point is we need guarantee safety and consistency, so there are two differences between master and replica: 1. master would load the replication info as secondary ID and offset, in case other masters have the same replid. 2. when master loading RDB, it would propagate expired keys as DEL command to replication backlog, then replica can receive these commands to delete stale keys. p.s. the expired keys when RDB loading is useful for users, so we show it as `rdb_last_load_keys_expired` and `rdb_last_load_keys_loaded` in info persistence. Moreover, after load replication info, master should update `no_replica_time` in case loading RDB cost too long time.	2021-09-13 15:39:11 +08:00
Huang Zhw	75dd230994	bitpos/bitcount add bit index (#9324 ) Make bitpos/bitcount support bit index: ``` BITPOS key bit [start [end [BIT\|BYTE]]] BITCOUNT key [start end [BIT\|BYTE]] ``` The default behavior is `BYTE`, so these commands are still compatible with old.	2021-09-12 11:31:22 +03:00
David CARLIER	418c2e7931	TLS build fix on OpenBSD when built with LibreSSL. (#9486 )	2021-09-11 22:54:09 +03:00
zhaozhao.zz	d7fa44f4da	init client pause value in more appropriate place (#9479 )	2021-09-10 14:02:45 +08:00
sundb	3ca6972ecd	Replace all usage of ziplist with listpack for t_zset (#9366 ) Part two of implementing #8702 (zset), after #8887. ## Description of the feature Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance. ## Rdb format changes New `RDB_TYPE_ZSET_LISTPACK` rdb type. ## Rdb loading improvements: 1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist. 2) Simplifying the release of empty key objects when RDB loading. 3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c. ## Interface changes 1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`). 2) OBJECT ENCODING will return listpack instead of ziplist. ## Listpack improvements: 1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack. 2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string. 3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`. ## Zset improvements: 1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop. 2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset. ## Tests 1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function. 2) Add zset RDB loading test. 3) Add benchmark test for `lpCompare` and `ziplsitCompare`. 4) Add empty listpack zset corrupt dump test.	2021-09-09 18:18:53 +03:00
Madelyn Olson	86b0de5c41	Remove redundant validation and prevent duplicate users during ACL load (#9330 ) Throw an error when a user is provided multiple times on the command line instead of silently throwing one of them away. Remove unneeded validation for validating users on ACL load.	2021-09-09 07:40:33 -07:00
yancz2000	47c001dde6	Add make test-cluster option (#9478 ) Add make test-cluster option	2021-09-09 06:52:21 -07:00
yvette903	f560531d5b	Fix: client pause uses an old timeout (#9477 ) A write request may be paused unexpectedly because `server.client_pause_end_time` is old. Recreate this: redis-cli -p 6379 127.0.0.1:6379> client pause 500000000 write OK 127.0.0.1:6379> client unpause OK 127.0.0.1:6379> client pause 10000 write OK 127.0.0.1:6379> set key value The write request `set key value` is paused util the timeout of 500000000 milliseconds was reached. Fix: reset `server.client_pause_end_time` = 0 in `unpauseClients`	2021-09-09 13:44:48 +03:00
Binbin	c50af0aeba	Add LMPOP/BLMPOP commands. (#9373 ) We want to add COUNT option for BLPOP. But we can't do it without breaking compatibility due to the command arguments syntax. So this commit introduce two new commands. Syntax for the new LMPOP command: `LMPOP numkeys [<key> ...] LEFT\|RIGHT [COUNT count]` Syntax for the new BLMPOP command: `BLMPOP timeout numkeys [<key> ...] LEFT\|RIGHT [COUNT count]` Some background: - LPOP takes one key, and can return multiple elements. - BLPOP takes multiple keys, but returns one element from just one key. - LMPOP can take multiple keys and return multiple elements from just one key. Note that LMPOP/BLMPOP can take multiple keys, it eventually operates on just one key. And it will propagate as LPOP or RPOP with the COUNT option. As a new command, it still return NIL if we can't pop any elements. For the normal response is nested arrays in RESP2 and RESP3, like: ``` LMPOP/BLMPOP 1) keyname 2) 1) element1 2) element2 ``` I.e. unlike BLPOP that returns a key name and one element so it uses a flat array, and LPOP that returns multiple elements with no key name, and again uses a flat array, this one has to return a nested array, and it does for for both RESP2 and RESP3 (like SCAN does) Some discuss can see: #766 #8824	2021-09-09 12:02:33 +03:00
Huang Zhw	216f168b2b	Add INFO total_active_defrag_time and current_active_defrag_time (#9377 ) Add two INFO metrics: ``` total_active_defrag_time:12345 current_active_defrag_time:456 ``` `current_active_defrag_time` if greater than 0, means how much time has passed since active defrag started running. If active defrag stops, this metric is reset to 0. `total_active_defrag_time` means total time the fragmentation was over the defrag threshold since the server started. This is a followup PR for #9031	2021-09-09 11:38:10 +03:00
Wang Yuan	cee3d67f50	Delay to discard cached master when full synchronization (#9398 ) * Delay to discard cache master when full synchronization * Don't disconnect with replicas before loading transferred RDB when full sync Previously, once replica need to start full synchronization with master, it will discard cached master whatever full synchronization is failed or not. Now we discard cached master only when transferring RDB is finished and start to change data space, this make replica could start partial resynchronization with another new master if new master is failed during full synchronization.	2021-09-09 11:32:29 +03:00
chenyang8094	bc0c22fabc	Fix callReplyParseCollection memleak when use AutoMemory (#9446 ) When parsing an array type reply, ctx will be lost when recursively parsing its elements, which will cause a memory leak in automemory mode. This is a result of the changes in #9202 Add test for callReplyParseCollection fix	2021-09-09 11:03:05 +03:00
chenyang8094	7a0e668560	Add stdlib.h for RedisModule_Assert (#9470 )	2021-09-08 23:22:01 +03:00
zhaozhao.zz	1b83353dc3	Fix wrong offset when replica pause (#9448 ) When a replica paused, it would not apply any commands event the command comes from master, if we feed the non-applied command to replication stream, the replication offset would be wrong, and data would be lost after failover(since replica's `master_repl_offset` grows but command is not applied). To fix it, here are the changes: * Don't update replica's replication offset or propagate commands to sub-replicas when it's paused in `commandProcessed`. * Show `slave_read_repl_offset` in info reply. * Add an assert to make sure master client should never be blocked unless pause or module (some modules may use block way to do background (parallel) processing and forward original block module command to the replica, it's not a good way but it can work, so the assert excludes module now, but someday in future all modules should rewrite block command to propagate like what `BLPOP` does).	2021-09-08 16:07:25 +08:00
Viktor Söderqvist	547c3405d4	Optimize quicklistIndex to seek from the nearest end (#9454 ) Until now, giving a negative index seeks from the end of a list and a positive seeks from the beginning. This change makes it seek from the nearest end, regardless of the sign of the given index. quicklistIndex is used by all list commands which operate by index. LINDEX key 999999 in a list if 1M elements is greately optimized by this change. Latency is cut by 75%. LINDEX key -1000000 in a list of 1M elements, likewise. LRANGE key -1 -1 is affected by this, since LRANGE converts the indices to positive numbers before seeking. The tests for corrupt dumps are updated to make sure the corrup data is seeked in the same direction as before.	2021-09-06 09:12:38 +03:00
guybe7	6aa2285e32	Fix two minor bugs (MIGRATE key args and getKeysUsingCommandTable) (#9455 ) 1. MIGRATE has a potnetial key arg in argv[3]. It should be reflected in the command table. 2. getKeysUsingCommandTable should never free getKeysResult, it is always freed by the caller) The reason we never encountered this double-free bug is that almost always getKeysResult uses the statis buffer and doesn't allocate a new one.	2021-09-02 17:19:27 +03:00
sundb	306a5ccd2d	Fix the timing of read and write events under kqueue (#9416 ) Normally we execute the read event first and then the write event. When the barrier is set, we will do it reverse. However, under `kqueue`, if an `fd` has both read and write events, reading the event using `kevent` will generate two events, which will result in uncontrolled read and write timing. This also means that the guarantees of AOF `appendfsync` = `always` are not met on MacOS without this fix. The main change to this pr is to cache the events already obtained when reading them, so that if the same `fd` occurs again, only the mask in the cache is updated, rather than a new event is generated. This was exposed by the following test failure on MacOS: ``` *** [err]: AOF fsync always barrier issue in tests/integration/aof.tcl Expected 544 != 544 (context: type eval line 26 cmd {assert {$size1 != $size2}} proc ::test) ```	2021-09-02 11:07:51 +03:00
Viktor Söderqvist	f24c63a292	Slot-to-keys using dict entry metadata (#9356 ) * Enhance dict to support arbitrary metadata carried in dictEntry Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech> * Rewrite slot-to-keys mapping to linked lists using dict entry metadata This is a memory enhancement for Redis Cluster. The radix tree slots_to_keys (which duplicates all key names prefixed with their slot number) is replaced with a linked list for each slot. The dict entries of the same cluster slot form a linked list and the pointers are stored as metadata in each dict entry of the main DB dict. This commit also moves the slot-to-key API from db.c to cluster.c. Co-authored-by: Jim Brunner <brunnerj@amazon.com>	2021-08-30 23:25:36 -07:00
Wang Yuan	9a0c0617f1	Use sync_file_range to optimize fsync if possible (#9409 ) We implement incremental data sync in rio.c by call fsync, on slow disk, that may cost a lot of time, sync_file_range could provide async fsync, so we could serialize key/value and sync file data at the same time. > one tip for sync_file_range usage: http://lkml.iu.edu/hypermail/linux/kernel/1005.2/01845.html Additionally, this change avoids a single large write to be used, which can result in a mass of dirty pages in the kernel (increasing the risk of someone else's write to block). On HDD, current solution could reduce approximate half of dumping RDB time, this PR costs 50s for dump 7.7G rdb but unstable branch costs 93s. On NVME SSD, this PR can't reduce much time, this PR costs 40s, unstable branch costs 48s. Moreover, I find calling data sync every 4MB is better than 32MB.	2021-08-30 10:24:53 +03:00
Binbin	aefbc23451	Better error handling for updateClientOutputBufferLimit. (#9308 ) This one follow #9313 and goes deeper (validation of config file parsing) Move the check/update logic to a new updateClientOutputBufferLimit function. So that it can be used in CONFIG SET and config file parsing.	2021-08-29 15:03:05 +03:00
Viktor Söderqvist	97dcf95cc8	redis-benchmark: improved help and warnings (#9419 ) 1. The output of --help: * On the Usage line, just write [OPTIONS] [COMMAND ARGS...] instead listing only a few arbitrary options and no command. * For --cluster, describe that if the command is supplied on the command line, the key must contain "{tag}". Otherwise, the command will not be sent to the right cluster node. * For -r, add a note that if -r is omitted, all commands in a benchmark will use the same key. Also align the description. * For -t, describe that -t is ignored if a command is supplied on the command line. 2. Print a warning if -t is present when a specific command is supplied. 3. Print all warnings and errors to stderr. 4. Remove -e from calls in redis-benchmark test suite.	2021-08-29 14:31:08 +03:00
Huang Zhw	b375f5919e	redis-benchmark: make show throughput in only one thread. (#9146 ) In multipe threads mode, every thread output throughput info. This may cause some problems: - Bug in https://github.com/redis/redis/pull/8615; - The show throughput is called too frequently; - showThroughput which updates shared variable lacks synchronization mechanism. This commit also reverts changes in #8615 and changes time event interval to macro.	2021-08-25 14:58:35 +03:00
Garen Chan	945a83d406	Fix boundary problem of adjusting open files limit. (#5722 ) When `decr_step` is greater than `oldlimit`, the final `bestlimit` may be invalid. For example, oldlimit = 10, decr_step = 16. Current bestlimit = 15 and setrlimit() failed. Since bestlimit is less than decr_step , then exit the loop. The final bestlimit is larger than oldlimit but is invalid. Note that this only matters if the system fd limit is below 16, so unlikely to have any actual effect.	2021-08-24 22:54:21 +03:00
Wen Hui	641780a9c6	config memory limits: handle values larger than (signed) LLONG_MAX (#9313 ) This aims to solve the issue in CONFIG SET maxmemory can only set maxmemory to up to 9223372036854775807 (2^63) while the maxmemory should be ULLONG. Added a memtoull function to convert a string representing an amount of memory into the number of bytes (similar to memtoll but for ull). Also added ull2string to convert a ULLong to string (Similar to ll2string).	2021-08-23 21:00:40 +03:00
Viktor Söderqvist	74590f8345	redis-cli: Assert > 0 before dividing, to silence warning by tool (#9396 ) Also make sure function can't return NULL by another assert.	2021-08-22 13:57:18 +03:00
Binbin	0835f596b8	BITSET and BITFIELD SET only propagate command when the value changed. (#9403 ) In old way, we always increase server.dirty in BITSET and BITFIELD SET. Even the command doesn't really change anything. This commit make sure BITSET and BITFIELD SET only increase dirty when the value changed. Because of that, if the value not changed, some others implications: - Avoid adding useless AOF - Reduce replication traffic - Will not trigger keyspace notifications (setbit) - Will not invalidate WATCH - Will not sent the invalidation message to the tracking client	2021-08-22 10:20:53 +03:00
Viktor Söderqvist	8f59c1ecae	Let CONFIG GET * show both replicaof and its alias (#9395 )	2021-08-21 19:43:18 -07:00
sundb	492d8d0961	Sanitize dump payload: fix double free after insert dup nodekey to stream rax and returns 0 (#9399 )	2021-08-20 10:37:45 +03:00
yoav-steinberg	0e8d469f82	More generic crash report for unsupported archs (#9385 ) Following compilation warnings on s390x.	2021-08-18 15:46:11 +03:00
Yossi Gottlieb	fe359cbfc2	Fix: don't assume char is unsigned. (#9375 ) On systems that have unsigned char by default (s390x, arm), redis-server could crash as soon as it populates the command table.	2021-08-15 21:37:44 +03:00
Wang Yuan	8edc3cd62c	Fix the wrong detection of sync_file_range system call (#9371 ) If we want to check `defined(SYNC_FILE_RANGE_WAIT_BEFORE)`, we should include fcntl.h. otherwise, SYNC_FILE_RANGE_WAIT_BEFORE is not defined, and there is alway not `sync_file_range` system call. Introduced by #8532	2021-08-14 23:52:44 +03:00
Madelyn Olson	0cf2df84d4	Added additional validation for cluster SETSLOT (#9360 )	2021-08-12 14:59:17 -07:00
Madelyn Olson	2402f5a7a1	Update cluster debug log to include human readable packet type (#9361 )	2021-08-12 14:50:09 -07:00
Yossi Gottlieb	1221f7cd5e	Improve setup operations order after fork. (#9365 ) The order of setting things up follows some reasoning: Setup signal handlers first because a signal could fire at any time. Adjust OOM score before everything else to assist the OOM killer if memory resources are low. The trigger for this is a valgrind test failure which resulted with the child catching a SIGUSR1 before initializing the handler.	2021-08-12 14:31:12 +03:00
sundb	5705cec68e	Fix missing dismiss hash listpack memory due to ziplist->listpack migration (#9353 )	2021-08-10 16:54:19 +03:00
Huang Zhw	82528d2678	Redis-cli monitor and pubsub can be aborted with Ctrl+C, keeping the cli alive (#9347 ) Abort cli blocking modes with SIGINT without exiting the cli. Co-authored-by: charsyam <charsyam@gmail.com>	2021-08-10 15:03:49 +03:00
DarrenJiang13	8ab33c18e4	fix a compilation error around madvise when make with jemalloc on MacOS (#9350 ) We only use MADV_DONTNEED on Linux, that's were it was tested.	2021-08-10 11:32:27 +03:00
Meir Shpilraien (Spielrein)	8f8117f78e	Format fixes and naming. SentReplyOnKeyMiss -> addReplyOrErrorObject (#9346 ) Following the comments on #8659, this PR fix some formatting and naming issues.	2021-08-10 10:19:21 +03:00
sundb	02fd76b97c	Replace all usage of ziplist with listpack for t_hash (#8887 ) Part one of implementing #8702 (taking hashes first before other types) ## Description of the feature 1. Change ziplist encoded hash objects to listpack encoding. 2. Convert existing ziplists on RDB loading time. an O(n) operation. ## Rdb format changes 1. Add RDB_TYPE_HASH_LISTPACK rdb type. 2. Bump RDB_VERSION to 10 ## Interface changes 1. New `hash-max-listpack-entries` config is an alias for `hash-max-ziplist-entries` (same with `hash-max-listpack-value`) 2. OBJECT ENCODING will return `listpack` instead of `ziplist` ## Listpack improvements: 1. Support direct insert, replace integer element (rather than convert back and forth from string) 3. Add more listpack capabilities to match the ziplist ones (like `lpFind`, `lpRandomPairs` and such) 4. Optimize element length fetching, avoid multiple calculations 5. Use inline to avoid function call overhead. ## Tests 1. Add a new test to the RDB load time conversion 2. Adding the listpack unit tests. (based on the one in ziplist.c) 3. Add a few "corrupt payload: fuzzer findings" tests, and slightly modify existing ones. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-10 09:18:49 +03:00
sundb	cbda492909	Sanitize dump payload: handle remaining empty key when RDB loading and restore command (#9349 ) This commit mainly fixes empty keys due to RDB loading and restore command, which was omitted in #9297. 1) When loading quicklsit, if all the ziplists in the quicklist are empty, NULL will be returned. If only some of the ziplists are empty, then we will skip the empty ziplists silently. 2) When loading hash zipmap, if zipmap is empty, sanitization check will fail. 3) When loading hash ziplist, if ziplist is empty, NULL will be returned. 4) Add RDB loading test with sanitize.	2021-08-09 17:13:46 +03:00
Qu Chen	0b643e930d	Cleanup: createAOFClient uses createClient to avoid overlooked mismatches (#9338 ) AOF fake client creation (createAOFClient) was doing similar work as createClient, with some minor differences, most of which unintended, this was dangerous and meant that many changes to createClient should have always been reflected to aof.c This cleanup changes createAOFClient to call createClient with NULL, like we do in module.c and elsewhere.	2021-08-09 11:03:59 +03:00
Eduardo Semprebon	d3356bf614	Add SORT_RO command (#9299 ) Add a readonly variant of the STORE command, so it can be used on read-only workloads (replica, ACL, etc)	2021-08-09 09:40:29 +03:00
Qu Chen	e8eeba7bee	Allow master to replicate command longer than replica's query buffer limit (#9340 ) Replication client no longer checks incoming command length against the client-query-buffer-limit. This makes the master able to replicate commands longer than replica's configured client-query-buffer-limit	2021-08-08 17:34:11 -07:00
Yossi Gottlieb	3307958bd0	Propagate OPENSSL_PREFIX to hiredis. (#9345 )	2021-08-08 18:30:17 +03:00
Binbin	e1dc979054	sds.c: Fix potential overflow in sdsll2str. (#8910 ) Fixes an undefined behavior, same way as our `ll2string` does.	2021-08-08 14:30:47 +03:00
Binbin	563ba7a3f0	Fix the wrong method used in quicklistTest. (#8951 ) The test try to test `insert before 1 element`, but it use quicklist InsertAfter, a copy-paste typo. The commit also add an assert to verify results in some tests to make sure it is as expected.	2021-08-08 09:03:52 +03:00
DarrenJiang13	43eb0ce3bf	[BUGFIX] Add some missed error statistics (#9328 ) add error counting for some missed behaviors.	2021-08-06 19:27:24 -07:00
yoav-steinberg	0a9377535b	Ignore resize threshold on idle qbuf resizing (#9322 ) Also update qbuf tests to verify both idle and peak based resizing logic. And delete unused function: getClientsMaxBuffers	2021-08-06 20:50:34 +03:00
Oran Agra	0c90370e6d	Improvements to corrupt payload sanitization (#9321 ) Recently we found two issues in the fuzzer tester: #9302 #9285 After fixing them, more problems surfaced and this PR (as well as #9297) aims to fix them. Here's a list of the fixes - Prevent an overflow when allocating a dict hashtable - Prevent OOM when attempting to allocate a huge string - Prevent a few invalid accesses in listpack - Improve sanitization of listpack first entry - Validate integrity of stream consumer groups PEL - Validate integrity of stream listpack entry IDs - Validate ziplist tail followed by extra data which start with 0xff Co-authored-by: sundb <sundbcn@gmail.com>	2021-08-05 22:56:14 +03:00
sundb	8ea777a6a0	Sanitize dump payload: fix empty keys when RDB loading and restore command (#9297 ) When we load rdb or restore command, if we encounter a length of 0, it will result in the creation of an empty key. This could either be a corrupt payload, or a result of a bug (see #8453 ) This PR mainly fixes the following: 1) When restore command will return `Bad data format` error. 2) When loading RDB, we will silently discard the key. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-05 22:42:20 +03:00
Madelyn Olson	39a4a44d7d	Add debug config flag to print certain config values on engine crash (#9304 ) Add debug config flag to print certain config values on engine crash	2021-08-05 11:59:12 -07:00
Wen Hui	63e2a6d212	Add sentinel debug option command (#9291 ) This makes it possible to tune many parameters that were previously hard coded. We don't intend these to be user configurable, but only used by tests to accelerate certain conditions which would otherwise take a long time and slow down the test suite. Co-authored-by: Lucas Guang Yang <l84193800@china.huawei.com>	2021-08-05 11:12:55 +03:00
menwen	ca559819f7	Add latency monitor sample when key is deleted via lazy expire (#9317 ) Fix that there is no sample latency after the key expires via expireIfNeeded(). Some refactoring for shared code.	2021-08-05 11:09:24 +03:00
yoav-steinberg	d32f8641ed	fix dict access broken by #9228 (#9319 )	2021-08-05 09:02:30 +03:00
yoav-steinberg	5e908a290c	dict struct memory optimizations (#9228 ) Reduce dict struct memory overhead on 64bit dict size goes down from jemalloc's 96 byte bin to its 56 byte bin. summary of changes: - Remove `privdata` from callbacks and dict creation. (this affects many files, see "Interface change" below). - Meld `dictht` struct into the `dict` struct to eliminate struct padding. (this affects just dict.c and defrag.c) - Eliminate the `sizemask` field, can be calculated from size when needed. - Convert the `size` field into `size_exp` (exponent), utilizes one byte instead of 8. Interface change: pass dict pointer to dict type call back functions. This is instead of passing the removed privdata field. In the future if we'd like to have private data in the callbacks we can extract it from the dict type. We can extend dictType to include a custom dict struct allocator and use it to allocate more data at the end of the dict struct. This data can then be used to store private data later acccessed by the callbacks.	2021-08-05 08:25:58 +03:00

... 2 3 4 5 6 ...

7909 Commits