redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-22 16:18:28 -05:00

Author	SHA1	Message	Date
Oran Agra	ae89958972	Set repl-diskless-sync to yes by default, add repl-diskless-sync-max-replicas (#10092 ) 1. enable diskless replication by default 2. add a new config named repl-diskless-sync-max-replicas that enables replication to start before the full repl-diskless-sync-delay was reached. 3. put replica online sooner on the master (see below) 4. test suite uses repl-diskless-sync-delay of 0 to be faster 5. a few tests that use multiple replica on a pre-populated master, are now using the new repl-diskless-sync-max-replicas 6. fix possible timing issues in a few cluster tests (see below) put replica online sooner on the master ---------------------------------------------------- there were two tests that failed because they needed for the master to realize that the replica is online, but the test code was actually only waiting for the replica to realize it's online, and in diskless it could have been before the master realized it. changes include two things: 1. the tests wait on the right thing 2. issues in the master, putting the replica online in two steps. the master used to put the replica as online in 2 steps. the first step was to mark it as online, and the second step was to enable the write event (only after getting ACK), but in fact the first step didn't contains some of the tasks to put it online (like updating good slave count, and sending the module event). this meant that if a test was waiting to see that the replica is online form the point of view of the master, and then confirm that the module got an event, or that the master has enough good replicas, it could fail due to timing issues. so now the full effect of putting the replica online, happens at once, and only the part about enabling the writes is delayed till the ACK. fix cluster tests -------------------- I added some code to wait for the replica to sync and avoid race conditions. later realized the sentinel and cluster tests where using the original 5 seconds delay, so changed it to 0. this means the other changes are probably not needed, but i suppose they're still better (avoid race conditions)	2022-01-17 14:11:11 +02:00
Meir Shpilraien (Spielrein)	4db4b43417	Function Flags support (no-writes, no-cluster, allow-state, allow-oom) (#10066 ) # Redis Functions Flags Following the discussion on #10025 Added Functions Flags support. The PR is divided to 2 sections: * Add named argument support to `redis.register_function` API. * Add support for function flags ## `redis.register_function` named argument support The first part of the PR adds support for named argument on `redis.register_function`, example: ``` redis.register_function{ function_name='f1', callback=function() return 'hello' end, description='some desc' } ``` The positional arguments is also kept, which means that it still possible to write: ``` redis.register_function('f1', function() return 'hello' end) ``` But notice that it is no longer possible to pass the optional description argument on the positional argument version. Positional argument was change to allow passing only the mandatory arguments (function name and callback). To pass more arguments the user must use the named argument version. As with positional arguments, the `function_name` and `callback` is mandatory and an error will be raise if those are missing. Also, an error will be raise if an unknown argument name is given or the arguments type is wrong. Tests was added to verify the new syntax. ## Functions Flags The second part of the PR is adding functions flags support. Flags are given to Redis when the engine calls `functionLibCreateFunction`, supported flags are: * `no-writes` - indicating the function perform no writes which means that it is OK to run it on: * read-only replica * Using FCALL_RO * If disk error detected It will not be possible to run a function in those situations unless the function turns on the `no-writes` flag * `allow-oom` - indicate that its OK to run the function even if Redis is in OOM state, if the function will not turn on this flag it will not be possible to run it if OOM reached (even if the function declares `no-writes` and even if `fcall_ro` is used). If this flag is set, any command will be allow on OOM (even those that is marked with CMD_DENYOOM). The assumption is that this flag is for advance users that knows its meaning and understand what they are doing, and Redis trust them to not increase the memory usage. (e.g. it could be an INCR or a modification on an existing key, or a DEL command) * `allow-state` - indicate that its OK to run the function on stale replica, in this case we will also make sure the function is only perform `stale` commands and raise an error if not. * `no-cluster` - indicate to disallow running the function if cluster is enabled. Default behaviure of functions (if no flags is given): 1. Allow functions to read and write 2. Do not run functions on OOM 3. Do not run functions on stale replica 4. Allow functions on cluster ### Lua API for functions flags On Lua engine, it is possible to give functions flags as `flags` named argument: ``` redis.register_function{function_name='f1', callback=function() return 1 end, flags={'no-writes', 'allow-oom'}, description='description'} ``` The function flags argument must be a Lua table that contains all the requested flags, The following will result in an error: * Unknown flag * Wrong flag type Default behaviour is the same as if no flags are used. Tests were added to verify all flags functionality ## Additional changes * mark FCALL and FCALL_RO with CMD_STALE flag (unlike EVAL), so that they can run if the function was registered with the `allow-stale` flag. * Verify `CMD_STALE` on `scriptCall` (`redis.call`), so it will not be possible to call commands from script while stale unless the command is marked with the `CMD_STALE` flags. so that even if the function is allowed while stale we do not allow it to bypass the `CMD_STALE` flag of commands. * Flags section was added to `FUNCTION LIST` command to provide the set of flags for each function: ``` > FUNCTION list withcode 1) 1) "library_name" 2) "test" 3) "engine" 4) "LUA" 5) "description" 6) (nil) 7) "functions" 8) 1) 1) "name" 2) "f1" 3) "description" 4) (nil) 5) "flags" 6) (empty array) 9) "library_code" 10) "redis.register_function{function_name='f1', callback=function() return 1 end}" ``` * Added API to get Redis version from within a script, The redis version can be provided using: 1. `redis.REDIS_VERSION` - string representation of the redis version in the format of MAJOR.MINOR.PATH 2. `redis.REDIS_VERSION_NUM` - number representation of the redis version in the format of `0x00MMmmpp` (`MM` - major, `mm` - minor, `pp` - patch). The number version can be used to check if version is greater or less another version. The string version can be used to return to the user or print as logs. This new API is provided to eval scripts and functions, it also possible to use this API during functions loading phase.	2022-01-14 14:02:02 +02:00
chenyang8094	e9bff7978a	Always create base AOF file when redis start from empty. (#10102 ) Force create a BASE file (use a foreground `rewriteAppendOnlyFile`) when redis starts from an empty data set and `appendonly` is yes. The reasoning is that normally, after redis is running for some time, and the AOF has gone though a few rewrites, there's always a base rdb file. and the scenario where the base file is missing, is kinda rare (happens only at empty startup), so this change normalizes it. But more importantly, there are or could be some complex modules that are started with some configuration, when they create persistence they write that configuration to RDB AUX fields, so that can can always know with which configuration the persistence file they're loading was created (could be critical). there is (was) one scenario in which they could load their persisted data, and that configuration was missing, and this change fixes it. Add a new module event: REDISMODULE_SUBEVENT_PERSISTENCE_SYNC_AOF_START, similar to REDISMODULE_SUBEVENT_PERSISTENCE_AOF_START which is async. Co-authored-by: Oran Agra <oran@redislabs.com>	2022-01-13 08:49:26 +02:00
chenyang8094	bd46a2abf4	Support whitespace characters in appendfilename, and ban them in appenddirname (#10049 ) 1. Ban whitespace characters in `appenddirname` 2. Handle the case where `appendfilename` contains spaces (for backwards compatibility)	2022-01-10 09:09:39 +02:00
Meir Shpilraien (Spielrein)	885f6b5ceb	Redis Function Libraries (#10004 ) # Redis Function Libraries This PR implements Redis Functions Libraries as describe on: https://github.com/redis/redis/issues/9906. Libraries purpose is to provide a better code sharing between functions by allowing to create multiple functions in a single command. Functions that were created together can safely share code between each other without worrying about compatibility issues and versioning. Creating a new library is done using 'FUNCTION LOAD' command (full API is described below) This PR introduces a new struct called libraryInfo, libraryInfo holds information about a library: * name - name of the library * engine - engine used to create the library * code - library code * description - library description * functions - the functions exposed by the library When Redis gets the `FUNCTION LOAD` command it creates a new empty libraryInfo. Redis passes the `CODE` to the relevant engine alongside the empty libraryInfo. As a result, the engine will create one or more functions by calling 'libraryCreateFunction'. The new funcion will be added to the newly created libraryInfo. So far Everything is happening locally on the libraryInfo so it is easy to abort the operation (in case of an error) by simply freeing the libraryInfo. After the library info is fully constructed we start the joining phase by which we will join the new library to the other libraries currently exist on Redis. The joining phase make sure there is no function collision and add the library to the librariesCtx (renamed from functionCtx). LibrariesCtx is used all around the code in the exact same way as functionCtx was used (with respect to RDB loading, replicatio, ...). The only difference is that apart from function dictionary (maps function name to functionInfo object), the librariesCtx contains also a libraries dictionary that maps library name to libraryInfo object. ## New API ### FUNCTION LOAD `FUNCTION LOAD <ENGINE> <LIBRARY NAME> [REPLACE] [DESCRIPTION <DESCRIPTION>] <CODE>` Create a new library with the given parameters: * ENGINE - REPLACE Engine name to use to create the library. * LIBRARY NAME - The new library name. * REPLACE - If the library already exists, replace it. * DESCRIPTION - Library description. * CODE - Library code. Return "OK" on success, or error on the following cases: * Library name already taken and REPLACE was not used * Name collision with another existing library (even if replace was uses) * Library registration failed by the engine (usually compilation error) ## Changed API ### FUNCTION LIST `FUNCTION LIST [LIBRARYNAME <LIBRARY NAME PATTERN>] [WITHCODE]` Command was modified to also allow getting libraries code (so `FUNCTION INFO` command is no longer needed and removed). In addition the command gets an option argument, `LIBRARYNAME` allows you to only get libraries that match the given `LIBRARYNAME` pattern. By default, it returns all libraries. ### INFO MEMORY Added number of libraries to `INFO MEMORY` ### Commands flags `DENYOOM` flag was set on `FUNCTION LOAD` and `FUNCTION RESTORE`. We consider those commands as commands that add new data to the dateset (functions are data) and so we want to disallows to run those commands on OOM. ## Removed API * FUNCTION CREATE - Decided on https://github.com/redis/redis/issues/9906 * FUNCTION INFO - Decided on https://github.com/redis/redis/issues/9899 ## Lua engine changes When the Lua engine gets the code given on `FUNCTION LOAD` command, it immediately runs it, we call this run the loading run. Loading run is not a usual script run, it is not possible to invoke any Redis command from within the load run. Instead there is a new API provided by `library` object. The new API's: * `redis.log` - behave the same as `redis.log` * `redis.register_function` - register a new function to the library The loading run purpose is to register functions using the new `redis.register_function` API. Any attempt to use any other API will result in an error. In addition, the load run is has a time limit of 500ms, error is raise on timeout and the entire operation is aborted. ### `redis.register_function` `redis.register_function(<function_name>, <callback>, [<description>])` This new API allows users to register a new function that will be linked to the newly created library. This API can only be called during the load run (see definition above). Any attempt to use it outside of the load run will result in an error. The parameters pass to the API are: * function_name - Function name (must be a Lua string) * callback - Lua function object that will be called when the function is invokes using fcall/fcall_ro * description - Function description, optional (must be a Lua string). ### Example The following example creates a library called `lib` with 2 functions, `f1` and `f1`, returns 1 and 2 respectively: ``` local function f1(keys, args) return 1 end local function f2(keys, args) return 2 end redis.register_function('f1', f1) redis.register_function('f2', f2) ``` Notice: Unlike `eval`, functions inside a library get the KEYS and ARGV as arguments to the functions and not as global. ### Technical Details On the load run we only want the user to be able to call a white list on API's. This way, in the future, if new API's will be added, the new API's will not be available to the load run unless specifically added to this white list. We put the while list on the `library` object and make sure the `library` object is only available to the load run by using [lua_setfenv](https://www.lua.org/manual/5.1/manual.html#lua_setfenv) API. This API allows us to set the `globals` of a function (and all the function it creates). Before starting the load run we create a new fresh Lua table (call it `g`) that only contains the `library` API (we make sure to set global protection on this table just like the general global protection already exists today), then we use [lua_setfenv](https://www.lua.org/manual/5.1/manual.html#lua_setfenv) to set `g` as the global table of the load run. After the load run finished we update `g` metatable and set `__index` and `__newindex` functions to be `_G` (Lua default globals), we also pop out the `library` object as we do not need it anymore. This way, any function that was created on the load run (and will be invoke using `fcall`) will see the default globals as it expected to see them and will not have the `library` API anymore. An important outcome of this new approach is that now we can achieve a distinct global table for each library (it is not yet like that but it is very easy to achieve it now). In the future we can decide to remove global protection because global on different libraries will not collide or we can chose to give different API to different libraries base on some configuration or input. Notice that this technique was meant to prevent errors and was not meant to prevent malicious user from exploit it. For example, the load run can still save the `library` object on some local variable and then using in `fcall` context. To prevent such a malicious use, the C code also make sure it is running in the right context and if not raise an error.	2022-01-06 13:39:38 +02:00
sundb	4d3c4cfac7	Show the elapsed time of single test and speed up some tests (#10058 ) Following #10038. This PR introduces two changes. 1. Show the elapsed time of a single test in the test output, in order to have a more detailed understanding of the changes in test run time. 2. Speedup two tests related to `key-load-delay` configuration. other tests do not seem to be affected by #10003.	2022-01-05 13:49:01 +02:00
chenyang8094	87789fae0b	Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788 ) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>	2022-01-03 19:14:13 +02:00
Binbin	b8ba942ac2	Add DUMP RESTORE tests for redis-cli -x and -X options (#10041 ) This commit adds DUMP RESTORES tests for the -x and -X options. I wanted to add it in #9980 which introduce the -X option, but back then i failed due to some errors (related to redis-cli call).	2022-01-02 13:58:22 +02:00
Viktor Söderqvist	45a155bd0f	Wait for replicas when shutting down (#9872 ) To avoid data loss, this commit adds a grace period for lagging replicas to catch up the replication offset. Done: * Wait for replicas when shutdown is triggered by SIGTERM and SIGINT. * Wait for replicas when shutdown is triggered by the SHUTDOWN command. A new blocked client type BLOCKED_SHUTDOWN is introduced, allowing multiple clients to call SHUTDOWN in parallel. Note that they don't expect a response unless an error happens and shutdown is aborted. * Log warning for each replica lagging behind when finishing shutdown. * CLIENT_PAUSE_WRITE while waiting for replicas. * Configurable grace period 'shutdown-timeout' in seconds (default 10). * New flags for the SHUTDOWN command: - NOW disables the grace period for lagging replicas. - FORCE ignores errors writing the RDB or AOF files which would normally prevent a shutdown. - ABORT cancels ongoing shutdown. Can't be combined with other flags. * New field in the output of the INFO command: 'shutdown_in_milliseconds'. The value is the remaining maximum time to wait for lagging replicas before finishing the shutdown. This field is present in the Server section only during shutdown. Not directly related: * When shutting down, if there is an AOF saving child, it is killed even if AOF is disabled. This can happen if BGREWRITEAOF is used when AOF is off. * Client pause now has end time and type (WRITE or ALL) per purpose. The different pause purposes are CLIENT PAUSE command, failover and shutdown. If clients are unpaused for one purpose, it doesn't affect client pause for other purposes. For example, the CLIENT UNPAUSE command doesn't affect client pause initiated by the failover or shutdown procedures. A completed failover or a failed shutdown doesn't unpause clients paused by the CLIENT PAUSE command. Notes: * DEBUG RESTART doesn't wait for replicas. * We already have a warning logged when a replica disconnects. This means that if any replica connection is lost during the shutdown, it is either logged as disconnected or as lagging at the time of exit. Co-authored-by: Oran Agra <oran@redislabs.com>	2022-01-02 09:50:15 +02:00
yoav-steinberg	1bf6d6f11e	Generate RDB with Functions only via redis-cli --functions-rdb (#9968 ) This is needed in order to ease the deployment of functions for ephemeral cases, where user needs to spin up a server with functions pre-loaded. #### Details: * Added `--functions-rdb` option to _redis-cli_. * Functions only rdb via `REPLCONF rdb-filter-only functions`. This is a placeholder for a space separated inclusion filter for the RDB. In the future can be `REPLCONF rdb-filter-only "functions db:3 key-patten:user"` and a complementing `rdb-filter-exclude` `REPLCONF` can also be added. Handle "slave requirements" specification to RDB saving code so we can use the same RDB when different slaves express the same requirements (like functions-only) and not share the RDB when their requirements differ. This is currently just a flags `int`, but can be extended to a more complex structure with various filter fields. * make sure to support filters only in diskless replication mode (not to override the persistence file), we do that by forcing diskless (even if disabled by config) other changes: * some refactoring in rdb.c (extract portion of a big function to a sub-function) * rdb_key_save_delay used in AOFRW too * sendChildInfo takes the number of updated keys (incremental, rather than absolute) Co-authored-by: Oran Agra <oran@redislabs.com>	2022-01-02 09:39:01 +02:00
sundb	888e92eb57	Fix a valgrind test failure due to slowly shutdown (#10038 ) This pr is mainly to solve the problem that redis process cannot be exited normally, due to changes in #10003. When a test uses the `key-load-delay` config to delay loading, but does not reset it at the end of the test, will lead to server wait for the loading to reach the event loop (once in 2mb) before actually shutting down.	2022-01-01 17:45:13 +02:00
Binbin	4836ae32c7	redis-cli: Add -X option and extend --cluster call take arg from stdin (#9980 ) There are two changes in this commit: 1. Add -X option to redis-cli. Currently `-x` can only be used to provide the last argument, so you can do `redis-cli dump keyname > key.dump`, and then do `redis-cli -x restore keyname 0 < key.dump`. But what if you want to add the replace argument (which comes last?). oran suggested adding such usage: `redis-cli -X <tag> restore keyname <tag> replace < key.dump` i.e. you're able to provide a string in the arguments that's gonna be substituted with the content from stdin. Note that the tag name should not conflict with others non-replaced args. And the -x and -X options are conflicting. Some usages: ``` [root]# echo mypasswd \| src/redis-cli -X passwd_tag mset username myname password passwd_tag OK [root]# echo username > username.txt [root]# head -c -1 username.txt \| src/redis-cli -X name_tag mget name_tag password 1) "myname" 2) "mypasswd\n" ``` 2. Handle the combination of both `-x` and `--cluster` or `-X` and `--cluster` Extend the broadcast option to receive the last arg or <tag> arg from the stdin. Now we can use `redis-cli -x --cluster call <host>:<port> cmd`, or `redis-cli -X <tag> --cluster call <host>:<port> cmd <tag>`. (support part of #9899)	2021-12-30 12:10:04 +02:00
Binbin	e84ccc3f56	santize dump payload: fix carsh when zset with NAN score (#10002 ) `zslInsert` with a NAN score will crash the server. This one found by the `corrupt-dump-fuzzer`.	2021-12-26 11:40:11 +02:00
Oran Agra	b7567394e1	resolve replication test timing sensitivity - 2nd attempt (#9988 ) issue started failing after #9878 was merged (made an exiting test more sensitive) looks like #9982 didn't help, tested this one and it seems to work better. this commit does two things: 1. reduce the extra delay i added earlier and instead add more keys, the effect no duration of replication is the same, but the intervals in which the server is responsive to the tcl client is higher. 2. improve the test infra to print context when assert_error fails.	2021-12-22 23:37:12 +02:00
Oran Agra	e33e0295bb	resolve replication test timing sensitivity (#9982 ) issue started failing after #9878 was merged (made an exiting test more sensitive)	2021-12-22 16:05:53 +02:00
Oran Agra	41e6e05dee	Allow most CONFIG SET during loading, block some commands in async-loading (#9878 ) ## background Till now CONFIG SET was blocked during loading. (In the not so distant past, GET was disallowed too) We recently (not released yet) added an async-loading mode, see #9323, and during that time it'll serve CONFIG SET and any other command. And now we realized (#9770) that some configs, and commands are dangerous during async-loading. ## changes * Allow most CONFIG SET during loading (both on async-loading and normal loading) * Allow CONFIG REWRITE and CONFIG RESETSTAT during loading * Block a few config during loading (`appendonly`, `repl-diskless-load`, and `dir`) * Block a few commands during loading (list below) ## the blocked commands: * SAVE - obviously we don't wanna start a foregreound save during loading 8-) * BGSAVE - we don't mind to schedule one, but we don't wanna fork now * BGREWRITEAOF - we don't mind to schedule one, but we don't wanna fork now * MODULE - we obviously don't wanna unload a module during replication / rdb loading (MODULE HELP and MODULE LIST are not blocked) * SYNC / PSYNC - we're in the middle of RDB loading from master, must not allow sync requests now. * REPLICAOF / SLAVEOF - we're in the middle of replicating, maybe it makes sense to let the user abort it, but he couldn't do that so far, i don't wanna take any risk of bugs due to odd state. * CLUSTER - only allow [HELP, SLOTS, NODES, INFO, MYID, LINKS, KEYSLOT, COUNTKEYSINSLOT, GETKEYSINSLOT, RESET, REPLICAS, COUNT_FAILURE_REPORTS], for others, preserve the status quo ## other fixes * processEventsWhileBlocked had an issue when being nested, this could happen with a busy script during async loading (new), but also in a busy script during AOF loading (old). this lead to a crash in the scenario described in #6988	2021-12-22 14:11:16 +02:00
zhugezy	1b0968df46	Remove EVAL script verbatim replication, propagation, and deterministic execution logic (#9812 ) # Background The main goal of this PR is to remove relevant logics on Lua script verbatim replication, only keeping effects replication logic, which has been set as default since Redis 5.0. As a result, Lua in Redis 7.0 would be acting the same as Redis 6.0 with default configuration from users' point of view. There are lots of reasons to remove verbatim replication. Antirez has listed some of the benefits in Issue #5292: >1. No longer need to explain to users side effects into scripts. They can do whatever they want. >2. No need for a cache about scripts that we sent or not to the slaves. >3. No need to sort the output of certain commands inside scripts (SMEMBERS and others): this both simplifies and gains speed. >4. No need to store scripts inside the RDB file in order to startup correctly. >5. No problems about evicting keys during the script execution. When looking back at Redis 5.0, antirez and core team decided to set the config `lua-replicate-commands yes` by default instead of removing verbatim replication directly, in case some bad situations happened. 3 years later now before Redis 7.0, it's time to remove it formally. # Changes - configuration for lua-replicate-commands removed - created config file stub for backward compatibility - Replication script cache removed - this is useless under script effects replication - relevant statistics also removed - script persistence in RDB files is also removed - Propagation of SCRIPT LOAD and SCRIPT FLUSH to replica / AOF removed - Deterministic execution logic in scripts removed (i.e. don't run write commands after random ones, and sorting output of commands with random order) - the flags indicating which commands have non-deterministic results are kept as hints to clients. - `redis.replicate_commands()` & `redis.set_repl()` changed - now `redis.replicate_commands()` does nothing and return an 1 - ...and then `redis.set_repl()` can be issued before `redis.replicate_commands()` now - Relevant TCL cases adjusted - DEBUG lua-always-replicate-commands removed # Other changes - Fix a recent bug comparing CLIENT_ID_AOF to original_client->flags instead of id. (introduced in #9780) Co-authored-by: Oran Agra <oran@redislabs.com>	2021-12-21 08:32:42 +02:00
Oran Agra	6add1b7217	Add external test that runs without debug command (#9964 ) - add needs:debug flag for some tests - disable "save" in external tests (speedup?) - use debug_digest proc instead of debug command directly so it can be skipped - use OBJECT ENCODING instead of DEBUG OBJECT to get encoding - add a proc for OBJECT REFCOUNT so it can be skipped - move a bunch of tests in latency_monitor tests to happen later so that latency monitor has some values in it - add missing close_replication_stream calls - make sure to close the temp client if DEBUG LOG fails	2021-12-19 17:41:51 +02:00
sundb	7f0fae947a	Santize dump payload: fix crash when stream with duplicate consumes (#9918 ) When rdb creates a consumer without determining whether it exists in advance, it may return NULL and crash if it encounters corrupt data with duplicate consumers.	2021-12-08 18:11:57 +02:00
Binbin	b947049f85	Fix timing issue in logging.tcl with FreeBSD (#9910 ) A test failure was reported in Daily CI. `Crash report generated on SIGABRT` with FreeBSD. ``` *** [err]: Crash report generated on SIGABRT in tests/integration/logging.tcl Expected [string match crashed by signal ### Starting...(logs) in tests/integration/logging.tcl] ``` It look like `tail -1000` was executed too early, before it printed out all the crash logs. We can give it a few more chances by using `wait_for_log_messages`. Other changes: 1. In `Server is able to generate a stack trace on selected systems`, use `wait_for_log_messages`to reduce the lines of code. And if it fails, there are more detailed logs that can be printed. 2. In `Crash report generated on DEBUG SEGFAULT`, we also use `wait_for_log_messages` to avoid possible timing issues.	2021-12-07 12:02:58 +02:00
sundb	1808618f5d	Santize dump payload: fix invalid listpack entry start with EOF (#9889 ) When an invalid listpack entry starts with EOF, we will skip it when we verify it in the loop.	2021-12-04 16:43:08 +02:00
meir@redislabs.com	cbd463175f	Redis Functions - Added redis function unit and Lua engine Redis function unit is located inside functions.c and contains Redis Function implementation: 1. FUNCTION commands: * FUNCTION CREATE * FCALL * FCALL_RO * FUNCTION DELETE * FUNCTION KILL * FUNCTION INFO 2. Register engine In addition, this commit introduce the first engine that uses the Redis Function capabilities, the Lua engine.	2021-12-02 19:35:52 +02:00
Viktor Söderqvist	acf3495eb8	Sort out the mess around writable replicas and lookupKeyRead/Write (#9572 ) Writable replicas now no longer use the values of expired keys. Expired keys are deleted when lookupKeyWrite() is used, even on a writable replica. Previously, writable replicas could use the value of an expired key in write commands such as INCR, SUNIONSTORE, etc.. This commit also sorts out the mess around the functions lookupKeyRead() and lookupKeyWrite() so they now indicate what we intend to do with the key and are not affected by the command calling them. Multi-key commands like SUNIONSTORE, ZUNIONSTORE, COPY and SORT with the store option now use lookupKeyRead() for the keys they're reading from (which will not allow reading from logically expired keys). This commit also fixes a bug where PFCOUNT could return a value of an expired key. Test modules commands have their readonly and write flags updated to correctly reflect their lookups for reading or writing. Modules are not required to correctly reflect this in their command flags, but this change is made for consistency since the tests serve as usage examples. Fixes #6842. Fixes #7475.	2021-11-28 11:26:28 +02:00
sundb	4512905961	Replace ziplist with listpack in quicklist (#9740 ) Part three of implementing #8702, following #8887 and #9366 . ## Description of the feature 1. Replace the ziplist container of quicklist with listpack. 2. Convert existing quicklist ziplists on RDB loading time. an O(n) operation. ## Interface changes 1. New `list-max-listpack-size` config is an alias for `list-max-ziplist-size`. 2. Replace `debug ziplist` command with `debug listpack`. ## Internal changes 1. Add `lpMerge` to merge two listpacks . (same as `ziplistMerge`) 2. Add `lpRepr` to print info of listpack which is used in debugCommand and `quicklistRepr`. (same as `ziplistRepr`) 3. Replace `QUICKLIST_NODE_CONTAINER_ZIPLIST` with `QUICKLIST_NODE_CONTAINER_PACKED`(following #9357 ). It represent that a quicklistNode is a packed node, as opposed to a plain node. 4. Remove `createZiplistObject` method, which is never used. 5. Calculate listpack entry size using overhead overestimation in `quicklistAllowInsert`. We prefer an overestimation, which would at worse lead to a few bytes below the lowest limit of 4k. ## Improvements 1. Calling `lpShrinkToFit` after converting Ziplist to listpack, which was missed at #9366. 2. Optimize `quicklistAppendPlainNode` to avoid memcpy data. ## Bugfix 1. Fix crash in `quicklistRepr` when ziplist is compressed, introduced from #9366. ## Test 1. Add unittest for `lpMerge`. 2. Modify the old quicklist ziplist corrupt dump test. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-24 13:34:13 +02:00
Binbin	fb4f7be22c	Wait for `asyn_loading` to stop in `short read` test (#9841 ) In #9323, when `repl-diskless-load` is enabled and set to `swapdb`, if the master replication ID hasn't changed, we can load data-set asynchronously, and serving read commands during the full resync. In `diskless loading short read` test, after a loading successfully, we will wait for the loading to stop and continue the for loop. After the introduction of `async_loading`, we also need to check it. Otherwise the next loop will start too soon, may trigger a timing issue.	2021-11-24 12:46:43 +02:00
Oran Agra	a3a014294f	fix invalid read on corrupt ziplist (#9831 ) If the last bytes in ziplist are corrupt and we decode from tail to head, we may reach slightly outside the ziplist.	2021-11-23 14:56:52 +02:00
Oran Agra	f07dedf73f	Fix invalid access in lpFind on corrupted listpack (#9819 ) Issue found by corrupt-dump-fuzzer test with ASAN. The problem was that lpSkip and lpGetWithSize could read the next listpack entry without validating that it's in range. Similarly even the memcmp in lpFind could do that and possibly crash on segfault and now they'll crash on assert first. The naive fix of using lpAssertValidEntry every time, resulted in 30% degradation in the lpFind benchmark of the unit test. The final fix with the condition at the bottom has no performance implications.	2021-11-22 15:30:00 +02:00
Oran Agra	f00a8ad93c	fix string escaping in corrupt-dump test to support TCL8.5 (#9824 ) TCL8.5 can't handle cases where part of the string is escaped and part of it isn't, if there's a single char that needs escaping, we need to escape the whole string.	2021-11-22 12:30:06 +02:00
Oran Agra	183b90a625	Fix false positive leak reported by GCC ASAN (#9816 ) Leak found by the corrupt-dump-fuzzer when using GCC ASAN, which seems to falsely report leaks on pointers kept only on the stack when calling exit. Instead we now use _exit on panic / assert to skip these leak checks. Additionally, check for sanitizer warnings in the corrupt-dump-fuzzer between iterations, so that when something is found we know which test to relate it too (and it prints reproduction command list)	2021-11-21 18:47:10 +02:00
Oran Agra	1417648469	Prevent LCS from allocating temp memory over proto-max-bulk-len (#9817 ) LCS can allocate immense amount of memory (sizes of two inputs multiplied by each other). In the past this caused some possible security issues due to overflows, which we solved and also added use of `trymalloc` to return "Insufficient memory" instead of OOM panic zmalloc. But in case overcommit is enabled, it could be that we won't get the OOM panic, and zmalloc will succeed, and then we can get OOM killed by the kernel. The solution here is to prevent LCS from allocating transient memory that's bigger than `proto-max-bulk-len` config. This config is not directly related to transient memory, but using a hard coded value ad well as introducing a specific config seems wrong. This comes to solve an error in the corrupt-dump-fuzzer test that started in the daily CI see #9799	2021-11-21 14:30:20 +02:00
Ozan Tezcan	b91d8b289b	Add sanitizer support and clean up sanitizer findings (#9601 ) - Added sanitizer support. `address`, `undefined` and `thread` sanitizers are available. - To build Redis with desired sanitizer : `make SANITIZER=undefined` - There were some sanitizer findings, cleaned up codebase - Added tests with address and undefined behavior sanitizers to daily CI. - Added tests with address sanitizer to the per-PR CI (smoke out mem leaks sooner). Basically, there are three types of issues : 1- Unaligned load/store : Most probably, this issue may cause a crash on a platform that does not support unaligned access. Redis does unaligned access only on supported platforms. 2- Signed integer overflow. Although, signed overflow issue can be problematic time to time and change how compiler generates code, current findings mostly about signed shift or simple addition overflow. For most platforms Redis can be compiled for, this wouldn't cause any issue as far as I can tell (checked generated code on godbolt.org). 3 -Minor leak (redis-cli), use-after-free(just before calling exit()); UB means nothing guaranteed and risky to reason about program behavior but I don't think any of the fixes here worth backporting. As sanitizers are now part of the CI, preventing new issues will be the real benefit.	2021-11-11 13:51:33 +02:00
Oran Agra	0927a0dd24	Try solving test timeout on freebsd CI (#9768 ) First, avoid using --accurate on the freebsd CI, we only care about systematic issues there due to being different platform, but not accuracy Secondly, when looking at the test which timed out it seems silly and outdated: - it used KEYS to attempt to trigger lazy expiry, but KEYS doesn't do that anymore. - it used some hard coded sleeps rather than waiting for things to happen and exiting ASAP	2021-11-10 19:39:26 +02:00
YaacovHazan	03406fcb6c	fix short timeout in replication short read tests (#9763 ) In both tests, "diskless loading short read" and "diskless loading short read with module", the timeout of waiting for the replica to respond to a short read and log it, is too short. Also, add --dump-logs in runtest-moduleapi for valgrind runs.	2021-11-09 22:37:18 +02:00
Eduardo Semprebon	91d0c758e5	Replica keep serving data during repl-diskless-load=swapdb for better availability (#9323 ) For diskless replication in swapdb mode, considering we already spend replica memory having a backup of current db to restore in case of failure, we can have the following benefits by instead swapping database only in case we succeeded in transferring db from master: - Avoid `LOADING` response during failed and successful synchronization for cases where the replica is already up and running with data. - Faster total time of diskless replication, because now we're moving from Transfer + Flush + Load time to Transfer + Load only. Flushing the tempDb is done asynchronously after swapping. - This could be implemented also for disk replication with similar benefits if consumers are willing to spend the extra memory usage. General notes: - The concept of `backupDb` becomes `tempDb` for clarity. - Async loading mode will only kick in if the replica is syncing from a master that has the same repl-id the one it had before. i.e. the data it's getting belongs to a different time of the same timeline. - New property in INFO: `async_loading` to differentiate from the blocking loading - Slot to Key mapping is now a field of `redisDb` as it's more natural to access it from both server.db and the tempDb that is passed around. - Because this is affecting replicas only, we assume that if they are not readonly and write commands during replication, they are lost after SYNC same way as before, but we're still denying CONFIG SET here anyways to avoid complications. Considerations for review: - We have many cases where server.loading flag is used and even though I tried my best, there may be cases where async_loading should be checked as well and cases where it shouldn't (would require very good understanding of whole code) - Several places that had different behavior depending on the loading flag where actually meant to just handle commands coming from the AOF client differently than ones coming from real clients, changed to check CLIENT_ID_AOF instead. Additional for Release Notes - Bugfix - server.dirty was not incremented for any kind of diskless replication, as effect it wouldn't contribute on triggering next database SAVE - New flag for RM_GetContextFlags module API: REDISMODULE_CTX_FLAGS_ASYNC_LOADING - Deprecated RedisModuleEvent_ReplBackup. Starting from Redis 7.0, we don't fire this event. Instead, we have the new RedisModuleEvent_ReplAsyncLoad holding 3 sub-events: STARTED, ABORTED and COMPLETED. - New module flag REDISMODULE_OPTIONS_HANDLE_REPL_ASYNC_LOAD for RedisModule_SetModuleOptions to allow modules to declare they support the diskless replication with async loading (when absent, we fall back to disk-based loading). Co-authored-by: Eduardo Semprebon <edus@saxobank.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-04 10:46:50 +02:00
menwen	ccf8a651f3	Retry when a blocked connection system call is interrupted by a signal (#9629 ) When repl-diskless-load is enabled, the connection is set to the blocking state. The connection may be interrupted by a signal during a system call. This would have resulted in a disconnection and possibly a reconnection loop. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-04 09:09:28 +02:00
perryitay	f27083a4a8	Add support for list type to store elements larger than 4GB (#9357 ) Redis lists are stored in quicklist, which is currently a linked list of ziplists. Ziplists are limited to storing elements no larger than 4GB, so when bigger items are added they're getting truncated. This PR changes quicklists so that they're capable of storing large items in quicklist nodes that are plain string buffers rather than ziplist. As part of the PR there were few other changes in redis: 1. new DEBUG sub-commands: - QUICKLIST-PACKED-THRESHOLD - set the threshold of for the node type to be plan or ziplist. default (1GB) - QUICKLIST <key> - Shows low level info about the quicklist encoding of <key> 2. rdb format change: - A new type was added - RDB_TYPE_LIST_QUICKLIST_2 . - container type (packed / plain) was added to the beginning of the rdb object (before the actual node list). 3. testing: - Tests that requires over 100MB will be by default skipped. a new flag was added to 'runtest' to run the large memory tests (not used by default) Co-authored-by: sundb <sundbcn@gmail.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-03 20:47:18 +02:00
Binbin	58a1d16ff6	Fix timing issue in replication test (#9719 ) So it looks like sampling set loglines [count_log_lines -2] was executed too late, and the replication managed to complete before that. ``` *** [err]: diskless no replicas drop during rdb pipe in tests/integration/replication.tcl log message of '"Diskless rdb transfer, done reading from pipe, 2 replicas still up"' not found in ./tests/tmp/server.6124.69/stdout after line: 52 till line: 52 ``` Changes: 1. when we search the master log file, we start to search from before we sent the REPLICAOF command, to prevent a race in which the replication completed before we sampled the log line count. 2. we don't need to sample the replica loglines sine it's a fresh resplica that's just been started, so the message we're looking for is the first occurrence in the log, we can start search from 0. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-02 10:32:01 +02:00
Binbin	cea7809cea	Fix race condition in psync2-pingoff test (#9712 ) Test failed on freebsd: ``` *** [err]: Make the old master a replica of the new one and check conditions in tests/integration/psync2-pingoff.tcl Expected '162' to be equal to '176' (context: type eval line 18 cmd {assert_equal [status $R(0) master_repl_offset] [status $R(1) master_repl_offset]} proc ::test) ``` There are two possible race conditions in the test. 1. The code waits for sync_full to increment, and assumes that means the master did the fork. But in fact there are cases the master will increment that sync_full counter (after replica asks for sync), but will see that there's already a fork running and will delay the fork creation. In this case the INCR will be executed before the fork happens, so it'll not be in the command stream. Solve that by waiting for `master_link_status: up` on the replica before the INCR. 2. The repl-ping-replica-period is still high (1 second), so there's a chance the master will send an additional PING between the two calls to INFO (the line that fails is the one that samples INFO from both servers). So there's a chance one of them will have an incremented offset due to PING and the other won't have it yet. In theory we can wait for the repl_offset to match, but then we risk facing a situation where that race will hide an offset mis-match. so instead, i think we should just change repl-ping-replica-period to prevent further pings from being pushed. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-01 16:07:08 +02:00
Wang Yuan	68886de085	Fix timing issue in replication buffer test (#9697 ) Introduced in #9166	2021-10-29 08:04:12 +03:00
Wang Yuan	37dc2f13b4	Fix not waiting for data loading to complete in AOF tests (#9683 ) Fix timing issue of a new test introduced in #9326	2021-10-26 14:08:09 +03:00
Wang Yuan	9ec3294b97	Add timestamp annotations in AOF (#9326 ) Add timestamp annotation in AOF, one part of #9325. Enabled with the new `aof-timestamp-enabled` config option. Timestamp annotation format is "#TS:${timestamp}\r\n"." TS" is short of timestamp and this method could save extra bytes in AOF. We can use timestamp annotation for some special functions. - know the executing time of commands - restore data to a specific point-in-time (by using redis-check-rdb to truncate the file)	2021-10-25 13:08:34 +03:00
Wang Yuan	c1718f9d86	Replication backlog and replicas use one global shared replication buffer (#9166 ) ## Background For redis master, one replica uses one copy of replication buffer, that is a big waste of memory, more replicas more waste, and allocate/free memory for every reply list also cost much. If we set client-output-buffer-limit small and write traffic is heavy, master may disconnect with replicas and can't finish synchronization with replica. If we set client-output-buffer-limit big, master may be OOM when there are many replicas that separately keep much memory. Because replication buffers of different replica client are the same, one simple idea is that all replicas only use one replication buffer, that will effectively save memory. Since replication backlog content is the same as replicas' output buffer, now we can discard replication backlog memory and use global shared replication buffer to implement replication backlog mechanism. ## Implementation I create one global "replication buffer" which contains content of replication stream. The structure of "replication buffer" is similar to the reply list that exists in every client. But the node of list is `replBufBlock`, which has `id, repl_offset, refcount` fields. ```c /* Replication buffer blocks is the list of replBufBlock. * * +--------------+ +--------------+ +--------------+ * \| refcount = 1 \| ... \| refcount = 0 \| ... \| refcount = 2 \| * +--------------+ +--------------+ +--------------+ * \| / \ * \| / \ * \| / \ * Repl Backlog Replia_A Replia_B * * Each replica or replication backlog increments only the refcount of the * 'ref_repl_buf_node' which it points to. So when replica walks to the next * node, it should first increase the next node's refcount, and when we trim * the replication buffer nodes, we remove node always from the head node which * refcount is 0. If the refcount of the head node is not 0, we must stop * trimming and never iterate the next node. / / Similar with 'clientReplyBlock', it is used for shared buffers between * all replica clients and replication backlog. / typedef struct replBufBlock { int refcount; / Number of replicas or repl backlog using. / long long id; / The unique incremental number. / long long repl_offset; / Start replication offset of the block. */ size_t size, used; char buf[]; } replBufBlock; ``` So now when we feed replication stream into replication backlog and all replicas, we only need to feed stream into replication buffer `feedReplicationBuffer`. In this function, we set some fields of replication backlog and replicas to references of the global replication buffer blocks. And we also need to check replicas' output buffer limit to free if exceeding `client-output-buffer-limit`, and trim replication backlog if exceeding `repl-backlog-size`. When sending reply to replicas, we also need to iterate replication buffer blocks and send its content, when totally sending one block for replica, we decrease current node count and increase the next current node count, and then free the block which reference is 0 from the head of replication buffer blocks. Since now we use linked list to manage replication backlog, it may cost much time for iterating all linked list nodes to find corresponding replication buffer node. So we create a rax tree to store some nodes for index, but to avoid rax tree occupying too much memory, i record one per 64 nodes for index. Currently, to make partial resynchronization as possible as much, we always let replication backlog as the last reference of replication buffer blocks, backlog size may exceeds our setting if slow replicas that reference vast replication buffer blocks, and this method doesn't increase memory usage since they share replication buffer. To avoid freezing server for freeing unreferenced replication buffer blocks when we need to trim backlog for exceeding backlog size setting, we trim backlog incrementally (free 64 blocks per call now), and make it faster in `beforeSleep` (free 640 blocks). ### Other changes - `mem_total_replication_buffers`: we add this field in INFO command, it means the total memory of replication buffers used. - `mem_clients_slaves`: now even replica is slow to replicate, and its output buffer memory is not 0, but it still may be 0, since replication backlog and replicas share one global replication buffer, only if replication buffer memory is more than the repl backlog setting size, we consider the excess as replicas' memory. Otherwise, we think replication buffer memory is the consumption of repl backlog. - Key eviction Since all replicas and replication backlog share global replication buffer, we think only the part of exceeding backlog size the extra separate consumption of replicas. Because we trim backlog incrementally in the background, backlog size may exceeds our setting if slow replicas that reference vast replication buffer blocks disconnect. To avoid massive eviction loop, we don't count the delayed freed replication backlog into used memory even if there are no replicas, i.e. we also regard this memory as replicas's memory. - `client-output-buffer-limit` check for replica clients It doesn't make sense to set the replica clients output buffer limit lower than the repl-backlog-size config (partial sync will succeed and then replica will get disconnected). Such a configuration is ignored (the size of repl-backlog-size will be used). This doesn't have memory consumption implications since the replica client will share the backlog buffers memory. - Drop replication backlog after loading data if needed We always create replication backlog if server is a master, we need it because we put DELs in it when loading expired keys in RDB, but if RDB doesn't have replication info or there is no rdb, it is not possible to support partial resynchronization, to avoid extra memory of replication backlog, we drop it. - Multi IO threads Since all replicas and replication backlog use global replication buffer, if I/O threads are enabled, to guarantee data accessing thread safe, we must let main thread handle sending the output buffer to all replicas. But before, other IO threads could handle sending output buffer of all replicas. ## Other optimizations This solution resolve some other problem: - When replicas disconnect with master since of out of output buffer limit, releasing the output buffer of replicas may freeze server if we set big `client-output-buffer-limit` for replicas, but now, it doesn't cause freezing. - This implementation may mitigate reply list copy cost time(also freezes server) when one replication has huge reply buffer and another replica can copy buffer for full synchronization. now, we just copy reference info, it is very light. - If we set replication backlog size big, it also may cost much time to copy replication backlog into replica's output buffer. But this commit eliminates this problem. - Resizing replication backlog size doesn't empty current replication backlog content.	2021-10-25 09:24:31 +03:00
Oran Agra	276b460ea9	Attempt to fix a valgrind test failure due to timing (#9643 ) in the past few days i've seen two failures in the valgrind daily test. *** [err]: slave fails full sync and diskless load swapdb recovers it in tests/integration/replication.tcl Replica didn't get into loading mode can't reproduce it, but i'm hoping it's just too slow (to start loading within 5 seconds)	2021-10-18 10:45:45 +03:00
YaacovHazan	5becb7c9c6	improve the stability and correctness of "Test child sending info" (#9562 ) Since we measure the COW size in this test by changing some keys and reading the reported COW size, we need to ensure that the "dismiss mechanism" (#8974) will not free memory and reduce the COW size. For that, this commit changes the size of the keys to 512B (less than a page). and because some keys may fall into the same page, we are modifying ten keys on each iteration and check for at least 50% change in the COW size.	2021-10-04 10:32:26 +03:00
Oran Agra	5a4ab7c7d2	Fix stream sanitization for non-int first value (#9553 ) This was recently broken in #9321 when we validated stream IDs to be integers but did that after to the stepping next record instead of before.	2021-09-26 18:46:22 +03:00
Binbin	14d6abd8e9	Add ZMPOP/BZMPOP commands. (#9484 ) This is similar to the recent addition of LMPOP/BLMPOP (#9373), but zset. Syntax for the new ZMPOP command: `ZMPOP numkeys [<key> ...] MIN\|MAX [COUNT count]` Syntax for the new BZMPOP command: `BZMPOP timeout numkeys [<key> ...] MIN\|MAX [COUNT count]` Some background: - ZPOPMIN/ZPOPMAX take only one key, and can return multiple elements. - BZPOPMIN/BZPOPMAX take multiple keys, but return only one element from just one key. - ZMPOP/BZMPOP can take multiple keys, and can return multiple elements from just one key. Note that ZMPOP/BZMPOP can take multiple keys, it eventually operates on just on key. And it will propagate as ZPOPMIN or ZPOPMAX with the COUNT option. As new commands, if we can not pop any elements, the response like: - ZMPOP: Return a NIL in both RESP2 and RESP3, unlike ZPOPMIN/ZPOPMAX return emptyarray. - BZMPOP: Return a NIL in both RESP2 and RESP3 when timeout is reached, like BZPOPMIN/BZPOPMAX. For the normal response is nested arrays in RESP2 and RESP3: ``` ZMPOP/BZMPOP 1) keyname 2) 1) 1) member1 2) score1 2) 1) member2 2) score2 In RESP2: 1) "myzset" 2) 1) 1) "three" 2) "3" 2) 1) "two" 2) "2" In RESP3: 1) "myzset" 2) 1) 1) "three" 2) (double) 3 2) 1) "two" 2) (double) 2 ```	2021-09-23 08:34:40 +03:00
Oran Agra	16be742b08	fix replication test failure, probing the wrong log file (#9513 )	2021-09-19 12:07:04 +03:00
filipe oliveira	b5a879e1c2	Added URI support to redis-benchmark (cli and benchmark share the same uri-parsing methods) (#9314 ) - Add `-u <uri>` command line option to support `redis://` URI scheme. - included server connection information object (`struct cliConnInfo`), used to describe an ip:port pair, db num user input, and user:pass to avoid a large number of function arguments. - Using sds on connection info strings for redis-benchmark/redis-cli Co-authored-by: yoav-steinberg <yoav@monfort.co.il>	2021-09-14 19:45:06 +03:00
zhaozhao.zz	794442b130	PSYNC2: make partial sync possible after master reboot (#8015 ) The main idea is how to allow a master to load replication info from RDB file when rebooting, if master can load replication info it means that replicas may have the chance to psync with master, it can save much traffic. The key point is we need guarantee safety and consistency, so there are two differences between master and replica: 1. master would load the replication info as secondary ID and offset, in case other masters have the same replid. 2. when master loading RDB, it would propagate expired keys as DEL command to replication backlog, then replica can receive these commands to delete stale keys. p.s. the expired keys when RDB loading is useful for users, so we show it as `rdb_last_load_keys_expired` and `rdb_last_load_keys_loaded` in info persistence. Moreover, after load replication info, master should update `no_replica_time` in case loading RDB cost too long time.	2021-09-13 15:39:11 +08:00
sundb	3ca6972ecd	Replace all usage of ziplist with listpack for t_zset (#9366 ) Part two of implementing #8702 (zset), after #8887. ## Description of the feature Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance. ## Rdb format changes New `RDB_TYPE_ZSET_LISTPACK` rdb type. ## Rdb loading improvements: 1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist. 2) Simplifying the release of empty key objects when RDB loading. 3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c. ## Interface changes 1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`). 2) OBJECT ENCODING will return listpack instead of ziplist. ## Listpack improvements: 1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack. 2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string. 3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`. ## Zset improvements: 1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop. 2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset. ## Tests 1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function. 2) Add zset RDB loading test. 3) Add benchmark test for `lpCompare` and `ziplsitCompare`. 4) Add empty listpack zset corrupt dump test.	2021-09-09 18:18:53 +03:00

1 2 3 4 5 ...

284 Commits