redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 08:38:27 -05:00

Author	SHA1	Message	Date
yoav-steinberg	2753429c99	Client eviction (#8687 ) ### Description A mechanism for disconnecting clients when the sum of all connected clients is above a configured limit. This prevents eviction or OOM caused by accumulated used memory between all clients. It's a complimentary mechanism to the `client-output-buffer-limit` mechanism which takes into account not only a single client and not only output buffers but rather all memory used by all clients. #### Design The general design is as following: * We track memory usage of each client, taking into account all memory used by the client (query buffer, output buffer, parsed arguments, etc...). This is kept up to date after reading from the socket, after processing commands and after writing to the socket. * Based on the used memory we sort all clients into buckets. Each bucket contains all clients using up up to x2 memory of the clients in the bucket below it. For example up to 1m clients, up to 2m clients, up to 4m clients, ... * Before processing a command and before sleep we check if we're over the configured limit. If we are we start disconnecting clients from larger buckets downwards until we're under the limit. #### Config `maxmemory-clients` max memory all clients are allowed to consume, above this threshold we disconnect clients. This config can either be set to 0 (meaning no limit), a size in bytes (possibly with MB/GB suffix), or as a percentage of `maxmemory` by using the `%` suffix (e.g. setting it to `10%` would mean 10% of `maxmemory`). #### Important code changes * During the development I encountered yet more situations where our io-threads access global vars. And needed to fix them. I also had to handle keeps the clients sorted into the memory buckets (which are global) while their memory usage changes in the io-thread. To achieve this I decided to simplify how we check if we're in an io-thread and make it much more explicit. I removed the `CLIENT_PENDING_READ` flag used for checking if the client is in an io-thread (it wasn't used for anything else) and just used the global `io_threads_op` variable the same way to check during writes. * I optimized the cleanup of the client from the `clients_pending_read` list on client freeing. We now store a pointer in the `client` struct to this list so we don't need to search in it (`pending_read_list_node`). * Added `evicted_clients` stat to `INFO` command. * Added `CLIENT NO-EVICT ON\|OFF` sub command to exclude a specific client from the client eviction mechanism. Added corrosponding 'e' flag in the client info string. * Added `multi-mem` field in the client info string to show how much memory is used up by buffered multi commands. * Client `tot-mem` now accounts for buffered multi-commands, pubsub patterns and channels (partially), tracking prefixes (partially). * CLIENT_CLOSE_ASAP flag is now handled in a new `beforeNextClient()` function so clients will be disconnected between processing different clients and not only before sleep. This new function can be used in the future for work we want to do outside the command processing loop but don't want to wait for all clients to be processed before we get to it. Specifically I wanted to handle output-buffer-limit related closing before we process client eviction in case the two race with each other. * Added a `DEBUG CLIENT-EVICTION` command to print out info about the client eviction buckets. * Each client now holds a pointer to the client eviction memory usage bucket it belongs to and listNode to itself in that bucket for quick removal. * Global `io_threads_op` variable now can contain a `IO_THREADS_OP_IDLE` value indicating no io-threading is currently being executed. * In order to track memory used by each clients in real-time we can't rely on updating these stats in `clientsCron()` alone anymore. So now I call `updateClientMemUsage()` (used to be `clientsCronTrackClientsMemUsage()`) after command processing, after writing data to pubsub clients, after writing the output buffer and after reading from the socket (and maybe other places too). The function is written to be fast. * Clients are evicted if needed (with appropriate log line) in `beforeSleep()` and before processing a command (before performing oom-checks and key-eviction). * All clients memory usage buckets are grouped as follows: * All clients using less than 64k. * 64K..128K * 128K..256K * ... * 2G..4G * All clients using 4g and up. * Added client-eviction.tcl with a bunch of tests for the new mechanism. * Extended maxmemory.tcl to test the interaction between maxmemory and maxmemory-clients settings. * Added an option to flag a numeric configuration variable as a "percent", this means that if we encounter a '%' after the number in the config file (or config set command) we consider it as valid. Such a number is store internally as a negative value. This way an integer value can be interpreted as either a percent (negative) or absolute value (positive). This is useful for example if some numeric configuration can optionally be set to a percentage of something else. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-09-23 14:02:16 +03:00
YaacovHazan	a56d4533b7	Adding ACL support for modules (#9309 ) This commit introduced a new flag to the RM_Call: 'C' - Check if the command can be executed according to the ACLs associated with it. Also, three new API's added to check if a command, key, or channel can be executed or accessed by a user, according to the ACLs associated with it. - RM_ACLCheckCommandPerm - RM_ACLCheckKeyPerm - RM_ACLCheckChannelPerm The user for these API's is a RedisModuleUser object, that for a Module user returned by the RM_CreateModuleUser API, or for a general ACL user can be retrieved by these two new API's: - RM_GetCurrentUserName - Retrieve the user name of the client connection behind the current context. - RM_GetModuleUserFromUserName - Get a RedisModuleUser from a user name As a result of getting a RedisModuleUser from name, it can now also access the general ACL users (not just ones created by the module). This mean the already existing API RM_SetModuleUserACL(), can be used to change the ACL rules for such users.	2021-09-23 08:52:56 +03:00
Binbin	14d6abd8e9	Add ZMPOP/BZMPOP commands. (#9484 ) This is similar to the recent addition of LMPOP/BLMPOP (#9373), but zset. Syntax for the new ZMPOP command: `ZMPOP numkeys [<key> ...] MIN\|MAX [COUNT count]` Syntax for the new BZMPOP command: `BZMPOP timeout numkeys [<key> ...] MIN\|MAX [COUNT count]` Some background: - ZPOPMIN/ZPOPMAX take only one key, and can return multiple elements. - BZPOPMIN/BZPOPMAX take multiple keys, but return only one element from just one key. - ZMPOP/BZMPOP can take multiple keys, and can return multiple elements from just one key. Note that ZMPOP/BZMPOP can take multiple keys, it eventually operates on just on key. And it will propagate as ZPOPMIN or ZPOPMAX with the COUNT option. As new commands, if we can not pop any elements, the response like: - ZMPOP: Return a NIL in both RESP2 and RESP3, unlike ZPOPMIN/ZPOPMAX return emptyarray. - BZMPOP: Return a NIL in both RESP2 and RESP3 when timeout is reached, like BZPOPMIN/BZPOPMAX. For the normal response is nested arrays in RESP2 and RESP3: ``` ZMPOP/BZMPOP 1) keyname 2) 1) 1) member1 2) score1 2) 1) member2 2) score2 In RESP2: 1) "myzset" 2) 1) 1) "three" 2) "3" 2) 1) "two" 2) "2" In RESP3: 1) "myzset" 2) 1) 1) "three" 2) (double) 3 2) 1) "two" 2) (double) 2 ```	2021-09-23 08:34:40 +03:00
Oran Agra	5f7789d329	tune lazyfree test timeout (#9527 ) i've seen this CI failure a couple of times on MacOS: *** [err]: lazy free a stream with all types of metadata in tests/unit/lazyfree.tcl lazyfree isn't done only reason i can think of is that 500ms is sometimes not enough on slow systems.	2021-09-22 09:48:44 +03:00
Oran Agra	16be742b08	fix replication test failure, probing the wrong log file (#9513 )	2021-09-19 12:07:04 +03:00
Binbin	f898a9e97d	Adds limit to SINTERCARD/ZINTERCARD. (#9425 ) Implements the [LIMIT limit] variant of SINTERCARD/ZINTERCARD. Now with the LIMIT, we can stop the searching when cardinality reaching the limit, and return the cardinality ASAP. Note that in SINTERCARD, the old synatx was: `SINTERCARD key [key ...]` In order to add a optional parameter, we must break the old synatx. So the new syntax of SINTERCARD will be consistent with ZINTERCARD. New syntax: `SINTERCARD numkeys key [key ...] [LIMIT limit]`. Note that this means that SINTERCARD has a different syntax than SINTER and SINTERSTORE (taking numkeys argument) As for ZINTERCARD, we can easily add a optional parameter to it. New syntax: `ZINTERCARD numkeys key [key ...] [LIMIT limit]`	2021-09-16 14:07:08 +03:00
Wen Hui	53ad5627b7	Sentinel: Fix failed daily tests, due to race condition (#9501 )	2021-09-15 13:39:50 +03:00
guybe7	03fcc211de	A better approach for COMMAND INFO for movablekeys commands (#8324 ) Fix #7297 The problem: Today, there is no way for a client library or app to know the key name indexes for commands such as ZUNIONSTORE/EVAL and others with "numkeys", since COMMAND INFO returns no useful info for them. For cluster-aware redis clients, this requires to 'patch' the client library code specifically for each of these commands or to resolve each execution of these commands with COMMAND GETKEYS. The solution: Introducing key specs other than the legacy "range" (first,last,step) The 8th element of the command info array, if exists, holds an array of key specs. The array may be empty, which indicates the command doesn't take any key arguments or may contain one or more key-specs, each one may leads to the discovery of 0 or more key arguments. A client library that doesn't support this key-spec feature will keep using the first,last,step and movablekeys flag which will obviously remain unchanged. A client that supports this key-specs feature needs only to look at the key-specs array. If it finds an unrecognized spec, it must resort to using COMMAND GETKEYS if it wishes to get all key name arguments, but if all it needs is one key in order to know which cluster node to use, then maybe another spec (if the command has several) can supply that, and there's no need to use GETKEYS. Each spec is an array of arguments, first one is the spec name, the second is an array of flags, and the third is an array containing details about the spec (specific meaning for each spec type) The initial flags we support are "read" and "write" indicating if the keys that this key-spec finds are used for read or for write. clients should ignore any unfamiliar flags. In order to easily find the positions of keys in a given array of args we introduce keys specs. There are two logical steps of key specs: 1. `start_search`: Given an array of args, indicate where we should start searching for keys 2. `find_keys`: Given the output of start_search and an array of args, indicate all possible indices of keys. ### start_search step specs - `index`: specify an argument index explicitly - `index`: 0 based index (1 means the first command argument) - `keyword`: specify a string to match in `argv`. We should start searching for keys just after the keyword appears. - `keyword`: the string to search for - `start_search`: an index from which to start the keyword search (can be negative, which means to search from the end) Examples: - `SET` has start_search of type `index` with value `1` - `XREAD` has start_search of type `keyword` with value `[“STREAMS”,1]` - `MIGRATE` has start_search of type `keyword` with value `[“KEYS”,-2]` ### find_keys step specs - `range`: specify `[count, step, limit]`. - `lastkey`: index of the last key. relative to the index returned from begin_search. -1 indicating till the last argument, -2 one before the last - `step`: how many args should we skip after finding a key, in order to find the next one - `limit`: if count is -1, we use limit to stop the search by a factor. 0 and 1 mean no limit. 2 means ½ of the remaining args, 3 means ⅓, and so on. - “keynum”: specify `[keynum_index, first_key_index, step]`. - `keynum_index`: is relative to the return of the `start_search` spec. - `first_key_index`: is relative to `keynum_index`. - `step`: how many args should we skip after finding a key, in order to find the next one Examples: - `SET` has `range` of `[0,1,0]` - `MSET` has `range` of `[-1,2,0]` - `XREAD` has `range` of `[-1,1,2]` - `ZUNION` has `start_search` of type `index` with value `1` and `find_keys` of type `keynum` with value `[0,1,1]` - `AI.DAGRUN` has `start_search` of type `keyword` with value `[“LOAD“,1]` and `find_keys` of type `keynum` with value `[0,1,1]` (see https://oss.redislabs.com/redisai/master/commands/#aidagrun) Note: this solution is not perfect as the module writers can come up with anything, but at least we will be able to find the key args of the vast majority of commands. If one of the above specs can’t describe the key positions, the module writer can always fall back to the `getkeys-api` option. Some keys cannot be found easily (`KEYS` in `MIGRATE`: Imagine the argument for `AUTH` is the string “KEYS” - we will start searching in the wrong index). The guarantee is that the specs may be incomplete (`incomplete` will be specified in the spec to denote that) but we never report false information (assuming the command syntax is correct). For `MIGRATE` we start searching from the end - `startfrom=-1` - and if one of the keys is actually called "keys" we will report only a subset of all keys - hence the `incomplete` flag. Some `incomplete` specs can be completely empty (i.e. UNKNOWN begin_search) which should tell the client that COMMAND GETKEYS (or any other way to get the keys) must be used (Example: For `SORT` there is no way to describe the STORE keyword spec, as the word "store" can appear anywhere in the command). We will expose these key specs in the `COMMAND` command so that clients can learn, on startup, where the keys are for all commands instead of holding hardcoded tables or use `COMMAND GETKEYS` in runtime. Comments: 1. Redis doesn't internally use the new specs, they are only used for COMMAND output. 2. In order to support the current COMMAND INFO format (reply array indices 4, 5, 6) we created a synthetic range, called legacy_range, that, if possible, is built according to the new specs. 3. Redis currently uses only getkeys_proc or the legacy_range to get the keys indices (in COMMAND GETKEYS for example). "incomplete" specs: the command we have issues with are MIGRATE, STRALGO, and SORT for MIGRATE, because the token KEYS, if exists, must be the last token, we can search in reverse. it one of the keys is actually the string "keys" will return just a subset of the keys (hence, it's "incomplete") for SORT and STRALGO we can use this heuristic (the keys can be anywhere in the command) and therefore we added a key spec that is both "incomplete" and of "unknown type" if a client encounters an "incomplete" spec it means that it must find a different way (either COMMAND GETKEYS or have its own parser) to retrieve the keys. please note that all commands, apart from the three mentioned above, have "complete" key specs	2021-09-15 11:10:29 +03:00
filipe oliveira	b5a879e1c2	Added URI support to redis-benchmark (cli and benchmark share the same uri-parsing methods) (#9314 ) - Add `-u <uri>` command line option to support `redis://` URI scheme. - included server connection information object (`struct cliConnInfo`), used to describe an ip:port pair, db num user input, and user:pass to avoid a large number of function arguments. - Using sds on connection info strings for redis-benchmark/redis-cli Co-authored-by: yoav-steinberg <yoav@monfort.co.il>	2021-09-14 19:45:06 +03:00
Viktor Söderqvist	ea36d4de17	Modules: Add remaining list API functions (#8439 ) List functions operating on elements by index: * RM_ListGet * RM_ListSet * RM_ListInsert * RM_ListDelete Iteration is done using a simple for loop over indices. The index based functions use an internal iterator as an optimization. This is explained in the docs: ``` * Many of the list functions access elements by index. Since a list is in * essence a doubly-linked list, accessing elements by index is generally an * O(N) operation. However, if elements are accessed sequentially or with * indices close together, the functions are optimized to seek the index from * the previous index, rather than seeking from the ends of the list. * * This enables iteration to be done efficiently using a simple for loop: * * long n = RM_ValueLength(key); * for (long i = 0; i < n; i++) { * RedisModuleString elem = RedisModule_ListGet(key, i); // Do stuff... * } ```	2021-09-14 17:48:06 +03:00
sundb	1376d83363	Fix memory leak due to missing freeCallback in blockonbackground moduleapi test (#9499 ) Before #9497, before redis-server was shut down, we did not manually shut down all the clients, which would have prevented valgrind from detecting a memory leak in the client's argc.	2021-09-14 15:14:09 +03:00
yoav-steinberg	4c7827588d	Fixed leaked client for "start_server" when running in --loop (#9497 ) * On `kill_server` make sure we close the default `"client"` connection. * Don't reconnect when trying to execute the client's `close` command. * On `restart_server` make sure to remove the (closed) default `"client"` after killing the old server.	2021-09-13 18:16:47 +03:00
zhaozhao.zz	794442b130	PSYNC2: make partial sync possible after master reboot (#8015 ) The main idea is how to allow a master to load replication info from RDB file when rebooting, if master can load replication info it means that replicas may have the chance to psync with master, it can save much traffic. The key point is we need guarantee safety and consistency, so there are two differences between master and replica: 1. master would load the replication info as secondary ID and offset, in case other masters have the same replid. 2. when master loading RDB, it would propagate expired keys as DEL command to replication backlog, then replica can receive these commands to delete stale keys. p.s. the expired keys when RDB loading is useful for users, so we show it as `rdb_last_load_keys_expired` and `rdb_last_load_keys_loaded` in info persistence. Moreover, after load replication info, master should update `no_replica_time` in case loading RDB cost too long time.	2021-09-13 15:39:11 +08:00
Huang Zhw	75dd230994	bitpos/bitcount add bit index (#9324 ) Make bitpos/bitcount support bit index: ``` BITPOS key bit [start [end [BIT\|BYTE]]] BITCOUNT key [start end [BIT\|BYTE]] ``` The default behavior is `BYTE`, so these commands are still compatible with old.	2021-09-12 11:31:22 +03:00
Meir Shpilraien (Spielrein)	05e6b97bed	Fix RedisModule_Call tests on 32bit (#9481 )	2021-09-09 23:03:02 +03:00
sundb	3ca6972ecd	Replace all usage of ziplist with listpack for t_zset (#9366 ) Part two of implementing #8702 (zset), after #8887. ## Description of the feature Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance. ## Rdb format changes New `RDB_TYPE_ZSET_LISTPACK` rdb type. ## Rdb loading improvements: 1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist. 2) Simplifying the release of empty key objects when RDB loading. 3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c. ## Interface changes 1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`). 2) OBJECT ENCODING will return listpack instead of ziplist. ## Listpack improvements: 1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack. 2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string. 3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`. ## Zset improvements: 1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop. 2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset. ## Tests 1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function. 2) Add zset RDB loading test. 3) Add benchmark test for `lpCompare` and `ziplsitCompare`. 4) Add empty listpack zset corrupt dump test.	2021-09-09 18:18:53 +03:00
Madelyn Olson	86b0de5c41	Remove redundant validation and prevent duplicate users during ACL load (#9330 ) Throw an error when a user is provided multiple times on the command line instead of silently throwing one of them away. Remove unneeded validation for validating users on ACL load.	2021-09-09 07:40:33 -07:00
Binbin	c50af0aeba	Add LMPOP/BLMPOP commands. (#9373 ) We want to add COUNT option for BLPOP. But we can't do it without breaking compatibility due to the command arguments syntax. So this commit introduce two new commands. Syntax for the new LMPOP command: `LMPOP numkeys [<key> ...] LEFT\|RIGHT [COUNT count]` Syntax for the new BLMPOP command: `BLMPOP timeout numkeys [<key> ...] LEFT\|RIGHT [COUNT count]` Some background: - LPOP takes one key, and can return multiple elements. - BLPOP takes multiple keys, but returns one element from just one key. - LMPOP can take multiple keys and return multiple elements from just one key. Note that LMPOP/BLMPOP can take multiple keys, it eventually operates on just one key. And it will propagate as LPOP or RPOP with the COUNT option. As a new command, it still return NIL if we can't pop any elements. For the normal response is nested arrays in RESP2 and RESP3, like: ``` LMPOP/BLMPOP 1) keyname 2) 1) element1 2) element2 ``` I.e. unlike BLPOP that returns a key name and one element so it uses a flat array, and LPOP that returns multiple elements with no key name, and again uses a flat array, this one has to return a nested array, and it does for for both RESP2 and RESP3 (like SCAN does) Some discuss can see: #766 #8824	2021-09-09 12:02:33 +03:00
Wang Yuan	cee3d67f50	Delay to discard cached master when full synchronization (#9398 ) * Delay to discard cache master when full synchronization * Don't disconnect with replicas before loading transferred RDB when full sync Previously, once replica need to start full synchronization with master, it will discard cached master whatever full synchronization is failed or not. Now we discard cached master only when transferring RDB is finished and start to change data space, this make replica could start partial resynchronization with another new master if new master is failed during full synchronization.	2021-09-09 11:32:29 +03:00
chenyang8094	bc0c22fabc	Fix callReplyParseCollection memleak when use AutoMemory (#9446 ) When parsing an array type reply, ctx will be lost when recursively parsing its elements, which will cause a memory leak in automemory mode. This is a result of the changes in #9202 Add test for callReplyParseCollection fix	2021-09-09 11:03:05 +03:00
zhaozhao.zz	1b83353dc3	Fix wrong offset when replica pause (#9448 ) When a replica paused, it would not apply any commands event the command comes from master, if we feed the non-applied command to replication stream, the replication offset would be wrong, and data would be lost after failover(since replica's `master_repl_offset` grows but command is not applied). To fix it, here are the changes: * Don't update replica's replication offset or propagate commands to sub-replicas when it's paused in `commandProcessed`. * Show `slave_read_repl_offset` in info reply. * Add an assert to make sure master client should never be blocked unless pause or module (some modules may use block way to do background (parallel) processing and forward original block module command to the replica, it's not a good way but it can work, so the assert excludes module now, but someday in future all modules should rewrite block command to propagate like what `BLPOP` does).	2021-09-08 16:07:25 +08:00
Viktor Söderqvist	547c3405d4	Optimize quicklistIndex to seek from the nearest end (#9454 ) Until now, giving a negative index seeks from the end of a list and a positive seeks from the beginning. This change makes it seek from the nearest end, regardless of the sign of the given index. quicklistIndex is used by all list commands which operate by index. LINDEX key 999999 in a list if 1M elements is greately optimized by this change. Latency is cut by 75%. LINDEX key -1000000 in a list of 1M elements, likewise. LRANGE key -1 -1 is affected by this, since LRANGE converts the indices to positive numbers before seeking. The tests for corrupt dumps are updated to make sure the corrup data is seeked in the same direction as before.	2021-09-06 09:12:38 +03:00
Wen Hui	763fd09416	Speed up sentinel tests (#9408 ) Use sentinel debug to reduce default timeouts and allow tests to execute faster.	2021-09-05 13:26:29 +03:00
Madelyn Olson	8b8f05c86c	Add test verifying PUBSUB NUMPAT behavior (#9209 )	2021-09-03 15:52:39 -07:00
Oran Agra	1e7ad894d2	Tune timeout of active defrag test (#9426 ) Failed on Raspberry Pi 3b where that single test took about 170 seconds	2021-08-30 12:39:09 +03:00
Binbin	aefbc23451	Better error handling for updateClientOutputBufferLimit. (#9308 ) This one follow #9313 and goes deeper (validation of config file parsing) Move the check/update logic to a new updateClientOutputBufferLimit function. So that it can be used in CONFIG SET and config file parsing.	2021-08-29 15:03:05 +03:00
Viktor Söderqvist	97dcf95cc8	redis-benchmark: improved help and warnings (#9419 ) 1. The output of --help: * On the Usage line, just write [OPTIONS] [COMMAND ARGS...] instead listing only a few arbitrary options and no command. * For --cluster, describe that if the command is supplied on the command line, the key must contain "{tag}". Otherwise, the command will not be sent to the right cluster node. * For -r, add a note that if -r is omitted, all commands in a benchmark will use the same key. Also align the description. * For -t, describe that -t is ignored if a command is supplied on the command line. 2. Print a warning if -t is present when a specific command is supplied. 3. Print all warnings and errors to stderr. 4. Remove -e from calls in redis-benchmark test suite.	2021-08-29 14:31:08 +03:00
Binbin	0835f596b8	BITSET and BITFIELD SET only propagate command when the value changed. (#9403 ) In old way, we always increase server.dirty in BITSET and BITFIELD SET. Even the command doesn't really change anything. This commit make sure BITSET and BITFIELD SET only increase dirty when the value changed. Because of that, if the value not changed, some others implications: - Avoid adding useless AOF - Reduce replication traffic - Will not trigger keyspace notifications (setbit) - Will not invalidate WATCH - Will not sent the invalidation message to the tracking client	2021-08-22 10:20:53 +03:00
Viktor Söderqvist	8f59c1ecae	Let CONFIG GET * show both replicaof and its alias (#9395 )	2021-08-21 19:43:18 -07:00
sundb	492d8d0961	Sanitize dump payload: fix double free after insert dup nodekey to stream rax and returns 0 (#9399 )	2021-08-20 10:37:45 +03:00
Yossi Gottlieb	1d9c8d61d8	Skip OOM-related tests on incompatible platforms. (#9386 ) We only run OOM related tests on x86_64 and aarch64, as jemalloc on other platforms (notably s390x) may actually succeed very large allocations. As a result the test may hang for a very long time at the cleanup phase, iterating as many as 2^61 hash table slots.	2021-08-18 16:00:22 +03:00
yoav-steinberg	19bc83716c	support regex in "--only" in runtest (#9352 )	2021-08-10 14:28:24 +03:00
sundb	02fd76b97c	Replace all usage of ziplist with listpack for t_hash (#8887 ) Part one of implementing #8702 (taking hashes first before other types) ## Description of the feature 1. Change ziplist encoded hash objects to listpack encoding. 2. Convert existing ziplists on RDB loading time. an O(n) operation. ## Rdb format changes 1. Add RDB_TYPE_HASH_LISTPACK rdb type. 2. Bump RDB_VERSION to 10 ## Interface changes 1. New `hash-max-listpack-entries` config is an alias for `hash-max-ziplist-entries` (same with `hash-max-listpack-value`) 2. OBJECT ENCODING will return `listpack` instead of `ziplist` ## Listpack improvements: 1. Support direct insert, replace integer element (rather than convert back and forth from string) 3. Add more listpack capabilities to match the ziplist ones (like `lpFind`, `lpRandomPairs` and such) 4. Optimize element length fetching, avoid multiple calculations 5. Use inline to avoid function call overhead. ## Tests 1. Add a new test to the RDB load time conversion 2. Adding the listpack unit tests. (based on the one in ziplist.c) 3. Add a few "corrupt payload: fuzzer findings" tests, and slightly modify existing ones. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-10 09:18:49 +03:00
sundb	cbda492909	Sanitize dump payload: handle remaining empty key when RDB loading and restore command (#9349 ) This commit mainly fixes empty keys due to RDB loading and restore command, which was omitted in #9297. 1) When loading quicklsit, if all the ziplists in the quicklist are empty, NULL will be returned. If only some of the ziplists are empty, then we will skip the empty ziplists silently. 2) When loading hash zipmap, if zipmap is empty, sanitization check will fail. 3) When loading hash ziplist, if ziplist is empty, NULL will be returned. 4) Add RDB loading test with sanitize.	2021-08-09 17:13:46 +03:00
Eduardo Semprebon	d3356bf614	Add SORT_RO command (#9299 ) Add a readonly variant of the STORE command, so it can be used on read-only workloads (replica, ACL, etc)	2021-08-09 09:40:29 +03:00
Qu Chen	e8eeba7bee	Allow master to replicate command longer than replica's query buffer limit (#9340 ) Replication client no longer checks incoming command length against the client-query-buffer-limit. This makes the master able to replicate commands longer than replica's configured client-query-buffer-limit	2021-08-08 17:34:11 -07:00
DarrenJiang13	43eb0ce3bf	[BUGFIX] Add some missed error statistics (#9328 ) add error counting for some missed behaviors.	2021-08-06 19:27:24 -07:00
yoav-steinberg	0a9377535b	Ignore resize threshold on idle qbuf resizing (#9322 ) Also update qbuf tests to verify both idle and peak based resizing logic. And delete unused function: getClientsMaxBuffers	2021-08-06 20:50:34 +03:00
Oran Agra	3f3f678a47	corrupt-dump-fuzzer test, avoid creating junk keys (#9302 ) The execution of the RPOPLPUSH command by the fuzzer created junk keys, that were later being selected by RANDOMKEY and modified. This also meant that lists were statistically tested more than other files. Fix the fuzzer not to pass junk key names to RPOPLPUSH, and add a check that detects that new keys are not added by the fuzzer to detect future similar issues.	2021-08-05 22:57:05 +03:00
Oran Agra	0c90370e6d	Improvements to corrupt payload sanitization (#9321 ) Recently we found two issues in the fuzzer tester: #9302 #9285 After fixing them, more problems surfaced and this PR (as well as #9297) aims to fix them. Here's a list of the fixes - Prevent an overflow when allocating a dict hashtable - Prevent OOM when attempting to allocate a huge string - Prevent a few invalid accesses in listpack - Improve sanitization of listpack first entry - Validate integrity of stream consumer groups PEL - Validate integrity of stream listpack entry IDs - Validate ziplist tail followed by extra data which start with 0xff Co-authored-by: sundb <sundbcn@gmail.com>	2021-08-05 22:56:14 +03:00
sundb	8ea777a6a0	Sanitize dump payload: fix empty keys when RDB loading and restore command (#9297 ) When we load rdb or restore command, if we encounter a length of 0, it will result in the creation of an empty key. This could either be a corrupt payload, or a result of a bug (see #8453 ) This PR mainly fixes the following: 1) When restore command will return `Bad data format` error. 2) When loading RDB, we will silently discard the key. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-05 22:42:20 +03:00
Binbin	d0244bfc3d	Make sure execute SLAVEOF command in the right order in psync2 test. (#9316 ) The psync2 test has failed several times recently. In #9159 we only solved half of the problem. i.e. reordering of the replica that's already connected to the newly promoted master. Consider this scenario: 0 slaveof 2 1 slaveof 2 3 slaveof 2 4 slaveof 1 0 slaveof no one, became a new master got a new replid 2 slaveof 0, partial resync and got the new replid 3 reconnect 2, inherit the new replid 3 slaveof 4, use the new replid and got a full resync And another scenario: 1 slaveof 3 2 slaveof 4 3 slaveof 0 4 slaveof 0 4 slaveof no one, became a new master got a new replid 2 reconnect 4, inherit the new replid 2 slaveof 1, use the new replid and got a full resync So maybe we should reattach replicas in the right order. i.e. In the above example, if it would have reattached 1, 3 and 0 to the new chain formed by 4 before trying to attach 2 to 1, it would succeed. This commit break the SLAVEOF loop into two loops. (ideas from oran) First loop that uses random to decide who replicates from who. Second loop that does the actual SLAVEOF command. In the second loop, we make sure to execute it in the right order, and after each SLAVEOF, wait for it to be connected before we proceed. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-05 11:26:09 +03:00
Wen Hui	63e2a6d212	Add sentinel debug option command (#9291 ) This makes it possible to tune many parameters that were previously hard coded. We don't intend these to be user configurable, but only used by tests to accelerate certain conditions which would otherwise take a long time and slow down the test suite. Co-authored-by: Lucas Guang Yang <l84193800@china.huawei.com>	2021-08-05 11:12:55 +03:00
Viktor Söderqvist	1c59567a7f	redis-cli ASK redirect test: Add retry loop to fix timing issue (#9315 )	2021-08-05 08:20:30 +03:00
Wang Yuan	d4bca53cd9	Use madvise(MADV_DONTNEED) to release memory to reduce COW (#8974 ) ## Backgroud As we know, after `fork`, one process will copy pages when writing data to these pages(CoW), and another process still keep old pages, they totally cost more memory. For redis, we suffered that redis consumed much memory when the fork child is serializing key/values, even that maybe cause OOM. But actually we find, in redis fork child process, the child process don't need to keep some memory and parent process may write or update that, for example, child process will never access the key-value that is serialized but users may update it in parent process. So we think it may reduce COW if the child process release memory that it is not needed. ## Implementation For releasing key value in child process, we may think we call `decrRefCount` to free memory, but i find the fork child process still use much memory when we don't write any data to redis, and it costs much more time that slows down bgsave. Maybe because memory allocator doesn't really release memory to OS, and it may modify some inner data for this free operation, especially when we free small objects. Moreover, CoW is based on pages, so it is a easy way that we only free the memory bulk that is not less than kernel page size. madvise(MADV_DONTNEED) can quickly release specified region pages to OS bypassing memory allocator, and allocator still consider that this memory still is used and don't change its inner data. There are some buffers we can release in the fork child process: - Serialized key-values the fork child process never access serialized key-values, so we try to free them. Because we only can release big bulk memory, and it is time consumed to iterate all items/members/fields/entries of complex data type. So we decide to iterate them and try to release them only when their average size of item/member/field/entry is more than page size of OS. - Replication backlog Because replication backlog is a cycle buffer, it will be changed quickly if redis has heavy write traffic, but in fork child process, we don't need to access that. - Client buffers If clients have requests during having the fork child process, clients' buffer also be changed frequently. The memory includes client query buffer, output buffer, and client struct used memory. To get child process peak private dirty memory, we need to count peak memory instead of last used memory, because the child process may continue to release memory (since COW used to only grow till now, the last was equivalent to the peak). Also we're adding a new `current_cow_peak` info variable (to complement the existing `current_cow_size`) Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-04 23:01:46 +03:00
Meir Shpilraien (Spielrein)	56eb7f7de4	Fix tests failure on 32bit build (#9318 ) Fix test introduced in #9202 that failed on 32bit CI. The failure was due to a wrong double comparison. Change code to stringify the double first and then compare.	2021-08-04 21:33:38 +03:00
Meir Shpilraien (Spielrein)	2237131e15	Unified Lua and modules reply parsing and added RESP3 support to RM_Call (#9202 ) ## Current state 1. Lua has its own parser that handles parsing `reds.call` replies and translates them to Lua objects that can be used by the user Lua code. The parser partially handles resp3 (missing big number, verbatim, attribute, ...) 2. Modules have their own parser that handles parsing `RM_Call` replies and translates them to RedisModuleCallReply objects. The parser does not support resp3. In addition, in the future, we want to add Redis Function (#8693) that will probably support more languages. At some point maintaining so many parsers will stop scaling (bug fixes and protocol changes will need to be applied on all of them). We will probably end up with different parsers that support different parts of the resp protocol (like we already have today with Lua and modules) ## PR Changes This PR attempt to unified the reply parsing of Lua and modules (and in the future Redis Function) by introducing a new parser unit (`resp_parser.c`). The new parser handles parsing the reply and calls different callbacks to allow the users (another unit that uses the parser, i.e, Lua, modules, or Redis Function) to analyze the reply. ### Lua API Additions The code that handles reply parsing on `scripting.c` was removed. Instead, it uses the resp_parser to parse and create a Lua object out of the reply. As mentioned above the Lua parser did not handle parsing big numbers, verbatim, and attribute. The new parser can handle those and so Lua also gets it for free. Those are translated to Lua objects in the following way: 1. Big Number - Lua table `{'big_number':'<str representation for big number>'}` 2. Verbatim - Lua table `{'verbatim_string':{'format':'<verbatim format>', 'string':'<verbatim string value>'}}` 3. Attribute - currently ignored and not expose to the Lua parser, another issue will be open to decide how to expose it. Tests were added to check resp3 reply parsing on Lua ### Modules API Additions The reply parsing code on `module.c` was also removed and the new resp_parser is used instead. In addition, the RedisModuleCallReply was also extracted to a separate unit located on `call_reply.c` (in the future, this unit will also be used by Redis Function). A nice side effect of unified parsing is that modules now also support resp3. Resp3 can be enabled by giving `3` as a parameter to the fmt argument of `RM_Call`. It is also possible to give `0`, which will indicate an auto mode. i.e, Redis will automatically chose the reply protocol base on the current client set on the RedisModuleCtx (this mode will mostly be used when the module want to pass the reply to the client as is). In addition, the following RedisModuleAPI were added to allow analyzing resp3 replies: * New RedisModuleCallReply types: * `REDISMODULE_REPLY_MAP` * `REDISMODULE_REPLY_SET` * `REDISMODULE_REPLY_BOOL` * `REDISMODULE_REPLY_DOUBLE` * `REDISMODULE_REPLY_BIG_NUMBER` * `REDISMODULE_REPLY_VERBATIM_STRING` * `REDISMODULE_REPLY_ATTRIBUTE` * New RedisModuleAPI: * `RedisModule_CallReplyDouble` - getting double value from resp3 double reply * `RedisModule_CallReplyBool` - getting boolean value from resp3 boolean reply * `RedisModule_CallReplyBigNumber` - getting big number value from resp3 big number reply * `RedisModule_CallReplyVerbatim` - getting format and value from resp3 verbatim reply * `RedisModule_CallReplySetElement` - getting element from resp3 set reply * `RedisModule_CallReplyMapElement` - getting key and value from resp3 map reply * `RedisModule_CallReplyAttribute` - getting a reply attribute * `RedisModule_CallReplyAttributeElement` - getting key and value from resp3 attribute reply * New context flags: * `REDISMODULE_CTX_FLAGS_RESP3` - indicate that the client is using resp3 Tests were added to check the new RedisModuleAPI ### Modules API Changes * RM_ReplyWithCallReply might return REDISMODULE_ERR if the given CallReply is in resp3 but the client expects resp2. This is not a breaking change because in order to get a resp3 CallReply one needs to specifically specify `3` as a parameter to the fmt argument of `RM_Call` (as mentioned above). Tests were added to check this change ### More small Additions * Added `debug set-disable-deny-scripts` that allows to turn on and off the commands no-script flag protection. This is used by the Lua resp3 tests so it will be possible to run `debug protocol` and check the resp3 parsing code. Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>	2021-08-04 16:28:07 +03:00
Oran Agra	52df350fe5	Skip new redis-cli ASK test in TLS mode (#9312 )	2021-08-03 13:19:03 -07:00
Jonah H. Harris	432c92d8df	Add SINTERCARD/ZINTERCARD Commands (#8946 ) Add SINTERCARD and ZINTERCARD commands that are similar to ZINTER and SINTER but only return the cardinality with minimum processing and memory overheads. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-03 11:45:27 +03:00
Ariel Shtul	bdbf5eedae	Module api support for RESP3 (#8521 ) Add new Module APS for RESP3 responses: - RM_ReplyWithMap - RM_ReplyWithSet - RM_ReplyWithAttribute - RM_ReplySetMapLength - RM_ReplySetSetLength - RM_ReplySetAttributeLength - RM_ReplyWithBool Deprecate REDISMODULE_POSTPONED_ARRAY_LEN in favor of a generic REDISMODULE_POSTPONED_LEN Improve documentation Add tests Co-authored-by: Guy Benoish <guy.benoish@redislabs.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-03 11:37:19 +03:00
Yossi Gottlieb	4bd7748362	Tests: fix commandfilter crash on alpine. (#9307 ) Loading and unloading the shared object does not initialize global vars on alpine.	2021-08-02 15:50:45 +03:00
Huang Zhw	cf61ad14cc	When redis-cli received ASK, it didn't handle it (#8930 ) When redis-cli received ASK, it used string matching wrong and didn't handle it. When we access a slot which is in migrating state, it maybe return ASK. After redirect to the new node, we need send ASKING command before retry the command. In this PR after redis-cli receives ASK, we send a ASKING command before send the origin command after reconnecting. Other changes: * Make redis-cli -u and -c (unix socket and cluster mode) incompatible with one another. * When send command fails, we avoid the 2nd reconnect retry and just print the error info. Users will decide how to do next. See #9277. * Add a test faking two redis nodes in TCL to just send ASK and OK in redis protocol to test ASK behavior. Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-02 14:59:08 +03:00
Ning Sun	f74af0e61d	Add NX/XX/GT/LT options to EXPIRE command group (#2795 ) Add NX, XX, GT, and LT flags to EXPIRE, PEXPIRE, EXPIREAT, PEXAPIREAT. - NX - only modify the TTL if no TTL is currently set - XX - only modify the TTL if there is a TTL currently set - GT - only increase the TTL (considering non-volatile keys as infinite expire time) - LT - only decrease the TTL (considering non-volatile keys as infinite expire time) return value of the command is 0 when the operation was skipped due to one of these flags. Signed-off-by: Ning Sun <sunng@protonmail.com>	2021-08-02 08:57:49 +03:00
menwen	82c3158ad5	Fix if consumer is created as a side effect without notify and dirty++ (#9263 ) Fixes: - When a consumer is created as a side effect, redis didn't issue a keyspace notification, nor incremented the server.dirty (affects periodic snapshots). this was a bug in XREADGROUP, XCLAIM, and XAUTOCLAIM. - When attempting to delete a non-existent consumer, don't issue a keyspace notification and don't increment server.dirty this was a bug in XGROUP DELCONSUMER Other changes: - Changed streamLookupConsumer() to always only do lookup consumer (never do implicit creation), Its last seen time is updated unless the SLC_NO_REFRESH flag is specified. - Added streamCreateConsumer() to create a new consumer. When the creation is successful, it will notify and dirty++ unless the SCC_NO_NOTIFY or SCC_NO_DIRTIFY flags is specified. - Changed streamDelConsumer() to always only do delete consumer. - Added keyspace notifications tests about stream events.	2021-08-02 08:31:33 +03:00
Binbin	86555ae0f7	GEO* STORE with empty src key delete the dest key and return 0, not empty array (#9271 ) With an empty src key, we need to deal with two situations: 1. non-STORE: We should return emptyarray. 2. STORE: Try to delete the store key and return 0. This applies to both GEOSEARCHSTORE (new to v6.2), and also GEORADIUS STORE (which was broken since forever) This pr try to fix #9261. i.e. both STORE variants would have behaved like the non-STORE variants when the source key was missing, returning an empty array and not deleting the destination key, instead of returning 0, and deleting the destination key. Also add more tests for some commands. - GEORADIUS: wrong type src key, non existing src key, empty search, store with non existing src key, store with empty search - GEORADIUSBYMEMBER: wrong type src key, non existing src key, non existing member, store with non existing src key - GEOSEARCH: wrong type src key, non existing src key, empty search, frommember with non existing member - GEOSEARCHSTORE: wrong type key, non existing src key, fromlonlat with empty search, frommember with non existing member Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-01 19:32:24 +03:00
Yossi Gottlieb	68b8b45cd5	Tests: avoid short reads on redis-cli output. (#9301 ) In some cases large replies on slow systems may only be partially read by the test suite, resulting with parsing errors. This fix is still timing sensitive but should greatly reduce the chances of this happening.	2021-08-01 15:07:27 +03:00
Guy Korland	1483f5aa9b	Remove const from CommandFilterArgGet result (#9247 ) Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>	2021-08-01 11:29:32 +03:00
Long Dai	89fdcbec8c	tests: fix a typo (#9294 ) Signed-off-by: Long Dai <long0dai@foxmail.com>	2021-07-30 10:00:07 +03:00
Yossi Gottlieb	8bf433dc86	Clean unused var compiler warning in module test. (#9289 )	2021-07-29 19:45:29 +03:00
Wen Hui	db41536454	Remove duplicate zero-port sentinels (#9240 ) The issue is that when a sentinel with the same address and IP is turned on with a different runid, its port is set to 0 but it is still present in the dictionary master->sentinels which contain all the sentinels for a master. This causes a problem when we do INFO SENTINEL because it takes the size of the dictionary of sentinels. This might also cause a problem for failover if enough sentinels have their port set to 0 since the number of voters in failover is also determined by the size of the dictionary of sentinels. This commits removes the sentinels with the port set to zero from the dictionary of sentinels. Fixes #8786	2021-07-29 12:32:28 +03:00
sundb	3db0f1a284	Fix missing check for sanitize_dump in corrupt-dump-fuzzer test (#9285 ) this means the assertion that checks that when deep sanitization is enabled, there are no crashes, was missing.	2021-07-29 11:53:21 +03:00
ZhaolongLi	8d00493485	tests: fix exec fails when grep exists with status other than 0 (#9066 ) Co-authored-by: lizhaolong.lzl <lizhaolong.lzl@B-54MPMD6R-0221.local>	2021-07-25 09:58:21 +03:00
Huang Zhw	71d452876e	On 32 bit platform, the bit position of GETBIT/SETBIT/BITFIELD/BITCOUNT,BITPOS may overflow (see CVE-2021-32761) (#9191 ) GETBIT, SETBIT may access wrong address because of wrap. BITCOUNT and BITPOS may return wrapped results. BITFIELD may access the wrong address but also allocate insufficient memory and segfault (see CVE-2021-32761). This commit uses `uint64_t` or `long long` instead of `size_t`. related https://github.com/redis/redis/pull/8096 At 32bit platform: > setbit bit 4294967295 1 (integer) 0 > config set proto-max-bulk-len 536870913 OK > append bit "\xFF" (integer) 536870913 > getbit bit 4294967296 (integer) 0 When the bit index is larger than 4294967295, size_t can't hold bit index. In the past, `proto-max-bulk-len` is limit to 536870912, so there is no problem. After this commit, bit position is stored in `uint64_t` or `long long`. So when `proto-max-bulk-len > 536870912`, 32bit platforms can still be correct. For 64bit platform, this problem still exists. The major reason is bit pos 8 times of byte pos. When proto-max-bulk-len is very larger, bit pos may overflow. But at 64bit platform, we don't have so long string. So this bug may never happen. Additionally this commit add a test cost `512MB` memory which is tag as `large-memory`. Make freebsd ci and valgrind ci ignore this test.	2021-07-21 16:25:19 +03:00
Binbin	11dc4e59b3	SMOVE only notify dstset when the addition is successful. (#9244 ) in case dest key already contains the member, the dest key isn't modified, so the command shouldn't invalidate watch.	2021-07-17 09:54:06 +03:00
Oran Agra	6a5bac309e	Test infra, handle RESP3 attributes and big-numbers and bools (#9235 ) - promote the code in DEBUG PROTOCOL to addReplyBigNum - DEBUG PROTOCOL ATTRIB skips the attribute when client is RESP2 - networking.c addReply for push and attributes generate assertion when called on a RESP2 client, anything else would produce a broken protocol that clients can't handle.	2021-07-14 19:14:31 +03:00
perryitay	ac8b1df885	Fail EXEC command in case a watched key is expired (#9194 ) There are two issues fixed in this commit: 1. we want to fail the EXEC command in case there is a watched key that's logically expired but not yet deleted by active expire or lazy expire. 2. we saw that currently cache time is update in every `call()` (including nested calls), this time is being also being use for the isKeyExpired comparison, we want to update the cache time only in the first call (execCommand) Co-authored-by: Oran Agra <oran@redislabs.com>	2021-07-11 13:17:23 +03:00
Yossi Gottlieb	92e8004705	Pre-test bind-source-addr before running test. (#9214 ) This attempts to catch any non-standard configuration where the test may fail and produce a false positive.	2021-07-11 09:54:07 +03:00
Mikhail Fesenko	1eb4baa5b8	Direct redis-cli repl prints to stderr, because --rdb can print to stdout. fflush stdout after responses (#9136 ) 1. redis-cli can output --rdb data to stdout but redis-cli also write some messages to stdout which will mess up the rdb. 2. Make redis-cli flush stdout when printing a reply This was needed in order to fix a hung in redis-cli test that uses --replica. Note that printf does flush when there's a newline, but fwrite does not. 3. fix the redis-cli --replica test which used to pass previously because it didn't really care what it read, and because redis-cli used printf to print these other things to stdout. 4. improve redis-cli --replica test to run with both diskless and disk-based. Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Viktor Söderqvist <viktor@zuiderkwast.se>	2021-07-07 08:26:26 +03:00
Binbin	a418a2d3fc	hrandfield and zrandmember with count should return emptyarray when key does not exist. (#9178 ) due to a copy-paste bug, it used to reply with null response rather than empty array. this commit includes new tests that are looking at the RESP response directly in order to be able to tell the difference between them. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-07-05 10:41:57 +03:00
Oran Agra	7103367ad4	Tests: add a way to read raw RESP protocol reponses (#9193 ) This makes it possible to distinguish between null response and an empty array (currently the tests infra translates both to an empty string/list)	2021-07-04 19:43:58 +03:00
Oran Agra	a8518cce95	fix valgrind issues with recently added test in modules/blockonbackground (#9192 ) fixes test issue introduced in #9167 1. invalid reads due to accessing non-retained string (passed as unblock context). 2. leaking module blocked client context, see #6922 for info.	2021-07-04 14:21:53 +03:00
Yossi Gottlieb	aa139e2f02	Fix CLIENT UNBLOCK crashing modules. (#9167 ) Modules that use background threads with thread safe contexts are likely to use RM_BlockClient() without a timeout function, because they do not set up a timeout. Before this commit, `CLIENT UNBLOCK` would result with a crash as the `NULL` timeout callback is called. Beyond just crashing, this is also logically wrong as it may throw the module into an unexpected client state. This commits makes `CLIENT UNBLOCK` on such clients behave the same as any other client that is not in a blocked state and therefore cannot be unblocked.	2021-07-01 17:11:27 +03:00
Binbin	5dddf496ce	Add missing pause tcl test to test_helper.tcl (#9158 ) * Add keyname tags to avoid CROSSSLOT errors in external server CI * Use new wait_for_blocked_clients_count in pause.tcl	2021-06-30 13:32:51 +03:00
Binbin	1d5aa37d68	Fix timing issue in psync2 test. (#9159 ) *** [err]: PSYNC2: total sum of full synchronizations is exactly 4 intests/integration/psync2.tcl Expected 5 == 4 (context: type eval line 8 cmd {assert {$sum == 4}} proc::test) Sometime the test got an unexpected full sync since a replica switch to master, before the new master change propagated the new replid to all replicas, a replica attempted to sync with it using a wrong replid and triggered a full resync. Consider this scenario: 1 slaveof 4 full resync 0 slaveof 4 full resync 2 slaveof 0 full resync 3 slaveof 1 full resync 1 slaveof no one, replid changed 3 reconnect 1, did a partial resyn and got the new replid Before 2 inherits the new replid. 3 slaveof 2 3 try to do a partial resyn with 2. But their replication ids are inconsistent, so a full resync happens. :) A special thank you for oran and helping me in this test case. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-06-30 09:18:10 +03:00
Yossi Gottlieb	5d8ea4b326	Add missing needs:repl tag. (#9169 )	2021-06-29 16:48:52 +03:00
Leibale Eidelman	95274f1f8a	fix ZRANGESTORE - should return 0 when src points to an empty key (#9089 ) mistakenly it used to return an empty array rather than 0. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-06-29 16:38:10 +03:00
Binbin	4bc5a8324d	ZRANDMEMBER WITHSCORES with negative COUNT may return bad score (#9162 ) Return a bad score when used with negative count (or count of 1), and non-ziplist encoded zset. Also add test to validate the return value and cover the issue.	2021-06-29 10:14:28 +03:00
Yossi Gottlieb	f233c4c59d	Add bind-source-addr configuration argument. (#9142 ) In the past, the first bind address that was explicitly specified was also used to bind outgoing connections. This could result with some problems. For example: on some systems using `bind 127.0.0.1` would result with outgoing connections also binding to `127.0.0.1` and failing to connect to remote addresses. With the recent change to the way `bind` is handled, this presented other issues: * The default first bind address is '' which is not a valid address. We make no distinction between user-supplied config that is identical to the default, and the default config. This commit addresses both these issues by introducing an explicit configuration parameter to control the bind address on outgoing connections.	2021-06-24 19:48:18 +03:00
Oran Agra	5ffdbae1f6	Fix failing basics moduleapi test on 32bit CI (#9140 )	2021-06-24 12:44:13 +03:00
Oran Agra	ae418eca24	Adjustments to recent RM_StringTruncate fix (#3718 ) (#9125 ) - Introduce a new sdssubstr api as a building block for sdsrange. The API of sdsrange is many times hard to work with and also has corner case that cause bugs. sdsrange is easy to work with and also simplifies the implementation of sdsrange. - Revert the fix to RM_StringTruncate and just use sdssubstr instead of sdsrange. - Solve valgrind warnings from the new tests introduced by the previous PR.	2021-06-22 17:22:40 +03:00
Yossi Gottlieb	a49b766860	Remove leftover after CONFIG SET bind change. (#9129 )	2021-06-22 14:03:00 +03:00
Yossi Gottlieb	8284544adb	Fix typo in test. (#9128 )	2021-06-22 13:30:20 +03:00
Yossi Gottlieb	07b0d144ce	Improve bind and protected-mode config handling. (#9034 ) * Specifying an empty `bind ""` configuration prevents Redis from listening on any TCP port. Before this commit, such configuration was not accepted. * Using `CONFIG GET bind` will always return an explicit configuration value. Before this commit, if a bind address was not specified the returned value was empty (which was an anomaly). Another behavior change is that modifying the `bind` configuration to a non-default value will NO LONGER DISABLE protected-mode implicitly.	2021-06-22 12:50:17 +03:00
Evan	1ccf2ca2f4	modules: Add newlen == 0 handling to RM_StringTruncate (#3717 ) (#3718 ) Previously, passing 0 for newlen would not truncate the string at all. This adds handling of this case, freeing the old string and creating a new empty string. Other changes: - Move `src/modules/testmodule.c` to `tests/modules/basics.c` - Introduce that basic test into the test suite - Add tests to cover StringTruncate - Add `test-modules` build target for the main makefile - Extend `distclean` build target to clean modules too	2021-06-22 12:26:48 +03:00
Oran Agra	d0819d618e	solve test timing issues in replication tests (#9121 ) # replication-3.tcl had a test timeout failure with valgrind on daily CI: ``` * [err]: SLAVE can reload "lua" AUX RDB fields of duplicated scripts in tests/integration/replication-3.tcl Replication not started. ``` replication took more than 70 seconds. https://github.com/redis/redis/runs/2854037905?check_suite_focus=true on my machine it takes only about 30, but i can see how 50 seconds isn't enough. # replication.tcl loading was over too quickly in freebsd daily CI: ``` * [err]: slave fails full sync and diskless load swapdb recovers it in tests/integration/replication.tcl Expected '0' to be equal to '1' (context: type eval line 44 cmd {assert_equal [s -1 loading] 1} proc ::start_server) ``` # rdb.tcl loading was over too quickly. increase the time loading takes, and decrease the amount of work we try to achieve in that time.	2021-06-22 11:10:11 +03:00
Oran Agra	9b564b525d	Fix race in client side tracking (#9116 ) The `Tracking gets notification of expired keys` test in tracking.tcl used to hung in valgrind CI quite a lot. It turns out the reason is that with valgrind and a busy machine, the server cron active expire cycle could easily run in the same event loop as the command that created `mykey`, so that when they key got expired, there were two change events to broadcast, one that set the key and one that expired it, but since we used raxTryInsert, the client that was associated with the "last" change was the one that created the key, so the NOLOOP filtered that event. This commit adds a test that reproduces the problem by using lazy expire in a multi-exec which makes sure the key expires in the same event loop as the one that added it.	2021-06-22 07:35:59 +03:00
sundb	eae0983d2d	Fix leak and double free issues in datatype2 module test (#9102 ) * Add missing call for RedisModule_DictDel in datatype2 test * Fix memory leak in datatype2 test	2021-06-17 21:45:21 +03:00
sundb	b586d5b567	Fix querybuf test failure (#9091 ) Fix test failure which introduced by #9003. The following case will occur when querybuf expansion will allocate memory equal to (16*1024)k. 1) make use ```CFLAGS=-DNO_MALLOC_USABLE_SIZE```. 2) ```malloc``` will not allocate more under ```alpine```.	2021-06-16 22:01:37 +03:00
chenyang8094	e0cd3ad0de	Enhance mem_usage/free_effort/unlink/copy callbacks and add GetDbFromIO api. (#8999 ) Create new module type enhanced callbacks: mem_usage2, free_effort2, unlink2, copy2. These will be given a context point from which the module can obtain the key name and database id. In addition the digest and defrag context can now be used to obtain the key name and database id.	2021-06-16 09:45:49 +03:00
Jason Elbaum	7f342020dc	Change return value type for ZPOPMAX/MIN in RESP3 (#8981 ) When using RESP3, ZPOPMAX/ZPOPMIN should return nested arrays for consistency with other commands (e.g. ZRANGE). We do that only when COUNT argument is present (similarly to how LPOP behaves). for reasoning see https://github.com/redis/redis/issues/8824#issuecomment-855427955 This is a breaking change only when RESP3 is used, and COUNT argument is present!	2021-06-16 09:29:57 +03:00
sundb	e5d8a5eb85	Fix the wrong reisze of querybuf (#9003 ) The initialize memory of `querybuf` is `PROTO_IOBUF_LEN(102416) 2` (due to sdsMakeRoomFor being greedy), under `jemalloc`, the allocated memory will be 40k. This will most likely result in the `querybuf` being resized when call `clientsCronResizeQueryBuffer` unless the client requests it fast enough. Note that this bug existed even before #7875, since the condition for resizing includes the sds headers (32k+6). ## Changes 1. Use non-greedy sdsMakeRoomFor when allocating the initial query buffer (of 16k). 1. Also use non-greedy allocation when working with BIG_ARG (we won't use that extra space anyway) 2. in case we did use a greedy allocation, read as much as we can into the buffer we got (including internal frag), to reduce system calls. 3. introduce a dedicated constant for the shrinking (same value as before) 3. Add test for querybuf. 4. improve a maxmemory test by ignoring the effect of replica query buffers (can accumulate many ACKs on slow env) 5. improve a maxmemory by disabling slowlog (it will cause slight memory growth on slow env).	2021-06-15 14:46:19 +03:00
Binbin	b109977301	Fix XINFO help for unexpected options. (#9075 ) Small cleanup and consistency.	2021-06-15 10:01:11 +03:00
Binbin	7900b48bc7	slowlog get command supports passing in -1 to get all logs. (#9018 ) This was already the case before this commit, but it wasn't clear / intended in the code, now it does.	2021-06-14 16:46:45 +03:00
YaacovHazan	1677efb9da	cleanup around loadAppendOnlyFile (#9012 ) Today when we load the AOF on startup, the loadAppendOnlyFile checks if the file is openning for reading. This check is redundent (dead code) as we open the AOF file for writing at initServer, and the file will always be existing for the loadAppendOnlyFile. In this commit: - remove all the exit(1) from loadAppendOnlyFile, as it is the caller responsibility to decide what to do in case of failure. - move the opening of the AOF file for writing, to be after we loading it. - avoid return -ERR in DEBUG LOADAOF, when the AOF is existing but empty	2021-06-14 10:38:08 +03:00
Binbin	b8a5da80c4	Fix accidental deletion of sinterstore command when we meet wrong type error. (#9032 ) SINTERSTORE would have deleted the dest key right away, even when later on it is bound to fail on an (WRONGTYPE) error. With this change it first picks up all the input keys, and only later delete the dest key if one is empty. Also add more tests for some commands. Mainly focus on - `wrong type error`: expand test case (base on sinter bug) in non-store variant add tests for store variant (although it exists in non-store variant, i think it would be better to have same tests) - the dstkey result when we meet `non-exist key (empty set)` in *store sdiff: - improve test case about wrong type error (the one we found in sinter, although it is safe in sdiff) - add test about using non-exist key (treat it like an empty set) sdiffstore: - according to sdiff test case, also add some tests about `wrong type error` and `non-exist key` - the different is that in sdiffstore, we will consider the `dstkey` result sunion/sunionstore add more tests (same as above) sinter/sinterstore also same as above ...	2021-06-13 10:53:46 +03:00
ny0312	fb140a1bff	Fix flaky test case for absolute TTL replication (#9069 ) The root cause is that one test (`5 keys in, 5 keys out`) is leaking a volatile key that can expire while another later test(`All TTL in commands are propagated as absolute timestamp in replication stream`) is running. Such leaked expiration injects an unexpected `DEL` command into the replication command during the later test, causing it to fail. The fixes are two fold: 1. Plug the leak in the first test. 2. Add FLUSHALL to the later test, to avoid future interference from other tests.	2021-06-13 08:42:20 +03:00
Binbin	0bfccc55e2	Fixed some typos, add a spell check ci and others minor fix (#8890 ) This PR adds a spell checker CI action that will fail future PRs if they introduce typos and spelling mistakes. This spell checker is based on blacklist of common spelling mistakes, so it will not catch everything, but at least it is also unlikely to cause false positives. Besides that, the PR also fixes many spelling mistakes and types, not all are a result of the spell checker we use. Here's a summary of other changes: 1. Scanned the entire source code and fixes all sorts of typos and spelling mistakes (including missing or extra spaces). 2. Outdated function / variable / argument names in comments 3. Fix outdated keyspace masks error log when we check `config.notify-keyspace-events` in loadServerConfigFromString. 4. Trim the white space at the end of line in `module.c`. Check: https://github.com/redis/redis/pull/7751 5. Some outdated https link URLs. 6. Fix some outdated comment. Such as: - In README: about the rdb, we used to said create a `thread`, change to `process` - dbRandomKey function coment (about the dictGetRandomKey, change to dictGetFairRandomKey) - notifyKeyspaceEvent fucntion comment (add type arg) - Some others minor fix in comment (Most of them are incorrectly quoted by variable names) 7. Modified the error log so that users can easily distinguish between TCP and TLS in `changeBindAddr`	2021-06-10 15:39:33 +03:00
Yossi Gottlieb	8a86bca5ed	Improve test suite to handle external servers better. (#9033 ) This commit revives the improves the ability to run the test suite against external servers, instead of launching and managing `redis-server` processes as part of the test fixture. This capability existed in the past, using the `--host` and `--port` options. However, it was quite limited and mostly useful when running a specific tests. Attempting to run larger chunks of the test suite experienced many issues: * Many tests depend on being able to start and control `redis-server` themselves, and there's no clear distinction between external server compatible and other tests. * Cluster mode is not supported (resulting with `CROSSSLOT` errors). This PR cleans up many things and makes it possible to run the entire test suite against an external server. It also provides more fine grained controls to handle cases where the external server supports a subset of the Redis commands, limited number of databases, cluster mode, etc. The tests directory now contains a `README.md` file that describes how this works. This commit also includes additional cleanups and fixes: * Tests can now be tagged. * Tag-based selection is now unified across `start_server`, `tags` and `test`. * More information is provided about skipped or ignored tests. * Repeated patterns in tests have been extracted to common procedures, both at a global level and on a per-test file basis. * Cleaned up some cases where test setup was based on a previous test executing (a major anti-pattern that repeats itself in many places). * Cleaned up some cases where test teardown was not part of a test (in the future we should have dedicated teardown code that executes even when tests fail). * Fixed some tests that were flaky running on external servers.	2021-06-09 15:13:24 +03:00
Fabian Eichinger	39b0f0dd73	Add support for combining NX and GET flags on SET command (#8906 ) Till now GET and NX were mutually exclusive. This change make their combination mean a "Get or Set" command. If the key exists it returns the old value and avoids setting, and if it does't exist it returns nil and sets it to the new value (possibly with expiry time)	2021-06-07 16:47:58 +03:00
Huang Zhw	eaa7a7bb93	Fix XTRIM or XADD with LIMIT may delete more entries than Count. (#9048 ) The decision to stop trimming due to LIMIT in XADD and XTRIM was after the limit was reached. i.e. the code was deleting at least that count of records (from the LIMIT argument's perspective, not the MAXLEN), instead of up to that count of records. see #9046	2021-06-07 14:43:36 +03:00

1 2 3 4 5 ...

1552 Commits