redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-22 16:18:28 -05:00

Author	SHA1	Message	Date
Wen Hui	d412269ff8	Adding missing test cases for Addslot Command (#12288 ) Added missing test case coverage for below scenarios: 1. The command only works if all the specified slots are, from the point of view of the node receiving the command, currently not assigned. A node will refuse to take ownership for slots that already belong to some other node (including itself). 2. The command fails if the same slot is specified multiple times.	2023-06-11 08:36:26 +03:00
Oran Agra	2764dc3768	Optimize MSETNX to avoid double lookup (#11944 ) This is a redo of #11594 which got reverted in #11940 It improves performance by avoiding double lookup of the the key.	2023-05-28 10:58:29 +03:00
Oran Agra	6117f28822	Fix WAIT for clients being blocked in a module command (#12220 ) So far clients being blocked and unblocked by a module command would update the c->woff variable and so WAIT was ineffective and got released without waiting for the command actions to propagate. This seems to have existed since forever, but not for RM_BlockClientOnKeys. It is problematic though to know if the module did or didn't propagate anything in that command, so for now, instead of adding an API, we'll just update the woff to the latest offset when unblocking, this will cause the client to possibly wait excessively, but that's not that bad.	2023-05-28 10:10:52 +03:00
Wen Hui	1a188e4ed6	[BUG] Incorrect error msg for XREAD command (#12238 ) XREAD only supports a special ID of $ and XREADGROUP only supports ^. make sure not to suggest the wrong one when rerunning an error about unbalanced ID arguments Co-authored-by: Oran Agra <oran@redislabs.com>	2023-05-28 08:37:32 +03:00
judeng	d71478a889	postpone the initialization of oject's lru&lfu until it is added to the db as a value object (#11626 ) This pr can get two performance benefits: 1. Stop redundant initialization when most robj objects are created 2. LRU_CLOCK will no longer be called in io threads, so we can avoid the `atomicGet` Another code optimization: deleted the redundant judgment in dbSetValue, no matter in LFU or LRU, the lru field inold robj is always the freshest (it is always updated in lookupkey), so we don't need to judge if in LFU	2023-05-24 09:40:11 +03:00
Wen Hui	d664889992	Adding test case for hvals, hkeys, hexists against wrong type (#12198 ) HVALS, HKEYS and HEXISTS commands wrong type test cases were not covered so added the test cases.	2023-05-24 09:34:13 +03:00
Ping Xie	4c74dd986f	Exclude aux fields from "cluster nodes" and "cluster replicas" output (#12166 ) This commit excludes aux fields from the output of the `cluster nodes` and `cluster replicas` command. We may decide to re-introduce them in some form or another in the future, but not in v7.2.	2023-05-23 18:32:37 +03:00
Binbin	48757934ff	Performance improvement to ZADD and ZRANGESTORE, convert to skiplist and expand dict in advance (#12185 ) For zsets that will eventually be stored as the skiplist encoding (has a dict), we can convert it to skiplist ahead of time. This change checks the number of arguments in the ZADD command, and converts the data-structure if the number of new entries exceeds the listpack-max-entries configuration. This can cause us to over-allocate memory if there are duplicate entries in the input, which is unexpected. For ZRANGESTORE, we know the size of the zset, so we can expand the dict in advance, to avoid the temporary dict from being rehashed while it grows. Simple benchmarks shows it provides some 4% improvement in ZADD and 20% in ZRANGESTORE	2023-05-18 15:24:46 +03:00
Hanna Fadida	37cf1984b9	Add BITFIELD_RO basic tests for non-repl use cases (#12187 ) Current tests for BITFIELD_RO command are skipped in the external mode, and therefore reply-schemas-validator reports a coverage error. This PR adds basic tests to increase coverage.	2023-05-18 12:16:46 +03:00
Binbin	fd566f4050	Fix for set max entries edge case in setTypeCreate / setTypeMaybeConvert (#12183 ) In the judgment in setTypeCreate, we should judge size_hint <= max_entries. This results in the following inconsistencies: ``` 127.0.0.1:6379> config set set-max-intset-entries 5 set-max-listpack-entries 5 OK 127.0.0.1:6379> sadd intset_set1 1 2 3 4 5 (integer) 5 127.0.0.1:6379> object encoding intset_set1 "hashtable" 127.0.0.1:6379> sadd intset_set2 1 2 3 4 (integer) 4 127.0.0.1:6379> sadd intset_set2 5 (integer) 1 127.0.0.1:6379> object encoding intset_set2 "intset" 127.0.0.1:6379> sadd listpack_set1 a 1 2 3 4 (integer) 5 127.0.0.1:6379> object encoding listpack_set1 "hashtable" 127.0.0.1:6379> sadd listpack_set2 a 1 2 3 (integer) 4 127.0.0.1:6379> sadd listpack_set2 4 (integer) 1 127.0.0.1:6379> object encoding listpack_set2 "listpack" ``` This was introduced in #12019, added corresponding tests.	2023-05-16 11:32:21 -07:00
Wen Hui	e45272884e	Adding missing test case for smembers scard commands (#12148 ) Minor missing test case addition. SMEMBERS SCARD against non set SMEMBERS SCARD against non existing key	2023-05-16 09:26:49 -07:00
Oran Agra	2ffde15a1d	increase tollerance of new event loop test, fails on freebsd CI (#12169 ) new test added in #11963, fails on freebsd CI which is slow.	2023-05-14 17:40:29 +03:00
JJ Lu	11cf5cbdcc	Fix bug: LPOS RANK LONG_ MIN causes overflow (#12167 ) Limit the range of RANK to -LONG_ MAX ~ LONG_ MAX. Without this limit, passing -9223372036854775808 would effectively be the same as passing -1.	2023-05-14 09:04:33 +03:00
Chen Tianjie	29ca87955e	Add basic eventloop latency measurement. (#11963 ) The measured latency(duration) includes the list below, which can be shown by `INFO STATS`. ``` eventloop_cycles // ever increasing counter eventloop_duration_sum // cumulative duration of eventloop in microseconds eventloop_duration_cmd_sum // cumulative duration of executing commands in microseconds instantaneous_eventloop_cycles_per_sec // average eventloop count per second in recent 1.6s instantaneous_eventloop_duration_usec // average single eventloop duration in recent 1.6s ``` Also added some experimental metrics, which are shown only when `INFO DEBUG` is called. This section isn't included in the default INFO, or even in `INFO ALL` and the fields in this section can change in the future without considering backwards compatibility. ``` eventloop_duration_aof_sum // cumulative duration of writing AOF eventloop_duration_cron_sum // cumulative duration cron jobs (serverCron, beforeSleep excluding IO and AOF) eventloop_cmd_per_cycle_max // max number of commands executed in one eventloop eventloop_duration_max // max duration of one eventloop ``` All of these are being reset by CONFIG RESETSTAT	2023-05-12 20:13:15 +03:00
Binbin	6ab2174d37	EXPIRE precision test more attempts and more tolerant (#12150 ) The test failed on MacOS: ``` *** [err]: EXPIRE precision is now the millisecond in tests/unit/expire.tcl Expected 'somevalue {}' to equal or match '{} {}' ``` `set a [r get x]`, even though we tried 10 times, sometimes we still get {}, this is a time-sensitive test. In this PR, we add the following changes: 1. More attempts, change it from 10 to 30. 2. More tolerant, change the `after 900` to `after 800`. In addition, we judging $a in advance and changing `after 1100` to `after 300`, this will save us some times.	2023-05-09 14:14:22 +03:00
cui fliter	03d50e0c30	Remove several instances of duplicate "the" in comments (#12144 ) Remove several instances of duplicate "the" in comments	2023-05-08 16:12:44 -07:00
Wen Hui	42dd98ec19	adding missing test cases GET and GETEX (#12125 ) adding test case of expired key or not exist for GET and GETEX. for better test coverage.	2023-05-07 11:46:11 +03:00
sundb	ce5f4ea3a9	Delete empty key if fails after moduleCreateEmptyKey() in module (#12129 ) When `RM_ZsetAdd()`/`RM_ZsetIncrby()`/`RM_StreamAdd()` fails, if a new key happens to be created using `moduleCreateEmptyKey()`, we should clean up the empty key. ## Test 1) Add new module commands(`zset.add` and `zset.incrby`) to cover `RM_ZsetAdd()`/`RM_ZsetIncrby()`. 2) Add a large-memory test to cover `RM_StreamAdd()`.	2023-05-07 10:13:19 +03:00
Binbin	e49c2a5292	Pause cron to prevent premature shrinking in querybuf test (#12126 ) Tests occasionally fail since #12000: ``` * [err]: query buffer resized correctly when not idle in tests/unit/querybuf.tcl Expected 0 > 32768 (context: type eval line 11 cmd {assert {$orig_test_client_qbuf > 32768}} proc ::test) * [err]: query buffer resized correctly with fat argv in tests/unit/querybuf.tcl query buffer should not be resized when client idle time smaller than 2s ``` The reason may be because we set hz to 100, querybuf shrinks before we count client_query_buffer. We avoid this problem by setting pause-cron to 1.	2023-05-04 13:02:08 +03:00
guybe7	857c09b04d	multi.tcl: reset readraw at the end of the test (#12123 ) 1. reset the readraw mode after a test that uses it. undetected since the only test after that on the same server didn't read any replies. 2. fix a cross slot issue that was undetected in cluster mode because readraw doesn't throw exceptions on errors.	2023-05-04 11:58:31 +03:00
Wen Hui	f32d1817e3	Updating missing test cases for Hash commands (#12116 ) Adding missing test case against wrong type for HRANDFIELD HGET HGETALL HDEL HINCRBY HINCRBYFLOAT HSTRLEN.	2023-05-01 21:00:07 +03:00
judeng	9b588f3820	minor optimization for slowlog get (#12103 ) We can always know the array length of the response, so there is no need to use addReplyDeferredLen which may introduce some additional overheads.	2023-04-25 10:17:21 +03:00
Wen Hui	091412cf62	add test cases for decr decrby on missing key (#12070 ) Minor test case addition for DECR and DECRBY. Currently DECR and DECRBY do not have test case coverage for the scenarios where they run on a non-existing key.	2023-04-19 09:55:56 +03:00
Binbin	20533cc1d7	Tests: Do not save an RDB by default and add a SIGTERM default AOFRW test (#12064 ) In order to speed up tests, avoid saving an RDB (mostly notable on shutdown), except for tests that explicitly test the RDB mechanism In addition, use `shutdown-on-sigterm force` to prevetn shutdown from failing in case the server is in the middle of the initial AOFRW Also a a test that checks that the `shutdown-on-sigterm default` is to refuse shutdown if there's an initial AOFRW Co-authored-by: Guy Benoish <guy.benoish@redislabs.com>	2023-04-18 16:14:26 +03:00
Wen Hui	d2db4aa753	Added getrange missing testcase (#12061 ) Minor test case addition. Currently GETRANGE command does not have the test case coverage for the scenarios: An error is returned when key exists but of different type Added missing test cases for getrange command.	2023-04-18 08:32:10 +03:00
judeng	e7f18432b8	avoid incorrect shrinking of querybuf when client is reading a big argv (#12000 ) this pr fix two wrongs： 1. When client’s querybuf is pre-allocated for a fat argv, we need to update the querybuf_peak of the client immediately to completely avoid the unexpected shrinking of querybuf in the next clientCron (before data arrives to set the peak). 2. the protocol's bulklen does not include `\r\n`, but the allocation and the data we read does. so in `clientsCronResizeQueryBuffer`, the `resize` or `querybuf_peak` should add these 2 bytes. the first bug is likely to hit us on large payloads over slow connections, in which case transferring the payload can take longer and a cron event will be triggered (specifically if there are not a lot of clients)	2023-04-16 15:49:26 +03:00
Wen Hui	4375b01cc7	Adding missing test cases for substring (#12039 ) There is are some missing test cases for SUBSTR command. These might already be covered by GETRANGE, but no harm in adding them since they're simple. Added 3 test case. * start > stop * start and stop both greater than string length * when no key is present.	2023-04-13 21:48:26 +03:00
Wen Hui	bc82309ceb	Adding missing test cases for linsert command (#12040 ) Currently LINSERT command does not have the test case coverage for following scenarios. 1. When key does not exist, it is considered an empty list and no operation is performed. 2. An error is returned when key exists but does not hold a list value. Added above two missing test cases for linsert command.	2023-04-13 19:05:41 +03:00
Binbin	bfec2d700b	Add RM_ReplyWithErrorFormat that can support format (#11923 ) * Add RM_ReplyWithErrorFormat that can support format Reply with the error create from a printf format and arguments. If the error code is already passed in the string 'fmt', the error code provided is used, otherwise the string "-ERR " for the generic error code is automatically added. The usage is, for example: RedisModule_ReplyWithErrorFormat(ctx, "An error: %s", "foo"); RedisModule_ReplyWithErrorFormat(ctx, "-WRONGTYPE Wrong Type: %s", "foo"); The function always returns REDISMODULE_OK.	2023-04-12 10:11:29 +03:00
Oran Agra	997fa41e99	Attempt to solve MacOS CI issues in GH Actions (#12013 ) The MacOS CI in github actions often hangs without any logs. GH argues that it's due to resource utilization, either running out of disk space, memory, or CPU starvation, and thus the runner is terminated. This PR contains multiple attempts to resolve this: 1. introducing pause_process instead of SIGSTOP, which waits for the process to stop before resuming the test, possibly resolving race conditions in some tests, this was a suspect since there was one test that could result in an infinite loop in that case, in practice this didn't help, but still a good idea to keep. 2. disable the `save` config in many tests that don't need it, specifically ones that use heavy writes and could create large files. 3. change the `populate` proc to use short pipeline rather than an infinite one. 4. use `--clients 1` in the macos CI so that we don't risk running multiple resource demanding tests in parallel. 5. enable `--verbose` to be repeated to elevate verbosity and print more info to stdout when a test or a server starts.	2023-04-12 09:19:21 +03:00
Binbin	45b8eea19f	Add ZREMRANGEBYLEX basics tests to fix reply-schemas daily (#12021 ) We do have ZREMRANGEBYLEX tests, but it is a stress test marked with slow tag and then skipped in reply-schemas daily. In the past, we were able to succeed on a daily, i guess it was because there were some random command executions, such as corrupt-dump-fuzzy, which might call it. These test examples are taken from ZRANGEBYLEX basics test.	2023-04-11 11:14:16 +03:00
Ozan Tezcan	e55568edb5	Add RM_RdbLoad and RM_RdbSave module API functions (#11852 ) Add `RM_RdbLoad()` and `RM_RdbSave()` to load/save RDB files from the module API. In our use case, we have our clustering implementation as a module. As part of this implementation, the module needs to trigger RDB save operation at specific points. Also, this module delivers RDB files to other nodes (not using Redis' replication). When a node receives an RDB file, it should be able to load the RDB. Currently, there is no module API to save/load RDB files. This PR adds four new APIs: ```c RedisModuleRdbStream RM_RdbStreamCreateFromFile(const char filename); void RM_RdbStreamFree(RedisModuleRdbStream stream); int RM_RdbLoad(RedisModuleCtx ctx, RedisModuleRdbStream stream, int flags); int RM_RdbSave(RedisModuleCtx ctx, RedisModuleRdbStream stream, int flags); ``` The first step is to create a `RedisModuleRdbStream` object. This PR provides a function to create RedisModuleRdbStream from the filename. (You can load/save RDB with the filename). In the future, this API can be extended if needed: e.g., `RM_RdbStreamCreateFromFd()`, `RM_RdbStreamCreateFromSocket()` to save/load RDB from an `fd` or a `socket`. Usage: ```c / Save RDB / RedisModuleRdbStream stream = RedisModule_RdbStreamCreateFromFile("example.rdb"); RedisModule_RdbSave(ctx, stream, 0); RedisModule_RdbStreamFree(stream); /* Load RDB / RedisModuleRdbStream stream = RedisModule_RdbStreamCreateFromFile("example.rdb"); RedisModule_RdbLoad(ctx, stream, 0); RedisModule_RdbStreamFree(stream); ```	2023-04-09 12:07:32 +03:00
Slava Koyfman	f38aa6bfb7	Disconnect pub-sub subscribers when revoking `allchannels` permission (#11992 ) The existing logic for killing pub-sub clients did not handle the `allchannels` permission correctly. For example, if you: ACL SETUSER foo allchannels Have a client authenticate as the user `foo` and subscribe to a channel, and then: ACL SETUSER foo resetchannels The subscribed client would not be disconnected, though new clients under that user would be blocked from subscribing to any channels. This was caused by an incomplete optimization in `ACLKillPubsubClientsIfNeeded` checking whether the new channel permissions were a strict superset of the old ones.	2023-04-02 16:18:28 +03:00
Binbin	cb17178658	Fix fork done handler wrongly update fsync metrics and enhance AOF_ FSYNC_ALWAYS (#11973 ) This PR fix several unrelated bugs that were discovered by the same set of tests (WAITAOF tests in #11713), could make the `WAITAOF` test hang. The change in `backgroundRewriteDoneHandler` is about MP-AOF. That leftover / old code assumes that we started a new AOF file just now (when we have a new base into which we're gonna incrementally write), but the fact is that with MP-AOF, the fork done handler doesn't really affect the incremental file being maintained by the parent process, there's no reason to re-issue `SELECT`, and no reason to update any of the fsync variables in that flow. This should have been deleted with MP-AOF (introduced in #9788, 7.0). The damage is that the update to `aof_fsync_offset` will cause us to miss an fsync in `flushAppendOnlyFile`, that happens if we stop write commands in `AOF_FSYNC_EVERYSEC` while an AOFRW is in progress. This caused a new `WAITAOF` test to sometime hang forever. Also because of MP-AOF, we needed to change `aof_fsync_offset` to `aof_last_incr_fsync_offset` and match it to `aof_last_incr_size` in `flushAppendOnlyFile`. This is because in the past we compared `aof_fsync_offset` and `aof_current_size`, but with MP-AOF it could be the total AOF file will be smaller after AOFRW, and the (already existing) incr file still has data that needs to be fsynced. The change in `flushAppendOnlyFile`, about the `AOF_FSYNC_ALWAYS`, it is follow #6053 (the details is in #5985), we also check `AOF_FSYNC_ALWAYS` to handle a case where appendfsync is changed from everysec to always while there is data that's written but not yet fsynced.	2023-03-29 15:17:05 +03:00
Binbin	2cc99c692c	Add COMMAND COUNT test to cover reply-schemas-validator test (#11971 ) Since we remove the COMMAND COUNT call in sentinel test in #11950, reply-schemas-validator started reporting this error: ``` WARNING! The following commands were not hit at all: command\|count ERROR! at least one command was not hit by the tests ``` This PR add a COMMAND COUNT test to cover it and also fix some typos in req-res-log-validator.py	2023-03-26 08:39:04 +03:00
Ozan Tezcan	99e6855453	Add needs:reset for the test (#11959 ) Added missing needs:reset tag. Introduced by #11758	2023-03-23 10:48:45 +02:00
Oran Agra	d38df59a3f	fix CLIENT SETINFO to use error replies instead of status replies (#11952 )	2023-03-22 14:32:36 +02:00
Igor Malinovskiy	c3b9f2fbd9	Allow clients to report name and version (#11758 ) This PR allows clients to send information about the client library to redis to be displayed in CLIENT LIST and CLIENT INFO. Currently supports: `CLIENT [lib-name \| lib-ver] <value>` Client libraries are expected to pipeline these right after AUTH, and ignore the failure in case they're talking to an older version of redis. These will be shown in CLIENT LIST and CLIENT INFO as: * `lib-name` - meant to hold the client library name. * `lib-ver` - meant to hold the client library version. The values cannot contain spaces, newlines and any wild ASCII characters, but all other normal chars are accepted, e.g `.`, `=` etc (same as CLIENT NAME). The RESET command does NOT clear these, but they can be cleared to the default by sending a command with a blank string. Co-authored-by: Oran Agra <oran@redislabs.com>	2023-03-22 08:17:20 +02:00
Roshan Khatri	6948dacaf6	Module commands to have ACL categories. (#11708 ) This allows modules to register commands to existing ACL categories and blocks the creation of [sub]commands, datatypes and registering the configs outside of the OnLoad function. For allowing modules to register commands to existing ACL categories, This PR implements a new API int RM_SetCommandACLCategories() which takes a pointer to a RedisModuleCommand and a C string aclflags containing the set of space separated ACL categories. Example, 'write slow' marks the command as part of the write and slow ACL categories. The C string aclflags is tokenized by implementing a helper function categoryFlagsFromString(). Theses tokens are matched and the corresponding ACL categories flags are set by a helper function matchAclCategoriesFlags. The helper function categoryFlagsFromString() returns the corresponding categories_flags or returns -1 if some token not processed correctly. If the module contains commands which are registered to existing ACL categories, the number of [sub]commands are tracked by num_commands_with_acl_categories in struct RedisModule. Further, the allowed command bit-map of the existing users are recomputed from the command_rules list, by implementing a function called ACLRecomputeCommandBitsFromCommandRulesAllUsers() for the existing users to have access to the module commands on runtime. ## Breaking change This change requires that registering commands and subcommands only occur during a modules "OnLoad" function, in order to allow efficient recompilation of ACL bits. We also chose to block registering configs and types, since we believe it's only valid for those to be created during onLoad. We check for this onload flag in struct RedisModule to check if the call is made from the OnLoad function. Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2023-03-21 10:07:11 -07:00
Binbin	78f15b7ef1	Fix race in temp rdb delete shutdown test (#11840 ) I saw this error once, in the FreeBSD Daily CI: ``` *** [err]: Temp rdb will be deleted if we use bg_unlink when shutdown in tests/unit/shutdown.tcl Expected [file exists /xxx/temp-10336.rdb] (context: type eval line 15 cmd {assert {[file exists $temp_rdb]}} proc ::test) ``` The log shows that bgsave was executed, and it was successfully executed in the end: ``` Starting test Temp rdb will be deleted if we use bg_unlink when shutdown in tests/unit/shutdown.tcl 10251:M 22 Feb 2023 11:37:25.441 * Background saving started by pid 10336 10336:C 22 Feb 2023 11:37:27.949 * DB saved on disk 10336:C 22 Feb 2023 11:37:27.949 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB 10251:M 22 Feb 2023 11:37:28.060 * Background saving terminated with success ``` There may be two reasons: 1. The child process has been created, but it has not created the temp rdb file yet, so [file exists $temp_rdb] check failed. 2. The child process bgsave has been executed successfully and the temp file has been deleted, so [file exists $temp_rdb] check failed. From the logs pint, it should be the case 2, case 1 is too extreme, set rdb-key-save-delay to a higher value to ensure bgsave does not succeed early to avoid this case.	2023-03-21 17:51:47 +02:00
Oran Agra	48e0d47884	Avoid assertion when MSETNX is used with the same key twice (CVE-2023-28425) (#11940 ) Using the same key twice in MSETNX command would trigger an assertion. This reverts #11594 (introduced in Redis 7.0.8)	2023-03-20 18:50:44 +02:00
Shaya Potter	6cf8fc08f5	Don't run command filter on blocked command reprocessing (#11895 ) Previously we would run the module command filters even upon blocked command reprocessing. This could modify the command, and it's args. This is irrelevant in the context of a command being reprocessed (it already went through the filters), as well as breaks the crashed command lookup that exists in the case of a reprocessed command. fixes #11894. Co-authored-by: Oran Agra <oran@redislabs.com>	2023-03-20 08:04:13 +02:00
Meir Shpilraien (Spielrein)	d0da0a6a3f	Support for RM_Call on blocking commands (#11568 ) Allow running blocking commands from within a module using `RM_Call`. Today, when `RM_Call` is used, the fake client that is used to run command is marked with `CLIENT_DENY_BLOCKING` flag. This flag tells the command that it is not allowed to block the client and in case it needs to block, it must fallback to some alternative (either return error or perform some default behavior). For example, `BLPOP` fallback to simple `LPOP` if it is not allowed to block. All the commands must respect the `CLIENT_DENY_BLOCKING` flag (including module commands). When the command invocation finished, Redis asserts that the client was not blocked. This PR introduces the ability to call blocking command using `RM_Call` by passing a callback that will be called when the client will get unblocked. In order to do that, the user must explicitly say that he allow to perform blocking command by passing a new format specifier argument, `K`, to the `RM_Call` function. This new flag will tell Redis that it is allow to run blocking command and block the client. In case the command got blocked, Redis will return a new type of call reply (`REDISMODULE_REPLY_PROMISE`). This call reply indicates that the command got blocked and the user can set the on_unblocked handler using `RM_CallReplyPromiseSetUnblockHandler`. When clients gets unblocked, it eventually reaches `processUnblockedClients` function. This is where we check if the client is a fake module client and if it is, we call the unblock callback instead of performing the usual unblock operations. Notice: `RM_CallReplyPromiseSetUnblockHandler` must be called atomically along side the command invocation (without releasing the Redis lock in between). In addition, unlike other CallReply types, the promise call reply must be released by the module when the Redis GIL is acquired. The module can abort the execution on the blocking command (if it was not yet executed) using `RM_CallReplyPromiseAbort`. the API will return `REDISMODULE_OK` on success and `REDISMODULE_ERR` if the operation is already executed. Notice that in case of misbehave module, Abort might finished successfully but the operation will not really be aborted. This can only happened if the module do not respect the disconnect callback of the blocked client. For pure Redis commands this can not happened. ### Atomicity Guarantees The API promise that the unblock handler will run atomically as an execution unit. This means that all the operation performed on the unblock handler will be wrapped with a multi exec transaction when replicated to the replica and AOF. The API do not grantee any other atomicity properties such as when the unblock handler will be called. This gives us the flexibility to strengthen the grantees (or not) in the future if we will decide that we need a better guarantees. That said, the implementation does provide a better guarantees when performing pure Redis blocking command like `BLPOP`. In this case the unblock handler will run atomically with the operation that got unblocked (for example, in case of `BLPOP`, the unblock handler will run atomically with the `LPOP` operation that run when the command got unblocked). This is an implementation detail that might be change in the future and the module writer should not count on that. ### Calling blocking commands while running on script mode (`S`) `RM_Call` script mode (`S`) was introduced on #0372. It is used for usecases where the command that was invoked on `RM_Call` comes from a user input and we want to make sure the user will not run dangerous commands like `shutdown`. Some command, such as `BLPOP`, are marked with `NO_SCRIPT` flag, which means they will not be allowed on script mode. Those commands are marked with `NO_SCRIPT` just because they are blocking commands and not because they are dangerous. Now that we can run blocking commands on RM_Call, there is no real reason not to allow such commands on script mode. The underline problem is that the `NO_SCRIPT` flag is abused to also mark some of the blocking commands (notice that those commands know not to block the client if it is not allowed to do so, and have a fallback logic to such cases. So even if those commands were not marked with `NO_SCRIPT` flag, it would not harm Redis, and today we can already run those commands within multi exec). In addition, not all blocking commands are marked with `NO_SCRIPT` flag, for example `blmpop` are not marked and can run from within a script. Those facts shows that there are some ambiguity about the meaning of the `NO_SCRIPT` flag, and its not fully clear where it should be use. The PR suggest that blocking commands should not be marked with `NO_SCRIPT` flag, those commands should handle `CLIENT_DENY_BLOCKING` flag and only block when it's safe (like they already does today). To achieve that, the PR removes the `NO_SCRIPT` flag from the following commands: * `blmove` * `blpop` * `brpop` * `brpoplpush` * `bzpopmax` * `bzpopmin` * `wait` This might be considered a breaking change as now, on scripts, instead of getting `command is not allowed from script` error, the user will get some fallback behavior base on the command implementation. That said, the change matches the behavior of scripts and multi exec with respect to those commands and allow running them on `RM_Call` even when script mode is used. ### Additional RedisModule API and changes * `RM_BlockClientSetPrivateData` - Set private data on the blocked client without the need to unblock the client. This allows up to set the promise CallReply as the private data of the blocked client and abort it if the client gets disconnected. * `RM_BlockClientGetPrivateData` - Return the current private data set on a blocked client. We need it so we will have access to this private data on the disconnect callback. * On RM_Call, the returned reply will be added to the auto memory context only if auto memory is enabled, this allows us to keep the call reply for longer time then the context lifetime and does not force an unneeded borrow relationship between the CallReply and the RedisModuleContext.	2023-03-16 14:04:31 +02:00
Binbin	0b159b34ea	Bump codespell to 2.2.4, fix typos and outupdated comments (#11911 ) Fix some seen typos and wrong comments.	2023-03-16 08:50:32 +02:00
KarthikSubbarao	f8a5a4f70c	Custom authentication for Modules (#11659 ) This change adds new module callbacks that can override the default password based authentication associated with ACLs. With this, Modules can register auth callbacks through which they can implement their own Authentication logic. When `AUTH` and `HELLO AUTH ...` commands are used, Module based authentication is attempted and then normal password based authentication is attempted if needed. The new Module APIs added in this PR are - `RM_RegisterCustomAuthCallback` and `RM_BlockClientOnAuth` and `RedisModule_ACLAddLogEntryByUserName `. Module based authentication will be attempted for all Redis users (created through the ACL SETUSER cmd or through Module APIs) even if the Redis user does not exist at the time of the command. This gives a chance for the Module to create the RedisModule user and then authenticate via the RedisModule API - from the custom auth callback. For the AUTH command, we will support both variations - `AUTH <username> <password>` and `AUTH <password>`. In case of the `AUTH <password>` variation, the custom auth callbacks are triggered with “default” as the username and password as what is provided. ### RedisModule_RegisterCustomAuthCallback ``` void RM_RegisterCustomAuthCallback(RedisModuleCtx ctx, RedisModuleCustomAuthCallback cb) { ``` This API registers a callback to execute to prior to normal password based authentication. Multiple callbacks can be registered across different modules. These callbacks are responsible for either handling the authentication, each authenticating the user or explicitly denying, or deferring it to other authentication mechanisms. Callbacks are triggered in the order they were registered. When a Module is unloaded, all the auth callbacks registered by it are unregistered. The callbacks are attempted, in the order of most recently registered callbacks, when the AUTH/HELLO (with AUTH field is provided) commands are called. The callbacks will be called with a module context along with a username and a password, and are expected to take one of the following actions: (1) Authenticate - Use the RM_Authenticate API successfully and return `REDISMODULE_AUTH_HANDLED`. This will immediately end the auth chain as successful and add the OK reply. (2) Block a client on authentication - Use the `RM_BlockClientOnAuth` API and return `REDISMODULE_AUTH_HANDLED`. Here, the client will be blocked until the `RM_UnblockClient `API is used which will trigger the auth reply callback (provided earlier through the `RM_BlockClientOnAuth`). In this reply callback, the Module should authenticate, deny or skip handling authentication. (3) Deny Authentication - Return `REDISMODULE_AUTH_HANDLED` without authenticating or blocking the client. Optionally, `err` can be set to a custom error message. This will immediately end the auth chain as unsuccessful and add the ERR reply. (4) Skip handling Authentication - Return `REDISMODULE_AUTH_NOT_HANDLED` without blocking the client. This will allow the engine to attempt the next custom auth callback. If none of the callbacks authenticate or deny auth, then password based auth is attempted and will authenticate or add failure logs and reply to the clients accordingly. ### RedisModule_BlockClientOnAuth ``` RedisModuleBlockedClient RM_BlockClientOnAuth(RedisModuleCtx ctx, RedisModuleCustomAuthCallback reply_callback, void (free_privdata)(RedisModuleCtx,void)) ``` This API can only be used from a Module from the custom auth callback. If a client is not in the middle of custom module based authentication, ERROR is returned. Otherwise, the client is blocked and the `RedisModule_BlockedClient` is returned similar to the `RedisModule_BlockClient` API. ### RedisModule_ACLAddLogEntryByUserName ``` int RM_ACLAddLogEntryByUserName(RedisModuleCtx ctx, RedisModuleString username, RedisModuleString object, RedisModuleACLLogEntryReason reason) ``` Adds a new entry in the ACL log with the `username` RedisModuleString provided. This simplifies the Module usage because now, developers do not need to create a Module User just to add an error ACL Log entry. Aside from accepting username (RedisModuleString) instead of a RedisModuleUser, it is the same as the existing `RedisModule_ACLAddLogEntry` API. ### Breaking changes - HELLO command - Clients can now only set the client name and RESP protocol from the `HELLO` command if they are authenticated. Also, we now finish command arg validation first and return early with a ERR reply if any arg is invalid. This is to avoid mutating the client name / RESP from a command that would have failed on invalid arguments. ### Notable behaviors - Module unblocking - Now, we will not allow Modules to block the client from inside the context of a reply callback (triggered from the Module unblock flow `moduleHandleBlockedClients`). --------- Co-authored-by: Madelyn Olson <34459052+madolson@users.noreply.github.com>	2023-03-15 15:18:42 -07:00
Binbin	58285a6e92	Fix WAITAOF mix-use last_offset and last_numreplicas (#11922 ) There be a situation that satisfies WAIT, and then wrongly unblock WAITAOF because we mix-use last_offset and last_numreplicas. We update last_offset and last_numreplicas only when the condition matches. i.e. output of either replicationCountAOFAcksByOffset or replicationCountAcksByOffset is right. In this case, we need to have separate last_ variables for each of them. Added a last_aof_offset and last_aof_numreplicas for WAITAOF. WAITAOF was added in #11713. Found while coding #11917. A Test was added to validate that case.	2023-03-15 18:16:16 +02:00
Binbin	70b2c4f5fd	Fix WAITAOF reply when using last_offset and last_numreplicas (#11917 ) WAITAOF wad added in #11713, its return is an array. But forget to handle WAITAOF in last_offset and last_numreplicas, causing WAITAOF to return a WAIT like reply. Tests was added to validate that case (both WAIT and WAITAOF). This PR also refactored processClientsWaitingReplicas a bit for better maintainability and readability.	2023-03-15 11:07:04 +02:00
Slava Koyfman	9344f654c6	Implementing the WAITAOF command (issue #10505 ) (#11713 ) Implementing the WAITAOF functionality which would allow the user to block until a specified number of Redises have fsynced all previous write commands to the AOF. Syntax: `WAITAOF <num_local> <num_replicas> <timeout>` Response: Array containing two elements: num_local, num_replicas num_local is always either 0 or 1 representing the local AOF on the master. num_replicas is the number of replicas that acknowledged the a replication offset of the last write being fsynced to the AOF. Returns an error when called on replicas, or when called with non-zero num_local on a master with AOF disabled, in all other cases the response just contains number of fsync copies. Main changes: * Added code to keep track of replication offsets that are confirmed to have been fsynced to disk. * Keep advancing master_repl_offset even when replication is disabled (and there's no replication backlog, only if there's an AOF enabled). This way we can use this command and it's mechanisms even when replication is disabled. * Extend REPLCONF ACK to `REPLCONF ACK <ofs> FACK <ofs>`, the FACK will be appended only if there's an AOF on the replica, and already ignored on old masters (thus backwards compatible) * WAIT now no longer wait for the replication offset after your last command, but rather the replication offset after your last write (or read command that caused propagation, e.g. lazy expiry). Unrelated changes: * WAIT command respects CLIENT_DENY_BLOCKING (not just CLIENT_MULTI) Implementation details: * Add an atomic var named `fsynced_reploff_pending` that's updated (usually by the bio thread) and later copied to the main `fsynced_reploff` variable (only if the AOF base file exists). I.e. during the initial AOF rewrite it will not be used as the fsynced offset since the AOF base is still missing. * Replace close+fsync bio job with new BIO_CLOSE_AOF (AOF specific) job that will also update fsync offset the field. * Handle all AOF jobs (BIO_CLOSE_AOF, BIO_AOF_FSYNC) in the same bio worker thread, to impose ordering on their execution. This solves a race condition where a job could set `fsynced_reploff_pending` to a higher value than another pending fsync job, resulting in indicating an offset for which parts of the data have not yet actually been fsynced. Imposing an ordering on the jobs guarantees that fsync jobs are executed in increasing order of replication offset. * Drain bio jobs when switching `appendfsync` to "always" This should prevent a write race between updates to `fsynced_reploff_pending` in the main thread (`flushAppendOnlyFile` when set to ALWAYS fsync), and those done in the bio thread. * Drain the pending fsync when starting over a new AOF to avoid race conditions with the previous AOF offsets overriding the new one (e.g. after switching to replicate from a new master). * Make sure to update the fsynced offset at the end of the initial AOF rewrite. a must in case there are no additional writes that trigger a periodic fsync, specifically for a replica that does a full sync. Limitations: It is possible to write a module and a Lua script that propagate to the AOF and doesn't propagate to the replication stream. see REDISMODULE_ARGV_NO_REPLICAS and luaRedisSetReplCommand. These features are incompatible with the WAITAOF command, and can result in two bad cases. The scenario is that the user executes command that only propagates to AOF, and then immediately issues a WAITAOF, and there's no further writes on the replication stream after that. 1. if the the last thing that happened on the replication stream is a PING (which increased the replication offset but won't trigger an fsync on the replica), then the client would hang forever (will wait for an fack that the replica will never send sine it doesn't trigger any fsyncs). 2. if the last thing that happened is a write command that got propagated properly, then WAITAOF will be released immediately, without waiting for an fsync (since the offset didn't change) Refactoring: * Plumbing to allow bio worker to handle multiple job types This introduces infrastructure necessary to allow BIO workers to not have a 1-1 mapping of worker to job-type. This allows in the future to assign multiple job types to a single worker, either as a performance/resource optimization, or as a way of enforcing ordering between specific classes of jobs. Co-authored-by: Oran Agra <oran@redislabs.com>	2023-03-14 20:26:21 +02:00
Binbin	416842e6c0	Fix the bug that CLIENT REPLY OFF\|SKIP cannot receive push notifications (#11875 ) This bug seems to be there forever, CLIENT REPLY OFF\|SKIP will mark the client with CLIENT_REPLY_OFF or CLIENT_REPLY_SKIP flags. With these flags, prepareClientToWrite called by addReply* will return C_ERR directly. So the client can't receive the Pub/Sub messages and any other push notifications, e.g client side tracking. In this PR, we adding a CLIENT_PUSHING flag, disables the reply silencing flags. When adding push replies, set the flag, after the reply, clear the flag. Then add the flag check in prepareClientToWrite. Fixes #11874 Note, the SUBSCRIBE command response is a bit awkward, see https://github.com/redis/redis-doc/pull/2327 Co-authored-by: Oran Agra <oran@redislabs.com>	2023-03-12 17:50:44 +02:00
guybe7	4ba47d2d21	Add reply_schema to command json files (internal for now) (#10273 ) Work in progress towards implementing a reply schema as part of COMMAND DOCS, see #9845 Since ironing the details of the reply schema of each and every command can take a long time, we would like to merge this PR when the infrastructure is ready, and let this mature in the unstable branch. Meanwhile the changes of this PR are internal, they are part of the repo, but do not affect the produced build. ### Background In #9656 we add a lot of information about Redis commands, but we are missing information about the replies ### Motivation 1. Documentation. This is the primary goal. 2. It should be possible, based on the output of COMMAND, to be able to generate client code in typed languages. In order to do that, we need Redis to tell us, in detail, what each reply looks like. 3. We would like to build a fuzzer that verifies the reply structure (for now we use the existing testsuite, see the "Testing" section) ### Schema The idea is to supply some sort of schema for the various replies of each command. The schema will describe the conceptual structure of the reply (for generated clients), as defined in RESP3. Note that the reply structure itself may change, depending on the arguments (e.g. `XINFO STREAM`, with and without the `FULL` modifier) We decided to use the standard json-schema (see https://json-schema.org/) as the reply-schema. Example for `BZPOPMIN`: ``` "reply_schema": { "oneOf": [ { "description": "Timeout reached and no elements were popped.", "type": "null" }, { "description": "The keyname, popped member, and its score.", "type": "array", "minItems": 3, "maxItems": 3, "items": [ { "description": "Keyname", "type": "string" }, { "description": "Member", "type": "string" }, { "description": "Score", "type": "number" } ] } ] } ``` #### Notes 1. It is ok that some commands' reply structure depends on the arguments and it's the caller's responsibility to know which is the relevant one. this comes after looking at other request-reply systems like OpenAPI, where the reply schema can also be oneOf and the caller is responsible to know which schema is the relevant one. 2. The reply schemas will describe RESP3 replies only. even though RESP3 is structured, we want to use reply schema for documentation (and possibly to create a fuzzer that validates the replies) 3. For documentation, the description field will include an explanation of the scenario in which the reply is sent, including any relation to arguments. for example, for `ZRANGE`'s two schemas we will need to state that one is with `WITHSCORES` and the other is without. 4. For documentation, there will be another optional field "notes" in which we will add a short description of the representation in RESP2, in case it's not trivial (RESP3's `ZRANGE`'s nested array vs. RESP2's flat array, for example) Given the above: 1. We can generate the "return" section of all commands in [redis-doc](https://redis.io/commands/) (given that "description" and "notes" are comprehensive enough) 2. We can generate a client in a strongly typed language (but the return type could be a conceptual `union` and the caller needs to know which schema is relevant). see the section below for RESP2 support. 3. We can create a fuzzer for RESP3. ### Limitations (because we are using the standard json-schema) The problem is that Redis' replies are more diverse than what the json format allows. This means that, when we convert the reply to a json (in order to validate the schema against it), we lose information (see the "Testing" section below). The other option would have been to extend the standard json-schema (and json format) to include stuff like sets, bulk-strings, error-string, etc. but that would mean also extending the schema-validator - and that seemed like too much work, so we decided to compromise. Examples: 1. We cannot tell the difference between an "array" and a "set" 2. We cannot tell the difference between simple-string and bulk-string 3. we cannot verify true uniqueness of items in commands like ZRANGE: json-schema doesn't cover the case of two identical members with different scores (e.g. `[["m1",6],["m1",7]]`) because `uniqueItems` compares (member,score) tuples and not just the member name. ### Testing This commit includes some changes inside Redis in order to verify the schemas (existing and future ones) are indeed correct (i.e. describe the actual response of Redis). To do that, we added a debugging feature to Redis that causes it to produce a log of all the commands it executed and their replies. For that, Redis needs to be compiled with `-DLOG_REQ_RES` and run with `--reg-res-logfile <file> --client-default-resp 3` (the testsuite already does that if you run it with `--log-req-res --force-resp3`) You should run the testsuite with the above args (and `--dont-clean`) in order to make Redis generate `.reqres` files (same dir as the `stdout` files) which contain request-response pairs. These files are later on processed by `./utils/req-res-log-validator.py` which does: 1. Goes over req-res files, generated by redis-servers, spawned by the testsuite (see logreqres.c) 2. For each request-response pair, it validates the response against the request's reply_schema (obtained from the extended COMMAND DOCS) 5. In order to get good coverage of the Redis commands, and all their different replies, we chose to use the existing redis test suite, rather than attempt to write a fuzzer. #### Notes about RESP2 1. We will not be able to use the testing tool to verify RESP2 replies (we are ok with that, it's time to accept RESP3 as the future RESP) 2. Since the majority of the test suite is using RESP2, and we want the server to reply with RESP3 so that we can validate it, we will need to know how to convert the actual reply to the one expected. - number and boolean are always strings in RESP2 so the conversion is easy - objects (maps) are always a flat array in RESP2 - others (nested array in RESP3's `ZRANGE` and others) will need some special per-command handling (so the client will not be totally auto-generated) Example for ZRANGE: ``` "reply_schema": { "anyOf": [ { "description": "A list of member elements", "type": "array", "uniqueItems": true, "items": { "type": "string" } }, { "description": "Members and their scores. Returned in case `WITHSCORES` was used.", "notes": "In RESP2 this is returned as a flat array", "type": "array", "uniqueItems": true, "items": { "type": "array", "minItems": 2, "maxItems": 2, "items": [ { "description": "Member", "type": "string" }, { "description": "Score", "type": "number" } ] } } ] } ``` ### Other changes 1. Some tests that behave differently depending on the RESP are now being tested for both RESP, regardless of the special log-req-res mode ("Pub/Sub PING" for example) 2. Update the history field of CLIENT LIST 3. Added basic tests for commands that were not covered at all by the testsuite ### TODO - [x] (maybe a different PR) add a "condition" field to anyOf/oneOf schemas that refers to args. e.g. when `SET` return NULL, the condition is `arguments.get\|\|arguments.condition`, for `OK` the condition is `!arguments.get`, and for `string` the condition is `arguments.get` - https://github.com/redis/redis/issues/11896 - [x] (maybe a different PR) also run `runtest-cluster` in the req-res logging mode - [x] add the new tests to GH actions (i.e. compile with `-DLOG_REQ_RES`, run the tests, and run the validator) - [x] (maybe a different PR) figure out a way to warn about (sub)schemas that are uncovered by the output of the tests - https://github.com/redis/redis/issues/11897 - [x] (probably a separate PR) add all missing schemas - [x] check why "SDOWN is triggered by misconfigured instance replying with errors" fails with --log-req-res - [x] move the response transformers to their own file (run both regular, cluster, and sentinel tests - need to fight with the tcl including mechanism a bit) - [x] issue: module API - https://github.com/redis/redis/issues/11898 - [x] (probably a separate PR): improve schemas: add `required` to `object`s - https://github.com/redis/redis/issues/11899 Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com> Co-authored-by: Hanna Fadida <hanna.fadida@redislabs.com> Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Shaya Potter <shaya@redislabs.com>	2023-03-11 10:14:16 +02:00

1 2 3 4 5 ...

1291 Commits