A single SPOP with command with count argument resulted in many SPOP
commands being propagated to the replica.
This is inefficient because the key name is repeated many times, and is also
being looked-up many times.
also it results in high QPS metrics on the replica.
To solve that, we flush batches of 1024 fields per SPOP command.
Co-authored-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
For zsets that will eventually be stored as the skiplist encoding (has a dict),
we can convert it to skiplist ahead of time. This change checks the number
of arguments in the ZADD command, and converts the data-structure
if the number of new entries exceeds the listpack-max-entries configuration.
This can cause us to over-allocate memory if there are duplicate entries in the
input, which is unexpected.
For ZRANGESTORE, we know the size of the zset, so we can expand
the dict in advance, to avoid the temporary dict from being rehashed
while it grows.
Simple benchmarks shows it provides some 4% improvement in ZADD and 20% in ZRANGESTORE
Current tests for BITFIELD_RO command are skipped in the external mode,
and therefore reply-schemas-validator reports a coverage error.
This PR adds basic tests to increase coverage.
Extend SENTINEL CONFIG SET and SENTINEL CONFIG GET to be
compatible with variadic CONFIG SET and CONFIG GET and allow multiple
parameters to be modified in a single call atomically.
Co-authored-by: Oran Agra <oran@redislabs.com>
The measured latency(duration) includes the list below, which can be shown by `INFO STATS`.
```
eventloop_cycles // ever increasing counter
eventloop_duration_sum // cumulative duration of eventloop in microseconds
eventloop_duration_cmd_sum // cumulative duration of executing commands in microseconds
instantaneous_eventloop_cycles_per_sec // average eventloop count per second in recent 1.6s
instantaneous_eventloop_duration_usec // average single eventloop duration in recent 1.6s
```
Also added some experimental metrics, which are shown only when `INFO DEBUG` is called.
This section isn't included in the default INFO, or even in `INFO ALL` and the fields in this section
can change in the future without considering backwards compatibility.
```
eventloop_duration_aof_sum // cumulative duration of writing AOF
eventloop_duration_cron_sum // cumulative duration cron jobs (serverCron, beforeSleep excluding IO and AOF)
eventloop_cmd_per_cycle_max // max number of commands executed in one eventloop
eventloop_duration_max // max duration of one eventloop
```
All of these are being reset by CONFIG RESETSTAT
The test failed on MacOS:
```
*** [err]: EXPIRE precision is now the millisecond in tests/unit/expire.tcl
Expected 'somevalue {}' to equal or match '{} {}'
```
`set a [r get x]`, even though we tried 10 times, sometimes we
still get {}, this is a time-sensitive test.
In this PR, we add the following changes:
1. More attempts, change it from 10 to 30.
2. More tolerant, change the `after 900` to `after 800`.
In addition, we judging $a in advance and changing `after 1100`
to `after 300`, this will save us some times.
When `RM_ZsetAdd()`/`RM_ZsetIncrby()`/`RM_StreamAdd()` fails, if a new key happens to
be created using `moduleCreateEmptyKey()`, we should clean up the empty key.
## Test
1) Add new module commands(`zset.add` and `zset.incrby`) to cover `RM_ZsetAdd()`/`RM_ZsetIncrby()`.
2) Add a large-memory test to cover `RM_StreamAdd()`.
When master reboot from RDB, if rsi in RDB is valid we should not free replication backlog, even if master_repl_offset or repl-offset is 0.
Since if master doesn't send any data to replicas master_repl_offset is 0, it's a valid number.
A clear example:
1. start a master and apply some write commands, the master's master_repl_offset is 0 since it has no replicas.
2. stop write commands on master, and start another instance and replicaof the master, trigger an FULLRESYNC
3. the master's master_repl_offset is still 0 (set a large number for repl-ping-replica-period), do BGSAVE and restart the master
4. master load master_repl_offset from RDB's rsi and it's still 0, and we should make sure replica can partially resync with master.
Tests occasionally fail since #12000:
```
*** [err]: query buffer resized correctly when not idle in tests/unit/querybuf.tcl
Expected 0 > 32768 (context: type eval line 11 cmd {assert {$orig_test_client_qbuf > 32768}} proc ::test)
*** [err]: query buffer resized correctly with fat argv in tests/unit/querybuf.tcl
query buffer should not be resized when client idle time smaller than 2s
```
The reason may be because we set hz to 100, querybuf shrinks before we count
client_query_buffer. We avoid this problem by setting pause-cron to 1.
1. reset the readraw mode after a test that uses it. undetected since the
only test after that on the same server didn't read any replies.
2. fix a cross slot issue that was undetected in cluster mode because
readraw doesn't throw exceptions on errors.
Technically declaring a prototype with an empty declaration has been deprecated since the early days of C, but we never got a warning for it. C2x will apparently be introducing a breaking change if you are using this type of declarator, so Clang 15 has started issuing a warning with -pedantic. Although not apparently a problem for any of the compiler we build on, if feels like the right thing is to properly adhere to the C standard and use (void).
The change in #12018 break the CI (fixed by #12083).
There are quite a few sentinel commands that are missing both test coverage and also schema.
PR added reply-schema to the following commands:
- sentinel debug
- sentinel info-cache
- sentinel pendding-scripts
- sentinel reset
- sentinel simulate-failure
Added some very basic tests for other sentinel commands, just so that they have some coverage.
- sentinel help
- sentinel masters
- sentinel myid
- sentinel sentinels
- sentinel slaves
These tests should be improved / replaced in a followup PR.
Minor test case addition for DECR and DECRBY.
Currently DECR and DECRBY do not have test case coverage for the
scenarios where they run on a non-existing key.
In order to speed up tests, avoid saving an RDB (mostly notable on shutdown),
except for tests that explicitly test the RDB mechanism
In addition, use `shutdown-on-sigterm force` to prevetn shutdown from failing
in case the server is in the middle of the initial AOFRW
Also a a test that checks that the `shutdown-on-sigterm default` is to refuse
shutdown if there's an initial AOFRW
Co-authored-by: Guy Benoish <guy.benoish@redislabs.com>
This PR is to fix the compilation warnings and errors generated by the latest
complier toolchain, and to add a new runner of the latest toolchain for daily CI.
## Fix various compilation warnings and errors
1) jemalloc.c
COMPILER: clang-14 with FORTIFY_SOURCE
WARNING:
```
src/jemalloc.c:1028:7: warning: suspicious concatenation of string literals in an array initialization; did you mean to separate the elements with a comma? [-Wstring-concatenation]
"/etc/malloc.conf",
^
src/jemalloc.c:1027:3: note: place parentheses around the string literal to silence warning
"\"name\" of the file referenced by the symbolic link named "
^
```
REASON: the compiler to alert developers to potential issues with string concatenation
that may miss a comma,
just like #9534 which misses a comma.
SOLUTION: use `()` to tell the compiler that these two line strings are continuous.
2) config.h
COMPILER: clang-14 with FORTIFY_SOURCE
WARNING:
```
In file included from quicklist.c:36:
./config.h:319:76: warning: attribute declaration must precede definition [-Wignored-attributes]
char *strcat(char *restrict dest, const char *restrict src) __attribute__((deprecated("please avoid use of unsafe C functions. prefer use of redis_strlcat instead")));
```
REASON: Enabling _FORTIFY_SOURCE will cause the compiler to use `strcpy()` with check,
it results in a deprecated attribute declaration after including <features.h>.
SOLUTION: move the deprecated attribute declaration from config.h to fmacro.h before "#include <features.h>".
3) networking.c
COMPILER: GCC-12
WARNING:
```
networking.c: In function ‘addReplyDouble.part.0’:
networking.c:876:21: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
876 | dbuf[start] = '$';
| ^
networking.c:868:14: note: at offset -5 into destination object ‘dbuf’ of size 5152
868 | char dbuf[MAX_LONG_DOUBLE_CHARS+32];
| ^
networking.c:876:21: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
876 | dbuf[start] = '$';
| ^
networking.c:868:14: note: at offset -6 into destination object ‘dbuf’ of size 5152
868 | char dbuf[MAX_LONG_DOUBLE_CHARS+32];
```
REASON: GCC-12 predicts that digits10() may return 9 or 10 through `return 9 + (v >= 1000000000UL)`.
SOLUTION: add an assert to let the compiler know the possible length;
4) redis-cli.c & redis-benchmark.c
COMPILER: clang-14 with FORTIFY_SOURCE
WARNING:
```
redis-benchmark.c:1621:2: warning: embedding a directive within macro arguments has undefined behavior [-Wembedded-directive] #ifdef USE_OPENSSL
redis-cli.c:3015:2: warning: embedding a directive within macro arguments has undefined behavior [-Wembedded-directive] #ifdef USE_OPENSSL
```
REASON: when _FORTIFY_SOURCE is enabled, the compiler will use the print() with
check, which is a macro. this may result in the use of directives within the macro, which
is undefined behavior.
SOLUTION: move the directives-related code out of `print()`.
5) server.c
COMPILER: gcc-13 with FORTIFY_SOURCE
WARNING:
```
In function 'lookupCommandLogic',
inlined from 'lookupCommandBySdsLogic' at server.c:3139:32:
server.c:3102:66: error: '*(robj **)argv' may be used uninitialized [-Werror=maybe-uninitialized]
3102 | struct redisCommand *base_cmd = dictFetchValue(commands, argv[0]->ptr);
| ~~~~^~~
```
REASON: The compiler thinks that the `argc` returned by `sdssplitlen()` could be 0,
resulting in an empty array of size 0 being passed to lookupCommandLogic.
this should be a false positive, `argc` can't be 0 when strings are not NULL.
SOLUTION: add an assert to let the compiler know that `argc` is positive.
6) sha1.c
COMPILER: gcc-12
WARNING:
```
In function ‘SHA1Update’,
inlined from ‘SHA1Final’ at sha1.c:195:5:
sha1.c:152:13: warning: ‘SHA1Transform’ reading 64 bytes from a region of size 0 [-Wstringop-overread]
152 | SHA1Transform(context->state, &data[i]);
| ^
sha1.c:152:13: note: referencing argument 2 of type ‘const unsigned char[64]’
sha1.c: In function ‘SHA1Final’:
sha1.c:56:6: note: in a call to function ‘SHA1Transform’
56 | void SHA1Transform(uint32_t state[5], const unsigned char buffer[64])
| ^
In function ‘SHA1Update’,
inlined from ‘SHA1Final’ at sha1.c:198:9:
sha1.c:152:13: warning: ‘SHA1Transform’ reading 64 bytes from a region of size 0 [-Wstringop-overread]
152 | SHA1Transform(context->state, &data[i]);
| ^
sha1.c:152:13: note: referencing argument 2 of type ‘const unsigned char[64]’
sha1.c: In function ‘SHA1Final’:
sha1.c:56:6: note: in a call to function ‘SHA1Transform’
56 | void SHA1Transform(uint32_t state[5], const unsigned char buffer[64])
```
REASON: due to the bug[https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80922], when
enable LTO, gcc-12 will not see `diagnostic ignored "-Wstringop-overread"`, resulting in a warning.
SOLUTION: temporarily set SHA1Update to noinline to avoid compiler warnings due
to LTO being enabled until the above gcc bug is fixed.
7) zmalloc.h
COMPILER: GCC-12
WARNING:
```
In function ‘memset’,
inlined from ‘moduleCreateContext’ at module.c:877:5,
inlined from ‘RM_GetDetachedThreadSafeContext’ at module.c:8410:5:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:59:10: warning: ‘__builtin_memset’ writing 104 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
59 | return __builtin___memset_chk (__dest, __ch, __len,
```
REASON: due to the GCC-12 bug [https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503],
GCC-12 cannot see alloc_size, which causes GCC to think that the actual size of memory
is 0 when checking with __glibc_objsize0().
SOLUTION: temporarily set malloc-related interfaces to `noinline` to avoid compiler warnings
due to LTO being enabled until the above gcc bug is fixed.
## Other changes
1) Fixed `ps -p [pid]` doesn't output `<defunct>` when using procps 4.x causing `replication
child dies when parent is killed - diskless` test to fail.
2) Add a new fortify CI with GCC-13 and ubuntu-lunar docker image.
Minor test case addition.
Currently GETRANGE command does not have the test case coverage for the scenarios:
An error is returned when key exists but of different type
Added missing test cases for getrange command.
this pr fix two wrongs:
1. When client’s querybuf is pre-allocated for a fat argv, we need to update the
querybuf_peak of the client immediately to completely avoid the unexpected
shrinking of querybuf in the next clientCron (before data arrives to set the peak).
2. the protocol's bulklen does not include `\r\n`, but the allocation and the data we
read does. so in `clientsCronResizeQueryBuffer`, the `resize` or `querybuf_peak`
should add these 2 bytes.
the first bug is likely to hit us on large payloads over slow connections, in which case
transferring the payload can take longer and a cron event will be triggered (specifically
if there are not a lot of clients)
There is are some missing test cases for SUBSTR command.
These might already be covered by GETRANGE, but no harm in adding them since they're simple.
Added 3 test case.
* start > stop
* start and stop both greater than string length
* when no key is present.
Currently LINSERT command does not have the test case coverage for following scenarios.
1. When key does not exist, it is considered an empty list and no operation is performed.
2. An error is returned when key exists but does not hold a list value.
Added above two missing test cases for linsert command.
* Add RM_ReplyWithErrorFormat that can support format
Reply with the error create from a printf format and arguments.
If the error code is already passed in the string 'fmt', the error
code provided is used, otherwise the string "-ERR " for the generic
error code is automatically added.
The usage is, for example:
RedisModule_ReplyWithErrorFormat(ctx, "An error: %s", "foo");
RedisModule_ReplyWithErrorFormat(ctx, "-WRONGTYPE Wrong Type: %s", "foo");
The function always returns REDISMODULE_OK.
The MacOS CI in github actions often hangs without any logs. GH argues that
it's due to resource utilization, either running out of disk space, memory, or CPU
starvation, and thus the runner is terminated.
This PR contains multiple attempts to resolve this:
1. introducing pause_process instead of SIGSTOP, which waits for the process
to stop before resuming the test, possibly resolving race conditions in some tests,
this was a suspect since there was one test that could result in an infinite loop in that
case, in practice this didn't help, but still a good idea to keep.
2. disable the `save` config in many tests that don't need it, specifically ones that use
heavy writes and could create large files.
3. change the `populate` proc to use short pipeline rather than an infinite one.
4. use `--clients 1` in the macos CI so that we don't risk running multiple resource
demanding tests in parallel.
5. enable `--verbose` to be repeated to elevate verbosity and print more info to stdout
when a test or a server starts.
We do have ZREMRANGEBYLEX tests, but it is a stress test
marked with slow tag and then skipped in reply-schemas daily.
In the past, we were able to succeed on a daily, i guess
it was because there were some random command executions,
such as corrupt-dump-fuzzy, which might call it.
These test examples are taken from ZRANGEBYLEX basics test.
Add `RM_RdbLoad()` and `RM_RdbSave()` to load/save RDB files from the module API.
In our use case, we have our clustering implementation as a module. As part of this
implementation, the module needs to trigger RDB save operation at specific points.
Also, this module delivers RDB files to other nodes (not using Redis' replication).
When a node receives an RDB file, it should be able to load the RDB. Currently,
there is no module API to save/load RDB files.
This PR adds four new APIs:
```c
RedisModuleRdbStream *RM_RdbStreamCreateFromFile(const char *filename);
void RM_RdbStreamFree(RedisModuleRdbStream *stream);
int RM_RdbLoad(RedisModuleCtx *ctx, RedisModuleRdbStream *stream, int flags);
int RM_RdbSave(RedisModuleCtx *ctx, RedisModuleRdbStream *stream, int flags);
```
The first step is to create a `RedisModuleRdbStream` object. This PR provides a function to
create RedisModuleRdbStream from the filename. (You can load/save RDB with the filename).
In the future, this API can be extended if needed:
e.g., `RM_RdbStreamCreateFromFd()`, `RM_RdbStreamCreateFromSocket()` to save/load
RDB from an `fd` or a `socket`.
Usage:
```c
/* Save RDB */
RedisModuleRdbStream *stream = RedisModule_RdbStreamCreateFromFile("example.rdb");
RedisModule_RdbSave(ctx, stream, 0);
RedisModule_RdbStreamFree(stream);
/* Load RDB */
RedisModuleRdbStream *stream = RedisModule_RdbStreamCreateFromFile("example.rdb");
RedisModule_RdbLoad(ctx, stream, 0);
RedisModule_RdbStreamFree(stream);
```
The existing logic for killing pub-sub clients did not handle the `allchannels`
permission correctly. For example, if you:
ACL SETUSER foo allchannels
Have a client authenticate as the user `foo` and subscribe to a channel, and then:
ACL SETUSER foo resetchannels
The subscribed client would not be disconnected, though new clients under that user
would be blocked from subscribing to any channels.
This was caused by an incomplete optimization in `ACLKillPubsubClientsIfNeeded`
checking whether the new channel permissions were a strict superset of the old ones.
Now that the command argument specs are available at runtime (#9656), this PR addresses
#8084 by implementing a complete solution for command-line hinting in `redis-cli`.
It correctly handles nearly every case in Redis's complex command argument definitions, including
`BLOCK` and `ONEOF` arguments, reordering of optional arguments, and repeated arguments
(even when followed by mandatory arguments). It also validates numerically-typed arguments.
It may not correctly handle all possible combinations of those, but overall it is quite robust.
Arguments are only matched after the space bar is typed, so partial word matching is not
supported - that proved to be more confusing than helpful. When the user's current input
cannot be matched against the argument specs, hinting is disabled.
Partial support has been implemented for legacy (pre-7.0) servers that do not support
`COMMAND DOCS`, by falling back to a statically-compiled command argument table.
On startup, if the server does not support `COMMAND DOCS`, `redis-cli` will now issue
an `INFO SERVER` command to retrieve the server version (unless `HELLO` has already
been sent, in which case the server version will be extracted from the reply to `HELLO`).
The server version will be used to filter the commands and arguments in the command table,
removing those not supported by that version of the server. However, the static table only
includes core Redis commands, so with a legacy server hinting will not be supported for
module commands. The auto generated help.h and the scripts that generates it are gone.
Command and argument tables for the server and CLI use different structs, due primarily
to the need to support different runtime data. In order to generate code for both, macros
have been added to `commands.def` (previously `commands.c`) to make it possible to
configure the code generation differently for different use cases (one linked with redis-server,
and one with redis-cli).
Also adding a basic testing framework for the command hints based on new (undocumented)
command line options to `redis-cli`: `--test_hint 'INPUT'` prints out the command-line hint for
a given input string, and `--test_hint_file <filename>` runs a suite of test cases for the hinting
mechanism. The test suite is in `tests/assets/test_cli_hint_suite.txt`, and it is run from
`tests/integration/redis-cli.tcl`.
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
This is an attempt to normalize/formalize command summaries.
Main actions performed:
* Starts with the continuation of the phrase "The XXXX command, when called, ..." for user commands.
* Starts with "An internal command...", "A container command...", etc... when applicable.
* Always uses periods.
* Refrains from referring to other commands. If this is needed, backquotes should be used for command names.
* Tries to be very clear about the data type when applicable.
* Tries to mention additional effects, e.g. "The key is created if it doesn't exist" and "The set is deleted if the last member is removed."
* Prefers being terse over verbose.
* Tries to be consistent.
This PR fix several unrelated bugs that were discovered by the same set of tests
(WAITAOF tests in #11713), could make the `WAITAOF` test hang.
The change in `backgroundRewriteDoneHandler` is about MP-AOF.
That leftover / old code assumes that we started a new AOF file just now
(when we have a new base into which we're gonna incrementally write), but
the fact is that with MP-AOF, the fork done handler doesn't really affect the
incremental file being maintained by the parent process, there's no reason to
re-issue `SELECT`, and no reason to update any of the fsync variables in that flow.
This should have been deleted with MP-AOF (introduced in #9788, 7.0).
The damage is that the update to `aof_fsync_offset` will cause us to miss an fsync
in `flushAppendOnlyFile`, that happens if we stop write commands in `AOF_FSYNC_EVERYSEC`
while an AOFRW is in progress. This caused a new `WAITAOF` test to sometime hang forever.
Also because of MP-AOF, we needed to change `aof_fsync_offset` to `aof_last_incr_fsync_offset`
and match it to `aof_last_incr_size` in `flushAppendOnlyFile`. This is because in the past we compared
`aof_fsync_offset` and `aof_current_size`, but with MP-AOF it could be the total AOF file will be
smaller after AOFRW, and the (already existing) incr file still has data that needs to be fsynced.
The change in `flushAppendOnlyFile`, about the `AOF_FSYNC_ALWAYS`, it is follow #6053
(the details is in #5985), we also check `AOF_FSYNC_ALWAYS` to handle a case where
appendfsync is changed from everysec to always while there is data that's written but not yet fsynced.
This test fails sporadically:
```
*** [err]: Migrate the last slot away from a node using redis-cli in tests/unit/cluster/cli.tcl
cluster size did not reach a consistent size 4
```
I guess the time (5s) of wait_for_cluster_size is not enough,
usually, the waiting time for our other tests for cluster
consistency is 50s, so also changing it to 50s.
Since we remove the COMMAND COUNT call in sentinel test in #11950,
reply-schemas-validator started reporting this error:
```
WARNING! The following commands were not hit at all:
command|count
ERROR! at least one command was not hit by the tests
```
This PR add a COMMAND COUNT test to cover it and also fix some
typos in req-res-log-validator.py
The sanity check test intention was to detect that when a command is
added to sentinel it is on purpose. This test is easily broken, like
CLIENT SETINFO introduced by #11758.
We replace it with a test that validates that a few specific commands
are either there or missing (to test the infrastructure works correctly).
This PR allows clients to send information about the client library to redis
to be displayed in CLIENT LIST and CLIENT INFO.
Currently supports:
`CLIENT [lib-name | lib-ver] <value>`
Client libraries are expected to pipeline these right after AUTH, and ignore
the failure in case they're talking to an older version of redis.
These will be shown in CLIENT LIST and CLIENT INFO as:
* `lib-name` - meant to hold the client library name.
* `lib-ver` - meant to hold the client library version.
The values cannot contain spaces, newlines and any wild ASCII characters,
but all other normal chars are accepted, e.g `.`, `=` etc (same as CLIENT NAME).
The RESET command does NOT clear these, but they can be cleared to the
default by sending a command with a blank string.
Co-authored-by: Oran Agra <oran@redislabs.com>
This allows modules to register commands to existing ACL categories and blocks the creation of [sub]commands, datatypes and registering the configs outside of the OnLoad function.
For allowing modules to register commands to existing ACL categories,
This PR implements a new API int RM_SetCommandACLCategories() which takes a pointer to a RedisModuleCommand and a C string aclflags containing the set of space separated ACL categories.
Example, 'write slow' marks the command as part of the write and slow ACL categories.
The C string aclflags is tokenized by implementing a helper function categoryFlagsFromString(). Theses tokens are matched and the corresponding ACL categories flags are set by a helper function matchAclCategoriesFlags. The helper function categoryFlagsFromString() returns the corresponding categories_flags or returns -1 if some token not processed correctly.
If the module contains commands which are registered to existing ACL categories, the number of [sub]commands are tracked by num_commands_with_acl_categories in struct RedisModule. Further, the allowed command bit-map of the existing users are recomputed from the command_rules list, by implementing a function called ACLRecomputeCommandBitsFromCommandRulesAllUsers() for the existing users to have access to the module commands on runtime.
## Breaking change
This change requires that registering commands and subcommands only occur during a modules "OnLoad" function, in order to allow efficient recompilation of ACL bits. We also chose to block registering configs and types, since we believe it's only valid for those to be created during onLoad. We check for this onload flag in struct RedisModule to check if the call is made from the OnLoad function.
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
I saw this error once, in the FreeBSD Daily CI:
```
*** [err]: Temp rdb will be deleted if we use bg_unlink when shutdown in tests/unit/shutdown.tcl
Expected [file exists /xxx/temp-10336.rdb] (context: type eval line 15 cmd {assert {[file exists $temp_rdb]}} proc ::test)
```
The log shows that bgsave was executed, and it was successfully executed in the end:
```
Starting test Temp rdb will be deleted if we use bg_unlink when shutdown in tests/unit/shutdown.tcl
10251:M 22 Feb 2023 11:37:25.441 * Background saving started by pid 10336
10336:C 22 Feb 2023 11:37:27.949 * DB saved on disk
10336:C 22 Feb 2023 11:37:27.949 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
10251:M 22 Feb 2023 11:37:28.060 * Background saving terminated with success
```
There may be two reasons:
1. The child process has been created, but it has not created
the temp rdb file yet, so [file exists $temp_rdb] check failed.
2. The child process bgsave has been executed successfully and the
temp file has been deleted, so [file exists $temp_rdb] check failed.
From the logs pint, it should be the case 2, case 1 is too extreme,
set rdb-key-save-delay to a higher value to ensure bgsave does not
succeed early to avoid this case.
The reason is in reply-schemas-validator, the resp of the
client we create will be client_default_resp (currently 3):
```
client *createClient(connection *conn) {
client *c = zmalloc(sizeof(client));
#ifdef LOG_REQ_RES
reqresReset(c, 0);
c->resp = server.client_default_resp;
#else
c->resp = 2;
#endif
}
```
But current_resp3 in redis-cli will be inconsistent with it,
the test adds a simple hello 3 to avoid this failure, test
was added in #11873.
Added help descriptions for dont-pre-clean option, it was
added in #10273
Previously we would run the module command filters even upon blocked
command reprocessing. This could modify the command, and it's args.
This is irrelevant in the context of a command being reprocessed (it already
went through the filters), as well as breaks the crashed command lookup
that exists in the case of a reprocessed command.
fixes#11894.
Co-authored-by: Oran Agra <oran@redislabs.com>
The message "Reading messages... (press Ctrl-C to quit)" is replaced by
"Reading messages... (press Ctrl-C to quit or any key to type command)".
This allows users to subscribe to more channels, to try out UNSUBSCRIBE and to
combine pubsub with other features such as push messages from client tracking.
The "Reading messages" info message is displayed in the bottom of the output in a
distinct style and moves downward as more messages appear. When any key is pressed,
the info message is replaced by the prompt with for entering commands.
After entering a command and the reply is displayed, the "Reading messages" info
messages appears again. This is added to the repl loop in redis-cli and in the
corresponding place for non-interactive mode.
An indication "(subscribed mode)" is included in the prompt when entering commands
in subscribed mode.
Also:
* Fixes a problem that UNSUBSCRIBE hanged when used with RESP3 and push callback,
without first entering subscribe mode. It hanged because UNSUBSCRIBE gets one or
more push replies but no in-band reply.
* Exit subscribed mode after RESET.
Allow running blocking commands from within a module using `RM_Call`.
Today, when `RM_Call` is used, the fake client that is used to run command
is marked with `CLIENT_DENY_BLOCKING` flag. This flag tells the command
that it is not allowed to block the client and in case it needs to block, it must
fallback to some alternative (either return error or perform some default behavior).
For example, `BLPOP` fallback to simple `LPOP` if it is not allowed to block.
All the commands must respect the `CLIENT_DENY_BLOCKING` flag (including
module commands). When the command invocation finished, Redis asserts that
the client was not blocked.
This PR introduces the ability to call blocking command using `RM_Call` by
passing a callback that will be called when the client will get unblocked.
In order to do that, the user must explicitly say that he allow to perform blocking
command by passing a new format specifier argument, `K`, to the `RM_Call`
function. This new flag will tell Redis that it is allow to run blocking command
and block the client. In case the command got blocked, Redis will return a new
type of call reply (`REDISMODULE_REPLY_PROMISE`). This call reply indicates
that the command got blocked and the user can set the on_unblocked handler using
`RM_CallReplyPromiseSetUnblockHandler`.
When clients gets unblocked, it eventually reaches `processUnblockedClients` function.
This is where we check if the client is a fake module client and if it is, we call the unblock
callback instead of performing the usual unblock operations.
**Notice**: `RM_CallReplyPromiseSetUnblockHandler` must be called atomically
along side the command invocation (without releasing the Redis lock in between).
In addition, unlike other CallReply types, the promise call reply must be released
by the module when the Redis GIL is acquired.
The module can abort the execution on the blocking command (if it was not yet
executed) using `RM_CallReplyPromiseAbort`. the API will return `REDISMODULE_OK`
on success and `REDISMODULE_ERR` if the operation is already executed.
**Notice** that in case of misbehave module, Abort might finished successfully but the
operation will not really be aborted. This can only happened if the module do not respect
the disconnect callback of the blocked client.
For pure Redis commands this can not happened.
### Atomicity Guarantees
The API promise that the unblock handler will run atomically as an execution unit.
This means that all the operation performed on the unblock handler will be wrapped
with a multi exec transaction when replicated to the replica and AOF.
The API **do not** grantee any other atomicity properties such as when the unblock
handler will be called. This gives us the flexibility to strengthen the grantees (or not)
in the future if we will decide that we need a better guarantees.
That said, the implementation **does** provide a better guarantees when performing
pure Redis blocking command like `BLPOP`. In this case the unblock handler will run
atomically with the operation that got unblocked (for example, in case of `BLPOP`, the
unblock handler will run atomically with the `LPOP` operation that run when the command
got unblocked). This is an implementation detail that might be change in the future and the
module writer should not count on that.
### Calling blocking commands while running on script mode (`S`)
`RM_Call` script mode (`S`) was introduced on #0372. It is used for usecases where the
command that was invoked on `RM_Call` comes from a user input and we want to make
sure the user will not run dangerous commands like `shutdown`. Some command, such
as `BLPOP`, are marked with `NO_SCRIPT` flag, which means they will not be allowed on
script mode. Those commands are marked with `NO_SCRIPT` just because they are
blocking commands and not because they are dangerous. Now that we can run blocking
commands on RM_Call, there is no real reason not to allow such commands on script mode.
The underline problem is that the `NO_SCRIPT` flag is abused to also mark some of the
blocking commands (notice that those commands know not to block the client if it is not
allowed to do so, and have a fallback logic to such cases. So even if those commands
were not marked with `NO_SCRIPT` flag, it would not harm Redis, and today we can
already run those commands within multi exec).
In addition, not all blocking commands are marked with `NO_SCRIPT` flag, for example
`blmpop` are not marked and can run from within a script.
Those facts shows that there are some ambiguity about the meaning of the `NO_SCRIPT`
flag, and its not fully clear where it should be use.
The PR suggest that blocking commands should not be marked with `NO_SCRIPT` flag,
those commands should handle `CLIENT_DENY_BLOCKING` flag and only block when
it's safe (like they already does today). To achieve that, the PR removes the `NO_SCRIPT`
flag from the following commands:
* `blmove`
* `blpop`
* `brpop`
* `brpoplpush`
* `bzpopmax`
* `bzpopmin`
* `wait`
This might be considered a breaking change as now, on scripts, instead of getting
`command is not allowed from script` error, the user will get some fallback behavior
base on the command implementation. That said, the change matches the behavior
of scripts and multi exec with respect to those commands and allow running them on
`RM_Call` even when script mode is used.
### Additional RedisModule API and changes
* `RM_BlockClientSetPrivateData` - Set private data on the blocked client without the
need to unblock the client. This allows up to set the promise CallReply as the private
data of the blocked client and abort it if the client gets disconnected.
* `RM_BlockClientGetPrivateData` - Return the current private data set on a blocked client.
We need it so we will have access to this private data on the disconnect callback.
* On RM_Call, the returned reply will be added to the auto memory context only if auto
memory is enabled, this allows us to keep the call reply for longer time then the context
lifetime and does not force an unneeded borrow relationship between the CallReply and
the RedisModuleContext.
There is a -Wimplicit-function-declaration warning in here:
```
auth.c: In function ‘AuthBlock_ThreadMain’:
auth.c:116:5: warning: implicit declaration of function ‘usleep’; did you mean ‘sleep’? [-Wimplicit-function-declaration]
116 | usleep(500000);
| ^~~~~~
| sleep
```