# Short description
The Redis extended latency stats track per command latencies and enables:
- exporting the per-command percentile distribution via the `INFO LATENCYSTATS` command.
**( percentile distribution is not mergeable between cluster nodes ).**
- exporting the per-command cumulative latency distributions via the `LATENCY HISTOGRAM` command.
Using the cumulative distribution of latencies we can merge several stats from different cluster nodes
to calculate aggregate metrics .
By default, the extended latency monitoring is enabled since the overhead of keeping track of the
command latency is very small.
If you don't want to track extended latency metrics, you can easily disable it at runtime using the command:
- `CONFIG SET latency-tracking no`
By default, the exported latency percentiles are the p50, p99, and p999.
You can alter them at runtime using the command:
- `CONFIG SET latency-tracking-info-percentiles "0.0 50.0 100.0"`
## Some details:
- The total size per histogram should sit around 40 KiB. We only allocate those 40KiB when a command
was called for the first time.
- With regards to the WRITE overhead As seen below, there is no measurable overhead on the achievable
ops/sec or full latency spectrum on the client. Including also the measured redis-benchmark for unstable
vs this branch.
- We track from 1 nanosecond to 1 second ( everything above 1 second is considered +Inf )
## `INFO LATENCYSTATS` exposition format
- Format: `latency_percentiles_usec_<CMDNAME>:p0=XX,p50....`
## `LATENCY HISTOGRAM [command ...]` exposition format
Return a cumulative distribution of latencies in the format of a histogram for the specified command names.
The histogram is composed of a map of time buckets:
- Each representing a latency range, between 1 nanosecond and roughly 1 second.
- Each bucket covers twice the previous bucket's range.
- Empty buckets are not printed.
- Everything above 1 sec is considered +Inf.
- At max there will be log2(1000000000)=30 buckets
We reply a map for each command in the format:
`<command name> : { `calls`: <total command calls> , `histogram` : { <bucket 1> : latency , < bucket 2> : latency, ... } }`
Co-authored-by: Oran Agra <oran@redislabs.com>
This would mean that the effects of `CONFIG SET maxmemory` may not be visible once the command returns.
That could anyway happen since incremental eviction was added in redis 6.2 (see #7653)
We do this to fix one of the propagation bugs about eviction see #9890 and #10014.
Preventing COFIG SET maxmemory from propagating is just the tip of the iceberg.
Module that performs a write operation in a notification can cause any
command to be propagated, based on server.dirty
We need to come up with a better solution.
It turns out that libc malloc can return an allocation of a different size on requests of the same size.
this means that matching MEMORY USAGE of one key to another copy of the same data can fail.
Solution:
Keep running the test that calls MEMORY USAGE, but ignore the response.
We do that by introducing a new utility function to get the memory usage, which always returns 1
when the allocator is not jemalloc.
Other changes:
Some formatting for datatype2.tcl
Co-authored-by: Oran Agra <oran@redislabs.com>
Following #9656, this script generates a "commands.json" file from the output
of the new COMMAND. The output of this script is used in redis/redis-doc#1714
and by redis/redis-io#259. This also converts a couple of rogue dashes (in
'key-specs' and 'multiple-token' flags) to underscores (continues #9959).
There's a race between testing DBSIZE and the thread starting.
If the thread hadn't started by the time we checked DBISZE, no
keys will have been evicted.
The correct way is to check the evicted_keys stat.
The mess:
Some parts use alsoPropagate for late propagation, others using an immediate one (propagate()),
causing edge cases, ugly/hacky code, and the tendency for bugs
The basic idea is that all commands are propagated via alsoPropagate (i.e. added to a list) and the
top-most call() is responsible for going over that list and actually propagating them (and wrapping
them in MULTI/EXEC if there's more than one command). This is done in the new function,
propagatePendingCommands.
Callers to propagatePendingCommands:
1. top-most call() (we want all nested call()s to add to the also_propagate array and just the top-most
one to propagate them) - via `afterCommand`
2. handleClientsBlockedOnKeys: it is out of call() context and it may propagate stuff - via `afterCommand`.
3. handleClientsBlockedOnKeys edge case: if the looked-up key is already expired, we will propagate the
expire but will not unblock any client so `afterCommand` isn't called. in that case, we have to propagate
the deletion explicitly.
4. cron stuff: active-expire and eviction may also propagate stuff
5. modules: the module API allows to propagate stuff from just about anywhere (timers, keyspace notifications,
threads). I could have tried to catch all the out-of-call-context places but it seemed easier to handle it in one
place: when we free the context. in the spirit of what was done in call(), only the top-most freeing of a module
context may cause propagation.
6. modules: when using a thread-safe ctx it's not clear when/if the ctx will be freed. we do know that the module
must lock the GIL before calling RM_Replicate/RM_Call so we propagate the pending commands when
releasing the GIL.
A "known limitation", which were actually a bug, was fixed because of this commit (see propagate.tcl):
When using a mix of RM_Call with `!` and RM_Replicate, the command would propagate out-of-order:
first all the commands from RM_Call, and then the ones from RM_Replicate
Another thing worth mentioning is that if, in the past, a client would issue a MULTI/EXEC with just one
write command the server would blindly propagate the MULTI/EXEC too, even though it's redundant.
not anymore.
This commit renames propagate() to propagateNow() in order to cause conflicts in pending PRs.
propagatePendingCommands is the only caller of propagateNow, which is now a static, internal helper function.
Optimizations:
1. alsoPropagate will not add stuff to also_propagate if there's no AOF and replicas
2. alsoPropagate reallocs also_propagagte exponentially, to save calls to memmove
Bugfixes:
1. CONFIG SET can create evictions, sending notifications which can cause to dirty++ with modules.
we need to prevent it from propagating to AOF/replicas
2. We need to set current_client in RM_Call. buggy scenario:
- CONFIG SET maxmemory, eviction notifications, module hook calls RM_Call
- assertion in lookupKey crashes, because current_client has CONFIG SET, which isn't CMD_WRITE
3. minor: in eviction, call propagateDeletion after notification, like active-expire and all commands
(we always send a notification before propagating the command)
- add needs:debug flag for some tests
- disable "save" in external tests (speedup?)
- use debug_digest proc instead of debug command directly so it can be skipped
- use OBJECT ENCODING instead of DEBUG OBJECT to get encoding
- add a proc for OBJECT REFCOUNT so it can be skipped
- move a bunch of tests in latency_monitor tests to happen later so that latency monitor has some values in it
- add missing close_replication_stream calls
- make sure to close the temp client if DEBUG LOG fails
Block sensitive configs and commands by default.
* `enable-protected-configs` - block modification of configs with the new `PROTECTED_CONFIG` flag.
Currently we add this flag to `dbfilename`, and `dir` configs,
all of which are non-mutable configs that can set a file redis will write to.
* `enable-debug-command` - block the `DEBUG` command
* `enable-module-command` - block the `MODULE` command
These have a default value set to `no`, so that these features are not
exposed by default to client connections, and can only be set by modifying the config file.
Users can change each of these to either `yes` (allow all access), or `local` (allow access from
local TCP connections and unix domain connections)
Note that this is a **breaking change** (specifically the part about MODULE command being disabled by default).
I.e. we don't consider DEBUG command being blocked as an issue (people shouldn't have been using it),
and the few configs we protected are unlikely to have been set at runtime anyway.
On the other hand, it's likely to assume some users who use modules, load them from the config file anyway.
Note that's the whole point of this PR, for redis to be more secure by default and reduce the attack surface on
innocent users, so secure defaults will necessarily mean a breaking change.
some languages can build a json-like object by parsing a textual json,
but it works poorly when attributes contain hyphens
example in JS:
```
let j = JSON.parse(json)
j['key-name'] <- works
j.key-name <= illegal syntax
```
Delete the hardcoded command table and replace it with an auto-generated table, based
on a JSON file that describes the commands (each command must have a JSON file).
These JSON files are the SSOT of everything there is to know about Redis commands,
and it is reflected fully in COMMAND INFO.
These JSON files are used to generate commands.c (using a python script), which is then
committed to the repo and compiled.
The purpose is:
* Clients and proxies will be able to get much more info from redis, instead of relying on hard coded logic.
* drop the dependency between Redis-user and the commands.json in redis-doc.
* delete help.h and have redis-cli learn everything it needs to know just by issuing COMMAND (will be
done in a separate PR)
* redis.io should stop using commands.json and learn everything from Redis (ultimately one of the release
artifacts should be a large JSON, containing all the information about all of the commands, which will be
generated from COMMAND's reply)
* the byproduct of this is:
* module commands will be able to provide that info and possibly be more of a first-class citizens
* in theory, one may be able to generate a redis client library for a strictly typed language, by using this info.
### Interface changes
#### COMMAND INFO's reply change (and arg-less COMMAND)
Before this commit the reply at index 7 contained the key-specs list
and reply at index 8 contained the sub-commands list (Both unreleased).
Now, reply at index 7 is a map of:
- summary - short command description
- since - debut version
- group - command group
- complexity - complexity string
- doc-flags - flags used for documentation (e.g. "deprecated")
- deprecated-since - if deprecated, from which version?
- replaced-by - if deprecated, which command replaced it?
- history - a list of (version, what-changed) tuples
- hints - a list of strings, meant to provide hints for clients/proxies. see https://github.com/redis/redis/issues/9876
- arguments - an array of arguments. each element is a map, with the possibility of nesting (sub-arguments)
- key-specs - an array of keys specs (already in unstable, just changed location)
- subcommands - a list of sub-commands (already in unstable, just changed location)
- reply-schema - will be added in the future (see https://github.com/redis/redis/issues/9845)
more details on these can be found in https://github.com/redis/redis-doc/pull/1697
only the first three fields are mandatory
#### API changes (unreleased API obviously)
now they take RedisModuleCommand opaque pointer instead of looking up the command by name
- RM_CreateSubcommand
- RM_AddCommandKeySpec
- RM_SetCommandKeySpecBeginSearchIndex
- RM_SetCommandKeySpecBeginSearchKeyword
- RM_SetCommandKeySpecFindKeysRange
- RM_SetCommandKeySpecFindKeysKeynum
Currently, we did not add module API to provide additional information about their commands because
we couldn't agree on how the API should look like, see https://github.com/redis/redis/issues/9944.
### Somehow related changes
1. Literals should be in uppercase while placeholder in lowercase. Now all the GEO* command
will be documented with M|KM|FT|MI and can take both lowercase and uppercase
### Unrelated changes
1. Bugfix: no_madaory_keys was absent in COMMAND's reply
2. expose CMD_MODULE as "module" via COMMAND
3. have a dedicated uint64 for ACL categories (instead of having them in the same uint64 as command flags)
Co-authored-by: Itamar Haber <itamar@garantiadata.com>
In #9323, when `repl-diskless-load` is enabled and set to `swapdb`,
if the master replication ID hasn't changed, we can load data-set
asynchronously, and serving read commands during the full resync.
In `diskless loading short read` test, after a loading successfully,
we will wait for the loading to stop and continue the for loop.
After the introduction of `async_loading`, we also need to check it.
Otherwise the next loop will start too soon, may trigger a timing issue.
Some people complain that QUIT is missing from help/command table.
Not appearing in COMMAND command, command stats, ACL, etc.
and instead, there's a hack in processCommand with a comment that looks outdated.
Note that it is [documented](https://redis.io/commands/quit)
At the same time, HOST: and POST are there in the command table although these are not real commands.
They would appear in the COMMAND command, and even in commandstats.
Other changes:
1. Initialize the static logged_time static var in securityWarningCommand
2. add `no-auth` flag to RESET so it can always be executed.
In both tests, "diskless loading short read" and "diskless loading short read with module",
the timeout of waiting for the replica to respond to a short read and log it, is too short.
Also, add --dump-logs in runtest-moduleapi for valgrind runs.
For diskless replication in swapdb mode, considering we already spend replica memory
having a backup of current db to restore in case of failure, we can have the following benefits
by instead swapping database only in case we succeeded in transferring db from master:
- Avoid `LOADING` response during failed and successful synchronization for cases where the
replica is already up and running with data.
- Faster total time of diskless replication, because now we're moving from Transfer + Flush + Load
time to Transfer + Load only. Flushing the tempDb is done asynchronously after swapping.
- This could be implemented also for disk replication with similar benefits if consumers are willing
to spend the extra memory usage.
General notes:
- The concept of `backupDb` becomes `tempDb` for clarity.
- Async loading mode will only kick in if the replica is syncing from a master that has the same
repl-id the one it had before. i.e. the data it's getting belongs to a different time of the same timeline.
- New property in INFO: `async_loading` to differentiate from the blocking loading
- Slot to Key mapping is now a field of `redisDb` as it's more natural to access it from both server.db
and the tempDb that is passed around.
- Because this is affecting replicas only, we assume that if they are not readonly and write commands
during replication, they are lost after SYNC same way as before, but we're still denying CONFIG SET
here anyways to avoid complications.
Considerations for review:
- We have many cases where server.loading flag is used and even though I tried my best, there may
be cases where async_loading should be checked as well and cases where it shouldn't (would require
very good understanding of whole code)
- Several places that had different behavior depending on the loading flag where actually meant to just
handle commands coming from the AOF client differently than ones coming from real clients, changed
to check CLIENT_ID_AOF instead.
**Additional for Release Notes**
- Bugfix - server.dirty was not incremented for any kind of diskless replication, as effect it wouldn't
contribute on triggering next database SAVE
- New flag for RM_GetContextFlags module API: REDISMODULE_CTX_FLAGS_ASYNC_LOADING
- Deprecated RedisModuleEvent_ReplBackup. Starting from Redis 7.0, we don't fire this event.
Instead, we have the new RedisModuleEvent_ReplAsyncLoad holding 3 sub-events: STARTED,
ABORTED and COMPLETED.
- New module flag REDISMODULE_OPTIONS_HANDLE_REPL_ASYNC_LOAD for RedisModule_SetModuleOptions
to allow modules to declare they support the diskless replication with async loading (when absent, we fall
back to disk-based loading).
Co-authored-by: Eduardo Semprebon <edus@saxobank.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
The module test in reply.tcl was introduced by #8521 but didn't run until recently (see #9639)
and then it started failing with valgrind.
This is because valgrind uses 64 bit long double (unlike most other platforms that have at least 80 bits)
But besides valgrind, the tests where also incompatible with ARM32, which also uses 64 bit long doubles.
We now use appropriate value to avoid issues with either valgrind or ARM32
In all the double tests, i use 3.141, which is safe since since addReplyDouble uses
`%.17Lg` which is able to represent this value without adding any digits due to precision loss.
In the long double, since we use `%.17Lf` in ld2string, it preserves 17 significant
digits, rather than 17 digit after the decimal point (like in `%.17Lg`).
So to make these similar, i use value lower than 1 (no digits left of
the period)
Lastly, we have the same issue with TCL (no long doubles) so we read
raw protocol in that test.
Note that the only error before this fix (in both valgrind and ARM32 is this:
```
*** [err]: RM_ReplyWithLongDouble: a float reply in tests/unit/moduleapi/reply.tcl
Expected '3.141' to be equal to '3.14100000000000001' (context: type eval line 2 cmd {assert_equal 3.141 [r rw.longdouble 3.141]} proc ::test)
```
so the changes to debug.c and scripting.tcl aren't really needed, but i consider them a cleanup
(i.e. scripting.c validated a different constant than the one that's sent to it from debug.c).
Another unrelated change is to add the RESP version to the repeated tests in reply.tcl
Let modules use additional type of RESP3 response (unused by redis so far)
Also fix tests that where introduced in #8521 but didn't actually run.
Co-authored-by: Oran Agra <oran@redislabs.com>
## Background
For redis master, one replica uses one copy of replication buffer, that is a big waste of memory,
more replicas more waste, and allocate/free memory for every reply list also cost much.
If we set client-output-buffer-limit small and write traffic is heavy, master may disconnect with
replicas and can't finish synchronization with replica. If we set client-output-buffer-limit big,
master may be OOM when there are many replicas that separately keep much memory.
Because replication buffers of different replica client are the same, one simple idea is that
all replicas only use one replication buffer, that will effectively save memory.
Since replication backlog content is the same as replicas' output buffer, now we
can discard replication backlog memory and use global shared replication buffer
to implement replication backlog mechanism.
## Implementation
I create one global "replication buffer" which contains content of replication stream.
The structure of "replication buffer" is similar to the reply list that exists in every client.
But the node of list is `replBufBlock`, which has `id, repl_offset, refcount` fields.
```c
/* Replication buffer blocks is the list of replBufBlock.
*
* +--------------+ +--------------+ +--------------+
* | refcount = 1 | ... | refcount = 0 | ... | refcount = 2 |
* +--------------+ +--------------+ +--------------+
* | / \
* | / \
* | / \
* Repl Backlog Replia_A Replia_B
*
* Each replica or replication backlog increments only the refcount of the
* 'ref_repl_buf_node' which it points to. So when replica walks to the next
* node, it should first increase the next node's refcount, and when we trim
* the replication buffer nodes, we remove node always from the head node which
* refcount is 0. If the refcount of the head node is not 0, we must stop
* trimming and never iterate the next node. */
/* Similar with 'clientReplyBlock', it is used for shared buffers between
* all replica clients and replication backlog. */
typedef struct replBufBlock {
int refcount; /* Number of replicas or repl backlog using. */
long long id; /* The unique incremental number. */
long long repl_offset; /* Start replication offset of the block. */
size_t size, used;
char buf[];
} replBufBlock;
```
So now when we feed replication stream into replication backlog and all replicas, we only need
to feed stream into replication buffer `feedReplicationBuffer`. In this function, we set some fields of
replication backlog and replicas to references of the global replication buffer blocks. And we also
need to check replicas' output buffer limit to free if exceeding `client-output-buffer-limit`, and trim
replication backlog if exceeding `repl-backlog-size`.
When sending reply to replicas, we also need to iterate replication buffer blocks and send its
content, when totally sending one block for replica, we decrease current node count and
increase the next current node count, and then free the block which reference is 0 from the
head of replication buffer blocks.
Since now we use linked list to manage replication backlog, it may cost much time for iterating
all linked list nodes to find corresponding replication buffer node. So we create a rax tree to
store some nodes for index, but to avoid rax tree occupying too much memory, i record
one per 64 nodes for index.
Currently, to make partial resynchronization as possible as much, we always let replication
backlog as the last reference of replication buffer blocks, backlog size may exceeds our setting
if slow replicas that reference vast replication buffer blocks, and this method doesn't increase
memory usage since they share replication buffer. To avoid freezing server for freeing unreferenced
replication buffer blocks when we need to trim backlog for exceeding backlog size setting,
we trim backlog incrementally (free 64 blocks per call now), and make it faster in
`beforeSleep` (free 640 blocks).
### Other changes
- `mem_total_replication_buffers`: we add this field in INFO command, it means the total
memory of replication buffers used.
- `mem_clients_slaves`: now even replica is slow to replicate, and its output buffer memory
is not 0, but it still may be 0, since replication backlog and replicas share one global replication
buffer, only if replication buffer memory is more than the repl backlog setting size, we consider
the excess as replicas' memory. Otherwise, we think replication buffer memory is the consumption
of repl backlog.
- Key eviction
Since all replicas and replication backlog share global replication buffer, we think only the
part of exceeding backlog size the extra separate consumption of replicas.
Because we trim backlog incrementally in the background, backlog size may exceeds our
setting if slow replicas that reference vast replication buffer blocks disconnect.
To avoid massive eviction loop, we don't count the delayed freed replication backlog into
used memory even if there are no replicas, i.e. we also regard this memory as replicas's memory.
- `client-output-buffer-limit` check for replica clients
It doesn't make sense to set the replica clients output buffer limit lower than the repl-backlog-size
config (partial sync will succeed and then replica will get disconnected). Such a configuration is
ignored (the size of repl-backlog-size will be used). This doesn't have memory consumption
implications since the replica client will share the backlog buffers memory.
- Drop replication backlog after loading data if needed
We always create replication backlog if server is a master, we need it because we put DELs in
it when loading expired keys in RDB, but if RDB doesn't have replication info or there is no rdb,
it is not possible to support partial resynchronization, to avoid extra memory of replication backlog,
we drop it.
- Multi IO threads
Since all replicas and replication backlog use global replication buffer, if I/O threads are enabled,
to guarantee data accessing thread safe, we must let main thread handle sending the output buffer
to all replicas. But before, other IO threads could handle sending output buffer of all replicas.
## Other optimizations
This solution resolve some other problem:
- When replicas disconnect with master since of out of output buffer limit, releasing the output
buffer of replicas may freeze server if we set big `client-output-buffer-limit` for replicas, but now,
it doesn't cause freezing.
- This implementation may mitigate reply list copy cost time(also freezes server) when one replication
has huge reply buffer and another replica can copy buffer for full synchronization. now, we just copy
reference info, it is very light.
- If we set replication backlog size big, it also may cost much time to copy replication backlog into
replica's output buffer. But this commit eliminates this problem.
- Resizing replication backlog size doesn't empty current replication backlog content.
Before this commit, module blocked clients did not carry through the original RESP version, resulting with RESP3 clients receiving unexpected RESP2 replies.
Following #9483 the daily CI exposed a few problems.
* The cluster creation code (uses redis-cli) is complicated to test with TLS enabled.
for now i'm just skipping them since the tests we run there don't really need that kind of coverage
* cluster port binding failures
note that `find_available_port` already looks for a free cluster port
but the code in `wait_server_started` couldn't detect the failure of binding
(the text it greps for wasn't found in the log)
## Intro
The purpose is to allow having different flags/ACL categories for
subcommands (Example: CONFIG GET is ok-loading but CONFIG SET isn't)
We create a small command table for every command that has subcommands
and each subcommand has its own flags, etc. (same as a "regular" command)
This commit also unites the Redis and the Sentinel command tables
## Affected commands
CONFIG
Used to have "admin ok-loading ok-stale no-script"
Changes:
1. Dropped "ok-loading" in all except GET (this doesn't change behavior since
there were checks in the code doing that)
XINFO
Used to have "read-only random"
Changes:
1. Dropped "random" in all except CONSUMERS
XGROUP
Used to have "write use-memory"
Changes:
1. Dropped "use-memory" in all except CREATE and CREATECONSUMER
COMMAND
No changes.
MEMORY
Used to have "random read-only"
Changes:
1. Dropped "random" in PURGE and USAGE
ACL
Used to have "admin no-script ok-loading ok-stale"
Changes:
1. Dropped "admin" in WHOAMI, GENPASS, and CAT
LATENCY
No changes.
MODULE
No changes.
SLOWLOG
Used to have "admin random ok-loading ok-stale"
Changes:
1. Dropped "random" in RESET
OBJECT
Used to have "read-only random"
Changes:
1. Dropped "random" in ENCODING and REFCOUNT
SCRIPT
Used to have "may-replicate no-script"
Changes:
1. Dropped "may-replicate" in all except FLUSH and LOAD
CLIENT
Used to have "admin no-script random ok-loading ok-stale"
Changes:
1. Dropped "random" in all except INFO and LIST
2. Dropped "admin" in ID, TRACKING, CACHING, GETREDIR, INFO, SETNAME, GETNAME, and REPLY
STRALGO
No changes.
PUBSUB
No changes.
CLUSTER
Changes:
1. Dropped "admin in countkeysinslots, getkeysinslot, info, nodes, keyslot, myid, and slots
SENTINEL
No changes.
(note that DEBUG also fits, but we decided not to convert it since it's for
debugging and anyway undocumented)
## New sub-command
This commit adds another element to the per-command output of COMMAND,
describing the list of subcommands, if any (in the same structure as "regular" commands)
Also, it adds a new subcommand:
```
COMMAND LIST [FILTERBY (MODULE <module-name>|ACLCAT <cat>|PATTERN <pattern>)]
```
which returns a set of all commands (unless filters), but excluding subcommands.
## Module API
A new module API, RM_CreateSubcommand, was added, in order to allow
module writer to define subcommands
## ACL changes:
1. Now, that each subcommand is actually a command, each has its own ACL id.
2. The old mechanism of allowed_subcommands is redundant
(blocking/allowing a subcommand is the same as blocking/allowing a regular command),
but we had to keep it, to support the widespread usage of allowed_subcommands
to block commands with certain args, that aren't subcommands (e.g. "-select +select|0").
3. I have renamed allowed_subcommands to allowed_firstargs to emphasize the difference.
4. Because subcommands are commands in ACL too, you can now use "-" to block subcommands
(e.g. "+client -client|kill"), which wasn't possible in the past.
5. It is also possible to use the allowed_firstargs mechanism with subcommand.
For example: `+config -config|set +config|set|loglevel` will block all CONFIG SET except
for setting the log level.
6. All of the ACL changes above required some amount of refactoring.
## Misc
1. There are two approaches: Either each subcommand has its own function or all
subcommands use the same function, determining what to do according to argv[0].
For now, I took the former approaches only with CONFIG and COMMAND,
while other commands use the latter approach (for smaller blamelog diff).
2. Deleted memoryGetKeys: It is no longer needed because MEMORY USAGE now uses the "range" key spec.
4. Bugfix: GETNAME was missing from CLIENT's help message.
5. Sentinel and Redis now use the same table, with the same function pointer.
Some commands have a different implementation in Sentinel, so we redirect
them (these are ROLE, PUBLISH, and INFO).
6. Command stats now show the stats per subcommand (e.g. instead of stats just
for "config" you will have stats for "config|set", "config|get", etc.)
7. It is now possible to use COMMAND directly on subcommands:
COMMAND INFO CONFIG|GET (The pipeline syntax was inspired from ACL, and
can be used in functions lookupCommandBySds and lookupCommandByCString)
8. STRALGO is now a container command (has "help")
## Breaking changes:
1. Command stats now show the stats per subcommand (see (5) above)
Prevent clients from being blocked forever in cluster when they block with their own module command
and the hash slot is migrated to another master at the same time.
These will get a redirection message when unblocked.
Also, release clients blocked on module commands when cluster is down (same as other blocked clients)
This commit adds basic tests for the main (non-cluster) redis test infra that test the cluster.
This was done because the cluster test infra can't handle some common test features,
but most importantly we only build the test modules with the non-cluster test suite.
note that rather than really supporting cluster operations by the test infra, it was added (as dup code)
in two files, one for module tests and one for non-modules tests, maybe in the future we'll refactor that.
Co-authored-by: Oran Agra <oran@redislabs.com>
This commit introduced a new flag to the RM_Call:
'C' - Check if the command can be executed according to the ACLs associated with it.
Also, three new API's added to check if a command, key, or channel can be executed or accessed
by a user, according to the ACLs associated with it.
- RM_ACLCheckCommandPerm
- RM_ACLCheckKeyPerm
- RM_ACLCheckChannelPerm
The user for these API's is a RedisModuleUser object, that for a Module user returned by the RM_CreateModuleUser API, or for a general ACL user can be retrieved by these two new API's:
- RM_GetCurrentUserName - Retrieve the user name of the client connection behind the current context.
- RM_GetModuleUserFromUserName - Get a RedisModuleUser from a user name
As a result of getting a RedisModuleUser from name, it can now also access the general ACL users (not just ones created by the module).
This mean the already existing API RM_SetModuleUserACL(), can be used to change the ACL rules for such users.
Fix#7297
The problem:
Today, there is no way for a client library or app to know the key name indexes for commands such as
ZUNIONSTORE/EVAL and others with "numkeys", since COMMAND INFO returns no useful info for them.
For cluster-aware redis clients, this requires to 'patch' the client library code specifically for each of these commands or to
resolve each execution of these commands with COMMAND GETKEYS.
The solution:
Introducing key specs other than the legacy "range" (first,last,step)
The 8th element of the command info array, if exists, holds an array of key specs. The array may be empty, which indicates
the command doesn't take any key arguments or may contain one or more key-specs, each one may leads to the discovery
of 0 or more key arguments.
A client library that doesn't support this key-spec feature will keep using the first,last,step and movablekeys flag which will
obviously remain unchanged.
A client that supports this key-specs feature needs only to look at the key-specs array. If it finds an unrecognized spec, it
must resort to using COMMAND GETKEYS if it wishes to get all key name arguments, but if all it needs is one key in order
to know which cluster node to use, then maybe another spec (if the command has several) can supply that, and there's no
need to use GETKEYS.
Each spec is an array of arguments, first one is the spec name, the second is an array of flags, and the third is an array
containing details about the spec (specific meaning for each spec type)
The initial flags we support are "read" and "write" indicating if the keys that this key-spec finds are used for read or for write.
clients should ignore any unfamiliar flags.
In order to easily find the positions of keys in a given array of args we introduce keys specs. There are two logical steps of
key specs:
1. `start_search`: Given an array of args, indicate where we should start searching for keys
2. `find_keys`: Given the output of start_search and an array of args, indicate all possible indices of keys.
### start_search step specs
- `index`: specify an argument index explicitly
- `index`: 0 based index (1 means the first command argument)
- `keyword`: specify a string to match in `argv`. We should start searching for keys just after the keyword appears.
- `keyword`: the string to search for
- `start_search`: an index from which to start the keyword search (can be negative, which means to search from the end)
Examples:
- `SET` has start_search of type `index` with value `1`
- `XREAD` has start_search of type `keyword` with value `[“STREAMS”,1]`
- `MIGRATE` has start_search of type `keyword` with value `[“KEYS”,-2]`
### find_keys step specs
- `range`: specify `[count, step, limit]`.
- `lastkey`: index of the last key. relative to the index returned from begin_search. -1 indicating till the last argument, -2 one before the last
- `step`: how many args should we skip after finding a key, in order to find the next one
- `limit`: if count is -1, we use limit to stop the search by a factor. 0 and 1 mean no limit. 2 means ½ of the remaining args, 3 means ⅓, and so on.
- “keynum”: specify `[keynum_index, first_key_index, step]`.
- `keynum_index`: is relative to the return of the `start_search` spec.
- `first_key_index`: is relative to `keynum_index`.
- `step`: how many args should we skip after finding a key, in order to find the next one
Examples:
- `SET` has `range` of `[0,1,0]`
- `MSET` has `range` of `[-1,2,0]`
- `XREAD` has `range` of `[-1,1,2]`
- `ZUNION` has `start_search` of type `index` with value `1` and `find_keys` of type `keynum` with value `[0,1,1]`
- `AI.DAGRUN` has `start_search` of type `keyword` with value `[“LOAD“,1]` and `find_keys` of type `keynum` with value
`[0,1,1]` (see https://oss.redislabs.com/redisai/master/commands/#aidagrun)
Note: this solution is not perfect as the module writers can come up with anything, but at least we will be able to find the key
args of the vast majority of commands.
If one of the above specs can’t describe the key positions, the module writer can always fall back to the `getkeys-api` option.
Some keys cannot be found easily (`KEYS` in `MIGRATE`: Imagine the argument for `AUTH` is the string “KEYS” - we will
start searching in the wrong index).
The guarantee is that the specs may be incomplete (`incomplete` will be specified in the spec to denote that) but we never
report false information (assuming the command syntax is correct).
For `MIGRATE` we start searching from the end - `startfrom=-1` - and if one of the keys is actually called "keys" we will
report only a subset of all keys - hence the `incomplete` flag.
Some `incomplete` specs can be completely empty (i.e. UNKNOWN begin_search) which should tell the client that
COMMAND GETKEYS (or any other way to get the keys) must be used (Example: For `SORT` there is no way to describe
the STORE keyword spec, as the word "store" can appear anywhere in the command).
We will expose these key specs in the `COMMAND` command so that clients can learn, on startup, where the keys are for
all commands instead of holding hardcoded tables or use `COMMAND GETKEYS` in runtime.
Comments:
1. Redis doesn't internally use the new specs, they are only used for COMMAND output.
2. In order to support the current COMMAND INFO format (reply array indices 4, 5, 6) we created a synthetic range, called
legacy_range, that, if possible, is built according to the new specs.
3. Redis currently uses only getkeys_proc or the legacy_range to get the keys indices (in COMMAND GETKEYS for
example).
"incomplete" specs:
the command we have issues with are MIGRATE, STRALGO, and SORT
for MIGRATE, because the token KEYS, if exists, must be the last token, we can search in reverse. it one of the keys is
actually the string "keys" will return just a subset of the keys (hence, it's "incomplete")
for SORT and STRALGO we can use this heuristic (the keys can be anywhere in the command) and therefore we added a
key spec that is both "incomplete" and of "unknown type"
if a client encounters an "incomplete" spec it means that it must find a different way (either COMMAND GETKEYS or have
its own parser) to retrieve the keys.
please note that all commands, apart from the three mentioned above, have "complete" key specs
List functions operating on elements by index:
* RM_ListGet
* RM_ListSet
* RM_ListInsert
* RM_ListDelete
Iteration is done using a simple for loop over indices.
The index based functions use an internal iterator as an optimization.
This is explained in the docs:
```
* Many of the list functions access elements by index. Since a list is in
* essence a doubly-linked list, accessing elements by index is generally an
* O(N) operation. However, if elements are accessed sequentially or with
* indices close together, the functions are optimized to seek the index from
* the previous index, rather than seeking from the ends of the list.
*
* This enables iteration to be done efficiently using a simple for loop:
*
* long n = RM_ValueLength(key);
* for (long i = 0; i < n; i++) {
* RedisModuleString *elem = RedisModule_ListGet(key, i);
* // Do stuff...
* }
```
## Current state
1. Lua has its own parser that handles parsing `reds.call` replies and translates them
to Lua objects that can be used by the user Lua code. The parser partially handles
resp3 (missing big number, verbatim, attribute, ...)
2. Modules have their own parser that handles parsing `RM_Call` replies and translates
them to RedisModuleCallReply objects. The parser does not support resp3.
In addition, in the future, we want to add Redis Function (#8693) that will probably
support more languages. At some point maintaining so many parsers will stop
scaling (bug fixes and protocol changes will need to be applied on all of them).
We will probably end up with different parsers that support different parts of the
resp protocol (like we already have today with Lua and modules)
## PR Changes
This PR attempt to unified the reply parsing of Lua and modules (and in the future
Redis Function) by introducing a new parser unit (`resp_parser.c`). The new parser
handles parsing the reply and calls different callbacks to allow the users (another
unit that uses the parser, i.e, Lua, modules, or Redis Function) to analyze the reply.
### Lua API Additions
The code that handles reply parsing on `scripting.c` was removed. Instead, it uses
the resp_parser to parse and create a Lua object out of the reply. As mentioned
above the Lua parser did not handle parsing big numbers, verbatim, and attribute.
The new parser can handle those and so Lua also gets it for free.
Those are translated to Lua objects in the following way:
1. Big Number - Lua table `{'big_number':'<str representation for big number>'}`
2. Verbatim - Lua table `{'verbatim_string':{'format':'<verbatim format>', 'string':'<verbatim string value>'}}`
3. Attribute - currently ignored and not expose to the Lua parser, another issue will be open to decide how to expose it.
Tests were added to check resp3 reply parsing on Lua
### Modules API Additions
The reply parsing code on `module.c` was also removed and the new resp_parser is used instead.
In addition, the RedisModuleCallReply was also extracted to a separate unit located on `call_reply.c`
(in the future, this unit will also be used by Redis Function). A nice side effect of unified parsing is
that modules now also support resp3. Resp3 can be enabled by giving `3` as a parameter to the
fmt argument of `RM_Call`. It is also possible to give `0`, which will indicate an auto mode. i.e, Redis
will automatically chose the reply protocol base on the current client set on the RedisModuleCtx
(this mode will mostly be used when the module want to pass the reply to the client as is).
In addition, the following RedisModuleAPI were added to allow analyzing resp3 replies:
* New RedisModuleCallReply types:
* `REDISMODULE_REPLY_MAP`
* `REDISMODULE_REPLY_SET`
* `REDISMODULE_REPLY_BOOL`
* `REDISMODULE_REPLY_DOUBLE`
* `REDISMODULE_REPLY_BIG_NUMBER`
* `REDISMODULE_REPLY_VERBATIM_STRING`
* `REDISMODULE_REPLY_ATTRIBUTE`
* New RedisModuleAPI:
* `RedisModule_CallReplyDouble` - getting double value from resp3 double reply
* `RedisModule_CallReplyBool` - getting boolean value from resp3 boolean reply
* `RedisModule_CallReplyBigNumber` - getting big number value from resp3 big number reply
* `RedisModule_CallReplyVerbatim` - getting format and value from resp3 verbatim reply
* `RedisModule_CallReplySetElement` - getting element from resp3 set reply
* `RedisModule_CallReplyMapElement` - getting key and value from resp3 map reply
* `RedisModule_CallReplyAttribute` - getting a reply attribute
* `RedisModule_CallReplyAttributeElement` - getting key and value from resp3 attribute reply
* New context flags:
* `REDISMODULE_CTX_FLAGS_RESP3` - indicate that the client is using resp3
Tests were added to check the new RedisModuleAPI
### Modules API Changes
* RM_ReplyWithCallReply might return REDISMODULE_ERR if the given CallReply is in resp3
but the client expects resp2. This is not a breaking change because in order to get a resp3
CallReply one needs to specifically specify `3` as a parameter to the fmt argument of
`RM_Call` (as mentioned above).
Tests were added to check this change
### More small Additions
* Added `debug set-disable-deny-scripts` that allows to turn on and off the commands no-script
flag protection. This is used by the Lua resp3 tests so it will be possible to run `debug protocol`
and check the resp3 parsing code.
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Modules that use background threads with thread safe contexts are likely
to use RM_BlockClient() without a timeout function, because they do not
set up a timeout.
Before this commit, `CLIENT UNBLOCK` would result with a crash as the
`NULL` timeout callback is called. Beyond just crashing, this is also
logically wrong as it may throw the module into an unexpected client
state.
This commits makes `CLIENT UNBLOCK` on such clients behave the same as
any other client that is not in a blocked state and therefore cannot be
unblocked.
Previously, passing 0 for newlen would not truncate the string at all.
This adds handling of this case, freeing the old string and creating a new empty string.
Other changes:
- Move `src/modules/testmodule.c` to `tests/modules/basics.c`
- Introduce that basic test into the test suite
- Add tests to cover StringTruncate
- Add `test-modules` build target for the main makefile
- Extend `distclean` build target to clean modules too
Create new module type enhanced callbacks: mem_usage2, free_effort2, unlink2, copy2.
These will be given a context point from which the module can obtain the key name and database id.
In addition the digest and defrag context can now be used to obtain the key name and database id.
This PR adds a spell checker CI action that will fail future PRs if they introduce typos and spelling mistakes.
This spell checker is based on blacklist of common spelling mistakes, so it will not catch everything,
but at least it is also unlikely to cause false positives.
Besides that, the PR also fixes many spelling mistakes and types, not all are a result of the spell checker we use.
Here's a summary of other changes:
1. Scanned the entire source code and fixes all sorts of typos and spelling mistakes (including missing or extra spaces).
2. Outdated function / variable / argument names in comments
3. Fix outdated keyspace masks error log when we check `config.notify-keyspace-events` in loadServerConfigFromString.
4. Trim the white space at the end of line in `module.c`. Check: https://github.com/redis/redis/pull/7751
5. Some outdated https link URLs.
6. Fix some outdated comment. Such as:
- In README: about the rdb, we used to said create a `thread`, change to `process`
- dbRandomKey function coment (about the dictGetRandomKey, change to dictGetFairRandomKey)
- notifyKeyspaceEvent fucntion comment (add type arg)
- Some others minor fix in comment (Most of them are incorrectly quoted by variable names)
7. Modified the error log so that users can easily distinguish between TCP and TLS in `changeBindAddr`
Adding a new type mask for key space notification, REDISMODULE_NOTIFY_MODULE, to enable unique notifications from commands on REDISMODULE_KEYTYPE_MODULE type keys (which is currently unsupported).
Modules can subscribe to a module key keyspace notification by RM_SubscribeToKeyspaceEvents,
and clients by notify-keyspace-events of redis.conf or via the CONFIG SET, with the characters 'd' or 'A'
(REDISMODULE_NOTIFY_MODULE type mask is part of the '**A**ll' notation for key space notifications).
Refactor: move some pubsub test infra from pubsub.tcl to util.tcl to be re-used by other tests.
Before this commit using RM_Call without "!" could cause the master
to lazy-expire a key (delete it) but without replicating to replicas.
This could cause the replica's memory usage to gradually grow and
could also cause consistency issues if the master and replica have
a clock diff.
This bug was introduced in #8617
Added a test which demonstrates that scenario.
1. moduleReplicateMultiIfNeeded should use server.in_eval like
moduleHandlePropagationAfterCommandCallback
2. server.in_eval could have been set to 1 and not reset back
to 0 (a lot of missed early-exits after in_eval is already 1)
Note: The new assertions in processCommand cover (2) and I added
two module tests to cover (1)
Implications:
If an EVAL that failed (and thus left server.in_eval=1) runs before a module
command that replicates, the replication stream will contain MULTI (because
moduleReplicateMultiIfNeeded used to check server.lua_caller which is NULL
at this point) but not EXEC (because server.in_eval==1)
This only affects modules as module.c the only user of server.in_eval.
Affects versions 6.2.0, 6.2.1
Bug 1:
When a module ctx is freed moduleHandlePropagationAfterCommandCallback
is called and handles propagation. We want to prevent it from propagating
commands that were not replicated by the same context. Example:
1. module1.foo does: RM_Replicate(cmd1); RM_Call(cmd2); RM_Replicate(cmd3)
2. RM_Replicate(cmd1) propagates MULTI and adds cmd1 to also_propagagte
3. RM_Call(cmd2) create a new ctx, calls call() and destroys the ctx.
4. moduleHandlePropagationAfterCommandCallback is called, calling
alsoPropagates EXEC (Note: EXEC is still not written to socket),
setting server.in_trnsaction = 0
5. RM_Replicate(cmd3) is called, propagagting yet another MULTI (now
we have nested MULTI calls, which is no good) and then cmd3
We must prevent RM_Call(cmd2) from resetting server.in_transaction.
REDISMODULE_CTX_MULTI_EMITTED was revived for that purpose.
Bug 2:
Fix issues with nested RM_Call where some have '!' and some don't.
Example:
1. module1.foo does RM_Call of module2.bar without replication (i.e. no '!')
2. module2.bar internally calls RM_Call of INCR with '!'
3. at the end of module1.foo we call RM_ReplicateVerbatim
We want the replica/AOF to see only module1.foo and not the INCR from module2.bar
Introduced a global replication_allowed flag inside RM_Call to determine
whether we need to replicate or not (even if '!' was specified)
Other changes:
Split beforePropagateMultiOrExec to beforePropagateMulti afterPropagateExec
just for better readability
Because when the RM_Call is invoked. It will create a faker client.
The point is client connection is NULL, so server will crash in connGetInfo
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
A single client pointer is added in the server struct. This is
initialized by the first RM_Call() and reused for every subsequent
RM_Call() except if it's already in use, which means that it's not
used for (recursive) module calls to modules. For these, a new
"fake" client is created each time.
Other changes:
* Avoid allocating a dict iterator in pubsubUnsubscribeAllChannels
when not needed
The added flag affects the return value of RM_HashSet() to include
the number of inserted fields, in addition to updated and deleted
fields.
errno is set on errors, tests are added and documentation updated.
- removes time sensitive checks from block on background tests during leak checks.
- fix uninitialized variable on RedisModuleBlockedClient() when calling
RM_BlockedClientMeasureTimeEnd() without RM_BlockedClientMeasureTimeStart()
The test failed from time to time on Github actions.
We think it's possible that on the module's blocking timeout
time tracking test, the timeout is happening prior we issue the
RedisModule_BlockedClientMeasureTimeStart(bc) on the
background thread. If that is the case one possible solution
is to increase the timeout.
Increasing to 200ms to 500ms to see if nightly stops failing.
Without this fix, RM_ZsetRem can leave empty sorted sets which are
not allowed to exist.
Removing from a sorted set while iterating seems to work (while
inserting causes failed assetions). RM_ZsetRangeEndReached is
modified to return 1 if the key doesn't exist, to terminate
iteration when the last element has been removed.
This commit enables tracking time of the background tasks and on replies,
opening the door for properly tracking commands that rely on blocking / background
work via the slowlog, latency history, and commandstats.
Some notes:
- The time spent blocked waiting for key changes, or blocked on synchronous
replication is not accounted for.
- **This commit does not affect latency tracking of commands that are non-blocking
or do not have background work.** ( meaning that it all stays the same with exception to
`BZPOPMIN`,`BZPOPMAX`,`BRPOP`,`BLPOP`, etc... and module's commands that rely
on background threads ).
- Specifically for latency history command we've added a new event class named
`command-unblocking` that will enable latency monitoring on commands that spawn
background threads to do the work.
- For blocking commands we're now considering the total time of a command as the
time spent on call() + the time spent on replying when unblocked.
- For Modules commands that rely on background threads we're now considering the
total time of a command as the time spent on call (main thread) + the time spent on
the background thread ( if marked within `RedisModule_MeasureTimeStart()` and
`RedisModule_MeasureTimeEnd()` ) + the time spent on replying (main thread)
To test for this feature we've added a `unit/moduleapi/blockonbackground` test that relies on
a module that blocks the client and sleeps on the background for a given time.
- check blocked command that uses RedisModule_MeasureTimeStart() is tracking background time
- check blocked command that uses RedisModule_MeasureTimeStart() is tracking background time even in timeout
- check blocked command with multiple calls RedisModule_MeasureTimeStart() is tracking the total background time
- check blocked command without calling RedisModule_MeasureTimeStart() is not reporting background time
APIs added for these stream operations: add, delete, iterate and
trim (by ID or maxlength). The functions are prefixed by RM_Stream.
* RM_StreamAdd
* RM_StreamDelete
* RM_StreamIteratorStart
* RM_StreamIteratorStop
* RM_StreamIteratorNextID
* RM_StreamIteratorNextField
* RM_StreamIteratorDelete
* RM_StreamTrimByLength
* RM_StreamTrimByID
The type RedisModuleStreamID is added and functions for converting
from and to RedisModuleString.
* RM_CreateStringFromStreamID
* RM_StringToStreamID
Whenever the stream functions return REDISMODULE_ERR, errno is set to
provide additional error information.
Refactoring: The zset iterator fields in the RedisModuleKey struct
are wrapped in a union, to allow the same space to be used for type-
specific info for streams and allow future use for other key types.
BLPOP and other blocking list commands can only block on empty keys
and LPUSH only wakes up clients when the list is created.
Using the module API, it's possible to block on a non-empty key.
Unblocking a client blocked on a non-empty list (or zset) can only
be done using RedisModule_SignalKeyAsReady(). This commit tests it.
the test was misleading because the module would actually woke up on a wrong type and
re-blocked, while the test name suggests the module doesn't not wake up at all on a wrong type..
i changed the name of the test + added verification that indeed the module wakes up and gets
re-blocked after it understand it's the wrong type
This was a regression from #7625 (only in 6.2 RC2).
This makes it possible again to implement blocking list and zset
commands using the modules API.
This commit also includes a test case for the reverse: A module
unblocks a client blocked on BLPOP by inserting elements using
RedisModule_ListPush(). This already works, but it was untested.
In the distant history there was only the read flag for commands, and whatever
command that didn't have the read flag was a write one.
Then we added the write flag, but some portions of the code still used !read
Also some commands that don't work on the keyspace at all, still have the read
flag.
Changes in this commit:
1. remove the read-only flag from TIME, ECHO, ROLE and LASTSAVE
2. EXEC command used to decides if it should propagate a MULTI by looking at
the command flags (!read & !admin).
When i was about to change it to look at the write flag instead, i realized
that this would cause it not to propagate a MULTI for PUBLISH, EVAL, and
SCRIPT, all 3 are not marked as either a read command or a write one (as
they should), but all 3 are calling forceCommandPropagation.
So instead of introducing a new flag to denote a command that "writes" but
not into the keyspace, and still needs propagation, i decided to rely on
the forceCommandPropagation, and just fix the code to propagate MULTI when
needed rather than depending on the command flags at all.
The implication of my change then is that now it won't decide to propagate
MULTI when it sees one of these: SELECT, PING, INFO, COMMAND, TIME and
other commands which are neither read nor write.
3. Changing getNodeByQuery and clusterRedirectBlockedClientIfNeeded in
cluster.c to look at !write rather than read flag.
This should have no implications, since these code paths are only reachable
for commands which access keys, and these are always marked as either read
or write.
This commit improve MULTI propagation tests, for modules and a bunch of
other special cases, all of which used to pass already before that commit.
the only one that test change that uncovered a change of behavior is the
one that DELs a non-existing key, it used to propagate an empty
multi-exec block, and no longer does.
Additionally the older defrag tests are using an obsolete way to check
if the defragger is suuported (the error no longer contains "DISABLED").
this doesn't usually makes a difference since these tests are completely
skipped if the allocator is not jemalloc, but that would fail if the
allocator is a jemalloc that doesn't support defrag.
Add a new set of defrag functions that take a defrag context and allow
defragmenting memory blocks and RedisModuleStrings.
Modules can register a defrag callback which will be invoked when the
defrag process handles globals.
Modules with custom data types can also register a datatype-specific
defrag callback which is invoked for keys that require defragmentation.
The callback and associated functions support both one-step and
multi-step options, depending on the complexity of the key as exposed by
the free_effort callback.
This adds a copy callback for module data types, in order to make
modules compatible with the new COPY command.
The callback is optional and COPY will fail for keys with data types
that do not implement it.
Module blocked clients cache the response in a temporary client,
the reply list in this client would be affected by the recent fix
in #7202, but when the reply is later copied into the real client,
it would have bypassed all the checks for output buffer limit, which
would have resulted in both: responding with a partial response to
the client, and also not disconnecting it at all.
One way this was happening is when a module issued an RM_Call which would inject MULTI.
If the module command that does that was itself issued by something else that already did
added MULTI (e.g. another module, or a Lua script), it would have caused nested MULTI.
In fact the MULTI state in the client or the MULTI_EMITTED flag in the context isn't
the right indication that we need to propagate MULTI or not, because on a nested calls
(possibly a module action called by a keyspace event of another module action), these
flags aren't retained / reflected.
instead there's now a global propagate_in_transaction flag for that.
in addition to that, we now have a global in_eval and in_exec flags, to serve the flags
of RM_GetContextFlags, since their dependence on the current client is wrong for the same
reasons mentioned above.
Fixes#7923.
This PR appropriates the special `&` symbol (because `@` and `*` are taken),
followed by a literal value or pattern for describing the Pub/Sub patterns that
an ACL user can interact with. It is similar to the existing key patterns
mechanism in function (additive) and implementation (copy-pasta). It also adds
the allchannels and resetchannels ACL keywords, naturally.
The default user is given allchannels permissions, whereas new users get
whatever is defined by the acl-pubsub-default configuration directive. For
backward compatibility in 6.2, the default of this directive is allchannels but
this is likely to be changed to resetchannels in the next major version for
stronger default security settings.
Unless allchannels is set for the user, channel access permissions are checked
as follows :
* Calls to both PUBLISH and SUBSCRIBE will fail unless a pattern matching the
argumentative channel name(s) exists for the user.
* Calls to PSUBSCRIBE will fail unless the pattern(s) provided as an argument
literally exist(s) in the user's list.
Such failures are logged to the ACL log.
Runtime changes to channel permissions for a user with existing subscribing
clients cause said clients to disconnect unless the new permissions permit the
connections to continue. Note, however, that PSUBSCRIBErs' patterns are matched
literally, so given the change bar:* -> b*, pattern subscribers to bar:* will be
disconnected.
Notes/questions:
* UNSUBSCRIBE, PUNSUBSCRIBE and PUBSUB remain unprotected due to lack of reasons
for touching them.
Blocking command should not be used with MULTI, LUA, and RM_Call. This is because,
the caller, who executes the command in this context, expects a reply.
Today, LUA and MULTI have a special (and different) treatment to blocking commands:
LUA - Most commands are marked with no-script flag which are checked when executing
and command from LUA, commands that are not marked (like XREAD) verify that their
blocking mode is not used inside LUA (by checking the CLIENT_LUA client flag).
MULTI - Command that is going to block, first verify that the client is not inside
multi (by checking the CLIENT_MULTI client flag). If the client is inside multi, they
return a result which is a match to the empty key with no timeout (for example blpop
inside MULTI will act as lpop)
For modules that perform RM_Call with blocking command, the returned results type is
REDISMODULE_REPLY_UNKNOWN and the caller can not really know what happened.
Disadvantages of the current state are:
No unified approach, LUA, MULTI, and RM_Call, each has a different treatment
Module can not safely execute blocking command (and get reply or error).
Though It is true that modules are not like LUA or MULTI and should be smarter not
to execute blocking commands on RM_Call, sometimes you want to execute a command base
on client input (for example if you create a module that provides a new scripting
language like javascript or python).
While modules (on modules command) can check for REDISMODULE_CTX_FLAGS_LUA or
REDISMODULE_CTX_FLAGS_MULTI to know not to block the client, there is no way to
check if the command came from another module using RM_Call. So there is no way
for a module to know not to block another module RM_Call execution.
This commit adds a way to unify the treatment for blocking clients by introducing
a new CLIENT_DENY_BLOCKING client flag. On LUA, MULTI, and RM_Call the new flag
turned on to signify that the client should not be blocked. A blocking command
verifies that the flag is turned off before blocking. If a blocking command sees
that the CLIENT_DENY_BLOCKING flag is on, it's not blocking and return results
which are matches to empty key with no timeout (as MULTI does today).
The new flag is checked on the following commands:
List blocking commands: BLPOP, BRPOP, BRPOPLPUSH, BLMOVE,
Zset blocking commands: BZPOPMIN, BZPOPMAX
Stream blocking commands: XREAD, XREADGROUP
SUBSCRIBE, PSUBSCRIBE, MONITOR
In addition, the new flag is turned on inside the AOF client, we do not want to
block the AOF client to prevent deadlocks and commands ordering issues (and there
is also an existing assert in the code that verifies it).
To keep backward compatibility on LUA, all the no-script flags on existing commands
were kept untouched. In addition, a LUA special treatment on XREAD and XREADGROUP was kept.
To keep backward compatibility on MULTI (which today allows SUBSCRIBE, and PSUBSCRIBE).
We added a special treatment on those commands to allow executing them on MULTI.
The only backward compatibility issue that this PR introduces is that now MONITOR
is not allowed inside MULTI.
Tests were added to verify blocking commands are not blocking the client on LUA, MULTI,
or RM_Call. Tests were added to verify the module can check for CLIENT_DENY_BLOCKING flag.
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Itamar Haber <itamar@redislabs.com>
Add two optional callbacks to the RedisModuleTypeMethods structure, which is `free_effort`
and `unlink`. the `free_effort` callback indicates the effort required to free a module memory.
Currently, if the effort exceeds LAZYFREE_THRESHOLD, the module memory may be released
asynchronously. the `unlink` callback indicates the key has been removed from the DB by redis, and
may soon be freed by a background thread.
Add `lazyfreed_objects` info field, which represents the number of objects that have been
lazyfreed since redis was started.
Add `RM_GetTypeMethodVersion` API, which return the current redis-server runtime value of
`REDISMODULE_TYPE_METHOD_VERSION`. You can use that when calling `RM_CreateDataType` to know
which fields of RedisModuleTypeMethods are gonna be supported and which will be ignored.
* Introduce a new API's: RM_GetContextFlagsAll, and
RM_GetKeyspaceNotificationFlagsAll that will return the
full flags mask of each feature. The module writer can
check base on this value if the Flags he needs are
supported or not.
* For each flag, introduce a new value on redismodule.h,
this value represents the LAST value and should be there
as a reminder to update it when a new value is added,
also it will be used in the code to calculate the full
flags mask (assuming flags are incrementally increasing).
In addition, stated that the module writer should not use
the LAST flag directly and he should use the GetFlagAll API's.
* Introduce a new API: RM_IsSubEventSupported, that returns for a given
event and subevent, whether or not the subevent supported.
* Introduce a new macro RMAPI_FUNC_SUPPORTED(func) that returns whether
or not a function API is supported by comparing it to NULL.
* Introduce a new API: int RM_GetServerVersion();, that will return the
current Redis version in the format 0x00MMmmpp; e.g. 0x00060008;
* Changed unstable version from 999.999.999 to 255.255.255
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
The main motivation here is to provide a way for modules to create a
single, global context that can be used for logging.
Currently, it is possible to obtain a thread-safe context that is not
attached to any blocked client by using `RM_GetThreadSafeContext`.
However, the attached context is not linked to the module identity so
log messages produced are not tagged with the module name.
Ideally we'd fix this in `RM_GetThreadSafeContext` itself but as it
doesn't accept the current context as an argument there's no way to do
that in a backwards compatible manner.
This is essentially the same as calling COMMAND GETKEYS but provides a
more efficient interface that can be used in every context (i.e. not a
Redis command).
Added RedisModule_HoldString that either returns a
shallow copy of the given String (by increasing
the String ref count) or a new deep copy of String
in case its not possible to get a shallow copy.
Co-authored-by: Itamar Haber <itamar@redislabs.com>
Diskless master has some inherent latencies.
1) fork starts with delay from cron rather than immediately
2) replica is put online only after an ACK. but the ACK
was sent only once a second.
3) but even if it would arrive immediately, it will not
register in case cron didn't yet detect that the fork is done.
Besides that, when a replica disconnects, it doesn't immediately
attempts to re-connect, it waits for replication cron (one per second).
in case it was already online, it may be important to try to re-connect
as soon as possible, so that the backlog at the master doesn't vanish.
In case it disconnected during rdb transfer, one can argue that it's
not very important to re-connect immediately, but this is needed for the
"diskless loading short read" test to be able to run 100 iterations in 5
seconds, rather than 3 (waiting for replication cron re-connection)
changes in this commit:
1) sync command starts a fork immediately if no sync_delay is configured
2) replica sends REPLCONF ACK when done reading the rdb (rather than on 1s cron)
3) when a replica unexpectedly disconnets, it immediately tries to
re-connect rather than waiting 1s
4) when when a child exits, if there is another replica waiting, we spawn a new
one right away, instead of waiting for 1s replicationCron.
5) added a call to connectWithMaster from replicationSetMaster. which is called
from the REPLICAOF command but also in 3 places in cluster.c, in all of
these the connection attempt will now be immediate instead of delayed by 1
second.
side note:
we can add a call to rdbPipeReadHandler in replconfCommand when getting
a REPLCONF ACK from the replica to solve a race where the replica got
the entire rdb and EOF marker before we detected that the pipe was
closed.
in the test i did see this race happens in one about of some 300 runs,
but i concluded that this race is unlikely in real life (where the
replica is on another host and we're more likely to first detect the
pipe was closed.
the test runs 100 iterations in 3 seconds, so in some cases it'll take 4
seconds instead (waiting for another REPLCONF ACK).
Removing unneeded startBgsaveForReplication from updateSlavesWaitingForBgsave
Now that CheckChildrenDone is calling the new replicationStartPendingFork
(extracted from serverCron) there's actually no need to call
startBgsaveForReplication from updateSlavesWaitingForBgsave anymore,
since as soon as updateSlavesWaitingForBgsave returns, CheckChildrenDone is
calling replicationStartPendingFork that handles that anyway.
The code in updateSlavesWaitingForBgsave had a bug in which it ignored
repl-diskless-sync-delay, but removing that code shows that this bug was
hiding another bug, which is that the max_idle should have used >= and
not >, this one second delay has a big impact on my new test.
- the test now waits for specific set of log messages rather than wait for
timeout looking for just one message.
- we don't wanna sample the current length of the log after an action, due
to a race, we need to start the search from the line number of the last
message we where waiting for.
- when attempting to trigger a full sync, use multi-exec to avoid a race
where the replica manages to re-connect before we completed the set of
actions that should force a full sync.
- fix verify_log_message which was broken and unused
tests were sensitive to additional log lines appearing in the log
causing the search to come empty handed.
instead of just looking for the n last log lines, capture the log lines
before performing the action, and then search from that offset.
The scan key module API provides the scan callback with the current
field name and value (if it exists). Those arguments are RedisModuleString*
which means it supposes to point to robj which is encoded as a string.
Using createStringObjectFromLongLong function might return robj that
points to an integer and so break a module that tries for example to
use RedisModule_StringPtrLen on the given field/value.
The PR introduces a fix that uses the createObject function and sdsfromlonglong function.
Using those function promise that the field and value pass to the to the
scan callback will be Strings.
The PR also changes the Scan test module to use RedisModule_StringPtrLen
to catch the issue. without this, the issue is hidden because
RedisModule_ReplyWithString knows to handle integer encoding of the
given robj (RedisModuleString).
The PR also introduces a new test to verify the issue is solved.
There is an inherent race between the deferring client and the
"main" client of the test: While the deferring client issues a blocking
command, we can't know for sure that by the time the "main" client
tries to issue another command (Usually one that unblocks the deferring
client) the deferring client is even blocked...
For lack of a better choice this commit uses TCL's 'after' in order
to give some time for the deferring client to issues its blocking
command before the "main" client does its thing.
This problem probably exists in many other tests but this commit
tries to fix blockonkeys.tcl
By using a "circular BRPOPLPUSH"-like scenario it was
possible the get the same client on db->blocking_keys
twice (See comment in moduleTryServeClientBlockedOnKey)
The fix was actually already implememnted in
moduleTryServeClientBlockedOnKey but it had a bug:
the funxction should return 0 or 1 (not OK or ERR)
Other changes:
1. Added two commands to blockonkeys.c test module (To
reproduce the case described above)
2. Simplify blockonkeys.c in order to make testing easier
3. cast raxSize() to avoid warning with format spec
Makse sure call() doesn't wrap replicated commands with
a redundant MULTI/EXEC
Other, unrelated changes:
1. Formatting compiler warning in INFO CLIENTS
2. Use CLIENT_ID_AOF instead of UINT64_MAX
If a blocked module client times-out (or disconnects, unblocked
by CLIENT command, etc.) we need to call moduleUnblockClient
in order to free memory allocated by the module sub-system
and blocked-client private data
Other changes:
Made blockedonkeys.tcl tests a bit more aggressive in order
to smoke-out potential memory leaks
With the previous API, a NULL return value was ambiguous and could
represent either an old value of NULL or an error condition. The new API
returns a status code and allows the old value to be returned
by-reference.
This commit also includes test coverage based on
tests/modules/datatype.c which did not exist at the time of the original
commit.
there were two lssues, one is taht BGREWRITEAOF failed since the initial one was still in progress
the solution for this one is to enable appendonly from the server startup so there's no initial aofrw.
the other problem was 0 loading progress events, theory is that on some
platforms a sleep of 1 will cause a much greater delay due to the context
switch, but on other platform it doesn't. in theory a sleep of 100 micro
for 1k keys whould take 100ms, and with hz of 500 we should be gettering
50 events (one every 2ms). in practise it doesn't work like that, so trying
to find a sleep that would be long enough but still not cause the test to take
too long.
- Adding RM_ScanKey
- Adding tests for RM_ScanKey
- Refactoring RM_Scan API
Changes in RM_Scan
- cleanup in docs and coding convention
- Moving out of experimantal Api
- Adding ctx to scan callback
- Dont use cursor of -1 as an indication of done (can be a valid cursor)
- Set errno when returning 0 for various reasons
- Rename Cursor to ScanCursor
- Test filters key that are not strings, and opens a key if NULL
The implementation expose the following new functions:
1. RedisModule_CursorCreate - allow to create a new cursor object for
keys scanning
2. RedisModule_CursorRestart - restart an existing cursor to restart the
scan
3. RedisModule_CursorDestroy - destroy an existing cursor
4. RedisModule_Scan - scan keys
The RedisModule_Scan function gets a cursor object, a callback and void*
(used as user private data).
The callback will be called for each key in the database proving the key
name and the value as RedisModuleKey.
- the API name was odd, separated to two apis one for LRU and one for LFU
- the LRU idle time was in 1 second resolution, which might be ok for RDB
and RESTORE, but i think modules may need higher resolution
- adding tests for LFU and for handling maxmemory policy mismatch
Add two new functions that leverage the RedisModuleDataType mechanism
for RDB serialization/deserialization and make it possible to use it
to/from arbitrary strings:
* RM_SaveDataTypeToString()
* RM_LoadDataTypeFromString()
rename RM_ServerInfoGetFieldNumerical RM_ServerInfoGetFieldSigned
move string2ull to util.c
fix leak in RM_GetServerInfo when duplicate info fields exist
- Add RM_GetServerInfo and friends
- Add auto memory for new opaque struct
- Add tests for new APIs
other minor fixes:
- add const in various char pointers
- requested_section in modulesCollectInfo was actually not sds but char*
- extract new string2d out of getDoubleFromObject for code reuse
Add module API for
* replication hooks: role change, master link status, replica online/offline
* persistence hooks: saving, loading, loading progress
* misc hooks: cron loop, shutdown, module loaded/unloaded
* change the way hooks test work, and add tests for all of the above
startLoading() now gets flag indicating what is loaded.
stopLoading() now gets an indication of success or failure.
adding startSaving() and stopSaving() with similar args and role.
Adding a test for coverage for RM_Call in a new "misc" unit
to be used for various short simple tests
also solves compilation warnings in redismodule.h and fork.c
* create module API for forking child processes.
* refactor duplicate code around creating and tracking forks by AOF and RDB.
* child processes listen to SIGUSR1 and dies exitFromChild in order to
eliminate a valgrind warning of unhandled signal.
* note that BGSAVE error reply has changed.
valgrind error is:
Process terminating with default action of signal 10 (SIGUSR1)