Commit Graph

156 Commits

Author SHA1 Message Date
Eduardo Semprebon
91d0c758e5
Replica keep serving data during repl-diskless-load=swapdb for better availability (#9323)
For diskless replication in swapdb mode, considering we already spend replica memory
having a backup of current db to restore in case of failure, we can have the following benefits
by instead swapping database only in case we succeeded in transferring db from master:

- Avoid `LOADING` response during failed and successful synchronization for cases where the
  replica is already up and running with data.
- Faster total time of diskless replication, because now we're moving from Transfer + Flush + Load
  time to Transfer + Load only. Flushing the tempDb is done asynchronously after swapping.
- This could be implemented also for disk replication with similar benefits if consumers are willing
  to spend the extra memory usage.

General notes:
- The concept of `backupDb` becomes `tempDb` for clarity.
- Async loading mode will only kick in if the replica is syncing from a master that has the same
  repl-id the one it had before. i.e. the data it's getting belongs to a different time of the same timeline. 
- New property in INFO: `async_loading` to differentiate from the blocking loading
- Slot to Key mapping is now a field of `redisDb` as it's more natural to access it from both server.db
  and the tempDb that is passed around.
- Because this is affecting replicas only, we assume that if they are not readonly and write commands
  during replication, they are lost after SYNC same way as before, but we're still denying CONFIG SET
  here anyways to avoid complications.

Considerations for review:
- We have many cases where server.loading flag is used and even though I tried my best, there may
  be cases where async_loading should be checked as well and cases where it shouldn't (would require
  very good understanding of whole code)
- Several places that had different behavior depending on the loading flag where actually meant to just
  handle commands coming from the AOF client differently than ones coming from real clients, changed
  to check CLIENT_ID_AOF instead.

**Additional for Release Notes**
- Bugfix - server.dirty was not incremented for any kind of diskless replication, as effect it wouldn't
  contribute on triggering next database SAVE
- New flag for RM_GetContextFlags module API: REDISMODULE_CTX_FLAGS_ASYNC_LOADING
- Deprecated RedisModuleEvent_ReplBackup. Starting from Redis 7.0, we don't fire this event.
  Instead, we have the new RedisModuleEvent_ReplAsyncLoad holding 3 sub-events: STARTED,
  ABORTED and COMPLETED.
- New module flag REDISMODULE_OPTIONS_HANDLE_REPL_ASYNC_LOAD for RedisModule_SetModuleOptions
  to allow modules to declare they support the diskless replication with async loading (when absent, we fall
  back to disk-based loading).

Co-authored-by: Eduardo Semprebon <edus@saxobank.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2021-11-04 10:46:50 +02:00
Oran Agra
f1f3cceb50
fix valgrind issues with long double module test (#9709)
The module test in reply.tcl was introduced by #8521 but didn't run until recently (see #9639)
and then it started failing with valgrind.
This is because valgrind uses 64 bit long double (unlike most other platforms that have at least 80 bits)
But besides valgrind, the tests where also incompatible with ARM32, which also uses 64 bit long doubles.

We now use appropriate value to avoid issues with either valgrind or ARM32

In all the double tests, i use 3.141, which is safe since since addReplyDouble uses
`%.17Lg` which is able to represent this value without adding any digits due to precision loss. 

In the long double, since we use `%.17Lf` in ld2string, it preserves 17 significant
digits, rather than 17 digit after the decimal point (like in `%.17Lg`).
So to make these similar, i use value lower than 1 (no digits left of
the period)

Lastly, we have the same issue with TCL (no long doubles) so we read
raw protocol in that test.

Note that the only error before this fix (in both valgrind and ARM32 is this:
```
*** [err]: RM_ReplyWithLongDouble: a float reply in tests/unit/moduleapi/reply.tcl
Expected '3.141' to be equal to '3.14100000000000001' (context: type eval line 2 cmd {assert_equal 3.141 [r rw.longdouble 3.141]} proc ::test)
```
so the changes to debug.c and scripting.tcl aren't really needed, but i consider them a cleanup
(i.e. scripting.c validated a different constant than the one that's sent to it from debug.c).

Another unrelated change is to add the RESP version to the repeated tests in reply.tcl
2021-11-01 13:41:35 +02:00
Shaya Potter
12ce2c3925
Add RM_ReplyWithBigNumber module API (#9639)
Let modules use additional type of RESP3 response (unused by redis so far)
Also fix tests that where introduced in #8521 but didn't actually run.

Co-authored-by: Oran Agra <oran@redislabs.com>
2021-10-25 11:31:20 +03:00
Wang Yuan
c1718f9d86
Replication backlog and replicas use one global shared replication buffer (#9166)
## Background
For redis master, one replica uses one copy of replication buffer, that is a big waste of memory,
more replicas more waste, and allocate/free memory for every reply list also cost much.
If we set client-output-buffer-limit small and write traffic is heavy, master may disconnect with
replicas and can't finish synchronization with replica. If we set  client-output-buffer-limit big,
master may be OOM when there are many replicas that separately keep much memory.
Because replication buffers of different replica client are the same, one simple idea is that
all replicas only use one replication buffer, that will effectively save memory.

Since replication backlog content is the same as replicas' output buffer, now we
can discard replication backlog memory and use global shared replication buffer
to implement replication backlog mechanism.

## Implementation
I create one global "replication buffer" which contains content of replication stream.
The structure of "replication buffer" is similar to the reply list that exists in every client.
But the node of list is `replBufBlock`, which has `id, repl_offset, refcount` fields.
```c
/* Replication buffer blocks is the list of replBufBlock.
 *
 * +--------------+       +--------------+       +--------------+
 * | refcount = 1 |  ...  | refcount = 0 |  ...  | refcount = 2 |
 * +--------------+       +--------------+       +--------------+
 *      |                                            /       \
 *      |                                           /         \
 *      |                                          /           \
 *  Repl Backlog                               Replia_A      Replia_B
 * 
 * Each replica or replication backlog increments only the refcount of the
 * 'ref_repl_buf_node' which it points to. So when replica walks to the next
 * node, it should first increase the next node's refcount, and when we trim
 * the replication buffer nodes, we remove node always from the head node which
 * refcount is 0. If the refcount of the head node is not 0, we must stop
 * trimming and never iterate the next node. */

/* Similar with 'clientReplyBlock', it is used for shared buffers between
 * all replica clients and replication backlog. */
typedef struct replBufBlock {
    int refcount;           /* Number of replicas or repl backlog using. */
    long long id;           /* The unique incremental number. */
    long long repl_offset;  /* Start replication offset of the block. */
    size_t size, used;
    char buf[];
} replBufBlock;
```
So now when we feed replication stream into replication backlog and all replicas, we only need
to feed stream into replication buffer `feedReplicationBuffer`. In this function, we set some fields of
replication backlog and replicas to references of the global replication buffer blocks. And we also
need to check replicas' output buffer limit to free if exceeding `client-output-buffer-limit`, and trim
replication backlog if exceeding `repl-backlog-size`.

When sending reply to replicas, we also need to iterate replication buffer blocks and send its
content, when totally sending one block for replica, we decrease current node count and
increase the next current node count, and then free the block which reference is 0 from the
head of replication buffer blocks.

Since now we use linked list to manage replication backlog, it may cost much time for iterating
all linked list nodes to find corresponding replication buffer node. So we create a rax tree to
store some nodes  for index, but to avoid rax tree occupying too much memory, i record
one per 64 nodes for index.

Currently, to make partial resynchronization as possible as much, we always let replication
backlog as the last reference of replication buffer blocks, backlog size may exceeds our setting
if slow replicas that reference vast replication buffer blocks, and this method doesn't increase
memory usage since they share replication buffer. To avoid freezing server for freeing unreferenced
replication buffer blocks when we need to trim backlog for exceeding backlog size setting,
we trim backlog incrementally (free 64 blocks per call now), and make it faster in
`beforeSleep` (free 640 blocks).

### Other changes
- `mem_total_replication_buffers`: we add this field in INFO command, it means the total
  memory of replication buffers used.
- `mem_clients_slaves`:  now even replica is slow to replicate, and its output buffer memory
  is not 0, but it still may be 0, since replication backlog and replicas share one global replication
  buffer, only if replication buffer memory is more than the repl backlog setting size, we consider
  the excess as replicas' memory. Otherwise, we think replication buffer memory is the consumption
  of repl backlog.
- Key eviction
  Since all replicas and replication backlog share global replication buffer, we think only the
  part of exceeding backlog size the extra separate consumption of replicas.
  Because we trim backlog incrementally in the background, backlog size may exceeds our
  setting if slow replicas that reference vast replication buffer blocks disconnect.
  To avoid massive eviction loop, we don't count the delayed freed replication backlog into
  used memory even if there are no replicas, i.e. we also regard this memory as replicas's memory.
- `client-output-buffer-limit` check for replica clients
  It doesn't make sense to set the replica clients output buffer limit lower than the repl-backlog-size
  config (partial sync will succeed and then replica will get disconnected). Such a configuration is
  ignored (the size of repl-backlog-size will be used). This doesn't have memory consumption
  implications since the replica client will share the backlog buffers memory.
- Drop replication backlog after loading data if needed
  We always create replication backlog if server is a master, we need it because we put DELs in
  it when loading expired keys in RDB, but if RDB doesn't have replication info or there is no rdb,
  it is not possible to support partial resynchronization, to avoid extra memory of replication backlog,
  we drop it.
- Multi IO threads
 Since all replicas and replication backlog use global replication buffer,  if I/O threads are enabled,
  to guarantee data accessing thread safe, we must let main thread handle sending the output buffer
  to all replicas. But before, other IO threads could handle sending output buffer of all replicas.

## Other optimizations
This solution resolve some other problem:
- When replicas disconnect with master since of out of output buffer limit, releasing the output
  buffer of replicas may freeze server if we set big `client-output-buffer-limit` for replicas, but now,
  it doesn't cause freezing.
- This implementation may mitigate reply list copy cost time(also freezes server) when one replication
  has huge reply buffer and another replica can copy buffer for full synchronization. now, we just copy
  reference info, it is very light.
- If we set replication backlog size big, it also may cost much time to copy replication backlog into
  replica's output buffer. But this commit eliminates this problem.
- Resizing replication backlog size doesn't empty current replication backlog content.
2021-10-25 09:24:31 +03:00
Shaya Potter
cf860df599
Fix module blocked clients RESP version (#9634)
Before this commit, module blocked clients did not carry through the original RESP version, resulting with RESP3 clients receiving unexpected RESP2 replies.
2021-10-21 14:01:10 +03:00
Oran Agra
7d6744c739
fix new cluster tests issues (#9657)
Following #9483 the daily CI exposed a few problems.

* The cluster creation code (uses redis-cli) is complicated to test with TLS enabled.
  for now i'm just skipping them since the tests we run there don't really need that kind of coverage
* cluster port binding failures
  note that `find_available_port` already looks for a free cluster port
  but the code in `wait_server_started` couldn't detect the failure of binding
  (the text it greps for wasn't found in the log)
2021-10-20 15:40:28 +03:00
guybe7
43e736f79b
Treat subcommands as commands (#9504)
## Intro

The purpose is to allow having different flags/ACL categories for
subcommands (Example: CONFIG GET is ok-loading but CONFIG SET isn't)

We create a small command table for every command that has subcommands
and each subcommand has its own flags, etc. (same as a "regular" command)

This commit also unites the Redis and the Sentinel command tables

## Affected commands

CONFIG
Used to have "admin ok-loading ok-stale no-script"
Changes:
1. Dropped "ok-loading" in all except GET (this doesn't change behavior since
there were checks in the code doing that)

XINFO
Used to have "read-only random"
Changes:
1. Dropped "random" in all except CONSUMERS

XGROUP
Used to have "write use-memory"
Changes:
1. Dropped "use-memory" in all except CREATE and CREATECONSUMER

COMMAND
No changes.

MEMORY
Used to have "random read-only"
Changes:
1. Dropped "random" in PURGE and USAGE

ACL
Used to have "admin no-script ok-loading ok-stale"
Changes:
1. Dropped "admin" in WHOAMI, GENPASS, and CAT

LATENCY
No changes.

MODULE
No changes.

SLOWLOG
Used to have "admin random ok-loading ok-stale"
Changes:
1. Dropped "random" in RESET

OBJECT
Used to have "read-only random"
Changes:
1. Dropped "random" in ENCODING and REFCOUNT

SCRIPT
Used to have "may-replicate no-script"
Changes:
1. Dropped "may-replicate" in all except FLUSH and LOAD

CLIENT
Used to have "admin no-script random ok-loading ok-stale"
Changes:
1. Dropped "random" in all except INFO and LIST
2. Dropped "admin" in ID, TRACKING, CACHING, GETREDIR, INFO, SETNAME, GETNAME, and REPLY

STRALGO
No changes.

PUBSUB
No changes.

CLUSTER
Changes:
1. Dropped "admin in countkeysinslots, getkeysinslot, info, nodes, keyslot, myid, and slots

SENTINEL
No changes.

(note that DEBUG also fits, but we decided not to convert it since it's for
debugging and anyway undocumented)

## New sub-command
This commit adds another element to the per-command output of COMMAND,
describing the list of subcommands, if any (in the same structure as "regular" commands)
Also, it adds a new subcommand:
```
COMMAND LIST [FILTERBY (MODULE <module-name>|ACLCAT <cat>|PATTERN <pattern>)]
```
which returns a set of all commands (unless filters), but excluding subcommands.

## Module API
A new module API, RM_CreateSubcommand, was added, in order to allow
module writer to define subcommands

## ACL changes:
1. Now, that each subcommand is actually a command, each has its own ACL id.
2. The old mechanism of allowed_subcommands is redundant
(blocking/allowing a subcommand is the same as blocking/allowing a regular command),
but we had to keep it, to support the widespread usage of allowed_subcommands
to block commands with certain args, that aren't subcommands (e.g. "-select +select|0").
3. I have renamed allowed_subcommands to allowed_firstargs to emphasize the difference.
4. Because subcommands are commands in ACL too, you can now use "-" to block subcommands
(e.g. "+client -client|kill"), which wasn't possible in the past.
5. It is also possible to use the allowed_firstargs mechanism with subcommand.
For example: `+config -config|set +config|set|loglevel` will block all CONFIG SET except
for setting the log level.
6. All of the ACL changes above required some amount of refactoring.

## Misc
1. There are two approaches: Either each subcommand has its own function or all
   subcommands use the same function, determining what to do according to argv[0].
   For now, I took the former approaches only with CONFIG and COMMAND,
   while other commands use the latter approach (for smaller blamelog diff).
2. Deleted memoryGetKeys: It is no longer needed because MEMORY USAGE now uses the "range" key spec.
4. Bugfix: GETNAME was missing from CLIENT's help message.
5. Sentinel and Redis now use the same table, with the same function pointer.
   Some commands have a different implementation in Sentinel, so we redirect
   them (these are ROLE, PUBLISH, and INFO).
6. Command stats now show the stats per subcommand (e.g. instead of stats just
   for "config" you will have stats for "config|set", "config|get", etc.)
7. It is now possible to use COMMAND directly on subcommands:
   COMMAND INFO CONFIG|GET (The pipeline syntax was inspired from ACL, and
   can be used in functions lookupCommandBySds and lookupCommandByCString)
8. STRALGO is now a container command (has "help")

## Breaking changes:
1. Command stats now show the stats per subcommand (see (5) above)
2021-10-20 11:52:57 +03:00
qetu3790
4962c5526d
Release clients blocked on module commands in cluster resharding and down state (#9483)
Prevent clients from being blocked forever in cluster when they block with their own module command
and the hash slot is migrated to another master at the same time.
These will get a redirection message when unblocked.
Also, release clients blocked on module commands when cluster is down (same as other blocked clients)

This commit adds basic tests for the main (non-cluster) redis test infra that test the cluster.
This was done because the cluster test infra can't handle some common test features,
but most importantly we only build the test modules with the non-cluster test suite.

note that rather than really supporting cluster operations by the test infra, it was added (as dup code)
in two files, one for module tests and one for non-modules tests, maybe in the future we'll refactor that.

Co-authored-by: Oran Agra <oran@redislabs.com>
2021-10-19 11:50:37 +03:00
Viktor Söderqvist
b7f2a1a217
Add RedisModule_KeyExists (#9600)
The LRU of the key is not touched. Locically expired keys are
logically not existing, so they're treated as such.
2021-10-18 22:21:19 +03:00
Hanna Fadida
ffafb434fb
Modules: add RM_LoadDataTypeFromStringEncver (#9537)
adding an advanced api to enable loading data that was sereialized with a specific encoding version
2021-09-30 11:21:32 +03:00
YaacovHazan
a56d4533b7
Adding ACL support for modules (#9309)
This commit introduced a new flag to the RM_Call:
'C' - Check if the command can be executed according to the ACLs associated with it.

Also, three new API's added to check if a command, key, or channel can be executed or accessed
by a user, according to the ACLs associated with it.
- RM_ACLCheckCommandPerm
- RM_ACLCheckKeyPerm
- RM_ACLCheckChannelPerm

The user for these API's is a RedisModuleUser object, that for a Module user returned by the RM_CreateModuleUser API, or for a general ACL user can be retrieved by these two new API's:
- RM_GetCurrentUserName - Retrieve the user name of the client connection behind the current context.
- RM_GetModuleUserFromUserName - Get a RedisModuleUser from a user name

As a result of getting a RedisModuleUser from name, it can now also access the general ACL users (not just ones created by the module).
This mean the already existing API RM_SetModuleUserACL(), can be used to change the ACL rules for such users.
2021-09-23 08:52:56 +03:00
guybe7
03fcc211de
A better approach for COMMAND INFO for movablekeys commands (#8324)
Fix #7297

The problem:

Today, there is no way for a client library or app to know the key name indexes for commands such as
ZUNIONSTORE/EVAL and others with "numkeys", since COMMAND INFO returns no useful info for them.

For cluster-aware redis clients, this requires to 'patch' the client library code specifically for each of these commands or to
resolve each execution of these commands with COMMAND GETKEYS.

The solution:

Introducing key specs other than the legacy "range" (first,last,step)

The 8th element of the command info array, if exists, holds an array of key specs. The array may be empty, which indicates
the command doesn't take any key arguments or may contain one or more key-specs, each one may leads to the discovery
of 0 or more key arguments.

A client library that doesn't support this key-spec feature will keep using the first,last,step and movablekeys flag which will
obviously remain unchanged.

A client that supports this key-specs feature needs only to look at the key-specs array. If it finds an unrecognized spec, it
must resort to using COMMAND GETKEYS if it wishes to get all key name arguments, but if all it needs is one key in order
to know which cluster node to use, then maybe another spec (if the command has several) can supply that, and there's no
need to use GETKEYS.

Each spec is an array of arguments, first one is the spec name, the second is an array of flags, and the third is an array
containing details about the spec (specific meaning for each spec type)
The initial flags we support are "read" and "write" indicating if the keys that this key-spec finds are used for read or for write.
clients should ignore any unfamiliar flags.

In order to easily find the positions of keys in a given array of args we introduce keys specs. There are two logical steps of
key specs:
1. `start_search`: Given an array of args, indicate where we should start searching for keys
2. `find_keys`: Given the output of start_search and an array of args, indicate all possible indices of keys.

### start_search step specs
- `index`: specify an argument index explicitly
  - `index`: 0 based index (1 means the first command argument)
- `keyword`: specify a string to match in `argv`. We should start searching for keys just after the keyword appears.
  - `keyword`: the string to search for
  - `start_search`: an index from which to start the keyword search (can be negative, which means to search from the end)

Examples:
- `SET` has start_search of type `index` with value `1`
- `XREAD` has start_search of type `keyword` with value `[“STREAMS”,1]`
- `MIGRATE` has start_search of type `keyword` with value `[“KEYS”,-2]`

### find_keys step specs
- `range`: specify `[count, step, limit]`.
  - `lastkey`: index of the last key. relative to the index returned from begin_search. -1 indicating till the last argument, -2 one before the last
  - `step`: how many args should we skip after finding a key, in order to find the next one
  - `limit`: if count is -1, we use limit to stop the search by a factor. 0 and 1 mean no limit. 2 means ½ of the remaining args, 3 means ⅓, and so on.
- “keynum”: specify `[keynum_index, first_key_index, step]`.
  - `keynum_index`: is relative to the return of the `start_search` spec.
  - `first_key_index`: is relative to `keynum_index`.
  - `step`: how many args should we skip after finding a key, in order to find the next one

Examples:
- `SET` has `range` of `[0,1,0]`
- `MSET` has `range` of `[-1,2,0]`
- `XREAD` has `range` of `[-1,1,2]`
- `ZUNION` has `start_search` of type `index` with value `1` and `find_keys` of type `keynum` with value `[0,1,1]`
- `AI.DAGRUN` has `start_search` of type `keyword` with value `[“LOAD“,1]` and `find_keys` of type `keynum` with value
  `[0,1,1]` (see https://oss.redislabs.com/redisai/master/commands/#aidagrun)

Note: this solution is not perfect as the module writers can come up with anything, but at least we will be able to find the key
args of the vast majority of commands.
If one of the above specs can’t describe the key positions, the module writer can always fall back to the `getkeys-api` option.

Some keys cannot be found easily (`KEYS` in `MIGRATE`: Imagine the argument for `AUTH` is the string “KEYS” - we will
start searching in the wrong index). 
The guarantee is that the specs may be incomplete (`incomplete` will be specified in the spec to denote that) but we never
report false information (assuming the command syntax is correct).
For `MIGRATE` we start searching from the end - `startfrom=-1` - and if one of the keys is actually called "keys" we will
report only a subset of all keys - hence the `incomplete` flag.
Some `incomplete` specs can be completely empty (i.e. UNKNOWN begin_search) which should tell the client that
COMMAND GETKEYS (or any other way to get the keys) must be used (Example: For `SORT` there is no way to describe
the STORE keyword spec, as the word "store" can appear anywhere in the command).

We will expose these key specs in the `COMMAND` command so that clients can learn, on startup, where the keys are for
all commands instead of holding hardcoded tables or use `COMMAND GETKEYS` in runtime.

Comments:
1. Redis doesn't internally use the new specs, they are only used for COMMAND output.
2. In order to support the current COMMAND INFO format (reply array indices 4, 5, 6) we created a synthetic range, called
   legacy_range, that, if possible, is built according to the new specs.
3. Redis currently uses only getkeys_proc or the legacy_range to get the keys indices (in COMMAND GETKEYS for
   example).

"incomplete" specs:
the command we have issues with are MIGRATE, STRALGO, and SORT
for MIGRATE, because the token KEYS, if exists, must be the last token, we can search in reverse. it one of the keys is
actually the string "keys" will return just a subset of the keys (hence, it's "incomplete")
for SORT and STRALGO we can use this heuristic (the keys can be anywhere in the command) and therefore we added a
key spec that is both "incomplete" and of "unknown type"

if a client encounters an "incomplete" spec it means that it must find a different way (either COMMAND GETKEYS or have
its own parser) to retrieve the keys.
please note that all commands, apart from the three mentioned above, have "complete" key specs
2021-09-15 11:10:29 +03:00
Viktor Söderqvist
ea36d4de17
Modules: Add remaining list API functions (#8439)
List functions operating on elements by index:

* RM_ListGet
* RM_ListSet
* RM_ListInsert
* RM_ListDelete

Iteration is done using a simple for loop over indices.
The index based functions use an internal iterator as an optimization.
This is explained in the docs:

```
 * Many of the list functions access elements by index. Since a list is in
 * essence a doubly-linked list, accessing elements by index is generally an
 * O(N) operation. However, if elements are accessed sequentially or with
 * indices close together, the functions are optimized to seek the index from
 * the previous index, rather than seeking from the ends of the list.
 *
 * This enables iteration to be done efficiently using a simple for loop:
 *
 *     long n = RM_ValueLength(key);
 *     for (long i = 0; i < n; i++) {
 *         RedisModuleString *elem = RedisModule_ListGet(key, i);
 *         // Do stuff...
 *     }
```
2021-09-14 17:48:06 +03:00
Meir Shpilraien (Spielrein)
2237131e15
Unified Lua and modules reply parsing and added RESP3 support to RM_Call (#9202)
## Current state
1. Lua has its own parser that handles parsing `reds.call` replies and translates them
  to Lua objects that can be used by the user Lua code. The parser partially handles
  resp3 (missing big number, verbatim, attribute, ...)
2. Modules have their own parser that handles parsing `RM_Call` replies and translates
  them to RedisModuleCallReply objects. The parser does not support resp3.

In addition, in the future, we want to add Redis Function (#8693) that will probably
support more languages. At some point maintaining so many parsers will stop
scaling (bug fixes and protocol changes will need to be applied on all of them).
We will probably end up with different parsers that support different parts of the
resp protocol (like we already have today with Lua and modules)

## PR Changes
This PR attempt to unified the reply parsing of Lua and modules (and in the future
Redis Function) by introducing a new parser unit (`resp_parser.c`). The new parser
handles parsing the reply and calls different callbacks to allow the users (another
unit that uses the parser, i.e, Lua, modules, or Redis Function) to analyze the reply.

### Lua API Additions
The code that handles reply parsing on `scripting.c` was removed. Instead, it uses
the resp_parser to parse and create a Lua object out of the reply. As mentioned
above the Lua parser did not handle parsing big numbers, verbatim, and attribute.
The new parser can handle those and so Lua also gets it for free.
Those are translated to Lua objects in the following way:
1. Big Number - Lua table `{'big_number':'<str representation for big number>'}`
2. Verbatim - Lua table `{'verbatim_string':{'format':'<verbatim format>', 'string':'<verbatim string value>'}}`
3. Attribute - currently ignored and not expose to the Lua parser, another issue will be open to decide how to expose it.

Tests were added to check resp3 reply parsing on Lua

### Modules API Additions
The reply parsing code on `module.c` was also removed and the new resp_parser is used instead.
In addition, the RedisModuleCallReply was also extracted to a separate unit located on `call_reply.c`
(in the future, this unit will also be used by Redis Function). A nice side effect of unified parsing is
that modules now also support resp3. Resp3 can be enabled by giving `3` as a parameter to the
fmt argument of `RM_Call`. It is also possible to give `0`, which will indicate an auto mode. i.e, Redis
will automatically chose the reply protocol base on the current client set on the RedisModuleCtx
(this mode will mostly be used when the module want to pass the reply to the client as is).
In addition, the following RedisModuleAPI were added to allow analyzing resp3 replies:

* New RedisModuleCallReply types:
   * `REDISMODULE_REPLY_MAP`
   * `REDISMODULE_REPLY_SET`
   * `REDISMODULE_REPLY_BOOL`
   * `REDISMODULE_REPLY_DOUBLE`
   * `REDISMODULE_REPLY_BIG_NUMBER`
   * `REDISMODULE_REPLY_VERBATIM_STRING`
   * `REDISMODULE_REPLY_ATTRIBUTE`

* New RedisModuleAPI:
   * `RedisModule_CallReplyDouble` - getting double value from resp3 double reply
   * `RedisModule_CallReplyBool` - getting boolean value from resp3 boolean reply
   * `RedisModule_CallReplyBigNumber` - getting big number value from resp3 big number reply
   * `RedisModule_CallReplyVerbatim` - getting format and value from resp3 verbatim reply
   * `RedisModule_CallReplySetElement` - getting element from resp3 set reply
   * `RedisModule_CallReplyMapElement` - getting key and value from resp3 map reply
   * `RedisModule_CallReplyAttribute` - getting a reply attribute
   * `RedisModule_CallReplyAttributeElement` - getting key and value from resp3 attribute reply
   
* New context flags:
   * `REDISMODULE_CTX_FLAGS_RESP3` - indicate that the client is using resp3

Tests were added to check the new RedisModuleAPI

### Modules API Changes
* RM_ReplyWithCallReply might return REDISMODULE_ERR if the given CallReply is in resp3
  but the client expects resp2. This is not a breaking change because in order to get a resp3
  CallReply one needs to specifically specify `3` as a parameter to the fmt argument of
  `RM_Call` (as mentioned above).

Tests were added to check this change

### More small Additions
* Added `debug set-disable-deny-scripts` that allows to turn on and off the commands no-script
flag protection. This is used by the Lua resp3 tests so it will be possible to run `debug protocol`
and check the resp3 parsing code.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
2021-08-04 16:28:07 +03:00
Ariel Shtul
bdbf5eedae
Module api support for RESP3 (#8521)
Add new Module APS for RESP3 responses:
- RM_ReplyWithMap
- RM_ReplyWithSet
- RM_ReplyWithAttribute
- RM_ReplySetMapLength
- RM_ReplySetSetLength
- RM_ReplySetAttributeLength
- RM_ReplyWithBool

Deprecate REDISMODULE_POSTPONED_ARRAY_LEN in favor of a generic REDISMODULE_POSTPONED_LEN

Improve documentation
Add tests

Co-authored-by: Guy Benoish <guy.benoish@redislabs.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2021-08-03 11:37:19 +03:00
Guy Korland
1483f5aa9b
Remove const from CommandFilterArgGet result (#9247)
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
2021-08-01 11:29:32 +03:00
ZhaolongLi
8d00493485
tests: fix exec fails when grep exists with status other than 0 (#9066)
Co-authored-by: lizhaolong.lzl <lizhaolong.lzl@B-54MPMD6R-0221.local>
2021-07-25 09:58:21 +03:00
Yossi Gottlieb
aa139e2f02
Fix CLIENT UNBLOCK crashing modules. (#9167)
Modules that use background threads with thread safe contexts are likely
to use RM_BlockClient() without a timeout function, because they do not
set up a timeout.

Before this commit, `CLIENT UNBLOCK` would result with a crash as the
`NULL` timeout callback is called. Beyond just crashing, this is also
logically wrong as it may throw the module into an unexpected client
state.

This commits makes `CLIENT UNBLOCK` on such clients behave the same as
any other client that is not in a blocked state and therefore cannot be
unblocked.
2021-07-01 17:11:27 +03:00
Oran Agra
5ffdbae1f6
Fix failing basics moduleapi test on 32bit CI (#9140) 2021-06-24 12:44:13 +03:00
Evan
1ccf2ca2f4
modules: Add newlen == 0 handling to RM_StringTruncate (#3717) (#3718)
Previously, passing 0 for newlen would not truncate the string at all.
This adds handling of this case, freeing the old string and creating a new empty string.

Other changes:
- Move `src/modules/testmodule.c` to `tests/modules/basics.c`
- Introduce that basic test into the test suite
- Add tests to cover StringTruncate
- Add `test-modules` build target for the main makefile
- Extend `distclean` build target to clean modules too
2021-06-22 12:26:48 +03:00
chenyang8094
e0cd3ad0de
Enhance mem_usage/free_effort/unlink/copy callbacks and add GetDbFromIO api. (#8999)
Create new module type enhanced callbacks: mem_usage2, free_effort2, unlink2, copy2.
These will be given a context point from which the module can obtain the key name and database id.
In addition the digest and defrag context can now be used to obtain the key name and database id.
2021-06-16 09:45:49 +03:00
Binbin
0bfccc55e2
Fixed some typos, add a spell check ci and others minor fix (#8890)
This PR adds a spell checker CI action that will fail future PRs if they introduce typos and spelling mistakes.
This spell checker is based on blacklist of common spelling mistakes, so it will not catch everything,
but at least it is also unlikely to cause false positives.

Besides that, the PR also fixes many spelling mistakes and types, not all are a result of the spell checker we use.

Here's a summary of other changes:
1. Scanned the entire source code and fixes all sorts of typos and spelling mistakes (including missing or extra spaces).
2. Outdated function / variable / argument names in comments
3. Fix outdated keyspace masks error log when we check `config.notify-keyspace-events` in loadServerConfigFromString.
4. Trim the white space at the end of line in `module.c`. Check: https://github.com/redis/redis/pull/7751
5. Some outdated https link URLs.
6. Fix some outdated comment. Such as:
    - In README: about the rdb, we used to said create a `thread`, change to `process`
    - dbRandomKey function coment (about the dictGetRandomKey, change to dictGetFairRandomKey)
    - notifyKeyspaceEvent fucntion comment (add type arg)
    - Some others minor fix in comment (Most of them are incorrectly quoted by variable names)
7. Modified the error log so that users can easily distinguish between TCP and TLS in `changeBindAddr`
2021-06-10 15:39:33 +03:00
Oran Agra
7cb42c9c36 add test for modules load/unload and config rewrite 2021-06-01 13:43:48 +03:00
Hanna Fadida
53a4d6c3b1
Modules: adding a module type for key space notification (#8759)
Adding a new type mask ​for key space notification, REDISMODULE_NOTIFY_MODULE, to enable unique notifications from commands on REDISMODULE_KEYTYPE_MODULE type keys (which is currently unsupported).

Modules can subscribe to a module key keyspace notification by RM_SubscribeToKeyspaceEvents,
and clients by notify-keyspace-events of redis.conf or via the CONFIG SET, with the characters 'd' or 'A' 
(REDISMODULE_NOTIFY_MODULE type mask is part of the '**A**ll' notation for key space notifications).

Refactor: move some pubsub test infra from pubsub.tcl to util.tcl to be re-used by other tests.
2021-04-19 21:33:26 +03:00
guybe7
f40ca9cb58
Modules: Replicate lazy-expire even if replication is not allowed (#8816)
Before this commit using RM_Call without "!" could cause the master
to lazy-expire a key (delete it) but without replicating to replicas.
This could cause the replica's memory usage to gradually grow and
could also cause consistency issues if the master and replica have
a clock diff.
This bug was introduced in #8617

Added a test which demonstrates that scenario.
2021-04-19 17:16:02 +03:00
guybe7
dba33a943d
Missing EXEC on modules propagation after failed EVAL execution (#8654)
1. moduleReplicateMultiIfNeeded should use server.in_eval like
   moduleHandlePropagationAfterCommandCallback
2. server.in_eval could have been set to 1 and not reset back
   to 0 (a lot of missed early-exits after in_eval is already 1)

Note: The new assertions in processCommand cover (2) and I added
two module tests to cover (1)

Implications:
If an EVAL that failed (and thus left server.in_eval=1) runs before a module
command that replicates, the replication stream will contain MULTI (because
moduleReplicateMultiIfNeeded used to check server.lua_caller which is NULL
at this point) but not EXEC (because server.in_eval==1)
This only affects modules as module.c the only user of server.in_eval.

Affects versions 6.2.0, 6.2.1
2021-03-15 21:19:57 +02:00
guybe7
3d0b427c30
Fix some issues with modules and MULTI/EXEC (#8617)
Bug 1:
When a module ctx is freed moduleHandlePropagationAfterCommandCallback
is called and handles propagation. We want to prevent it from propagating
commands that were not replicated by the same context. Example:
1. module1.foo does: RM_Replicate(cmd1); RM_Call(cmd2); RM_Replicate(cmd3)
2. RM_Replicate(cmd1) propagates MULTI and adds cmd1 to also_propagagte
3. RM_Call(cmd2) create a new ctx, calls call() and destroys the ctx.
4. moduleHandlePropagationAfterCommandCallback is called, calling
   alsoPropagates EXEC (Note: EXEC is still not written to socket),
   setting server.in_trnsaction = 0
5. RM_Replicate(cmd3) is called, propagagting yet another MULTI (now
   we have nested MULTI calls, which is no good) and then cmd3

We must prevent RM_Call(cmd2) from resetting server.in_transaction.
REDISMODULE_CTX_MULTI_EMITTED was revived for that purpose.

Bug 2:
Fix issues with nested RM_Call where some have '!' and some don't.
Example:
1. module1.foo does RM_Call of module2.bar without replication (i.e. no '!')
2. module2.bar internally calls RM_Call of INCR with '!'
3. at the end of module1.foo we call RM_ReplicateVerbatim

We want the replica/AOF to see only module1.foo and not the INCR from module2.bar

Introduced a global replication_allowed flag inside RM_Call to determine
whether we need to replicate or not (even if '!' was specified)

Other changes:
Split beforePropagateMultiOrExec to beforePropagateMulti afterPropagateExec
just for better readability
2021-03-10 18:02:17 +02:00
Bonsai
81a55d026f
fix: call CLIENT INFO from redis module will crash the server (#8560)
Because when the RM_Call is invoked. It will create a faker client.
The point is client connection is NULL, so server will crash in connGetInfo

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2021-03-01 08:18:14 +02:00
Viktor Söderqvist
6122f1c450
Shared reusable client for RM_Call() (#8516)
A single client pointer is added in the server struct. This is
initialized by the first RM_Call() and reused for every subsequent
RM_Call() except if it's already in use, which means that it's not
used for (recursive) module calls to modules. For these, a new
"fake" client is created each time.

Other changes:
* Avoid allocating a dict iterator in pubsubUnsubscribeAllChannels
  when not needed
2021-02-28 14:11:18 +02:00
Yossi Gottlieb
141ac8df59
Escape unsafe field name characters in INFO. (#8492)
Fixes #8489
2021-02-15 17:08:53 +02:00
Viktor Söderqvist
0bc8c9c8f9
Modules: In RM_HashSet, add COUNT_ALL flag and set errno (#8446)
The added flag affects the return value of RM_HashSet() to include
the number of inserted fields, in addition to updated and deleted
fields.

errno is set on errors, tests are added and documentation updated.
2021-02-15 11:40:05 +02:00
filipe oliveira
b5ca1e9e53
Removed time sensitive checks from block on background tests. Fixed uninitialized variable (#8479)
- removes time sensitive checks from block on background tests during leak checks.
- fix uninitialized variable on RedisModuleBlockedClient() when calling
  RM_BlockedClientMeasureTimeEnd() without RM_BlockedClientMeasureTimeStart()
2021-02-10 08:59:07 +02:00
filipe oliveira
b2351ea0dc
[fix] Increasing block on background timeout time to avoid test failure (#8470)
The test failed from time to time on Github actions.
We think it's possible that on the module's blocking timeout
time tracking test, the timeout is happening prior we issue the
RedisModule_BlockedClientMeasureTimeStart(bc) on the
background thread. If that is the case one possible solution
is to increase the timeout.
Increasing to 200ms to 500ms to see if nightly stops failing.
2021-02-08 16:24:00 +02:00
Viktor Söderqvist
aea6e71ef8
RM_ZsetRem: Delete key if empty (#8453)
Without this fix, RM_ZsetRem can leave empty sorted sets which are
not allowed to exist.

Removing from a sorted set while iterating seems to work (while
inserting causes failed assetions). RM_ZsetRangeEndReached is
modified to return 1 if the key doesn't exist, to terminate
iteration when the last element has been removed.
2021-02-05 19:54:01 +02:00
filipe oliveira
f0c5052aa8
Enabled background and reply time tracking on blocked on keys/blocked on background work clients (#7491)
This commit enables tracking time of the background tasks and on replies,
opening the door for properly tracking commands that rely on blocking / background
 work via the slowlog, latency history, and commandstats. 

Some notes:
- The time spent blocked waiting for key changes, or blocked on synchronous
  replication is not accounted for. 

- **This commit does not affect latency tracking of commands that are non-blocking
  or do not have background work.** ( meaning that it all stays the same with exception to
  `BZPOPMIN`,`BZPOPMAX`,`BRPOP`,`BLPOP`, etc... and module's commands that rely
  on background threads ). 

-  Specifically for latency history command we've added a new event class named
  `command-unblocking` that will enable latency monitoring on commands that spawn
  background threads to do the work.

- For blocking commands we're now considering the total time of a command as the
  time spent on call() + the time spent on replying when unblocked.

- For Modules commands that rely on background threads we're now considering the
  total time of a command as the time spent on call (main thread) + the time spent on
  the background thread ( if marked within `RedisModule_MeasureTimeStart()` and
  `RedisModule_MeasureTimeEnd()` ) + the time spent on replying (main thread)

To test for this feature we've added a `unit/moduleapi/blockonbackground` test that relies on
a module that blocks the client and sleeps on the background for a given time. 
- check blocked command that uses RedisModule_MeasureTimeStart() is tracking background time
- check blocked command that uses RedisModule_MeasureTimeStart() is tracking background time even in timeout
- check blocked command with multiple calls RedisModule_MeasureTimeStart()  is tracking the total background time
- check blocked command without calling RedisModule_MeasureTimeStart() is not reporting background time
2021-01-29 15:38:30 +02:00
Viktor Söderqvist
4355145a62
Add modules API for streams (#8288)
APIs added for these stream operations: add, delete, iterate and
trim (by ID or maxlength). The functions are prefixed by RM_Stream.

* RM_StreamAdd
* RM_StreamDelete
* RM_StreamIteratorStart
* RM_StreamIteratorStop
* RM_StreamIteratorNextID
* RM_StreamIteratorNextField
* RM_StreamIteratorDelete
* RM_StreamTrimByLength
* RM_StreamTrimByID

The type RedisModuleStreamID is added and functions for converting
from and to RedisModuleString.

* RM_CreateStringFromStreamID
* RM_StringToStreamID

Whenever the stream functions return REDISMODULE_ERR, errno is set to
provide additional error information.

Refactoring: The zset iterator fields in the RedisModuleKey struct
are wrapped in a union, to allow the same space to be used for type-
specific info for streams and allow future use for other key types.
2021-01-28 16:19:43 +02:00
Viktor Söderqvist
9c1483100a
Test that module can wake up module blocked on non-empty list key (#8382)
BLPOP and other blocking list commands can only block on empty keys
and LPUSH only wakes up clients when the list is created.

Using the module API, it's possible to block on a non-empty key.
Unblocking a client blocked on a non-empty list (or zset) can only
be done using RedisModule_SignalKeyAsReady(). This commit tests it.
2021-01-22 16:19:37 +02:00
guybe7
5a77d015be
Fix misleading module test (#8366)
the test was misleading because the module would actually woke up on a wrong type and
re-blocked, while the test name suggests the module doesn't not wake up at all on a wrong type..

i changed the name of the test + added verification that indeed the module wakes up and gets
re-blocked after it understand it's the wrong type
2021-01-20 14:03:38 +02:00
Viktor Söderqvist
4985c11bd6
Bugfix: Make modules blocked on keys unblock on commands like LPUSH (#8356)
This was a regression from #7625 (only in 6.2 RC2).

This makes it possible again to implement blocking list and zset
commands using the modules API.

This commit also includes a test case for the reverse: A module
unblocks a client blocked on BLPOP by inserting elements using
RedisModule_ListPush(). This already works, but it was untested.
2021-01-19 13:15:33 +02:00
Oran Agra
411c18bbce
Remove read-only flag from non-keyspace cmds, different approach for EXEC to propagate MULTI (#8216)
In the distant history there was only the read flag for commands, and whatever
command that didn't have the read flag was a write one.
Then we added the write flag, but some portions of the code still used !read
Also some commands that don't work on the keyspace at all, still have the read
flag.

Changes in this commit:
1. remove the read-only flag from TIME, ECHO, ROLE and LASTSAVE

2. EXEC command used to decides if it should propagate a MULTI by looking at
   the command flags (!read & !admin).
   When i was about to change it to look at the write flag instead, i realized
   that this would cause it not to propagate a MULTI for PUBLISH, EVAL, and
   SCRIPT, all 3 are not marked as either a read command or a write one (as
   they should), but all 3 are calling forceCommandPropagation.

   So instead of introducing a new flag to denote a command that "writes" but
   not into the keyspace, and still needs propagation, i decided to rely on
   the forceCommandPropagation, and just fix the code to propagate MULTI when
   needed rather than depending on the command flags at all.

   The implication of my change then is that now it won't decide to propagate
   MULTI when it sees one of these: SELECT, PING, INFO, COMMAND, TIME and
   other commands which are neither read nor write.

3. Changing getNodeByQuery and clusterRedirectBlockedClientIfNeeded in
   cluster.c to look at !write rather than read flag.
   This should have no implications, since these code paths are only reachable
   for commands which access keys, and these are always marked as either read
   or write.

This commit improve MULTI propagation tests, for modules and a bunch of
other special cases, all of which used to pass already before that commit.
the only one that test change that uncovered a change of behavior is the
one that DELs a non-existing key, it used to propagate an empty
multi-exec block, and no longer does.
2020-12-22 12:03:49 +02:00
Oran Agra
7d9b09adaa
Tests: fix new defrag test to be skipped when not supported (#8185)
Additionally the older defrag tests are using an obsolete way to check
if the defragger is suuported (the error no longer contains "DISABLED").
this doesn't usually makes a difference since these tests are completely
skipped if the allocator is not jemalloc, but that would fail if the
allocator is a jemalloc that doesn't support defrag.
2020-12-14 11:13:46 +02:00
Yossi Gottlieb
63c1303cfb
Modules: add defrag API support. (#8149)
Add a new set of defrag functions that take a defrag context and allow
defragmenting memory blocks and RedisModuleStrings.

Modules can register a defrag callback which will be invoked when the
defrag process handles globals.

Modules with custom data types can also register a datatype-specific
defrag callback which is invoked for keys that require defragmentation.
The callback and associated functions support both one-step and
multi-step options, depending on the complexity of the key as exposed by
the free_effort callback.
2020-12-13 09:56:01 +02:00
Yossi Gottlieb
4e064fbab4
Add module data-type support for COPY. (#8112)
This adds a copy callback for module data types, in order to make
modules compatible with the new COPY command.

The callback is optional and COPY will fail for keys with data types
that do not implement it.
2020-12-09 20:22:45 +02:00
Oran Agra
48efc25f74
Handle output buffer limits for Module blocked clients (#8141)
Module blocked clients cache the response in a temporary client,
the reply list in this client would be affected by the recent fix
in #7202, but when the reply is later copied into the real client,
it would have bypassed all the checks for output buffer limit, which
would have resulted in both: responding with a partial response to
the client, and also not disconnecting it at all.
2020-12-08 16:41:20 +02:00
guybe7
1df5bb5687
Make sure we do not propagate nested MULTI/EXEC (#8097)
One way this was happening is when a module issued an RM_Call which would inject MULTI.
If the module command that does that was itself issued by something else that already did
added MULTI (e.g. another module, or a Lua script), it would have caused nested MULTI.

In fact the MULTI state in the client or the MULTI_EMITTED flag in the context isn't
the right indication that we need to propagate MULTI or not, because on a nested calls
(possibly a module action called by a keyspace event of another module action), these
flags aren't retained / reflected.

instead there's now a global propagate_in_transaction flag for that.

in addition to that, we now have a global in_eval and in_exec flags, to serve the flags
of RM_GetContextFlags, since their dependence on the current client is wrong for the same
reasons mentioned above.
2020-12-06 13:14:18 +02:00
Itamar Haber
c1b1e8c329
Adds pub/sub channel patterns to ACL (#7993)
Fixes #7923.

This PR appropriates the special `&` symbol (because `@` and `*` are taken),
followed by a literal value or pattern for describing the Pub/Sub patterns that
an ACL user can interact with. It is similar to the existing key patterns
mechanism in function (additive) and implementation (copy-pasta). It also adds
the allchannels and resetchannels ACL keywords, naturally.

The default user is given allchannels permissions, whereas new users get
whatever is defined by the acl-pubsub-default configuration directive. For
backward compatibility in 6.2, the default of this directive is allchannels but
this is likely to be changed to resetchannels in the next major version for
stronger default security settings.

Unless allchannels is set for the user, channel access permissions are checked
as follows :
* Calls to both PUBLISH and SUBSCRIBE will fail unless a pattern matching the
  argumentative channel name(s) exists for the user.
* Calls to PSUBSCRIBE will fail unless the pattern(s) provided as an argument
  literally exist(s) in the user's list.

Such failures are logged to the ACL log.

Runtime changes to channel permissions for a user with existing subscribing
clients cause said clients to disconnect unless the new permissions permit the
connections to continue. Note, however, that PSUBSCRIBErs' patterns are matched
literally, so given the change bar:* -> b*, pattern subscribers to bar:* will be
disconnected.

Notes/questions:
* UNSUBSCRIBE, PUNSUBSCRIBE and PUBSUB remain unprotected due to lack of reasons
  for touching them.
2020-12-01 14:21:39 +02:00
Meir Shpilraien (Spielrein)
d87a0d0286
Unified MULTI, LUA, and RM_Call with respect to blocking commands (#8025)
Blocking command should not be used with MULTI, LUA, and RM_Call. This is because,
the caller, who executes the command in this context, expects a reply.

Today, LUA and MULTI have a special (and different) treatment to blocking commands:

LUA   - Most commands are marked with no-script flag which are checked when executing
and command from LUA, commands that are not marked (like XREAD) verify that their
blocking mode is not used inside LUA (by checking the CLIENT_LUA client flag).
MULTI - Command that is going to block, first verify that the client is not inside
multi (by checking the CLIENT_MULTI client flag). If the client is inside multi, they
return a result which is a match to the empty key with no timeout (for example blpop
inside MULTI will act as lpop)
For modules that perform RM_Call with blocking command, the returned results type is
REDISMODULE_REPLY_UNKNOWN and the caller can not really know what happened.

Disadvantages of the current state are:

No unified approach, LUA, MULTI, and RM_Call, each has a different treatment
Module can not safely execute blocking command (and get reply or error).
Though It is true that modules are not like LUA or MULTI and should be smarter not
to execute blocking commands on RM_Call, sometimes you want to execute a command base
on client input (for example if you create a module that provides a new scripting
language like javascript or python).
While modules (on modules command) can check for REDISMODULE_CTX_FLAGS_LUA or
REDISMODULE_CTX_FLAGS_MULTI to know not to block the client, there is no way to
check if the command came from another module using RM_Call. So there is no way
for a module to know not to block another module RM_Call execution.

This commit adds a way to unify the treatment for blocking clients by introducing
a new CLIENT_DENY_BLOCKING client flag. On LUA, MULTI, and RM_Call the new flag
turned on to signify that the client should not be blocked. A blocking command
verifies that the flag is turned off before blocking. If a blocking command sees
that the CLIENT_DENY_BLOCKING flag is on, it's not blocking and return results
which are matches to empty key with no timeout (as MULTI does today).

The new flag is checked on the following commands:

List blocking commands: BLPOP, BRPOP, BRPOPLPUSH, BLMOVE,
Zset blocking commands: BZPOPMIN, BZPOPMAX
Stream blocking commands: XREAD, XREADGROUP
SUBSCRIBE, PSUBSCRIBE, MONITOR
In addition, the new flag is turned on inside the AOF client, we do not want to
block the AOF client to prevent deadlocks and commands ordering issues (and there
is also an existing assert in the code that verifies it).

To keep backward compatibility on LUA, all the no-script flags on existing commands
were kept untouched. In addition, a LUA special treatment on XREAD and XREADGROUP was kept.

To keep backward compatibility on MULTI (which today allows SUBSCRIBE, and PSUBSCRIBE).
We added a special treatment on those commands to allow executing them on MULTI.

The only backward compatibility issue that this PR introduces is that now MONITOR
is not allowed inside MULTI.

Tests were added to verify blocking commands are not blocking the client on LUA, MULTI,
or RM_Call. Tests were added to verify the module can check for CLIENT_DENY_BLOCKING flag.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Itamar Haber <itamar@redislabs.com>
2020-11-17 18:58:55 +02:00
chenyangyang
c1aaad06d8
Modules callbacks for lazy free effort, and unlink (#7912)
Add two optional callbacks to the RedisModuleTypeMethods structure, which is `free_effort`
and `unlink`. the `free_effort` callback indicates the effort required to free a module memory.
Currently, if the effort exceeds LAZYFREE_THRESHOLD, the module memory may be released
asynchronously. the `unlink` callback indicates the key has been removed from the DB by redis, and
may soon be freed by a background thread.

Add `lazyfreed_objects` info field, which represents the number of objects that have been
lazyfreed since redis was started.

Add `RM_GetTypeMethodVersion` API, which return the current redis-server runtime value of
`REDISMODULE_TYPE_METHOD_VERSION`. You can use that when calling `RM_CreateDataType` to know
which fields of RedisModuleTypeMethods are gonna be supported and which will be ignored.
2020-11-16 10:34:04 +02:00
Yossi Gottlieb
48a9d63af3
Add timer module API tests. (#8041) 2020-11-11 22:57:33 +02:00
Meir Shpilraien (Spielrein)
adc3183cd2
Add Module API for version and compatibility checks (#7865)
* Introduce a new API's: RM_GetContextFlagsAll, and
RM_GetKeyspaceNotificationFlagsAll that will return the
full flags mask of each feature. The module writer can
check base on this value if the Flags he needs are
supported or not.

* For each flag, introduce a new value on redismodule.h,
this value represents the LAST value and should be there
as a reminder to update it when a new value is added,
also it will be used in the code to calculate the full
flags mask (assuming flags are incrementally increasing).
In addition, stated that the module writer should not use
the LAST flag directly and he should use the GetFlagAll API's.

* Introduce a new API: RM_IsSubEventSupported, that returns for a given
event and subevent, whether or not the subevent supported.

* Introduce a new macro RMAPI_FUNC_SUPPORTED(func) that returns whether
or not a function API is supported by comparing it to NULL.

* Introduce a new API: int RM_GetServerVersion();, that will return the
current Redis version in the format 0x00MMmmpp; e.g. 0x00060008;

* Changed unstable version from 999.999.999 to 255.255.255

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
2020-10-11 17:21:58 +03:00
Yossi Gottlieb
0aec98dce2
Module API: Add RM_GetClientCertificate(). (#7866)
This API function makes it possible to retrieve the X.509 certificate
used by clients to authenticate TLS connections.
2020-10-11 17:11:42 +03:00
Yossi Gottlieb
907da0580b
Modules: Add RM_GetDetachedThreadSafeContext(). (#7886)
The main motivation here is to provide a way for modules to create a
single, global context that can be used for logging.

Currently, it is possible to obtain a thread-safe context that is not
attached to any blocked client by using `RM_GetThreadSafeContext`.
However, the attached context is not linked to the module identity so
log messages produced are not tagged with the module name.

Ideally we'd fix this in `RM_GetThreadSafeContext` itself but as it
doesn't accept the current context as an argument there's no way to do
that in a backwards compatible manner.
2020-10-11 16:11:31 +03:00
Yossi Gottlieb
7d117d7591 Modules: add RM_GetCommandKeys().
This is essentially the same as calling COMMAND GETKEYS but provides a
more efficient interface that can be used in every context (i.e. not a
Redis command).
2020-10-11 16:04:14 +03:00
Wen Hui
dfe9714c86
Add Swapdb Module Event (#7804) 2020-09-20 13:36:20 +03:00
Roi Lipman
042189fd87
RM_ThreadSafeContextTryLock a non-blocking method for acquiring GIL (#7738)
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2020-09-09 16:01:16 +03:00
Oran Agra
e3e69c25fd test infra - reduce disk space usage
this is important when running a test with --loop
2020-09-06 09:59:19 +03:00
Yossi Gottlieb
64c360c515
Module API: fix missing RM_CLIENTINFO_FLAG_SSL. (#7666)
The `REDISMODULE_CLIENTINFO_FLAG_SSL` flag was already a part of the `RedisModuleClientInfo` structure but was not implemented.
2020-08-17 17:46:54 +03:00
Meir Shpilraien (Spielrein)
3f494cc49d
see #7544, added RedisModule_HoldString api. (#7577)
Added RedisModule_HoldString that either returns a
shallow copy of the given String (by increasing
the String ref count) or a new deep copy of String
in case its not possible to get a shallow copy.

Co-authored-by: Itamar Haber <itamar@redislabs.com>
2020-08-09 06:11:47 +03:00
Oran Agra
c17e597d05
Accelerate diskless master connections, and general re-connections (#6271)
Diskless master has some inherent latencies.
1) fork starts with delay from cron rather than immediately
2) replica is put online only after an ACK. but the ACK
   was sent only once a second.
3) but even if it would arrive immediately, it will not
   register in case cron didn't yet detect that the fork is done.

Besides that, when a replica disconnects, it doesn't immediately
attempts to re-connect, it waits for replication cron (one per second).
in case it was already online, it may be important to try to re-connect
as soon as possible, so that the backlog at the master doesn't vanish.

In case it disconnected during rdb transfer, one can argue that it's
not very important to re-connect immediately, but this is needed for the
"diskless loading short read" test to be able to run 100 iterations in 5
seconds, rather than 3 (waiting for replication cron re-connection)

changes in this commit:
1) sync command starts a fork immediately if no sync_delay is configured
2) replica sends REPLCONF ACK when done reading the rdb (rather than on 1s cron)
3) when a replica unexpectedly disconnets, it immediately tries to
   re-connect rather than waiting 1s
4) when when a child exits, if there is another replica waiting, we spawn a new
   one right away, instead of waiting for 1s replicationCron.
5) added a call to connectWithMaster from replicationSetMaster. which is called
   from the REPLICAOF command but also in 3 places in cluster.c, in all of
   these the connection attempt will now be immediate instead of delayed by 1
   second.

side note:
we can add a call to rdbPipeReadHandler in replconfCommand when getting
a REPLCONF ACK from the replica to solve a race where the replica got
the entire rdb and EOF marker before we detected that the pipe was
closed.
in the test i did see this race happens in one about of some 300 runs,
but i concluded that this race is unlikely in real life (where the
replica is on another host and we're more likely to first detect the
pipe was closed.
the test runs 100 iterations in 3 seconds, so in some cases it'll take 4
seconds instead (waiting for another REPLCONF ACK).

Removing unneeded startBgsaveForReplication from updateSlavesWaitingForBgsave
Now that CheckChildrenDone is calling the new replicationStartPendingFork
(extracted from serverCron) there's actually no need to call
startBgsaveForReplication from updateSlavesWaitingForBgsave anymore,
since as soon as updateSlavesWaitingForBgsave returns, CheckChildrenDone is
calling replicationStartPendingFork that handles that anyway.
The code in updateSlavesWaitingForBgsave had a bug in which it ignored
repl-diskless-sync-delay, but removing that code shows that this bug was
hiding another bug, which is that the max_idle should have used >= and
not >, this one second delay has a big impact on my new test.
2020-08-06 16:53:06 +03:00
Oran Agra
f7e7775990
module hook for master link up missing on successful psync (#7584)
besides, hooks test was time sensitive. when the replica managed to
reconnect quickly after the client kill, the test would fail
2020-07-31 13:14:29 +03:00
Oran Agra
109b5ccdcd
Fix failing tests due to issues with wait_for_log_message (#7572)
- the test now waits for specific set of log messages rather than wait for
  timeout looking for just one message.
- we don't wanna sample the current length of the log after an action, due
  to a race, we need to start the search from the line number of the last
  message we where waiting for.
- when attempting to trigger a full sync, use multi-exec to avoid a race
  where the replica manages to re-connect before we completed the set of
  actions that should force a full sync.
- fix verify_log_message which was broken and unused
2020-07-28 11:15:29 +03:00
Meir Shpilraien (Spielrein)
8d82639319
This PR introduces a new loaded keyspace event (#7536)
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Itamar Haber <itamar@redislabs.com>
2020-07-23 12:38:51 +03:00
Oran Agra
8e76e13472
stabilize tests that look for log lines (#7367)
tests were sensitive to additional log lines appearing in the log
causing the search to come empty handed.

instead of just looking for the n last log lines, capture the log lines
before performing the action, and then search from that offset.
2020-07-10 08:28:22 +03:00
meir@redislabs.com
a89bf734a9 Fix RM_ScanKey module api not to return int encoded strings
The scan key module API provides the scan callback with the current
field name and value (if it exists). Those arguments are RedisModuleString*
which means it supposes to point to robj which is encoded as a string.
Using createStringObjectFromLongLong function might return robj that
points to an integer and so break a module that tries for example to
use RedisModule_StringPtrLen on the given field/value.

The PR introduces a fix that uses the createObject function and sdsfromlonglong function.
Using those function promise that the field and value pass to the to the
scan callback will be Strings.

The PR also changes the Scan test module to use RedisModule_StringPtrLen
to catch the issue. without this, the issue is hidden because
RedisModule_ReplyWithString knows to handle integer encoding of the
given robj (RedisModuleString).

The PR also introduces a new test to verify the issue is solved.
2020-06-14 11:20:15 +03:00
Guy Benoish
1b0d30aeb7 Try to fix time-sensitive tests in blockonkey.tcl
There is an inherent race between the deferring client and the
"main" client of the test: While the deferring client issues a blocking
command, we can't know for sure that by the time the "main" client
tries to issue another command (Usually one that unblocks the deferring
client) the deferring client is even blocked...
For lack of a better choice this commit uses TCL's 'after' in order
to give some time for the deferring client to issues its blocking
command before the "main" client does its thing.
This problem probably exists in many other tests but this commit
tries to fix blockonkeys.tcl
2020-04-03 14:51:45 +03:00
Guy Benoish
c4dc5b80b2 Fix memory corruption in moduleHandleBlockedClients
By using a "circular BRPOPLPUSH"-like scenario it was
possible the get the same client on db->blocking_keys
twice (See comment in moduleTryServeClientBlockedOnKey)

The fix was actually already implememnted in
moduleTryServeClientBlockedOnKey but it had a bug:
the funxction should return 0 or 1 (not OK or ERR)

Other changes:
1. Added two commands to blockonkeys.c test module (To
   reproduce the case described above)
2. Simplify blockonkeys.c in order to make testing easier
3. cast raxSize() to avoid warning with format spec
2020-04-01 12:53:26 +03:00
Guy Benoish
d6eb3afd13 Modules: Test MULTI/EXEC replication of RM_Replicate
Makse sure call() doesn't wrap replicated commands with
a redundant MULTI/EXEC

Other, unrelated changes:
1. Formatting compiler warning in INFO CLIENTS
2. Use CLIENT_ID_AOF instead of UINT64_MAX
2020-03-31 13:55:51 +03:00
antirez
4379b8b411 Fix the propagate Tcl test after module changes. 2020-03-31 12:09:38 +02:00
Oran Agra
0a643efa0c fix race in module api test for fork
in some cases we were trying to kill the fork before it got created
2020-02-23 16:48:37 +02:00
Guy Benoish
d7d13721d3 Modules: Fix blocked-client-related memory leak
If a blocked module client times-out (or disconnects, unblocked
by CLIENT command, etc.) we need to call moduleUnblockClient
in order to free memory allocated by the module sub-system
and blocked-client private data

Other changes:
Made blockedonkeys.tcl tests a bit more aggressive in order
to smoke-out potential memory leaks
2019-12-30 10:10:59 +05:30
Madelyn Olson
034dcf185c Add module APIs for custom authentication 2019-12-17 06:59:59 +00:00
Yossi Gottlieb
0283db5883 Improve RM_ModuleTypeReplaceValue() API.
With the previous API, a NULL return value was ambiguous and could
represent either an old value of NULL or an error condition. The new API
returns a status code and allows the old value to be returned
by-reference.

This commit also includes test coverage based on
tests/modules/datatype.c which did not exist at the time of the original
commit.
2019-12-12 18:50:11 +02:00
Salvatore Sanfilippo
a1b654819c
Merge pull request #6598 from oranagra/module-hook-test
try to fix an unstable test (module hook for loading progress)
2019-11-25 17:54:21 +01:00
Salvatore Sanfilippo
64c2508ee3
Merge branch 'unstable' into rm_get_server_info 2019-11-21 10:06:15 +01:00
Oran Agra
ed2269762b try to fix an unstable test (module hook for loading progress)
there were two lssues, one is taht BGREWRITEAOF failed since the initial one was still in progress
the solution for this one is to enable appendonly from the server startup so there's no initial aofrw.

the other problem was 0 loading progress events, theory is that on some
platforms a sleep of 1 will cause a much greater delay due to the context
switch, but on other platform it doesn't. in theory a sleep of 100 micro
for 1k keys whould take 100ms, and with hz of 500 we should be gettering
50 events (one every 2ms). in practise it doesn't work like that, so trying
to find a sleep that would be long enough but still not cause the test to take
too long.
2019-11-19 15:01:51 +02:00
Salvatore Sanfilippo
e7144fbed8
Merge branch 'unstable' into module-long-double 2019-11-19 12:15:45 +01:00
Salvatore Sanfilippo
e916058f0b
Merge pull request #6557 from oranagra/rm_lru_lfu_revized
rename RN_SetLRUOrLFU -> RM_SetLRU and RN_SetLFU
2019-11-19 11:58:07 +01:00
Salvatore Sanfilippo
656e40eed2
Merge branch 'unstable' into scan_module_impl 2019-11-19 11:08:02 +01:00
Oran Agra
0f8692b464 Add RM_ScanKey to scan hash, set, zset, changes to RM_Scan API
- Adding RM_ScanKey
- Adding tests for RM_ScanKey
- Refactoring RM_Scan API

Changes in RM_Scan
- cleanup in docs and coding convention
- Moving out of experimantal Api
- Adding ctx to scan callback
- Dont use cursor of -1 as an indication of done (can be a valid cursor)
- Set errno when returning 0 for various reasons
- Rename Cursor to ScanCursor
- Test filters key that are not strings, and opens a key if NULL
2019-11-11 16:05:55 +02:00
meir@redislabs.com
11c6ce812a Added scan implementation to module api.
The implementation expose the following new functions:
1. RedisModule_CursorCreate - allow to create a new cursor object for
keys scanning
2. RedisModule_CursorRestart - restart an existing cursor to restart the
scan
3. RedisModule_CursorDestroy - destroy an existing cursor
4. RedisModule_Scan - scan keys

The RedisModule_Scan function gets a cursor object, a callback and void*
(used as user private data).
The callback will be called for each key in the database proving the key
name and the value as RedisModuleKey.
2019-11-11 16:01:41 +02:00
Oran Agra
28c20b4ef9 rename RN_SetLRUOrLFU -> RM_SetLRU and RN_SetLFU
- the API name was odd, separated to two apis one for LRU and one for LFU
- the LRU idle time was in 1 second resolution, which might be ok for RDB
  and RESTORE, but i think modules may need higher resolution
- adding tests for LFU and for handling maxmemory policy mismatch
2019-11-10 09:27:01 +02:00
Yossi Gottlieb
5350e7669e Add ModuleDataType to/from string serialization.
Add two new functions that leverage the RedisModuleDataType mechanism
for RDB serialization/deserialization and make it possible to use it
to/from arbitrary strings:

* RM_SaveDataTypeToString()
* RM_LoadDataTypeFromString()
2019-11-04 21:53:59 +02:00
artix
060af1858d Fix RedisModule_ReplyWithLongDouble ptr definition, add tests 2019-11-04 18:49:50 +01:00
Oran Agra
3adf10b809 Test coverage for new module APIs: dbsize, flushall, randomkey, lru get/set 2019-11-04 19:30:31 +02:00
Salvatore Sanfilippo
23c681b51d
Merge pull request #6535 from guybe7/module_block_on_keys_tests
Modules: Test RedisModule_BlockClientOnKeys
2019-11-04 10:53:59 +01:00
Guy Benoish
b81f486c2f Modules: Test RedisModule_BlockClientOnKeys 2019-11-04 14:20:24 +05:30
Oran Agra
0423309768 Add RM_ServerInfoGetFieldUnsigned
rename RM_ServerInfoGetFieldNumerical RM_ServerInfoGetFieldSigned
move string2ull to util.c
fix leak in RM_GetServerInfo when duplicate info fields exist
2019-11-04 08:50:29 +02:00
Oran Agra
deebed23e1 Adding RM_ServerInfoGetFieldC 2019-11-04 07:57:52 +02:00
Oran Agra
4d580438b0 Add module api for looking into INFO fields
- Add RM_GetServerInfo and friends
- Add auto memory for new opaque struct
- Add tests for new APIs

other minor fixes:
- add const in various char pointers
- requested_section in modulesCollectInfo was actually not sds but char*
- extract new string2d out of getDoubleFromObject for code reuse

Add module API for
2019-11-03 15:02:25 +02:00
Oran Agra
51c3ff8d75 Modules hooks: complete missing hooks for the initial set of hooks
* replication hooks: role change, master link status, replica online/offline
* persistence hooks: saving, loading, loading progress
* misc hooks: cron loop, shutdown, module loaded/unloaded
* change the way hooks test work, and add tests for all of the above

startLoading() now gets flag indicating what is loaded.
stopLoading() now gets an indication of success or failure.
adding startSaving() and stopSaving() with similar args and role.
2019-10-29 17:59:09 +02:00
Oran Agra
0399b5a27e Module api tests for RM_Call
Adding a test for coverage for RM_Call in a new "misc" unit
to be used for various short simple tests

also solves compilation warnings in redismodule.h and fork.c
2019-10-28 12:39:57 +02:00
Salvatore Sanfilippo
c328c807e7
Merge pull request #6496 from oranagra/module_tests_valgrind
Make module tests pass with valgrind, and fix a leak in diskless load
2019-10-28 10:05:52 +01:00
antirez
6e98214f74 Modules hooks: test flush event. 2019-10-24 10:51:03 +02:00
antirez
39f2ab595c Modules hooks: initial Tcl test file. 2019-10-24 10:41:25 +02:00
Oran Agra
be352633a0 Make module tests pass with valgrind, and fix a leak in diskless load 2019-10-24 09:45:25 +03:00
antirez
1b8b8c029f Modules: add RM_Replicate() Tcl test file & fix the module. 2019-10-03 18:44:50 +02:00
Oran Agra
d5c14c70b7 Merge remote-tracking branch 'antirez/unstable' into modules_info 2019-09-30 20:47:35 +03:00
Salvatore Sanfilippo
6129758558
Merge branch 'unstable' into modules_fork 2019-09-27 11:24:06 +02:00
Oran Agra
61853ad8de Module INFO, support default section for simple modules 2019-08-18 10:01:57 +03:00
Oran Agra
1d6e5dc4dc Module INFO, add support for dict fields, rename API to have common prefix 2019-08-18 09:41:45 +03:00