When using RESP3, ZPOPMAX/ZPOPMIN should return nested arrays for consistency
with other commands (e.g. ZRANGE).
We do that only when COUNT argument is present (similarly to how LPOP behaves).
for reasoning see https://github.com/redis/redis/issues/8824#issuecomment-855427955
This is a breaking change only when RESP3 is used, and COUNT argument is present!
* Cleaning up the cluster interface by moving almost all related declarations into cluster.h
(no logic change -- just moving declarations/definitions around)
This initial effort leaves two items out of scope - the configuration parsing into the server
struct and the internals exposed by the clusterNode struct.
* Remove unneeded declarations of dictSds*
Ideally all the dictSds functionality would move from server.c into a dedicated module
so we can avoid the duplication in redis-benchmark/cli
* Move crc16 back into server.h, will be moved out once we create a seperate header file for
hashing functions
This PR adds a spell checker CI action that will fail future PRs if they introduce typos and spelling mistakes.
This spell checker is based on blacklist of common spelling mistakes, so it will not catch everything,
but at least it is also unlikely to cause false positives.
Besides that, the PR also fixes many spelling mistakes and types, not all are a result of the spell checker we use.
Here's a summary of other changes:
1. Scanned the entire source code and fixes all sorts of typos and spelling mistakes (including missing or extra spaces).
2. Outdated function / variable / argument names in comments
3. Fix outdated keyspace masks error log when we check `config.notify-keyspace-events` in loadServerConfigFromString.
4. Trim the white space at the end of line in `module.c`. Check: https://github.com/redis/redis/pull/7751
5. Some outdated https link URLs.
6. Fix some outdated comment. Such as:
- In README: about the rdb, we used to said create a `thread`, change to `process`
- dbRandomKey function coment (about the dictGetRandomKey, change to dictGetFairRandomKey)
- notifyKeyspaceEvent fucntion comment (add type arg)
- Some others minor fix in comment (Most of them are incorrectly quoted by variable names)
7. Modified the error log so that users can easily distinguish between TCP and TLS in `changeBindAddr`
Also the bug that currently does not cause memory leaks.
Because op->type = OBJ_ZSET, in zuiClearIterator, we will do nothing.
Just a cleanup that zuiInitIterator and zuiClearIterator should appear in pairs.
If GT/LT fails the operation we need to reply with
nill (like failure due to NX).
Other changes:
Add the missing $encoding suffix to many zset tests
Note: there's a behavior change just in case of INCR + GT/LT that fails.
The old code was replying with the wrong (rejected) score, and now it'll reply with nil.
Note that that's anyway a corner case so this "behavior change" shouldn't have too much affect.
Using GT/LT with INCR has a predictable result even before we run the command
(INCR GT will only only / always fail if the increment is negative).
There are 2 common range comparators for skiplist: zslValueGteMin and
zslValueLteMax, but they're not being reused in zslDeleteRangeByScore
This is a small change to make code cleaner.
Have a clear separation between in and out flags
Other changes:
delete dead code in RM_ZsetIncrby: if zsetAdd returned error (happens only if
the result of the operation is NAN or if score is NAN) we return immediately so
there is no way that zsetAdd succeeded and returned NAN in the out-flags
SRANDMEMBER with negative count (non unique) can return the same member
multiple times, and the order of elements in the returned collection matters.
For these reasons returning a RESP3 Set type is not valid for the negative
count, but also not really valid for the positive (unique) variant either (the
command returns an array of random picks, not a set)
This PR also contains a minor optimization for SRANDMEMBER, HRANDFIELD,
and ZRANDMEMBER, to avoid the temporary dict from being rehashed while it grows.
Co-authored-by: Oran Agra <oran@redislabs.com>
It is inefficient to repeatedly pick a single random element from a
ziplist.
For CASE4, which is when the user requested a low number of unique
random picks from the collectoin, we used thta pattern.
Now we use a different algorithm that picks unique elements from a
ziplist, and guarentee no duplicate but doesn't provide random order
(which is only needed in the non-unique random picks case)
Unrelated changes:
* change ziplist count and indexes variables to unsigned
* solve compilation warnings about uninitialized vars in gcc 10.2
Co-authored-by: xinluton <xinluton@qq.com>
Changes to HRANDFIELD and ZRANDMEMBER:
* Fix risk of OOM panic when client query a very big negative count (avoid allocating huge temporary buffer).
* Fix uneven random distribution in HRANDFIELD with negative count (wasn't using dictGetFairRandomKey).
* Add tests to check an even random distribution (HRANDFIELD, SRANDMEMBER, ZRANDMEMBER).
Co-authored-by: Oran Agra <oran@redislabs.com>
New commands:
`HRANDFIELD [<count> [WITHVALUES]]`
`ZRANDMEMBER [<count> [WITHSCORES]]`
Algorithms are similar to the one in SRANDMEMBER.
Both return a simple bulk response when no arguments are given, and an array otherwise.
In case values/scores are requested, RESP2 returns a long array, and RESP3 a nested array.
note: in all 3 commands, the only option that also provides random order is the one with negative count.
Changes to SRANDMEMBER
* Optimization when count is 1, we can use the more efficient algorithm of non-unique random
* optimization: work with sds strings rather than robj
Other changes:
* zzlGetScore: when zset needs to convert string to double, we use safer memcpy (in
case the buffer is too small)
* Solve a "bug" in SRANDMEMBER test: it intended to test a positive count (case 3 or
case 4) and by accident used a negative count
Co-authored-by: xinluton <xinluton@qq.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
It was confusing as to why these don't return a map type.
the reason is that order matters, so we need to make sure the client
library knows to respect it.
Added comments in the implementation and tests to cover it.
* Change zunionInterDiffGenericCommand to use lookupKeyRead if dstkey is null
* Change zrangeGenericCommand to use lookupKey Write if dstkey isn't null
ZRANGESTORE and UNION, ZINTER, ZDIFF are all new commands (6.2 RC1 and RC2).
In redis 6.0 the ZRANGE was using lookupKeyRead, and ZUNIONSTORE / ZINTERSTORE were using lookupKeyWrite.
So there bugs are introduced in 6.2 and will be resolved before it is released.
the implications of this bug are also not big:
The sole difference between LookupKeyRead and LookupKeyWrite is for command executed on a replica, which are not received from its master client. (for the master, and for the master client on the replica, these two functions behave the same)!
Add ZRANGESTORE command, and improve ZSTORE command to deprecated Z[REV]RANGE[BYSCORE|BYLEX].
Syntax for the new ZRANGESTORE command:
ZRANGESTORE [BYSCORE | BYLEX] [REV] [LIMIT offset count]
New syntax for ZRANGE:
ZRANGE [BYSCORE | BYLEX] [REV] [WITHSCORES] [LIMIT offset count]
Old syntax for ZRANGE:
ZRANGE [WITHSCORES]
Other ZRANGE commands remain unchanged.
The implementation uses common code for all of these, by utilizing a consumer interface that in one
command response to the client, and in the other command stores a zset key.
Co-authored-by: Oran Agra <oran@redislabs.com>
If RESTORE passes successfully with full sanitization, we can't affort
to crash later on assertion due to duplicate records in a hash when
converting it form ziplist to dict.
This means that when doing full sanitization, we must make sure there
are no duplicate records in any of the collections.
As we know, redis may reject user's requests or evict some keys if
used memory is over maxmemory. Dictionaries expanding may make
things worse, some big dictionaries, such as main db and expires dict,
may eat huge memory at once for allocating a new big hash table and be
far more than maxmemory after expanding.
There are related issues: #4213#4583
More details, when expand dict in redis, we will allocate a new big
ht[1] that generally is double of ht[0], The size of ht[1] will be
very big if ht[0] already is big. For db dict, if we have more than
64 million keys, we need to cost 1GB for ht[1] when dict expands.
If the sum of used memory and new hash table of dict needed exceeds
maxmemory, we shouldn't allow the dict to expand. Because, if we
enable keys eviction, we still couldn't add much more keys after
eviction and rehashing, what's worse, redis will keep less keys when
redis only remains a little memory for storing new hash table instead
of users' data. Moreover users can't write data in redis if disable
keys eviction.
What this commit changed ?
Add a new member function expandAllowed for dict type, it provide a way
for caller to allow expand or not. We expose two parameters for this
function: more memory needed for expanding and dict current load factor,
users can implement a function to make a decision by them.
For main db dict and expires dict type, these dictionaries may be very
big and cost huge memory for expanding, so we implement a judgement
function: we can stop dict to expand provisionally if used memory will
be over maxmemory after dict expands, but to guarantee the performance
of redis, we still allow dict to expand if dict load factor exceeds the
safe load factor.
Add test cases to verify we don't allow main db to expand when left
memory is not enough, so that avoid keys eviction.
Other changes:
For new hash table size when expand. Before this commit, the size is
that double used of dict and later _dictNextPower. Actually we aim to
control a dict load factor between 0.5 and 1.0. Now we replace *2 with
+1, since the first check is that used >= size, the outcome of before
will usually be the same as _dictNextPower(used+1). The only case where
it'll differ is when dict_can_resize is false during fork, so that later
the _dictNextPower(used*2) will cause the dict to jump to *4 (i.e.
_dictNextPower(1025*2) will return 4096).
Fix rehash test cases due to changing algorithm of new hash table size
when expand.
In the iterator for these functions, we'll traverse the sorted sets
in a reversed way so that largest elements come first. We prefer
this order because it's optimized for insertion in a skiplist, which
is the destination of the elements being iterated in there functions.
Blocking command should not be used with MULTI, LUA, and RM_Call. This is because,
the caller, who executes the command in this context, expects a reply.
Today, LUA and MULTI have a special (and different) treatment to blocking commands:
LUA - Most commands are marked with no-script flag which are checked when executing
and command from LUA, commands that are not marked (like XREAD) verify that their
blocking mode is not used inside LUA (by checking the CLIENT_LUA client flag).
MULTI - Command that is going to block, first verify that the client is not inside
multi (by checking the CLIENT_MULTI client flag). If the client is inside multi, they
return a result which is a match to the empty key with no timeout (for example blpop
inside MULTI will act as lpop)
For modules that perform RM_Call with blocking command, the returned results type is
REDISMODULE_REPLY_UNKNOWN and the caller can not really know what happened.
Disadvantages of the current state are:
No unified approach, LUA, MULTI, and RM_Call, each has a different treatment
Module can not safely execute blocking command (and get reply or error).
Though It is true that modules are not like LUA or MULTI and should be smarter not
to execute blocking commands on RM_Call, sometimes you want to execute a command base
on client input (for example if you create a module that provides a new scripting
language like javascript or python).
While modules (on modules command) can check for REDISMODULE_CTX_FLAGS_LUA or
REDISMODULE_CTX_FLAGS_MULTI to know not to block the client, there is no way to
check if the command came from another module using RM_Call. So there is no way
for a module to know not to block another module RM_Call execution.
This commit adds a way to unify the treatment for blocking clients by introducing
a new CLIENT_DENY_BLOCKING client flag. On LUA, MULTI, and RM_Call the new flag
turned on to signify that the client should not be blocked. A blocking command
verifies that the flag is turned off before blocking. If a blocking command sees
that the CLIENT_DENY_BLOCKING flag is on, it's not blocking and return results
which are matches to empty key with no timeout (as MULTI does today).
The new flag is checked on the following commands:
List blocking commands: BLPOP, BRPOP, BRPOPLPUSH, BLMOVE,
Zset blocking commands: BZPOPMIN, BZPOPMAX
Stream blocking commands: XREAD, XREADGROUP
SUBSCRIBE, PSUBSCRIBE, MONITOR
In addition, the new flag is turned on inside the AOF client, we do not want to
block the AOF client to prevent deadlocks and commands ordering issues (and there
is also an existing assert in the code that verifies it).
To keep backward compatibility on LUA, all the no-script flags on existing commands
were kept untouched. In addition, a LUA special treatment on XREAD and XREADGROUP was kept.
To keep backward compatibility on MULTI (which today allows SUBSCRIBE, and PSUBSCRIBE).
We added a special treatment on those commands to allow executing them on MULTI.
The only backward compatibility issue that this PR introduces is that now MONITOR
is not allowed inside MULTI.
Tests were added to verify blocking commands are not blocking the client on LUA, MULTI,
or RM_Call. Tests were added to verify the module can check for CLIENT_DENY_BLOCKING flag.
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Itamar Haber <itamar@redislabs.com>
ZREVRANGEBYSCORE key max min [WITHSCORES] [LIMIT offset count]
When the offset is too large, the query is very slow. Especially when the offset is greater than the length of zset it is easy to determine whether the offset is greater than the length of zset at first, and If it exceed the length of zset, then return directly.
Co-authored-by: Oran Agra <oran@redislabs.com>
Syntax:
COPY <key> <new-key> [DB <dest-db>] [REPLACE]
No support for module keys yet.
Co-authored-by: tmgauss
Co-authored-by: Itamar Haber <itamar@redislabs.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
- Add ZDIFF and ZDIFFSTORE which work similarly to SDIFF and SDIFFSTORE
- Make sure the new WITHSCORES argument that was added for ZUNION isn't considered valid for ZUNIONSTORE
Co-authored-by: Oran Agra <oran@redislabs.com>
Adding [B]LMOVE <src> <dst> RIGHT|LEFT RIGHT|LEFT. deprecating [B]RPOPLPUSH.
Note that when receiving a BRPOPLPUSH we'll still propagate an RPOPLPUSH,
but on BLMOVE RIGHT LEFT we'll propagate an LMOVE
improvement to existing tests
- Replace "after 1000" with "wait_for_condition" when wait for
clients to block/unblock.
- Add a pre-existing element to target list on basic tests so
that we can check if the new element was added to the correct
side of the list.
- check command stats on the replica to make sure the right
command was replicated
Co-authored-by: Oran Agra <oran@redislabs.com>
Syntax: `ZMSCORE KEY MEMBER [MEMBER ...]`
This is an extension of #2359
amended by Tyson Andre to work with the changed unstable API,
add more tests, and consistently return an array.
- It seemed as if it would be more likely to get reviewed
after updating the implementation.
Currently, multi commands or lua scripting to call zscore multiple times
would almost definitely be less efficient than a native ZMSCORE
for the following reasons:
- Need to fetch the set from the string every time instead of reusing the C
pointer.
- Using pipelining or multi-commands would result in more bytes sent by
the client for the repeated `ZMSCORE KEY` sections.
- Need to specially encode the data and decode it from the client
for lua-based solutions.
- The fastest solution I've seen for large sets(thousands or millions)
involves lua and a variadic ZADD, then a ZINTERSECT, then a ZRANGE 0 -1,
then UNLINK of a temporary set (or lua). This is still inefficient.
Co-authored-by: Tyson Andre <tysonandre775@hotmail.com>