Commit Graph

11795 Commits

Author SHA1 Message Date
zhaozhao.zz
aefdc57f1e
add an assertion to make sure numkeys is 0 in getKeysUsingKeySpecs (#12389)
This is an addition to #12380, to prevent potential bugs when collecting
keys from multiple commands in the future.
Note that this function also resets numkeys in some cases.
2023-07-06 08:17:25 +03:00
Sankar
1190f25ca7
Process loss of slot ownership in cluster bus (#12344)
Process loss of slot ownership in cluster bus

When a node no longer owns a slot, it clears the bit corresponding
to the slot in the cluster bus messages. The receiving nodes
currently don't record the fact that the sender stopped claiming
a slot until some other node in the cluster starts claiming the slot.
This can cause a slot to go missing during slot migration when subjected
to inopportune race with addition of new shards or a failover.
This fix forces the receiving nodes to process the loss of ownership
to avoid spreading wrong information.
2023-07-05 17:46:23 -07:00
Job Henandez Lara
b35e20d1de
redis-benchmark: add documentation for the amount of clients needed (#12372)
In cluster mode, we need more clients than the number of nodes.
2023-07-05 10:27:14 +03:00
Binbin
d56a7d9b05
Fix and increase tollerance of event loop test, add verbose logs (#12385)
The test fails on freebsd CI:
```
*** [err]: stats: eventloop metrics in tests/unit/info.tcl
Expected '31777' to be less than '16183' (context: type eval line 17 cmd
{assert_lessthan $el_sum2 [expr $el_sum1+10000] } proc ::test)
```

The test added in #11963, fails on freebsd CI which is slow,
increase tollerance and also add some verbose logs, now we can
see these logs in verbose mode (for better views):
```
eventloop metrics cycle1: 12, cycle2: 15
eventloop metrics el_sum1: 315, el_sum2: 411
eventloop metrics cmd_sum1: 126, cmd_sum2: 137
[ok]: stats: eventloop metrics (111 ms)
instantaneous metrics instantaneous_eventloop_cycles_per_sec: 8
instantaneous metrics instantaneous_eventloop_duration_usec: 55
[ok]: stats: instantaneous metrics (1603 ms)
[ok]: stats: debug metrics (112 ms)
```
2023-07-05 09:32:30 +03:00
Lior Lahav
b7559d9f3b
Fix possible crash in command getkeys (#12380)
When getKeysUsingKeySpecs processes a command with more than one key-spec,
and called with a total of more than 256 keys, it'll call getKeysPrepareResult again,
but since numkeys isn't updated, getKeysPrepareResult will not bother to copy key
names from the old result (leaving these slots uninitialized). Furthermore, it did not
consider the keys it already found when allocating more space.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-07-03 12:45:18 +03:00
Binbin
6b57241fa8
Revert zrangeGenericCommand negative offset check (#12377)
The negative offset check was added in #9052, we realized
that this is a non-mandatory breaking change and we would
like to add it only in 8.0.

This reverts PR #9052, will be re-introduced later in 8.0.
2023-07-02 20:38:30 +03:00
Binbin
26174123ee
Avoid DEBUG POPULATE crash at dictExpand OOM (#12363)
Change to use dictTryExpand, return error on OOM.
2023-07-01 07:35:35 -07:00
DevineLiu
6bf9b144ef
redis-cli: Support URIs with IPv6 (#11834)
Co-authored-by: hrliu <hrliu@alauda.io>
2023-06-29 19:32:01 +03:00
judeng
07ed0eafa9
improve performance for scan command when matching pattern or data type (#12209)
Optimized the performance of the SCAN command in a few ways:
1. Move the key filtering (by MATCH pattern) in the scan callback,
  so as to avoid collecting them for later filtering.
2. Reduce a many memory allocations and copying (use a reference
  to the original sds, instead of creating an robj, an excessive 2 mallocs
  and one string duplication)
3. Compare TYPE filter directly (as integers), instead of inefficient string
  compare per key.
4. fixed a small bug: when scan zset and hash types, maxiterations uses
  a more accurate number to avoid wrong double maxiterations.

Changes **postponed** for a later version (8.0):
1. Prepare to move the TYPE filtering to the scan callback as well. this was
  put on hold since it has side effects that can be considered a breaking
  change, which is that we will not attempt to do lazy expire (delete) a key
  that was filtered by not matching the TYPE (changing it would mean TYPE filter
  starts behaving the same as MATCH filter already does in that respect). 
2. when the specified key TYPE filter is an unknown type, server will reply a error
  immediately instead of doing a full scan that comes back empty handed. 

Benchmark result:
For different scenarios, we obtained about 30% or more performance improvement.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-27 16:43:46 +03:00
Maria Markova
b2cdf6bcc3
Adding of O3 on linking stage for OPTIMIZATION=-O3 cases (#12339)
Added missing O3 flag to linking stage in default option "-O3 -flto". 
Flags doesn't lead to significant changes in performance:
- +0.21% in geomean for all benchmarks on ICX bare-metal (256 cpus)
- +0.33% in geomean for all benchmarks on m6i.2xlarge (16 cpus) 
Checked on redis from Mar'30 (commit 1f76bb17dd ). Comparison file is attached.
2023-06-27 11:54:17 +03:00
michalbiesek
ef4bb4e374
Add RISC-V debug support (#12349)
- Add support for `getAndSetMcontextEip`
- Add support for `logRegisters`
2023-06-27 11:53:42 +03:00
Binbin
f58fd9e6d2
Set HIDDEN_CONFIG flag on aof-disable-auto-gc (#12355)
aof-disable-auto-gc was created for testing purposes,
to check if certain AOF files were actually generated
and if they were deletedcorrectly during testing.

So hiding it, see #12249 for more discussion.
2023-06-27 10:21:58 +03:00
dependabot[bot]
1634a0f271
Bump vmactions/freebsd-vm from 0.3.0 to 0.3.1 (#12352)
Bumps [vmactions/freebsd-vm](https://github.com/vmactions/freebsd-vm) from 0.3.0 to 0.3.1.
- [Release notes](https://github.com/vmactions/freebsd-vm/releases)
- [Commits](https://github.com/vmactions/freebsd-vm/compare/v0.3.0...v0.3.1)

---
updated-dependencies:
- dependency-name: vmactions/freebsd-vm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-27 09:17:34 +03:00
Chen Tianjie
22a29935ff
Support TLS service when "tls-cluster" is not enabled and persist both plain and TLS port in nodes.conf (#12233)
Originally, when "tls-cluster" is enabled, `port` is set to TLS port. In order to support non-TLS clients, `pport` is used to propagate TCP port across cluster nodes. However when "tls-cluster" is disabled, `port` is set to TCP port, and `pport` is not used, which means the cluster cannot provide TLS service unless "tls-cluster" is on.
```
typedef struct {
    // ...
    uint16_t port;  /* Latest known clients port (TLS or plain). */
    uint16_t pport; /* Latest known clients plaintext port. Only used if the main clients port is for TLS. */
    // ...
} clusterNode;
```
```
typedef struct {
    // ...
    uint16_t port;   /* TCP base port number. */
    uint16_t pport;  /* Sender TCP plaintext port, if base port is TLS */
    // ...
} clusterMsg;
```
This PR renames `port` and `pport` in `clusterNode` to `tcp_port` and `tls_port`, to record both ports no matter "tls-cluster" is enabled or disabled.

This allows to provide TLS service to clients when "tls-cluster" is disabled: when displaying cluster topology, or giving `MOVED` error, server can provide TLS or TCP port according to client's connection type, no matter what type of connection cluster bus is using.

For backwards compatibility, `port` and `pport` in `clusterMsg` are preserved, when "tls-cluster" is enabled, `port` is set to TLS port and `pport` is set to TCP port, when "tls-cluster" is disabled, `port` is set to TCP port and `pport` is set to TLS port (instead of 0).

Also, in the nodes.conf file, a new aux field displaying an extra port is added to complete the persisted info. We may have `tls_port=xxxxx` or `tcp_port=xxxxx` in the aux field, to complete the cluster topology, while the other port is stored in the normal `<ip>:<port>` field. The format is shown below.
```
<node-id> <ip>:<tcp_port>@<cport>,<hostname>,shard-id=...,tls-port=6379 myself,master - 0 0 0 connected 0-1000
```
Or we can switch the position of two ports, both can be correctly resolved.
```
<node-id> <ip>:<tls_port>@<cport>,<hostname>,shard-id=...,tcp-port=6379 myself,master - 0 0 0 connected 0-1000
```
2023-06-26 07:43:38 -07:00
Binbin
9600553ef2
Move clusterBeforeSleep before blockedBeforeSleep (#12343)
The new blockedBeforeSleep was added in #12337, it breaks the order in 2ecb5ed.

This may be related to #2288, quoted from comment in #2288:
```
Moreover the clusterBeforeSleep() call was misplaced at the end of the chain of the
beforeSleep() call in redis.c. It should be at the top, before processing un blocking
clients. This is exactly the reason in the specific instance of the bug as reported,
why the state was not updated in time before clients served.
```
2023-06-25 14:21:03 +03:00
Meir Shpilraien (Spielrein)
153f8f082e
Fix use after free on blocking RM_Call. (#12342)
blocking RM_Call was introduced on: #11568, It allows a module to perform
blocking commands and get the reply asynchronously.If the command gets
block, a special promise CallReply is returned that allow to set the unblock
handler. The unblock handler will be called when the command invocation
finish and it gets, as input, the command real reply.

The issue was that the real CallReply was created using a stack allocated
RedisModuleCtx which is no longer available after the unblock handler finishes.
So if the module keeps the CallReply after the unblock handler finished, the
CallReply holds a pointer to invalid memory and will try to access it when the
CallReply will be released.

The solution is to create the CallReply with a NULL context to make it totally
detached and can be freed freely when the module wants.

Test was added to cover this case, running the test with valgrind before the
fix shows the use after free error. With the fix, there are no valgrind errors.

unrelated: adding a missing `$rd close` in many tests in that file.
2023-06-25 14:12:27 +03:00
guybe7
3230199920
Modules: Unblock from within a timer coverage (#12337)
Apart from adding the missing coverage, this PR also adds `blockedBeforeSleep`
that gathers all block-related functions from `beforeSleep`

The order inside `blockedBeforeSleep` is different: now `handleClientsBlockedOnKeys`
(which may unblock clients) is called before `processUnblockedClients` (which handles
unblocked clients).
It makes sense to have this order.

There are no visible effects of the wrong ordering, except some cleanups of the now-unblocked
client would have  happen in the next `beforeSleep` (will now happen in the current one)

The reason we even got into it is because i triggers an assertion in logresreq.c (breaking
the assumption that `unblockClient` is called **before** actually flushing the reply to the socket):
`handleClientsBlockedOnKeys` is called, then it calls `moduleUnblockClientOnKey`, which calls
`moduleUnblockClient`, which adds the client to `moduleUnblockedClients` back to `beforeSleep`,
we call `handleClientsWithPendingWritesUsingThreads`, it writes the data of buf to the client, so
`client->bufpos` became 0
On the next `beforeSleep`, we call `moduleHandleBlockedClients`, which calls `unblockClient`,
which calls `reqresAppendResponse`, triggering the assert. (because the `bufpos` is 0) - see https://github.com/redis/redis/pull/12301#discussion_r1226386716
2023-06-22 23:15:16 +03:00
Gabi Ganam
9e5f45f230
Fix typos in comments (#12338) 2023-06-22 08:10:42 -07:00
Binbin
47f32bc998
Print strerror when bio initialization fails (#12333)
Now we can see something like this:
```
Fatal: Can't initialize Background Jobs. Error message: Cannot allocate memory
```
2023-06-21 17:57:11 +03:00
guybe7
d46ef88671
Improve moduleBlockClient timeout overflow handling (#12174)
Continuation of https://github.com/redis/redis/pull/11338
avoid overflow adding input timeout to "now" in moduleBlockClient.
2023-06-21 12:48:13 +03:00
Madelyn Olson
73cf0243df
Make nodename test more consistent (#12330)
To determine when everything was stable, we couldn't just query the nodename since they aren't API visible by design. Instead, we were using a proxy piece of information which was bumping the epoch and waiting for everyone to observe that. This works for making source Node 0 and Node 1 had pinged, and Node 0 and Node 2 had pinged, but did not guarantee that Node 1 and Node 2 had pinged. Although unlikely, this can cause this failure message. To fix it I hijacked hostnames and used its validation that it has been propagated, since we know that it is stable.

I also noticed while stress testing this sometimes the test took almost 4.5 seconds to finish, which is really close to the current 5 second limit of the log check, so I bumped that up as well just to make it a bit more consistent.
2023-06-20 18:00:55 -07:00
guybe7
20fa156067
Align RM_ReplyWithErrorFormat with RM_ReplyWithError (#12321)
Introduced by https://github.com/redis/redis/pull/11923 (Redis 7.2 RC2)

It's very weird and counterintuitive that `RM_ReplyWithError` requires the error-code
**without** a hyphen while `RM_ReplyWithErrorFormat` requires either the error-code
**with** a hyphen or no error-code at all
```
RedisModule_ReplyWithError(ctx, "BLA bla bla");
```
vs.
```
RedisModule_ReplyWithErrorFormat(ctx, "-BLA %s", "bla bla");
```

This commit aligns RM_ReplyWithErrorFormat to behvae like RM_ReplyWithError.
it's a breaking changes but it's done before 7.2 goes GA.
2023-06-20 20:44:43 +03:00
Oran Agra
8ad8f0f9d8
Fix broken protocol when PUBLISH emits local push inside MULTI (#12326)
When a connection that's subscribe to a channel emits PUBLISH inside MULTI-EXEC,
the push notification messes up the EXEC response.

e.g. MULTI, PING, PUSH foo bar, PING, EXEC
the EXEC's response will contain: PONG, {message foo bar}, 1. and the second PONG
will be delivered outside the EXEC's response.

Additionally, this PR changes the order of responses in case of a plain PUBLISH (when
the current client also subscribed to it), by delivering the push after the command's
response instead of before it.
This also affects modules calling RM_PublishMessage in a similar way, so that we don't
run the risk of getting that push mixed together with the module command's response.
2023-06-20 20:41:41 +03:00
judeng
93708c7f6a
use embedded string object and more efficient ll2string for long long value convert to string (#12250)
A value of type long long is always less than 21 bytes when convert to a
string, so always meets the conditions for using embedded string object
which can always get memory reduction and performance gain (less calls
to the heap allocator).
Additionally, for the conversion of longlong type to sds, we also use a faster
algorithm (the one in util.c instead of the one that used to be in sds.c). 

For the DECR command on 32-bit Redis, we get about a 5.7% performance
improvement. There will also be some performance gains for some commands
that heavily use sdscatfmt to convert numbers, such as INFO.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-20 15:14:44 +03:00
Binbin
d9c2ef8a72
zrangeGenericCommand add check for negative offset (#9052)
Now we will check the offset in zrangeGenericCommand.
With a negative offset, we will throw an error and return.

This also resolve the issue of zeroing the destination key
in case of the "store" variant when we input a negative offset.
```
127.0.0.1:6379> set key value
OK
127.0.0.1:6379> zrangestore key myzset 0 10 byscore limit -1 10
(integer) 0
127.0.0.1:6379> exists key
(integer) 0
```

This change affects the following commands:
- ZRANGE / ZRANGESTORE / ZRANGEBYLEX / ZRANGEBYSCORE
- ZREVRANGE / ZREVRANGEBYSCORE / ZREVRANGEBYLEX
2023-06-20 14:33:17 +03:00
Wen Hui
66ea178cb6
adding geo command edge cases tests (#12274)
For geosearch and georadius we have already test coverage for wrong type, but we dont have for geodist, geohash, geopos commands. So adding the wrong type test cases for geodist, geohash, geopos commands.

Existing code, we have verify_geo_edge_response_bymember function for wrong type test cases which has member as an option. But the function is being called in other test cases where the output is not inline with these commnds(geodist, geohash, geopos). So I could not include these commands(geodist, geohash, geopos) as part of existing function, hence implemented a new function verify_geo_edge_response_generic and called from the test case.
2023-06-20 12:50:03 +03:00
Binbin
d306d86146
Fix ZRANK/ZREVRANK reply_schema description (#12331)
The parameter name is WITHSCORE instead of WITHSCORES.
2023-06-20 11:15:40 +03:00
mstmdev
13e17e94d8
Doc indentation fix for the --functions-rdb option (#12328) 2023-06-20 11:15:11 +03:00
Wen Hui
813924b4c3
Sanitizer reported memory leak for '--invalid' option or port number is missed cases to redis-server. (#12322)
Observed that the sanitizer reported memory leak as clean up is not done
before the process termination in negative/following cases:

**- when we passed '--invalid' as option to redis-server.**

```
 -vm:~/mem-leak-issue/redis$ ./src/redis-server --invalid

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'invalid'
Bad directive or wrong number of arguments

=================================================================
==865778==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7f0985f65867 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x558ec86686ec in ztrymalloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:117
    #2 0x558ec86686ec in ztrymalloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:135
    #3 0x558ec86686ec in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:276
    #4 0x558ec86686ec in zrealloc /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:327
    #5 0x558ec865dd7e in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1172
    #6 0x558ec87a1be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    #7 0x558ec87a13b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    #8 0x558ec85e6f15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    #9 0x7f09856e5d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).

```

**- when we pass '--port' as option and missed to add port number to redis-server.**

```
vm:~/mem-leak-issue/redis$ ./src/redis-server --port

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'port'
wrong number of arguments

=================================================================
==865846==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7fdcdbb1f867 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x557e8b04f6ec in ztrymalloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:117
    #2 0x557e8b04f6ec in ztrymalloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:135
    #3 0x557e8b04f6ec in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:276
    #4 0x557e8b04f6ec in zrealloc /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:327
    #5 0x557e8b044d7e in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1172
    #6 0x557e8b188be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    #7 0x557e8b1883b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    #8 0x557e8afcdf15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    #9 0x7fdcdb29fd8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Indirect leak of 10 byte(s) in 1 object(s) allocated from:
    #0 0x7fdcdbb1fc18 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x557e8b04f9aa in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:287
    #2 0x557e8b04f9aa in ztryrealloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:317
    #3 0x557e8b04f9aa in zrealloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:342
    #4 0x557e8b033f90 in _sdsMakeRoomFor /home/ubuntu/mem-leak-issue/redis/src/sds.c:271
    #5 0x557e8b033f90 in sdsMakeRoomFor /home/ubuntu/mem-leak-issue/redis/src/sds.c:295
    #6 0x557e8b033f90 in sdscatlen /home/ubuntu/mem-leak-issue/redis/src/sds.c:486
    #7 0x557e8b044e1f in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1165
    #8 0x557e8b188be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    #9 0x557e8b1883b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    #10 0x557e8afcdf15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    #11 0x7fdcdb29fd8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 18 byte(s) leaked in 2 allocation(s).

```

As part analysis found that the sdsfreesplitres is not called when this condition checks are being hit.

Output after the fix:


```
vm:~/mem-leak-issue/redis$ ./src/redis-server --invalid

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'invalid'
Bad directive or wrong number of arguments
vm:~/mem-leak-issue/redis$

===========================================
vm:~/mem-leak-issue/redis$ ./src/redis-server --jdhg

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'jdhg'
Bad directive or wrong number of arguments

---------------------------------------------------------------------------
vm:~/mem-leak-issue/redis$ ./src/redis-server --port

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'port'
wrong number of arguments
```

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-20 10:07:29 +03:00
Shaya Potter
07316f16f0
Add ability for modules to know which client is being cmd filtered (#12219)
Adds API
- RedisModule_CommandFilterGetClientId()

Includes addition to commandfilter test module to validate that it works
by performing the same command from 2 different clients
2023-06-20 09:03:52 +03:00
Binbin
cd4f3e20c1
Fix cluster human_nodename Getter data loss in nodes.conf (#12325)
auxHumanNodenameGetter limited to %.40s, since we did not limit the
length of config cluster-announce-human-nodename, %.40s will cause
nodename data loss (we will persist it in nodes.conf).

Additional modified auxHumanNodenamePresent to use sdslen.
2023-06-19 17:13:18 -07:00
Binbin
b510624978
Optimize PSUBSCRIBE and PUNSUBSCRIBE from O(N*M) to O(N) (#12298)
In the original implementation, the time complexity of the commands
is actually O(N*M), where N is the number of patterns the client is
already subscribed and M is the number of patterns to subscribe to.
The docs are all wrong about this.

Specifically, because the original client->pubsub_patterns is a list,
so we need to do listSearchKey which is O(N). In this PR, we change it
to a dict, so the search becomes O(1).

At the same time, both pubsub_channels and pubsubshard_channels are dicts.
Changing pubsub_patterns to a dictionary improves the readability and
maintainability of the code.
2023-06-19 16:31:18 +03:00
Oran Agra
07c14672bf
Set Jemalloc --disable-cache-oblivious to reduce memory overhead (#12315)
Apparently for large size classes Jemalloc allocate some extra
memory (can be up to 25% overhead for allocations of 16kb).
see https://github.com/jemalloc/jemalloc/issues/1098#issuecomment-1589870476

p.s. from Redis's perspective that looks like external fragmentation,
(i.e. allocated bytes will be low, and active pages bytes will be large)
which  can cause active-defrag to eat CPU cycles in vain.

Some details about this mechanism we disable:
---------------------------------------------------------------
Disabling this mechanism only affects large allocations (above 16kb)
Not only that it isn't expected to cause any performance regressions,
it's actually recommended, unless you have a specific workload pattern
and hardware that benefit from this feature -- by default it's enabled and
adds address randomization to all large buffers, by over allocating 1 page
per large size class, and offsetting into that page to make the starting
address of the user buffer randomized. Workloads such as scientific
computation often handle multiple big matrixes at the same time, and the
randomization makes sure that the cacheline level accesses don't suffer
bad conflicts (when they all start from page-aligned addresses).

However the downsize is also quite noticeable, like you observed that extra
page per large size can cause memory overhead, plus the extra TLB entry.
The other factor is, hardware in the last few years started doing the
randomization at the hardware level, i.e. the address to cacheline mapping isn't
a direct mapping anymore. So there's debate to disable the randomization by default,
but we are still hesitant because when it matters, it could matter a lot, and having
it enabled by default limits that worst case behavior, even though it means the
majority of workloads suffers a regression.

So in short, it's safe and offers better performance in most cases.
2023-06-18 10:30:23 +03:00
Wen Hui
070453eef3
Cluster human readable nodename feature (#9564)
This PR adds a human readable name to a node in clusters that are visible as part of error logs. This is useful so that admins and operators of Redis cluster have better visibility into failures without having to cross-reference the generated ID with some logical identifier (such as pod-ID or EC2 instance ID). This is mentioned in #8948. Specific nodenames can be set by using the variable cluster-announce-human-nodename. The nodename is gossiped using the clusterbus extension in #9530.

Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2023-06-17 21:16:51 -07:00
sundb
b00a235186
Use Reservoir Sampling for random sampling of dict, and fix hang during fork (#12276)
## Issue:
When a dict has a long chain or the length of the chain is longer than
the number of samples, we will never be able to sample the elements
at the end of the chain using dictGetSomeKeys().
This could mean that SRANDMEMBER can be hang in and endless loop.
The most severe case, is the pathological case of when someone uses SCAN+DEL
or SSCAN+SREM creating an unevenly distributed dict.
This was amplified by the recent change in #11692 which prevented a
down-sizing rehashing while there is a fork.

## Solution
1. Before, we will stop sampling when we reach the maximum number
  of samples, even if there is more data after the current chain.
  Now when we reach the maximum we use the Reservoir Sampling
  algorithm to fairly sample the end of the chain that cannot be sampled
2. Fix the rehashing code, so that the same as it allows rehashing for up-sizing
  during fork when the ratio is extreme, it will allow it for down-sizing as well.

Issue was introduced (or became more severe) by #11692

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-16 19:13:07 +03:00
Binbin
439b0315c8
Fix SPOP/RESTORE propagation when doing lazy free (#12320)
In SPOP, when COUNT is greater than or equal to set's size,
we will remove the set. In dbDelete, we will do DEL or UNLINK
according to the lazy flag. This is also required for propagate.

In RESTORE, we won't store expired keys into the db, see #7472.
When used together with REPLACE, it should emit a DEL or UNLINK
according to the lazy flag.

This PR also adds tests to cover the propagation. The RESTORE
test will also cover #7472.
2023-06-16 08:14:11 -07:00
YaacovHazan
5da9eecdb8
Removing duplicated tests (#12318)
In 4ba47d2d2 the following tests added in both tracking.tcl and introspection.tcl

- Coverage: Basic CLIENT CACHING
- Coverage: Basic CLIENT REPLY
- Coverage: Basic CLIENT TRACKINGINFO
- Coverage: Basic CLIENT GETREDIR
2023-06-16 15:55:24 +03:00
Binbin
9dc6f93e9d
Add command being unblocked cause another command to get unblocked execution order test (#12324)
* Add command being unblocked cause another command to get unblocked execution order test

In #12301, we observed that if the
`while(listLength(server.ready_keys) != 0)`
in handleClientsBlockedOnKeys is changed to
`if(listLength(server.ready_keys) != 0)`,
the order of command execution will change.

It is wrong to change that. It means that if a command
being unblocked causes another command to get unblocked
(like a BLMOVE would do), then the new unblocked command
will wait for later to get processed rather than right away.

It'll not have any real implication if we change that since
we do call handleClientsBlockedOnKeys in beforeSleep again,
and redis will still behave correctly, but we don't change that.

An example:
1. $rd1 blmove src{t} dst{t} left right 0
2. $rd2 blmove dst{t} src{t} right left 0
3. $rd3 set key1{t}, $rd3 lpush src{t}, $rd3 set key2{t} in a pipeline

The correct order would be:
1. set key1{t}
2. lpush src{t}
3. lmove src{t} dst{t} left right
4. lmove dst{t} src{t} right left
5. set key2{t}

The wrong order would be:
1. set key1{t}
2. lpush src{t}
3. lmove src{t} dst{t} left right
4. set key2{t}
5. lmove dst{t} src{t} right left

This PR adds corresponding test to cover it.

* Add comment near while(listLength(server.ready_keys) != 0)
2023-06-16 15:39:00 +03:00
Meir Shpilraien (Spielrein)
cefe4566e9
Prevent PEJ on loading and on readonly replica. (#12304)
While Redis loading data from disk (AOF or RDB), modules will get
key space notifications. In such stage the module should not register
any PEJ, the main reason this is forbidden is that PEJ purpose is to
perform a write operation as a reaction to the key space notification.
Write operations should not be performed while loading data and so
there is no reason to register a PEJ. 

Same argument also apply to readonly replica. module should not
perform any writes as a reaction to key space notifications and so it
should not register a PEJ.

If a module need to perform some other task which is not involve
writing, he can do it on the key space notification callback itself.
2023-06-15 10:10:23 +03:00
Binbin
254475537a
Optimize SET PXAT to reduce calls of rewriteClientCommandVector (#12316)
In PXAT case, there is no need to do the rewriteClientCommandVector,
a simply benchmark show we gain a improvement of 10%.
2023-06-15 10:07:47 +03:00
judeng
789c33bb3e
improve performance for keys with expiration time (#12177)
This change only affects keys with expiry time.
For SETEX, the average improvement is 5%, and for GET with
expiation key, we gain a improvement of 13%.

When keys have expiration time, Redis has an assertion to look
up the main dict every time when it touches the expires.
This comes with a performance const, especially during rehash.
the damage will be double.

It looks like that assert was added some ten years old, maybe out
of paranoia, and there's probably no reason to keep it at that cost.
2023-06-14 11:17:39 +03:00
Wen Hui
1946083410
Removing the duplicate test case (#12310)
Looks like the Zadd test case was copied to create Zincrby test case ,but missed to change the command.
2023-06-14 10:03:33 +03:00
Harkrishn Patro
a9e32767f7
Allow cluster slots/shards api to respond during loading (#12269)
It would be helpful for clients to get cluster slots/shards information during a node failover and is loading data.
2023-06-13 18:16:32 +03:00
Binbin
e7129e43e0
Fix XREADGROUP BLOCK stuck in endless loop (#12301)
For the XREADGROUP BLOCK > scenario, there is an endless loop.
Due to #11012, it keep going, reprocess command -> blockForKeys -> reprocess command

The right fix is to avoid an endless loop in handleClientsBlockedOnKey and handleClientsBlockedOnKeys,
looks like there was some attempt in handleClientsBlockedOnKeys but maybe not sufficiently good,
and it looks like using a similar trick in handleClientsBlockedOnKey is complicated.
i.e. stashing the list on the stack and iterating on it after creating a fresh one for future use,
is problematic since the code keeps accessing the global list.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-13 13:27:05 +03:00
Oran Agra
f228ec1ea5
flushSlavesOutputBuffers should not write to replicas scheduled to drop (#12242)
This will increase the size of an already large COB (one already passed
the threshold for disconnection)

This could also mean that we'll attempt to write that data to the socket
and the replica will manage to read it, which will result in an
undesired partial sync (undesired for the test)
2023-06-12 14:05:34 +03:00
YaacovHazan
0bfb6d5582
Reset command duration for rejected command. (#12247)
In 7.2, After 971b177fa we make sure (assert) that
the duration has been recorded when resetting the client.

This is not true for rejected commands.
The use case I found is a blocking command that an ACL rule changed before
it was unblocked, and while reprocessing it, the command rejected and triggered the assert.

The PR reset the command duration inside rejectCommand / rejectCommandSds.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-11 23:12:05 +03:00
Chen Tianjie
2d7d3911da
Allow bigger tolerance in eventloop duration test. (#12179)
In #11963, some new tests about eventloop duration were added, which includes time measurement in TCL scripts. This has caused some unexpected CI failures, such as #12169 and #12177, due to slow test servers or some performance jittering.
2023-06-11 09:02:41 +03:00
Wen Hui
d412269ff8
Adding missing test cases for Addslot Command (#12288)
Added missing test case coverage for below scenarios:

1. The command only works if all the specified slots are, from
  the point of view of the node receiving the command, currently
  not assigned. A node will refuse to take ownership for slots that
  already belong to some other node (including itself).
2. The command fails if the same slot is specified multiple times.
2023-06-11 08:36:26 +03:00
Binbin
5fd9756d2e
Fix ACLAppendUserForLoading memory leak when merge error (#12296)
This leak will only happen in loadServerConfigFromString,
that is, when we are loading a redis.conf, and the user is wrong.

Because it happens in loadServerConfigFromString, redis will
exit if there is an error, so this is actually just a cleanup.
2023-06-10 18:11:16 -07:00
Binbin
e4d183afd3
Add missing return on -UNKILLABLE sent by master case (#12277)
We now no longer propagate scripts (started from 7.0), so this is a
very rare issue that in nearly-dead-code.

This is an overlook in #9780
2023-06-08 15:13:53 +03:00