Commit Graph

11945 Commits

Author SHA1 Message Date
Binbin
e216c83909
Change addReplyErrorFormat to addReplyError when there is no format (#12641)
This is just a cleanup, although they are both correct, the change
is normatively better, and addReplyError is also much faster.
Although not important, speed is not important for these error cases.
2023-11-30 12:36:17 +02:00
lyq2333
423565f784
Optimize CPU cache efficiency on dict while rehashing in dictTwoPhaseUnlinkFind (#12818)
In #5692, we optimize CPU cache efficiency on dict while rehashing but
missed the modification in dictTwoPhaseUnlinkFind.

This PR supplements it.
2023-11-30 13:50:09 +08:00
Binbin
22cc9b5122
Use CLZ in _dictNextExp to get the next power of two (#12815)
In the past, we did not call _dictNextExp frequently. It was only
called when the dictionary was expanded.

Later, dictTypeExpandAllowed was introduced in #7954, which is 6.2.
For the data dict and the expire dict, we can check maxmemory before
actually expanding the dict. This is a good optimization to avoid
maxmemory being exceeded due to the dict expansion.

And in #11692, we moved the dictTypeExpandAllowed check before the
threshold check, this caused a bit of performance degradation, every
time a key is added to the dict, dictTypeExpandAllowed is called to
check.

The main reason for degradation is that in a large dict, we need to
call _dictNextExp frequently, that is, every time we add a key, we
need to call _dictNextExp once. Then the threshold is checked to see
if the dict needs to be expanded. We can see that the order of checks
here can be optimized.

So we moved the dictTypeExpandAllowed check back to after the threshold
check in #12789. In this way, before the dict is actually expanded (that
is, before the threshold is reached), we will not do anything extra
compared to before, that is, we will not call _dictNextExp frequently.

But note we'll still hit the degradation when we over the thresholds.
When the threshold is reached, because #7954, we may delay the dict
expansion due to maxmemory limitations. In this case, we will call
_dictNextExp every time we add a key during this period.

This PR use CLZ in _dictNextExp to get the next power of two. CLZ (count
leading zeros) can easily give you the next power of two. It should be
noted that we have actually introduced the use of __builtin_clzl in
#8687,
which is 7.0. So i suppose all the platforms we use have it (even if the
CPU doesn't have an instruction).

We build 67108864 (2**26) keys through DEBUG POPULTE, which will use
approximately 5.49G memory (used_memory:5898522936). If expansion is
triggered, the additional hash table will consume approximately 1G
memory (2 ** 27 * 8). So we set maxmemory to 6871947673 (that is, 6.4G),
which will be less than 5.49G + 1G, so we will delay the dict rehash
while addint the keys.

After that, each time an element is added to the dict, an allow check
will be performed, that is, we can frequently call _dictNextExp to test
the comparison before and after the optimization. Using DEBUG HTSTATS 0
to
check and make sure that our dict expansion is dealyed.

Using `./src/redis-server redis.conf --save "" --maxmemory 6871947673`.
Using `./src/redis-benchmark -P 100 -r 1000000000 -t set -n 5000000`.
After ten rounds of testing:
```
unstable:           this PR:
769585.94           816860.00
771724.00           818196.69
775674.81           822368.44
781983.12           822503.69
783576.25           828088.75
784190.75           828637.75
791389.69           829875.50
794659.94           835660.69
798212.00           830013.25
801153.62           833934.56
```

We can see there is about 4-5% performance improvement in this case.
2023-11-29 14:42:22 +02:00
Binbin
bdceaf50e4
Fix the check for new RDB_OPCODE_SLOT_INFO in redis-check-rdb (#12768)
We did not read expires_slot_size, causing its check to fail.
An overlook in #11695.
2023-11-29 14:25:17 +02:00
zhaozhao.zz
3431b1f156
format cpu config as redis style (#7351)
The following four configurations are renamed to align with Redis style:

1. server_cpulist renamed to server-cpulist
2. bio_cpulist renamed to bio-cpulist
3. aof_rewrite_cpulist renamed to aof-rewrite-cpulist
4. bgsave_cpulist renamed to bgsave-cpulist

The original names are retained as aliases to ensure compatibility with
old configuration files. We recommend users to gradually transition to
using the new configuration names to maintain consistency in style.
2023-11-29 13:40:06 +08:00
zhaozhao.zz
a1c5171c1d
Fix resize hash tables stuck on the last non-empty slot (#12802)
Introduced in #11695 .

The tryResizeHashTables function gets stuck on the last non-empty slot
while iterating through dictionaries. It does not restart from the
beginning. The reason for this issue is a problem with the usage of
dbIteratorNextDict:

/* Returns next dictionary from the iterator, or NULL if iteration is complete. */
dict *dbIteratorNextDict(dbIterator *dbit) {
    if (dbit->next_slot == -1) return NULL;
    dbit->slot = dbit->next_slot;
    dbit->next_slot = dbGetNextNonEmptySlot(dbit->db, dbit->slot, dbit->keyType);
    return dbGetDictFromIterator(dbit);
}

When iterating to the last non-empty slot, next_slot is set to -1,
causing it to loop indefinitely on that slot. We need to modify the code
to ensure that after iterating to the last non-empty slot, it returns to
the first non-empty slot.

BTW, function tryResizeHashTables is actually iterating over slots
that have keys. However, in its implementation, it leverages the
dbIterator (which is a key iterator) to obtain slot and dictionary
information. While this approach works fine, but it is not very
intuitive. This PR also improves readability by changing the iteration
to directly iterate over slots, thereby enhancing clarity.
2023-11-28 18:50:16 +08:00
zhaozhao.zz
095d05786f
clarify the comment of findSlotByKeyIndex function (#12811)
The current comment for `findSlotByKeyIndex` is a bit ambiguous and can
be misleading, as it may be misunderstood as getting the next slot
corresponding to target.
2023-11-27 18:45:40 -08:00
Binbin
d6f19539d2
Un-register notification and server event when RedisModule_OnLoad fails (#12809)
When we register notification or server event in RedisModule_OnLoad, but
RedisModule_OnLoad eventually fails, triggering notification or server
event
will cause the server to crash.

If the loading fails on a later stage of moduleLoad, we do call
moduleUnload
which handles all un-registration, but when it fails on the
RedisModule_OnLoad
call, we only un-register several specific things and these were
missing:

- moduleUnsubscribeNotifications
- moduleUnregisterFilters
- moduleUnsubscribeAllServerEvents

Refactored the code to reuse the code from moduleUnload.

Fixes #12808.
2023-11-27 17:26:33 +02:00
zhaozhao.zz
1bd0b54957
Optimize the efficiency of active expiration when databases exceeds 16. (#12738)
Currently, when the number of databases exceeds 16,
the efficiency of cleaning expired keys is relatively low.

The reason is that by default only 16 databases are scanned when
attempting to clean expired keys (CRON_DBS_PER_CALL is 16). But users
may set databases higher than 16, such as 256, but it does not
necessarily mean that all 256 databases have expiration time set. If
only one database has expiration time set, this database needs 16
activeExpireCycle rounds in order to be scanned once, and 15 of those
rounds are meaningless.

To optimize the efficiency of expiration in such scenarios, we use dbs_per_call
to control the number of databases with expired keys being scanned.

Additionally, add a condition to limit the maximum number of rounds
to server.dbnum to prevent excessive spinning. This ensures that even if
only one database has expired keys, it can be triggered within a single cron job.
2023-11-27 12:12:12 +08:00
binfeng-xin
56ec1ff1ce
Call signalModifiedKey after the key modification is completed (#11144)
Fix `signalModifiedKey()` order, call it after key modification
completed, to ensure the state of key eventual consistency.

When a key is modified, Redis calls `signalModifiedKey` to notify other
systems, such as the watch system of transactions and the tracking
system of client side caching. However, in some commands, the
`signalModifiedKey` call happens during the key modification process
instead of after the key modification is completed. This can potentially
cause issues, as systems relying on `signalModifiedKey` may receive the
"write in flight" status of the key rather than its final state.

These commands include:
1. PFADD
2. LSET, LMOVE, LREM
3. ZPOPMIN, ZPOPMAX, BZPOPMIN, BZPOPMAX, ZMPOP, BZMPOP

Currently there is no problem with Redis, but it is better to adjust the
order of `signalModifiedKey()`, to avoid issues in future development on
Redis.

---------

Co-authored-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
2023-11-27 11:16:41 +08:00
meiravgri
2e854bccc6
Fix async safety in signal handlers (#12658)
see discussion from after https://github.com/redis/redis/pull/12453 was
merged
----
This PR replaces signals that are not considered async-signal-safe
(AS-safe) with safe calls.

#### **1. serverLog() and serverLogFromHandler()**
`serverLog` uses unsafe calls. It was decided that we will **avoid**
`serverLog` calls by the signal handlers when:
* The signal is not fatal, such as SIGALRM. In these cases, we prefer
using `serverLogFromHandler` which is the safe version of `serverLog`.
Note they have different prompts:
`serverLog`: `62220:M 26 Oct 2023 14:39:04.526 # <msg>`
`serverLogFromHandler`: `62220:signal-handler (1698331136) <msg>`
* The code was added recently. Calls to `serverLog` by the signal
handler have been there ever since Redis exists and it hasn't caused
problems so far. To avoid regression, from now we should use
`serverLogFromHandler`

#### **2. `snprintf` `fgets` and `strtoul`(base = 16) -------->
`_safe_snprintf`, `fgets_async_signal_safe`, `string_to_hex`**
The safe version of `snprintf` was taken from
[here](8cfc4ca5e7/src/mc_util.c (L754))

#### **3. fopen(), fgets(), fclose() --------> open(), read(), close()**

#### **4. opendir(), readdir(), closedir() --------> open(),
syscall(SYS_getdents64), close()**

#### **5. Threads_mngr sync mechanisms**
* waiting for the thread to generate stack trace: semaphore -------->
busy-wait
* `globals_rw_lock` was removed: as we are not using malloc and the
semaphore anymore we don't need to protect `ThreadsManager_cleanups`.

#### **6. Stacktraces buffer**
The initial problem was that we were not able to safely call malloc
within the signal handler.
To solve that we created a buffer on the stack of `writeStacktraces` and
saved it in a global pointer, assuming that under normal circumstances,
the function `writeStacktraces` would complete before any thread
attempted to write to it. However, **if threads lag behind, they might
access this global pointer after it no longer belongs to the
`writeStacktraces` stack, potentially corrupting memory.**
To address this, various solutions were discussed
[here](https://github.com/redis/redis/pull/12658#discussion_r1390442896)
Eventually, we decided to **create a pipe** at server startup that will
remain valid as long as the process is alive.
We chose this solution due to its minimal memory usage, and since
`write()` and `read()` are atomic operations. It ensures that stack
traces from different threads won't mix.

**The stacktraces collection process is now as  follows:**
* Cleaning the pipe to eliminate writes of late threads from previous
runs.
* Each thread writes to the pipe its stacktrace
* Waiting for all the threads to mark completion or until a timeout (2
sec) is reached
* Reading from the pipe to print the stacktraces.

#### **7. Changes that were considered and eventually were dropped**
* replace watchdog timer with a POSIX timer: 
according to [settimer man](https://linux.die.net/man/2/setitimer)

> POSIX.1-2008 marks getitimer() and setitimer() obsolete, recommending
the use of the POSIX timers API
([timer_gettime](https://linux.die.net/man/2/timer_gettime)(2),
[timer_settime](https://linux.die.net/man/2/timer_settime)(2), etc.)
instead.

However, although it is supposed to conform to POSIX std, POSIX timers
API is not supported on Mac.
You can take a look here at the Linux implementation:

[here](c7562ee135)
To avoid messing up the code, and uncertainty regarding compatibility,
it was decided to drop it for now.

* avoid using sds (uses malloc) in logConfigDebugInfo
It was considered to print config info instead of using sds, however
apparently, `logConfigDebugInfo` does more than just print the sds, so
it was decided this fix is out of this issue scope.

#### **8. fix Signal mask check**
The check `signum & sig_mask` intended to indicate whether the signal is
blocked by the thread was incorrect. Actually, the bit position in the
signal mask corresponds to the signal number. We fixed this by changing
the condition to: `sig_mask & (1L << (sig_num - 1))`

#### **9. Unrelated changes**
both `fork.tcl `and `util.tcl` implemented a function called
`count_log_message` expecting different parameters. This caused
confusion when trying to run daily tests with additional test parameters
to run a specific test.
The `count_log_message` in `fork.tcl` was removed and the calls were
replaced with calls to `count_log_message` located in `util.tcl`

---------

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2023-11-23 13:22:20 +02:00
Binbin
5e403099bd
Fix misleading error message in redis.log when loglevel is invalid (#12636)
We don't have any debug level, change it to log level.
2023-11-23 10:23:30 +02:00
Moshe Kaplan
c9aa586b6b
rdb.c: Avoid potential file handle leak (Coverity 404720) (#12795)
`open()` can return any non-negative integer on success, including zero.
This change modifies the check from open()'s return value to also
include checking for a return value of zero (e.g., if stdin were closed
and then `open()` was called).

Fixes Coverity 404720

Can't happen in Redis. just a cleanup.
2023-11-23 10:14:17 +02:00
Moshe Kaplan
ae09d4d3ef
redis-check-aof.c: Avoid leaking file handle if file is zero bytes (#12797)
If fopen() is successful, but redis_fstat() determines the file is zero
bytes, the file handle stored in fp will leak. This change closes the
filehandle stored in fp if the file is zero bytes.

An FD leak on a tool like redis-check-aof isn't an issue (it'll exit soon anyway).
This is just a cleanup.
2023-11-23 10:08:49 +02:00
Moshe Kaplan
1c48d3dab2
config.c: Avoid leaking file handle if redis_fstat() fails (#12796)
If fopen() is successful, but redis_fstat() fails, the file handle
stored in fp will leak. This change closes the filehandle stored in fp
if redis_fstat() fails.

Fixes Coverity 390029
2023-11-23 10:06:13 +02:00
Moshe Kaplan
157e5d47b5
util.c: Don't leak directory handle if recursive call to dirRemove fails (#12800)
If a recursive call to dirRemove() returns -1, dirRemove() the directory
handle stored in dir will leak. This change closes the directory handle
stored in dir even if the recursive call to dirRemove() returns -1.

Fixed Coverity 371073
2023-11-23 10:04:02 +02:00
Binbin
463476933c
Optimize dict expand check, move allow check after the thresholds check (#12789)
dictExpandAllowed (for the main db dict and the expire dict) seems to
involve a few function calls and memory accesses, and we can do it only
after the thresholds checks and can get some performance improvements.

A simple benchmark test: there are 11032768 fixed keys in the database,
start a redis-server with `--maxmemory big_number --save ""`,
start a redis-benchmark with `-P 100 -r 1000000000 -t set -n 5000000`,
collect `throughput summary: n requests per second` result.

After five rounds of testing:
```
unstable     this PR
848032.56    897988.56
854408.69    913408.88
858663.94    914076.81
871839.56    916758.31
882612.56    920640.75
```

We can see a 5% performance improvement in general condition.
But note we'll still hit the degradation when we over the thresholds.
2023-11-22 11:33:26 +02:00
Yehoshua (Josh) Hershberg
ef1ca6c882
add some file level comments and copyright (#12793)
A followup PR for #12742
Add some brief comments explaining the purpose of the file to the head
of cluster_legacy.c and cluster.c.
Add copyright notice to cluster.c

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
Co-authored-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 11:32:23 +02:00
Binbin
dedbf99a80
Fix dbExpand not dividing by slots, resulting in consuming slots times the dictExpand (#12773)
We meant to divide it by the number of slots, otherwise it will do slots
times
dictExpand, bug was introduced in #11695.
2023-11-22 11:16:06 +02:00
Oran Agra
58cb302526
Cluster refactor, common API for different implementations (#12742)
This PR reworks the clustering code to support multiple clustering
implementations, specifically, the current "legacy" clustering
implementation or, although not part of this PR, flotilla (see #10875).
Which implementation is used could be a compile-time flag (will be added
later). Legacy clustering functionality remains unchanged.

The basic idea is as follows. The header cluster.h now contains function
declarations that define the "Cluster API." These are the contract and
interface between any clustering implementation and the rest of the
Redis source code. Some of the function definitions are shared between
all clustering implementations. These functions are in cluster.c. The
functions and data structures specific to legacy clustering are in
cluster-legacy.c/h. One consequence of this is that the structs
clusterNode and clusterState which were previously "public" to the rest
of Redis are now hidden behind the Cluster API.

The PR is divided up into commits, each with a commit message explaining
the changes. some are just mass rename or moving code between files (may
not require close inspection / review), others are manual changes.

One other, related change is:
- The "failover" command is now plugged into the Cluster API so that the
clustering implementation can (a) enable/disable the command to begin
with and if enabled (b) perform the actual failover. The "failover"
command remains disabled for legacy clustering.
2023-11-22 09:31:19 +02:00
Josh Hershberg
eebb025826 Cluster refactor: Some code convention fixes
Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:48 +02:00
Josh Hershberg
290f376429 Cluster refactor: fn renames + small compilation issue on ubuntu
Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
13b754853c Cluster refactor: cluster.h - reorder functions into logical groups
Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
2e5181ef28 Cluster refactor: Add failover cmd support to cluster api
The failover command is up until now not supported
in cluster mode. This commit allows a cluster
implementation to support the command. The legacy
clustering implementation still does not support
this command.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
c6157b3510 Cluster refactor: Make clustering functions common
Move primary functions used to implement datapath
clustering into cluster.c, making them shared. This
required adding "accessor" and other functions to
abstract access to node details and cluster state.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
4afc54ad9b Cluster refactor: break up clusterCommand
Divide up clusterCommand into clusterCommand for shared
sub-commands and clusterCommandSpecial for implementation
specific sub-commands. So to, the cluster command help
sub-command has been divided into two implementations,
clusterCommandHelp and clusterCommandHelpSpecial. Some
common sub-subcommand implementations have been extracted
and their implemenations either made shared or else
implementation specific.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
33ef6a3003 Cluster refactor: s/clusterNodeGetSlotBit/clusterNodeCoversSlot/
Simple rename, "GetSlotBit" is implementation specific

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
ac1513221b Cluster refactor: Move items from cluster_legacy.c to cluster.c
Move (but do not change) some items from cluster_legacy.c
back info cluster.c. These items are shared code that all
clustering implementations will use.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
040cb6a4aa Cluster refactor: verifyClusterNodeId need not be 'public'
Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
4944eda696 Cluster refactor: Move more stuff from cluster.h to cluster_legacy.h
More declerations can be moved into cluster_legacy.h
as they are not requied for the cluster api. The code
was simply moved, not changed in any way.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:03 +02:00
Josh Hershberg
d9a0478599 Cluster refactor: Make clusterNode private
Move clusterNode into cluster_legacy.h.
In order to achieve this some accessor methods
were added and also a refactor of how debugCommand
handles cluster related subcommands.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:50:46 +02:00
Josh Hershberg
98a6c44b75 Cluster refactor: Make clusterState private
Move clusterState into cluster_legacy.h. In order to achieve
this some "accessor" methods needed to be added to the
cluster API and some other minor refactors.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:44:10 +02:00
Binbin
2c41b13505
Block DEBUG POPULATE in loading and async-loading (#12790)
When we are loading data, it is not safe to generate data
through DEBUG POPULATE. POPULATE may generate keys, causing
panic when loading data with duplicate keys.
2023-11-21 17:00:27 +02:00
Josh Hershberg
5292adb985 Cluster refactor: Move trivial stuff into cluster_legacy.h
Move some declerations from cluster.h to cluster_legacy.h.
The items moved are specific to the legacy clustering
implementation and DO NOT require any other refactoring
other than moving them from one file to another.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-21 12:49:14 +02:00
Josh Hershberg
6a6ae6ffe8 Cluster refactor: Create new cluster.c and include of cluster_legacy.h
create new cluster.c

Signed-off-by: Josh Hershberg <yehoshua@redis.com>

forgot to #include cluster_legacy.h

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-21 12:49:14 +02:00
Josh Hershberg
86915775f1 Cluster refactor: rename cluster.c -> cluster_legacy.c
Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-21 12:49:14 +02:00
iKun
4278ed8de5
redis-cli --bigkeys ,--hotkeys and --memkeys to replica in cluster mode (#12735)
Make redis-cli --bigkeys and --memkeys usable on a replicas in cluster
mode, by sending the READONLY command. This is only if -c is also given.

We don't detect if a node is a master or a replica so we send READONLY
in both cases. The READONLY has no effect on masters.

    Release notes:
    Make redis-cli --bigkeys and --memkeys usable on cluster replicas

---------

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Oran Agra <oran@redislabs.com>
2023-11-20 09:33:28 +02:00
Hwang Si Yeon
a1f91ffa18
Add an explanation for URI with -u in redis-cli --help (#12751)
Add documentation of the URI format in the `--help` output of
`redis-cli` and `redis-benchmark`.

In particular, it's good for users to know that they need to specify
"default" as the username when authenticating without a username. Other
details of the URI format are described too, like scheme and dbnum.

It used to be possible to connect to Redis using an URL with an empty
username, like `redis-cli -u redis://:PASSWORD@localhost:6379/0`. This
was broken in 6.2 (#8048), and there was a discussion about it #9186.
Now, users need to specify "default" as the username and it's better to
document it.

Refer to #12746 for more details.
2023-11-19 15:09:14 +02:00
Wen Hui
5a1f4b9aec
Adding missing SWAPDB related test cases. (#12769)
We have some test cases of swapdb with watchkey but missing seperate
basic swapdb test cases, unhappy path and flushdb after swapdb. So added
the test cases in keyspace.tcl.
2023-11-19 12:44:48 +02:00
Binbin
3d9c427f8c
Fix timing issue in CLUSTER SLAVE / REPLICAS consistent test (#12774)
CI reports that this test failed, the reason is because during
the command processing, the node processed PING/PONG, resulting
in ping_sent or pong_received mismatch.

Change to use MULTI to avoid timing issue. The test was introduced
in #12224.
2023-11-19 11:09:33 +02:00
zhaozhao.zz
eb392c0a6f
replace calculateKeySlot with known slot in evictionPoolPopulate (#12777) 2023-11-17 19:01:06 +08:00
zhaozhao.zz
6013122c8e
check dbSize when rewrite AOF, in case unnecessary SELECT (#12775)
Introduced by #11695, should skip empty db in case unnecessary SELECT.
2023-11-17 19:00:26 +08:00
Binbin
9b6dded421
Fix empty rehashing list in swapdb mode (#12770)
In swapdb mode, the temp db does not init the rehashing list,
the change added in #12764 caused cluster ci to fail.
2023-11-16 11:18:25 +02:00
Binbin
4366bbaa61
Empty rehashing list in emptyDbStructure (#12764)
This is currently harmless, since we have already cleared the dict
before, it will reset the rehashidx to -1, and in incrementallyRehash
we will call dictIsRehashing to check.

It would be nice to empty the list to avoid meaningless attempts, and
the code is also unified to reduce misunderstandings.
2023-11-15 07:55:34 +02:00
Binbin
fe36306340
Fix DB iterator not resetting pauserehash causing dict being unable to rehash (#12757)
When using DB iterator, it will use dictInitSafeIterator to init a old safe
dict iterator. When dbIteratorNext is used, it will jump to the next slot db
dict when we are done a dict. During this process, we do not have any calls to
dictResumeRehashing, which causes the dict's pauserehash to always be > 0.

And at last, it will be returned directly in dictRehashMilliseconds, which causes
us to have slot dict in a state where rehash cannot be completed.

In the "expire scan should skip dictionaries with lot's of empty buckets" test,
adding a `keys *` can reproduce the problem stably. `keys *` will call dbIteratorNext
to trigger a traversal of all slot dicts.

Added dbReleaseIterator and dbIteratorInitNextSafeIterator methods to call dictResetIterator.
Issue was introduced in #11695.
2023-11-14 14:28:46 +02:00
Yossi Gottlieb
a9e73c00bc
Reduce FreeBSD daily scope. (#12758)
The full test is very flaky running on a VM inside GitHub worker, so we
have to settle for only building and running a small smoke test.
2023-11-13 17:22:09 +02:00
Roshan Khatri
88e83e517b
Add DEBUG_ASSERTIONS option to custom assert (#12667)
This PR introduces a new macro, serverAssertWithInfoDebug, to do complex assertions only for debugging. The main intention is to allow running complex operations during tests without impacting runtime performance. This assertion is enabled when setting DEBUG_ASSERTIONS.

The DEBUG_ASSERTIONS flag is set for the daily and CI variants of `test-sanitizer-address`.
2023-11-11 20:31:34 -08:00
Harkrishn Patro
9ca8490315
Increase timeout for expiry cluster tests (#12752)
Test recently added fails on timeout in valgrind in GH actions.

Locally with valgrind the test finishes within 1.5 sec(s). Couldn't find
any issue due to lack of reproducibility. Increasing the timeout and
adding an additional log to the test to understand how many keys
were left at the end.
2023-11-11 12:01:04 +02:00
zhaozhao.zz
6258edebf0
reset bucket_count when empty db (#12750)
Introduced in #12697 , should reset bucket_count when empty db, or the overhead memory usage of db can be miscalculated.
2023-11-10 15:52:57 +02:00
zhaozhao.zz
cf6ed3feeb
fix the wrong judgement for activerehashing in standalone mode (#12741)
Introduced by #11695, the judgement should be dictIsRehashing.
2023-11-09 11:30:50 +02:00