fixes test issue introduced in #9167
1. invalid reads due to accessing non-retained string (passed as unblock context).
2. leaking module blocked client context, see #6922 for info.
Modules that use background threads with thread safe contexts are likely
to use RM_BlockClient() without a timeout function, because they do not
set up a timeout.
Before this commit, `CLIENT UNBLOCK` would result with a crash as the
`NULL` timeout callback is called. Beyond just crashing, this is also
logically wrong as it may throw the module into an unexpected client
state.
This commits makes `CLIENT UNBLOCK` on such clients behave the same as
any other client that is not in a blocked state and therefore cannot be
unblocked.
For the sdscatfmt function in sds.c, when the parameter fmt ended up with '%',
the behavior is undefined. This commit fix this bug.
Co-authored-by: stafuc <stafuc@gmail.com>
Before this commit, redis-server starts in sentinel mode if the first startup
argument has the string redis-sentinel, so redis also starts in sentinel mode
if the directory it was started from contains the string redis-sentinel.
Now we check the executable name instead of directory.
Some examples:
1. Execute ./redis-sentinel/redis/src/redis-sentinel, starts in sentinel mode.
2. Execute ./redis-sentinel/redis/src/redis-server, starts in server mode,
but before, redis will start in sentinel mode.
3. Execute ./redis-sentinel/redis/src/redis-server --sentinel, of course, like
before, starts in sentinel mode.
This seems to be an unimportant bug that was accidentally generated. If the user does not specify limit in streamParseAddOrTrimArgsOrReply, the initial value of args->limit is 100 * server.stream_node_max_entries, which may lead to out of bounds, and then the default function of limit in xadd becomes invalid (this failure occurs in streamTrim).
Additionally, provide sane default for args->limit in case stream_node_max_entries is set to 0.
Co-authored-by: lizhaolong.lzl <lizhaolong.lzl@B-54MPMD6R-0221.local>
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: guybe7 <guy.benoish@redislabs.com>
A change in redis 6.2 caused redis-cli --rdb that's directed to stdout to fail because fsync fails.
This commit avoids doing ftruncate (fails with a warning) and fsync (fails with an error) when the
output file is `-`, and adds the missing documentation that `-` means stdout.
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Wang Yuan <wangyuancode@163.com>
1. Add one key-value pair to myhash, which the length of key and value both less than hash-max-ziplist-value, for example:
>hset myhash key value
2. Then execute the following command
>hsetnx myhash key value1 (the length greater than hash-max-ziplist-value)
3. This will add nothing, but the code type of "myhash" changed from ziplist to dict even there are only one key-value pair in "myhash", and both of them less than hash-max-ziplist-value.
*** [err]: PSYNC2: total sum of full synchronizations is exactly 4 intests/integration/psync2.tcl
Expected 5 == 4 (context: type eval line 8 cmd {assert {$sum == 4}} proc::test)
Sometime the test got an unexpected full sync since a replica switch to master,
before the new master change propagated the new replid to all replicas,
a replica attempted to sync with it using a wrong replid and triggered a full resync.
Consider this scenario:
1 slaveof 4 full resync
0 slaveof 4 full resync
2 slaveof 0 full resync
3 slaveof 1 full resync
1 slaveof no one, replid changed
3 reconnect 1, did a partial resyn and got the new replid
Before 2 inherits the new replid.
3 slaveof 2
3 try to do a partial resyn with 2.
But their replication ids are inconsistent, so a full resync happens.
:) A special thank you for oran and helping me in this test case.
Co-authored-by: Oran Agra <oran@redislabs.com>
In the original version, the operation of traversing the stack only seems to
reconstruct the key that does not contain the current node.
But in fact We have got the matched length and splitpos in the key in the
raxlowwalk, so I think we can simplify the logic of this part.
Co-authored-by: lizhaolong.lzl <lizhaolong.lzl@B-54MPMD6R-0221.local>
Return a bad score when used with negative count (or count of 1), and non-ziplist encoded zset.
Also add test to validate the return value and cover the issue.
in the past, the reply list was a list of sds objects, so this didn't have any overhead,
but now addReplySds just copies the data from the sds and frees it, so there's no
need to make a copy of the buffer before copying again.
this reduces an excessive allocation and free and a memcpy.
In the past, the first bind address that was explicitly specified was
also used to bind outgoing connections. This could result with some
problems. For example: on some systems using `bind 127.0.0.1` would
result with outgoing connections also binding to `127.0.0.1` and failing
to connect to remote addresses.
With the recent change to the way `bind` is handled, this presented
other issues:
* The default first bind address is '*' which is not a valid address.
* We make no distinction between user-supplied config that is identical
to the default, and the default config.
This commit addresses both these issues by introducing an explicit
configuration parameter to control the bind address on outgoing
connections.
The call to raxNext didn't really progress in the rax, since we were already on the last item.
instead, all it does is check that it is indeed a valid item, so the new code clearer.
The daily CI was broken by #9119 seems that for cron scheduled tasks, these ifs aren't evaluated to false.
But also it turns out that workflow_dispatch is only able to run CI on branches in the main repo (not on PRs).
this is an attempt to overcome that by being able to checkout from any repo we want.
- Introduce a new sdssubstr api as a building block for sdsrange.
The API of sdsrange is many times hard to work with and also has
corner case that cause bugs. sdsrange is easy to work with and also
simplifies the implementation of sdsrange.
- Revert the fix to RM_StringTruncate and just use sdssubstr instead of
sdsrange.
- Solve valgrind warnings from the new tests introduced by the previous
PR.
* Specifying an empty `bind ""` configuration prevents Redis from listening on any TCP port. Before this commit, such configuration was not accepted.
* Using `CONFIG GET bind` will always return an explicit configuration value. Before this commit, if a bind address was not specified the returned value was empty (which was an anomaly).
Another behavior change is that modifying the `bind` configuration to a non-default value will NO LONGER DISABLE protected-mode implicitly.
Previously, passing 0 for newlen would not truncate the string at all.
This adds handling of this case, freeing the old string and creating a new empty string.
Other changes:
- Move `src/modules/testmodule.c` to `tests/modules/basics.c`
- Introduce that basic test into the test suite
- Add tests to cover StringTruncate
- Add `test-modules` build target for the main makefile
- Extend `distclean` build target to clean modules too
# replication-3.tcl
had a test timeout failure with valgrind on daily CI:
```
*** [err]: SLAVE can reload "lua" AUX RDB fields of duplicated scripts in tests/integration/replication-3.tcl
Replication not started.
```
replication took more than 70 seconds.
https://github.com/redis/redis/runs/2854037905?check_suite_focus=true
on my machine it takes only about 30, but i can see how 50 seconds isn't enough.
# replication.tcl
loading was over too quickly in freebsd daily CI:
```
*** [err]: slave fails full sync and diskless load swapdb recovers it in tests/integration/replication.tcl
Expected '0' to be equal to '1' (context: type eval line 44 cmd {assert_equal [s -1 loading] 1} proc ::start_server)
```
# rdb.tcl
loading was over too quickly.
increase the time loading takes, and decrease the amount of work we try to achieve in that time.
The `Tracking gets notification of expired keys` test in tracking.tcl
used to hung in valgrind CI quite a lot.
It turns out the reason is that with valgrind and a busy machine, the
server cron active expire cycle could easily run in the same event loop
as the command that created `mykey`, so that when they key got expired,
there were two change events to broadcast, one that set the key and one
that expired it, but since we used raxTryInsert, the client that was
associated with the "last" change was the one that created the key, so
the NOLOOP filtered that event.
This commit adds a test that reproduces the problem by using lazy expire
in a multi-exec which makes sure the key expires in the same event loop
as the one that added it.
Fixes#6792. Added support of REDIS_REPLY_SET in raw and csv output of `./redis-cli`
Test:
run commands to test:
./redis-cli -3 --csv COMMAND
./redis-cli -3 --raw COMMAND
Now they are returning resuts, were failing with: "Unknown reply type: 10" before the change.
Open the log file only after parsing the entire config file, so that it's
location isn't dependent on the order of configs (`dir` and `logfile`).
Also solves the problem of creating multiple log files if the `logfile`
directive appears many times in the config file.
cleanups:
1: Re-introduce debug leak subcommand in help text.
Mistankenly deleted in https://github.com/redis/redis/pull/5531
2: Formatted the text.
Some text lacks commas resulting in no line breaks.
3: Supplementary debug restart command descriptions of delay arg.
Due to the change in #9003, a long-standing bug was raised under `valgrind`.
This bug can cause the master-slave sync to take a very long time, causing the `pendingquerybuf.tcl` test to fail.
This problem does not only occur in master-slave sync, it is triggered when the big arg is greater than 32k.
step:
```sh
dd if=/dev/zero of=bigfile bs=1M count=32
./src/redis-cli -x hset a a < bigfile
```
1) Make room for querybuf in processMultibulkBuffer, now the alloc of querybuf will be more than 32k.
2) If this happens to trigger the `clientsCronResizeQueryBuffer`, querybuf will be resized to 0.
3) Finally, in readQueryFromClient, we expand the querybuf non-greedily, from 0 to 32k.
Old code, make room for querybuf is greedy, so it only needs 11 times to expand to 32M(16k*(2^11)),
but now we need 2048(32*1024/16) times to reach it, due to the slow allocation under valgrind that exposed the problem.
The fix for the excessive shrinking of the query buf to 0, will be handled in #5013 (that other change on it's own can fix failing test too), but the fix in this PR will also fix the failing test.
The fix in this PR will makes the reading in `readQueryFromClient` more aggressive when working on a big arg (so that it is in par with the same code in `processMultibulkBuffer` (i.e. the two calls to `sdsMakeRoomForNonGreedy` should both use the bulk size).
In the code before this fix the one in readQueryFromClient always has `readlen = PROTO_IOBUF_LEN`
This commit improve MEMORY USAGE command to include internal fragmentation overheads of:
1. EMBSTR encoded strings
2. ziplist encoded zsets and hashes
3. List type nodes
Fix test failure which introduced by #9003.
The following case will occur when querybuf expansion will allocate memory equal to (16*1024)k.
1) make use ```CFLAGS=-DNO_MALLOC_USABLE_SIZE```.
2) ```malloc``` will not allocate more under ```alpine```.
This will allow distros to use an "include conf.d/*.conf" statement in the default configuration file
which will facilitate customization across upgrades/downgrades.
The change itself is trivial: instead of opening an individual file, the glob call creates a vector of files to open, and each file is opened in turn, and its content is added to the configuration.