* SLOWLOG didn't record anything for blocked commands because the client
was reset and argv was already empty. there was a fix for this issue
specifically for modules, now it works for all blocked clients.
* The original command argv (before being re-written) was also reset
before adding the slowlog on behalf of the blocked command.
* Latency monitor is now updated regardless of the slowlog flags of the
command or its execution (their purpose is to hide sensitive info from
the slowlog, not hide the fact the latency happened).
* Latency monitor now uses real_cmd rather than c->cmd (which may be
different if the command got re-written, e.g. GEOADD)
Changes:
* Unify shared code between slowlog insertion in call() and
updateStatsOnUnblock(), hopefully prevent future bugs from happening
due to the later being overlooked.
* Reset CLIENT_PREVENT_LOGGING in resetClient rather than after command
processing.
* Add a test for SLOWLOG and BLPOP
Notes:
- real_cmd == c->lastcmd, except inside MULTI and Lua.
- blocked commands never happen in these cases (MULTI / Lua)
- real_cmd == c->cmd, except for when the command is rewritten (e.g.
GEOADD)
- blocked commands (currently) are never rewritten
- other than the command's CLIENT_PREVENT_LOGGING, and the
execution flag CLIENT_PREVENT_LOGGING, other cases that we want to
avoid slowlog are on AOF loading (specifically CMD_CALL_SLOWLOG will
be off when executed from execCommand that runs from an AOF)
the corrupt-dump-fuzzer test found a case where an access to a corrupt
stream would have caused accessing to uninitialized memory.
now it'll panic instead.
The issue was that there was a stream that says it has more than 0
records, but looking for the max ID came back empty handed.
p.s. when sanitize-dump-payload is used, this corruption is detected,
and the RESTORE command is gracefully rejected.
We sometimes see the crash report saying we were killed by a random
process even in cases where the crash was spontanius in redis.
for instance, crashes found by the corrupt-dump test.
It looks like this si_pid is sometimes left uninitialized, and a good
way to tell if the crash originated in redis or trigged by outside is to
look at si_code, real signal codes are always > 0, and ones generated by
kill are have si_code of 0 or below.
Reading CoW from /proc/<pid>/smaps can be slow with large processes on
some platforms.
This measures the time it takes to read CoW info and limits the duty
cycle of future updates to roughly 1/100.
As current_cow_size no longer represnets a current, fixed interval value
there is also a new current_cow_size_age field that provides information
about the age of the size value, in seconds.
pcall function runs another LUA function in protected mode, this means
that any error will be caught by this function and will not stop the LUA
execution. The script kill mechanism uses error to stop the running script.
Scripts that uses pcall can catch the error raise by the script kill mechanism,
this will cause a script like this to be unkillable:
local f = function()
while 1 do
redis.call('ping')
end
end
while 1 do
pcall(f)
end
The fix is, when we want to kill the script, we set the hook function to be invoked
after each line. This will promise that the execution will get another
error before it is able to enter the pcall function again.
Some operating systems (e.g., NetBSD and OpenBSD) have switched to
using a 64-bit integer for time_t on all platforms. This results in currently
harmless compiler warnings due to potential truncation.
These changes fix these minor portability concerns.
* Fix format string for systems with 64 bit time
* use llabs to avoid truncation with 64 bit time
In the long term we may want to move away from anet completely and have
everything implemented natively in connection.c, instead of having an
extra layer.
For now, just get rid of unused code.
In certain scenario start_server may think it failed to start a redis server
although it started successfully. in these cases, it'll not terminate it, and
it'll remain running when the test is over.
In start_server if config doesn't have bind (the minimal.conf in introspection.tcl),
it will try to bind ipv4 and ipv6. One may success while other fails. It will
output "Could not create server TCP listening socket".
wait_server_started uses this message to check whether instance started
successfully. So it will consider that it failed even though redis started successfully.
Additionally, in some cases it wasn't clear to users why the server exited,
since the warning message printed to the log, could in some cases be harmless,
and in some cases fatal.
This PR adds makes a clear distinction between a warning log message and
a fatal one, and changes the test suite to look for the fatal message.
lookupKeyReadOrReply and lookupKeyWriteOrReply might decide to reply to the user
with the given robj reply. This reply might be an error reply and if so addReply
function is used instead of addReplyErrorObject which will cause the error reply not
to be counted on stats. The fix checks the first char in the reply and if its '-' (error)
it uses addReplyErrorObject.
REDISMODULE_CTX_FLAGS_EVICT and REDISMODULE_CTX_FLAGS_MAXMEMORY
shouldn't be set when the module is run inside a replica that doesn't do eviction.
one may argue that the database is under eviction (the master does eviction and sends DELs to the replica).
but on the other hand, we don't really know the master's configuration.
all that matters is if the current instance does evictions or not.
listenToPort attempts to gracefully handle and ignore certain errors but does not store errno prior to logging, which in turn calls several libc functions that may overwrite errno.
This has been discovered due to libmusl strftime() always returning with errno set to EINVAL, which resulted with docker-library/redis#273.
1. moduleReplicateMultiIfNeeded should use server.in_eval like
moduleHandlePropagationAfterCommandCallback
2. server.in_eval could have been set to 1 and not reset back
to 0 (a lot of missed early-exits after in_eval is already 1)
Note: The new assertions in processCommand cover (2) and I added
two module tests to cover (1)
Implications:
If an EVAL that failed (and thus left server.in_eval=1) runs before a module
command that replicates, the replication stream will contain MULTI (because
moduleReplicateMultiIfNeeded used to check server.lua_caller which is NULL
at this point) but not EXEC (because server.in_eval==1)
This only affects modules as module.c the only user of server.in_eval.
Affects versions 6.2.0, 6.2.1
On a replica we do accept connections, even though commands accessing
the database will operate in read-only mode. But the server is still
already operational and processing commands.
Not sending the readiness notification means that on a HA setup where
the nodes all start as replicas (with replicaof in the config) with
a replica that cannot connect to the master server and which might not
come back in a predictable amount of time or at all, the service
supervisor will end up timing out the service and terminating it, with
no option to promote it to be the main instance. This seems counter to
what the readiness notification is supposed to be signaling.
Instead send the readiness notification when we start accepting
commands, and then send the various server status changes as that.
Fixes: commit 641c64ada1
Fixes: commit dfb598cf33
Bug 1:
When a module ctx is freed moduleHandlePropagationAfterCommandCallback
is called and handles propagation. We want to prevent it from propagating
commands that were not replicated by the same context. Example:
1. module1.foo does: RM_Replicate(cmd1); RM_Call(cmd2); RM_Replicate(cmd3)
2. RM_Replicate(cmd1) propagates MULTI and adds cmd1 to also_propagagte
3. RM_Call(cmd2) create a new ctx, calls call() and destroys the ctx.
4. moduleHandlePropagationAfterCommandCallback is called, calling
alsoPropagates EXEC (Note: EXEC is still not written to socket),
setting server.in_trnsaction = 0
5. RM_Replicate(cmd3) is called, propagagting yet another MULTI (now
we have nested MULTI calls, which is no good) and then cmd3
We must prevent RM_Call(cmd2) from resetting server.in_transaction.
REDISMODULE_CTX_MULTI_EMITTED was revived for that purpose.
Bug 2:
Fix issues with nested RM_Call where some have '!' and some don't.
Example:
1. module1.foo does RM_Call of module2.bar without replication (i.e. no '!')
2. module2.bar internally calls RM_Call of INCR with '!'
3. at the end of module1.foo we call RM_ReplicateVerbatim
We want the replica/AOF to see only module1.foo and not the INCR from module2.bar
Introduced a global replication_allowed flag inside RM_Call to determine
whether we need to replicate or not (even if '!' was specified)
Other changes:
Split beforePropagateMultiOrExec to beforePropagateMulti afterPropagateExec
just for better readability
Have a clear separation between in and out flags
Other changes:
delete dead code in RM_ZsetIncrby: if zsetAdd returned error (happens only if
the result of the operation is NAN or if score is NAN) we return immediately so
there is no way that zsetAdd succeeded and returned NAN in the out-flags
`master_sync_perc` and `loading_loaded_perc` don't have that sign,
and i think the info field should be a raw floating point number (the name suggests its units).
we already have `used_memory_peak_perc` and `used_memory_dataset_perc` which do add the `%` sign, but:
1) i think it was a mistake but maybe too late to fix now, and maybe not too late to fix for `current_fork_perc`
2) it is more important to be consistent with the two other "progress" "prec" metrics, and not with the "utilization" metric.
1. Add `redis-server test all` support to run all tests.
2. Add redis test to daily ci.
3. Add `--accurate` option to run slow tests for more iterations (so that
by default we run less cycles (shorter time, and less prints).
4. Move dict benchmark to REDIS_TEST.
5. fix some leaks in tests
6. make quicklist tests run on a specific fill set of options rather than huge ranges
7. move some prints in quicklist test outside their loops to reduce prints
8. removing sds.h from dict.c since it is now used in both redis-server and
redis-cli (uses hiredis sds)
The obtained process_rss was incorrect (the OS reports pages, not
bytes), resulting with many other fields getting corrupted.
This has been tested on FreeBSD but not other platforms.
When a quicklist has quicklist->compress * 2 nodes, then call
__quicklistCompress, all nodes will be decompressed and the middle
two nodes will be recompressed again. This violates the fact that
quicklist->compress * 2 nodes are uncompressed. It's harmless
because when visit a node, we always try to uncompress node first.
This only happened when a quicklist has quicklist->compress * 2 + 1
nodes, then delete a node. For other scenarios like insert node and
iterate this will not happen.
Scenario:
1. A module client is blocked on keys with a timeout
2. Shortly before the timeout expires, the key is being populated and signaled
as ready
3. Redis calls moduleTryServeClientBlockedOnKey (which replies to client) and
then moduleUnblockClient
4. moduleUnblockClient doesn't really unblock the client, it writes to
server.module_blocked_pipe and only marks the BC as unblocked.
5. beforeSleep kics in, by this time the client still exists and techincally
timed-out. beforeSleep replies to the timeout client (double reply) and
only then moduleHandleBlockedClients is called, reading from module_blocked_pipe
and calling unblockClient
The solution is similar to what was done in moduleTryServeClientBlockedOnKey: we
should avoid re-processing an already-unblocked client
* The `redis-cli --scan` output should honor output mode (set explicitly or implicitly), and quote key names when not in raw mode.
* Technically this is a breaking change, but it should be very minor since raw mode is by default on for non-tty output.
* It should only affect TTY output (human users) or non-tty output if `--no-raw` is specified.
* Added `--quoted-input` option to treat all arguments as potentially quoted strings.
* Added `--quoted-pattern` option to accept a potentially quoted pattern.
Unquoting is applied to potentially quoted input only if single or double quotes are used.
Fixes#8561, #8563
When the length of the quicklist is 1(only one zipmap), the rotate operation will cause
memory overlap when moving an entity from the tail of the zipmap to the head.
quicklistRotate is a dead code, so it has no impact on the existing code.
Since the API declared (as #define) in redismodule.h and uses
the CLIENT_ID_AOF that declared in the server.h, when
a module will want to make use of this API, it will get a compilation
error (module doesn't include the server.h).
The API was broken by d6eb3af (failed attempt for a cleanup).
Revert to the original version of RedisModule_IsAOFClient
that uses UINT64_MAX instead of CLIENT_ID_AOF
When no connected clients variables stopped being updated instead of being zeroed over time.
The other changes are cleanup and optimization (avoiding an unnecessary division per client)
When sanitizing the stream listpack, we need to count the deleted records too.
otherwise the last line that checks the next pointer fails.
Add test to cover that state in the stream tests.
Add ability to modify port, tls-port and bind configurations by CONFIG SET command.
To simplify the code and make it cleaner, a new structure
added, socketFds, which contains the file descriptors array and its counter,
and used for TCP, TLS and Cluster sockets file descriptors.
Because when the RM_Call is invoked. It will create a faker client.
The point is client connection is NULL, so server will crash in connGetInfo
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
The arm_thread_state64_get_pc used later in the file is defined
in mach kernel headers. Apparently they get included if you use
the system malloc but not if you use jemalloc.
DB ID used to be parsed as a long for SWAPDB command, now
make it into an int to be consistent with other commands that parses
the DB ID argument like SELECT, MOVE, COPY. See #8085
The implication is that the error message when the provided db index
is greater than 4M changes slightly.
This could happen on an invalid use, when trying to create a cluster with
a single node and provide it's address 3 time to satisfy redis-cli requirements.
A single client pointer is added in the server struct. This is
initialized by the first RM_Call() and reused for every subsequent
RM_Call() except if it's already in use, which means that it's not
used for (recursive) module calls to modules. For these, a new
"fake" client is created each time.
Other changes:
* Avoid allocating a dict iterator in pubsubUnsubscribeAllChannels
when not needed
* Add better control of malloc_usable_size() usage.
* Use malloc_usable_size on alpine libc daily job.
* Add no-malloc-usable-size daily jobs.
* Fix zmalloc(0) when HAVE_MALLOC_SIZE is undefined.
In order to align with the jemalloc behavior, this should never return
NULL or OOM panic.
This commit fixes a bug in what's currently dead code in redis.
In quicklistDelRange when delete entry from entry.offset to node tail,
extent only need gte node->count - entry.offset, not node->count
Co-authored-by: Yoav Steinberg <yoav@redislabs.com>
* Remove linux/version.h dependency.
This introduces unnecessary dependencies, and generally not a good idea
as the platform we build on may be different than the platform we run
on.
To determine if sync_file_range exists we can simply rely on header file
hints.
* Fix setproctitle() on libmusl.
The previous ifdef checks were a bit too strict for no apparent
reason.
* Fix tests failure on Linux with no backtrace.
* Add alpine daily CI job.
On 32-bit systems, setting the proto-max-bulk-len config parameter to a high value may result with integer overflow and a subsequent heap overflow when parsing an input bulk (CVE-2021-21309).
This fix has two parts:
Set a reasonable limit to the config parameter.
Add additional checks to prevent the problem in other potential but unknown code paths.
This validation was only done for sub-commands and not for commands.
These would have been valid (not produce any error)
ACL SETUSER bob +@all +client
ACL SETUSER bob +client +client
so no reason for this one to fail:
ACL SETUSER bob +client +client|id
One example why this is needed is that pfdebug wasn't part of the @hyperloglog
group and now it is. so something like:
acl setuser user1 +@hyperloglog +pfdebug|test
would have succeeded in early 6.0.x, and fail in 6.2 RC3
Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
SRANDMEMBER with negative count (non unique) can return the same member
multiple times, and the order of elements in the returned collection matters.
For these reasons returning a RESP3 Set type is not valid for the negative
count, but also not really valid for the positive (unique) variant either (the
command returns an array of random picks, not a set)
This PR also contains a minor optimization for SRANDMEMBER, HRANDFIELD,
and ZRANDMEMBER, to avoid the temporary dict from being rehashed while it grows.
Co-authored-by: Oran Agra <oran@redislabs.com>
Originally this was limited to IPv6 address length, but effectively it
has been used for host names and now that Sentinel accepts that as well
we need to be able to store full hostnames.
Fixes#8507
When redis responds with tracking-redir-broken push message (RESP3),
it was responding with a broken protocol: an array of 3 elements, but only
pushes 2 elements.
Some bugs in the test make this pass. Read the push reply
will consume an extra reply, because the reply length is 3, but there
are only two elements, so the next reply will be treated as third
element. So the test is corrected too.
Other changes:
* checkPrefixCollisionsOrReply success should return 1 instead of -1,
this bug didn't have any implications.
* improve client tracking tests to validate more of the response it reads.
Respond with error if expire time overflows from positive to negative of vice versa.
* `SETEX`, `SET EX`, `GETEX` etc would have already error on negative value,
but now they would also error on overflows (i.e. when the input was positive but
after the manipulation it becomes negative, which would have passed before)
* `EXPIRE` and `EXPIREAT` was ok taking negative values (would implicitly delete
the key), we keep that, but we do error if the user provided a value that changes
sign when manipulated (except the case of changing sign when `basetime` is added)
Signed-off-by: Gnanesh <gnaneshkunal@outlook.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
The `dict` field `iterators` is misleading and incorrect.
This variable is used for 1 purpose - to pause rehashing.
The current `iterators` field doesn't actually count "iterators".
It counts "safe iterators". But - it doesn't actually count safe iterators
either. For one, it's only incremented once the safe iterator begins to
iterate, not when it's created. It's also incremented in `dictScan` to
prevent rehashing (and commented to make it clear why `iterators` is
being incremented during a scan).
This update renames the field as `pauserehash` and creates 2 helper
macros `dictPauseRehashing(d)` and `dictResumeRehashing(d)`
Avoid repeated reallocs growing the listpack while entries are being added.
This is done by pre-allocating the listpack to near maximum size, and using
malloc_size to check if it needs realloc or not.
When the listpack reaches the maximum number of entries, we shrink it to fit it's used size.
Co-authored-by: Viktor Söderqvist <viktor@zuiderkwast.se>
Co-authored-by: Oran Agra <oran@redislabs.com>
* Adding current_save_keys_total and current_save_keys_processed info fields.
Present in replication, BGSAVE and AOFRW.
* Changing RM_SendChildCOWInfo() to RM_SendChildHeartbeat(double progress)
* Adding new info field current_fork_perc. Present in Replication, BGSAVE, AOFRW,
and module forks.
Avoids memmove and reallocs when replacing a ziplist element of the
same encoded size as the new value.
Affects HSET, HINRBY, HINCRBYFLOAT (via hashTypeSet) and LSET (via
quicklistReplaceAtIndex).