This way the behavior is very similar to the past one.
This is useful in order to remember the user she probably failed to
configure a password correctly.
This is needed in order to model the current behavior of authenticating
the connection directly when no password is set. Now with ACLs this will
be obtained by setting the default user as "nopass" user. Moreover this
flag can be used in order to create other users that do not require any
password but will work with "AUTH username <any-password>".
However we should remove this fake client ad-hoc creation, and replace
it with the proper call to createClient(-1), and then adjust the fake
client as we like.
This addresses two problems, one where infinite (negative) repeat count is broken for all types for Redis,
and another specific to cluster mode where redirection is needed.
Now allows and works correctly for negative (i.e. -1) repeat values passed with `-r` argument to redis-cli
as documented here https://redis.io/topics/rediscli#continuously-run-the-same-command which seems to have
regressed as a feature in 95b988 (though that commit removed bad integer wrap-around to `0` behaviour).
This broken behaviour exists currently (e50458), and redis-cli will just exit immediately with repeat `-r <= 0`
as opposed to send commands indefinitely as it should with `-r < 0`
Additionally prevents a repeat * interval seconds hang/time spent doing nothing at the start before issuing
commands in cluster mode (`-c`), where the command needed to redirect to a slot on another node, as commands
where failing and waiting to be reissued but this was fully repeated before being reissued. For example,
redis-cli -c -r 10 -i 0.5 INCR test_key_not_on_6379
Would hang and show nothing for 5 seconds (10 * 0.5) before showing
(integer) 1
(integer) 2
(integer) 3
(integer) 4
(integer) 5
(integer) 6
(integer) 7
(integer) 8
(integer) 9
(integer) 10
at half second intervals as intended.
If a key exists in the target node during a migration (BUSYKEY),
the value of the key on both nodes (source and target) will be compared.
If the key has the same value on both keys, the migration will be
automatically retried with the REPLACE argument in order to override
the target's key.
If the key has different values, the behaviour will depend on such
cases:
- In case of 'fix' command, the migration will stop and the user
will be warned to manually check the key(s).
- In other cases (ie. reshard), if the user launched the command
with the --cluster-replace option, the migration will be
retried with the REPLACE argument, elsewhere the migration will
stop and the user will be warned.
When loading data, we call processEventsWhileBlocked
to process events and execute commands.
But if we are loading AOF it's dangerous, because
processCommand would call freeMemoryIfNeeded to evict,
and that will break data consistency, see issue #5686.
Thanks to @soloestoy for discovering this issue in #5667.
This is an alternative fix in order to avoid both cycling the clients
and also disconnecting clients just having valid read-only transactions
pending.
these metrics become negative when RSS is smaller than the used_memory.
This can easily happen when the program allocated a lot of memory and haven't
written to it yet, in which case the kernel doesn't allocate any pages to the process
In MEMORY USAGE command, we count the key argv[2] into usage,
but the argument in command may contains free spaces because of
sdsMakeRoomFor. But the key in db never contains free spaces
because we use sdsdup when dbAdd, so using the real key to
count the usage is more accurate.
If we encounter an unsupported protocol in the "bind" list, don't
ipso-facto consider it a fatal error. We continue to abort startup if
there are no listening sockets at all.
This ensures that the lack of IPv6 support does not prevent Redis from
starting on Debian where we try to bind to the ::1 interface by default
(via "bind 127.0.0.1 ::1"). A machine with IPv6 disabled (such as some
container systems) would simply fail to start Redis after the initiall
call to apt(8).
This is similar to the case for where "bind" is not specified:
https://github.com/antirez/redis/issues/3894
... and was based on the corresponding PR:
https://github.com/antirez/redis/pull/4108
... but also adds EADDRNOTAVAIL to the list of errors to catch which I
believe is missing from there.
This issue was raised in Debian as both <https://bugs.debian.org/900284>
& <https://bugs.debian.org/914354>.
This really helps spot it in the logs, otherwise it does not look like a
warning/error. For example:
Creating Server TCP listening socket ::1:6379: bind: Cannot assign requested address
... is not nearly as clear as:
Could not create server TCP listening listening socket ::1:6379: bind: Cannot assign requested address
This commit fixes#5570. It is a similar bug to one fixed a few weeks
ago and is due to the range API to be called with NULL as "end ID"
parameter instead of repeating again the start ID, to be sure that we
selectively issue the entry with a given ID, or we get zero returned
(and we know we should emit a NULL reply).
This fixes the issue reported in #5570.
This was fixed the hard way, that is, propagating more information to
the lower level API about this being a request to read just the history,
so that the code is simpler and less likely to regress.
- clusterManagerFixOpenSlot: ensure that the
slot is unassigned before ADDSLOTS
- clusterManagerFixSlotsCoverage: after cold
migration, the slot configuration
is now updated on all the nodes.
This bug had a double effect:
1. Sometimes entries may not be emitted, producing broken protocol where
the array length was greater than the emitted entires, blocking the
client waiting for more data.
2. Some other time the right entry was claimed, but a wrong entry was
returned to the client.
This fix should correct both the instances.
server.hz was uninitialized between initServerConfig and initServer.
this can lead to someone (e.g. queued modules) doing createObject,
and accessing an uninitialized variable, that can potentially be 0,
and lead to a crash.
So far it was not possible to setup Sentinel with authentication
enabled. This commit introduces this feature: every Sentinel will try to
authenticate with other sentinels using the same password it is
configured to accept clients with.
So for instance if a Sentinel has a "requirepass" configuration
statemnet set to "foo", it will use the "foo" password to authenticate
with every other Sentinel it connects to. So basically to add the
"requirepass" to all the Sentinels configurations is enough in order to
make sure that:
1) Clients will require the password to access the Sentinels instances.
2) Each Sentinel will use the same password to connect and authenticate
with every other Sentinel in the group.
Related to #3279 and #3329.
Fake clients are used in special situations and are not linked to the
normal clients list, freeing them will always result in Redis crashing
in one way or the other.
It's not common to send replies to fake clients, but we have one usage
in the modules API. When a client is blocked, we associate to the
blocked client object (that is safe to manipulate in a thread), a fake
client that accumulates replies. So because of this bug there was
the problem described in issue #5443.
The fix was verified to work with the provided example module. To write
a regression is very hard and unlikely to be triggered in the future.
This adds support for passing a password through a REDISCLI_AUTH
environment variable (which is safer than the CLI), which might often be
safer than passing it through a CLI argument.
Passing a password this way does not trigger the warning about passing a
password through CLI arguments, and CLI arguments take precedence over
it.
The fix was removed by c8ca71d40 attempting to fix the stack generation
on ARM64, without testing if it would still work on ARM32.
Now it should work both sides.
They play better with Lua scripting, otherwise Lua will see status
replies as "ok" = "string" which is very odd, and actually as @oranagra
reasoned in issue #5456 in the rest of the Redis code base there was no
such concern as saving a few bytes when the protocol is emitted.
Related to #4840.
Note that when we re-enter the event loop with aeProcessEvents() we
don't process timers, nor before/after sleep callbacks, so we should
never end calling freeClientsInAsyncFreeQueue() when re-entering the
loop.
Related to #5201.
I removed the !!! Warning part since compared to the other errors, a
missing EXEC is in theory a normal happening in the AOF file, at least
in theory: may happen in a differnet number of situations, and it's
probably better to don't give the user the feeling that something really
bad happened.
This is useful in order to spot bugs where we fail
at updating the pointer returned by the insertion
function. Normally often the same pointer is returned,
making it harder than needed to spot bugs.
Related to #5210.
When HAVE_MALLOC_SIZE is false, each call to zrealloc causes used_memory
to increase by PREFIX_SIZE more than it should, due to mis-matched
accounting between the original zmalloc (which includes PREFIX size in
its increment) and zrealloc (which misses it from its decrement).
I've also supplied a command-line test to easily demonstrate the
problem. It's not wired into the test framework, because I don't know
TCL so I'm not sure how to automate it.
sdsZmallocSize assumes a dynamically allocated SDS. When given a string
object created by createEmbeddedStringObject, it calls zmalloc_size on a
pointer that isn't the one returned by zmalloc
There are two problems if we use lastcmd:
1. BRPOPLPUSH cannot be rewrited as RPOPLPUSH in multi/exec
In mulit/exec context, the lastcmd is exec.
2. Redis will crash when execute RPOPLPUSH loading from AOF
In fakeClient, the lastcmd is NULL.
Storing the context is useless, because we can't really reuse that
later. For instance in the API RM_DictNext() that returns a
RedisModuleString for the next key iterated, the user should pass the
new context, because we may run the keys of the dictionary in a
different context of the one where the dictionary was created. Also the
dictionary may be created without a context, but we may still demand
automatic memory management for the returned strings while iterating.
By using the "C" suffix for functions getting pointer/len, we can do the
same in the future for other modules APIs that need a variant with
pointer/len and that are now accepting a RedisModuleString.
The burden of having to always create RedisModuleString objects within a
module context was too much, especially now that we have threaded
operations and modules are doing more interesting things. The context in
the string API is currently only used for automatic memory managemnet,
so now the API was modified so that the user can opt-out this feature by
passing a NULL context.
During the full database resync we may still have unsaved changes
on the receiving side. This causes a race condition between
synced data rename/load and the rename of rdbSave tempfile.
SENTINEL REPLICAS was added as an alias, in the configuration rewriting
now it uses known-replica, however all the rest is basically at API
level of logged events and messages having to do with the protocol, so
there is very little to do here compared to the Redis core itself, to
preserve compatibility.
Aliases added for all the commands mentioning slave. Moreover CONFIG
REWRITE will use the new names, and will be able to reuse the old lines
mentioning the old options.
xadd with id * generates random stream id
xadd & xtrim with approximate maxlen count may
trim stream randomly
xinfo may get random radix-tree-keys/nodes
xpending may get random idletime
xclaim: master and slave may have different
idletime in stream
Here the idea is that we do not want freeMemoryIfNeeded() to propagate a
DEL command before the script and change what happens in the script
execution once it reaches the slave. For example see this potential
issue (in the words of @soloestoy):
On master, we run the following script:
if redis.call('get','key')
then
redis.call('set','xxx','yyy')
end
redis.call('set','c','d')
Then when redis attempts to execute redis.call('set','xxx','yyy'), we call freeMemoryIfNeeded(), and the key may get deleted, and because redis.call('set','xxx','yyy') has already been executed on master, this script will be replicated to slave.
But the slave received "DEL key" before the script, and will ignore maxmemory, so after that master has xxx and c, slave has only one key c.
Note that this patch (and other related work) was authored collaboratively in
issue #5250 with the help of @soloestoy and @oranagra.
Related to issue #5250.
The conclusion, that a xread request can be answered syncronously in
case that the stream's last_id is larger than the passed last-received-id
parameter, assumes, that there must be entries present, which could be
returned immediately.
This assumption fails for empty streams that actually contained some
entries which got removed by xdel, ... .
As result, the client is answered synchronously with an empty result,
instead of blocking for new entries to arrive.
An additional check for a non-empty stream is required.
If we are going to read a large object from network
try to make it likely that it will start at c->querybuf
boundary so that we can optimize object creation
avoiding a large copy of data.
But only when the data we have not parsed is less than
or equal to ll+2. If the data length is greater than
ll+2, trimming querybuf is just a waste of time, because
at this time the querybuf contains not only our bulk.
It's easy to reproduce the that:
Time1: call `client pause 10000` on slave.
Time2: redis-benchmark -t set -r 10000 -d 33000 -n 10000.
Then slave hung after 10 seconds.
Technically speaking we don't really need to put the master client in
the clients that need to be processed, since in practice the PING
commands from the master will take care, however it is conceptually more
sane to do so.
Processing command from the master while the slave is in busy state is
not correct, however we cannot, also, just reply -BUSY to the
replication stream commands from the master. The correct solution is to
stop processing data from the master, but just accumulate the stream
into the buffers and resume the processing later.
Related to #5297.
To avoid copying buffers to create a large Redis Object which
exceeding PROTO_IOBUF_LEN 32KB, we just read the remaining data
we need, which may less than PROTO_IOBUF_LEN. But the remaining
len may be zero, if the bulklen+2 equals sdslen(c->querybuf),
in client pause context.
For example:
Time1:
python
>>> import os, socket
>>> server="127.0.0.1"
>>> port=6379
>>> data1="*3\r\n$3\r\nset\r\n$1\r\na\r\n$33000\r\n"
>>> data2="".join("x" for _ in range(33000)) + "\r\n"
>>> data3="\n\n"
>>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>>> s.settimeout(10)
>>> s.connect((server, port))
>>> s.send(data1)
28
Time2:
redis-cli client pause 10000
Time3:
>>> s.send(data2)
33002
>>> s.send(data3)
2
>>> s.send(data3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
socket.error: [Errno 104] Connection reset by peer
To fix that, we should check if remaining is greater than zero.
Note: this breaks backward compatibility with Redis 4, since now slaves
by default are exact copies of masters and do not try to evict keys
independently.
Function setProtocolError just records proctocol error
details in server log, set client as CLIENT_CLOSE_AFTER_REPLY.
It doesn't care about querybuf sdsrange, because we
will do it after procotol parsing.
This is an optimization for processing pipeline, we discussed a
problem in issue #5229: clients may be paused if we apply `CLIENT
PAUSE` command, and then querybuf may grow too large, the cost of
memmove in sdsrange after parsing a completed command will be
horrible. The optimization is that parsing all commands in queyrbuf
, after that we can just call sdsrange only once.
When the element new score is the same of prev/next node, the
lexicographical order kicks in, so we can safely update the node in
place only when the new score is strictly between the adjacent nodes
but never equal to one of them.
Technically speaking we could do extra checks to make sure that even if the
score is the same as one of the adjacent nodes, we can still update on
place, but this rarely happens, so probably not a good deal to make it
more complex.
Related to #5179.
Slaves and rebooting redis may have different radix tree struct,
by different stream* config options. So propagating approximated
MAXLEN to AOF/slaves may lead to date inconsistency.
If we rewrite the MAXLEN argument as zero when no trimming
was performed, date between master and slave and aof will
be inconsistent, because `xtrim maxlen 0` means delete all
entries in stream.
User: "is there a reason why redis server logs are missing the year in
the "date time"?"
Me: "I guess I did not imagine it would be stable enough to run for
several years".
Related to #5157. The PR author correctly indentified that the check was
duplicated, but removing the second one introduces a bug that was fixed
in the past (hence the duplication). Instead we can remove the first
instance of the check without issues.
on slower machines, the active defrag test tended to fail.
although the fragmentation ratio was below the treshold, the defragger was
still in the middle of a scan cycle.
this commit changes:
- the defragger uses the current fragmentation state, rather than the cache one
that is updated by server cron every 100ms. this actually fixes a bug of
starting one excess scan cycle
- the test lets the defragger use more CPU cycles, in hope that the defrag
will be faster, but also give it more time before we give up.
The slave sends \n keepalive messages to the master while parsing the rdb,
and later sends REPLCONF ACK once a second. rarely, the master recives both
a linefeed char and a REPLCONF in the same read, \n*3\r\n$8\r\nREPLCONF\r\n...
and it tries to trim two chars (\r\n) from the query buffer,
trimming the '*' from *3\r\n$8\r\nREPLCONF\r\n...
then the master tries to process a command starting with '3' and replies to
the slave a bunch of -ERR and one +OK.
although the slave silently ignores these (prints a log message), this corrupts
the replication offset at the slave since the slave increases the replication
offset, and the master did not.
other than the fix in processInlineBuffer, i did several other improvments
while hunting this very rare bug.
- when redis replies with "unknown command" it includes a portion of the
arguments, not just the command name. so it would be easier to understand
what was recived, in my case, on the slave side, it was -ERR, but
the "arguments" were the interesting part (containing info on the error).
- about a year ago i added code in addReplyErrorLength to print the error to
the log in case of a reply to master (since this string isn't actually
trasmitted to the master), now changed that block to print a similar log
message to indicate an error being sent from the master to the slave.
note that the slave is marked as CLIENT_SLAVE only after PSYNC was received,
so this will not cause any harm for REPLCONF, and will only indicate problems
that are gonna corrupt the replication stream anyway.
- two places were c->reply was emptied, and i wanted to reset sentlen
this is a precaution (i did not actually see such a problem), since a
non-zero sentlen will cause corruption to be transmitted on the socket.
Reading the PR gave me the opportunity to better specify what the code
was doing in places where I was not immediately sure about what was
going on. Moreover I documented the structure in server.h so that people
reading the header file will immediately understand what the structure
is useful for.
A) slave buffers didn't count internal fragmentation and sds unused space,
this caused them to induce eviction although we didn't mean for it.
B) slave buffers were consuming about twice the memory of what they actually needed.
- this was mainly due to sdsMakeRoomFor growing to twice as much as needed each time
but networking.c not storing more than 16k (partially fixed recently in 237a38737).
- besides it wasn't able to store half of the new string into one buffer and the
other half into the next (so the above mentioned fix helped mainly for small items).
- lastly, the sds buffers had up to 30% internal fragmentation that was wasted,
consumed but not used.
C) inefficient performance due to starting from a small string and reallocing many times.
what i changed:
- creating dedicated buffers for reply list, counting their size with zmalloc_size
- when creating a new reply node from, preallocate it to at least 16k.
- when appending a new reply to the buffer, first fill all the unused space of the
previous node before starting a new one.
other changes:
- expose mem_not_counted_for_evict info field for the benefit of the test suite
- add a test to make sure slave buffers are counted correctly and that they don't cause eviction
We don't want to increment the deliveries here, because the sysadmin
reset the consumer group so the desire is likely to restart processing,
and having the PEL polluted with old information is not useful but
probably confusing.
Related to #5111.
To simplify the semantics of blocking for a group, this commit changes
the implementation to better match the description we provide of
conusmer groups: blocking for > will make the consumer waiting for new
elements in the group. However blocking for any other ID will always
serve the local history of the consumer.
However it must be noted that the > ID is actually an alias for the
special ID ms/seq of UINT64_MAX,UINT64_MAX.
To detect when the group (or the whole key) is destroyed to send an
error to the consumers blocked in such group is a problem, so we leave
the consumers listening, the sysadmin is free to create or destroy
groups assuming she/he knows what to do. However a client may be blocked
in a given consumer group, that is later destroyed. Then the stream
receives new elements. In that case there is no sane behavior to serve
the consumer... but to report an error about the group no longer
existing.
More about detecting this synchronously and why it is not done:
1. Normally we don't do that, we leave clients blocked for other data
types such as lists.
2. When we free a stream object there is no longer information about
what was the key it was associated with, so while destroying the
consumer groups we miss the info to unblock the clients in that moment.
3. Objects can be reclaimed in other threads where it is no longer safe
to do client operations.
When a client blocks for a consumer group, we don't know the actual ID
we want to be served: other clients blocked in the same consumer group
may be served first, so the consumer group latest delivered ID changes.
This was not handled correctly, all the clients in the consumer group
were unblocked without data but the first.
With such information will be able to use a private localtime()
implementation serverLog(), which does not use any locking and is both
thread and fork() safe.
PR #5081 fixes an "interesting" bug about Redis Cluster failover but in
general about the updating of repl_down_since, that is used in order to
count the time a slave was left disconnected from its master.
While the fix provided resolves the specific issue, in general the
validity of repl_down_since is limited to states that are different
than the state CONNECTED, and the disconnected time is set when the
state is DISCONNECTED. However from CONNECTED to other states, the state
machine must always go to DISCONNECTED first. So it makes sense to set
the field to zero (since it is meaningless in that context) when the
state is set to CONNECTED.
Instead of telling the user to set the renamed command to "" to remove
the renaming, to the obvious thing when a command is renamed to itself.
So if I want to remove the renaming of PING, I just rename it to PING
again.
Unlike the BZPOP variants, these functions take a single key. This fixes
an erroneous CROSSSLOT error when passing a count to a cluster enabled
server.
RESTORE now supports:
1. Setting LRU/LFU
2. Absolute-time TTL
Other related changes:
1. RDB loading will not override LRU bits when RDB file
does not contain the LRU opcode.
2. RDB loading will not set LRU/LFU bits if the server's
maxmemory-policy does not match.
this reduces the extra 8 bytes we save before each pointer.
but more importantly maybe, it makes the valgrind runs to be more similiar
to our normal runs.
note: the change in malloc_stats struct in server.h is to eliminate an name conflict.
structs that are not typedefed are resolved from a separate name space.
due to incorrect forward declaration, it didn't provide all arguments.
this lead to random value being read from the stack and return of incorrect time,
which in this case doesn't matter since no one uses it.
Basically we cannot be sure that if the key is expired while writing the
AOF, the main thread will surely find the key expired. There are
possible race conditions like the moment at which the "now" is sampled,
and the fact that time may jump backward.
Think about the following:
SET a 5
EXPIRE a 1
AOF rewrite starts after about 1 second. The child process finds the key
expired, while in the main thread instead an INCR command is called
against the key "a" immediately after a fork, and the scheduler was
faster to give execution time to the main thread, so "a" is yet not
expired.
The main thread will generate an INCR a command to the AOF log that will
be appended to the rewritten AOF file, but that INCR command will target
a non existin "a" key, so a new non volatile key "a" will be created.
Two observations:
A) In theory by computing "now" before the fork, we should be sure that
if a key is expired at that time, it will be expired later when the
main thread will try to access to such key. However this does not take
into account the fact that the computer time may jump backward.
B) Technically we may still make the process safe by using a monotonic
time source.
However there were other similar related bugs, and in general the new
"vision" is that Redis persistence files should represent the memory
state without trying to be too smart: this makes the design more
consistent, bugs less likely to arise from complex interactions, and in
the end what is to fix is the Redis expire process to have less expired
keys in RAM.
Thanks to Oran Agra and Guy Benoish for writing me an email outlining
this problem, after they conducted a Redis 5 code review.
The old version could not handle the fact that "STREAMS" is a valid key
name for streams. Now we really try to parse the command like the
command implementation would do.
Related to #5028 and 4857.
The loop allocated a buffer for the right number of keys positions, then
overflowed it going past the limit.
Related to #4857 and cause of the memory violation seen in #5028.
Now a MAXLEN of 0 really does what it means: it will create a zero
entries stream. This is useful in order to make sure that the behavior
is identical to XTRIM, that must be able to reduce the stream to zero
elements when MAXLEN is given.
Also now MAXLEN with a count < 0 will return an error.