Commit Graph

7026 Commits

Author SHA1 Message Date
Yossi Gottlieb
750acf3a45
Fix CONFIG REWRITE of oom-score-adj-values. (#7761) 2020-09-08 16:00:20 +03:00
Oran Agra
573246f73c
if diskless repl child is killed, make sure to reap the pid (#7742)
Starting redis 6.0 and the changes we made to the diskless master to be
suitable for TLS, I made the master avoid reaping (wait3) the pid of the
child until we know all replicas are done reading their rdb.

I did that in order to avoid a state where the rdb_child_pid is -1 but
we don't yet want to start another fork (still busy serving that data to
replicas).

It turns out that the solution used so far was problematic in case the
fork child was being killed (e.g. by the kernel OOM killer), in that
case there's a chance that we currently disabled the read event on the
rdb pipe, since we're waiting for a replica to become writable again.
and in that scenario the master would have never realized the child
exited, and the replica will remain hung too.
Note that there's no mechanism to detect a hung replica while it's in
rdb transfer state.

The solution here is to add another pipe which is used by the parent to
tell the child it is safe to exit. this mean that when the child exits,
for whatever reason, it is safe to reap it.

Besides that, i'm re-introducing an adjustment to REPLCONF ACK which was
part of #6271 (Accelerate diskless master connections) but was dropped
when that PR was rebased after the TLS fork/pipe changes (5a47794).
Now that RdbPipeCleanup no longer calls checkChildrenDone, and the ACK
has chance to detect that the child exited, it should be the one to call
it so that we don't have to wait for cron (server.hz) to do that.
2020-09-06 16:43:57 +03:00
Yossi Gottlieb
58e5feb3f4
redis-cli: fix writeConn() buffer handling. (#7749)
Fix issues with writeConn() which resulted with corruption of the stream by leaving an extra byte in the buffer. The trigger for this is partial writes or write errors which were not experienced on Linux but reported on macOS.
2020-09-03 18:15:48 +03:00
Oran Agra
9ef8d2f671
Run active defrag while blocked / loading (#7726)
During long running scripts or loading RDB/AOF, we may need to do some
defragging. Since processEventsWhileBlocked is called periodically at
unknown intervals, and many cron jobs either depend on run_with_period
(including active defrag), or rely on being called at server.hz rate
(i.e. active defrag knows ho much time to run by looking at server.hz),
the whileBlockedCron may have to run a loop triggering the cron jobs in it
(currently only active defrag) several times.

Other changes:
- Adding a test for defrag during aof loading.
- Changing key-load-delay config to take negative values for fractions
  of a microsecond sleep
2020-09-03 08:47:29 +03:00
Pierre Jambet
d52ce4ea1a
Fix error message for the DEBUG ZIPLIST command (#7745)
DEBUG ZIPLIST <key> currently returns the following error string if the
key is not a ziplist: "ERR Not an sds encoded string.". This looks like
an accidental copy/paste error from the error returned in the else if
branch above where this string is returned if the key is not an sds
string. The command was added in
ac61f90625 and looking at the commit,
nothing indicates that it is not an accidental typo.

The error string now returns a correct error: "Not a ziplist encoded
object", which accurately describes the error.
2020-09-02 23:27:48 +03:00
Oran Agra
8b0747d657
Print server startup messages after daemonization (#7743)
When redis isn't configured to have a log file, having these prints
before damonization puts them in the calling process stdout rather than
/dev/null
2020-09-02 17:18:09 +03:00
Thandayuthapani
f22f64f0db
Add masters/replicas options to redis-cli --cluster call command (#6491)
* Add master/slave option in --cluster call command

* Update src/redis-cli.c

* Update src/redis-cli.c

Co-authored-by: Itamar Haber <itamar@redislabs.com>
2020-09-02 16:23:49 +03:00
Yossi Gottlieb
b35d6e5cff
Fix double-make issue with make && make install. (#7734)
All user-supplied variables that affect the build should be explicitly
persisted.

Fixes #7254
2020-09-01 10:02:14 +03:00
maohuazhu
ee4a15aae0
Optimize __ziplistCascadeUpdate algorithm (#6886)
The previous algorithm is of O(n^2) time complexity.
It would have run through the ziplist entries one by one, each time doing a `realloc` and a
`memmove` (moving the entire tail of the ziplist).

The new algorithm is O(n), it runs over all the records once, computing the size of the `realloc`
needed, then does one `realloc`, and run thought the records again doing many smaller `memmove`s,
each time moving just one record.

So this change reduces many reallocs, and moves each record just once.

Co-authored-by: zhumaohua <zhumaohua@megvii.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2020-08-28 17:22:35 +03:00
Jim Brunner
c01e94a431
Use H/W Monotonic clock and updates to AE (#7644)
Update adds a general source for retrieving a monotonic time.
In addition, AE has been updated to utilize the new monotonic
clock for timer processing.

This performance improvement is **not** enabled in a default build due to various H/W compatibility
concerns, see README.md for details. It does however change the default use of gettimeofday with
clock_gettime and somewhat improves performance.

This update provides the following
1. An interface for retrieving a monotonic clock. getMonotonicUs returns a uint64_t (aka monotime)
   with the number of micro-seconds from an arbitrary point. No more messing with tv_sec/tv_usec.
   Simple routines are provided for measuring elapsed milli-seconds or elapsed micro-seconds (the
   most common use case for a monotonic timer). No worries about time moving backwards.
2. High-speed assembler implementation for x86 and ARM. The standard method for retrieving the
   monotonic clock is POSIX.1b (1993): clock_gettime(CLOCK_MONOTONIC, timespec*). However, most
   modern processors provide a constant speed instruction clock which can be retrieved in a fraction
   of the time that it takes to call clock_gettime. For x86, this is provided by the RDTSC
   instruction. For ARM, this is provided by the CNTVCT_EL0 instruction. As a compile-time option,
   these high-speed timers can be chosen. (Default is POSIX clock_gettime.)
3. Refactor of event loop timers. The timer processing in ae.c has been refactored to use the new
   monotonic clock interface. This results in simpler/cleaner logic and improved performance.
2020-08-28 11:54:10 +03:00
Oran Agra
9fcd9e191e
Fix rejectCommand trims newline in shared error objects, hung clients (#7714)
65a3307bc (released in 6.0.6) has a side effect, when processCommand
rejects a command with pre-made shared object error string, it trims the
newlines from the end of the string. if that string is later used with
addReply, the newline will be missing, breaking the protocol, and
leaving the client hung.

It seems that the only scenario which this happens is when replying with
-LOADING to some command, and later using that reply from the CONFIG
SET command (still during loading). this will result in hung client.

Refactoring the code in order to avoid trimming these newlines from
shared string objects, and do the newline trimming only in other cases
where it's needed.

Co-authored-by: Guy Benoish <guy.benoish@redislabs.com>
2020-08-27 12:54:01 +03:00
Oran Agra
8bdcbbb085
Update memory metrics for INFO during loading (#7690)
During a long AOF or RDB loading, the memory stats were not updated, and
INFO would return stale data, specifically about fragmentation and RSS.
In the past some of these were sampled directly inside the INFO command,
but were moved to cron as an optimization.

This commit introduces a concept of loadingCron which should take
some of the responsibilities of serverCron.
It attempts to limit it's rate to approximately the server Hz, but may
not be very accurate.

In order to avoid too many system call, we use the cached ustime, and
also make sure to update it in both AOF loading and RDB loading inside
processEventsWhileBlocked (it seems AOF loading was missing it).
2020-08-27 11:09:32 +03:00
valentinogeron
b7289e912c
EXEC with only read commands should not be rejected when OOM (#7696)
If the server gets MULTI command followed by only read
commands, and right before it gets the EXEC it reaches OOM,
the client will get OOM response.

So, from now on, it will get OOM response only if there was
at least one command that was tagged with `use-memory` flag
2020-08-27 09:19:24 +03:00
filipe oliveira
21784def70
Extended redis-benchmark instant metrics and overall latency report (#7600)
A first step to enable a consistent full percentile analysis on query latency so that we can fully understand the performance and stability characteristics of the redis-server system we are measuring. It also improves the instantaneous reported metrics, and the csv output format.
2020-08-25 21:21:29 +03:00
Itamar Haber
5b0a06af48
Expands lazyfree's effort estimate to include Streams (#5794)
Otherwise, it is treated as a single allocation and freed synchronously. The following logic is used for estimating the effort in constant-ish time complexity:

1. Check the number of nodes.
1. Add an allocation for each consumer group registered inside the stream.
1. Check the number of PELs in the first CG, and then add this count times the number of CGs.
1. Check the number of consumers in the first CG, and then add this count times the number of CGs.
2020-08-25 15:58:50 +03:00
Wang Yuan
43af28f5b4
Fix wrong format specifiers of 'sdscatfmt' for the INFO command (#7706)
unlike printf, sdscatfmt doesn't take %d
2020-08-24 22:59:56 +03:00
Wang Yuan
6b4ae919e8
Fix data race in bugReportStart (#7700)
The previous fix using _Atomic was insufficient, since we check and set it in
different places.

The implications of this bug are just that a portion of the bug report will be shown
twice, in the race case of two concurrent crashes.
2020-08-24 13:54:33 +03:00
Valentino Geron
8b428cf0f7 Assert that setDeferredAggregateLen isn't called with negative value
In case the redis is about to return broken reply we want to crash
with assert so that we are notified about the bug. see #7687.
2020-08-23 16:03:30 +03:00
Valentino Geron
9204a9b2c2 Fix LPOS command when RANK is greater than matches
When calling to LPOS command when RANK is higher than matches,
the return value is non valid response. For example:
```
LPUSH l a
:1
LPOS l b RANK 5 COUNT 10
*-4
```
It may break client-side parser.

Now, we count how many replies were replied in the array.
```
LPUSH l a
:1
LPOS l b RANK 5 COUNT 10
*0
```
2020-08-23 16:03:30 +03:00
Wen Hui
e61adc0d89
fix make warnings (#7692) 2020-08-21 23:37:49 +03:00
Wen Hui
89f2bfbb58
use dictSlots for getting total slots number in dict (#7691) 2020-08-21 00:14:09 +03:00
huangzhw
a3d4d7bf68
RedisModuleEvent_LoadingProgress always at 100% progress (#7685)
It was also using the wrong struct, but luckily RedisModuleFlushInfo and RedisModuleLoadingProgress
are identical.
2020-08-20 23:31:06 +03:00
guybe7
65c24bd3d4
Modules: Invalidate saved_oparray after use (#7688)
We wanna avoid a chance of someone using the pointer in it after it'll be freed / realloced.
2020-08-20 19:55:14 +03:00
杨博东
cbaf3c5bba
Fix flock cluster config may cause failure to restart after kill -9 (#7674)
After fork, the child process(redis-aof-rewrite) will get the fd opened
by the parent process(redis), when redis killed by kill -9, it will not
graceful exit(call prepareForShutdown()), so redis-aof-rewrite thread may still
alive, the fd(lock) will still be held by redis-aof-rewrite thread, and
redis restart will fail to get lock, means fail to start.

This issue was causing failures in the cluster tests in github actions.

Co-authored-by: Oran Agra <oran@redislabs.com>
2020-08-20 08:59:02 +03:00
Raghav Muddur
34c3be365a
Update clusterMsgDataPublish to clusterMsgModule (#7682)
Correcting the variable to clusterMsgModule.
2020-08-19 19:13:32 -07:00
Madelyn Olson
cbd9af8583
Fixed hset error since it's shared with hmset (#7678) 2020-08-19 19:07:43 -07:00
Wang Yuan
89d544d6f2
Add comments on 'slave.repldboff' when use diskless replication (#7679) 2020-08-19 10:52:53 +03:00
guybe7
b87c288016
PERSIST should signalModifiedKey (Like EXPIRE does) (#7671) 2020-08-18 19:07:59 +03:00
Oran Agra
0f741a9e2d
OOM Crash log include size of allocation attempt. (#7670)
Since users often post just the crash log in github issues, the log
print that's above it is missing.
No reason not to include the size in the panic message itself.
2020-08-18 09:53:59 +03:00
Wen Hui
88662c243d
edit auth failed message (#7648)
Edit auth failed message include user disabled case in hello command
2020-08-18 08:59:24 +03:00
Wen Hui
93d87d6d4c
[module] using predefined REDISMODULE_NO_EXPIRE in RM_GetExpire (#7669)
It was already defined in the API header and the documentation, but not used by the implementation.
2020-08-18 08:50:03 +03:00
Oran Agra
cdd925b289
Trim trailing spaces in error replies coming from rejectCommand (#7668)
65a3307bc9 added rejectCommand which takes an robj reply and passes it
through addReplyErrorSafe to addReplyErrorLength.
The robj contains newline at it's end, but addReplyErrorSafe converts it
to spaces, and passes it to addReplyErrorLength which adds the protocol
newlines.

The result was that most error replies (like OOM) had extra two trailing
spaces in them.
2020-08-18 08:28:43 +03:00
Yossi Gottlieb
64c360c515
Module API: fix missing RM_CLIENTINFO_FLAG_SSL. (#7666)
The `REDISMODULE_CLIENTINFO_FLAG_SSL` flag was already a part of the `RedisModuleClientInfo` structure but was not implemented.
2020-08-17 17:46:54 +03:00
Yossi Gottlieb
fb2a94af3f
TLS: relax verification on CONFIG SET. (#7665)
Avoid re-configuring (and validating) SSL/TLS configuration on `CONFIG
SET` when TLS is not actively enabled for incoming connections, cluster
bus or replication.

This fixes failures when tests run without `--tls` on binaries that were
built with TLS support.

An additional benefit is that it's now possible to perform a multi-step
configuration process while TLS is disabled. The new configuration will
be verified and applied only when TLS is effectively enabled.
2020-08-17 17:36:50 +03:00
michael-grunder
275068c86f Use Hiredis' sdscompat.h to map sds* calls to hi_sds* 2020-08-15 13:13:23 -07:00
Nathan Scott
11cd983d58
Annotate module API functions in redismodule.h for use with -fno-common (#6900)
In order to keep the redismodule.h self-contained but still usable with
gcc v10 and later, annotate each API function tentative definition with
the __common__ attribute.  This avoids the 'multiple definition' errors
modules will otherwise see for all API functions at link time.

Further details at gcc.gnu.org/gcc-10/porting_to.html

Turn the existing __attribute__ ((unused)), ((__common__)) and ((print))
annotations into conditional macros for any compilers not accepting this
syntax.  These macros only expand to API annotations under gcc.

Provide a pre- and post- macro for every API function, so that they can
be defined differently by the file that includes redismodule.h.

Removing REDISMODULE_API_FUNC in the interest of keeping the function
declarations readable.

Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2020-08-14 14:45:34 +03:00
caozb
d10b2f3173
wait command optimization (#7333)
Client that issued WAIT last will most likely have the highest replication offset, so imagine a probably common case where all clients are waiting for the same number of replicas. we prefer the loop to start from the last client (one waiting for the highest offset), so that the optimization in the function will call replicationCountAcksByOffset for each client until it found a good one, and stop calling it for the rest of the clients.
the way the loop was implemented would mean that in such case it is likely to call replicationCountAcksByOffset for all clients.

Note: the change from > to >= is not directly related to the above.

Co-authored-by: 曹正斌 <caozb@jiedaibao.com>
2020-08-13 14:30:51 +03:00
RemRain
47637bea6d
Set the initial seed for random() (#5679) 2020-08-12 13:15:58 -07:00
Yossi Gottlieb
2530dc0ebd
Add oom-score-adj configuration option to control Linux OOM killer. (#1690)
Add Linux kernel OOM killer control option.

This adds the ability to control the Linux OOM killer oom_score_adj
parameter for all Redis processes, depending on the process role (i.e.
master, replica, background child).

A oom-score-adj global boolean flag control this feature. In addition,
specific values can be configured using oom-score-adj-values if
additional tuning is required.
2020-08-12 17:58:56 +03:00
HarveyLiu
f3df3ec134
fix misleading typo hasActiveChildProcess doc comment (#7588)
and a misspell in rax.c
2020-08-12 10:23:55 +03:00
Madelyn Olson
79c506ebf0
Fixed timer warning (#5953) 2020-08-12 11:16:41 +08:00
Madelyn Olson
f2db379fa3
Replace usage of wrongtypeerr with helper (#7633)
* Replace usage of wrongtypeerr with helper
2020-08-11 20:04:54 -07:00
Mota
fbed632f3a
Adds redis-cli and redis-benchmark dependencies for make test target
Obsoletes the need to run `make` before `make test`.
2020-08-11 22:01:15 +03:00
Wagner Francisco Mezaroba
e2a71338eb
allow --pattern to be used along with --bigkeys (#3586)
Adds --pattern option to cli's --bigkeys, --hotkeys & --scan modes
2020-08-11 21:57:21 +03:00
zhaozhao.zz
ff1e4a7063
redis-benchmark: fix wrong random key for hset (#4895) 2020-08-11 20:51:27 +08:00
Itamar Haber
28a8465102
Merge pull request #7618 from ShooterIT/benchmark-zset
[Redis-benchmark] Support zset type
2020-08-11 14:31:11 +03:00
Muhammad Zahalqa
5dd499c6d4
Fix unidentical function declaration in bio.c. lazyfree.c: lazyfreeFreeSlotsMapFromBioThread (#7228) 2020-08-11 19:16:10 +08:00
zhaozhao.zz
589e610ebc CLIENT_MASTER should ignore server.proto_max_bulk_len 2020-08-11 18:59:29 +08:00
zhaozhao.zz
bd4b33d7a2 config: proto-max-bulk-len must be 1mb or greater 2020-08-11 18:59:29 +08:00
zhaozhao.zz
2e69bfe44d using proto-max-bulk-len in checkStringLength for SETRANGE and APPEND 2020-08-11 18:59:29 +08:00
Tyson Andre
6f11acbd67
Implement SMISMEMBER key member [member ...] (#7615)
This is a rebased version of #3078 originally by shaharmor
with the following patches by TysonAndre made after rebasing
to work with the updated C API:

1. Add 2 more unit tests
   (wrong argument count error message, integer over 64 bits)
2. Use addReplyArrayLen instead of addReplyMultiBulkLen.
3. Undo changes to src/help.h - for the ZMSCORE PR,
   I heard those should instead be automatically
   generated from the redis-doc repo if it gets updated

Motivations:

- Example use case: Client code to efficiently check if each element of a set
  of 1000 items is a member of a set of 10 million items.
  (Similar to reasons for working on #7593)
- HMGET and ZMSCORE already exist. This may lead to developers deciding
  to implement functionality that's best suited to a regular set with a
  data type of sorted set or hash map instead, for the multi-get support.

Currently, multi commands or lua scripting to call sismember multiple times
would almost definitely be less efficient than a native smismember
for the following reasons:

- Need to fetch the set from the string every time
  instead of reusing the C pointer.
- Using pipelining or multi-commands would result in more bytes sent
  and received by the client for the repeated SISMEMBER KEY sections.
- Need to specially encode the data and decode it from the client
  for lua-based solutions.
- Proposed solutions using Lua or SADD/SDIFF could trigger writes to
  memory, which is undesirable on a redis replica server
  or when commands get replicated to replicas.

Co-Authored-By: Shahar Mor <shahar@peer5.com>
Co-Authored-By: Tyson Andre <tysonandre775@hotmail.com>
2020-08-11 11:55:06 +03:00
Rajat Pawar
59d437c727
Fix comment about ACLGetCommandPerm() 2020-08-10 23:11:26 -07:00
Jim Brunner
f39c9404a3
Prevent dictRehashMilliseconds from rehashing while a safe iterator is present (#5948) 2020-08-10 22:36:34 -07:00
杨博东
229327ad8b
Avoid redundant calls to signalKeyAsReady (#7625)
signalKeyAsReady has some overhead (namely dictFind) so we should
only call it when there are clients blocked on the relevant type (BLOCKED_*)
2020-08-11 08:18:09 +03:00
Itamar Haber
efe92ee546
Removes dead code (#7642)
Appears to be handled by server.stream_node_max_bytes in reality.
2020-08-11 08:11:47 +03:00
WuYunlong
d6220f12a9
see #7250, fix signature of RedisModule_DeauthenticateAndCloseClient (#7645)
In redismodule.h, RedisModule_DeauthenticateAndCloseClient returns void
`void REDISMODULE_API_FUNC(RedisModule_DeauthenticateAndCloseClient)(RedisModuleCtx *ctx, uint64_t client_id);`
But in module.c, RM_DeauthenticateAndCloseClient returns int
`int RM_DeauthenticateAndCloseClient(RedisModuleCtx *ctx, uint64_t client_id)`

It it safe to change return value from `void` to `int` from the user's perspective.
2020-08-10 19:18:21 -07:00
Yossi Gottlieb
3f073b1d9c
Merge pull request #7609 from michael-grunder/redis-cli-resp3-push
Add redis-cli RESP3 Push support
2020-08-09 11:19:04 +03:00
Meir Shpilraien (Spielrein)
3f494cc49d
see #7544, added RedisModule_HoldString api. (#7577)
Added RedisModule_HoldString that either returns a
shallow copy of the given String (by increasing
the String ref count) or a new deep copy of String
in case its not possible to get a shallow copy.

Co-authored-by: Itamar Haber <itamar@redislabs.com>
2020-08-09 06:11:47 +03:00
Wen Hui
ca95b71f67
add sentinel command help (#7579) 2020-08-08 12:28:44 -07:00
WuYunlong
6dcc6898be
Optimize calls to mstime in trackInstantaneousMetric() (#6472) 2020-08-08 22:11:14 +03:00
Wang Yuan
ea7eeb2fd2
Print error info if failed opening config file (#6943) 2020-08-08 22:03:56 +03:00
michael-grunder
81879bc171 Create PUSH handlers in redis-cli
Add logic to redis-cli to display RESP3 PUSH messages when we detect
STDOUT is a tty, with an optional command-line argument to override
the default behavior.

The new argument:  --show-pushes <yn>

Examples:

$ redis-cli -3 --show-pushes no
$ echo "client tracking on\nget k1\nset k1 v1"| redis-cli -3 --show-pushes y
2020-08-08 11:59:17 -07:00
ShooterIT
6a06a5a597 [Redis-benchmark] Remove zrem test, add zpopmin test 2020-08-08 23:08:27 +08:00
Wen Hui
3f67b03378
fix memory leak in ACLLoadFromFile error handling (#7623) 2020-08-08 14:42:32 +03:00
Wang Yuan
1ef014ee6b
Fix applying zero offset to null pointer when creating moduleFreeContextReusedClient (#7323)
Before this fix we where attempting to select a db before creating db the DB, see: #7323

This issue doesn't seem to have any implications, since the selected DB index is 0,
the db pointer remains NULL, and will later be correctly set before using this dummy
client for the first time.

As we know, we call 'moduleInitModulesSystem()' before 'initServer()'. We will allocate
memory for server.db in 'initServer', but we call 'createClient()' that will call 'selectDb()'
in 'moduleInitModulesSystem()', before the databases where created. Instead, we should call
'createClient()' for moduleFreeContextReusedClient after 'initServer()'.
2020-08-08 14:36:41 +03:00
xuannianz
b118502a05
remove superfluous else block (#7620)
The else block would be executed when newlen == 0 and in the case memmove won't be called, so there's no need to set start.
2020-08-08 00:19:18 +03:00
fayadexinqing
e966264188
fix migration's broadcast PONG message, after the slot modification (#7590) 2020-08-07 13:01:14 -07:00
Oran Agra
c17e597d05
Accelerate diskless master connections, and general re-connections (#6271)
Diskless master has some inherent latencies.
1) fork starts with delay from cron rather than immediately
2) replica is put online only after an ACK. but the ACK
   was sent only once a second.
3) but even if it would arrive immediately, it will not
   register in case cron didn't yet detect that the fork is done.

Besides that, when a replica disconnects, it doesn't immediately
attempts to re-connect, it waits for replication cron (one per second).
in case it was already online, it may be important to try to re-connect
as soon as possible, so that the backlog at the master doesn't vanish.

In case it disconnected during rdb transfer, one can argue that it's
not very important to re-connect immediately, but this is needed for the
"diskless loading short read" test to be able to run 100 iterations in 5
seconds, rather than 3 (waiting for replication cron re-connection)

changes in this commit:
1) sync command starts a fork immediately if no sync_delay is configured
2) replica sends REPLCONF ACK when done reading the rdb (rather than on 1s cron)
3) when a replica unexpectedly disconnets, it immediately tries to
   re-connect rather than waiting 1s
4) when when a child exits, if there is another replica waiting, we spawn a new
   one right away, instead of waiting for 1s replicationCron.
5) added a call to connectWithMaster from replicationSetMaster. which is called
   from the REPLICAOF command but also in 3 places in cluster.c, in all of
   these the connection attempt will now be immediate instead of delayed by 1
   second.

side note:
we can add a call to rdbPipeReadHandler in replconfCommand when getting
a REPLCONF ACK from the replica to solve a race where the replica got
the entire rdb and EOF marker before we detected that the pipe was
closed.
in the test i did see this race happens in one about of some 300 runs,
but i concluded that this race is unlikely in real life (where the
replica is on another host and we're more likely to first detect the
pipe was closed.
the test runs 100 iterations in 3 seconds, so in some cases it'll take 4
seconds instead (waiting for another REPLCONF ACK).

Removing unneeded startBgsaveForReplication from updateSlavesWaitingForBgsave
Now that CheckChildrenDone is calling the new replicationStartPendingFork
(extracted from serverCron) there's actually no need to call
startBgsaveForReplication from updateSlavesWaitingForBgsave anymore,
since as soon as updateSlavesWaitingForBgsave returns, CheckChildrenDone is
calling replicationStartPendingFork that handles that anyway.
The code in updateSlavesWaitingForBgsave had a bug in which it ignored
repl-diskless-sync-delay, but removing that code shows that this bug was
hiding another bug, which is that the max_idle should have used >= and
not >, this one second delay has a big impact on my new test.
2020-08-06 16:53:06 +03:00
Oran Agra
81f8524a12 Fix potential race in bugReportStart
this race would only happen when two threads paniced at the same time,
and even then the only consequence is some extra log lines.

race reported in #7391
2020-08-06 16:47:27 +03:00
Oran Agra
90b717e723 Assertion and panic, print crash log without generating SIGSEGV
This makes it possible to add tests that generate assertions, and run
them with valgrind, making sure that there are no memory violations
prior to the assertion.

New config options:
- crash-log-enabled - can be disabled for cleaner core dumps
- crash-memcheck-enabled - useful for faster termination after a crash
- use-exit-on-panic - to be used by the test suite so that valgrind can
  detect leaks and memory corruptions

Other changes:
- Crash log is printed even on system that dont HAVE_BACKTRACE, i.e. in
  both SIGSEGV and assert / panic
- Assertion and panic won't print registers and code around EIP (which
  was useless), but will do fast memory test (which may still indicate
  that the assertion was due to memory corrpution)

I had to reshuffle code in order to re-use it, so i extracted come code
into function without actually doing any changes to the code:
- logServerInfo
- logModulesInfo
- doFastMemoryTest (with the exception of it being conditional)
- dumpCodeAroundEIP

changes to the crash report on segfault:
- logRegisters is called right after the stack trace (before info) done
  just in order to have more re-usable code
- stack trace skips the first two items on the stack (the crash log and
  signal handler functions)
2020-08-06 16:47:27 +03:00
ShooterIT
e5a50ed3c4 [Redis-benchmark] Support zset type 2020-08-06 15:36:28 +08:00
Itamar Haber
24c539251f
Merge pull request #7092 from itamarhaber/fix-5629
Prevents default save configuration being reset...
2020-08-05 21:16:38 +03:00
Oran Agra
1aa31e4da9 redis-cli --cluster-yes - negate force flag for clarity
this internal flag is there so that some commands do not comply to `--cluster-yes`
2020-08-05 18:30:43 +03:00
Frank Meier
51077c8212 reintroduce REDISCLI_CLUSTER_YES env variable in redis-cli
the variable was introduced only in the 5.0 branch in #5879 bc6c1c40db
2020-08-05 18:30:43 +03:00
Tyson Andre
f11f26cc53
Add a ZMSCORE command returning an array of scores. (#7593)
Syntax: `ZMSCORE KEY MEMBER [MEMBER ...]`

This is an extension of #2359
amended by Tyson Andre to work with the changed unstable API,
add more tests, and consistently return an array.

- It seemed as if it would be more likely to get reviewed
  after updating the implementation.

Currently, multi commands or lua scripting to call zscore multiple times
would almost definitely be less efficient than a native ZMSCORE
for the following reasons:

- Need to fetch the set from the string every time instead of reusing the C
  pointer.
- Using pipelining or multi-commands would result in more bytes sent by
  the client for the repeated `ZMSCORE KEY` sections.
- Need to specially encode the data and decode it from the client
  for lua-based solutions.
- The fastest solution I've seen for large sets(thousands or millions)
  involves lua and a variadic ZADD, then a ZINTERSECT, then a ZRANGE 0 -1,
  then UNLINK of a temporary set (or lua). This is still inefficient.

Co-authored-by: Tyson Andre <tysonandre775@hotmail.com>
2020-08-04 17:49:33 +03:00
hujie
2f454e5ae8
remove duplicate semicolon (#7438) 2020-08-02 13:59:51 +03:00
Oran Agra
f7e7775990
module hook for master link up missing on successful psync (#7584)
besides, hooks test was time sensitive. when the replica managed to
reconnect quickly after the client kill, the test would fail
2020-07-31 13:14:29 +03:00
Oran Agra
50f5181488
Remove dead code from update_zmalloc_stat_alloc (#7589)
this seems like leftover from before 6eb51bf
2020-07-31 13:01:39 +03:00
fayadexinqing
4faad81f50
broadcast a PONG message when slot's migration is over, which may reduce the moved request of clients (#7571) 2020-07-29 18:05:27 -07:00
Kevin McGehee
b55c1602b5
Call propagate instead of writing directly to AOF/replicas (#6658)
Use higher-level API to funnel all generic propagation through
single function call.
2020-07-29 17:54:37 -07:00
Yossi Gottlieb
7af05f07ff
Clarify RM_BlockClient() error condition. (#6093) 2020-07-29 17:03:38 +03:00
Arun Ranganathan
f6cad30bb6
Show threading configuration in INFO output (#7446)
Co-authored-by: Oran Agra <oran@redislabs.com>
2020-07-29 08:46:44 +03:00
namtsui
63dae52324
Avoid an out-of-bounds read in the redis-sentinel (#7443)
The Redis sentinel would crash with a segfault after a few minutes
because it tried to read from a page without read permissions. Check up
front whether the sds is long enough to contain redis:slave or
redis:master before memcmp() as is done everywhere else in
sentinelRefreshInstanceInfo().

Bug report and commit message from Theo Buehler. Fix from Nam Nguyen.

Co-authored-by: Nam Nguyen <namn@berkeley.edu>
2020-07-29 08:25:56 +03:00
Wen Hui
f33acb3f02
Add SignalModifiedKey hook in XGROUP CREATE with MKSTREAM option (#7562) 2020-07-29 08:22:54 +03:00
Wen Hui
c69a9b2f61
fix leak in error handling of debug populate command (#7062)
valsize was not modified during the for loop below instead of getting from c->argv[4], therefore there is no need to put inside the for loop.. Moreover, putting the check outside loop will also avoid memory leaking, decrRefCount(key) should be called in the original code if we put the check in for loop
2020-07-28 22:05:48 +03:00
Yossi Gottlieb
784ceeb90d
TLS: Propagate and handle SSL_new() failures. (#7576)
The connection API may create an accepted connection object in an error
state, and callers are expected to check it before attempting to use it.

Co-authored-by: mrpre <mrpre@163.com>
2020-07-28 11:32:47 +03:00
Jiayuan Chen
f31260b044
Add optional tls verification (#7502)
Adds an `optional` value to the previously boolean `tls-auth-clients` configuration keyword.

Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
2020-07-28 10:45:21 +03:00
Yossi Gottlieb
c75512d89d TLS: support cluster/replication without tls-port.
Initialize and configure OpenSSL even when tls-port is not used, because
we may still have tls-cluster or tls-replication.

Also, make sure to reconfigure OpenSSL when these parameters are changed
as TLS could have been enabled for the first time.
2020-07-27 13:26:02 +03:00
grishaf
4126ca466f
Fix prepareForShutdown function declaration (#7566) 2020-07-26 08:27:30 +03:00
zhaozhao.zz
da840e9851
more strict check in rioConnRead (#7564) 2020-07-24 14:40:19 +08:00
Meir Shpilraien (Spielrein)
8d82639319
This PR introduces a new loaded keyspace event (#7536)
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Itamar Haber <itamar@redislabs.com>
2020-07-23 12:38:51 +03:00
Oran Agra
40d7fca368
Fix harmless bug in rioConnRead (#7557)
this code is in use only if the master is disk-based, and the replica is
diskless. In this case we use a buffered reader, but we must avoid reading
past the rdb file, into the command stream. which Luckly rdb.c doesn't
really attempt to do (it knows how much it should read).

When rioConnRead detects that the extra buffering attempt reaches beyond
the read limit it should read less, but if the caller actually requested
more, then it should return with an error rather than a short read. the
bug would have resulted in short read.

in order to fix it, the code must consider the real requested size, and
not the extra buffering size.
2020-07-23 12:37:43 +03:00
Madelyn Olson
818dc3a089
Properly reset errno for rdbLoad (#7542) 2020-07-21 17:00:13 -07:00
WuYunlong
f7f77a746a
Clarification on the bug that was fixed in PR #7539. (#7541)
Before that PR, processCommand() did not notice that cmd could be a module
command in which case getkeys_proc member has a different meaning.

The outcome was that a module command which doesn't take any key names in its
arguments (similar to SLOWLOG) would be handled as if it might have key name arguments
(similar to MEMORY), would consider cluster redirect but will end up with 0 keys
after an excessive call to getKeysFromCommand, and eventually do the right thing.
2020-07-21 09:41:44 +03:00
Wen Hui
4e8f2d6881
Add missing calls to raxStop (#7532)
Since the dynamic allocations in raxIterator are only used for deep walks, memory
leak due to missing call to raxStop can only happen for rax with key names longer
than 32 bytes.

Out of all the missing calls, the only ones that may lead to a leak are the rax
for consumer groups and consumers, and these were only in AOFRW and rdbSave, which
normally only happen in fork or at shutdown.
2020-07-21 08:13:05 +03:00
Wen Hui
2fbd0271f6
add missing caching command in client help (#7399) 2020-07-20 18:53:03 -07:00
zhaozhao.zz
13e50935a8
replication: need handle -NOPERM error after send ping (#7538) 2020-07-20 22:21:55 +08:00
WuYunlong
e4d7de608c
Fix cluster redirect for module command with no firstkey. (#7539)
Before this commit, processCommand() did not notice that cmd could be a module command
which declared `getkeys-api` and handled it for the purpose of cluster redirect it
as if it doesn't use any keys.

This commit fixed it by reusing the codes in addReplyCommand().
2020-07-20 15:33:06 +03:00
WuYunlong
86fed3fe09
Refactor streamAppendItem() by deleting redundancy condition. (#7487)
It will never happen that "lp != NULL && lp_bytes >= server.stream_node_max_bytes".
Assume that "lp != NULL && lp_bytes >= server.stream_node_max_bytes",
we got the following conditions:
a. lp != NULL
b. lp_bytes >= server.stream_node_max_bytes

If server.stream_node_max_bytes is 0, given condition a, condition b is always satisfied
If server.stream_node_max_bytes is not 0, given condition a and condition b, the codes just a
	few lines above set lp to NULL, a controdiction with condition a

So that condition b is recundant. We could delete it safely.
2020-07-20 13:14:27 +03:00
yoav-steinberg
d484b8a04e
Support passing stack allocated module strings to moduleCreateArgvFromUserFormat (#7528)
Specifically, the key passed to the module aof_rewrite callback is a stack allocated robj. When passing it to RedisModule_EmitAOF (with appropriate "s" fmt string) redis used to panic when trying to inc the ref count of the stack allocated robj. Now support such robjs by coying them to a new heap robj. This doesn't affect performance because using the alternative "c" or "b" format strings also copies the input to a new heap robj.
2020-07-16 20:59:38 +03:00
杨博东
8596d483bc
Stream avoid duplicate parse id (#7450) 2020-07-16 08:57:27 +03:00
Luke Palmer
5f716ea467
Send null for invalidate on flush (#7469) 2020-07-15 10:53:41 -07:00
dmurnane
9242ccf238
Notify systemd on sentinel startup (#7168)
Co-authored-by: Daniel Murnane <dmurnane@eitccorp.com>
2020-07-15 13:29:26 +03:00
Developer-Ecosystem-Engineering
c2b5f1c15b
Add registers dump support for Apple silicon (#7453)
Export following environment variables before building on macOS on Apple silicon

export ARCH_FLAGS="-arch arm64"
export SDK_NAME=macosx
export SDK_PATH=$(xcrun --show-sdk-path --sdk $SDK_NAME)
export CFLAGS="$ARCH_FLAGS -isysroot $SDK_PATH -I$SDK_PATH/usr/include"
export CXXFLAGS=$CFLAGS
export LDFLAGS="$ARCH_FLAGS"
export CC="$(xcrun -sdk $SDK_PATH --find clang) $CFLAGS"
export CXX="$(xcrun -sdk $SDK_PATH --find clang++) $CXXFLAGS"
export LD="$(xcrun -sdk $SDK_PATH --find ld) $LDFLAGS"

make
make test
..
All tests passed without errors!

Backtrack logging assumes x86 and required updating
2020-07-15 12:44:03 +03:00
Wen Hui
d85af4d6f5
correct error msg for num connections reaching maxclients in cluster mode (#7444) 2020-07-15 12:38:47 +03:00
WuYunlong
93bdbf5aa4
Fix command help for unexpected options (#7476) 2020-07-15 12:38:22 +03:00
WuYunlong
dc690161d5
Refactor RM_KeyType() by using macro. (#7486) 2020-07-15 12:37:44 +03:00
Oran Agra
a176cb56a3
diskless master disconnect replicas when rdb child failed (#7518)
in case the rdb child failed, crashed or terminated unexpectedly redis
would have marked the replica clients with repl_put_online_on_ack and
then kill them only after a minute when no ack was received.

it would not stream anything to these connections, so the only effect of
this bug is a delay of 1 minute in the replicas attempt to re-connect.
2020-07-14 20:21:59 +03:00
Qu Chen
938c35302f
Replica always reports master's config epoch in CLUSTER NODES output. (#7235) 2020-07-13 07:16:06 -07:00
Oran Agra
6a81450144
RESTORE ABSTTL skip expired keys - leak (#7511) 2020-07-13 16:40:19 +03:00
jimgreen2013
67660881ed
fix description about ziplist, the code is ok (#6318)
* fix description about ZIP_BIG_PREVLEN(the code is ok), it's similar to
antirez#4705

* fix description about ziplist entry encoding field (the code is ok),
the max length should be 2^32 - 1 when encoding is 5 bytes
2020-07-11 14:51:44 -05:00
杨博东
e9aba28932
STORE variants: SINTER,SUNION,SDIFF,ZUNION use setKey instead of dbDelete+dbAdd (#7489)
one of the differences (other than consistent code with SORT, GEORADIUS), is that the LFU of the old key is retained.
2020-07-11 15:52:41 +03:00
马永泽
279b4a1464
fix benchmark in cluster mode fails to authenticate (#7488)
Co-authored-by: Oran Agra <oran@redislabs.com> (styling)
2020-07-10 16:37:11 +03:00
Yossi Gottlieb
3e6f2b1a45
TLS: Session caching configuration support. (#7420)
* TLS: Session caching configuration support.
* TLS: Remove redundant config initialization.
2020-07-10 11:33:47 +03:00
Yossi Gottlieb
5266293a0f
TLS: Ignore client cert when tls-auth-clients off. (#7457) 2020-07-10 10:32:21 +03:00
James Hilliard
6a014af79a
Use pkg-config to properly detect libssl and libcrypto libraries (#7452) 2020-07-10 10:30:09 +03:00
Yossi Gottlieb
d9f970d8d3
TLS: Add missing redis-cli options. (#7456)
* Tests: fix and reintroduce redis-cli tests.

These tests have been broken and disabled for 10 years now!

* TLS: add remaining redis-cli support.

This adds support for the redis-cli --pipe, --rdb and --replica options
previously unsupported in --tls mode.

* Fix writeConn().
2020-07-10 10:25:55 +03:00
Oran Agra
b23e251036 redis-cli --hotkeys fixed to handle non-printable key names 2020-07-10 10:07:47 +03:00
Oran Agra
6f8a8647de redis-cli --bigkeys fixed to handle non-printable key names 2020-07-10 10:07:47 +03:00
Oran Agra
5977a94842
RESTORE ABSTTL won't store expired keys into the db (#7472)
Similarly to EXPIREAT with TTL in the past, which implicitly deletes the
key and return success, RESTORE should not store key that are already
expired into the db.
When used together with REPLACE it should emit a DEL to keyspace
notification and replication stream.
2020-07-10 10:02:37 +03:00
huangzhw
d6180c8c86
defrag.c activeDefragSdsListAndDict when defrag sdsele, We can't use (#7492)
it to calculate hash, we should use newsds.
2020-07-10 08:29:44 +03:00
Oran Agra
69ade87325
tests/valgrind: don't use debug restart (#7404)
* tests/valgrind: don't use debug restart

DEBUG REATART causes two issues:
1. it uses execve which replaces the original process and valgrind doesn't
   have a chance to check for errors, so leaks go unreported.
2. valgrind report invalid calls to close() which we're unable to resolve.

So now the tests use restart_server mechanism in the tests, that terminates
the old server and starts a new one, new PID, but same stdout, stderr.

since the stderr can contain two or more valgrind report, it is not enough
to just check for the absence of leaks, we also need to check for some known
errors, we do both, and fail if we either find an error, or can't find a
report saying there are no leaks.

other changes:
- when killing a server that was already terminated we check for leaks too.
- adding DEBUG LEAK which was used to test it.
- adding --trace-children to valgrind, although no longer needed.
- since the stdout contains two or more runs, we need slightly different way
  of checking if the new process is up (explicitly looking for the new PID)
- move the code that handles --wait-server to happen earlier (before
  watching the startup message in the log), and serve the restarted server too.

* squashme - CR fixes
2020-07-10 08:26:52 +03:00
Oran Agra
9bbf768d3c
change references to the github repo location (#7479) 2020-07-10 08:25:26 +03:00
zhaozhao.zz
1978f996d8
BITOP: propagate only when it really SET or DEL targetkey (#5783)
For example:
    BITOP not targetkey sourcekey

If targetkey and sourcekey doesn't exist, BITOP has no effect,
we do not propagate it, thus can save aof and replica flow.
2020-07-10 08:20:27 +03:00
antirez
ad0a9df77a Update comment to clarify change in #7398. 2020-06-25 12:58:21 +02:00
Salvatore Sanfilippo
760021e677
Merge pull request #7398 from caiyuxinggg/work
cluster.c remove "if (nodeIsMaster(myself))" judgement before clusterSendFail in markNodeAsFailingIfNeeded, avoiding slave failover requires twice vote requests
2020-06-25 12:56:26 +02:00
antirez
959cdb358b Merge branch 'unstable' of github.com:/antirez/redis into unstable 2020-06-24 09:09:59 +02:00
antirez
a5a3a7bbc6 LPOS: option FIRST renamed RANK. 2020-06-24 09:09:43 +02:00
Salvatore Sanfilippo
6bbbdd26f4
Merge pull request #7390 from oranagra/exec_fails_abort
EXEC always fails with EXECABORT and multi-state is cleared
2020-06-23 13:12:52 +02:00
Oran Agra
65a3307bc9 EXEC always fails with EXECABORT and multi-state is cleared
In order to support the use of multi-exec in pipeline, it is important that
MULTI and EXEC are never rejected and it is easy for the client to know if the
connection is still in multi state.

It was easy to make sure MULTI and DISCARD never fail (done by previous
commits) since these only change the client state and don't do any actual
change in the server, but EXEC is a different story.

Since in the past, it was possible for clients to handle some EXEC errors and
retry the EXEC, we now can't affort to return any error on EXEC other than
EXECABORT, which now carries with it the real reason for the abort too.

Other fixes in this commit:
- Some checks that where performed at the time of queuing need to be re-
  validated when EXEC runs, for instance if the transaction contains writes
  commands, it needs to be aborted. there was one check that was already done
  in execCommand (-READONLY), but other checks where missing: -OOM, -MISCONF,
  -NOREPLICAS, -MASTERDOWN
- When a command is rejected by processCommand it was rejected with addReply,
  which was not recognized as an error in case the bad command came from the
  master. this will enable to count or MONITOR these errors in the future.
- make it easier for tests to create additional (non deferred) clients.
- add tests for the fixes of this commit.
2020-06-23 12:01:33 +03:00
antirez
21f62c3346 Include cluster.h for getClusterConnectionsCount(). 2020-06-22 11:44:11 +02:00
antirez
746297314f Fix BITFIELD i64 type handling, see #7417. 2020-06-22 11:41:19 +02:00
antirez
59fd178014 Clarify maxclients and cluster in conf. Remove myself too. 2020-06-22 11:21:21 +02:00
hwware
1bfa2d27a6 fix memory leak in sentinel connection sharing 2020-06-21 23:04:28 -04:00
Salvatore Sanfilippo
5e46e0800b
Merge pull request #7402 from chenhui0212/unstable
Fix comments in listpack.c
2020-06-18 11:29:06 +02:00
chenhui0212
f800b52172 Fix comments in function raxLowWalk of listpack.c 2020-06-18 17:28:26 +08:00
Tomasz Poradowski
4ee011adb5 ensure SHUTDOWN_NOSAVE in Sentinel mode
- enforcing of SHUTDOWN_NOSAVE flag in one place to make it consitent
  when running in Sentinel mode
2020-06-17 22:22:49 +02:00
chenhui0212
71fafd761a fix comments in listpack.c 2020-06-16 17:50:38 +08:00
antirez
4b8d8826af Use cluster connections too, to limit maxclients.
See #7401.
2020-06-16 11:45:11 +02:00
antirez
62fc7f4f24 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2020-06-16 11:11:01 +02:00
antirez
784479939d Tracking: fix enableBcastTrackingForPrefix() invalid sdslen() call.
Related to #7387.
2020-06-16 11:09:48 +02:00
root
c92464db69 cluster.c remove if of clusterSendFail in markNodeAsFailingIfNeeded 2020-06-15 10:18:14 +08:00
meir@redislabs.com
a89bf734a9 Fix RM_ScanKey module api not to return int encoded strings
The scan key module API provides the scan callback with the current
field name and value (if it exists). Those arguments are RedisModuleString*
which means it supposes to point to robj which is encoded as a string.
Using createStringObjectFromLongLong function might return robj that
points to an integer and so break a module that tries for example to
use RedisModule_StringPtrLen on the given field/value.

The PR introduces a fix that uses the createObject function and sdsfromlonglong function.
Using those function promise that the field and value pass to the to the
scan callback will be Strings.

The PR also changes the Scan test module to use RedisModule_StringPtrLen
to catch the issue. without this, the issue is hidden because
RedisModule_ReplyWithString knows to handle integer encoding of the
given robj (RedisModuleString).

The PR also introduces a new test to verify the issue is solved.
2020-06-14 11:20:15 +03:00
antirez
1055398849 Fix LCS object type checking. Related to #7379. 2020-06-12 12:43:40 +02:00
Salvatore Sanfilippo
a66298e6f1
Merge pull request #7375 from hwware/lcs_crash_fix
Fix Server Crash in LCS Command
2020-06-12 12:31:15 +02:00
antirez
ca58198a76 help.h updated. 2020-06-12 12:16:19 +02:00
hwware
7008a0ba66 fix memory leak 2020-06-11 09:56:52 -04:00
antirez
0091125cae LPOS: tests + crash fix. 2020-06-11 12:39:06 +02:00
antirez
89ca2afd6b LPOS: update to latest proposal.
See https://gist.github.com/antirez/3591c5096bc79cad8b5a992e08304f48
2020-06-11 11:18:20 +02:00
antirez
e63a5ba122 LPOS: implement the final design. 2020-06-10 12:49:15 +02:00
Paul Spooren
a7936ef96d LRANK: Add command (the command will be renamed LPOS).
The `LRANK` command returns the index (position) of a given element
within a list. Using the `direction` argument it is possible to specify
going from head to tail (acending, 1) or from tail to head (decending,
-1). Only the first found index is returend. The complexity is O(N).

When using lists as a queue it can be of interest at what position a
given element is, for instance to monitor a job processing through a
work queue. This came up within the Python `rq` project which is based
on Redis[0].

[0]: https://github.com/rq/rq/issues/1197

Signed-off-by: Paul Spooren <mail@aparcar.org>
2020-06-10 12:07:40 +02:00
antirez
cdad0e6485 Temporary fix for #7353 issue about EVAL during -BUSY. 2020-06-09 11:52:33 +02:00
hwware
2a05fa0d48 fix server crash in STRALGO command 2020-06-08 23:36:01 -04:00
Salvatore Sanfilippo
48b2915c18
Merge pull request #7363 from xhebox/unstable
return the correct proto version
2020-06-08 12:54:15 +02:00
Salvatore Sanfilippo
254f06c131
Merge pull request #7370 from oranagra/no_queue_in_aborted_multi
Don't queue commands in an already aborted MULTI state
2020-06-08 11:08:08 +02:00
Salvatore Sanfilippo
8183a4ca7f
Merge pull request #7369 from oranagra/no_reject_watch
Avoid rejecting WATCH / UNWATCH, like MULTI/EXEC/DISCARD
2020-06-08 11:04:26 +02:00
Salvatore Sanfilippo
265a7e7ec7
Merge pull request #7357 from soloestoy/bugfix-aof-keepttl
AOF: append origin SET if no expire option
2020-06-08 11:02:00 +02:00
Salvatore Sanfilippo
cfffda83fb
Merge pull request #7371 from oranagra/fix_disconnectSlaves
fix disconnectSlaves, to try to free each slave.
2020-06-08 10:43:51 +02:00
Salvatore Sanfilippo
74a203a050
Merge pull request #7352 from soloestoy/donot-free-protected-client-when-blocking
donot free protected client in freeClientsInAsyncFreeQueue()
2020-06-08 10:43:17 +02:00
Oran Agra
12504105c4 fix disconnectSlaves, to try to free each slave.
the recent change in that loop (iteration rather than waiting for it to
be empty) was intended to avoid an endless loop in case some slave would
refuse to be freed.

but the lookup of the first client remained, which would have caused it
to try the first one again and again instead of moving on.
2020-06-08 09:50:06 +03:00
Oran Agra
2bb297b102 Don't queue commands in an already aborted MULTI state 2020-06-08 09:43:10 +03:00
Oran Agra
2fa077b0e9 Avoid rejecting WATCH / UNWATCH, like MULTI/EXEC/DISCARD
Much like MULTI/EXEC/DISCARD, the WATCH and UNWATCH are not actually
operating on the database or server state, but instead operate on the
client state. the client may send them all in one long pipeline and check
all the responses only at the end, so failing them may lead to a
mismatch between the client state on the server and the one on the
client end, and execute the wrong commands (ones that were meant to be
discarded)

the watched keys are not actually stored in the client struct, but they
are in fact part of the client state. for instance, they're not cleared
or moved in SWAPDB or FLUSHDB.
2020-06-08 09:16:32 +03:00
xhe
7eba5c308a return the correct proto version
HELLO should return the current proto version, while the code hardcoded
3
2020-06-07 13:34:55 +08:00
antirez
44b76a75d2 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2020-06-06 11:44:05 +02:00
antirez
61074b43a6 Revert "Implements sendfile for redis."
This reverts commit 9cf500a3f6.
2020-06-06 11:42:45 +02:00
antirez
d1e23e04aa Revert "avoid using sendfile if tls-replication is enabled"
This reverts commit b9abecfc4c.
2020-06-06 11:42:41 +02:00
zhaozhao.zz
5d0774d62f AOF: append origin SET if no expire option 2020-06-03 17:55:18 +08:00
zhaozhao.zz
ad6b71352d donot free protected client in freeClientsInAsyncFreeQueue
related #7234
2020-06-02 11:48:14 +08:00
Salvatore Sanfilippo
f644b112ad
Merge pull request #7334 from kevin-fwu/tls_load_chained_certs
Fix TLS certificate loading for chained certificates.
2020-05-31 14:32:56 +02:00
antirez
bdd5411747 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2020-05-29 11:09:15 +02:00
antirez
1f8ea99b4b Fix handling of special chars in ACL LOAD.
Now it is also possible for ACL SETUSER to accept empty strings
as valid operations (doing nothing), so for instance

    ACL SETUSER myuser ""

Will have just the effect of creating a user in the default state.

This should fix #7329.
2020-05-29 11:07:13 +02:00
Salvatore Sanfilippo
5ef157bae1
Merge pull request #7330 from liuzhen/unstable
fix clusters mixing accidentally by gossip
2020-05-29 10:15:26 +02:00
antirez
6a16a636bf Replication: showLatestBacklog() refactored out. 2020-05-28 10:08:16 +02:00
antirez
484af8ed53 Drop useless line from replicationCacheMaster(). 2020-05-27 17:08:51 +02:00
Kevin Fwu
151b12a80f Fix TLS certificate loading for chained certificates.
This impacts client verification for chained certificates (such as Lets
Encrypt certificates). Client Verify requires the full chain in order to
properly verify the certificate.
2020-05-27 08:53:29 -04:00
antirez
22472fe5a1 Remove the meaningful offset feature.
After a closer look, the Redis core devleopers all believe that this was
too fragile, caused many bugs that we didn't expect and that were very
hard to track. Better to find an alternative solution that is simpler.
2020-05-27 12:06:33 +02:00
antirez
325409a011 Set a protocol error if master use the inline protocol.
We want to react a bit more aggressively if we sense that the master is
sending us some corrupted stream. By setting the protocol error we both
ensure that the replica will disconnect, and avoid caching the master so
that a full SYNC will be required. This is protective against
replication bugs.
2020-05-27 11:45:49 +02:00
Liu Zhen
3984dc6539 fix clusters mixing accidentally by gossip
`clusterStartHandshake` will start hand handshake
and eventually send CLUSTER MEET message, which is strictly prohibited
in the REDIS CLUSTER SPEC.
Only system administrator can initiate CLUSTER MEET message.
Futher, according to the SPEC, rather than IP/PORT pairs, only nodeid
can be trusted.
2020-05-27 12:01:40 +08:00
antirez
94c026cd19 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2020-05-26 23:55:52 +02:00
antirez
7d35939206 Replication: log backlog creation event. 2020-05-26 23:55:18 +02:00
Salvatore Sanfilippo
caf7c50408
Merge pull request #7328 from oranagra/daily_tls_test
avoid using sendfile if tls-replication is enabled
2020-05-26 13:19:55 +02:00
Oran Agra
b9abecfc4c avoid using sendfile if tls-replication is enabled
this obviously broke the tests, but went unnoticed so far since tls
wasn't often tested.
2020-05-26 13:52:06 +03:00
antirez
92a3ff6168 Clarify what is happening in PR #7320. 2020-05-25 11:47:38 +02:00
zhaozhao.zz
eec769be59 PSYNC2: second_replid_offset should be real meaningful offset
After adjustMeaningfulReplOffset(), all the other related variable
should be updated, including server.second_replid_offset.

Or the old version redis like 5.0 may receive wrong data from
replication stream, cause redis 5.0 can sync with redis 6.0,
but doesn't know meaningful offset.
2020-05-25 11:17:54 +08:00
antirez
adc5df1bc3 Make disconnectSlaves() synchronous in the base case.
Otherwise we run into that:

Backtrace:
src/redis-server 127.0.0.1:21322(logStackTrace+0x45)[0x479035]
src/redis-server 127.0.0.1:21322(sigsegvHandler+0xb9)[0x4797f9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fd373c5e390]
src/redis-server 127.0.0.1:21322(_serverAssert+0x6a)[0x47660a]
src/redis-server 127.0.0.1:21322(freeReplicationBacklog+0x42)[0x451282]
src/redis-server 127.0.0.1:21322[0x4552d4]
src/redis-server 127.0.0.1:21322[0x4c5593]
src/redis-server 127.0.0.1:21322(aeProcessEvents+0x2e6)[0x42e786]
src/redis-server 127.0.0.1:21322(aeMain+0x1d)[0x42eb0d]
src/redis-server 127.0.0.1:21322(main+0x4c5)[0x42b145]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fd3738a3830]
src/redis-server 127.0.0.1:21322(_start+0x29)[0x42b409]

Since we disconnect all the replicas and free the replication backlog in
certain replication paths, and the code that will free the replication
backlog expects that no replica is connected.

However we still need to free the replicas asynchronously in certain
cases, as documented in the top comment of disconnectSlaves().
2020-05-22 19:29:09 +02:00
antirez
07c6bee78f Merge branch 'unstable' of github.com:/antirez/redis into unstable 2020-05-22 16:31:05 +02:00
antirez
b407590cee Fix #7306 less aggressively.
Citing from the issue:

btw I suggest we change this fix to something else:
* We revert the fix.
* We add a call that disconnects chained replicas in the place where we trim the replica (that is a master i this case) offset.
This way we can avoid disconnections when there is no trimming of the backlog.

Note that we now want to disconnect replicas asynchronously in
disconnectSlaves(), because it's in general safer now that we can call
it from freeClient(). Otherwise for instance the command:

    CLIENT KILL TYPE master

May crash: clientCommand() starts running the linked of of clients,
looking for clients to kill. However it finds the master, kills it
calling freeClient(), but this in turn calls replicationCacheMaster()
that may also call disconnectSlaves() now. So the linked list iterator
of the clientCommand() will no longer be valid.
2020-05-22 16:29:53 +02:00
Salvatore Sanfilippo
40161553e2
Merge pull request #7096 from ShooterIT/sendfile
Implements sendfile for redis.
2020-05-22 13:55:46 +02:00
Salvatore Sanfilippo
285817b28a
Merge pull request #7305 from madolson/unstable-connection
EAGAIN not handled for TLS during diskless load
2020-05-22 12:25:40 +02:00
Qu Chen
42f5da5d2d Disconnect chained replicas when the replica performs PSYNC with the master always to avoid replication offset mismatch between master and chained replicas. 2020-05-21 18:42:10 -07:00
Madelyn Olson
5109f16b77 EAGAIN for tls during diskless load 2020-05-21 15:20:59 -07:00
hwware
06a59fd44d using moreargs variable 2020-05-21 18:13:35 -04:00
hwware
5edb1beb63 fix server crash for STRALGO command 2020-05-21 17:30:36 -04:00
Salvatore Sanfilippo
fe640e5858
Merge pull request #7300 from ShooterIT/comment
Replace 'addDeferredMultiBulkLength' with 'addReplyDeferredLen' in comment
2020-05-21 15:53:01 +02:00
Salvatore Sanfilippo
9f932bcc82
Merge pull request #7299 from ShooterIT/reply-bytes
Fix reply bytes calculation error on 32bit platform
2020-05-21 15:47:15 +02:00
ShooterIT
86f0e873c7 Replace addDeferredMultiBulkLength with addReplyDeferredLen in comment 2020-05-21 21:45:35 +08:00
ShooterIT
9018ddc32a Fix reply bytes calculation error
Fix #7275.
2020-05-21 21:00:30 +08:00
zhaozhao.zz
4f3ff46a81 Tracking: flag CLIENT_TRACKING_BROKEN_REDIR when redir broken 2020-05-21 13:57:29 +08:00
Oran Agra
88d71f4793 fix a rare active defrag edge case bug leading to stagnation
There's a rare case which leads to stagnation in the defragger, causing
it to keep scanning the keyspace and do nothing (not moving any
allocation), this happens when all the allocator slabs of a certain bin
have the same % utilization, but the slab from which new allocations are
made have a lower utilization.

this commit fixes it by removing the current slab from the overall
average utilization of the bin, and also eliminate any precision loss in
the utilization calculation and move the decision about the defrag to
reside inside jemalloc.

and also add a test that consistently reproduce this issue.
2020-05-20 16:04:42 +03:00
Oran Agra
5d83e9e1de improve DEBUG MALLCTL to be able to write to write only fields.
also support:
  debug mallctl-str thread.tcache.flush VOID
2020-05-20 14:09:22 +03:00