Commit Graph

10506 Commits

Author SHA1 Message Date
Viktor Söderqvist
547c3405d4
Optimize quicklistIndex to seek from the nearest end (#9454)
Until now, giving a negative index seeks from the end of a list and a
positive seeks from the beginning. This change makes it seek from
the nearest end, regardless of the sign of the given index.

quicklistIndex is used by all list commands which operate by index.

LINDEX key 999999 in a list if 1M elements is greately optimized by
this change. Latency is cut by 75%.

LINDEX key -1000000 in a list of 1M elements, likewise.

LRANGE key -1 -1 is affected by this, since LRANGE converts the
indices to positive numbers before seeking.

The tests for corrupt dumps are updated to make sure the corrup
data is seeked in the same direction as before.
2021-09-06 09:12:38 +03:00
Wen Hui
763fd09416
Speed up sentinel tests (#9408)
Use sentinel debug to reduce default timeouts and allow tests to execute faster.
2021-09-05 13:26:29 +03:00
Madelyn Olson
8b8f05c86c
Add test verifying PUBSUB NUMPAT behavior (#9209) 2021-09-03 15:52:39 -07:00
guybe7
6aa2285e32
Fix two minor bugs (MIGRATE key args and getKeysUsingCommandTable) (#9455)
1. MIGRATE has a potnetial key arg in argv[3]. It should be reflected in the command table.
2. getKeysUsingCommandTable should never free getKeysResult, it is always freed by the caller)
   The reason we never encountered this double-free bug is that almost always getKeysResult
   uses the statis buffer and doesn't allocate a new one.
2021-09-02 17:19:27 +03:00
sundb
306a5ccd2d
Fix the timing of read and write events under kqueue (#9416)
Normally we execute the read event first and then the write event.
When the barrier is set, we will do it reverse.
However, under `kqueue`, if an `fd` has both read and write events,
reading the event using `kevent` will generate two events, which will
result in uncontrolled read and write timing.

This also means that the guarantees of AOF `appendfsync` = `always` are
not met on MacOS without this fix.

The main change to this pr is to cache the events already obtained when reading
them, so that if the same `fd` occurs again, only the mask in the cache is updated,
rather than a new event is generated.

This was exposed by the following test failure on MacOS:
```
*** [err]: AOF fsync always barrier issue in tests/integration/aof.tcl
Expected 544 != 544 (context: type eval line 26 cmd {assert {$size1 != $size2}} proc ::test)
```
2021-09-02 11:07:51 +03:00
Yossi Gottlieb
c9931ddba5
Use fchmod to update command history file. (#9447)
This is considered a safer approach as it prevents a race condition that
could lead to chmod executed on a different file.

Not a major risk, but CodeQL alerted this so it makes sense to fix.
2021-09-02 10:00:00 +03:00
Viktor Söderqvist
f24c63a292
Slot-to-keys using dict entry metadata (#9356)
* Enhance dict to support arbitrary metadata carried in dictEntry

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

* Rewrite slot-to-keys mapping to linked lists using dict entry metadata

This is a memory enhancement for Redis Cluster.

The radix tree slots_to_keys (which duplicates all key names prefixed with their
slot number) is replaced with a linked list for each slot. The dict entries of
the same cluster slot form a linked list and the pointers are stored as metadata
in each dict entry of the main DB dict.

This commit also moves the slot-to-key API from db.c to cluster.c.

Co-authored-by: Jim Brunner <brunnerj@amazon.com>
2021-08-30 23:25:36 -07:00
Oran Agra
1e7ad894d2
Tune timeout of active defrag test (#9426)
Failed on Raspberry Pi 3b where that single test took about 170 seconds
2021-08-30 12:39:09 +03:00
Wang Yuan
9a0c0617f1
Use sync_file_range to optimize fsync if possible (#9409)
We implement incremental data sync in rio.c by call fsync, on slow disk, that may cost a lot of time,
sync_file_range could provide async fsync, so we could serialize key/value and sync file data at the same time.

> one tip for sync_file_range usage: http://lkml.iu.edu/hypermail/linux/kernel/1005.2/01845.html

Additionally, this change avoids a single large write to be used, which can result in a mass of dirty
pages in the kernel (increasing the risk of someone else's write to block).

On HDD, current solution could reduce approximate half of dumping RDB time,
this PR costs 50s for dump 7.7G rdb but unstable branch costs 93s.
On NVME SSD, this PR can't reduce much time,  this PR costs 40s, unstable branch costs 48s.

Moreover, I find calling data sync every 4MB is better than 32MB.
2021-08-30 10:24:53 +03:00
Binbin
aefbc23451
Better error handling for updateClientOutputBufferLimit. (#9308)
This one follow #9313 and goes deeper (validation of config file parsing)

Move the check/update logic to a new updateClientOutputBufferLimit
function. So that it can be used in CONFIG SET and config file parsing.
2021-08-29 15:03:05 +03:00
Viktor Söderqvist
97dcf95cc8
redis-benchmark: improved help and warnings (#9419)
1. The output of --help:

  * On the Usage line, just write [OPTIONS] [COMMAND ARGS...] instead listing
    only a few arbitrary options and no command.
  * For --cluster, describe that if the command is supplied on the command line,
    the key must contain "{tag}". Otherwise, the command will not be sent to the
    right cluster node.
  * For -r, add a note that if -r is omitted, all commands in a benchmark will
    use the same key. Also align the description.
  * For -t, describe that -t is ignored if a command is supplied on the command
    line.

2. Print a warning if -t is present when a specific command is supplied.

3. Print all warnings and errors to stderr.

4. Remove -e from calls in redis-benchmark test suite.
2021-08-29 14:31:08 +03:00
Huang Zhw
b375f5919e
redis-benchmark: make show throughput in only one thread. (#9146)
In multipe threads mode, every thread output throughput info. This
may cause some problems:
- Bug in https://github.com/redis/redis/pull/8615;
- The show throughput is called too frequently;
- showThroughput which updates shared variable lacks synchronization
mechanism.

This commit also reverts changes in #8615 and changes time event
interval to macro.
2021-08-25 14:58:35 +03:00
Garen Chan
945a83d406
Fix boundary problem of adjusting open files limit. (#5722)
When `decr_step` is greater than `oldlimit`, the final `bestlimit` may be invalid.

    For example, oldlimit = 10, decr_step = 16.
    Current bestlimit = 15 and setrlimit() failed. Since bestlimit  is less than decr_step , then exit the loop.
    The final bestlimit is larger than oldlimit but is invalid.

Note that this only matters if the system fd limit is below 16, so unlikely to have any actual effect.
2021-08-24 22:54:21 +03:00
Wen Hui
641780a9c6
config memory limits: handle values larger than (signed) LLONG_MAX (#9313)
This aims to solve the issue in CONFIG SET maxmemory can only set maxmemory to up
to 9223372036854775807 (2^63) while the maxmemory should be ULLONG.
Added a memtoull function to convert a string representing an amount of memory
into the number of bytes (similar to memtoll but for ull). Also added ull2string to
convert a ULLong to string (Similar to ll2string).
2021-08-23 21:00:40 +03:00
Viktor Söderqvist
74590f8345
redis-cli: Assert > 0 before dividing, to silence warning by tool (#9396)
Also make sure function can't return NULL by another assert.
2021-08-22 13:57:18 +03:00
Binbin
0835f596b8
BITSET and BITFIELD SET only propagate command when the value changed. (#9403)
In old way, we always increase server.dirty in BITSET and BITFIELD SET.
Even the command doesn't really change anything. This commit make 
sure BITSET and BITFIELD SET only increase dirty when the value changed.

Because of that, if the value not changed, some others implications:
- Avoid adding useless AOF
- Reduce replication traffic
- Will not trigger keyspace notifications (setbit)
- Will not invalidate WATCH
- Will not sent the invalidation message to the tracking client
2021-08-22 10:20:53 +03:00
Viktor Söderqvist
8f59c1ecae
Let CONFIG GET * show both replicaof and its alias (#9395) 2021-08-21 19:43:18 -07:00
sundb
492d8d0961
Sanitize dump payload: fix double free after insert dup nodekey to stream rax and returns 0 (#9399) 2021-08-20 10:37:45 +03:00
Yossi Gottlieb
1d9c8d61d8
Skip OOM-related tests on incompatible platforms. (#9386)
We only run OOM related tests on x86_64 and aarch64, as jemalloc on other
platforms (notably s390x) may actually succeed very large allocations. As
a result the test may hang for a very long time at the cleanup phase,
iterating as many as 2^61 hash table slots.
2021-08-18 16:00:22 +03:00
yoav-steinberg
0e8d469f82
More generic crash report for unsupported archs (#9385)
Following compilation warnings on s390x.
2021-08-18 15:46:11 +03:00
Yossi Gottlieb
fe359cbfc2
Fix: don't assume char is unsigned. (#9375)
On systems that have unsigned char by default (s390x, arm), redis-server
could crash as soon as it populates the command table.
2021-08-15 21:37:44 +03:00
Wang Yuan
8edc3cd62c
Fix the wrong detection of sync_file_range system call (#9371)
If we want to check `defined(SYNC_FILE_RANGE_WAIT_BEFORE)`, we should include fcntl.h.
otherwise, SYNC_FILE_RANGE_WAIT_BEFORE is not defined, and there is alway not `sync_file_range` system call.
Introduced by #8532
2021-08-14 23:52:44 +03:00
Madelyn Olson
0cf2df84d4
Added additional validation for cluster SETSLOT (#9360) 2021-08-12 14:59:17 -07:00
Madelyn Olson
2402f5a7a1
Update cluster debug log to include human readable packet type (#9361) 2021-08-12 14:50:09 -07:00
Yossi Gottlieb
1221f7cd5e
Improve setup operations order after fork. (#9365)
The order of setting things up follows some reasoning: Setup signal
handlers first because a signal could fire at any time. Adjust OOM score
before everything else to assist the OOM killer if memory resources are
low.

The trigger for this is a valgrind test failure which resulted with the
child catching a SIGUSR1 before initializing the handler.
2021-08-12 14:31:12 +03:00
Yossi Gottlieb
08c46f2b86
Add debian:oldoldstable build target for CI. (#9358)
Making sure Redis builds properly on older compiler is important given the wide range of systems it is built for. So far Ubuntu 16.04 has been used for this purpose, but as it's getting phased out we'll move to `oldoldstable` Debian as an "old system" precursor.
2021-08-11 16:19:54 +03:00
sundb
5705cec68e
Fix missing dismiss hash listpack memory due to ziplist->listpack migration (#9353) 2021-08-10 16:54:19 +03:00
Huang Zhw
82528d2678
Redis-cli monitor and pubsub can be aborted with Ctrl+C, keeping the cli alive (#9347)
Abort cli blocking modes with SIGINT without exiting the cli.

Co-authored-by: charsyam <charsyam@gmail.com>
2021-08-10 15:03:49 +03:00
yoav-steinberg
19bc83716c
support regex in "--only" in runtest (#9352) 2021-08-10 14:28:24 +03:00
DarrenJiang13
8ab33c18e4
fix a compilation error around madvise when make with jemalloc on MacOS (#9350)
We only use MADV_DONTNEED on Linux, that's were it was tested.
2021-08-10 11:32:27 +03:00
Meir Shpilraien (Spielrein)
8f8117f78e
Format fixes and naming. SentReplyOnKeyMiss -> addReplyOrErrorObject (#9346)
Following the comments on #8659, this PR fix some formatting
and naming issues.
2021-08-10 10:19:21 +03:00
sundb
02fd76b97c
Replace all usage of ziplist with listpack for t_hash (#8887)
Part one of implementing #8702 (taking hashes first before other types)

## Description of the feature
1. Change ziplist encoded hash objects to listpack encoding.
2. Convert existing ziplists on RDB loading time. an O(n) operation.

## Rdb format changes
1. Add RDB_TYPE_HASH_LISTPACK rdb type.
2. Bump RDB_VERSION to 10

## Interface changes
1. New `hash-max-listpack-entries` config is an alias for `hash-max-ziplist-entries` (same with `hash-max-listpack-value`)
2. OBJECT ENCODING will return `listpack` instead of `ziplist`

## Listpack improvements:
1. Support direct insert, replace integer element (rather than convert back and forth from string)
3. Add more listpack capabilities to match the ziplist ones (like `lpFind`, `lpRandomPairs` and such)
4. Optimize element length fetching, avoid multiple calculations
5. Use inline to avoid function call overhead.

## Tests
1. Add a new test to the RDB load time conversion
2. Adding the listpack unit tests. (based on the one in ziplist.c)
3. Add a few "corrupt payload: fuzzer findings" tests, and slightly modify existing ones.

Co-authored-by: Oran Agra <oran@redislabs.com>
2021-08-10 09:18:49 +03:00
sundb
cbda492909
Sanitize dump payload: handle remaining empty key when RDB loading and restore command (#9349)
This commit mainly fixes empty keys due to RDB loading and restore command,
which was omitted in #9297.

1) When loading quicklsit, if all the ziplists in the quicklist are empty, NULL will be returned.
    If only some of the ziplists are empty, then we will skip the empty ziplists silently.
2) When loading hash zipmap, if zipmap is empty, sanitization check will fail.
3) When loading hash ziplist, if ziplist is empty, NULL will be returned.
4) Add RDB loading test with sanitize.
2021-08-09 17:13:46 +03:00
Qu Chen
0b643e930d
Cleanup: createAOFClient uses createClient to avoid overlooked mismatches (#9338)
AOF fake client creation (createAOFClient) was doing similar work as createClient,
with some minor differences, most of which unintended, this was dangerous and
meant that many changes to createClient should have always been reflected to aof.c

This cleanup changes createAOFClient to call createClient with NULL, like we
do in module.c and elsewhere.
2021-08-09 11:03:59 +03:00
Eduardo Semprebon
d3356bf614
Add SORT_RO command (#9299)
Add a readonly variant of the STORE command, so it can be used on
read-only workloads (replica, ACL, etc)
2021-08-09 09:40:29 +03:00
Qu Chen
e8eeba7bee
Allow master to replicate command longer than replica's query buffer limit (#9340)
Replication client no longer checks incoming command length against the client-query-buffer-limit. This makes the master able to replicate commands longer than replica's configured client-query-buffer-limit
2021-08-08 17:34:11 -07:00
Yossi Gottlieb
3307958bd0
Propagate OPENSSL_PREFIX to hiredis. (#9345) 2021-08-08 18:30:17 +03:00
Binbin
e1dc979054
sds.c: Fix potential overflow in sdsll2str. (#8910)
Fixes an undefined behavior, same way as our `ll2string` does.
2021-08-08 14:30:47 +03:00
Binbin
563ba7a3f0
Fix the wrong method used in quicklistTest. (#8951)
The test try to test `insert before 1 element`, but it use quicklist
InsertAfter, a copy-paste typo.

The commit also add an assert to verify results in some tests
to make sure it is as expected.
2021-08-08 09:03:52 +03:00
DarrenJiang13
43eb0ce3bf
[BUGFIX] Add some missed error statistics (#9328)
add error counting for some missed behaviors.
2021-08-06 19:27:24 -07:00
yoav-steinberg
0a9377535b
Ignore resize threshold on idle qbuf resizing (#9322)
Also update qbuf tests to verify both idle and peak based resizing logic.
And delete unused function: getClientsMaxBuffers
2021-08-06 20:50:34 +03:00
Oran Agra
3f3f678a47
corrupt-dump-fuzzer test, avoid creating junk keys (#9302)
The execution of the RPOPLPUSH command by the fuzzer created junk keys,
that were later being selected by RANDOMKEY and modified.
This also meant that lists were statistically tested more than other
files.

Fix the fuzzer not to pass junk key names to RPOPLPUSH, and add a check
that detects that new keys are not added by the fuzzer to detect future
similar issues.
2021-08-05 22:57:05 +03:00
Oran Agra
0c90370e6d
Improvements to corrupt payload sanitization (#9321)
Recently we found two issues in the fuzzer tester: #9302 #9285
After fixing them, more problems surfaced and this PR (as well as #9297) aims to fix them.

Here's a list of the fixes
- Prevent an overflow when allocating a dict hashtable
- Prevent OOM when attempting to allocate a huge string
- Prevent a few invalid accesses in listpack
- Improve sanitization of listpack first entry
- Validate integrity of stream consumer groups PEL
- Validate integrity of stream listpack entry IDs
- Validate ziplist tail followed by extra data which start with 0xff

Co-authored-by: sundb <sundbcn@gmail.com>
2021-08-05 22:56:14 +03:00
sundb
8ea777a6a0
Sanitize dump payload: fix empty keys when RDB loading and restore command (#9297)
When we load rdb or restore command, if we encounter a length of 0, it will result in the creation of an empty key.
This could either be a corrupt payload, or a result of a bug (see #8453 )

This PR mainly fixes the following:
1) When restore command will return `Bad data format` error.
2) When loading RDB, we will silently discard the key.

Co-authored-by: Oran Agra <oran@redislabs.com>
2021-08-05 22:42:20 +03:00
Madelyn Olson
39a4a44d7d
Add debug config flag to print certain config values on engine crash (#9304)
Add debug config flag to print certain config values on engine crash
2021-08-05 11:59:12 -07:00
Binbin
d0244bfc3d
Make sure execute SLAVEOF command in the right order in psync2 test. (#9316)
The psync2 test has failed several times recently.
In #9159 we only solved half of the problem.
i.e. reordering of the replica that's already connected to
the newly promoted master.

Consider this scenario:
0 slaveof 2
1 slaveof 2
3 slaveof 2
4 slaveof 1
0 slaveof no one, became a new master got a new replid
2 slaveof 0, partial resync and got the new replid
3 reconnect 2, inherit the new replid
3 slaveof 4, use the new replid and got a full resync

And another scenario:
1 slaveof 3
2 slaveof 4
3 slaveof 0
4 slaveof 0
4 slaveof no one, became a new master got a new replid
2 reconnect 4, inherit the new replid
2 slaveof 1, use the new replid and got a full resync

So maybe we should reattach replicas in the right order.
i.e. In the above example, if it would have reattached 1, 3 and 0 to
the new chain formed by 4 before trying to attach 2 to 1, it would succeed.

This commit break the SLAVEOF loop into two loops. (ideas from oran)

First loop that uses random to decide who replicates from who.
Second loop that does the actual SLAVEOF command.
In the second loop, we make sure to execute it in the right order,
and after each SLAVEOF, wait for it to be connected before we proceed.

Co-authored-by: Oran Agra <oran@redislabs.com>
2021-08-05 11:26:09 +03:00
Wen Hui
63e2a6d212
Add sentinel debug option command (#9291)
This makes it possible to tune many parameters that were previously hard coded.
We don't intend these to be user configurable, but only used by tests to accelerate certain conditions which would otherwise take a long time and slow down the test suite.

Co-authored-by: Lucas Guang Yang <l84193800@china.huawei.com>
2021-08-05 11:12:55 +03:00
menwen
ca559819f7
Add latency monitor sample when key is deleted via lazy expire (#9317)
Fix that there is no sample latency after the key expires via expireIfNeeded().
Some refactoring for shared code.
2021-08-05 11:09:24 +03:00
yoav-steinberg
d32f8641ed
fix dict access broken by #9228 (#9319) 2021-08-05 09:02:30 +03:00
yoav-steinberg
5e908a290c
dict struct memory optimizations (#9228)
Reduce dict struct memory overhead
on 64bit dict size goes down from jemalloc's 96 byte bin to its 56 byte bin.

summary of changes:
- Remove `privdata` from callbacks and dict creation. (this affects many files, see "Interface change" below).
- Meld `dictht` struct into the `dict` struct to eliminate struct padding. (this affects just dict.c and defrag.c)
- Eliminate the `sizemask` field, can be calculated from size when needed.
- Convert the `size` field into `size_exp` (exponent), utilizes one byte instead of 8.

Interface change: pass dict pointer to dict type call back functions.
This is instead of passing the removed privdata field. In the future if
we'd like to have private data in the callbacks we can extract it from
the dict type. We can extend dictType to include a custom dict struct
allocator and use it to allocate more data at the end of the dict
struct. This data can then be used to store private data later acccessed
by the callbacks.
2021-08-05 08:25:58 +03:00