Work in progress towards implementing a reply schema as part of COMMAND DOCS, see #9845
Since ironing the details of the reply schema of each and every command can take a long time, we
would like to merge this PR when the infrastructure is ready, and let this mature in the unstable branch.
Meanwhile the changes of this PR are internal, they are part of the repo, but do not affect the produced build.
### Background
In #9656 we add a lot of information about Redis commands, but we are missing information about the replies
### Motivation
1. Documentation. This is the primary goal.
2. It should be possible, based on the output of COMMAND, to be able to generate client code in typed
languages. In order to do that, we need Redis to tell us, in detail, what each reply looks like.
3. We would like to build a fuzzer that verifies the reply structure (for now we use the existing
testsuite, see the "Testing" section)
### Schema
The idea is to supply some sort of schema for the various replies of each command.
The schema will describe the conceptual structure of the reply (for generated clients), as defined in RESP3.
Note that the reply structure itself may change, depending on the arguments (e.g. `XINFO STREAM`, with
and without the `FULL` modifier)
We decided to use the standard json-schema (see https://json-schema.org/) as the reply-schema.
Example for `BZPOPMIN`:
```
"reply_schema": {
"oneOf": [
{
"description": "Timeout reached and no elements were popped.",
"type": "null"
},
{
"description": "The keyname, popped member, and its score.",
"type": "array",
"minItems": 3,
"maxItems": 3,
"items": [
{
"description": "Keyname",
"type": "string"
},
{
"description": "Member",
"type": "string"
},
{
"description": "Score",
"type": "number"
}
]
}
]
}
```
#### Notes
1. It is ok that some commands' reply structure depends on the arguments and it's the caller's responsibility
to know which is the relevant one. this comes after looking at other request-reply systems like OpenAPI,
where the reply schema can also be oneOf and the caller is responsible to know which schema is the relevant one.
2. The reply schemas will describe RESP3 replies only. even though RESP3 is structured, we want to use reply
schema for documentation (and possibly to create a fuzzer that validates the replies)
3. For documentation, the description field will include an explanation of the scenario in which the reply is sent,
including any relation to arguments. for example, for `ZRANGE`'s two schemas we will need to state that one
is with `WITHSCORES` and the other is without.
4. For documentation, there will be another optional field "notes" in which we will add a short description of
the representation in RESP2, in case it's not trivial (RESP3's `ZRANGE`'s nested array vs. RESP2's flat
array, for example)
Given the above:
1. We can generate the "return" section of all commands in [redis-doc](https://redis.io/commands/)
(given that "description" and "notes" are comprehensive enough)
2. We can generate a client in a strongly typed language (but the return type could be a conceptual
`union` and the caller needs to know which schema is relevant). see the section below for RESP2 support.
3. We can create a fuzzer for RESP3.
### Limitations (because we are using the standard json-schema)
The problem is that Redis' replies are more diverse than what the json format allows. This means that,
when we convert the reply to a json (in order to validate the schema against it), we lose information (see
the "Testing" section below).
The other option would have been to extend the standard json-schema (and json format) to include stuff
like sets, bulk-strings, error-string, etc. but that would mean also extending the schema-validator - and that
seemed like too much work, so we decided to compromise.
Examples:
1. We cannot tell the difference between an "array" and a "set"
2. We cannot tell the difference between simple-string and bulk-string
3. we cannot verify true uniqueness of items in commands like ZRANGE: json-schema doesn't cover the
case of two identical members with different scores (e.g. `[["m1",6],["m1",7]]`) because `uniqueItems`
compares (member,score) tuples and not just the member name.
### Testing
This commit includes some changes inside Redis in order to verify the schemas (existing and future ones)
are indeed correct (i.e. describe the actual response of Redis).
To do that, we added a debugging feature to Redis that causes it to produce a log of all the commands
it executed and their replies.
For that, Redis needs to be compiled with `-DLOG_REQ_RES` and run with
`--reg-res-logfile <file> --client-default-resp 3` (the testsuite already does that if you run it with
`--log-req-res --force-resp3`)
You should run the testsuite with the above args (and `--dont-clean`) in order to make Redis generate
`.reqres` files (same dir as the `stdout` files) which contain request-response pairs.
These files are later on processed by `./utils/req-res-log-validator.py` which does:
1. Goes over req-res files, generated by redis-servers, spawned by the testsuite (see logreqres.c)
2. For each request-response pair, it validates the response against the request's reply_schema
(obtained from the extended COMMAND DOCS)
5. In order to get good coverage of the Redis commands, and all their different replies, we chose to use
the existing redis test suite, rather than attempt to write a fuzzer.
#### Notes about RESP2
1. We will not be able to use the testing tool to verify RESP2 replies (we are ok with that, it's time to
accept RESP3 as the future RESP)
2. Since the majority of the test suite is using RESP2, and we want the server to reply with RESP3
so that we can validate it, we will need to know how to convert the actual reply to the one expected.
- number and boolean are always strings in RESP2 so the conversion is easy
- objects (maps) are always a flat array in RESP2
- others (nested array in RESP3's `ZRANGE` and others) will need some special per-command
handling (so the client will not be totally auto-generated)
Example for ZRANGE:
```
"reply_schema": {
"anyOf": [
{
"description": "A list of member elements",
"type": "array",
"uniqueItems": true,
"items": {
"type": "string"
}
},
{
"description": "Members and their scores. Returned in case `WITHSCORES` was used.",
"notes": "In RESP2 this is returned as a flat array",
"type": "array",
"uniqueItems": true,
"items": {
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"description": "Member",
"type": "string"
},
{
"description": "Score",
"type": "number"
}
]
}
}
]
}
```
### Other changes
1. Some tests that behave differently depending on the RESP are now being tested for both RESP,
regardless of the special log-req-res mode ("Pub/Sub PING" for example)
2. Update the history field of CLIENT LIST
3. Added basic tests for commands that were not covered at all by the testsuite
### TODO
- [x] (maybe a different PR) add a "condition" field to anyOf/oneOf schemas that refers to args. e.g.
when `SET` return NULL, the condition is `arguments.get||arguments.condition`, for `OK` the condition
is `!arguments.get`, and for `string` the condition is `arguments.get` - https://github.com/redis/redis/issues/11896
- [x] (maybe a different PR) also run `runtest-cluster` in the req-res logging mode
- [x] add the new tests to GH actions (i.e. compile with `-DLOG_REQ_RES`, run the tests, and run the validator)
- [x] (maybe a different PR) figure out a way to warn about (sub)schemas that are uncovered by the output
of the tests - https://github.com/redis/redis/issues/11897
- [x] (probably a separate PR) add all missing schemas
- [x] check why "SDOWN is triggered by misconfigured instance replying with errors" fails with --log-req-res
- [x] move the response transformers to their own file (run both regular, cluster, and sentinel tests - need to
fight with the tcl including mechanism a bit)
- [x] issue: module API - https://github.com/redis/redis/issues/11898
- [x] (probably a separate PR): improve schemas: add `required` to `object`s - https://github.com/redis/redis/issues/11899
Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Co-authored-by: Hanna Fadida <hanna.fadida@redislabs.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Shaya Potter <shaya@redislabs.com>
This payload produces a set with duplicate elements (listpack encoding):
```
restore _key 0 "\x14\x25\x25\x00\x00\x00\x0A\x00\x06\x01\x82\x5F\x35\x03\x04\x01\x82\x5F\x31\x03\x82\x5F\x33\x03\x00\x01\x82\x5F\x39\x03\x82\x5F\x33\x03\x08\x01\x02\x01\xFF\x0B\x00\x31\xBE\x7D\x41\x01\x03\x5B\xEC"
smembers key
1) "6"
2) "_5"
3) "4"
4) "_1"
5) "_3" ---> dup
6) "0"
7) "_9"
8) "_3" ---> dup
9) "8"
10) "2"
```
This kind of sets will cause SDIFF to hang, SDIFF generated a broken
protocol and left the client hung. (Expected ten elements, but only
got nine elements due to the duplication.)
If we set `sanitize-dump-payload` to yes, we will be able to find
the duplicate elements and report "ERR Bad data format".
Discovered and discussed in #11290.
This PR also improve prints when corrupt-dump-fuzzer hangs, it will
print the cmds and the payload, an example like:
```
Testing integration/corrupt-dump-fuzzer
[TIMEOUT]: clients state report follows.
sock6 => (SPAWNED SERVER) pid:28884
Killing still running Redis server 28884
commands caused test to hang:
SDIFF __key
payload that caused test to hang: "\x14\balabala"
```
Co-authored-by: Oran Agra <oran@redislabs.com>
The following example will create an empty set (listpack encoding):
```
> RESTORE key 0
"\x14\x25\x25\x00\x00\x00\x00\x00\x02\x01\x82\x5F\x37\x03\x06\x01\x82\x5F\x35\x03\x82\x5F\x33\x03\x00\x01\x82\x5F\x31\x03\x82\x5F\x39\x03\x04\xA9\x08\x01\xFF\x0B\x00\xA3\x26\x49\xB4\x86\xB0\x0F\x41"
OK
> SCARD key
(integer) 0
> SRANDMEMBER key
Error: Server closed the connection
```
In the spirit of #9297, skip empty set when loading RDB_TYPE_SET_LISTPACK.
Introduced in #11290
The corrupt dump fuzzer uncovered a valgrind warning saying:
```
==76370== Argument 'size' of function malloc has a fishy (possibly negative) value: -3744781444216323815
```
This allocation would have failed (returning NULL) and being handled properly by redis (even before this change), but we also want to silence the valgrind warnings (which are checking that casting to ssize_t produces a non-negative value).
The solution i opted for is to explicitly fail these allocations (returning NULL), before even reaching `malloc` (which would have failed and return NULL too).
The implication is that we will not be able to support a single allocation of more than 2GB on a 32bit system (which i don't think is a realistic scenario).
i.e. i do think we could be facing cases were redis consumes more than 2gb on a 32bit system, but not in a single allocation.
The byproduct of this, is that i dropped the overflow assertions, since these will now lead to the same OOM panic we have for failed allocations.
Adds the ability to track the lag of a consumer group (CG), that is, the number
of entries yet-to-be-delivered from the stream.
The proposed constant-time solution is in the spirit of "best-effort."
Partially addresses #8737.
## Description of approach
We add a new "entries_added" property to the stream. This starts at 0 for a new
stream and is incremented by 1 with every `XADD`. It is essentially an all-time
counter of the entries added to the stream.
Given the stream's length and this counter value, we can trivially find the logical
"entries_added" counter of the first ID if and only if the stream is contiguous.
A fragmented stream contains one or more tombstones generated by `XDEL`s.
The new "xdel_max_id" stream property tracks the latest tombstone.
The CG also tracks its last delivered ID's as an "entries_read" counter and
increments it independently when delivering new messages, unless the this
read counter is invalid (-1 means invalid offset). When the CG's counter is
available, the reported lag is the difference between added and read counters.
Lastly, this also adds a "first_id" field to the stream structure in order to make
looking it up cheaper in most cases.
## Limitations
There are two cases in which the mechanism isn't able to track the lag.
In these cases, `XINFO` replies with `null` in the "lag" field.
The first case is when a CG is created with an arbitrary last delivered ID,
that isn't "0-0", nor the first or the last entries of the stream. In this case,
it is impossible to obtain a valid read counter (short of an O(N) operation).
The second case is when there are one or more tombstones fragmenting
the stream's entries range.
In both cases, given enough time and assuming that the consumers are
active (reading and lacking) and advancing, the CG should be able to
catch up with the tip of the stream and report zero lag.
Once that's achieved, lag tracking would resume as normal (until the
next tombstone is set).
## API changes
* `XGROUP CREATE` added with the optional named argument `[ENTRIESREAD entries-read]`
for explicitly specifying the new CG's counter.
* `XGROUP SETID` added with an optional positional argument `[ENTRIESREAD entries-read]`
for specifying the CG's counter.
* `XINFO` reports the maximal tombstone ID, the recorded first entry ID, and total
number of entries added to the stream.
* `XINFO` reports the current lag and logical read counter of CGs.
* `XSETID` is an internal command that's used in replication/aof. It has been added with
the optional positional arguments `[ENTRIESADDED entries-added] [MAXDELETEDID max-deleted-entry-id]`
for propagating the CG's offset and maximal tombstone ID of the stream.
## The generic unsolved problem
The current stream implementation doesn't provide an efficient way to obtain the
approximate/exact size of a range of entries. While it could've been nice to have
that ability (#5813) in general, let alone specifically in the context of CGs, the risk
and complexities involved in such implementation are in all likelihood prohibitive.
## A refactoring note
The `streamGetEdgeID` has been refactored to accommodate both the existing seek
of any entry as well as seeking non-deleted entries (the addition of the `skip_tombstones`
argument). Furthermore, this refactoring also migrated the seek logic to use the
`streamIterator` (rather than `raxIterator`) that was, in turn, extended with the
`skip_tombstones` Boolean struct field to control the emission of these.
Co-authored-by: Guy Benoish <guy.benoish@redislabs.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
When rdb creates a consumer without determining whether it exists in advance,
it may return NULL and crash if it encounters corrupt data with duplicate consumers.
Part three of implementing #8702, following #8887 and #9366 .
## Description of the feature
1. Replace the ziplist container of quicklist with listpack.
2. Convert existing quicklist ziplists on RDB loading time. an O(n) operation.
## Interface changes
1. New `list-max-listpack-size` config is an alias for `list-max-ziplist-size`.
2. Replace `debug ziplist` command with `debug listpack`.
## Internal changes
1. Add `lpMerge` to merge two listpacks . (same as `ziplistMerge`)
2. Add `lpRepr` to print info of listpack which is used in debugCommand and `quicklistRepr`. (same as `ziplistRepr`)
3. Replace `QUICKLIST_NODE_CONTAINER_ZIPLIST` with `QUICKLIST_NODE_CONTAINER_PACKED`(following #9357 ).
It represent that a quicklistNode is a packed node, as opposed to a plain node.
4. Remove `createZiplistObject` method, which is never used.
5. Calculate listpack entry size using overhead overestimation in `quicklistAllowInsert`.
We prefer an overestimation, which would at worse lead to a few bytes below the lowest limit of 4k.
## Improvements
1. Calling `lpShrinkToFit` after converting Ziplist to listpack, which was missed at #9366.
2. Optimize `quicklistAppendPlainNode` to avoid memcpy data.
## Bugfix
1. Fix crash in `quicklistRepr` when ziplist is compressed, introduced from #9366.
## Test
1. Add unittest for `lpMerge`.
2. Modify the old quicklist ziplist corrupt dump test.
Co-authored-by: Oran Agra <oran@redislabs.com>
Issue found by corrupt-dump-fuzzer test with ASAN.
The problem was that lpSkip and lpGetWithSize could read the next listpack entry without validating that it's in range.
Similarly even the memcmp in lpFind could do that and possibly crash on segfault and now they'll crash on assert first.
The naive fix of using lpAssertValidEntry every time, resulted in 30% degradation in the lpFind benchmark of the unit test.
The final fix with the condition at the bottom has no performance implications.
TCL8.5 can't handle cases where part of the string is escaped and part of it isn't,
if there's a single char that needs escaping, we need to escape the whole string.
Leak found by the corrupt-dump-fuzzer when using GCC ASAN, which seems
to falsely report leaks on pointers kept only on the stack when calling exit.
Instead we now use _exit on panic / assert to skip these leak checks.
Additionally, check for sanitizer warnings in the corrupt-dump-fuzzer between iterations,
so that when something is found we know which test to relate it too (and it prints reproduction command list)
LCS can allocate immense amount of memory (sizes of two inputs multiplied by each other).
In the past this caused some possible security issues due to overflows, which we solved
and also added use of `trymalloc` to return "Insufficient memory" instead of OOM panic zmalloc.
But in case overcommit is enabled, it could be that we won't get the OOM panic, and zmalloc
will succeed, and then we can get OOM killed by the kernel.
The solution here is to prevent LCS from allocating transient memory that's bigger than
`proto-max-bulk-len` config.
This config is not directly related to transient memory, but using a hard coded value ad well as
introducing a specific config seems wrong.
This comes to solve an error in the corrupt-dump-fuzzer test that started in the daily CI see #9799
Redis lists are stored in quicklist, which is currently a linked list of ziplists.
Ziplists are limited to storing elements no larger than 4GB, so when bigger
items are added they're getting truncated.
This PR changes quicklists so that they're capable of storing large items
in quicklist nodes that are plain string buffers rather than ziplist.
As part of the PR there were few other changes in redis:
1. new DEBUG sub-commands:
- QUICKLIST-PACKED-THRESHOLD - set the threshold of for the node type to
be plan or ziplist. default (1GB)
- QUICKLIST <key> - Shows low level info about the quicklist encoding of <key>
2. rdb format change:
- A new type was added - RDB_TYPE_LIST_QUICKLIST_2 .
- container type (packed / plain) was added to the beginning of the rdb object
(before the actual node list).
3. testing:
- Tests that requires over 100MB will be by default skipped. a new flag was
added to 'runtest' to run the large memory tests (not used by default)
Co-authored-by: sundb <sundbcn@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
Part two of implementing #8702 (zset), after #8887.
## Description of the feature
Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance.
## Rdb format changes
New `RDB_TYPE_ZSET_LISTPACK` rdb type.
## Rdb loading improvements:
1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist.
2) Simplifying the release of empty key objects when RDB loading.
3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c.
## Interface changes
1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`).
2) OBJECT ENCODING will return listpack instead of ziplist.
## Listpack improvements:
1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack.
2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string.
3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`.
## Zset improvements:
1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop.
2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset.
## Tests
1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function.
2) Add zset RDB loading test.
3) Add benchmark test for `lpCompare` and `ziplsitCompare`.
4) Add empty listpack zset corrupt dump test.
Until now, giving a negative index seeks from the end of a list and a
positive seeks from the beginning. This change makes it seek from
the nearest end, regardless of the sign of the given index.
quicklistIndex is used by all list commands which operate by index.
LINDEX key 999999 in a list if 1M elements is greately optimized by
this change. Latency is cut by 75%.
LINDEX key -1000000 in a list of 1M elements, likewise.
LRANGE key -1 -1 is affected by this, since LRANGE converts the
indices to positive numbers before seeking.
The tests for corrupt dumps are updated to make sure the corrup
data is seeked in the same direction as before.
We only run OOM related tests on x86_64 and aarch64, as jemalloc on other
platforms (notably s390x) may actually succeed very large allocations. As
a result the test may hang for a very long time at the cleanup phase,
iterating as many as 2^61 hash table slots.
Part one of implementing #8702 (taking hashes first before other types)
## Description of the feature
1. Change ziplist encoded hash objects to listpack encoding.
2. Convert existing ziplists on RDB loading time. an O(n) operation.
## Rdb format changes
1. Add RDB_TYPE_HASH_LISTPACK rdb type.
2. Bump RDB_VERSION to 10
## Interface changes
1. New `hash-max-listpack-entries` config is an alias for `hash-max-ziplist-entries` (same with `hash-max-listpack-value`)
2. OBJECT ENCODING will return `listpack` instead of `ziplist`
## Listpack improvements:
1. Support direct insert, replace integer element (rather than convert back and forth from string)
3. Add more listpack capabilities to match the ziplist ones (like `lpFind`, `lpRandomPairs` and such)
4. Optimize element length fetching, avoid multiple calculations
5. Use inline to avoid function call overhead.
## Tests
1. Add a new test to the RDB load time conversion
2. Adding the listpack unit tests. (based on the one in ziplist.c)
3. Add a few "corrupt payload: fuzzer findings" tests, and slightly modify existing ones.
Co-authored-by: Oran Agra <oran@redislabs.com>
This commit mainly fixes empty keys due to RDB loading and restore command,
which was omitted in #9297.
1) When loading quicklsit, if all the ziplists in the quicklist are empty, NULL will be returned.
If only some of the ziplists are empty, then we will skip the empty ziplists silently.
2) When loading hash zipmap, if zipmap is empty, sanitization check will fail.
3) When loading hash ziplist, if ziplist is empty, NULL will be returned.
4) Add RDB loading test with sanitize.
Recently we found two issues in the fuzzer tester: #9302#9285
After fixing them, more problems surfaced and this PR (as well as #9297) aims to fix them.
Here's a list of the fixes
- Prevent an overflow when allocating a dict hashtable
- Prevent OOM when attempting to allocate a huge string
- Prevent a few invalid accesses in listpack
- Improve sanitization of listpack first entry
- Validate integrity of stream consumer groups PEL
- Validate integrity of stream listpack entry IDs
- Validate ziplist tail followed by extra data which start with 0xff
Co-authored-by: sundb <sundbcn@gmail.com>
When we load rdb or restore command, if we encounter a length of 0, it will result in the creation of an empty key.
This could either be a corrupt payload, or a result of a bug (see #8453 )
This PR mainly fixes the following:
1) When restore command will return `Bad data format` error.
2) When loading RDB, we will silently discard the key.
Co-authored-by: Oran Agra <oran@redislabs.com>
This PR adds a spell checker CI action that will fail future PRs if they introduce typos and spelling mistakes.
This spell checker is based on blacklist of common spelling mistakes, so it will not catch everything,
but at least it is also unlikely to cause false positives.
Besides that, the PR also fixes many spelling mistakes and types, not all are a result of the spell checker we use.
Here's a summary of other changes:
1. Scanned the entire source code and fixes all sorts of typos and spelling mistakes (including missing or extra spaces).
2. Outdated function / variable / argument names in comments
3. Fix outdated keyspace masks error log when we check `config.notify-keyspace-events` in loadServerConfigFromString.
4. Trim the white space at the end of line in `module.c`. Check: https://github.com/redis/redis/pull/7751
5. Some outdated https link URLs.
6. Fix some outdated comment. Such as:
- In README: about the rdb, we used to said create a `thread`, change to `process`
- dbRandomKey function coment (about the dictGetRandomKey, change to dictGetFairRandomKey)
- notifyKeyspaceEvent fucntion comment (add type arg)
- Some others minor fix in comment (Most of them are incorrectly quoted by variable names)
7. Modified the error log so that users can easily distinguish between TCP and TLS in `changeBindAddr`
This commit revives the improves the ability to run the test suite against
external servers, instead of launching and managing `redis-server` processes as
part of the test fixture.
This capability existed in the past, using the `--host` and `--port` options.
However, it was quite limited and mostly useful when running a specific tests.
Attempting to run larger chunks of the test suite experienced many issues:
* Many tests depend on being able to start and control `redis-server` themselves,
and there's no clear distinction between external server compatible and other
tests.
* Cluster mode is not supported (resulting with `CROSSSLOT` errors).
This PR cleans up many things and makes it possible to run the entire test suite
against an external server. It also provides more fine grained controls to
handle cases where the external server supports a subset of the Redis commands,
limited number of databases, cluster mode, etc.
The tests directory now contains a `README.md` file that describes how this
works.
This commit also includes additional cleanups and fixes:
* Tests can now be tagged.
* Tag-based selection is now unified across `start_server`, `tags` and `test`.
* More information is provided about skipped or ignored tests.
* Repeated patterns in tests have been extracted to common procedures, both at a
global level and on a per-test file basis.
* Cleaned up some cases where test setup was based on a previous test executing
(a major anti-pattern that repeats itself in many places).
* Cleaned up some cases where test teardown was not part of a test (in the
future we should have dedicated teardown code that executes even when tests
fail).
* Fixed some tests that were flaky running on external servers.
the corrupt-dump-fuzzer test found a case where an access to a corrupt
stream would have caused accessing to uninitialized memory.
now it'll panic instead.
The issue was that there was a stream that says it has more than 0
records, but looking for the max ID came back empty handed.
p.s. when sanitize-dump-payload is used, this corruption is detected,
and the RESTORE command is gracefully rejected.
* The corrupt dump fuzzer found a division by zero.
* in some cases the random fields from the HRANDFIELD tests produced
fields with newlines and other special chars (due to \ char), this caused
the TCL tests to see a bulk response that has a newline in it and add {}
around it, later it can think this is a nested list. in fact the `alpha` random
string generator isn't using spaces and newlines, so it should not use `\`
either.
If RESTORE passes successfully with full sanitization, we can't affort
to crash later on assertion due to duplicate records in a hash when
converting it form ziplist to dict.
This means that when doing full sanitization, we must make sure there
are no duplicate records in any of the collections.
The test creates keys with various encodings, DUMP them, corrupt the payload
and RESTORES it.
It utilizes the recently added use-exit-on-panic config to distinguish between
asserts and segfaults.
If the restore succeeds, it runs random commands on the key to attempt to
trigger a crash.
It runs in two modes, one with deep sanitation enabled and one without.
In the first one we don't expect any assertions or segfaults, in the second one
we expect assertions, but no segfaults.
We also check for leaks and invalid reads using valgrind, and if we find them
we print the commands that lead to that issue.
Changes in the code (other than the test):
- Replace a few NPD (null pointer deference) flows and division by zero with an
assertion, so that it doesn't fail the test. (since we set the server to use
`exit` rather than `abort` on assertion).
- Fix quite a lot of flows in rdb.c that could have lead to memory leaks in
RESTORE command (since it now responds with an error rather than panic)
- Add a DEBUG flag for SET-SKIP-CHECKSUM-VALIDATION so that the test don't need
to bother with faking a valid checksum
- Remove a pile of code in serverLogObjectDebugInfo which is actually unsafe to
run in the crash report (see comments in the code)
- fix a missing boundary check in lzf_decompress
test suite infra improvements:
- be able to run valgrind checks before the process terminates
- rotate log files when restarting servers
When loading an encoded payload we will at least do a shallow validation to
check that the size that's encoded in the payload matches the size of the
allocation.
This let's us later use this encoded size to make sure the various offsets
inside encoded payload don't reach outside the allocation, if they do, we'll
assert/panic, but at least we won't segfault or smear memory.
We can also do 'deep' validation which runs on all the records of the encoded
payload and validates that they don't contain invalid offsets. This lets us
detect corruptions early and reject a RESTORE command rather than accepting
it and asserting (crashing) later when accessing that payload via some command.
configuration:
- adding ACL flag skip-sanitize-payload
- adding config sanitize-dump-payload [yes/no/clients]
For now, we don't have a good way to ensure MIGRATE in cluster resharding isn't
being slowed down by these sanitation, so i'm setting the default value to `no`,
but later on it should be set to `clients` by default.
changes:
- changing rdbReportError not to `exit` in RESTORE command
- adding a new stat to be able to later check if cluster MIGRATE isn't being
slowed down by sanitation.