BITCOUNT and BITPOS with non-existing key will return 0 even the
arguments are error, before this commit:
```
> flushall
OK
> bitcount s 0
(integer) 0
> bitpos s 0 0 1 hello
(integer) 0
> set s 1
OK
> bitcount s 0
(error) ERR syntax error
> bitpos s 0 0 1 hello
(error) ERR syntax error
```
The reason is that we judged non-existing before parameter checking and
returned. This PR fixes it, and after this commit:
```
> flushall
OK
> bitcount s 0
(error) ERR syntax error
> bitpos s 0 0 1 hello
(error) ERR syntax error
```
Also BITPOS made the same fix as #12394, check for wrong argument, before
checking for key.
```
> lpush mylist a b c
(integer) 3
> bitpos mylist 1 a b
(error) WRONGTYPE Operation against a key holding the wrong kind of value
```
Generally, In any command we first check for the argument and then check if key exist.
Some of the examples are
```
127.0.0.1:6379> getrange no-key invalid1 invalid2
(error) ERR value is not an integer or out of range
127.0.0.1:6379> setbit no-key 1 invalid
(error) ERR bit is not an integer or out of range
127.0.0.1:6379> xrange no-key invalid1 invalid2
(error) ERR Invalid stream ID specified as stream command argument
```
**Before change**
```
bitcount no-key invalid1 invalid2
0
```
**After change**
```
bitcount no-key invalid1 invalid2
(error) ERR value is not an integer or out of range
```
By the convention of errors, there is supposed to be a space between the code and the name.
While looking at some lua stuff I noticed that interpreter errors were not adding the space,
so some clients will try to map the detailed error message into the error.
We have tests that hit this condition, but they were just checking that the string "starts" with ERR.
I updated some other tests with similar incorrect string checking. This isn't complete though, as
there are other ways we check for ERR I didn't fix.
Produces some fun output like:
```
# Errorstats
errorstat_ERR:count=1
errorstat_ERRuser_script_1_:count=1
```
Redis supports inserting data over 4GB into string (and recently for lists too, see #9357),
But LZF compression used in RDB files (see `rdbcompression` config), and in quicklist
(see `list-compress-depth` config) does not support compress/decompress data over
UINT32_MAX, which will result in corrupting the rdb after compression.
Internal changes:
1. Modify the `unsigned int` parameter of `lzf_compress/lzf_decompress` to `size_t`.
2. Modify the variable types in `lzf_compress` involving offsets and lengths to `size_t`.
3. Set LZF_USE_OFFSETS to 0.
When LZF_USE_OFFSETS is 1, lzf store offset into `LZF_HSLOT`(32bit).
Even in 64-bit, `LZF_USE_OFFSETS` defaults to 1, because lzf assumes that it only
compresses and decompresses data smaller than UINT32_MAX.
But now we need to make lzf support 64-bit, turning on `LZF_USE_OFFSETS` will make
it impossible to store 64-bit offsets or pointers.
BTW, disable LZF_USE_OFFSETS also brings a few performance improvements.
Tests:
1. Add test for compress/decompress string large than UINT32_MAX.
2. Add unittest for compress/decompress quicklistNode.
Make bitpos/bitcount support bit index:
```
BITPOS key bit [start [end [BIT|BYTE]]]
BITCOUNT key [start end [BIT|BYTE]]
```
The default behavior is `BYTE`, so these commands are still compatible with old.
In old way, we always increase server.dirty in BITSET and BITFIELD SET.
Even the command doesn't really change anything. This commit make
sure BITSET and BITFIELD SET only increase dirty when the value changed.
Because of that, if the value not changed, some others implications:
- Avoid adding useless AOF
- Reduce replication traffic
- Will not trigger keyspace notifications (setbit)
- Will not invalidate WATCH
- Will not sent the invalidation message to the tracking client
GETBIT, SETBIT may access wrong address because of wrap.
BITCOUNT and BITPOS may return wrapped results.
BITFIELD may access the wrong address but also allocate insufficient memory and segfault (see CVE-2021-32761).
This commit uses `uint64_t` or `long long` instead of `size_t`.
related https://github.com/redis/redis/pull/8096
At 32bit platform:
> setbit bit 4294967295 1
(integer) 0
> config set proto-max-bulk-len 536870913
OK
> append bit "\xFF"
(integer) 536870913
> getbit bit 4294967296
(integer) 0
When the bit index is larger than 4294967295, size_t can't hold bit index. In the past, `proto-max-bulk-len` is limit to 536870912, so there is no problem.
After this commit, bit position is stored in `uint64_t` or `long long`. So when `proto-max-bulk-len > 536870912`, 32bit platforms can still be correct.
For 64bit platform, this problem still exists. The major reason is bit pos 8 times of byte pos. When proto-max-bulk-len is very larger, bit pos may overflow.
But at 64bit platform, we don't have so long string. So this bug may never happen.
Additionally this commit add a test cost `512MB` memory which is tag as `large-memory`. Make freebsd ci and valgrind ci ignore this test.
This commit revives the improves the ability to run the test suite against
external servers, instead of launching and managing `redis-server` processes as
part of the test fixture.
This capability existed in the past, using the `--host` and `--port` options.
However, it was quite limited and mostly useful when running a specific tests.
Attempting to run larger chunks of the test suite experienced many issues:
* Many tests depend on being able to start and control `redis-server` themselves,
and there's no clear distinction between external server compatible and other
tests.
* Cluster mode is not supported (resulting with `CROSSSLOT` errors).
This PR cleans up many things and makes it possible to run the entire test suite
against an external server. It also provides more fine grained controls to
handle cases where the external server supports a subset of the Redis commands,
limited number of databases, cluster mode, etc.
The tests directory now contains a `README.md` file that describes how this
works.
This commit also includes additional cleanups and fixes:
* Tests can now be tagged.
* Tag-based selection is now unified across `start_server`, `tags` and `test`.
* More information is provided about skipped or ignored tests.
* Repeated patterns in tests have been extracted to common procedures, both at a
global level and on a per-test file basis.
* Cleaned up some cases where test setup was based on a previous test executing
(a major anti-pattern that repeats itself in many places).
* Cleaned up some cases where test teardown was not part of a test (in the
future we should have dedicated teardown code that executes even when tests
fail).
* Fixed some tests that were flaky running on external servers.
It was verified in practice that this test is able to stress much more
the implementation by introducing errors that were only trivially to
detect with different offsets but impossible to detect starting always
at zero and counting bits the full length of the string.
So instead to reply with a generic error like:
-ERR ... wrong kind of value ...
now it replies with:
-WRONGTYPE ... wrong kind of value ...
This makes this particular error easy to check without resorting to
(fragile) pattern matching of the error string (however the error string
used to be consistent already).
Client libraries should return a specific exeption type for this error.
Most of the commit is about fixing unit tests.
Bug #582 was not present in 32 bit builds of Redis as
getObjectFromLong() will return an error for overflow.
This commit makes sure that the test does not fail because of the error
returned when running against 32 bit builds.
remove unsafe and unnecessary cast.
until now, this cast may lead segmentation fault when end > UINT_MAX
setbit foo 0 1
bitcount 0 4294967295
=> ok
bitcount 0 4294967296
=> cause segmentation fault.
Note by @antirez: the commit was modified a bit to also change the
string length type to long, since it's guaranteed to be at max 512 MB in
size, so we can work with the same type across all the code path.
A regression test was also added.
In the issue #529 an user reported a bug that can be triggered with the
following code:
flushdb
set a
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
bitop or x a b
The bug was introduced with the speed optimization in commit 8bbc076
that specializes every BITOP operation loop up to the minimum length of
the input strings.
However the computation of the minimum length contained an error when a
non existing key was present in the input, after a key that was non zero
length.
This commit fixes the bug and adds a regression test for it.
This commit adds a fast-path to the BITOP that can be used for all the
bytes from 0 to the minimal length of the string, and if there are
at max 16 input keys.
Often the intersected bitmaps are roughly the same size, so this
optimization can provide a 10x speed boost to most real world usages
of the command.
Bytes are processed four full words at a time, in loops specialized
for the specific BITOP sub-command, without the need to check for
length issues with the inputs (since we run this algorithm only as far
as there is data from all the keys at the same time).
The remaining part of the string is intersected in the usual way using
the slow but generic algorith.
It is possible to do better than this with inputs that are not roughly
the same size, sorting the input keys by length, by initializing the
result string in a smarter way, and noticing that the final part of the
output string composed of only data from the longest string does not
need any proecessing since AND, OR and XOR against an empty string does
not alter the output (zero in the first case, and the original string in
the other two cases).
More implementations will be implemented later likely, but this should
be enough to release Redis 2.6-RC4 with bitops merged in.
Note: this commit also adds better testing for BITOP NOT command, that
is currently the faster and hard to optimize further since it just
flips the bits of a single input string.
A bug in the implementation caused BITOP to crash the server if at least
one one of the source objects was integer encoded.
The new implementation takes an additional array of Redis objects
pointers and calls getDecodedObject() to get a reference to a string
encoded object, and then uses decrRefCount() to release the object.
Tests modified to cover the regression and improve coverage.
Fuzzing tests of BITCOUNT / BITOP are iterated multiple times.
The new BITCOUNT fuzzing test uses random strings in a wider interval of
lengths including zero-len strings.
The Redis implementation is tested against Tcl implementations of the
same operation. Both fuzzing and testing of specific aspects of the
commands behavior are performed.