Previously two string encodings were used for string objects:
1) REDIS_ENCODING_RAW: a string object with obj->ptr pointing to an sds
stirng.
2) REDIS_ENCODING_INT: a string object where the obj->ptr void pointer
is casted to a long.
This commit introduces a experimental new encoding called
REDIS_ENCODING_EMBSTR that implements an object represented by an sds
string that is not modifiable but allocated in the same memory chunk as
the robj structure itself.
The chunk looks like the following:
+--------------+-----------+------------+--------+----+
| robj data... | robj->ptr | sds header | string | \0 |
+--------------+-----+-----+------------+--------+----+
| ^
+-----------------------+
The robj->ptr points to the contiguous sds string data, so the object
can be manipulated with the same functions used to manipulate plan
string objects, however we need just on malloc and one free in order to
allocate or release this kind of objects. Moreover it has better cache
locality.
This new allocation strategy should benefit both the memory usage and
the performances. A performance gain between 60 and 70% was observed
during micro-benchmarks, however there is more work to do to evaluate
the performance impact and the memory usage behavior.
NetBSD-current's libc has a function named popcount.
hiding these extensions using feature macros is not possible because
redis uses other extensions covered by the same feature macro.
eg. inet_aton
This reverts commit 2c75f2cf1a.
After further analysis, it is very unlikely that we'll raise the
string size limit to > 512MB, and at the same time such big strings
will be used in 32 bit systems.
Better to revert to size_t so that 32 bit processors will not be
forced to use a 64 bit counter in normal operations, that is currently
completely useless.
When keyspace events are enabled, the overhead is not sever but
noticeable, so this commit introduces the ability to select subclasses
of events in order to avoid to generate events the user is not
interested in.
The events can be selected using redis.conf or CONFIG SET / GET.
remove unsafe and unnecessary cast.
until now, this cast may lead segmentation fault when end > UINT_MAX
setbit foo 0 1
bitcount 0 4294967295
=> ok
bitcount 0 4294967296
=> cause segmentation fault.
Note by @antirez: the commit was modified a bit to also change the
string length type to long, since it's guaranteed to be at max 512 MB in
size, so we can work with the same type across all the code path.
A regression test was also added.
For the C standard char can be either signed or unsigned, it's up to the
compiler, but Redis assumed that it was signed in a few places.
The practical effect of this patch is that now Redis 2.6 will run
correctly in every system where char is unsigned, notably the RaspBerry
PI and other ARM systems with GCC.
Thanks to Georgi Marinov (@eesn on twitter) that reported the problem
and allowed me to use his RaspBerry via SSH to trace and fix the issue!
In the issue #529 an user reported a bug that can be triggered with the
following code:
flushdb
set a
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
bitop or x a b
The bug was introduced with the speed optimization in commit 8bbc076
that specializes every BITOP operation loop up to the minimum length of
the input strings.
However the computation of the minimum length contained an error when a
non existing key was present in the input, after a key that was non zero
length.
This commit fixes the bug and adds a regression test for it.
This commit adds a fast-path to the BITOP that can be used for all the
bytes from 0 to the minimal length of the string, and if there are
at max 16 input keys.
Often the intersected bitmaps are roughly the same size, so this
optimization can provide a 10x speed boost to most real world usages
of the command.
Bytes are processed four full words at a time, in loops specialized
for the specific BITOP sub-command, without the need to check for
length issues with the inputs (since we run this algorithm only as far
as there is data from all the keys at the same time).
The remaining part of the string is intersected in the usual way using
the slow but generic algorith.
It is possible to do better than this with inputs that are not roughly
the same size, sorting the input keys by length, by initializing the
result string in a smarter way, and noticing that the final part of the
output string composed of only data from the longest string does not
need any proecessing since AND, OR and XOR against an empty string does
not alter the output (zero in the first case, and the original string in
the other two cases).
More implementations will be implemented later likely, but this should
be enough to release Redis 2.6-RC4 with bitops merged in.
Note: this commit also adds better testing for BITOP NOT command, that
is currently the faster and hard to optimize further since it just
flips the bits of a single input string.
A bug in the implementation caused BITOP to crash the server if at least
one one of the source objects was integer encoded.
The new implementation takes an additional array of Redis objects
pointers and calls getDecodedObject() to get a reference to a string
encoded object, and then uses decrRefCount() to release the object.
Tests modified to cover the regression and improve coverage.
At Redis's default optimization level the command is now much faster,
always using a constant-time bit manipualtion technique to count bits
instead of GCC builtin popcount, and unrolling the loop.
The current implementation performance is 1.5GB/s in a MBA 11" (1.8 Ghz
i7) compiled with both GCC and clang.
The algorithm used is described here:
http://graphics.stanford.edu/~seander/bithacks.html
bitop.c contains the "Bit related string operations" so it seems more
logical to call it bitops instead of bitop.
This also makes it matching the name of the test (unit/bitops.tcl).