This gives us a 24 bytes size class which is dict.c dictEntry size, thus
improving the memory efficiency of Redis significantly.
Moreover other non 16 bytes aligned tiny classes are added that further
reduce the fragmentation of the allocator.
Technically speaking LG_QUANTUM should be 4 on i386 / AMD64 because of
SSE types and other 16 bytes types, however we don't use those, and our
jemalloc only targets Redis.
New versions of Jemalloc will have an explicit configure switch in order
to specify the quantum value for a platform without requiring any change
to the Jemalloc source code: we'll switch to this system when available.
This change was originally proposed by Oran Agra (@oranagra) as a change
to the Jemalloc script to generate the size classes define. We ended
doing it differently by changing LG_QUANTUM since it is apparently the
supported Jemalloc method to obtain a 24 bytes size class, moreover it
also provides us other potentially useful size classes.
Related to issue #2510.
When empty strings are created, or when sdsMakeRoomFor() is called, we
are likely into an appending pattern. Use at least type 8 SDS strings
since TYPE 5 does not remember the free allocation size and requires to
call sdsMakeRoomFor() at every new piece appended.
The previos attempt to process each client at least once every ten
seconds was not a good idea, because:
1. Usually because of the past min iterations set to 50, you get much
better processing period most of the times.
2. However when there are many clients and a normal setting for
server.hz, the edge case is triggered, and waiting 10 seconds for a
BLPOP that asked for 1 second is not ok.
3. Moreover, because of the high min-itereations limit of 50, when HZ
was set to an high value, the actual behavior was to process a lot of
clients per second.
Also the function checking for timeouts called gettimeofday() at each
iteration which can be costly.
The new implementation will try to process each client once per second,
gets the current time as argument, and does not attempt to process more
than 5 clients per iteration if not needed.
So now:
1. The CPU usage of an idle Redis process is the same or better.
2. The CPU usage of a busy Redis process is the same or better.
3. However a non trivial amount of work may be performed per iteration
when there are many many clients. In this particular case the user may
want to raise the "HZ" value if needed.
Btw with 4000 clients it was still not possible to noticy any actual
latency created by processing 400 clients per second, since the work
performed for each client is pretty small.
This is an attempt to use the refcount feature of the sds.c fork
provided in the Pull Request #2509. A new type, SDS_TYPE_5 is introduced
having a one byte header with just the string length, without
information about the available additional length at the end of the
string (this means that sdsMakeRoomFor() will be required each time
we want to append something, since the string will always report to have
0 bytes available).
More work needed in order to avoid common SDS functions will pay the
cost of this type. For example both sdscatprintf() and sdscatfmt()
should try to upgrade to SDS_TYPE_8 ASAP when appending chars.
The command reports information about the hash table internal state
representing the specified database ID.
This can be used in order to investigate rehashings, memory usage issues
and for other debugging purposes.
The new return value is the number of keys existing, among the ones
specified in the command line, counting the same key multiple times if
given multiple times (and if it exists).
See PR #2667.
Rationale:
1. The commands look like internals exposed without a real strong use
case.
2. Whatever there is an use case, the client would implement the
commands client side instead of paying RTT just to use a simple to
reimplement library.
3. They add complexity to an otherwise quite straightforward API.
So for now KILLED ;-)
Instead of successive divisions in iteration the new code uses bitwise
magic to interleave / deinterleave two 32bit values into a 64bit one.
All tests still passing and is measurably faster, so worth it.
Stack traces produced by Redis on crash are the most useful tool we
have to fix non easily reproducible crashes, or even easily reproducible
ones where the user just posts a bug report and does not collaborate
furhter.
By declaring functions "static" they no longer show up in the stack
trace.