This change attempts to switch to an hash function which mitigates
the effects of the HashDoS attack (denial of service attack trying
to force data structures to worst case behavior) while at the same time
providing Redis with an hash function that does not expect the input
data to be word aligned, a condition no longer true now that sds.c
strings have a varialbe length header.
Note that it is possible sometimes that even using an hash function
for which collisions cannot be generated without knowing the seed,
special implementation details or the exposure of the seed in an
indirect way (for example the ability to add elements to a Set and
check the return in which Redis returns them with SMEMBERS) may
make the attacker's life simpler in the process of trying to guess
the correct seed, however the next step would be to switch to a
log(N) data structure when too many items in a single bucket are
detected: this seems like an overkill in the case of Redis.
SPEED REGRESION TESTS:
In order to verify that switching from MurmurHash to SipHash had
no impact on speed, a set of benchmarks involving fast insertion
of 5 million of keys were performed.
The result shows Redis with SipHash in high pipelining conditions
to be about 4% slower compared to using the previous hash function.
However this could partially be related to the fact that the current
implementation does not attempt to hash whole words at a time but
reads single bytes, in order to have an output which is endian-netural
and at the same time working on systems where unaligned memory accesses
are a problem.
Further X86 specific optimizations should be tested, the function
may easily get at the same level of MurMurHash2 if a few optimizations
are performed.
Recently we moved the "return ASAP" condition for the Delete() function
from checking .size to checking .used, which is smarter, however while
testing the first table alone always works to ensure the dict is totally
emtpy, when we test the .size field, testing .used requires testing both
T0 and T1, since a rehashing could be in progress.
Notes by @antirez:
This patch was picked from a larger commit by Oran and adapted to change
the API a bit. The basic idea is to avoid double lookups when there is
to use the value of the deleted entry.
BEFORE:
entry = dictFind( ... ); /* 1st lookup. */
/* Do somethjing with the entry. */
dictDelete(...); /* 2nd lookup. */
AFTER:
entry = dictUnlink( ... ); /* 1st lookup. */
/* Do somethjing with the entry. */
dictFreeUnlinkedEntry(entry); /* No lookups!. */
The command reports information about the hash table internal state
representing the specified database ID.
This can be used in order to investigate rehashings, memory usage issues
and for other debugging purposes.
No semantical changes since to make dict.c truly able to scale over the
32 bit table size limit, the hash function shoulds and other internals
related to hash function output should be 64 bit ready.
rehashidx is always positive in the two code paths, since the only
negative value it could have is -1 when there is no rehashing in
progress, and the condition is explicitly checked.
The old version of SPOP with "count" argument used an API call of dict.c
which was actually designed for a different goal, and was not capable of
good distribution. We follow a different three-cases approach optimized
for different ratiion between sets and requested number of elements.
The implementation is simpler and allowed the removal of a large amount
of code.
Some language in the comment was difficult
to understand, so this commit: clarifies wording, removes
unnecessary words, and relocates some dependent clauses
closer to what they actually describe.
I also tried to break up longer chains of thought
(if X, then Y, and Q, and also F, so obviously M)
into more manageable chunks for ease of understanding.
This new function is useful to get a number of random entries from an
hash table when we just need to do some sampling without particularly
good distribution.
It just jumps at a random place of the hash table and returns the first
N items encountered by scanning linearly.
The main usefulness of this function is to speedup Redis internal
sampling of the key space, for example for key eviction or expiry.
Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.
This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).