redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 08:38:27 -05:00

Author	SHA1	Message	Date
antirez	61a01793ed	Better distribution for set get-random-element operations.	2019-02-18 18:27:18 +01:00
zhaozhao.zz	7c6ddbc37d	dict: fix the int problem for defrag	2017-12-05 15:38:03 +01:00
antirez	adeed29a99	Use SipHash hash function to mitigate HashDos attempts. This change attempts to switch to an hash function which mitigates the effects of the HashDoS attack (denial of service attack trying to force data structures to worst case behavior) while at the same time providing Redis with an hash function that does not expect the input data to be word aligned, a condition no longer true now that sds.c strings have a varialbe length header. Note that it is possible sometimes that even using an hash function for which collisions cannot be generated without knowing the seed, special implementation details or the exposure of the seed in an indirect way (for example the ability to add elements to a Set and check the return in which Redis returns them with SMEMBERS) may make the attacker's life simpler in the process of trying to guess the correct seed, however the next step would be to switch to a log(N) data structure when too many items in a single bucket are detected: this seems like an overkill in the case of Redis. SPEED REGRESION TESTS: In order to verify that switching from MurmurHash to SipHash had no impact on speed, a set of benchmarks involving fast insertion of 5 million of keys were performed. The result shows Redis with SipHash in high pipelining conditions to be about 4% slower compared to using the previous hash function. However this could partially be related to the fact that the current implementation does not attempt to hash whole words at a time but reads single bytes, in order to have an output which is endian-netural and at the same time working on systems where unaligned memory accesses are a problem. Further X86 specific optimizations should be tested, the function may easily get at the same level of MurMurHash2 if a few optimizations are performed.	2017-02-20 17:29:17 +01:00
oranagra	5ab6a54cc6	active defrag improvements	2017-01-02 09:42:32 +02:00
oranagra	7aa9e6d2ae	active memory defragmentation	2016-12-30 03:37:52 +02:00
antirez	09a50d34a2	dict.c: dictReplaceRaw() -> dictAddOrFind(). What they say about "naming things" in programming?	2016-09-14 16:43:38 +02:00
oranagra	afcbcc0e58	dict.c: introduce dictUnlink(). Notes by @antirez: This patch was picked from a larger commit by Oran and adapted to change the API a bit. The basic idea is to avoid double lookups when there is to use the value of the deleted entry. BEFORE: entry = dictFind( ... ); /* 1st lookup. / / Do somethjing with the entry. / dictDelete(...); / 2nd lookup. / AFTER: entry = dictUnlink( ... ); / 1st lookup. / / Do somethjing with the entry. / dictFreeUnlinkedEntry(entry); / No lookups!. */	2016-09-14 12:18:59 +02:00
oranagra	68bf45fa1e	Optimize repeated keyname hashing. (Change cherry-picked and modified by @antirez from a larger commit provided by @oranagra in PR #3223).	2016-09-12 13:19:05 +02:00
antirez	0c05436cef	Lazyfree: a first implementation of non blocking DEL.	2015-10-01 13:00:19 +02:00
antirez	0f64080dcb	DEBUG HTSTATS <dbid> added. The command reports information about the hash table internal state representing the specified database ID. This can be used in order to investigate rehashings, memory usage issues and for other debugging purposes.	2015-07-14 17:15:37 +02:00
antirez	9feee428f2	SPOP: reimplemented for speed and better distribution. The old version of SPOP with "count" argument used an API call of dict.c which was actually designed for a different goal, and was not capable of good distribution. We follow a different three-cases approach optimized for different ratiion between sets and requested number of elements. The implementation is simpler and allowed the removal of a large amount of code.	2015-02-11 10:52:28 +01:00
antirez	5792a217f8	dict.c: add dictGetSomeKeys(), specialized for eviction.	2015-02-11 10:52:27 +01:00
antirez	064d5c96ac	Use long for rehash and iterator index in dict.h. This allows to support datasets with more than 2 billion of keys (possible in very large memory instances, this bug was actually reported). Closes issue #1814.	2014-08-26 10:18:56 +02:00
xiaoyu	d786fb6e94	Clarify argument to dict macro d is more clear because the type of argument is dict not dictht Closes #513	2014-08-18 10:59:01 +02:00
antirez	edca2b14d2	Remove warnings and improve integer sign correctness.	2014-08-13 11:44:38 +02:00
antirez	d1cb6a0fc4	Add double field in dict.c entry value union.	2014-07-22 17:38:22 +02:00
antirez	5317f5e99a	Added dictGetRandomKeys() to dict.c: mass get random entries. This new function is useful to get a number of random entries from an hash table when we just need to do some sampling without particularly good distribution. It just jumps at a random place of the hash table and returns the first N items encountered by scanning linearly. The main usefulness of this function is to speedup Redis internal sampling of the key space, for example for key eviction or expiry.	2014-03-20 15:50:46 +01:00
antirez	2eb781b35b	dict.c: added optional callback to dictEmpty(). Redis hash table implementation has many non-blocking features like incremental rehashing, however while deleting a large hash table there was no way to have a callback called to do some incremental work. This commit adds this support, as an optiona callback argument to dictEmpty() that is currently called at a fixed interval (one time every 65k deletions).	2013-12-10 18:46:24 +01:00
Pieter Noordhuis	7f490b197f	Add SCAN command	2013-10-25 10:49:48 +02:00
antirez	48cde3fe47	dict.c iterator API misuse protection. dict.c allows the user to create unsafe iterators, that are iterators that will not touch the dictionary data structure in any way, preventing copy on write, but at the same time are limited in their usage. The limitation is that when itearting with an unsafe iterator, no call to other dictionary functions must be done inside the iteration loop, otherwise the dictionary may be incrementally rehashed resulting into missing elements in the set of the elements returned by the iterator. However after introducing this kind of iterators a number of bugs were found due to misuses of the API, and we are still finding bugs about this issue. The bugs are not trivial to track because the effect is just missing elements during the iteartion. This commit introduces auto-detection of the API misuse. The idea is that an unsafe iterator has a contract: from initialization to the release of the iterator the dictionary should not change. So we take a fingerprint of the dictionary state, xoring a few important dict properties when the unsafe iteartor is initialized. We later check when the iterator is released if the fingerprint is still the same. If it is not, we found a misuse of the iterator, as not allowed API calls changed the internal state of the dictionary. This code was checked against a real bug, issue #1240. This is what Redis prints (aborting) when a misuse is detected: Assertion failed: (iter->fingerprint == dictFingerprint(iter->d)), function dictReleaseIterator, file dict.c, line 587.	2013-08-19 15:00:57 +02:00
Salvatore Sanfilippo	ecd82f59fe	Merge pull request #693 from ghurrell/dict-h-typos Fix (cosmetic) typos in dict.h	2012-10-22 02:55:23 -07:00
antirez	da920e75d4	Hash function switched to murmurhash2. The previously used hash function, djbhash, is not secure against collision attacks even when the seed is randomized as there are simple ways to find seed-independent collisions. The new hash function appears to be safe (or much harder to exploit at least) in this case, and has better distribution. Better distribution does not always means that's better. For instance in a fast benchmark with "DEBUG POPULATE 1000000" I obtained the following results: 1.6 seconds with djbhash 2.0 seconds with murmurhash2 This is due to the fact that djbhash will hash objects that follow the pattern `prefix:<id>` and where the id is numerically near, to near buckets. This improves the locality. However in other access patterns with keys that have no relation murmurhash2 has some (apparently minimal) speed advantage. On the other hand a better distribution should significantly improve the quality of the distribution of elements returned with dictGetRandomKey() that is used in SPOP, SRANDMEMBER, RANDOMKEY, and other commands. Everything considered, and under the suspect that this commit fixes a security issue in Redis, we are switching to the new hash function. If some serious speed regression will be found in the future we'll be able to step back easiliy. This commit fixes issue #663.	2012-10-05 11:20:13 +02:00
Greg Hurrell	4b1f6ad3e7	Fix (cosmetic) typos in dict.h	2012-10-02 22:01:26 -07:00
antirez	a48c8d873b	Fix for hash table collision attack. We simply randomize hash table initialization value at startup time.	2012-01-21 23:30:13 +01:00
antirez	14ed10d957	dict set/get macros for integers fixed.	2011-11-09 13:39:59 +01:00
antirez	6c578b764a	dict.c: added macros to get signed/unsigned integer values from hash entry. Field name of hash entry union modified for clarity.	2011-11-08 23:59:53 +01:00
antirez	aa9a61ccd7	dict.c: added macros in dict.h to set signed and unsigned 64 bit values directly inside the hash entry without using additional memory.	2011-11-08 19:41:29 +01:00
antirez	c0ba9ebe13	dict.c API names modified to be more coincise and consistent.	2011-11-08 17:07:55 +01:00
antirez	71a50956b1	dict.c: added two lower level methods for directly manipulating hash entries. This is useful in order to set 64 bit integers as values directly inside the hash entry (in order to save memory), without casting, and even in 32 bit builds.	2011-11-08 16:57:20 +01:00
antirez	6a7841eb09	added an union in the dict.h structure to store 64 bit integers directly into hash table entries.	2011-11-02 15:28:45 +01:00
antirez	4b53e7365c	Introduced a safe iterator interface that can be used to iterate while accessing the dictionary at the same time. Now the default interface is consireded unsafe and should be used only with dictNext()	2011-05-10 10:15:50 +02:00
antirez	1b1f47c915	command lookup process turned into a much more flexible and probably faster hash table	2010-11-03 11:23:59 +01:00
antirez	e2641e09cc	redis.c split into many different C files. networking related stuff moved into networking.c moved more code more work on layout of source code SDS instantaneuos memory saving. By Pieter and Salvatore at VMware ;) cleanly compiling again after the first split, now splitting it in more C files moving more things around... work in progress split replication code splitting more Sets split Hash split replication split even more splitting more splitting minor change	2010-07-01 14:38:51 +02:00

33 Commits