redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 08:38:27 -05:00

Author	SHA1	Message	Date
antirez	8f7bf2841a	Cluster: decrease ping/pong traffic by trusting other nodes reports. Cluster of bigger sizes tend to have a lot of traffic in the cluster bus just for failure detection: a node will try to get a ping reply from another node no longer than when the half the node timeout would elapsed, in order to avoid a false positive. However this means that if we have N nodes and the node timeout is set to, for instance M seconds, we'll have to ping N nodes every M/2 seconds. This NM/2 pings will receive the same number of pongs, so a total of NM packets per node. However given that we have a total of N nodes doing this, the total number of messages will be NNM. In a 100 nodes cluster with a timeout of 60 seconds, this translates to a total of 10010030 packets per second, summing all the packets exchanged by all the nodes. This is, as you can guess, a lot... So this patch changes the implementation in a very simple way in order to trust the reports of other nodes: if a node A reports a node B as alive at least up to a given time, we update our view accordingly. The problem with this approach is that it could result into a subset of nodes being able to reach a given node X, and preventing others from detecting that is actually not reachable from the majority of nodes. So the above algorithm is refined by trusting other nodes only if we do not have currently a ping pending for the node X, and if there are no failure reports for that node. Since each node, anyway, pings 10 other nodes every second (one node every 100 milliseconds), anyway eventually even trusting the other nodes reports, we will detect if a given node is down from our POV. Now to understand the number of packets that the cluster would exchange for failure detection with the patch, we can start considering the random PINGs that the cluster sent anyway as base line: Each node sends 10 packets per second, so the total traffic if no additioal packets would be sent, including PONG packets, would be: Total messages per second = N102 However by trusting other nodes gossip sections will not AWALYS prevent pinging nodes for the "half timeout reached" rule all the times. The math involved in computing the actual rate as N and M change is quite complex and depends also on another parameter, which is the number of entries in the gossip section of PING and PONG packets. However it is possible to compare what happens in cluster of different sizes experimentally. After applying this patch a very important reduction in the number of packets exchanged is trivial to observe, without apparent impacts on the failure detection performances. Actual numbers with different cluster sizes should be published in the Reids Cluster documentation in the future. Related to #3929.	2017-04-14 10:43:53 +02:00
antirez	c5d6f577f0	Cluster: collect more specific bus messages stats. First step in order to change Cluster in order to use less messages. Related to issue #3929.	2017-04-13 19:22:35 +02:00
Itamar Haber	b8286d1fc9	Changes command stats iteration to being dict-based With the addition of modules, looping over the redisCommandTable misses any added commands. By moving to dictionary iteration this is resolved.	2017-04-13 17:03:46 +03:00
antirez	104584b95e	Fix typo in feedReplicationBacklog() top comment.	2017-04-12 12:28:05 +02:00
antirez	1210af3804	Add a top comment in crucial functions inside networking.c.	2017-04-12 10:12:27 +02:00
antirez	4a850be4dc	Set lua-time-limit default value at safe place. Otherwise, as it was, it will overwrite whatever the user set. Close #3703.	2017-04-11 16:56:00 +02:00
antirez	f47607af02	Fix preprocessor if/else chain broken in order to fix #3927 .	2017-04-11 16:54:27 +02:00
antirez	74720ea993	Merge branch 'unstable' of github.com:/antirez/redis into unstable	2017-04-11 16:45:49 +02:00
antirez	aa5b4be02e	Fix zmalloc_get_memory_size() ifdefs to actually use the else branch. Close #3927.	2017-04-11 16:45:11 +02:00
Salvatore Sanfilippo	69ce5c5d10	Merge pull request #3924 from lorneli/unstable Expire: Update comment of activeExpireCycle function	2017-04-11 16:31:55 +02:00
antirez	531647bb1b	Make more obvious why there was issue #3843 .	2017-04-10 13:17:05 +02:00
Salvatore Sanfilippo	01b6966afc	Merge pull request #3843 from dvirsky/fix_bc_free fixed free of blocked client before refering to it	2017-04-10 13:14:52 +02:00
antirez	ffefc9f92d	Fix modules blocking commands awake delay. If a thread unblocks a client blocked in a module command, by using the RedisMdoule_UnblockClient() API, the event loop may not be awaken until the next timeout of the multiplexing API or the next unrelated I/O operation on other clients. We actually want the client to be served ASAP, so a mechanism is needed in order for the unblocking API to inform Redis that there is a client to serve ASAP. This commit fixes the issue using the old trick of the pipe: when a client needs to be unblocked, a byte is written in a pipe. When we run the list of clients blocked in modules, we consume all the bytes written in the pipe. Writes and reads are performed inside the context of the mutex, so no race is possible in which we consume the bytes that are actually related to an awake request for a client that should still be put into the list of clients to unblock. It was verified that after the fix the server handles the blocked clients with the expected short delay. Thanks to @dvirsky for understanding there was such a problem and reporting it.	2017-04-10 09:33:21 +02:00
antirez	91999fce40	Rax library updated. Important bugs fixed.	2017-04-08 17:31:13 +02:00
lorneli	98db5739cc	Expire: Update comment of activeExpireCycle function The macro REDIS_EXPIRELOOKUPS_TIME_PERC has been replaced by ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC in commit `6500fabfb8`.	2017-04-08 15:15:24 +08:00
antirez	3f9e2322ec	Rax library updated.	2017-04-07 08:46:39 +02:00
antirez	1409c545da	Cluster: hash slots tracking using a radix tree.	2017-03-27 16:37:22 +02:00
Salvatore Sanfilippo	94751543b0	Merge pull request #3875 from oranagra/lfu_tests add LFU policies to the test suite, just for coverage	2017-03-15 09:18:04 +01:00
Oran Agra	4acb4da1d1	add LFU policies to the test suite, just for coverage	2017-03-15 01:05:15 -07:00
antirez	a62f786344	Use sha256 instead of sha1 to generate tarball hashes.	2017-03-09 13:49:36 +01:00
vienna	59bdd08214	fix #3847 : add close socket before return ANET_ERR.	2017-03-07 16:14:05 +00:00
itamar	443f279a3a	Sets up fake client to select current db in RM_Call()	2017-03-06 14:37:10 +02:00
Dvir Volk	4b2229e4b8	fixed free of blocked client before refering to it	2017-03-01 16:51:01 +02:00
Salvatore Sanfilippo	9cc83d2ad9	Makefile: fix building with Solaris C compiler, 64 bit.	2017-02-23 16:53:39 +01:00
antirez	ed7e331051	Merge branch 'sparc' of ssh://209.141.57.197:12222//export/home/antirez/redis into sparc	2017-02-23 15:35:01 +01:00
Salvatore Sanfilippo	b3391fd853	Use ARM unaligned accesses ifdefs for SPARC as well.	2017-02-23 22:39:44 +08:00
Salvatore Sanfilippo	d7826823c0	Fix BITPOS unaligned memory access.	2017-02-23 22:38:44 +08:00
antirez	95883313b5	Solaris fixes about tail usage and atomic vars. Testing with Solaris C compiler (SunOS 5.11 11.2 sun4v sparc sun4v) there were issues compiling due to atomicvar.h and running the tests also failed because of "tail" usage not conform with Solaris tail implementation. This commit fixes both the issues.	2017-02-22 13:08:21 +01:00
antirez	2b36706a48	Test: replication-psync, wait more to detect write load. Slow systems like the original Raspberry PI need more time than 5 seconds to start the script and detect writes. After fixing the Raspberry PI can pass the unit without issues.	2017-02-22 12:27:01 +01:00
antirez	7c8ddab4f8	Test: fix conditional execution of HINCRBYFLOAT representation test.	2017-02-22 12:00:09 +01:00
antirez	06263485d4	Merge branch 'siphash' into unstable	2017-02-21 17:10:10 +01:00
antirez	e084b5a39f	Merge branch 'arm' into unstable	2017-02-21 17:10:06 +01:00
antirez	0285c2714b	SipHash 2-4 -> SipHash 1-2. For performance reasons we use a reduced rounds variant of SipHash. This should still provide enough protection and the effects in the hash table distribution are non existing. If some real world attack on SipHash 1-2 will be found we can trivially switch to something more secure. Anyway it is a big step forward from Murmurhash, for which it is trivial to generate seed independent colliding keys... The speed penatly introduced by SipHash 2-4, around 4%, was a too big price to pay compared to the effectiveness of the HashDoS attack against SipHash 1-2, and considering so far in the Redis history, no such an incident ever happened even while using trivially to collide hash functions.	2017-02-21 17:07:28 +01:00
antirez	cd90389b30	freeMemoryIfNeeded(): improve code and lazyfree handling. 1. Refactor memory overhead computation into a function. 2. Every 10 keys evicted, check if memory usage already reached the target value directly, since we otherwise don't count all the memory reclaimed by the background thread right now.	2017-02-21 12:55:59 +01:00
antirez	84fa8230e5	Use locale agnostic tolower() in dict.c hash function.	2017-02-20 17:39:44 +01:00
antirez	05ea8c6122	SipHash x86 optimizations.	2017-02-20 17:32:46 +01:00
antirez	adeed29a99	Use SipHash hash function to mitigate HashDos attempts. This change attempts to switch to an hash function which mitigates the effects of the HashDoS attack (denial of service attack trying to force data structures to worst case behavior) while at the same time providing Redis with an hash function that does not expect the input data to be word aligned, a condition no longer true now that sds.c strings have a varialbe length header. Note that it is possible sometimes that even using an hash function for which collisions cannot be generated without knowing the seed, special implementation details or the exposure of the seed in an indirect way (for example the ability to add elements to a Set and check the return in which Redis returns them with SMEMBERS) may make the attacker's life simpler in the process of trying to guess the correct seed, however the next step would be to switch to a log(N) data structure when too many items in a single bucket are detected: this seems like an overkill in the case of Redis. SPEED REGRESION TESTS: In order to verify that switching from MurmurHash to SipHash had no impact on speed, a set of benchmarks involving fast insertion of 5 million of keys were performed. The result shows Redis with SipHash in high pipelining conditions to be about 4% slower compared to using the previous hash function. However this could partially be related to the fact that the current implementation does not attempt to hash whole words at a time but reads single bytes, in order to have an output which is endian-netural and at the same time working on systems where unaligned memory accesses are a problem. Further X86 specific optimizations should be tested, the function may easily get at the same level of MurMurHash2 if a few optimizations are performed.	2017-02-20 17:29:17 +01:00
John.Koepi	9b05aafb50	fix #2883 , #2857 pipe fds leak when fork() failed on bg aof rw	2017-02-20 10:22:57 +01:00
antirez	76d87f47c7	Don't leak file descriptor on syncWithMaster(). Close #3804.	2017-02-20 10:18:41 +01:00
Salvatore Sanfilippo	7329cc3981	ARM: Avoid fast path for BITOP. GCC will produce certain unaligned multi load-store instructions that will be trapped by the Linux kernel since ARM v6 cannot handle them with unaligned addresses. Better to use the slower but safer implementation instead of generating the exception which should be anyway very slow.	2017-02-19 15:07:08 +00:00
Salvatore Sanfilippo	4e9cf4cc7e	ARM: Use libc malloc by default. I'm not sure how much test Jemalloc gets on ARM, moreover compiling Redis with Jemalloc support in not very powerful devices, like most ARMs people will build Redis on, is extremely slow. It is possible to enable Jemalloc build anyway if needed by using "make MALLOC=jemalloc".	2017-02-19 15:02:37 +00:00
Salvatore Sanfilippo	72d6d64771	ARM: Avoid memcpy() in MurmurHash64A() if we are using 64 bit ARM. However note that in architectures supporting 64 bit unaligned accesses memcpy(...,...,8) is likely translated to a simple word memory movement anyway.	2017-02-19 15:00:46 +00:00
Salvatore Sanfilippo	1e272a6b52	ARM: Fix 64 bit unaligned access in MurmurHash64A().	2017-02-19 14:01:58 +00:00
minghang.zmh	de07deb4d2	fix server.stat_net_output_bytes calc bug	2017-02-10 20:13:01 +08:00
flowly	1f72ec7dad	Merge pull request #1 from antirez/unstable update to upstream	2017-02-10 19:53:36 +08:00
antirez	f917e0da4c	Fix MIGRATE closing of cached socket on error. After investigating issue #3796, it was discovered that MIGRATE could call migrateCloseSocket() after the original MIGRATE c->argv was already rewritten as a DEL operation. As a result the host/port passed to migrateCloseSocket() could be anything, often a NULL pointer that gets deferenced crashing the server. Now the socket is closed at an earlier time when there is a socket error in a later stage where no retry will be performed, before we rewrite the argument vector. Moreover a check was added so that later, in the socket_err label, there is no further attempt at closing the socket if the argument was rewritten. This fix should resolve the bug reported in #3796.	2017-02-09 09:58:38 +01:00
antirez	0dbfb1d154	Fix ziplist fix...	2017-02-01 17:01:31 +01:00
antirez	c495d095ae	Ziplist: insertion bug under particular conditions fixed. Ziplists had a bug that was discovered while investigating a different issue, resulting in a corrupted ziplist representation, and a likely segmentation foult and/or data corruption of the last element of the ziplist, once the ziplist is accessed again. The bug happens when a specific set of insertions / deletions is performed so that an entry is encoded to have a "prevlen" field (the length of the previous entry) of 5 bytes but with a count that could be encoded in a "prevlen" field of a since byte. This could happen when the "cascading update" process called by ziplistInsert()/ziplistDelete() in certain contitious forces the prevlen to be bigger than necessary in order to avoid too much data moving around. Once such an entry is generated, inserting a very small entry immediately before it will result in a resizing of the ziplist for a count smaller than the current ziplist length (which is a violation, inserting code expects the ziplist to get bigger actually). So an FF byte is inserted in a misplaced position. Moreover a realloc() is performed with a count smaller than the ziplist current length so the final bytes could be trashed as well. SECURITY IMPLICATIONS: Currently it looks like an attacker can only crash a Redis server by providing specifically choosen commands. However a FF byte is written and there are other memory operations that depend on a wrong count, so even if it is not immediately apparent how to mount an attack in order to execute code remotely, it is not impossible at all that this could be done. Attacks always get better... and we did not spent enough time in order to think how to exploit this issue, but security researchers or malicious attackers could.	2017-02-01 15:01:59 +01:00
antirez	3a7410a8a6	ziplist: better comments, some refactoring.	2017-01-30 10:12:47 +01:00
antirez	27e29f4fe6	Jemalloc updated to 4.4.0. The original jemalloc source tree was modified to: 1. Remove the configure error that prevents nested builds. 2. Insert the Redis private Jemalloc API in order to allow the Redis fragmentation function to work.	2017-01-30 09:58:34 +01:00

... 2 3 4 5 6 ...

6314 Commits