Commit Graph

4279 Commits

Author SHA1 Message Date
Salvatore Sanfilippo
33035cad04 Merge pull request #4035 from amallia/patch-2
Removed duplicate 'sys/socket.h'  include
2017-06-13 06:27:31 -07:00
antirez
5877c02c51 Fix PERSIST expired key resuscitation issue #4048. 2017-06-13 10:35:51 +02:00
Antonio Mallia
2d1d57eb47 Removed duplicate 'sys/socket.h' include 2017-06-04 15:26:53 +01:00
Antonio Mallia
591dba8055 Fixed comment in clusterMsg version field 2017-06-04 15:09:05 +01:00
Zachary Marquez
a3e53cf9bc Prevent expirations and evictions while paused
Proposed fix to https://github.com/antirez/redis/issues/4027
2017-06-01 16:28:40 -05:00
antirez
e91b81c612 More informative -MISCONF error message. 2017-05-19 12:03:30 +02:00
antirez
e498d9ee3e Collect fork() timing info only if fork succeeded. 2017-05-19 11:10:36 +02:00
antirez
78211aaaaf redis-cli --bigkeys: show error when TYPE fails.
Close #3993.
2017-05-15 11:22:28 +02:00
antirez
1f598fc2bb Modules TSC: use atomic var for server.unixtime.
This avoids Helgrind complaining, but we are actually not using
atomicGet() to get the unixtime value for now: too many places where it
is used and given tha time_t is word-sized it should be safe in all the
archs we support as it is.

On the other hand, Helgrind, when Redis is compiled with "make helgrind"
in order to force the __sync macros, will detect the write in
updateCachedTime() as a read (because atomic functions are used) and
will not complain about races.

This commit also includes minor refactoring of mutex initializations and
a "helgrind" target in the Makefile.
2017-05-10 10:04:16 +02:00
antirez
de786186a5 atomicvar.h: show used API in INFO. Add macro to force __sync builtin.
The __sync builtin can be correctly detected by Helgrind so to force it
is useful for testing. The API in the INFO output can be useful for
debugging after problems are reported.
2017-05-10 09:33:49 +02:00
antirez
6eb51bf1ec zmalloc.c: remove thread safe mode, it's the default way. 2017-05-09 16:59:51 +02:00
antirez
9390c384b8 Modules TSC: Add mutex for server.lruclock.
Only useful for when no atomic builtins are available.
2017-05-09 16:32:49 +02:00
antirez
ece658713b Modules TSC: Improve inter-thread synchronization.
More work to do with server.unixtime and similar. Need to write Helgrind
suppression file in order to suppress the valse positives.
2017-05-09 11:57:09 +02:00
antirez
2a51bac44e Simplify atomicvar.h usage by having the mutex name implicit. 2017-05-04 17:01:00 +02:00
antirez
52bc74f221 Lazyfree: fix lazyfreeGetPendingObjectsCount() race reading counter. 2017-05-04 10:35:40 +02:00
antirez
7d9326b1f3 Modules TSC: HELLO.KEYS reply format fixed. 2017-05-03 23:43:49 +02:00
antirez
9b01b64430 Modules TSC: put the client in the pending write list. 2017-05-03 14:54:48 +02:00
antirez
e67fb915eb adlist: fix final list count in listJoin(). 2017-05-03 14:54:14 +02:00
antirez
79226cb9fa adlist: fix listJoin() to handle empty lists. 2017-05-03 14:15:25 +02:00
antirez
6798736909 Modules: remove unused var in example module. 2017-05-03 14:10:21 +02:00
antirez
1ed2ff5570 Modules TSC: HELLO.KEYS example draft finished. 2017-05-03 14:08:12 +02:00
antirez
7127f15ebe Module: fix RedisModule_Call() "l" specifier to create a raw string. 2017-05-03 14:07:10 +02:00
antirez
3fcf959e60 Modules TSC: Release the GIL for all the time we are blocked.
Instead of giving the module background operations just a small time to
run in the beforeSleep() function, we can have the lock released for all
the time we are blocked in the multiplexing syscall.
2017-05-03 11:26:21 +02:00
antirez
ba4a5a3255 Modules TSC: Export symbols of the new API. 2017-05-02 15:19:28 +02:00
antirez
275905b328 Modules TSC: Handling of RM_Reply* functions. 2017-05-02 15:05:39 +02:00
antirez
9c500b89fb Modules TSC: Basic TS context creeation and handling. 2017-05-02 12:53:10 +02:00
antirez
59b06b14c9 Modules TSC: GIL and cooperative multi tasking setup. 2017-04-28 18:41:10 +02:00
antirez
469d6e2b37 PSYNC2: fix master cleanup when caching it.
The master client cleanup was incomplete: resetClient() was missing and
the output buffer of the client was not reset, so pending commands
related to the previous connection could be still sent.

The first problem caused the client argument vector to be, at times,
half populated, so that when the correct replication stream arrived the
protcol got mixed to the arugments creating invalid commands that nobody
called.

Thanks to @yangsiran for also investigating this problem, after
already providing important design / implementation hints for the
original PSYNC2 issues (see referenced Github issue).

Note that this commit adds a new function to the list library of Redis
in order to be able to reset a list without destroying it.

Related to issue #3899.
2017-04-27 17:08:37 +02:00
antirez
238cebdd5e Check event loop creation return value. Fix #3951.
Normally we never check for OOM conditions inside Redis since the
allocator will always return a pointer or abort the program on OOM
conditons. However we cannot have control on epool_create(), that may
fail for kernel OOM (according to the manual page) even if all the
parameters are correct, so the function aeCreateEventLoop() may indeed
return NULL and this condition must be checked.
2017-04-21 16:27:38 +02:00
Salvatore Sanfilippo
3773c06d28 Merge pull request #3950 from kensou97/unstable
update block->free after some diff data are written to the child process
2017-04-20 07:55:51 +02:00
antirez
7d9dd80db3 Fix getKeysUsingCommandTable() in cluster mode.
Close #3940.
2017-04-19 16:17:08 +02:00
antirez
189a12afb4 PSYNC2: discard pending transactions from cached master.
During the review of the fix for #3899, @yangsiran identified an
implementation bug: given that the offset is now relative to the applied
part of the replication log, when we cache a master, the successive
PSYNC2 request will be made in order to *include* the transaction that
was not completely processed. This means that we need to discard any
pending transaction from our replication buffer: it will be re-executed.
2017-04-19 14:02:52 +02:00
antirez
22be435efe Fix PSYNC2 incomplete command bug as described in #3899.
This bug was discovered by @kevinmcgehee and constituted a major hidden
bug in the PSYNC2 implementation, caused by the propagation from the
master of incomplete commands to slaves.

The bug had several results:

1. Borrowing from Kevin text in the issue: "Given that slaves blindly
copy over their master's input into their own replication backlog over
successive read syscalls, it's possible that with large commands or
small TCP buffers, partial commands are present in this buffer. If the
master were to fail before successfully propagating the entire command
to a slave, the slaves will never execute the partial command (since the
client is invalidated) but will copy it to replication backlog which may
relay those invalid bytes to its slaves on PSYNC2, corrupting the
backlog and possibly other valid commands that follow the failover.
Simple command boundaries aren't sufficient to capture this, either,
because in the case of a MULTI/EXEC block, if the master successfully
propagates a subset of the commands but not the EXEC, then the
transaction in the backlog becomes corrupt and could corrupt other
slaves that consume this data."

2. As identified by @yangsiran later, there is another effect of the
bug. For the same mechanism of the first problem, a slave having another
slave, could receive a full resynchronization request with an already
half-applied command in the backlog. Once the RDB is ready, it will be
sent to the slave, and the replication will continue sending to the
sub-slave the other half of the command, which is not valid.

The fix, designed by @yangsiran and @antirez, and implemented by
@antirez, uses a secondary buffer in order to feed the sub-masters and
update the replication backlog and offsets, only when a given part of
the query buffer is actually *applied* to the state of the instance,
that is, when the command gets processed and the command is not pending
in the Redis transaction buffer because of CLIENT_MULTI state.

Given that now the backlog and offsets representation are in agreement
with the actual processed commands, both issue 1 and 2 should no longer
be possible.

Thanks to @kevinmcgehee, @yangsiran and @oranagra for their work in
identifying and designing a fix for this problem.
2017-04-19 10:25:45 +02:00
Salvatore Sanfilippo
27fe8e9fb2 Merge pull request #3945 from badboy/dicthash-bench-compile
Reorder to make dict-benchmark compile on Linux
2017-04-18 16:31:18 +02:00
antirez
02d02a3754 Fix #3848 by closing the descriptor on error. 2017-04-18 16:24:06 +02:00
antirez
da2f9cd186 Fix descriptor leak. Close #3848. 2017-04-18 16:15:16 +02:00
张文康
5f88bd320e update block->free after some diff data are written to the child process 2017-04-18 20:10:08 +08:00
antirez
c33493277a Clarify why we save ziplist elements in revserse order.
Also get rid of variables that are now kinda redundant, since the
dictionary iterator was removed.

This is related to PR #3949.
2017-04-18 11:01:47 +02:00
Jan-Erik Rediger
c4ad4765b0 Reorder to make dict-benchmark compile on Linux
Fixes #3944
2017-04-17 13:37:59 +02:00
spinlock
23ec36909e rdb: saving skiplist in reversed order to accelerate the deserialisation process 2017-04-17 13:22:34 +08:00
antirez
271733f4f8 Cluster: discard pong times in the future.
However we allow for 500 milliseconds of tolerance, in order to
avoid often discarding semantically valid info (the node is up)
because of natural few milliseconds desync among servers even when
NTP is used.

Note that anyway we should ping the node from time to time regardless and
discover if it's actually down from our point of view, since no update
is accepted while we have an active ping on the node.

Related to #3929.
2017-04-15 10:12:08 +02:00
antirez
02777bb252 Cluster: always add PFAIL nodes at end of gossip section.
To rely on the fact that nodes in PFAIL state will be shared around by
randomly adding them in the gossip section is a weak assumption,
especially after changes related to sending less ping/pong packets.

We want to always include gossip entries for all the nodes that are in
PFAIL state, so that the PFAIL -> FAIL state promotion can happen much
faster and reliably.

Related to #3929.
2017-04-14 13:39:49 +02:00
antirez
8c829d9e43 Cluster: fix gossip section ping/pong times encoding.
The gossip section times are 32 bit, so cannot store the milliseconds
time but just the seconds approximation, which is good enough for our
uses. At the same time however, when comparing the gossip section times
of other nodes with our node's view, we need to convert back to
milliseconds.

Related to #3929. Without this change the patch to reduce the traffic in
the bus message does not work.
2017-04-14 11:01:22 +02:00
antirez
6878a3fedd Cluster: add clean-logs command to create-cluster script. 2017-04-14 10:52:00 +02:00
antirez
8f7bf2841a Cluster: decrease ping/pong traffic by trusting other nodes reports.
Cluster of bigger sizes tend to have a lot of traffic in the cluster bus
just for failure detection: a node will try to get a ping reply from
another node no longer than when the half the node timeout would elapsed,
in order to avoid a false positive.

However this means that if we have N nodes and the node timeout is set
to, for instance M seconds, we'll have to ping N nodes every M/2
seconds. This N*M/2 pings will receive the same number of pongs, so
a total of N*M packets per node. However given that we have a total of N
nodes doing this, the total number of messages will be N*N*M.

In a 100 nodes cluster with a timeout of 60 seconds, this translates
to a total of 100*100*30 packets per second, summing all the packets
exchanged by all the nodes.

This is, as you can guess, a lot... So this patch changes the
implementation in a very simple way in order to trust the reports of
other nodes: if a node A reports a node B as alive at least up to
a given time, we update our view accordingly.

The problem with this approach is that it could result into a subset of
nodes being able to reach a given node X, and preventing others from
detecting that is actually not reachable from the majority of nodes.
So the above algorithm is refined by trusting other nodes only if we do
not have currently a ping pending for the node X, and if there are no
failure reports for that node.

Since each node, anyway, pings 10 other nodes every second (one node
every 100 milliseconds), anyway eventually even trusting the other nodes
reports, we will detect if a given node is down from our POV.

Now to understand the number of packets that the cluster would exchange
for failure detection with the patch, we can start considering the
random PINGs that the cluster sent anyway as base line:
Each node sends 10 packets per second, so the total traffic if no
additioal packets would be sent, including PONG packets, would be:

    Total messages per second = N*10*2

However by trusting other nodes gossip sections will not AWALYS prevent
pinging nodes for the "half timeout reached" rule all the times. The
math involved in computing the actual rate as N and M change is quite
complex and depends also on another parameter, which is the number of
entries in the gossip section of PING and PONG packets. However it is
possible to compare what happens in cluster of different sizes
experimentally. After applying this patch a very important reduction in
the number of packets exchanged is trivial to observe, without apparent
impacts on the failure detection performances.

Actual numbers with different cluster sizes should be published in the
Reids Cluster documentation in the future.

Related to #3929.
2017-04-14 10:43:53 +02:00
antirez
c5d6f577f0 Cluster: collect more specific bus messages stats.
First step in order to change Cluster in order to use less messages.
Related to issue #3929.
2017-04-13 19:22:35 +02:00
Itamar Haber
b8286d1fc9 Changes command stats iteration to being dict-based
With the addition of modules, looping over the redisCommandTable
misses any added commands. By moving to dictionary iteration this
is resolved.
2017-04-13 17:03:46 +03:00
antirez
104584b95e Fix typo in feedReplicationBacklog() top comment. 2017-04-12 12:28:05 +02:00
antirez
1210af3804 Add a top comment in crucial functions inside networking.c. 2017-04-12 10:12:27 +02:00
antirez
4a850be4dc Set lua-time-limit default value at safe place.
Otherwise, as it was, it will overwrite whatever the user set.

Close #3703.
2017-04-11 16:56:00 +02:00
antirez
f47607af02 Fix preprocessor if/else chain broken in order to fix #3927. 2017-04-11 16:54:27 +02:00
antirez
74720ea993 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2017-04-11 16:45:49 +02:00
antirez
aa5b4be02e Fix zmalloc_get_memory_size() ifdefs to actually use the else branch.
Close #3927.
2017-04-11 16:45:11 +02:00
Salvatore Sanfilippo
69ce5c5d10 Merge pull request #3924 from lorneli/unstable
Expire: Update comment of activeExpireCycle function
2017-04-11 16:31:55 +02:00
antirez
531647bb1b Make more obvious why there was issue #3843. 2017-04-10 13:17:05 +02:00
Salvatore Sanfilippo
01b6966afc Merge pull request #3843 from dvirsky/fix_bc_free
fixed free of blocked client before refering to it
2017-04-10 13:14:52 +02:00
antirez
ffefc9f92d Fix modules blocking commands awake delay.
If a thread unblocks a client blocked in a module command, by using the
RedisMdoule_UnblockClient() API, the event loop may not be awaken until
the next timeout of the multiplexing API or the next unrelated I/O
operation on other clients. We actually want the client to be served
ASAP, so a mechanism is needed in order for the unblocking API to inform
Redis that there is a client to serve ASAP.

This commit fixes the issue using the old trick of the pipe: when a
client needs to be unblocked, a byte is written in a pipe. When we run
the list of clients blocked in modules, we consume all the bytes
written in the pipe. Writes and reads are performed inside the context
of the mutex, so no race is possible in which we consume the bytes that
are actually related to an awake request for a client that should still
be put into the list of clients to unblock.

It was verified that after the fix the server handles the blocked
clients with the expected short delay.

Thanks to @dvirsky for understanding there was such a problem and
reporting it.
2017-04-10 09:33:21 +02:00
antirez
91999fce40 Rax library updated.
Important bugs fixed.
2017-04-08 17:31:13 +02:00
lorneli
98db5739cc Expire: Update comment of activeExpireCycle function
The macro REDIS_EXPIRELOOKUPS_TIME_PERC has been replaced by
ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC in commit
6500fabfb8.
2017-04-08 15:15:24 +08:00
antirez
3f9e2322ec Rax library updated. 2017-04-07 08:46:39 +02:00
antirez
1409c545da Cluster: hash slots tracking using a radix tree. 2017-03-27 16:37:22 +02:00
itamar
443f279a3a Sets up fake client to select current db in RM_Call() 2017-03-06 14:37:10 +02:00
Dvir Volk
4b2229e4b8 fixed free of blocked client before refering to it 2017-03-01 16:51:01 +02:00
Salvatore Sanfilippo
9cc83d2ad9 Makefile: fix building with Solaris C compiler, 64 bit. 2017-02-23 16:53:39 +01:00
antirez
ed7e331051 Merge branch 'sparc' of ssh://209.141.57.197:12222//export/home/antirez/redis into sparc 2017-02-23 15:35:01 +01:00
Salvatore Sanfilippo
b3391fd853 Use ARM unaligned accesses ifdefs for SPARC as well. 2017-02-23 22:39:44 +08:00
Salvatore Sanfilippo
d7826823c0 Fix BITPOS unaligned memory access. 2017-02-23 22:38:44 +08:00
antirez
95883313b5 Solaris fixes about tail usage and atomic vars.
Testing with Solaris C compiler (SunOS 5.11 11.2 sun4v sparc sun4v)
there were issues compiling due to atomicvar.h and running the
tests also failed because of "tail" usage not conform with Solaris
tail implementation. This commit fixes both the issues.
2017-02-22 13:08:21 +01:00
antirez
06263485d4 Merge branch 'siphash' into unstable 2017-02-21 17:10:10 +01:00
antirez
e084b5a39f Merge branch 'arm' into unstable 2017-02-21 17:10:06 +01:00
antirez
0285c2714b SipHash 2-4 -> SipHash 1-2.
For performance reasons we use a reduced rounds variant of
SipHash. This should still provide enough protection and the
effects in the hash table distribution are non existing.
If some real world attack on SipHash 1-2 will be found we can
trivially switch to something more secure. Anyway it is a
big step forward from Murmurhash, for which it is trivial to
generate *seed independent* colliding keys... The speed
penatly introduced by SipHash 2-4, around 4%, was a too big
price to pay compared to the effectiveness of the HashDoS
attack against SipHash 1-2, and considering so far in the
Redis history, no such an incident ever happened even while
using trivially to collide hash functions.
2017-02-21 17:07:28 +01:00
antirez
cd90389b30 freeMemoryIfNeeded(): improve code and lazyfree handling.
1. Refactor memory overhead computation into a function.
2. Every 10 keys evicted, check if memory usage already reached
   the target value directly, since we otherwise don't count all
   the memory reclaimed by the background thread right now.
2017-02-21 12:55:59 +01:00
antirez
84fa8230e5 Use locale agnostic tolower() in dict.c hash function. 2017-02-20 17:39:44 +01:00
antirez
05ea8c6122 SipHash x86 optimizations. 2017-02-20 17:32:46 +01:00
antirez
adeed29a99 Use SipHash hash function to mitigate HashDos attempts.
This change attempts to switch to an hash function which mitigates
the effects of the HashDoS attack (denial of service attack trying
to force data structures to worst case behavior) while at the same time
providing Redis with an hash function that does not expect the input
data to be word aligned, a condition no longer true now that sds.c
strings have a varialbe length header.

Note that it is possible sometimes that even using an hash function
for which collisions cannot be generated without knowing the seed,
special implementation details or the exposure of the seed in an
indirect way (for example the ability to add elements to a Set and
check the return in which Redis returns them with SMEMBERS) may
make the attacker's life simpler in the process of trying to guess
the correct seed, however the next step would be to switch to a
log(N) data structure when too many items in a single bucket are
detected: this seems like an overkill in the case of Redis.

SPEED REGRESION TESTS:

In order to verify that switching from MurmurHash to SipHash had
no impact on speed, a set of benchmarks involving fast insertion
of 5 million of keys were performed.

The result shows Redis with SipHash in high pipelining conditions
to be about 4% slower compared to using the previous hash function.
However this could partially be related to the fact that the current
implementation does not attempt to hash whole words at a time but
reads single bytes, in order to have an output which is endian-netural
and at the same time working on systems where unaligned memory accesses
are a problem.

Further X86 specific optimizations should be tested, the function
may easily get at the same level of MurMurHash2 if a few optimizations
are performed.
2017-02-20 17:29:17 +01:00
John.Koepi
9b05aafb50 fix #2883, #2857 pipe fds leak when fork() failed on bg aof rw 2017-02-20 10:22:57 +01:00
antirez
76d87f47c7 Don't leak file descriptor on syncWithMaster().
Close #3804.
2017-02-20 10:18:41 +01:00
Salvatore Sanfilippo
7329cc3981 ARM: Avoid fast path for BITOP.
GCC will produce certain unaligned multi load-store instructions
that will be trapped by the Linux kernel since ARM v6 cannot
handle them with unaligned addresses. Better to use the slower
but safer implementation instead of generating the exception which
should be anyway very slow.
2017-02-19 15:07:08 +00:00
Salvatore Sanfilippo
4e9cf4cc7e ARM: Use libc malloc by default.
I'm not sure how much test Jemalloc gets on ARM, moreover
compiling Redis with Jemalloc support in not very powerful
devices, like most ARMs people will build Redis on, is extremely
slow. It is possible to enable Jemalloc build anyway if needed
by using "make MALLOC=jemalloc".
2017-02-19 15:02:37 +00:00
Salvatore Sanfilippo
72d6d64771 ARM: Avoid memcpy() in MurmurHash64A() if we are using 64 bit ARM.
However note that in architectures supporting 64 bit unaligned
accesses memcpy(...,...,8) is likely translated to a simple
word memory movement anyway.
2017-02-19 15:00:46 +00:00
Salvatore Sanfilippo
1e272a6b52 ARM: Fix 64 bit unaligned access in MurmurHash64A(). 2017-02-19 14:01:58 +00:00
minghang.zmh
de07deb4d2 fix server.stat_net_output_bytes calc bug 2017-02-10 20:13:01 +08:00
antirez
f917e0da4c Fix MIGRATE closing of cached socket on error.
After investigating issue #3796, it was discovered that MIGRATE
could call migrateCloseSocket() after the original MIGRATE c->argv
was already rewritten as a DEL operation. As a result the host/port
passed to migrateCloseSocket() could be anything, often a NULL pointer
that gets deferenced crashing the server.

Now the socket is closed at an earlier time when there is a socket
error in a later stage where no retry will be performed, before we
rewrite the argument vector. Moreover a check was added so that later,
in the socket_err label, there is no further attempt at closing the
socket if the argument was rewritten.

This fix should resolve the bug reported in #3796.
2017-02-09 09:58:38 +01:00
antirez
0dbfb1d154 Fix ziplist fix... 2017-02-01 17:01:31 +01:00
antirez
c495d095ae Ziplist: insertion bug under particular conditions fixed.
Ziplists had a bug that was discovered while investigating a different
issue, resulting in a corrupted ziplist representation, and a likely
segmentation foult and/or data corruption of the last element of the
ziplist, once the ziplist is accessed again.

The bug happens when a specific set of insertions / deletions is
performed so that an entry is encoded to have a "prevlen" field (the
length of the previous entry) of 5 bytes but with a count that could be
encoded in a "prevlen" field of a since byte. This could happen when the
"cascading update" process called by ziplistInsert()/ziplistDelete() in
certain contitious forces the prevlen to be bigger than necessary in
order to avoid too much data moving around.

Once such an entry is generated, inserting a very small entry
immediately before it will result in a resizing of the ziplist for a
count smaller than the current ziplist length (which is a violation,
inserting code expects the ziplist to get bigger actually). So an FF
byte is inserted in a misplaced position. Moreover a realloc() is
performed with a count smaller than the ziplist current length so the
final bytes could be trashed as well.

SECURITY IMPLICATIONS:

Currently it looks like an attacker can only crash a Redis server by
providing specifically choosen commands. However a FF byte is written
and there are other memory operations that depend on a wrong count, so
even if it is not immediately apparent how to mount an attack in order
to execute code remotely, it is not impossible at all that this could be
done. Attacks always get better... and we did not spent enough time in
order to think how to exploit this issue, but security researchers
or malicious attackers could.
2017-02-01 15:01:59 +01:00
antirez
3a7410a8a6 ziplist: better comments, some refactoring. 2017-01-30 10:12:47 +01:00
Jan-Erik Rediger
3c9b817217 Don't divide by zero
Previously Redis crashed on `MEMORY DOCTOR` when it has no slaves attached.

Fixes #3783
2017-01-27 16:24:14 +01:00
miter
3ec1a001fb Change switch statment to if statment 2017-01-26 21:36:26 +09:00
Salvatore Sanfilippo
41d16f7a4a Merge pull request #3657 from itamarhaber/patch-9
Verify pairs are provided after ZADD's subcommands
2017-01-25 09:31:47 +01:00
Salvatore Sanfilippo
432699845c Merge pull request #3712 from oranagra/fix_assert_debug_digest
fix rare assertion in DEBUG DIGEST
2017-01-20 11:01:43 +01:00
antirez
17ac46ea78 Add panic() into redisassert.h.
This header file is for libs, like ziplist.c, that we want to leave
almost separted from the core. The panic() calls will be easy to delete
in order to use such files outside, but the debugging info we gain are
very valuable compared to simple assertions where it is not possible to
print debugging info.
2017-01-18 17:12:07 +01:00
antirez
53b8bf2c89 serverPanic(): allow printf() alike formatting.
This is of great interest because allows us to print debugging
informations that could be of useful when debugging, like in the
following example:

    serverPanic("Unexpected encoding for object %d, %d",
        obj->type, obj->encoding);
2017-01-18 17:05:10 +01:00
antirez
2cd1ae736f Ziplist: remove static from functions, they prevent good crash reports. 2017-01-13 11:55:13 +01:00
Salvatore Sanfilippo
d21aabcedc Merge pull request #3734 from badboy/avoid-command
Initialize help only in repl mode
2017-01-13 11:32:22 +01:00
antirez
636c693f44 Use const in modules types mem_usage method.
As suggested by @itamarhaber.
2017-01-12 12:47:46 +01:00
antirez
3f79b2f883 Defrag: don't crash when a module value is encountered. 2017-01-12 09:50:40 +01:00
antirez
baa9898821 MEMORY USAGE: support for modules data types.
As a side effect of supporting it, we no longer crash when MEMORY USAGE
is called against a module data type.

Close #3637.
2017-01-12 09:47:57 +01:00
antirez
6ad34a4b78 Defrag: not enabled by default. Error on CONFIG SET if not available. 2017-01-11 15:43:08 +01:00
antirez
86192f3038 Defrag: fix function name typo defarg -> defrag. 2017-01-11 15:38:12 +01:00
antirez
4186879675 Defrag: do not crash on empty quicklist. 2017-01-11 15:38:09 +01:00
antirez
e91f0ea1b3 Defrag: fix comments & code to conform to the Redis code base.
Don't go over 80 cols. Start with captial letter, capital letter afer
point, end comment with a point and so forth. No actual code behavior
touched at all.
2017-01-10 11:33:50 +01:00
antirez
173d692bc2 Defrag: activate it only if running modified version of Jemalloc.
This commit also includes minor aesthetic changes like removal of
trailing spaces.
2017-01-10 11:25:39 +01:00
Jan-Erik Rediger
afaaa91885 Initialize help only in repl mode 2017-01-08 18:29:22 +01:00
oranagra
5ab6a54cc6 active defrag improvements 2017-01-02 09:42:32 +02:00
oranagra
7aa9e6d2ae active memory defragmentation 2016-12-30 03:37:52 +02:00
oranagra
b2da5ea773 fix rare assertion in DEBUG DIGEST
getExpire calls dictFind which can do rehashing.
found by calling computeDatasetDigest from serverCron and running the test suite.
2016-12-24 17:27:58 +02:00
Salvatore Sanfilippo
0b7691201e Merge pull request #3242 from whatacold/unstable
fix the wrong description of intsetGet().
2016-12-20 15:39:56 +01:00
Salvatore Sanfilippo
619317da6f Merge pull request #3696 from jstncarvalho/FixMissingBrackets_ZIP_DECODE_LENGTH
Fix missing brackets around encoding variable in ZIP_DECODE_LENGTH macro
2016-12-20 13:32:54 +01:00
antirez
0f72257049 Geo: fix GEOHASH return value for consistency.
The same thing observed in #3551 by gnethercutt also fixed for
GEOHASH as the original PR did.
2016-12-20 10:20:13 +01:00
antirez
913070a9e8 Geo: fix edge case return values for uniformity.
There were two cases outlined in issue #3512 and PR #3551 where
the Geo API returned unexpected results: empty strings where NULL
replies were expected, or a single null reply where an array was
expected. This violates the Redis principle that Redis replies for
existing keys or elements should be indistinguishable.

This is technically an API breakage so will be merged only into 4.0 and
specified in the changelog in the list of breaking compatibilities, even
if it is not very likely that actual code will be affected, hopefully,
since with the past behavior basically there was to acconut for *both*
the possibilities, and the new behavior is always one of the two, but
in a consistent way.
2016-12-20 10:12:38 +01:00
Justin Carvalho
7c64e88963 Fix missing brackets around encoding variable in ZIP_DECODE_LENGTH macro 2016-12-19 17:37:41 -05:00
antirez
074383f850 Remove first version of ASCII wave, later discarded. 2016-12-19 16:45:18 +01:00
antirez
06bfeb482d Only show Redis logo if logging to stdout / TTY.
You can still force the logo in the normal logs.
For motivations, check issue #3112. For me the reason is that actually
the logo is nice to have in interactive sessions, but inside the logs
kinda loses its usefulness, but for the ability of users to recognize
restarts easily: for this reason the new startup sequence shows a one
liner ASCII "wave" so that there is still a bit of visual clue.

Startup logging was modified in order to log events in more obvious
ways, and to log more events. Also certain important informations are
now more easy to parse/grep since they are printed in field=value style.

The option --always-show-logo in redis.conf was added, defaulting to no.
2016-12-19 16:41:47 +01:00
antirez
90a6f7fc98 adjustOpenFilesLimit() comment made hopefully more clear. 2016-12-19 08:53:29 +01:00
Salvatore Sanfilippo
2988889db1 Merge pull request #3603 from oranagra/adjustOpenFilesLimit_overflow
fix unsigned int overflow in adjustOpenFilesLimit
2016-12-19 08:48:44 +01:00
Salvatore Sanfilippo
ce9e36eb01 Merge pull request #3605 from hylepo/unstable
Fixing typo in the usage of redis-benchmark
2016-12-19 08:20:01 +01:00
Salvatore Sanfilippo
6cf1a325d6 Merge pull request #3643 from andyli028/unstable
Modify MIN->MAX
2016-12-19 08:19:10 +01:00
antirez
8e390a62ad Hopefully improve code comments for issue #3616.
This commit also contains other changes in order to conform the code to
the Redis core style, specifically 80 chars max per line, smart
conditionals in the same line:

    if (that) do_this();
2016-12-16 17:48:38 +01:00
Salvatore Sanfilippo
ca4ca5073e Merge pull request #3616 from oranagra/stop_aofrw_before_rdbload
CoW improvement, stop AOFRW before flushing and parsing slave RDB
2016-12-16 17:43:20 +01:00
Salvatore Sanfilippo
151af73118 Merge pull request #3661 from itamarhaber/module-doc2
Corrects a couple of omissions in the modules docs
2016-12-16 16:53:13 +01:00
antirez
87538cb7fe Switch PFCOUNT to LogLog-Beta algorithm.
The new algorithm provides the same speed with a smaller error for
cardinalities in the range 0-100k. Before switching, the new and old
algorithm behavior was studied in details in the context of
issue #3677. You can find a few graphs and motivations there.
2016-12-16 11:07:30 +01:00
antirez
0224be8811 Use llroundl() before converting loglog-beta output to integer.
Otherwise for small cardinalities the algorithm will output something
like, for example, 4.99 for a candinality of 5, that will be converted
to 4 producing a huge error.
2016-12-16 11:07:30 +01:00
Harish Murthy
c55e3fbae5 LogLog-Beta Algorithm support within HLL
Config option to use LogLog-Beta Algorithm for Cardinality
2016-12-16 11:07:30 +01:00
Salvatore Sanfilippo
5ad2a94a16 Merge pull request #3686 from dvirsky/fix_lowlevel_zrange
fixed stop condition in RM_ZsetRangeNext and RM_ZsetRangePrev
2016-12-16 09:20:47 +01:00
antirez
d634c36253 ziplist.c explanation of format improved a bit. 2016-12-16 09:04:57 +01:00
antirez
ac61f90625 DEBUG: new "ziplist" subcommand added. Dumps a ziplist on stdout.
The commit improves ziplistRepr() and adds a new debugging subcommand so
that we can trigger the dump directly from the Redis API.
This command capability was used while investigating issue #3684.
2016-12-16 09:02:50 +01:00
Dvir Volk
7f9b9512b8 fixed stop condition in RM_ZsetRangeNext and RM_ZsetRangePrev 2016-12-15 00:07:20 +02:00
antirez
b53e73e159 MIGRATE: Remove upfront ttl initialization.
After the fix for #3673 the ttl var is always initialized inside the
loop itself, so the early initialization is not needed.

Variables declaration also moved to a more local scope.
2016-12-14 12:43:55 +01:00
Salvatore Sanfilippo
c9f0456d81 Merge pull request #3673 from badboy/reset-ttl-on-migrating
Reset the ttl for additional keys
2016-12-14 12:41:00 +01:00
antirez
b6f871cf42 Writable slaves expires: fix leak in key tracking.
We need to use a dictionary type that frees the key, since we copy the
keys in the dictionary we use to track expires created in the slave
side.
2016-12-13 16:27:13 +01:00
antirez
d1adc85aa6 INFO: show num of slave-expires keys tracked. 2016-12-13 16:02:29 +01:00
antirez
5b9ba26403 Fix created->created typo in expire.c 2016-12-13 12:21:15 +01:00
antirez
04542cff92 Replication: fix the infamous key leakage of writable slaves + EXPIRE.
BACKGROUND AND USE CASEj

Redis slaves are normally write only, however the supprot a "writable"
mode which is very handy when scaling reads on slaves, that actually
need write operations in order to access data. For instance imagine
having slaves replicating certain Sets keys from the master. When
accessing the data on the slave, we want to peform intersections between
such Sets values. However we don't want to intersect each time: to cache
the intersection for some time often is a good idea.

To do so, it is possible to setup a slave as a writable slave, and
perform the intersection on the slave side, perhaps setting a TTL on the
resulting key so that it will expire after some time.

THE BUG

Problem: in order to have a consistent replication, expiring of keys in
Redis replication is up to the master, that synthesize DEL operations to
send in the replication stream. However slaves logically expire keys
by hiding them from read attempts from clients so that if the master did
not promptly sent a DEL, the client still see logically expired keys
as non existing.

Because slaves don't actively expire keys by actually evicting them but
just masking from the POV of read operations, if a key is created in a
writable slave, and an expire is set, the key will be leaked forever:

1. No DEL will be received from the master, which does not know about
such a key at all.

2. No eviction will be performed by the slave, since it needs to disable
eviction because it's up to masters, otherwise consistency of data is
lost.

THE FIX

In order to fix the problem, the slave should be able to tag keys that
were created in the slave side and have an expire set in some way.

My solution involved using an unique additional dictionary created by
the writable slave only if needed. The dictionary is obviously keyed by
the key name that we need to track: all the keys that are set with an
expire directly by a client writing to the slave are tracked.

The value in the dictionary is a bitmap of all the DBs where such a key
name need to be tracked, so that we can use a single dictionary to track
keys in all the DBs used by the slave (actually this limits the solution
to the first 64 DBs, but the default with Redis is to use 16 DBs).

This solution allows to pay both a small complexity and CPU penalty,
which is zero when the feature is not used, actually. The slave-side
eviction is encapsulated in code which is not coupled with the rest of
the Redis core, if not for the hook to track the keys.

TODO

I'm doing the first smoke tests to see if the feature works as expected:
so far so good. Unit tests should be added before merging into the
4.0 branch.
2016-12-13 10:59:54 +01:00
Yossi Gottlieb
b6ab4d04b6 Fix redis-cli rare crash.
This happens if the server (mysteriously) returns an unexpected response
to the COMMAND command.
2016-12-12 20:18:40 +02:00
Jan-Erik Rediger
2a32f0371e Reset the ttl for additional keys
Before, if a previous key had a TTL set but the current one didn't, the
TTL was reused and thus resulted in wrong expirations set.

This behaviour was experienced, when `MigrateDefaultPipeline` in
redis-trib was set to >1

Fixes #3655
2016-12-08 14:27:21 +01:00
wangshaonan
2d91fce970 Add '\n' to MEMORY DOCTOR command output message when num_reports
is 0 or empty is 1
2016-12-06 03:11:27 +00:00
itamar
94fe98666c Corrects a couple of omissions in the modules docs 2016-12-05 18:34:38 +02:00
antirez
16cce320c4 Modules: types doc updated to new API. 2016-12-05 14:40:51 +01:00
antirez
37b6e16ae1 Modules: API doc updated (auto generated). 2016-12-05 14:40:43 +01:00
antirez
3c85a88888 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2016-12-05 14:17:11 +01:00
antirez
001138aec3 Geo: fix computation of bounding box.
A bug was reported in the context in issue #3631. The root cause of the
bug was that certain neighbor boxes were zeroed after the "inside the
bounding box or not" check, simply because the bounding box computation
function was wrong.

A few debugging infos where enhanced and moved in other parts of the
code. A check to avoid steps=0 was added, but is unrelated to this
issue and I did not verified it was an actual bug in practice.
2016-12-05 14:02:32 +01:00
cbgbt
e5db99ad4a cli: Only print elapsed time on OUTPUT_STANDARD 2016-12-02 20:59:33 -08:00
Itamar Haber
5dc4fe1529 Verify pairs are provided after subcommands
Fixes https://github.com/antirez/redis/issues/3639
2016-12-02 18:19:36 +02:00
antirez
434e6b2da3 PSYNC2: Do not accept WAIT in slave instances.
No longer makes sense since writable slaves only do local writes now:
writes are no longer passed to sub-slaves in the stream.
2016-12-02 10:21:20 +01:00
Chris Lamb
6eb0c52d4c src/rdb.c: Correct "whenver" -> "whenever" typo. 2016-12-01 13:16:30 +01:00
Yossi Gottlieb
5f5b4f1508 Fix typo in RedisModuleTypeMethods declaration. 2016-11-30 22:05:59 +02:00
Salvatore Sanfilippo
3c4fe59e09 Merge pull request #3648 from dvirsky/fix_reply_crash
fix memory corruption on RM_FreeCallReply
2016-11-30 11:21:10 +01:00
antirez
71e8d15e49 Modules: change type registration API to use a struct of methods. 2016-11-30 11:14:01 +01:00
Dvir Volk
8521cde570 fix memory corruption on RM_FreeCallReply 2016-11-30 11:49:49 +02:00
antirez
6eb720ff2d PSYNC2: Minor memory leak reading -NOMASTERLINK master reply fixed. 2016-11-29 10:25:00 +01:00
andyli
8abf9729f0 Modify MIN->MAX 2016-11-29 16:34:41 +08:00
antirez
eab865a0a1 PSYNC2: stop sending newlines to sub-slaves when master is down.
This actually includes two changes:

1) No newlines to take the master-slave link up when the upstream master
is down. Doing this is dangerous because the sub-slave often is received
replication protocol for an half-command, so can't receive newlines
without desyncing the replication link, even with the code in order to
cancel out the bytes that PSYNC2 was using. Moreover this is probably
also not needed/sane, because anyway the slave can keep serving
requests, and because if it's configured to don't serve stale data, it's
a good idea, actually, to break the link.

2) When a +CONTINUE with a different ID is received, we now break
connection with the sub-slaves: they need to be notified as well. This
was part of the original specification but for some reason it was not
implemented in the code, and was alter found as a PSYNC2 bug in the
integration testing.
2016-11-28 17:54:04 +01:00
antirez
790310d894 Better protocol errors logging. 2016-11-25 10:55:16 +01:00
antirez
e09e31b12e PSYNC2: on transient error jump to error, not write_error. 2016-11-24 15:48:18 +01:00
antirez
1f55170b9c Modules: fix client blocking calls access to invalid struct field.
We already have reference to the client pointer, no need to access the
already freed structure.

Close #3634.
2016-11-24 11:05:19 +01:00
antirez
5b7d42fff3 PSYNC2: bugfixing pre release.
1. Master replication offset was cleared after switching configuration
to some other slave, since it was assumed you can't PSYNC after a
switch. Note the case anymore and when we successfully PSYNC we need to
have our offset untouched.

2. Secondary replication ID was not reset to "000..." pattern at
startup.

3. Master in error state replying -LOADING or other transient errors
forced the slave to discard the cached master and full resync. This is
now fixed.

4. Better logging of what's happening on failed PSYNCs.
2016-11-23 17:36:45 +01:00
Salvatore Sanfilippo
5b83fa482c Merge pull request #3612 from deep011/unstable
fix a possible bug for 'replconf getack'
2016-11-18 10:45:09 +01:00
antirez
8fb3ad2444 Merge branch 'psync2' into unstable 2016-11-17 09:37:03 +01:00
oranagra
e3a61950a2 when a slave loads an RDB, stop an AOFRW fork before flusing db and parsing rdb file, to avoid a CoW disaster. 2016-11-16 21:30:59 +02:00
antirez
cfdb3a2214 Cluster: handle zero bytes at the end of nodes.conf. 2016-11-16 14:13:18 +01:00
deep011
13a92a5bb1 fix a possible bug for 'replconf getack' 2016-11-16 11:04:33 +08:00
hylepo
dbb6cb442a Update redis-benchmark.c
Fixing typo in the usage of redis-benchmark
2016-11-11 10:33:48 +08:00
oranagra
a1a07225b3 fix unsigned int overflow in adjustOpenFilesLimit 2016-11-10 16:59:52 +02:00
antirez
28c96d73b2 PSYNC2: Save replication ID/offset on RDB file.
This means that stopping a slave and restarting it will still make it
able to PSYNC with the master. Moreover the master itself will retain
its ID/offset, in case it gets turned into a slave, or if a slave will
try to PSYNC with it with an exactly updated offset (otherwise there is
no backlog).

This change was possible thanks to PSYNC v2 that makes saving the current
replication state much simpler.
2016-11-10 12:35:29 +01:00
antirez
4e5e366ed2 PSYNC2: Wrap debugging code with if(0) 2016-11-09 15:37:15 +01:00
antirez
2669fb8364 PSYNC2: different improvements to Redis replication.
The gist of the changes is that now, partial resynchronizations between
slaves and masters (without the need of a full resync with RDB transfer
and so forth), work in a number of cases when it was impossible
in the past. For instance:

1. When a slave is promoted to mastrer, the slaves of the old master can
partially resynchronize with the new master.

2. Chained slalves (slaves of slaves) can be moved to replicate to other
slaves or the master itsef, without requiring a full resync.

3. The master itself, after being turned into a slave, is able to
partially resynchronize with the new master, when it joins replication
again.

In order to obtain this, the following main changes were operated:

* Slaves also take a replication backlog, not just masters.

* Same stream replication for all the slaves and sub slaves. The
replication stream is identical from the top level master to its slaves
and is also the same from the slaves to their sub-slaves and so forth.
This means that if a slave is later promoted to master, it has the
same replication backlong, and can partially resynchronize with its
slaves (that were previously slaves of the old master).

* A given replication history is no longer identified by the `runid` of
a Redis node. There is instead a `replication ID` which changes every
time the instance has a new history no longer coherent with the past
one. So, for example, slaves publish the same replication history of
their master, however when they are turned into masters, they publish
a new replication ID, but still remember the old ID, so that they are
able to partially resynchronize with slaves of the old master (up to a
given offset).

* The replication protocol was slightly modified so that a new extended
+CONTINUE reply from the master is able to inform the slave of a
replication ID change.

* REPLCONF CAPA is used in order to notify masters that a slave is able
to understand the new +CONTINUE reply.

* The RDB file was extended with an auxiliary field that is able to
select a given DB after loading in the slave, so that the slave can
continue receiving the replication stream from the point it was
disconnected without requiring the master to insert "SELECT" statements.
This is useful in order to guarantee the "same stream" property, because
the slave must be able to accumulate an identical backlog.

* Slave pings to sub-slaves are now sent in a special form, when the
top-level master is disconnected, in order to don't interfer with the
replication stream. We just use out of band "\n" bytes as in other parts
of the Redis protocol.

An old design document is available here:

https://gist.github.com/antirez/ae068f95c0d084891305

However the implementation is not identical to the description because
during the work to implement it, different changes were needed in order
to make things working well.
2016-11-09 15:37:15 +01:00
antirez
18d32c7e1c redis-cli typo fixed: perferences -> preferences.
Thanks to @qiaodaimadelaowang for signaling the issue.
Close #3585.
2016-11-02 15:15:49 +01:00
Salvatore Sanfilippo
fa2dc4b60c Merge pull request #3514 from charsyam/feature/simple-refactoring
Simple change just using slaves instead of server.slaves
2016-11-02 11:04:52 +01:00
Salvatore Sanfilippo
25811bc983 Merge pull request #3547 from yyoshiki41/refactor/redis-trib
Refactor redis-trib.rb
2016-11-02 11:02:32 +01:00
Salvatore Sanfilippo
b3e707339d Merge pull request #3575 from deep011/unstable
fix a bug for quicklistDup() function
2016-11-02 11:00:24 +01:00
Dvir Volk
ec8fd6e5e4 fixed sizeof in allocating io RedisModuleCtx* 2016-10-31 18:48:16 +02:00
Salvatore Sanfilippo
77b1abf185 Merge pull request #3565 from sunheehnus/bitfield-fix-highest_write_offset
bitops.c/bitfieldCommand: update higest_write_offset with check
2016-10-31 15:40:46 +01:00
Salvatore Sanfilippo
f48ca5581e Merge pull request #3573 from jybaek/module-io-context
Add missing fclose()
2016-10-31 15:36:38 +01:00
Guy Benoish
8b070b5d12 Fixed wrong sizeof(client) in object.c 2016-10-31 15:08:17 +02:00
deep
7f1bb22ef3 fix a bug for quicklistDup() function 2016-10-28 19:47:29 +08:00
jybaek
a06d59b583 Add missing fclose() 2016-10-28 10:42:54 +09:00
sunhe
949a274817 bitops.c/bitfieldCommand: update higest_write_offset with check 2016-10-22 01:54:46 +08:00
antirez
f39e7d4d7e Remove "Hey!" warning... 2016-10-19 10:43:40 +02:00
antirez
a9f50a389b Better target MacOS on __atomic macros conditional compilation. 2016-10-17 16:41:39 +02:00
Pedro Melo
2000abc86f Fixes compilation on MacOS 10.8.5, Clang tags/Apple/clang-421.0.57
Redis fails to compile on MacOS 10.8.5 with Clang 4, version 421.0.57
(based on LLVM 3.1svn).

When compiling zmalloc.c, we get these warnings:

        CC zmalloc.o
    zmalloc.c:109:5: warning: implicit declaration of function '__atomic_add_fetch' is invalid in C99 [-Wimplicit-function-declaration]
        update_zmalloc_stat_alloc(zmalloc_size(ptr));
        ^
    zmalloc.c:75:9: note: expanded from macro 'update_zmalloc_stat_alloc'
            atomicIncr(used_memory,__n,used_memory_mutex); \
            ^
    ./atomicvar.h:57:37: note: expanded from macro 'atomicIncr'
    #define atomicIncr(var,count,mutex) __atomic_add_fetch(&var,(count),__ATOMIC_RELAXED)
                                        ^
    zmalloc.c:145:5: warning: implicit declaration of function '__atomic_sub_fetch' is invalid in C99 [-Wimplicit-function-declaration]
        update_zmalloc_stat_free(oldsize);
        ^
    zmalloc.c:85:9: note: expanded from macro 'update_zmalloc_stat_free'
            atomicDecr(used_memory,__n,used_memory_mutex); \
            ^
    ./atomicvar.h:58:37: note: expanded from macro 'atomicDecr'
    #define atomicDecr(var,count,mutex) __atomic_sub_fetch(&var,(count),__ATOMIC_RELAXED)
                                        ^
    zmalloc.c:205:9: warning: implicit declaration of function '__atomic_load_n' is invalid in C99 [-Wimplicit-function-declaration]
            atomicGet(used_memory,um,used_memory_mutex);
            ^
    ./atomicvar.h:60:14: note: expanded from macro 'atomicGet'
        dstvar = __atomic_load_n(&var,__ATOMIC_RELAXED); \
                 ^
    3 warnings generated.

Also on lazyfree.c:

        CC lazyfree.o
    lazyfree.c:68:13: warning: implicit declaration of function '__atomic_add_fetch' is invalid in C99 [-Wimplicit-function-declaration]
                atomicIncr(lazyfree_objects,1,lazyfree_objects_mutex);
                ^
    ./atomicvar.h:57:37: note: expanded from macro 'atomicIncr'
    #define atomicIncr(var,count,mutex) __atomic_add_fetch(&var,(count),__ATOMIC_RELAXED)
                                        ^
    lazyfree.c:111:5: warning: implicit declaration of function '__atomic_sub_fetch' is invalid in C99 [-Wimplicit-function-declaration]
        atomicDecr(lazyfree_objects,1,lazyfree_objects_mutex);
        ^
    ./atomicvar.h:58:37: note: expanded from macro 'atomicDecr'
    #define atomicDecr(var,count,mutex) __atomic_sub_fetch(&var,(count),__ATOMIC_RELAXED)
                                        ^
    2 warnings generated.

Then in the linking stage:

        LINK redis-server
    Undefined symbols for architecture x86_64:
      "___atomic_add_fetch", referenced from:
          _zmalloc in zmalloc.o
          _zcalloc in zmalloc.o
          _zrealloc in zmalloc.o
          _dbAsyncDelete in lazyfree.o
          _emptyDbAsync in lazyfree.o
          _slotToKeyFlushAsync in lazyfree.o
      "___atomic_load_n", referenced from:
          _zmalloc_used_memory in zmalloc.o
          _zmalloc_get_fragmentation_ratio in zmalloc.o
      "___atomic_sub_fetch", referenced from:
          _zrealloc in zmalloc.o
          _zfree in zmalloc.o
          _lazyfreeFreeObjectFromBioThread in lazyfree.o
          _lazyfreeFreeDatabaseFromBioThread in lazyfree.o
          _lazyfreeFreeSlotsMapFromBioThread in lazyfree.o
    ld: symbol(s) not found for architecture x86_64
    clang: error: linker command failed with exit code 1 (use -v to see invocation)
    make[1]: *** [redis-server] Error 1
    make: *** [all] Error 2

With this patch, the compilation is sucessful, no warnings.

Running `make test` we get a almost clean bill of health. Test pass with
one exception:

    [err]: Check for memory leaks (pid 52793) in tests/unit/dump.tcl
    [err]: Check for memory leaks (pid 53103) in tests/unit/auth.tcl
    [err]: Check for memory leaks (pid 53117) in tests/unit/auth.tcl
    [err]: Check for memory leaks (pid 53131) in tests/unit/protocol.tcl
    [err]: Check for memory leaks (pid 53145) in tests/unit/protocol.tcl
    [ok]: Check for memory leaks (pid 53160)
    [err]: Check for memory leaks (pid 53175) in tests/unit/scan.tcl
    [ok]: Check for memory leaks (pid 53189)
    [err]: Check for memory leaks (pid 53221) in tests/unit/type/incr.tcl
    .
    .
    .

Full debug log (289MB, uncompressed) available at
https://dl.dropboxusercontent.com/u/75548/logs/redis-debug-log-macos-10.8.5.log.xz

Most if not all of the memory leak tests fail. Not sure if this is
related. They are the only ones that fail. I belive they are not related,
but just the memory leak detector is not working properly on 10.8.5.

Signed-off-by: Pedro Melo <melo@simplicidade.org>
2016-10-17 14:58:23 +01:00
antirez
c7a4e694ad SWAPDB command.
This new command swaps two Redis databases, so that immediately all the
clients connected to a given DB will see the data of the other DB, and
the other way around. Example:

    SWAPDB 0 1

This will swap DB 0 with DB 1. All the clients connected with DB 0 will
immediately see the new data, exactly like all the clients connected
with DB 1 will see the data that was formerly of DB 0.

MOTIVATION AND HISTORY
---

The command was recently demanded by Pedro Melo, but was suggested in
the past multiple times, and always refused by me.

The reason why it was asked: Imagine you have clients operating in DB 0.
At the same time, you create a new version of the dataset in DB 1.
When the new version of the dataset is available, you immediately want
to swap the two views, so that the clients will transparently use the
new version of the data. At the same time you'll likely destroy the
DB 1 dataset (that contains the old data) and start to build a new
version, to repeat the process.

This is an interesting pattern, but the reason why I always opposed to
implement this, was that FLUSHDB was a blocking command in Redis before
Redis 4.0 improvements. Now we have FLUSHDB ASYNC that releases the
old data in O(1) from the point of view of the client, to reclaim memory
incrementally in a different thread.

At this point, the pattern can really be supported without latency
spikes, so I'm providing this implementation for the users to comment.
In case a very compelling argument will be made against this new command
it may be removed.

BEHAVIOR WITH BLOCKING OPERATIONS
---

If a client is blocking for a list in a given DB, after the swap it will
still be blocked in the same DB ID, since this is the most logical thing
to do: if I was blocked for a list push to list "foo", even after the
swap I want still a LPUSH to reach the key "foo" in the same DB in order
to unblock.

However an interesting thing happens when a client is, for instance,
blocked waiting for new elements in list "foo" of DB 0. Then the DB
0 and 1 are swapped with SWAPDB. However the DB 1 happened to have
a list called "foo" containing elements. When this happens, this
implementation can correctly unblock the client.

It is possible that there are subtle corner cases that are not covered
in the implementation, but since the command is self-contained from the
POV of the implementation and the Redis core, it cannot cause anything
bad if not used.

Tests and documentation are yet to be provided.
2016-10-14 15:28:04 +02:00
antirez
a3b3ca7c21 Modules: use RedisModule_AbortBlock() in the example. 2016-10-13 17:00:45 +02:00
antirez
95c17c0cb2 Modules: AbortBlock() API implemented. 2016-10-13 16:57:40 +02:00
antirez
58601c8f7d Modules: blocking API documented. 2016-10-13 16:57:28 +02:00
antirez
553aa0e259 module.c: trim comment to 80 cols. 2016-10-13 12:48:36 +02:00
antirez
870274bea8 Example modules: remove warnings about types and not used args. 2016-10-13 12:43:18 +02:00
jybaek
c76b9b644c Remove Duplicate Processing 2016-10-13 15:17:07 +09:00
yyoshiki41
16f65068b0 Refactor redis-trib.rb 2016-10-10 01:13:20 +09:00
antirez
7dde8bf3ab Modules: blocking command example added. 2016-10-07 16:35:06 +02:00
antirez
34599691b3 Modules: fixes to the blocking commands API: examples now works. 2016-10-07 16:34:40 +02:00
antirez
f156038db8 Modules: RM_Milliseconds() API added. 2016-10-07 16:34:19 +02:00
antirez
ffb00fbcbe Modules: blocking commands WIP: API exported, a first example. 2016-10-07 13:48:14 +02:00
antirez
3aa816e61a Modules: introduce warning suppression macro for unused args. 2016-10-07 13:10:31 +02:00
antirez
3879923db8 Enable warning in example modules Makefile. 2016-10-07 13:07:13 +02:00
antirez
8fadfe52a2 Module: API to block clients with threading support.
Just a draft to align the main ideas, never executed code. Compiles.
2016-10-07 11:55:35 +02:00
antirez
a5998d1fda Fix typos in GetContextFromIO API declaration. 2016-10-06 18:26:04 +02:00
antirez
799208de85 Fix name of mispelled function. 2016-10-06 17:10:47 +02:00
antirez
152c1b6802 Module: Ability to get context from IO context.
It was noted by @dvirsky that it is not possible to use string functions
when writing the AOF file. This sometimes is critical since the command
rewriting may need to be built in the context of the AOF callback, and
without access to the context, and the limited types that the AOF
production functions will accept, this can be an issue.

Moreover there are other needs that we can't anticipate regarding the
ability to use Redis Modules APIs using the context in order to build
representations to emit AOF / RDB.

Because of this a new API was added that allows the user to get a
temporary context from the IO context. The context is auto released
if obtained when the RDB / AOF callback returns.

Calling multiple time the function to get the context, always returns
the same one, since it is invalid to have more than a single context.
2016-10-06 17:09:26 +02:00
antirez
72279e3ea4 Copyright notice added to module.c. 2016-10-06 08:48:21 +02:00
antirez
3dc84c5300 Modules: API to save/load single precision floating point numbers.
When double precision is not needed, to take 2x space in the
serialization is not good.
2016-10-03 00:08:35 +02:00
antirez
a1b1fd4f39 Modules: API to log from module I/O callbacks. 2016-10-02 16:51:37 +02:00
antirez
4674efdee2 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2016-10-02 16:50:37 +02:00
antirez
0d9febf6a0 Add compiler optimizations to example module makefile. 2016-10-02 11:01:36 +02:00
antirez
6782e774f1 debug.c: include dlfcn.h regardless of BACKTRACE support. 2016-09-27 00:29:47 +02:00
antirez
2564031a15 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2016-09-26 09:10:52 +02:00
antirez
6d9f8e2462 Security: CONFIG SET client-output-buffer-limit overflow fixed.
This commit fixes a vunlerability reported by Cory Duplantis
of Cisco Talos, see TALOS-2016-0206 for reference.

CONFIG SET client-output-buffer-limit accepts as client class "master"
which is actually only used to implement CLIENT KILL. The "master" class
has ID 3. What happens is that the global structure:

    server.client_obuf_limits[class]

Is accessed with class = 3. However it is a 3 elements array, so writing
the 4th element means to write up to 24 bytes of memory *after* the end
of the array, since the structure is defined as:

    typedef struct clientBufferLimitsConfig {
        unsigned long long hard_limit_bytes;
        unsigned long long soft_limit_bytes;
        time_t soft_limit_seconds;
    } clientBufferLimitsConfig;

EVALUATION OF IMPACT:

Checking what's past the boundaries of the array in the global
'server' structure, we find AOF state fields:

    clientBufferLimitsConfig client_obuf_limits[CLIENT_TYPE_OBUF_COUNT];
    /* AOF persistence */
    int aof_state;                  /* AOF_(ON|OFF|WAIT_REWRITE) */
    int aof_fsync;                  /* Kind of fsync() policy */
    char *aof_filename;             /* Name of the AOF file */
    int aof_no_fsync_on_rewrite;    /* Don't fsync if a rewrite is in prog. */
    int aof_rewrite_perc;           /* Rewrite AOF if % growth is > M and... */
    off_t aof_rewrite_min_size;     /* the AOF file is at least N bytes. */
    off_t aof_rewrite_base_size;    /* AOF size on latest startup or rewrite. */
    off_t aof_current_size;         /* AOF current size. */

Writing to most of these fields should be harmless and only cause problems in
Redis persistence that should not escalate to security problems.
However unfortunately writing to "aof_filename" could be potentially a
security issue depending on the access pattern.

Searching for "aof.filename" accesses in the source code returns many different
usages of the field, including using it as input for open(), logging to the
Redis log file or syslog, and calling the rename() syscall.

It looks possible that attacks could lead at least to informations
disclosure of the state and data inside Redis. However note that the
attacker must already have access to the server. But, worse than that,
it looks possible that being able to change the AOF filename can be used
to mount more powerful attacks: like overwriting random files with AOF
data (easily a potential security issue as demostrated here:
http://antirez.com/news/96), or even more subtle attacks where the
AOF filename is changed to a path were a malicious AOF file is loaded
in order to exploit other potential issues when the AOF parser is fed
with untrusted input (no known issue known currently).

The fix checks the places where the 'master' class is specifiedf in
order to access configuration data structures, and return an error in
this cases.

WHO IS AT RISK?

The "master" client class was introduced in Redis in Jul 28 2015.
Every Redis instance released past this date is not vulnerable
while all the releases after this date are. Notably:

    Redis 3.0.x is NOT vunlerable.
    Redis 3.2.x IS vulnerable.
    Redis unstable is vulnerable.

In order for the instance to be at risk, at least one of the following
conditions must be true:

    1. The attacker can access Redis remotely and is able to send
       the CONFIG SET command (often banned in managed Redis instances).

    2. The attacker is able to control the "redis.conf" file and
       can wait or trigger a server restart.

The problem was fixed 26th September 2016 in all the releases affected.
2016-09-26 08:47:52 +02:00
charsyam
ca6fc4f031 Simple change just using slaves instead of server.slaves 2016-09-24 15:53:57 +09:00
Dvir Volk
a91650fc57 added RM_CreateStringPrintf 2016-09-21 12:30:38 +03:00
antirez
670586715a dict.c: fix dictGenericDelete() return ASAP condition.
Recently we moved the "return ASAP" condition for the Delete() function
from checking .size to checking .used, which is smarter, however while
testing the first table alone always works to ensure the dict is totally
emtpy, when we test the .size field, testing .used requires testing both
T0 and T1, since a rehashing could be in progress.
2016-09-20 17:22:30 +02:00
antirez
e9d861ec69 Clear child data when opening the pipes.
This is important both to reset the magic to 0, so that it will not
match if the structure is not explicitly set, and to initialize other
things we may add like counters and such.
2016-09-19 14:11:17 +02:00
antirez
e565632e59 Child -> Parent pipe for COW info transferring. 2016-09-19 13:45:20 +02:00
antirez
e1eccf9a6b zmalloc: Make fp var non local to fix build. 2016-09-19 10:34:39 +02:00
antirez
945a2f948e zmalloc: zmalloc_get_smap_bytes_by_field() modified to work for any PID.
The goal is to get copy-on-write amount of the child from the parent.
2016-09-19 10:28:42 +02:00
antirez
b13759e90a redis-cli: "allocator-stats" -> "malloc-stats".
It was changed in Redis but not in redis-cli.
Thanks to @oranagra for signaling.
2016-09-19 09:47:35 +02:00
antirez
4263b12147 Typo fixed from MEMORY DOCTOR output. 2016-09-16 16:52:00 +02:00
antirez
8a00ffc0e6 Surround allocator name with quotes in MEMORY DOCTOR output. 2016-09-16 16:40:25 +02:00
antirez
44e714a59c MEMORY DOCTOR initial implementation. 2016-09-16 16:36:53 +02:00
antirez
d9325ac6c8 Provide percentage of memory peak used info. 2016-09-16 10:43:19 +02:00
oranagra
309c2bcd1b add zmalloc used mem to DEBUG SDSLEN 2016-09-16 10:29:27 +02:00
antirez
78f35f8d2c Memory related subcommands of DEBUG moved to MEMORY. 2016-09-16 10:26:23 +02:00
antirez
123891dbbf Group MEMORY command related APIs together in the source code. 2016-09-16 10:12:04 +02:00
antirez
adcfb77b5b objectComputeSize(): skiplist nodes have different sizes.
The size of the node depends on the node level, however it is not stored
into the node itself, is an implicit information, so we use
zmalloc_size() in order to compute the sorted set size.
2016-09-15 17:43:13 +02:00
antirez
e9629e148b MEMORY command: HELP + dataset percentage (like in INFO). 2016-09-15 17:33:16 +02:00
antirez
5443726d4d MEMORY USAGE: SAMPLES option added + fixes to size computation.
The new SAMPLES option is added, defaulting to 5, and with 0 being a
special value to scan the whole set of elements.

Fixes to the object size computation were made since the original PR
assumed data structures still contaning robj structures, while now after
the lazyfree changes, are all SDS strings.
2016-09-15 15:25:14 +02:00
antirez
7229af3898 INFO: new memory reporting fields added. 2016-09-15 10:33:23 +02:00
antirez
bf2624ea99 C struct memoh renamed redisMemOverhead. API prototypes added. 2016-09-15 09:44:07 +02:00
antirez
be5439bde3 MEMORY OVERHEAD refactored into a generic API. 2016-09-15 09:37:55 +02:00
antirez
09a50d34a2 dict.c: dictReplaceRaw() -> dictAddOrFind().
What they say about "naming things" in programming?
2016-09-14 16:43:38 +02:00
antirez
041ab04419 Trim comment to 80 cols. 2016-09-14 16:41:05 +02:00
antirez
a636aeac07 Apply the new dictUnlink() where possible.
Optimizations suggested and originally implemented by @oranagra.
Re-applied by @antirez using the modified API.
2016-09-14 16:37:53 +02:00
oranagra
afcbcc0e58 dict.c: introduce dictUnlink().
Notes by @antirez:

This patch was picked from a larger commit by Oran and adapted to change
the API a bit. The basic idea is to avoid double lookups when there is
to use the value of the deleted entry.

BEFORE:

    entry = dictFind( ... ); /* 1st lookup. */
    /* Do somethjing with the entry. */
    dictDelete(...);         /* 2nd lookup. */

AFTER:

    entry = dictUnlink( ... ); /* 1st lookup. */
    /* Do somethjing with the entry. */
    dictFreeUnlinkedEntry(entry); /* No lookups!. */
2016-09-14 12:18:59 +02:00
antirez
8c84c962cf MEMORY OVERHEAD implemented (using Oran Agra initial implementation).
This code was extracted from @oranagra PR #3223 and modified in order
to provide only certain amounts of information compared to the original
code. It was also moved from DEBUG to the newly introduced MEMORY
command. Thanks to Oran for the implementation and the PR.

It implements detailed memory usage stats that can be useful in both
provisioning and troubleshooting memory usage in Redis.
2016-09-13 17:39:25 +02:00
antirez
89dec6921d objectComputeSize(): estimate collections sampling N elements.
For most tasks, we need the memory estimation to be O(1) by default.
This commit also implements an initial MEMORY command.
Note that objectComputeSize() takes the number of samples to check as
argument, so MEMORY should be able to get the sample size as option
to make precision VS CPU tradeoff tunable.

Related to: PR #3223.
2016-09-13 10:28:23 +02:00
oranagra
8c24325f8f Adding objectComputeSize() function. 2016-09-12 16:36:59 +02:00
oranagra
68bf45fa1e Optimize repeated keyname hashing.
(Change cherry-picked and modified by @antirez from a larger commit
provided by @oranagra in PR #3223).
2016-09-12 13:19:05 +02:00
Salvatore Sanfilippo
d680eb6dbd Merge pull request #3492 from wyxustcsa09/fix-memory
fix memory error on module unload
2016-09-09 16:05:06 +02:00
antirez
c6dc8d5288 Merge branch 'unstable' of github.com:antirez/redis into unstable 2016-09-09 16:01:43 +02:00
antirez
56dba3adcc Example modules: Add C99 standard to cflags. 2016-09-09 16:01:29 +02:00
antirez
3793afa0ba Merge branch 'aofrdb' into unstable 2016-09-09 15:03:21 +02:00
antirez
f9624813af fix the fix for the TCP binding.
This commit attempts to fix a problem with PR #3467.
2016-09-09 14:59:48 +02:00
oranagra
92038286e8 fix tcp binding when IPv6 is unsupported 2016-09-09 14:59:21 +02:00
antirez
d35deb2327 debug.c: no need to define _GNU_SOURCE, is defined in fmacros.h. 2016-09-09 11:15:10 +02:00
antirez
6211e77ab6 crash log - improve code dump with more info and called symbols. 2016-09-09 11:00:19 +02:00
wyx
f9c9b4bf4c fix memory error on module unload 2016-09-09 10:22:57 +08:00
oranagra
24811fcb1b crash log - add hex dump of function code 2016-09-08 14:14:57 +02:00
antirez
0d179d17ba dict.c benchmark minor improvements. 2016-09-07 15:28:40 +02:00
antirez
bd6c4cade6 dict.c benchmark: mixed del/insert benchmark. 2016-09-07 12:34:53 +02:00
antirez
0f708ab2a9 dict.c benchmark: finish rehashing before testing lookups. 2016-09-07 11:06:03 +02:00
antirez
ed6a4517f5 dict.c benchmark improvements. 2016-09-07 10:53:47 +02:00
antirez
1074f73629 dict.c benchmark: take optional count argument. 2016-09-07 10:44:29 +02:00