Commit Graph

4488 Commits

Author SHA1 Message Date
Itamar Haber
bd5af03dbd Adds help to CLUSTER command 2017-12-03 19:05:10 +02:00
Itamar Haber
ee3884e63c Improve slowlog help 2017-12-03 17:39:52 +02:00
Itamar Haber
d884ba4bc9 Helps CLIENT 2017-12-03 16:49:29 +02:00
antirez
65a9740fa8 Fix loading of RDB files lua AUX fields when the script is defined.
In the case of slaves loading the RDB from master, or in other similar
cases, the script is already defined, and the function registering the
script should not fail in the assert() call.
2017-12-01 16:01:10 +01:00
antirez
8ac76be5f2 Streams: DEBUG DIGEST support. 2017-12-01 15:04:05 +01:00
antirez
f42df6f43a Streams: add code to compute the stream memory usage.
It's a bit of black magic without actually tracking it inside rax.c,
however Redis usage of the radix tree for the stream data structure is
quite consistent, so a few magic constants apparently are producing
results that make sense.
2017-12-01 12:50:27 +01:00
antirez
115d076d65 Streams: fix lp-count field for non-same-fields entries. 2017-12-01 10:24:25 +01:00
antirez
9bb18e5438 Streams: XRANGE REV option -> XREVRANGE command. 2017-12-01 10:24:25 +01:00
antirez
9dc79c039a Streams: fix reverse iterator discarding of items out of range. 2017-12-01 10:24:25 +01:00
antirez
6919280cc5 Streams: fix reverse iteration next node jumping. 2017-12-01 10:24:25 +01:00
antirez
ee3490ec48 Streams: state machine for reverse iteration WIP 1. 2017-12-01 10:24:25 +01:00
antirez
3c5d773f82 Streams: augment stream entries to allow backward scanning. 2017-12-01 10:24:25 +01:00
antirez
0381931b4c Streams: Update listpack to fix 32bit strings encoding error.
Note that streams produced by XADD in previous broken versions having
elements with 4096 bytes or more will be permanently broken and must be
created again from scratch.

Fix #4428
Fix #4349
2017-12-01 10:24:24 +01:00
antirez
020fe26bd6 Streams: fix COUNT parsing, issue #4433. 2017-12-01 10:24:24 +01:00
antirez
abab0b7817 Streams: fix redis-cli to understand the stream type. 2017-12-01 10:24:24 +01:00
antirez
671b1f6a9d Streams: fix TYPE for stream type. 2017-12-01 10:24:24 +01:00
antirez
5082ec6419 Streams: move ID ms/seq separator from '.' to '-'
After checking with the community via Twitter (here:
https://twitter.com/antirez/status/915130876861788161) the verdict was to
use ":". However I later realized, after users lamented the fact that
it's hard to copy IDs just with double click, that this was the reason
why I moved to "." in the first instance. Fortunately "-", that was the
other option with most votes, also gets selected with double click on
most terminal applications on Linux and MacOS.

So my reasoning was:

1) We can't retain "." because it's actually confusing to newcomers, it
looks like a floating number, people may be tricked into thinking they
can order IDs numerically as floats.

2) Moving to a double-click-to-select format is much better. People will
work with such IDs for long time when coding / debugging. Why making now
a choice that will impact this for the next years?

The only other viable option was "-", and that's what I did. Thanks.
2017-12-01 10:24:24 +01:00
antirez
50595a5889 Streams: fix XADD + MAXLEN propagation due to var shadowing.
Clang should be more prone to return warnings by default when there is
same-var-name shadowing. GCC does this and can avoid bugs like that.
2017-12-01 10:24:24 +01:00
antirez
a4e6aae6b8 Streams: fix memory leak in streamTrimByLength(). 2017-12-01 10:24:24 +01:00
antirez
0248a6b125 Streams: fix streamTrimByLength() standalone items skipping. 2017-12-01 10:24:24 +01:00
antirez
0540803288 Streams: XADD MAXLEN implementation.
The core of this change is the implementation of stream trimming, and
the resulting MAXLEN option of XADD as a trivial result of having
trimming functionalities. MAXLEN already works but in order to be more
efficient listpack GC should be implemented, currently marked as a TODO
item inside the comments.
2017-12-01 10:24:24 +01:00
antirez
0c00fd7834 Streams: reduce listpack max size to 2k to speedup range queries.
Listpack max size is a tradeoff between space and time. A 2k max entry
puts the memory usage approximately at a similar order of magnitude (5
million entries went from 96 to 120 MB), but the range queries speed
doubled (because there are half entries to scan in the average case).

Lower values could be considered, or maybe this parameter should be
made tunable.
2017-12-01 10:24:24 +01:00
antirez
f24d3a7de0 Streams: delta encode IDs based on key. Add count + deleted fields.
We used to have the master ID stored at the start of the listpack,
however using the key directly makes more sense in order to create a
space efficient representation: anyway the key at the radix tree is very
unlikely to change because of how the stream is implemented. Moreover on
nodes merging, to rewrite the merged listpacks is anyway the most
sensible operation, and we can use the iterator and the append-to-stream
function in order to avoid re-implementing the code needed for merging.

This commit also adds two items at the start of the listpack: the
number of valid items inside the listpack, and the number of items
marked as deleted. This means that there is no need to scan a listpack
in order to understand if it's a good candidate for garbage collection,
if the ration between valid/deleted items triggers the GC.
2017-12-01 10:24:24 +01:00
antirez
cea421a021 Streams: specify better how the master enty works. 2017-12-01 10:24:24 +01:00
antirez
3f2d7e277e Streams: items compression implemented.
The approach used is to set a fixed header at the start of every
listpack blob (that contains many entries). The header contains a
"master" ID and fields, that are initially just obtained from the first
entry inserted in the listpack, so that the first enty is always well
compressed. Later every new entry is checked against these fields, and
if it matches, the SAMEFIELD flag is set in the entry so that we know to
just use the master entry flags. The IDs are always delta-encoded
against the first entry. This approach avoids cascading effects in which
entries are encoded depending on the previous entries, in order to avoid
complexity and rewritings of the data when data is removed in the middle
(which is a planned feature).
2017-12-01 10:24:24 +01:00
antirez
8f00cf85a7 Streams: fixed memory leaks when blocking again for same stream.
blockForKeys() was not freeing the allocation holding the ID when the
key was already found busy. Fortunately the unit test checked explicitly
for blocking multiple times for the same key (copying a regression in
the blocking lists tests), so the bug was detected by the Redis test leak
checker.
2017-12-01 10:24:24 +01:00
antirez
26d4f8e3ec Streams: AOF rewriting + minor iterator improvements. 2017-12-01 10:24:24 +01:00
antirez
01ea018c40 Streams: export iteration API. 2017-12-01 10:24:24 +01:00
antirez
9ed40f0fc3 Streams: implement streamReplyWithRange() in terms of the iterator. 2017-12-01 10:24:24 +01:00
antirez
a58733cacf Streams: stream iteration refactoring, WIP 2. 2017-12-01 10:24:24 +01:00
antirez
b1ec333633 Streams: stream iteration refactoring, WIP 1. 2017-12-01 10:24:24 +01:00
antirez
1a603e1a87 Streams: fix bug in XREAD last received ID processing. 2017-12-01 10:24:24 +01:00
antirez
94af55c5ea Streams: fix memory leak in freeStream(). 2017-12-01 10:24:24 +01:00
antirez
3a0b78bc52 Streams: rewrite XADD ID argument for AOF/slaves. 2017-12-01 10:24:24 +01:00
antirez
19b06935d5 Streams: fix XADD API and keyspace notifications.
XADD was suboptimal in the first incarnation of the command, not being
able to accept an ID (very useufl for replication), nor options for
having capped streams.

The keyspace notification for streams was not implemented.
2017-12-01 10:24:24 +01:00
antirez
db89f7474d Streams: When XREAD blocks without COUNT, set a default one.
A client may lose a lot of time between invocations of blocking XREAD,
for example because it is processing the messages or for any other
cause. When it returns back, it may provide a low enough message ID that
the server will block to send an unreasonable number of messages in a
single call. For this reason we set a COUNT when the client is blocked
with XREAD calls, even if no COUNT is given. This is arbitrarily set to
1000 because it's enough to avoid slowing down the reception of many
messages, but low enough to avoid to block.
2017-12-01 10:24:24 +01:00
antirez
c128190026 Streams: fix handleClientsBlockedOnKeys() access to invalid ID. 2017-12-01 10:24:24 +01:00
antirez
6468cb2e82 Streams: fix XREAD ready-key signaling.
With lists we need to signal only on key creation, but streams can
provide data to clients listening at every new item added.
To make this slightly more efficient we now track different classes of
blocked clients to avoid signaling keys when there is nobody listening.
A typical case is when the stream is used as a time series DB and
accessed only by range with XRANGE.
2017-12-01 10:24:24 +01:00
antirez
b5be5093fe Streams: fix XREAD timeout handling, zero is valid. 2017-12-01 10:24:24 +01:00
antirez
2cacdcd6f8 Streams: XREAD related code to serve blocked clients. 2017-12-01 10:24:24 +01:00
antirez
0adb43b68f Streams: XREAD ability to block fixed. 2017-12-01 10:24:24 +01:00
antirez
6a1c92d52d Streams: synchronous xread fixes and improvements. 2017-12-01 10:24:24 +01:00
antirez
a7d898334a Streams: XREAD get-key method fixed. 2017-12-01 10:24:24 +01:00
antirez
110041825c Streams: XREAD get-keys method. 2017-12-01 10:24:24 +01:00
antirez
fa61720d30 Streams: XREAD, first draft. Handling of blocked clients still missing. 2017-12-01 10:24:24 +01:00
antirez
e65b4825f0 Streams: XREAD arguments parsing. 2017-12-01 10:24:24 +01:00
antirez
4086dff477 Streams: augment client.bpop with XREAD specific fields. 2017-12-01 10:24:24 +01:00
antirez
f80dfbf464 Streams: more internal preparation for blocking XREAD. 2017-12-01 10:24:24 +01:00
antirez
4a377cecd8 Streams: initial work to use blocking lists logic for streams XREAD. 2017-12-01 10:24:24 +01:00
antirez
439120c620 Streams: implement stream object release. 2017-12-01 10:24:24 +01:00
antirez
ec9bbe96bf Streams: XLEN command. 2017-12-01 10:24:24 +01:00
antirez
98d184db12 Streams: Save stream->length in RDB. 2017-12-01 10:24:24 +01:00
antirez
cd18f06e9c Streams: change listpack allocator to zmalloc. 2017-12-01 10:24:24 +01:00
antirez
edd70c1993 Streams: RDB loading. RDB saving modified.
After a few attempts it looked quite saner to just add the last item ID
at the end of the serialized listpacks, instead of scanning the last
listpack loaded from head to tail just to fetch it. It's a disk space VS
CPU-and-simplicity tradeoff basically.
2017-12-01 10:24:24 +01:00
antirez
485014cc74 Streams: RDB saving. 2017-12-01 10:24:24 +01:00
antirez
100d43c1ac Streams: assign value of 6 to OBJ_STREAM + some refactoring. 2017-12-01 10:24:24 +01:00
antirez
79866a6361 Streams: 12 commits squashed into the initial Streams implementation. 2017-12-01 10:24:24 +01:00
antirez
045d65c3af PSYNC2: Fix off by one buffer size in luaCreateFunction(). 2017-11-30 18:38:29 +01:00
antirez
452ad2e928 PSYNC2: just store script bodies into RDB.
Related to #4483. As suggested by @soloestoy, we can retrieve the SHA1
from the body. Given that in the new implementation using AUX fields we
ended copying around a lot to create new objects and strings, extremize
such concept and trade CPU for space inside the RDB file.
2017-11-30 18:38:26 +01:00
antirez
28dfdca733 PSYNC2: luaCreateFunction() should handle NULL client parameter.
See #4483. This is needed because luaCreateFunction() is now called
from RDB loading code outside a client context.
2017-11-30 18:37:52 +01:00
antirez
f11a7585a8 PSYNC2: Save Lua scripts state into RDB file.
This is currently needed in order to fix #4483, but this can be
useful in other contexts, so maybe later we may want to remove the
conditionals and always save/load scripts.

Note that we are using the "lua" AUX field here, in order to guarantee
backward compatibility of the RDB file. The unknown AUX fields must be
discarded by past versions of Redis.
2017-11-30 18:37:52 +01:00
antirez
3b9be93fda Prevent corruption of server.executable after DEBUG RESTART.
Doing the following ended with a broken server.executable:

1. Start Redis with src/redis-server
2. Send CONFIG SET DIR /tmp/
3. Send DEBUG RESTART

At this point we called execve with an argv[0] that is no longer related
to the new path. So after the restart the absolute path of the
executable is recomputed in the wrong way. With this fix we pass the
absolute path already computed as argv[0].
2017-11-30 18:30:06 +01:00
antirez
d8f8701032 Be more verbose when DEBUG RESTART fails. 2017-11-30 18:08:21 +01:00
zhaozhao.zz
43be967690 networking: optimize unlinkClient() in freeClient() 2017-11-30 18:11:05 +08:00
zhaozhao.zz
1b5f56d042 aof: cast sdslen to ssize_t 2017-11-30 10:27:12 +08:00
zhaozhao.zz
2d73cf2367 aof: fix the short write 2017-11-30 10:22:12 +08:00
Itamar Haber
0752a834f9 Check arity in SLOWLOG before accessing arg 2017-11-30 00:30:30 +02:00
antirez
2785d6caa0 Merge branch 'lfu-fixes' into unstable 2017-11-29 17:16:13 +01:00
Itamar Haber
59d52f7fab Standardizes the 'help' subcommand
This adds a new `addReplyHelp` helper that's used by commands
when returning a help text. The following commands have been
touched: DEBUG, OBJECT, COMMAND, PUBSUB, SCRIPT and SLOWLOG.

WIP

Fix entry command table entry for OBJECT for HELP option.

After #4472 the command may have just 2 arguments.

Improve OBJECT HELP descriptions.

See #4472.

WIP 2

WIP 3
2017-11-28 21:15:45 +02:00
Salvatore Sanfilippo
565e139a56
Merge pull request #4200 from jeesyn/fix_typo
fix a typo
2017-11-28 18:44:11 +01:00
Salvatore Sanfilippo
923502a70b
Merge pull request #4166 from charpty/wip-redisclic-typo
redis-cli.c typo: helpe -> helper.
2017-11-28 18:41:51 +01:00
Salvatore Sanfilippo
26826329f5
Merge pull request #4167 from charpty/wip-redisclic-typo2
redis-cli.c typo: Requets -> Requests.
2017-11-28 18:41:28 +01:00
Salvatore Sanfilippo
3508b9c440
Merge pull request #4170 from TehWebby/patch-2
Fix typo
2017-11-28 18:40:43 +01:00
antirez
851e9fc48b t_hash.c: clarify calling two times the same function. 2017-11-28 18:39:00 +01:00
antirez
c44732ac58 adlist: fix listJoin() in the case the second list is empty.
See #4192, the original PR removed lines of code that are actually
needed, so thanks to @chunqiulfq for reporting the problem, but merging
solution from @jeesyn after checking, together with @artix75, that the
logic covers all the cases.
2017-11-28 18:25:14 +01:00
Salvatore Sanfilippo
a13106e001
Merge pull request #4374 from rouzier/patch-1
Fix file descriptor leak and error handling
2017-11-28 17:33:23 +01:00
Salvatore Sanfilippo
bf71b120f1
Merge pull request #4451 from devnexen/minor_build_fixes
Fix undefined behavior constant defined.
2017-11-28 17:23:48 +01:00
Itamar Haber
8c7f90e91e Standardizes arity handling of DEBUG 2017-11-28 18:18:45 +02:00
antirez
06ca9d6839 LFU: Fix LFUDecrAndReturn() to just decrement.
Splitting the popularity in half actually just needs decrementing the
counter because the counter is logarithmic.
2017-11-28 12:18:30 +01:00
zhaozhao.zz
9f131c9a89 LFU: add hotkeys option to redis-cli 2017-11-27 18:39:29 +01:00
zhaozhao.zz
583c314725 LFU: do some changes about LFU to find hotkeys
Firstly, use access time to replace the decreas time of LFU.
For function LFUDecrAndReturn,
it should only try to get decremented counter,
not update LFU fields, we will update it in an explicit way.
And we will times halve the counter according to the times of
elapsed time than server.lfu_decay_time.
Everytime a key is accessed, we should update the LFU
including update access time, and increment the counter after
call function LFUDecrAndReturn.
If a key is overwritten, the LFU should be also updated.
Then we can use `OBJECT freq` command to get a key's frequence,
and LFUDecrAndReturn should be called in `OBJECT freq` command
in case of the key has not been accessed for a long time,
because we update the access time only when the key is read or
overwritten.
2017-11-27 18:39:22 +01:00
zhaozhao.zz
53cea97204 LFU: change lfu* parameters to int 2017-11-27 18:38:55 +01:00
zhaozhao.zz
dfc42ec447 LFU: fix the missing of config get and rewrite 2017-11-27 18:38:33 +01:00
antirez
75fa7879e6 Improve OBJECT HELP descriptions.
See #4472.
2017-11-27 18:09:08 +01:00
antirez
b412c544fd Fix entry command table entry for OBJECT for HELP option.
After #4472 the command may have just 2 arguments.
2017-11-27 13:16:07 +01:00
Salvatore Sanfilippo
29252391c4
Merge pull request #4472 from itamarhaber/object_patch
A minor fix and `help` subcommand for `OBJECT`
2017-11-27 12:41:02 +01:00
Itamar Haber
1c08220022 Adds -u <uri> option to redis-cli. 2017-11-27 11:34:11 +01:00
Itamar Haber
02d38f6b51 Adds OBJECT help 2017-11-24 19:59:05 +02:00
Itamar Haber
b28fb3d753 Prevents OBJECT freq with noeviction
When maxmemory is set to noeviction, idletime is implicitly kept. This renders access frequency nonsensical.
2017-11-24 19:58:37 +02:00
Salvatore Sanfilippo
c508cb6793
Merge pull request #4452 from soloestoy/expire-latency
expire & latency: fix the missing latency records generated by expire
2017-11-24 18:21:35 +01:00
antirez
7229fa8d6d Modules: fix memory leak in RM_IsModuleNameBusy(). 2017-11-24 13:29:54 +01:00
antirez
4d063bb6ba PSYNC2: reorganize comments related to recent fixes.
Related to PR #4412 and issue #4407.
2017-11-24 11:08:29 +01:00
Salvatore Sanfilippo
9d86ae4597
Merge pull request #4412 from soloestoy/bugfix-psync2
PSYNC2: safe free backlog when reach the time limit and others
2017-11-24 10:56:18 +01:00
Salvatore Sanfilippo
f739c27229
Merge pull request #4344 from soloestoy/fix-module-name-conflict
Fix module name conflict
2017-11-24 09:37:06 +01:00
Oran Agra
adf2701cc9 fix string to double conversion, stopped parsing on \0 even if the string has more data.
getLongLongFromObject calls string2ll which has this line:
/* Return if not all bytes were used. */
so if you pass an sds with 3 characters "1\01" it will fail.

but getLongDoubleFromObject calls strtold, and considers it ok if eptr[0]==`\0`
i.e. if the end of the string found by strtold ends with null terminator

127.0.0.1:6379> set a 1
OK
127.0.0.1:6379> setrange a 2 2
(integer) 3
127.0.0.1:6379> get a
"1\x002"
127.0.0.1:6379> incrbyfloat a 2
"3"
127.0.0.1:6379> get a
"3"
2017-11-23 17:15:27 +02:00
antirez
de914ede93 Modules: fix for scripting replication of modules commands.
See issue #4466 / #4467.
2017-11-23 15:14:17 +01:00
Yossi Gottlieb
2c70d28295 Nested MULTI/EXEC may replicate in different cases.
For example:
1. A module command called within a MULTI section.
2. A Lua script with replicate_commands() called within a MULTI section.
3. A module command called from a Lua script in the above context.
2017-11-22 22:02:51 +02:00
zhaozhao.zz
ea2e51c630 PSYNC2: persist cached_master's dbid inside the RDB 2017-11-22 12:11:26 +08:00
zhaozhao.zz
93037f7642 PSYNC2: make repl_stream_db never be -1
it means that after this change all the replication
info in RDB is valid, and it can distinguish us from
the older version.
2017-11-22 12:05:34 +08:00
zhaozhao.zz
7a808fd8a7 expire & latency: fix the missing latency records generated by expire 2017-11-21 23:35:30 +08:00
zhaozhao.zz
57bd8feb8d rehash: handle one db until finished 2017-11-21 09:49:42 +01:00
David Carlier
62689ef0cf Fix undefined behavior constant defined. 2017-11-19 16:23:42 +00:00
Salvatore Sanfilippo
cf9a3f7048
Merge pull request #2741 from kmiku7/unstable
fix boundary case for _dictNextPower
2017-11-08 17:06:09 +01:00
Itamar Haber
2564963dc8
Fixes an off-by-one in argument handling of MEMORY USAGE
Fixes #4430
2017-11-08 16:08:29 +02:00
antirez
a1944c3e4d Fix saving of zero-length lists.
Normally in modern Redis you can't create zero-len lists, however it's
possible to load them from old RDB files generated, for instance, using
Redis 2.8 (see issue #4409). The "Right Thing" would be not loading such
lists at all, but this requires to hook in rdb.c random places in a not
great way, for a problem that is at this point, at best, minor.

Here in this commit instead I just fix the fact that zero length lists,
materialized as quicklists with the first node set to NULL, were
iterated in the wrong way while they are saved, leading to a crash.

The other parts of the list implementation are apparently able to deal
with empty lists correctly, even if they are no longer a thing.
2017-11-06 12:37:03 +01:00
antirez
34d5804d4c SDS: improve sdsRemoveFreeSpace() to avoid useless data copy.
Since SDS v2, we no longer have a single header, so the function to
rewrite the SDS in terms of the minimum space required, instead of just
using realloc() and let the underlying allocator decide what to do,
was doing an allocation + copy every time the minimum possible header
needed to represent the string was different than the current one.
This could be often a bit wasteful, because if we go, for instance, from
the 32 bit fields header to the 16 bit fields header, the overhead of
the header is normally very small. With this commit we call realloc
instead, unless the change in header size is very significant in relation
to the string length.
2017-11-03 10:19:27 +01:00
zhaozhao.zz
b8579c225c PSYNC2: clarify the scenario when repl_stream_db can be -1 2017-11-02 10:45:33 +08:00
zhaozhao.zz
885c4f856e PSYNC2 & RDB: fix the missing rdbSaveInfo for BGSAVE 2017-11-01 17:52:43 +08:00
zhaozhao.zz
6ddf0ea293 PSYNC2: safe free backlog when reach the time limit
When we free the backlog, we should use a new
replication ID and clear the ID2. Since without
backlog we can not increment master_repl_offset
even do write commands, that may lead to inconsistency
when we try to connect a "slave-before" master
(if this master is our slave before, our replid
equals the master's replid2). As the master have our
history, so we can match the master's replid2 and
second_replid_offset, that make partial sync work,
but the data is inconsistent.
2017-11-01 17:32:27 +08:00
antirez
ffcf7d5ab1 Fix buffer overflows occurring reading redis.conf.
There was not enough sanity checking in the code loading the slots of
Redis Cluster from the nodes.conf file, this resulted into the
attacker's ability to write data at random addresses in the process
memory, by manipulating the index of the array. The bug seems
exploitable using the following techique: the config file may be altered so
that one of the nodes gets, as node ID (which is the first field inside the
structure) some data that is actually executable: then by writing this
address in selected places, this node ID part can be executed after a
jump. So it is mostly just a matter of effort in order to exploit the
bug. In practice however the issue is not very critical because the
bug requires an unprivileged user to be able to modify the Redis cluster
nodes configuration, and at the same time this should result in some
gain. However Redis normally is unprivileged as well. Yet much better to
have this fixed indeed.

Fix #4278.
2017-10-31 09:41:22 +01:00
antirez
de474186bd More robust object -> double conversion.
Certain checks were useless, at the same time certain malformed inputs
were accepted without problems (emtpy strings parsed as zero).
Cases where strtod() returns ERANGE but we still want to parse the input
where ok in getDoubleFromObject() but not in the long variant.

As a side effect of these fixes, this commit fixes #4391.
2017-10-30 13:39:58 +01:00
rouzier
6eb996540c Fix file descriptor leak and error handling 2017-10-13 13:20:45 -04:00
antirez
2bf8c2c130 Limit statement in RM_BlockClient() to 80 cols. 2017-09-28 23:15:34 +02:00
zhaozhao.zz
6dffc1b7a3 Modules: handle the busy module name 2017-09-28 17:38:40 +08:00
zhaozhao.zz
cb9dde3280 Modules: handle the conflict of registering commands 2017-09-28 16:21:21 +08:00
Dvir Volk
7393fd814e Added safety net preventing redis from crashing if a module decide to block in MULTI 2017-09-27 15:17:53 +03:00
Dvir Volk
b246635d6d Renamed GetCtxFlags to GetContextFlags 2017-09-27 11:58:16 +03:00
Dvir Volk
616c546b01 Added support for module context flags with RM_GetCtxFlags 2017-09-27 11:58:07 +03:00
antirez
474adba9fa Clarify comment in change fixing #4323. 2017-09-21 12:35:04 +02:00
zhaozhao.zz
269760edbb Lazyfree: avoid memory leak when free slowlog entry 2017-09-21 14:19:21 +08:00
antirez
bb3b5ddd19 PSYNC2: More refinements related to #4316. 2017-09-20 11:28:13 +02:00
zhaozhao.zz
b541ccef25 PSYNC2: make persisiting replication info more solid
This commit is a reinforcement of commit c1c99e9.

1. Replication information can be stored when the RDB file is
generated by a mater using server.slaveseldb when server.repl_backlog
is not NULL, or set repl_stream_db be -1. That's safe, because
NULL server.repl_backlog will trigger full synchronization,
then master will send SELECT command to replicaiton stream.
2. Only do rdbSave* when rsiptr is not NULL,
if we do rdbSave* without rdbSaveInfo, slave will miss repl-stream-db.
3. Save the replication informations also in the case of
SAVE command, FLUSHALL command and DEBUG reload.
2017-09-20 11:18:10 +02:00
antirez
c1c99e9f4e PSYNC2: Fix the way replication info is saved/loaded from RDB.
This commit attempts to fix a number of bugs reported in #4316.
They are related to the way replication info like replication ID,
offsets, and currently selected DB in the master client, are stored
and loaded by Redis. In order to avoid inconsistencies the changes in
this commit try to enforce that:

1. Replication information are only stored when the RDB file is
generated by a slave that has a valid 'master' client, so that we can
always extract the currently selected DB.
2. When replication informations are persisted in the RDB file, all the
info for a successful PSYNC or nothing is persisted.
3. The RDB replication informations are only loaded if the instance is
configured as a slave, otherwise a master can start with IDs that relate
to a different history of the data set, and stil retain such IDs in the
future while receiving unrelated writes.
2017-09-19 23:03:39 +02:00
antirez
a4152119c6 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2017-09-19 10:35:49 +02:00
antirez
b75ae0bbea PSYNC2: Create backlog on slave partial sync as well.
A slave may be started with an RDB file able to provide enough slave to
perform a successful partial SYNC with its master. However in such a
case, how outlined in issue #4268, the slave backlog will not be
started, since it was only initialized on full syncs attempts. This
creates different problems with successive PSYNC attempts that will
always result in full synchronizations.

Thanks to @fdingiit for discovering the issue.
2017-09-19 10:33:14 +02:00
Salvatore Sanfilippo
2d13bf4c59 Merge pull request #3785 from GitHubMota/unstable
redis-benchmark: default value size usage update.
2017-09-18 12:18:57 +02:00
Salvatore Sanfilippo
9b4cb4addc Merge pull request #3554 from jybaek/Delete_duplicate
Remove Duplicate Processing
2017-09-18 12:18:15 +02:00
Oran Agra
b122cadc66 Flush append only buffers before existing.
when SHUTDOWN command is recived it is possible that some of the recent
command were not yet flushed from the AOF buffer, and the server
experiences data loss at shutdown.
2017-09-17 07:22:16 +03:00
jianqingdu
498f65ffb7 fix not call va_end when syncWrite() failed
fix not call va_end when syncWrite() failed in sendSynchronousCommand()
2017-08-30 21:20:14 -05:00
jeesyn.liu
447b373fc9 fix a typo 2017-08-08 17:45:51 +08:00
jybaek
a8c08b9b76 Add missing fclose() 2017-08-03 17:28:04 +09:00
Salvatore Sanfilippo
34a79c353f Merge pull request #3935 from itamarhaber/module-cmdstats
Changes command stats iteration to being dict-based
2017-08-02 12:51:26 +02:00
antirez
bc64df9a66 Add MEMORY DOCTOR to MEMORY HELP. 2017-07-28 17:47:54 +02:00
Shaun Webb
2e6f285009 Fix typo 2017-07-27 09:37:37 +09:00
Bo Cai
00954f4d48 redis-cli.c typo: Requets -> Requests.
Signed-off-by: Bo Cai <charpty@gmail.com>
2017-07-26 21:33:29 +08:00
Bo Cai
005d9fa861 redis-cli.c typo: helpe -> helper.
Signed-off-by: Bo Cai <charpty@gmail.com>
2017-07-26 21:24:28 +08:00
Mota
81fe7a4733 redis-benchmark: default value size usage update.
default size of SET/GET value in usage should be 3 bytes as in main code.
2017-07-25 23:43:46 +08:00
Salvatore Sanfilippo
6b64cc47a0 Merge pull request #2259 from badboy/fix-2258
Check that the whole first argument is a number
2017-07-24 15:19:53 +02:00
Salvatore Sanfilippo
964224b77f Merge pull request #4124 from lamby/proceding-proceeding-typo
Correct proceding -> proceeding typo.
2017-07-24 15:19:21 +02:00
Salvatore Sanfilippo
ae40e5f362 Merge pull request #4125 from trevor211/fixAutoAofRewirteMinSize
fix rewrite config: auto-aof-rewrite-min-size
2017-07-24 15:18:56 +02:00
Salvatore Sanfilippo
25c231c4c1 Merge pull request #1998 from grobe0ba/unstable
Fix missing '-' in redis-benchmark help output (Issue #1996)
2017-07-24 15:18:08 +02:00
Salvatore Sanfilippo
d9565379da Merge pull request #4128 from leonchen83/unstable
fix mismatch argument and return wrong value of clusterDelNodeSlots
2017-07-24 14:18:28 +02:00
liangsijian
ffbbe5a720 Fix lua ldb command log 2017-07-24 19:24:06 +08:00
antirez
314043552b Modules: don't crash when Lua calls a module blocking command.
Lua scripting does not support calling blocking commands, however all
the native Redis commands are flagged as "s" (no scripting flag), so
this is not possible at all. With modules there is no such mechanism in
order to flag a command as non callable by the Lua scripting engine,
moreover we cannot trust the modules users from complying all the times:
it is likely that modules will be released to have blocking commands
without such commands being flagged correctly, even if we provide a way to
signal this fact.

This commit attempts to address the problem in a short term way, by
detecting that a module is trying to block in the context of the Lua
scripting engine client, and preventing to do this. The module will
actually believe to block as usually, but what happens is that the Lua
script receives an error immediately, and the background call is ignored
by the Redis engine (if not for the cleanup callbacks, once it
unblocks).

Long term, the more likely solution, is to introduce a new call called
RedisModule_GetClientFlags(), so that a command can detect if the caller
is a Lua script, and return an error, or avoid blocking at all.

Being the blocking API experimental right now, more work is needed in
this regard in order to reach a level well blocking module commands and
all the other Redis subsystems interact peacefully.

Now the effect is like the following:

    127.0.0.1:6379> eval "redis.call('hello.block',1,5000)" 0
    (error) ERR Error running script (call to
    f_b5ba35ff97bc1ef23debc4d6e9fd802da187ed53): @user_script:1: ERR
    Blocking module command called from Lua script

This commit fixes issue #4127 in the short term.
2017-07-23 12:55:37 +02:00
antirez
5bfdfbe174 Fix typo in unblockClientFromModule() top comment. 2017-07-23 12:41:26 +02:00
antirez
a3778f3b0f Make representClusterNodeFlags() more robust.
This function failed when an internal-only flag was set as an only flag
in a node: the string was trimmed expecting a final comma before
exiting the function, causing a crash. See issue #4142.
Moreover generation of flags representation only needed at DEBUG log
level was always performed: a waste of CPU time. This is fixed as well
by this commit.
2017-07-20 15:17:35 +02:00
antirez
b1c2e1a19c Fix two bugs in moduleTypeLookupModuleByID().
The function cache was not working at all, and the function returned
wrong values if there where two or more modules exporting native data
types.

See issue #4131 for more details.
2017-07-20 14:59:42 +02:00
Leon Chen
9e7a8c0207 fix return wrong value of clusterDelNodeSlots 2017-07-20 17:24:38 +08:00
Leon Chen
2cdf4cc656 fix mismatch argument 2017-07-18 02:28:24 -05:00
WuYunlong
c32c690de6 fix rewrite config: auto-aof-rewrite-min-size 2017-07-15 10:20:56 +08:00
Chris Lamb
7560d347da Correct proceding -> proceeding typo. 2017-07-14 22:53:14 +01:00
antirez
bd1782fa0a Modules: fix thread safe context DB selection.
Before this fix the DB currenty selected by the client blocked was not
respected and operations were always performed on DB 0.
2017-07-14 13:02:15 +02:00
antirez
8eefc9323d Allow certain modules APIs only defining REDISMODULE_EXPERIMENTAL_API.
Those calls may be subject to changes in the future, so the user should
acknowledge it is using non stable API.
2017-07-14 12:07:52 +02:00
antirez
f03947a676 Modules documentation removed from source.
Moving to redis-doc repository to publish via Redis.io.
2017-07-14 11:33:59 +02:00
antirez
43aaf96163 Markdown generation of Redis Modules API reference improved. 2017-07-14 11:29:31 +02:00
antirez
e74f0aa6d1 Fix replication of SLAVEOF inside transaction.
In Redis 4.0 replication, with the introduction of PSYNC2, masters and
slaves replicate commands to cascading slaves and to the replication
backlog itself in a different way compared to the past.

Masters actually replicate the effects of client commands.
Slaves just propagate what they receive from masters.

This mechanism can cause problems when the configuration of an instance
is changed from master to slave inside a transaction. For instance
we could send to a master instance the following sequence:

    MULTI
    SLAVEOF 127.0.0.1 0
    EXEC
    SLAVEOF NO ONE

Before the fixes in this commit, the MULTI command used to be propagated
into the replication backlog, however after the SLAVEOF command the
instance is a slave, so the EXEC implementation failed to also propagate
the EXEC command. When the slaves of the above instance reconnected,
they were incrementally synchronized just sending a "MULTI". This put
the master client (in the slaves) into MULTI state, breaking the
replication.

Notably even Redis Sentinel uses the above approach in order to guarantee
that configuration changes are always performed together with rewrites
of the configuration and with clients disconnection. Sentiel does:

    MULTI
    SLAVEOF ...
    CONFIG REWRITE
    CLIENT KILL TYPE normal
    EXEC

So this was a really problematic issue. However even with the fix in
this commit, that will add the final EXEC to the replication stream in
case the instance was switched from master to slave during the
transaction, the result would be to increment the slave replication
offset, so a successive reconnection with the new master, will not
permit a successful partial resynchronization: no way the new master can
provide us with the backlog needed, we incremented our offset to a value
that the new master cannot have.

However the EXEC implementation waits to emit the MULTI, so that if the
commands inside the transaction actually do not need to be replicated,
no commands propagation happens at all. From multi.c:

    if (!must_propagate && !(c->cmd->flags & (CMD_READONLY|CMD_ADMIN))) {
	execCommandPropagateMulti(c);
	must_propagate = 1;
    }

The above code is already modified by this commit you are reading.
Now also ADMIN commands do not trigger the emission of MULTI. It is actually
not clear why we do not just check for CMD_WRITE... Probably I wrote it this
way in order to make the code more reliable: better to over-emit MULTI
than not emitting it in time.

So this commit should indeed fix issue #3836 (verified), however it looks
like some reconsideration of this code path is needed in the long term.

BONUS POINT: The reverse bug.

Even in a read only slave "B", in a replication setup like:

	A -> B -> C

There are commands without the READONLY nor the ADMIN flag, that are also
not flagged as WRITE commands. An example is just the PING command.

So if we send B the following sequence:

    MULTI
    PING
    SLAVEOF NO ONE
    EXEC

The result will be the reverse bug, where only EXEC is emitted, but not the
previous MULTI. However this apparently does not create problems in practice
but it is yet another acknowledge of the fact some work is needed here
in order to make this code path less surprising.

Note that there are many different approaches we could follow. For instance
MULTI/EXEC blocks containing administrative commands may be allowed ONLY
if all the commands are administrative ones, otherwise they could be
denined. When allowed, the commands could simply never be replicated at all.
2017-07-12 11:07:28 +02:00
antirez
e1b8b4b6da CLUSTER GETKEYSINSLOT: avoid overallocating.
Close #3911.
2017-07-11 15:49:09 +02:00
antirez
5bd46d33db Fix isHLLObjectOrReply() to handle integer encoded strings.
Close #3766.
2017-07-11 12:44:59 +02:00
antirez
e203a46cf3 Clients blocked in modules: free argv/argc later.
See issue #3844 for more information.
2017-07-11 12:33:01 +02:00
antirez
14c32c3569 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2017-07-11 09:46:58 +02:00
antirez
54e4bbeabd Event loop: call after sleep() only from top level.
In general we do not want before/after sleep() callbacks to be called
when we re-enter the event loop, since those calls are only designed in
order to perform operations every main iteration of the event loop, and
re-entering is often just a way to incrementally serve clietns with
error messages or other auxiliary operations. However, if we call the
callbacks, we are then forced to think at before/after sleep callbacks
as re-entrant, which is much harder without any good need.

However here there was also a clear bug: beforeSleep() was actually
never called when re-entering the event loop. But the new afterSleep()
callback was. This is broken and in this instance re-entering
afterSleep() caused a modules GIL dead lock.
2017-07-11 00:13:52 +02:00
Salvatore Sanfilippo
58104d8327 Merge pull request #4113 from guybe7/module_io_bytes
Modules: Fix io->bytes calculation in RDB save
2017-07-10 19:14:34 +02:00
antirez
11182a1a58 redis-check-aof: tell users there is a --fix option. 2017-07-10 16:41:25 +02:00
Guy Benoish
dfb68cd235 Modules: Fix io->bytes calculation in RDB save 2017-07-10 14:41:57 +03:00
antirez
fc7ecd8d35 AOF check utility: ability to check files with RDB preamble. 2017-07-10 13:38:23 +02:00
Salvatore Sanfilippo
6b0670daad Merge pull request #3853 from itamarhaber/issue-3851
Sets up fake client to select current db in RM_Call()
2017-07-06 15:02:11 +02:00
Salvatore Sanfilippo
38dd30af42 Merge pull request #4105 from spinlock/unstable-networking
Optimize addReplyBulkSds for better performance
2017-07-06 14:31:08 +02:00
Salvatore Sanfilippo
2d5aa00959 Merge pull request #4106 from petersunbag/unstable
minor fix in listJoin().
2017-07-06 14:29:37 +02:00
sunweinan
87f771bff1 minor fix in listJoin(). 2017-07-06 19:47:21 +08:00
antirez
2b36950e9b Free IO context if any in RDB loading code.
Thanks to @oranagra for spotting this bug.
2017-07-06 11:20:49 +02:00
antirez
51ffd062d3 Modules: DEBUG DIGEST interface. 2017-07-06 11:04:46 +02:00
spinlock
10db81af71 update Makefile for test-sds 2017-07-05 14:32:09 +00:00
spinlock
ea31a4eae3 Optimize addReplyBulkSds for better performance 2017-07-05 14:25:05 +00:00
antirez
f9fac7f777 Avoid closing invalid FDs to make Valgrind happier. 2017-07-05 15:40:25 +02:00
antirez
413c2bc180 Modules: no MULTI/EXEC for commands replicated from async contexts.
They are technically like commands executed from external clients one
after the other, and do not constitute a single atomic entity.
2017-07-05 10:10:20 +02:00
Salvatore Sanfilippo
09dd7b5ff0 Merge pull request #4101 from dvirsky/fix_modules_reply_len
Proposed fix to #4100
2017-07-04 12:01:51 +02:00
antirez
eddd8d34c4 Add symmetrical assertion to track c->reply_buffer infinite growth.
Redis clients need to have an instantaneous idea of the amount of memory
they are consuming (if the number is not exact should at least be
proportional to the actual memory usage). We do that adding and
subtracting the SDS length when pushing / popping from the client->reply
list. However it is quite simple to add bugs in such a setup, by not
taking the objects in the list and the count in sync. For such reason,
Redis has an assertion to track counts near 2^64: those are always the
result of the counter wrapping around because we subtract more than we
add. This commit adds the symmetrical assertion: when the list is empty
since we sent everything, the reply_bytes count should be zero. Thanks
to the new assertion it should be simple to also detect the other
problem, where the count slowly increases because of over-counting.
The assertion adds a conditional in the code that sends the buffer to
the socket but should not create any measurable performance slowdown,
listLength() just accesses a structure field, and this code path is
totally dominated by write(2).

Related to #4100.
2017-07-04 11:55:05 +02:00
Dvir Volk
86e564e9ff fixed #4100 2017-07-04 00:02:19 +03:00
antirez
b2cd9fcab6 Fix GEORADIUS edge case with huge radius.
This commit closes issue #3698, at least for now, since the root cause
was not fixed: the bounding box function, for huge radiuses, does not
return a correct bounding box, there are points still within the radius
that are left outside.

So when using GEORADIUS queries with radiuses in the order of 5000 km or
more, it was possible to see, at the edge of the area, certain points
not correctly reported.

Because the bounding box for now was used just as an optimization, and
such huge radiuses are not common, for now the optimization is just
switched off when the radius is near such magnitude.

Three test cases found by the Continuous Integration test were added, so
that we can easily trigger the bug again, both for regression testing
and in order to properly fix it as some point in the future.
2017-07-03 19:38:31 +02:00
antirez
26e638a8e9 redis-cli --latency: ability to run non interactively.
This feature was proposed by @rosmo in PR #2643 and later redesigned
in order to fit better with the other options for non-interactive modes
of redis-cli. The idea is basically to allow to collect latency
information in scripts, cron jobs or whateever, just running for a
limited time and then producing a single output.
2017-06-30 15:41:58 +02:00
antirez
7bad78bd2f Fix abort typo in Lua debugger help screen. 2017-06-30 12:12:00 +02:00
antirez
f8547e53f0 Added GEORADIUS(BYMEMBER)_RO variants for read-only operations.
Issue #4084 shows how for a design error, GEORADIUS is a write command
because of the STORE option. Because of this it does not work
on readonly slaves, gets redirected to masters in Redis Cluster even
when the connection is in READONLY mode and so forth.

To break backward compatibility at this stage, with Redis 4.0 to be in
advanced RC state, is problematic for the user base. The API can be
fixed into the unstable branch soon if we'll decide to do so in order to
be more consistent, and reease Redis 5.0 with this incompatibility in
the future. This is still unclear.

However, the ability to scale GEO queries in slaves easily is too
important so this commit adds two read-only variants to the GEORADIUS
and GEORADIUSBYMEMBER command: GEORADIUS_RO and GEORADIUSBYMEMBER_RO.
The commands are exactly as the original commands, but they do not
accept the STORE and STOREDIST options.
2017-06-30 10:03:37 +02:00
antirez
01a4b9892d HMSET and MSET implementations unified. HSET now variadic.
This is the first step towards getting rid of HMSET which is a command
that does not make much sense once HSET is variadic, and has a saner
return value.
2017-06-29 17:38:46 +02:00
Salvatore Sanfilippo
634c64dd18 Merge pull request #4075 from sgn1/brpop_keys
Fix Issues in blocking commands in cluster mode.
2017-06-27 17:51:19 +02:00
antirez
365dd037dc RDB modules values serialization format version 2.
The original RDB serialization format was not parsable without the
module loaded, becuase the structure was managed only by the module
itself. Moreover RDB is a streaming protocol in the sense that it is
both produce di an append-only fashion, and is also sometimes directly
sent to the socket (in the case of diskless replication).

The fact that modules values cannot be parsed without the relevant
module loaded is a problem in many ways: RDB checking tools must have
loaded modules even for doing things not involving the value at all,
like splitting an RDB into N RDBs by key or alike, or just checking the
RDB for sanity.

In theory module values could be just a blob of data with a prefixed
length in order for us to be able to skip it. However prefixing the values
with a length would mean one of the following:

1. To be able to write some data at a previous offset. This breaks
stremaing.
2. To bufferize values before outputting them. This breaks performances.
3. To have some chunked RDB output format. This breaks simplicity.

Moreover, the above solution, still makes module values a totally opaque
matter, with the fowllowing problems:

1. The RDB check tool can just skip the value without being able to at
least check the general structure. For datasets composed mostly of
modules values this means to just check the outer level of the RDB not
actually doing any checko on most of the data itself.
2. It is not possible to do any recovering or processing of data for which a
module no longer exists in the future, or is unknown.

So this commit implements a different solution. The modules RDB
serialization API is composed if well defined calls to store integers,
floats, doubles or strings. After this commit, the parts generated by
the module API have a one-byte prefix for each of the above emitted
parts, and there is a final EOF byte as well. So even if we don't know
exactly how to interpret a module value, we can always parse it at an
high level, check the overall structure, understand the types used to
store the information, and easily skip the whole value.

The change is backward compatible: older RDB files can be still loaded
since the new encoding has a new RDB type: MODULE_2 (of value 7).
The commit also implements the ability to check RDB files for sanity
taking advantage of the new feature.
2017-06-27 13:19:16 +02:00
antirez
c3998728a2 ARM: Fix stack trace generation on crash. 2017-06-26 10:36:16 +02:00
antirez
c9097393bf Issue #4027: unify comment and modify return value in freeMemoryIfNeeded().
It looks safer to return C_OK from freeMemoryIfNeeded() when clients are
paused because returning C_ERR may prevent success of writes. It is
possible that there is no difference in practice since clients cannot
execute writes while clients are paused, but it looks more correct this
way, at least conceptually.

Related to PR #4028.
2017-06-23 11:42:25 +02:00
Salvatore Sanfilippo
936ade80b2 Merge pull request #4028 from zintrepid/prevent_expirations_while_paused
Prevent expirations and evictions while paused
2017-06-23 11:39:02 +02:00
Suraj Narkhede
f85f36f50d Fix following issues in blocking commands:
1. brpop last key index, thus checking all keys for slots.
2. Memory leak in clusterRedirectBlockedClientIfNeeded.
3. Remove while loop in clusterRedirectBlockedClientIfNeeded.
2017-06-23 00:30:21 -07:00
Suraj Narkhede
d303bca587 Fix brpop command table entry and redirect blocked clients. 2017-06-22 23:52:00 -07:00
antirez
8b768e8ea4 Aesthetic changes to #4068 PR to conform to Redis coding standard.
1. Inline if ... statement if short.
2. No lines over 80 columns.
2017-06-22 11:00:34 +02:00
Salvatore Sanfilippo
6476f1a979 Merge pull request #4068 from FreedomU007/unstable
Fix set with ex/px option when propagated to aof
2017-06-22 10:46:58 +02:00
xuzhou
86e9f48a0c Optimize set command with ex/px when updating aof. 2017-06-22 11:06:40 +08:00
Salvatore Sanfilippo
ef446bf16d Merge pull request #3802 from flowly/bugfix-calc-stat-net-output-bytes
Bugfix calc stat net output bytes
2017-06-20 17:01:16 +02:00
Salvatore Sanfilippo
1d857a99d5 Merge pull request #4056 from season89/unstable
Fixed comments of slowlog duration
2017-06-20 16:55:29 +02:00
Salvatore Sanfilippo
0a03187ac4 Merge pull request #3659 from cbgbt/cli-elapsed
cli: Only print elapsed time on OUTPUT_STANDARD.
2017-06-20 16:53:56 +02:00
antirez
2a84927f35 redis-benchmark: add -t hset target. 2017-06-19 09:41:11 +02:00
xuzhou
530fcf8687 Fix set with ex/px option when propagated to aof 2017-06-16 17:51:38 +08:00
antirez
53cb27b1d7 SLOWLOG: log offending client address and name. 2017-06-15 12:57:54 +02:00
antirez
ab9d398835 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2017-06-14 18:29:53 +02:00
Qu Chen
4740424049 Implement getKeys procedure for georadius and georadiusbymember
commands.
2017-06-14 18:15:48 +02:00
xuchengxuan
3fc4bf07cc Fixed comments of slowlog duration 2017-06-14 16:42:21 +08:00
Salvatore Sanfilippo
d3b32ca48d Merge pull request #4034 from amallia/patch-1
Fixed comment in clusterMsg version field
2017-06-13 06:28:23 -07:00
Salvatore Sanfilippo
33035cad04 Merge pull request #4035 from amallia/patch-2
Removed duplicate 'sys/socket.h'  include
2017-06-13 06:27:31 -07:00
antirez
5877c02c51 Fix PERSIST expired key resuscitation issue #4048. 2017-06-13 10:35:51 +02:00
Antonio Mallia
2d1d57eb47 Removed duplicate 'sys/socket.h' include 2017-06-04 15:26:53 +01:00
Antonio Mallia
591dba8055 Fixed comment in clusterMsg version field 2017-06-04 15:09:05 +01:00
Zachary Marquez
a3e53cf9bc Prevent expirations and evictions while paused
Proposed fix to https://github.com/antirez/redis/issues/4027
2017-06-01 16:28:40 -05:00
antirez
e91b81c612 More informative -MISCONF error message. 2017-05-19 12:03:30 +02:00
antirez
e498d9ee3e Collect fork() timing info only if fork succeeded. 2017-05-19 11:10:36 +02:00
antirez
78211aaaaf redis-cli --bigkeys: show error when TYPE fails.
Close #3993.
2017-05-15 11:22:28 +02:00
antirez
1f598fc2bb Modules TSC: use atomic var for server.unixtime.
This avoids Helgrind complaining, but we are actually not using
atomicGet() to get the unixtime value for now: too many places where it
is used and given tha time_t is word-sized it should be safe in all the
archs we support as it is.

On the other hand, Helgrind, when Redis is compiled with "make helgrind"
in order to force the __sync macros, will detect the write in
updateCachedTime() as a read (because atomic functions are used) and
will not complain about races.

This commit also includes minor refactoring of mutex initializations and
a "helgrind" target in the Makefile.
2017-05-10 10:04:16 +02:00
antirez
de786186a5 atomicvar.h: show used API in INFO. Add macro to force __sync builtin.
The __sync builtin can be correctly detected by Helgrind so to force it
is useful for testing. The API in the INFO output can be useful for
debugging after problems are reported.
2017-05-10 09:33:49 +02:00
Guy Benoish
89a9e5a9a2 Merge branch 'unstable' of https://github.com/antirez/redis into unstable 2017-05-09 18:42:32 +03:00
antirez
6eb51bf1ec zmalloc.c: remove thread safe mode, it's the default way. 2017-05-09 16:59:51 +02:00
antirez
9390c384b8 Modules TSC: Add mutex for server.lruclock.
Only useful for when no atomic builtins are available.
2017-05-09 16:32:49 +02:00
antirez
ece658713b Modules TSC: Improve inter-thread synchronization.
More work to do with server.unixtime and similar. Need to write Helgrind
suppression file in order to suppress the valse positives.
2017-05-09 11:57:09 +02:00
antirez
2a51bac44e Simplify atomicvar.h usage by having the mutex name implicit. 2017-05-04 17:01:00 +02:00
antirez
52bc74f221 Lazyfree: fix lazyfreeGetPendingObjectsCount() race reading counter. 2017-05-04 10:35:40 +02:00
antirez
7d9326b1f3 Modules TSC: HELLO.KEYS reply format fixed. 2017-05-03 23:43:49 +02:00
antirez
9b01b64430 Modules TSC: put the client in the pending write list. 2017-05-03 14:54:48 +02:00
antirez
e67fb915eb adlist: fix final list count in listJoin(). 2017-05-03 14:54:14 +02:00
antirez
79226cb9fa adlist: fix listJoin() to handle empty lists. 2017-05-03 14:15:25 +02:00
antirez
6798736909 Modules: remove unused var in example module. 2017-05-03 14:10:21 +02:00
antirez
1ed2ff5570 Modules TSC: HELLO.KEYS example draft finished. 2017-05-03 14:08:12 +02:00
antirez
7127f15ebe Module: fix RedisModule_Call() "l" specifier to create a raw string. 2017-05-03 14:07:10 +02:00
antirez
3fcf959e60 Modules TSC: Release the GIL for all the time we are blocked.
Instead of giving the module background operations just a small time to
run in the beforeSleep() function, we can have the lock released for all
the time we are blocked in the multiplexing syscall.
2017-05-03 11:26:21 +02:00
antirez
ba4a5a3255 Modules TSC: Export symbols of the new API. 2017-05-02 15:19:28 +02:00
antirez
275905b328 Modules TSC: Handling of RM_Reply* functions. 2017-05-02 15:05:39 +02:00
antirez
9c500b89fb Modules TSC: Basic TS context creeation and handling. 2017-05-02 12:53:10 +02:00
antirez
59b06b14c9 Modules TSC: GIL and cooperative multi tasking setup. 2017-04-28 18:41:10 +02:00
antirez
469d6e2b37 PSYNC2: fix master cleanup when caching it.
The master client cleanup was incomplete: resetClient() was missing and
the output buffer of the client was not reset, so pending commands
related to the previous connection could be still sent.

The first problem caused the client argument vector to be, at times,
half populated, so that when the correct replication stream arrived the
protcol got mixed to the arugments creating invalid commands that nobody
called.

Thanks to @yangsiran for also investigating this problem, after
already providing important design / implementation hints for the
original PSYNC2 issues (see referenced Github issue).

Note that this commit adds a new function to the list library of Redis
in order to be able to reset a list without destroying it.

Related to issue #3899.
2017-04-27 17:08:37 +02:00
antirez
238cebdd5e Check event loop creation return value. Fix #3951.
Normally we never check for OOM conditions inside Redis since the
allocator will always return a pointer or abort the program on OOM
conditons. However we cannot have control on epool_create(), that may
fail for kernel OOM (according to the manual page) even if all the
parameters are correct, so the function aeCreateEventLoop() may indeed
return NULL and this condition must be checked.
2017-04-21 16:27:38 +02:00
Salvatore Sanfilippo
3773c06d28 Merge pull request #3950 from kensou97/unstable
update block->free after some diff data are written to the child process
2017-04-20 07:55:51 +02:00
antirez
7d9dd80db3 Fix getKeysUsingCommandTable() in cluster mode.
Close #3940.
2017-04-19 16:17:08 +02:00
antirez
189a12afb4 PSYNC2: discard pending transactions from cached master.
During the review of the fix for #3899, @yangsiran identified an
implementation bug: given that the offset is now relative to the applied
part of the replication log, when we cache a master, the successive
PSYNC2 request will be made in order to *include* the transaction that
was not completely processed. This means that we need to discard any
pending transaction from our replication buffer: it will be re-executed.
2017-04-19 14:02:52 +02:00
antirez
22be435efe Fix PSYNC2 incomplete command bug as described in #3899.
This bug was discovered by @kevinmcgehee and constituted a major hidden
bug in the PSYNC2 implementation, caused by the propagation from the
master of incomplete commands to slaves.

The bug had several results:

1. Borrowing from Kevin text in the issue: "Given that slaves blindly
copy over their master's input into their own replication backlog over
successive read syscalls, it's possible that with large commands or
small TCP buffers, partial commands are present in this buffer. If the
master were to fail before successfully propagating the entire command
to a slave, the slaves will never execute the partial command (since the
client is invalidated) but will copy it to replication backlog which may
relay those invalid bytes to its slaves on PSYNC2, corrupting the
backlog and possibly other valid commands that follow the failover.
Simple command boundaries aren't sufficient to capture this, either,
because in the case of a MULTI/EXEC block, if the master successfully
propagates a subset of the commands but not the EXEC, then the
transaction in the backlog becomes corrupt and could corrupt other
slaves that consume this data."

2. As identified by @yangsiran later, there is another effect of the
bug. For the same mechanism of the first problem, a slave having another
slave, could receive a full resynchronization request with an already
half-applied command in the backlog. Once the RDB is ready, it will be
sent to the slave, and the replication will continue sending to the
sub-slave the other half of the command, which is not valid.

The fix, designed by @yangsiran and @antirez, and implemented by
@antirez, uses a secondary buffer in order to feed the sub-masters and
update the replication backlog and offsets, only when a given part of
the query buffer is actually *applied* to the state of the instance,
that is, when the command gets processed and the command is not pending
in the Redis transaction buffer because of CLIENT_MULTI state.

Given that now the backlog and offsets representation are in agreement
with the actual processed commands, both issue 1 and 2 should no longer
be possible.

Thanks to @kevinmcgehee, @yangsiran and @oranagra for their work in
identifying and designing a fix for this problem.
2017-04-19 10:25:45 +02:00
Salvatore Sanfilippo
27fe8e9fb2 Merge pull request #3945 from badboy/dicthash-bench-compile
Reorder to make dict-benchmark compile on Linux
2017-04-18 16:31:18 +02:00
antirez
02d02a3754 Fix #3848 by closing the descriptor on error. 2017-04-18 16:24:06 +02:00
antirez
da2f9cd186 Fix descriptor leak. Close #3848. 2017-04-18 16:15:16 +02:00
张文康
5f88bd320e update block->free after some diff data are written to the child process 2017-04-18 20:10:08 +08:00
antirez
c33493277a Clarify why we save ziplist elements in revserse order.
Also get rid of variables that are now kinda redundant, since the
dictionary iterator was removed.

This is related to PR #3949.
2017-04-18 11:01:47 +02:00
Jan-Erik Rediger
c4ad4765b0 Reorder to make dict-benchmark compile on Linux
Fixes #3944
2017-04-17 13:37:59 +02:00
spinlock
23ec36909e rdb: saving skiplist in reversed order to accelerate the deserialisation process 2017-04-17 13:22:34 +08:00
antirez
271733f4f8 Cluster: discard pong times in the future.
However we allow for 500 milliseconds of tolerance, in order to
avoid often discarding semantically valid info (the node is up)
because of natural few milliseconds desync among servers even when
NTP is used.

Note that anyway we should ping the node from time to time regardless and
discover if it's actually down from our point of view, since no update
is accepted while we have an active ping on the node.

Related to #3929.
2017-04-15 10:12:08 +02:00
antirez
02777bb252 Cluster: always add PFAIL nodes at end of gossip section.
To rely on the fact that nodes in PFAIL state will be shared around by
randomly adding them in the gossip section is a weak assumption,
especially after changes related to sending less ping/pong packets.

We want to always include gossip entries for all the nodes that are in
PFAIL state, so that the PFAIL -> FAIL state promotion can happen much
faster and reliably.

Related to #3929.
2017-04-14 13:39:49 +02:00
antirez
8c829d9e43 Cluster: fix gossip section ping/pong times encoding.
The gossip section times are 32 bit, so cannot store the milliseconds
time but just the seconds approximation, which is good enough for our
uses. At the same time however, when comparing the gossip section times
of other nodes with our node's view, we need to convert back to
milliseconds.

Related to #3929. Without this change the patch to reduce the traffic in
the bus message does not work.
2017-04-14 11:01:22 +02:00
antirez
6878a3fedd Cluster: add clean-logs command to create-cluster script. 2017-04-14 10:52:00 +02:00
antirez
8f7bf2841a Cluster: decrease ping/pong traffic by trusting other nodes reports.
Cluster of bigger sizes tend to have a lot of traffic in the cluster bus
just for failure detection: a node will try to get a ping reply from
another node no longer than when the half the node timeout would elapsed,
in order to avoid a false positive.

However this means that if we have N nodes and the node timeout is set
to, for instance M seconds, we'll have to ping N nodes every M/2
seconds. This N*M/2 pings will receive the same number of pongs, so
a total of N*M packets per node. However given that we have a total of N
nodes doing this, the total number of messages will be N*N*M.

In a 100 nodes cluster with a timeout of 60 seconds, this translates
to a total of 100*100*30 packets per second, summing all the packets
exchanged by all the nodes.

This is, as you can guess, a lot... So this patch changes the
implementation in a very simple way in order to trust the reports of
other nodes: if a node A reports a node B as alive at least up to
a given time, we update our view accordingly.

The problem with this approach is that it could result into a subset of
nodes being able to reach a given node X, and preventing others from
detecting that is actually not reachable from the majority of nodes.
So the above algorithm is refined by trusting other nodes only if we do
not have currently a ping pending for the node X, and if there are no
failure reports for that node.

Since each node, anyway, pings 10 other nodes every second (one node
every 100 milliseconds), anyway eventually even trusting the other nodes
reports, we will detect if a given node is down from our POV.

Now to understand the number of packets that the cluster would exchange
for failure detection with the patch, we can start considering the
random PINGs that the cluster sent anyway as base line:
Each node sends 10 packets per second, so the total traffic if no
additioal packets would be sent, including PONG packets, would be:

    Total messages per second = N*10*2

However by trusting other nodes gossip sections will not AWALYS prevent
pinging nodes for the "half timeout reached" rule all the times. The
math involved in computing the actual rate as N and M change is quite
complex and depends also on another parameter, which is the number of
entries in the gossip section of PING and PONG packets. However it is
possible to compare what happens in cluster of different sizes
experimentally. After applying this patch a very important reduction in
the number of packets exchanged is trivial to observe, without apparent
impacts on the failure detection performances.

Actual numbers with different cluster sizes should be published in the
Reids Cluster documentation in the future.

Related to #3929.
2017-04-14 10:43:53 +02:00
antirez
c5d6f577f0 Cluster: collect more specific bus messages stats.
First step in order to change Cluster in order to use less messages.
Related to issue #3929.
2017-04-13 19:22:35 +02:00