Commit Graph

4488 Commits

Author SHA1 Message Date
antirez
f7d4c3acdf Streams: trap more errors in stream loading + RDB check type name. 2018-03-15 12:54:10 +01:00
antirez
8727b4845b CG: XCLAIM, use minidle and fix array len. 2018-03-15 12:54:10 +01:00
antirez
09e3b3b975 CG: remove unused argument from streamReplyWithRangeFromConsumerPEL(). 2018-03-15 12:54:10 +01:00
antirez
13ff7bc3ef CG: fix RDB saving when there are no consumer groups. 2018-03-15 12:54:10 +01:00
antirez
267f7f2c97 Streams: fix error description for XADD when specified ID is small. 2018-03-15 12:54:10 +01:00
antirez
0a6780e560 CG: XCLAIM initial draft. 2018-03-15 12:54:10 +01:00
antirez
00a29b1a81 Make addReplyError...() family functions able to get error codes.
Now you can use:

    addReplyError("-MYERRORCODE some message");

If the error code is omitted, the behavior is like in the past,
the generic -ERR will be used.
2018-03-15 12:54:10 +01:00
antirez
c9d86c2b16 CG: More specific duplicated group error. 2018-03-15 12:54:10 +01:00
antirez
9f60a6bcee CG: RDB loading, fix inverted conditional. 2018-03-15 12:54:10 +01:00
antirez
f4e1a4de25 CG: RDB loading first implementation. 2018-03-15 12:54:10 +01:00
antirez
db7a5f23b4 CG: RDB saving part 2, consumers. 2018-03-15 12:54:10 +01:00
antirez
8fb6048ed0 CG: RDB saving part 1, metadata and PEL. 2018-03-15 12:54:10 +01:00
antirez
e76fb4ab25 CG: XPENDING should not create consumers and obey to count. 2018-03-15 12:54:10 +01:00
antirez
f3708af7f9 CG: XPENDING with start/stop/count variant implemented. 2018-03-15 12:54:10 +01:00
antirez
1bc31666da CG: XPENDING without start/stop variant implemented. 2018-03-15 12:54:10 +01:00
antirez
b65fe09bb8 CG: Now XREADGROUP + blocking operations work. 2018-03-15 12:54:10 +01:00
antirez
5ad29325fe CG: XACK should return zero when nothing is processed. 2018-03-15 12:54:10 +01:00
antirez
388c69fe4e CG: XACK implementation. 2018-03-15 12:54:10 +01:00
antirez
5bbd117c29 CG: XREADGROUP can fetch data from the consumer PEL. 2018-03-15 12:54:10 +01:00
antirez
aa808394f6 CG: first draft of streamReplyWithRangeFromConsumerPEL(). 2018-03-15 12:54:10 +01:00
antirez
bbec4569a5 CG: Fix order of calls in streamReplyWithRange().
We need to check if we are going to serve the request via the PEL before
inserting a deferred array len in the client output buffer.
2018-03-15 12:54:10 +01:00
antirez
41809fd969 CG: creation of NACK entries in PELs. 2018-03-15 12:54:10 +01:00
antirez
1ffb6723f5 CG: fix XREADGROUP ">" special ID parsing due to missing "continue". 2018-03-15 12:54:10 +01:00
antirez
6c0af37b6e CG: streamCompareID() + group last_id updating. 2018-03-15 12:54:10 +01:00
antirez
86fe8fde20 CG: consumer lookup + initial streamReplyWithRange() work to supprot CG. 2018-03-15 12:54:10 +01:00
antirez
ccdae09046 CG: add & populate group+consumer in the blocking state. 2018-03-15 12:54:10 +01:00
antirez
b8e5232161 CG: fix parsing in XREADGROUP and streamLookupCG() NULL check. 2018-03-15 12:54:10 +01:00
antirez
bd1c11dc35 CG: add XREADGROUP in the command table. 2018-03-15 12:54:10 +01:00
antirez
2bbb2bf427 CG: XGROUPREAD group option parsing and groups lookup. 2018-03-15 12:54:10 +01:00
antirez
1fafe7def1 CG: fix raxFind() retval check in streamCreateCG(). 2018-03-15 12:54:10 +01:00
antirez
58f0c000a5 CG: data structures design + XGROUP CREATE implementation. 2018-03-15 12:54:10 +01:00
charsyam
c76f890209 fix listpack.c to listpack.o in Makefile 2018-03-15 20:32:08 +09:00
Otmar Ertl
15d7e61701 fixed compilation error when using clang as reported by michael-grunder 2018-03-14 21:00:06 +01:00
antirez
432bf4770e Cluster: ability to prevent slaves from failing over their masters.
This commit, in some parts derived from PR #3041 which is no longer
possible to merge (because the user deleted the original branch),
implements the ability of slaves to have a special configuration
preventing that they try to start a failover when the master is failing.

There are multiple reasons for wanting this, and the feautre was
requested in issue #3021 time ago.

The differences between this patch and the original PR are the
following:

1. The flag is saved/loaded on the nodes configuration.
2. The 'myself' node is now flag-aware, the flag is updated as needed
   when the configuration is changed via CONFIG SET.
3. The flag name uses NOFAILOVER instead of NO_FAILOVER to be consistent
   with existing NOADDR.
4. The redis.conf documentation was rewritten.

Thanks to @deep011 for the original patch.
2018-03-14 14:01:38 +01:00
Oran Agra
806736cdf9 Adding real allocator fragmentation to INFO and MEMORY command + active defrag test
other fixes / improvements:
- LUA script memory isn't taken from zmalloc (taken from libc malloc)
  so it can cause high fragmentation ratio to be displayed (which is false)
- there was a problem with "fragmentation" info being calculated from
  RSS and used_memory sampled at different times (now sampling them together)

other details:
- adding a few more allocator info fields to INFO and MEMORY commands
- improve defrag test to measure defrag latency of big keys
- increasing the accuracy of the defrag test (by looking at real grag info)
  this way we can use an even lower threshold and still avoid false positives
- keep the old (total) "fragmentation" field unchanged, but add new ones for spcific things
- add these the MEMORY DOCTOR command
- deduct LUA memory from the rss in case of non jemalloc allocator (one for which we don't "allocator active/used")
- reduce sampling rate of the rss and allocator info
2018-03-12 15:08:52 +02:00
Oran Agra
be1b4aa9aa active defrag v2
- big keys are not defragged in one go from within the dict scan
  instead they are scanned in parts after the main dict hash bucket is done.
- add latency monitor sample for defrag
- change default active-defrag-cycle-min to induce lower latency
- make active defrag start a new scan right away if needed, so it's easier
  (for the test suite) to detect when it's done
- make active defrag quick the current cycle after each db / big key
- defrag  some non key long term global allocations
- some refactoring for smaller functions and more reusable code
- during dict rehashing, one scan iteration of the dict, can end up scanning
  one bucket in the smaller dict and many many buckets in the larger dict.
  so waiting for 16 scan iterations before checking the time, may be much too long.
2018-03-12 15:07:43 +02:00
Otmar Ertl
97bde9f623 use all 64 bits of the hash value instead of 63 2018-03-11 09:18:00 +01:00
Otmar Ertl
44698f45e7 made constant static 2018-03-10 20:44:20 +01:00
Otmar Ertl
633983d479 improved definition of HLL_Q 2018-03-10 20:22:42 +01:00
Otmar Ertl
1e9a774871 improved HyperLogLog cardinality estimation
based on method described in https://arxiv.org/abs/1702.01284
that does not rely on any magic constants
2018-03-10 20:13:21 +01:00
Otmar Ertl
6470b21f59 replaced tab by spaces 2018-03-10 20:09:41 +01:00
antirez
84b281209a Stream: update the listpack pointer in streamTrimByLength(). 2018-03-01 17:26:02 +01:00
antirez
efcbc01fbd Remove warning from lpGet snprintf(). 2018-03-01 15:26:27 +01:00
antirez
d63caaa820 redis-cli: fix missed unit in array. Change define name. 2018-03-01 15:06:41 +01:00
charsyam
da7f5700cf refactoring-call-aeDeleteFileEvent-twice-in-freeClusterLink 2018-03-01 22:30:39 +09:00
charsyam
51a03f6356 fix dlopen leak 2018-03-01 21:22:42 +09:00
Salvatore Sanfilippo
83b5b5a476
Merge pull request #4714 from charsyam/feature/fix-out-of-index-range
[BugFix] Fix out of array index range for findBigKeys in redis-cli
2018-03-01 03:39:15 -08:00
antirez
3a5bf75ede Actually use ae_flags to add AE_BARRIER if needed.
Many thanks to @Plasma that spotted this problem reviewing the code.
2018-02-28 18:03:51 +01:00
Salvatore Sanfilippo
7a73db7512
Merge pull request #4715 from charsyam/feature/refactoring-make-condition-clear-for-rdb
[BugFix] fix calculation length in rdbSaveAuxField
2018-02-27 10:15:27 -08:00
antirez
92696e49d2 expireIfNeeded() needed a top comment documenting the behavior. 2018-02-27 16:44:43 +01:00
antirez
b00c4ffab5 expireIfNeeded() comment: claim -> pretend. 2018-02-27 16:37:37 +01:00
charsyam
76386c48b8 refactoring-make-condition-clear-for-rdb 2018-02-27 21:55:20 +09:00
charsyam
6168d5a1a6 fix-out-of-index-range-for-redis-cli-findbigkey 2018-02-27 21:46:19 +09:00
antirez
956350ef89 ae.c: insetad of not firing, on AE_BARRIER invert the sequence.
AE_BARRIER was implemented like:

    - Fire the readable event.
    - Do not fire the writabel event if the readable fired.

However this may lead to the writable event to never be called if the
readable event is always fired. There is an alterantive, we can just
invert the sequence of the calls in case AE_BARRIER is set. This commit
does that.
2018-02-27 13:06:42 +01:00
antirez
75987431f0 AOF: fix a bug that may prevent proper fsyncing when fsync=always.
In case the write handler is already installed, it could happen that we
serve the reply of a query in the same event loop cycle we received it,
preventing beforeSleep() from guaranteeing that we do the AOF fsync
before sending the reply to the client.

The AE_BARRIER mechanism, introduced in a previous commit, prevents this
problem. This commit makes actual use of this new feature to fix the
bug.
2018-02-27 13:06:42 +01:00
antirez
533d0e0375 Cluster: improve crash-recovery safety after failover auth vote.
Add AE_BARRIER to the writable event loop so that slaves requesting
votes can't be served before we re-enter the event loop in the next
iteration, so clusterBeforeSleep() will fsync to disk in time.
Also add the call to explicitly fsync, given that we modified the last
vote epoch variable.
2018-02-27 13:06:42 +01:00
antirez
548e478e40 ae.c: introduce the concept of read->write barrier.
AOF fsync=always, and certain Redis Cluster bus operations, require to
fsync data on disk before replying with an acknowledge.
In such case, in order to implement Group Commits, we want to be sure
that queries that are read in a given cycle of the event loop, are never
served to clients in the same event loop iteration. This way, by using
the event loop "before sleep" callback, we can fsync the information
just one time before returning into the event loop for the next cycle.
This is much more efficient compared to calling fsync() multiple times.

Unfortunately because of a bug, this was not always guaranteed: the
actual way the events are installed was the sole thing that could
control. Normally this problem is hard to trigger when AOF is enabled
with fsync=always, because we try to flush the output buffers to the
socekt directly in the beforeSleep() function of Redis. However if the
output buffers are full, we actually install a write event, and in such
a case, this bug could happen.

This change to ae.c modifies the event loop implementation to make this
concept explicit. Write events that are registered with:

    AE_WRITABLE|AE_BARRIER

Are guaranteed to never fire after the readable event was fired for the
same file descriptor. In this way we are sure that data is persisted to
disk before the client performing the operation receives an
acknowledged.

However note that this semantics does not provide all the guarantees
that one may believe are automatically provided. Take the example of the
blocking list operations in Redis.

With AOF and fsync=always we could have:

    Client A doing: BLPOP myqueue 0
    Client B doing: RPUSH myqueue a b c

In this scenario, Client A will get the "a" elements immediately after
the Client B RPUSH will be executed, even before the operation is persisted.
However when Client B will get the acknowledge, it can be sure that
"b,c" are already safe on disk inside the list.

What to note here is that it cannot be assumed that Client A receiving
the element is a guaranteed that the operation succeeded from the point
of view of Client B.

This is due to the fact that the barrier exists within the same socket,
and not between different sockets. However in the case above, the
element "a" was not going to be persisted regardless, so it is a pretty
synthetic argument.
2018-02-27 13:06:42 +01:00
Salvatore Sanfilippo
d8830200b4
Merge pull request #3828 from oranagra/sdsnewlen_pr
add SDS_NOINIT option to sdsnewlen to avoid unnecessary memsets.
2018-02-27 04:04:32 -08:00
antirez
813960dbdd Fix ziplist prevlen encoding description. See #4705. 2018-02-23 12:19:35 +01:00
gechunlin
d4e6d1086f
Update object.c 2018-02-22 20:57:54 -06:00
antirez
ffde73c57d Track number of logically expired keys still in memory.
This commit adds two new fields in the INFO output, stats section:

expired_stale_perc:0.34
expired_time_cap_reached_count:58

The first field is an estimate of the number of keys that are yet in
memory but are already logically expired. They reason why those keys are
yet not reclaimed is because the active expire cycle can't spend more
time on the process of reclaiming the keys, and at the same time nobody
is accessing such keys. However as the active expire cycle runs, while
it will eventually have to return to the caller, because of time limit
or because there are less than 25% of keys logically expired in each
given database, it collects the stats in order to populate this INFO
field.

Note that expired_stale_perc is a running average, where the current
sample accounts for 5% and the history for 95%, so you'll see it
changing smoothly over time.

The other field, expired_time_cap_reached_count, counts the number
of times the expire cycle had to stop, even if still it was finding a
sizeable number of keys yet to expire, because of the time limit.
This allows people handling operations to understand if the Redis
server, during mass-expiration events, is able to collect keys fast
enough usually. It is normal for this field to increment during mass
expires, but normally it should very rarely increment. When instead it
constantly increments, it means that the current workloads is using
a very important percentage of CPU time to expire keys.

This feature was created thanks to the hints of Rashmi Ramesh and
Bart Robinson from Twitter. In private email exchanges, they noted how
it was important to improve the observability of this parameter in the
Redis server. Actually in big deployments, the amount of keys that are
yet to expire in each server, even if they are logically expired, may
account for a very big amount of wasted memory.
2018-02-19 11:12:49 +01:00
antirez
aa57481d8c Remove non semantical spaces from module.c. 2018-02-15 21:41:03 +01:00
Salvatore Sanfilippo
7830f8492f
Merge pull request #4479 from dvirsky/notify
Keyspace notifications API for modules
2018-02-15 21:36:32 +01:00
antirez
f4dc736cca Fix typo in notifyKeyspaceEvent() comment. 2018-02-15 21:33:06 +01:00
Dvir Volk
0a36196ce4 Add doc comment about notification flags 2018-02-14 21:54:00 +02:00
Dvir Volk
10efdf307b Add REDISMODULE_NOTIFY_STREAM flag to support stream notifications 2018-02-14 21:50:42 +02:00
Dvir Volk
613831f820 Fix indentation and comment style in testmodule 2018-02-14 21:43:06 +02:00
Dvir Volk
f27a64232e Use one static client for all keyspace notification callbacks 2018-02-14 21:40:10 +02:00
Dvir Volk
3aab12414f Remove the NOTIFY_MODULE flag and simplify the module notification flow if there aren't subscribers 2018-02-14 21:40:10 +02:00
Dvir Volk
a8e2e99a88 Document flags for notifications 2018-02-14 21:38:58 +02:00
Dvir Volk
d4d753dae4 removed some trailing whitespaces 2018-02-14 21:38:58 +02:00
Dvir Volk
5b7b12e38f removed hellonotify.c 2018-02-14 21:38:58 +02:00
Dvir Volk
896db12b41 fixed test 2018-02-14 21:38:58 +02:00
Dvir Volk
2136035e47 finished implementation of notifications. Tests unfinished 2018-02-14 21:38:58 +02:00
charsyam
9d41436115 getting rid of duplicated code 2018-02-14 00:12:13 +09:00
antirez
ae29bcd8e2 More verbose logging when slave sends errors to master.
See #3832.
2018-02-13 16:01:31 +01:00
Salvatore Sanfilippo
756df19134
Merge pull request #3832 from oranagra/slave_reply_to_master_pr
when a slave responds with an error on commands that come from master, log it
2018-02-13 15:55:26 +01:00
Salvatore Sanfilippo
f9e6c2046f
Merge pull request #3745 from guybe7/unstable
enlarged buffer given to ld2string
2018-02-13 15:50:21 +01:00
antirez
c14ba46e3a Make it explicit with a comment why we kill the old AOF rewrite.
See #3858.
2018-02-13 15:43:34 +01:00
Guy Benoish
f782006782 rewriteAppendOnlyFileBackground() failure fix
It is possible to do BGREWRITEAOF even if appendonly=no. This is by design.
stopAppendonly() didn't turn off aof_rewrite_scheduled (it can be turned on
again by BGREWRITEAOF even while appendonly is off anyway).
After configuring `appendonly yes` it will see that the state is AOF_OFF,
there's no RDB fork, so it will do rewriteAppendOnlyFileBackground() which
will fail since the aof_child_pid is set (was scheduled and started by cron).

Solution:
stopAppendonly() will turn off the schedule flag (regardless of who asked for it).
startAppendonly() will terminate any existing fork and start a new one (so it is the most recent).
2018-02-13 15:41:06 +01:00
Oran Agra
8e8d957ff8 fix to latency monitor reporting wrong max latency
in some cases LATENCY HISTORY reported latency that was
higher than the max latency reported by LATENCY LATEST / DOCTOR
2018-02-13 15:58:40 +02:00
赵磊
aacecbc997 Remove updateLFU() in dbOverwrite(). 2018-02-11 21:02:07 +08:00
antirez
32ac4c64ba Rax updated to latest antirez/rax commit. 2018-02-02 11:10:18 +01:00
Salvatore Sanfilippo
4aa2ecd98b
Merge pull request #4269 from jianqingdu/unstable
fix not call va_end() when syncWrite() failed
2018-01-24 10:55:25 +01:00
Mark Nunberg
062bd733da
redismodule.h: Check ModuleNameBusy before calling it
Older versions might not have this function.
2018-01-23 10:49:18 -05:00
antirez
727dd43614 Fix migrateCommand() access of not initialized byte. 2018-01-18 12:41:05 +01:00
Guy Benoish
fd8efb7c36 Replication buffer fills up on high rate traffic.
When feeding the master with a high rate traffic the the slave's feed is much slower.
This causes the replication buffer to grow (indefinitely) which leads to slave disconnection.
The problem is that writeToClient() decides to stop writing after NET_MAX_WRITES_PER_EVENT
writes (In order to be fair to clients).
We should ignore this when the client is a slave.
It's better if clients wait longer, the alternative is that the slave has no chance to stay in
sync in this situation.
2018-01-18 12:10:48 +01:00
antirez
1673a3f32c Cluster: improve anti-affinity algo in redis-trib.rb.
See #3462 and related PRs.

We use a simple algorithm to calculate the level of affinity violation,
and then an optimizer that performs random swaps until things improve.
2018-01-18 11:44:19 +01:00
antirez
e1e0bbe04d Remove useless comment from serverCron().
The behavior is well specified by the code itself.
2018-01-17 11:23:41 +01:00
Salvatore Sanfilippo
a18e4c964e
Merge pull request #4546 from hqin6/unstable
fixbug for #4545 dead loop aof rewrite
2018-01-17 11:21:55 +01:00
heqin
3d3faa0a19 fixbug for #4545 dead loop aof rewrite 2018-01-17 18:08:30 +08:00
Salvatore Sanfilippo
81401878de
Merge pull request #4609 from Qinch/unstable
fix assert problem in ZIP_DECODE_PREVLENSIZE macro
2018-01-17 10:45:11 +01:00
antirez
b23927b240 Hopefully more clear comment to explain the change in #4607. 2018-01-16 15:52:13 +01:00
qinchao
1e0e168570 fix assert problem in ZIP_DECODE_PREVLENSIZE
, see issue: https://github.com/antirez/redis/issues/4587
2018-01-16 22:43:06 +08:00
Oran Agra
689b64c3ad PSYNC2 fix - promoted slave should hold on to it's backlog
after a slave is promoted (assuming it has no slaves
and it booted over an hour ago), it will lose it's replication
backlog at the next replication cron, rather than waiting for slaves
to connect to it.
so on a simple master/slave faiover, if the new slave doesn't connect
immediately, it may be too later and PSYNC2 will fail.
2018-01-16 10:10:42 +02:00
zhaozhao.zz
1b8eec3e53 aof: format code and comment 2018-01-15 13:01:03 +01:00
antirez
c45366be0a Put more details in the comment introduced by #4601. 2018-01-15 12:50:08 +01:00
Salvatore Sanfilippo
1ed5ac7ce5
Merge pull request #4601 from soloestoy/fix-memoryleak-for-lazy-server-del
lazyfree: fix memory leak for lazyfree-lazy-server-del
2018-01-15 12:43:55 +01:00
zhaozhao.zz
0517ab8397 lazyfree: fix memory leak for lazyfree-lazy-server-del 2018-01-15 00:45:37 +08:00
Salvatore Sanfilippo
aeeb747796
Merge pull request #4575 from soloestoy/bugfix-benchmark
redis-benchmark: bugfix - handle zero liveclients in right way
2018-01-12 17:43:01 +01:00