71 Commits

Author SHA1 Message Date
Oran Agra
4447ddc8bb Keep track of meaningful replication offset in replicas too
Now both master and replicas keep track of the last replication offset
that contains meaningful data (ignoring the tailing pings), and both
trim that tail from the replication backlog, and the offset with which
they try to use for psync.

the implication is that if someone missed some pings, or even have
excessive pings that the promoted replica has, it'll still be able to
psync (avoid full sync).

the downside (which was already committed) is that replicas running old
code may fail to psync, since the promoted replica trims pings form it's
backlog.

This commit adds a test that reproduces several cases of promotions and
demotions with stale and non-stale pings

Background:
The mearningful offset on the master was added recently to solve a problem were
the master is left all alone, injecting PINGs into it's backlog when no one is
listening and then gets demoted and tries to replicate from a replica that didn't
have any of the PINGs (or at least not the last ones).

however, consider this case:
master A has two replicas (B and C) replicating directly from it.
there's no traffic at all, and also no network issues, just many pings in the
tail of the backlog. now B gets promoted, A becomes a replica of B, and C
remains a replica of A. when A gets demoted, it trims the pings from its
backlog, and successfully replicate from B. however, C is still aware of
these PINGs, when it'll disconnect and re-connect to A, it'll ask for something
that's not in the backlog anymore (since A trimmed the tail of it's backlog),
and be forced to do a full sync (something it didn't have to do before the
meaningful offset fix).

Besides that, the psync2 test was always failing randomly here and there, it
turns out the reason were PINGs. Investigating it shows the following scenario:

cycle 1: redis #1 is master, and all the rest are direct replicas of #1
cycle 2: redis #2 is promoted to master, #1 is a replica of #2 and #3 is replica of #1
now we see that when #1 is demoted it prints:
17339:S 21 Apr 2020 11:16:38.523 * Using the meaningful offset 3929963 instead of 3929977 to exclude the final PINGs (14 bytes difference)
17339:S 21 Apr 2020 11:16:39.391 * Trying a partial resynchronization (request e2b3f8817735fdfe5fa4626766daa938b61419e5:3929964).
17339:S 21 Apr 2020 11:16:39.392 * Successful partial resynchronization with master.
and when #3 connects to the demoted #2, #2 says:
17339:S 21 Apr 2020 11:16:40.084 * Partial resynchronization not accepted: Requested offset for secondary ID was 3929978, but I can reply up to 3929964

so the issue here is that the meaningful offset feature saved the day for the
demoted master (since it needs to sync from a replica that didn't get the last
ping), but it didn't help one of the other replicas which did get the last ping.
2020-04-27 15:52:23 +02:00
antirez
96a54866ab Speedup: unblock clients on keys in O(1).
See #7071.
2020-04-08 12:55:57 +02:00
antirez
dd7e61d77f timeout.c created: move client timeouts code there. 2020-03-27 16:35:03 +01:00
antirez
0e22cb2680 Precise timeouts: cleaup the table on unblock.
Now that this mechanism is the sole one used for blocked clients
timeouts, it is more wise to cleanup the table when the client unblocks
for any reason. We use a flag: CLIENT_IN_TO_TABLE, in order to avoid a
radix tree lookup when the client was already removed from the table
because we processed it by scanning the radix tree.
2020-03-27 16:35:03 +01:00
antirez
aa9d92d94a Precise timeouts: use only radix tree for timeouts. 2020-03-27 16:35:03 +01:00
antirez
8d11e0df7a Precise timeouts: fix bugs in initial implementation. 2020-03-27 16:35:03 +01:00
antirez
324a8c91d0 Precise timeouts: working initial implementation. 2020-03-27 16:35:03 +01:00
Guy Benoish
1f75ce30df Stream: Handle streamID-related edge cases
This commit solves several edge cases that are related to
exhausting the streamID limits: We should correctly calculate
the succeeding streamID instead of blindly incrementing 'seq'
This affects both XREAD and XADD.

Other (unrelated) changes:
Reply with a better error message when trying to add an entry
to a stream that has exhausted last_id
2019-12-26 15:31:37 +05:30
antirez
ce03d68332 Rename var to fixed_time_expire now that is more general. 2019-11-19 11:28:04 +01:00
antirez
b42466b925 Fix patch provided in #6554. 2019-11-19 11:23:43 +01:00
zhaozhao.zz
7059eceeb0 expires & blocking: handle ready keys as call() 2019-11-08 19:06:51 +08:00
antirez
dd5feec5e8 Modules: block on keys: fix stale comment. 2019-10-31 17:45:07 +01:00
antirez
66f55bc5c1 Modules: block on keys: fix bugs in processing order. 2019-10-31 12:23:55 +01:00
antirez
91f4bdc9f9 Modules: block on keys: use a better interface.
Using the is_key_ready() callback plus the reply callback later, creates
different issues AFAIK:

1. More complex API.
2. We need to call the reply callback() ASAP if the is_key_ready()
interface returned success, however the internals do not work in that
way, so when the reply callback is called the setup could be different.
To fix that, there is to break the current design that handles the
unblocked clients asyncrhonously, and run the list ASAP.
2019-10-31 11:35:07 +01:00
antirez
215b72c0ba Modules: block on keys: implement the internals. 2019-10-30 10:57:44 +01:00
antirez
a092f20d87 handleClientsBlockedOnKeys() refactoring. 2019-09-06 12:24:26 +02:00
antirez
89ad0ca566 Fix handleClientsBlockedOnKeys() names in comments. 2019-09-05 13:05:57 +02:00
Salvatore Sanfilippo
e5acc5ef4f
Merge pull request #2774 from rouzier/blocking-list-commands-support-milliseconds-floating
Added millisecond resolution for blpop command && friends
2019-03-12 18:10:28 +01:00
antirez
8a0391fbc9 RESP3: t_stream.c updated. 2019-01-09 17:00:29 +01:00
antirez
3fd78f41e8 RESP3: restore the concept of null array for RESP2 compat. 2019-01-09 17:00:29 +01:00
antirez
071da9844c RESP3: blocked.c updated. 2019-01-09 17:00:29 +01:00
antirez
05e8db24ed Slave removal: blocked.c logs fixed. 2018-09-11 15:32:28 +02:00
antirez
6c001bfc0d Unblocked clients API refactoring. See #4418. 2018-09-03 18:39:18 +02:00
antirez
3e7349fdaf Make pending buffer processing safe for CLIENT_MASTER client.
Related to #5305.
2018-09-03 18:17:31 +02:00
zhaozhao.zz
9a65f9cd3e block: format code 2018-08-14 20:59:32 +08:00
dejun.xdj
491682a668 Streams: using streamCompareID() instead of direct compare in block.c. 2018-07-14 15:03:05 +08:00
antirez
0420c3276f Merge branch 'unstable' of github.com:/antirez/redis into unstable 2018-07-10 12:06:44 +02:00
antirez
28e95c7c52 Streams: fix typo "consumer". 2018-07-10 12:04:31 +02:00
antirez
a71e814853 Streams: send an error to consumers blocked on non-existing group.
To detect when the group (or the whole key) is destroyed to send an
error to the consumers blocked in such group is a problem, so we leave
the consumers listening, the sysadmin is free to create or destroy
groups assuming she/he knows what to do. However a client may be blocked
in a given consumer group, that is later destroyed. Then the stream
receives new elements. In that case there is no sane behavior to serve
the consumer... but to report an error about the group no longer
existing.

More about detecting this synchronously and why it is not done:

1. Normally we don't do that, we leave clients blocked for other data
types such as lists.

2. When we free a stream object there is no longer information about
what was the key it was associated with, so while destroying the
consumer groups we miss the info to unblock the clients in that moment.

3. Objects can be reclaimed in other threads where it is no longer safe
to do client operations.
2018-07-10 11:19:06 +02:00
antirez
09327f11dd Streams: fix unblocking logic into a consumer group.
When a client blocks for a consumer group, we don't know the actual ID
we want to be served: other clients blocked in the same consumer group
may be served first, so the consumer group latest delivered ID changes.
This was not handled correctly, all the clients in the consumer group
were unblocked without data but the first.
2018-07-10 11:11:41 +02:00
dejun.xdj
61f12973f7 Bugfix: PEL is incorrect when consumer is blocked using xreadgroup with NOACK option.
Save NOACK option into client.blockingState structure.
2018-07-09 13:40:29 +02:00
antirez
34bd44187a Fix client unblocking for XREADGROUP, issue #4978.
We unblocked the client too early, when the group name object was no
longer valid in client->bpop, so propagating XCLAIM later in
streamPropagateXCLAIM() deferenced a field already set to NULL.
2018-06-11 16:51:06 +02:00
zhaozhao.zz
b9d19371e4 ZPOP: unblock multiple clients in right way 2018-05-31 23:35:47 +08:00
antirez
25f017e563 ZPOP: fix replication of blocking ZPOP. 2018-05-15 16:03:56 +02:00
antirez
56bbab238a ZPOP: change sync ZPOP to have a count argument instead of N keys.
Usually blocking operations make a lot of sense with multiple keys so
that we can listen to multiple queues (or whatever the app models) with
a single connection. However in the synchronous case it is more useful
to be able to ask for N elements. This is a change that I also wanted to
perform soon or later in the blocking list variant, but here it is more
natural since there is no reply type difference.
2018-05-11 18:00:32 +02:00
antirez
6efb6c1e06 ZPOP: renaming to have explicit MIN/MAX score idea.
This commit also adds a top comment about a subtle behavior of mixing
blocking operations of different types in the same key.
2018-05-11 17:31:53 +02:00
Itamar Haber
438125b47c Implements [B]Z[REV]POP and the respective unit tests
An implementation of the
[Ze POP Redis Module](https://github.com/itamarhaber/zpop) as core
Redis commands.

Fixes #1861.
2018-04-30 02:10:42 +03:00
Salvatore Sanfilippo
3163c9cb63
Merge pull request #4781 from guybe7/block_list_notify
Make blocking list commands send keyspace notifications
2018-03-22 16:21:19 +01:00
Guy Benoish
fa00e20b16 Make blocking list commands send keyspace notifications 2018-03-22 17:22:26 +07:00
antirez
0b58ad301e CG: Replication WIP 1: XREADGROUP and XCLAIM propagated as XCLAIM. 2018-03-19 18:02:19 +01:00
antirez
e76fb4ab25 CG: XPENDING should not create consumers and obey to count. 2018-03-15 12:54:10 +01:00
antirez
b65fe09bb8 CG: Now XREADGROUP + blocking operations work. 2018-03-15 12:54:10 +01:00
antirez
41809fd969 CG: creation of NACK entries in PELs. 2018-03-15 12:54:10 +01:00
antirez
86fe8fde20 CG: consumer lookup + initial streamReplyWithRange() work to supprot CG. 2018-03-15 12:54:10 +01:00
antirez
ccdae09046 CG: add & populate group+consumer in the blocking state. 2018-03-15 12:54:10 +01:00
antirez
ee3490ec48 Streams: state machine for reverse iteration WIP 1. 2017-12-01 10:24:25 +01:00
antirez
8f00cf85a7 Streams: fixed memory leaks when blocking again for same stream.
blockForKeys() was not freeing the allocation holding the ID when the
key was already found busy. Fortunately the unit test checked explicitly
for blocking multiple times for the same key (copying a regression in
the blocking lists tests), so the bug was detected by the Redis test leak
checker.
2017-12-01 10:24:24 +01:00
antirez
c128190026 Streams: fix handleClientsBlockedOnKeys() access to invalid ID. 2017-12-01 10:24:24 +01:00
antirez
6468cb2e82 Streams: fix XREAD ready-key signaling.
With lists we need to signal only on key creation, but streams can
provide data to clients listening at every new item added.
To make this slightly more efficient we now track different classes of
blocked clients to avoid signaling keys when there is nobody listening.
A typical case is when the stream is used as a time series DB and
accessed only by range with XRANGE.
2017-12-01 10:24:24 +01:00
antirez
2cacdcd6f8 Streams: XREAD related code to serve blocked clients. 2017-12-01 10:24:24 +01:00