464 Commits

Author SHA1 Message Date
Oran Agra
bf680b6f8c slave buffers were wasteful and incorrectly counted causing eviction
A) slave buffers didn't count internal fragmentation and sds unused space,
   this caused them to induce eviction although we didn't mean for it.

B) slave buffers were consuming about twice the memory of what they actually needed.
- this was mainly due to sdsMakeRoomFor growing to twice as much as needed each time
  but networking.c not storing more than 16k (partially fixed recently in 237a38737).
- besides it wasn't able to store half of the new string into one buffer and the
  other half into the next (so the above mentioned fix helped mainly for small items).
- lastly, the sds buffers had up to 30% internal fragmentation that was wasted,
  consumed but not used.

C) inefficient performance due to starting from a small string and reallocing many times.

what i changed:
- creating dedicated buffers for reply list, counting their size with zmalloc_size
- when creating a new reply node from, preallocate it to at least 16k.
- when appending a new reply to the buffer, first fill all the unused space of the
  previous node before starting a new one.

other changes:
- expose mem_not_counted_for_evict info field for the benefit of the test suite
- add a test to make sure slave buffers are counted correctly and that they don't cause eviction
2018-07-16 16:43:42 +03:00
dejun.xdj
61f12973f7 Bugfix: PEL is incorrect when consumer is blocked using xreadgroup with NOACK option.
Save NOACK option into client.blockingState structure.
2018-07-09 13:40:29 +02:00
dejun.xdj
289d8d9c2c CLIENT UNBLOCK: fix client unblock help message. 2018-07-09 13:03:57 +02:00
WuYunlong
0a5805d7f1 fix compile warning in addReplySubcommandSyntaxError 2018-07-09 12:57:12 +02:00
antirez
2edcafb35d addReplySubSyntaxError() renamed to addReplySubcommandSyntaxError(). 2018-07-02 18:49:34 +02:00
Salvatore Sanfilippo
bc6a004588
Merge pull request #4998 from itamarhaber/module_command_help
Module command help
2018-07-02 18:46:56 +02:00
antirez
d751d98b50 Change CLIENT LIST TYPE help string.
Making it more similar to KILL.
2018-06-29 18:03:00 +02:00
zhaozhao.zz
b9cbd04b57 clients: add type option for client list 2018-06-28 17:43:05 +08:00
zhaozhao.zz
f5538642cc clients: show pubsub flag in client list 2018-06-28 17:28:38 +08:00
antirez
ab55f9da5e Make CLIENT HELP output nicer to the eyes. 2018-06-28 00:21:32 +02:00
antirez
4a70ff7451 Add unblock in CLIENT HELP. 2018-06-28 00:17:10 +02:00
antirez
2214043b5c CLIENT UNBLOCK: support unblocking by error. 2018-06-27 18:51:06 +02:00
antirez
71295ee305 CLIENT UNBLOCK implemented. 2018-06-27 14:08:42 +02:00
antirez
fb39bfd7af Take clients in a ID -> Client handle dictionary. 2018-06-27 14:08:42 +02:00
antirez
ed65d734e7 CLIENT ID implemented. 2018-06-27 14:08:42 +02:00
zhaozhao.zz
963002d71e optimize reply list memory usage 2018-06-13 20:35:40 +08:00
Itamar Haber
76ad23d012 Adds MODULE HELP and implements addReplySubSyntaxError 2018-06-07 18:34:58 +03:00
antirez
6c4cb1670a Add top comments in two addReply*() functions. 2018-03-22 11:45:04 +01:00
antirez
b86c26b2fd Massivily simplify addReply*() functions in networking.c 2018-03-22 11:42:50 +01:00
antirez
00a29b1a81 Make addReplyError...() family functions able to get error codes.
Now you can use:

    addReplyError("-MYERRORCODE some message");

If the error code is omitted, the behavior is like in the past,
the generic -ERR will be used.
2018-03-15 12:54:10 +01:00
antirez
ccdae09046 CG: add & populate group+consumer in the blocking state. 2018-03-15 12:54:10 +01:00
antirez
3a5bf75ede Actually use ae_flags to add AE_BARRIER if needed.
Many thanks to @Plasma that spotted this problem reviewing the code.
2018-02-28 18:03:51 +01:00
antirez
75987431f0 AOF: fix a bug that may prevent proper fsyncing when fsync=always.
In case the write handler is already installed, it could happen that we
serve the reply of a query in the same event loop cycle we received it,
preventing beforeSleep() from guaranteeing that we do the AOF fsync
before sending the reply to the client.

The AE_BARRIER mechanism, introduced in a previous commit, prevents this
problem. This commit makes actual use of this new feature to fix the
bug.
2018-02-27 13:06:42 +01:00
Salvatore Sanfilippo
d8830200b4
Merge pull request #3828 from oranagra/sdsnewlen_pr
add SDS_NOINIT option to sdsnewlen to avoid unnecessary memsets.
2018-02-27 04:04:32 -08:00
antirez
ae29bcd8e2 More verbose logging when slave sends errors to master.
See #3832.
2018-02-13 16:01:31 +01:00
Salvatore Sanfilippo
756df19134
Merge pull request #3832 from oranagra/slave_reply_to_master_pr
when a slave responds with an error on commands that come from master, log it
2018-02-13 15:55:26 +01:00
Guy Benoish
fd8efb7c36 Replication buffer fills up on high rate traffic.
When feeding the master with a high rate traffic the the slave's feed is much slower.
This causes the replication buffer to grow (indefinitely) which leads to slave disconnection.
The problem is that writeToClient() decides to stop writing after NET_MAX_WRITES_PER_EVENT
writes (In order to be fair to clients).
We should ignore this when the client is a slave.
It's better if clients wait longer, the alternative is that the slave has no chance to stay in
sync in this situation.
2018-01-18 12:10:48 +01:00
antirez
8075572207 New config options about protocol prefixed with "proto".
Related to #4568.
2018-01-11 11:27:41 +01:00
Oran Agra
b509a14c3e Add config options for max-bulk-len and max-querybuf-len mainly to support RESTORE of large keys 2017-12-29 12:43:48 +02:00
Oran Agra
60a4f12f8b fix processing of large bulks (above 2GB)
- protocol parsing (processMultibulkBuffer) was limitted to 32big positions in the buffer
  readQueryFromClient potential overflow
- rioWriteBulkCount used int, although rioWriteBulkString gave it size_t
- several places in sds.c that used int for string length or index.
- bugfix in RM_SaveAuxField (return was 1 or -1 and not length)
- RM_SaveStringBuffer was limitted to 32bit length
2017-12-29 12:24:19 +02:00
antirez
522760fac7 Change indentation and other minor details of PR #4489.
The main change introduced by this commit is pretending that help
arrays are more text than code, thus indenting them at level 0. This
improves readability, and is an old practice when defining arrays of
C strings describing text.

Additionally a few useless return statements are removed, and the HELP
subcommand capitalized when printed to the user.
2017-12-06 12:05:14 +01:00
Itamar Haber
482d678e95 C style 2017-12-05 19:09:19 +02:00
Itamar Haber
b23c8babed Uses an offset in addReplyHelp 2017-12-05 18:17:14 +02:00
Itamar Haber
8b51121998 Merge remote-tracking branch 'upstream/unstable' into help_subcommands 2017-12-05 18:14:59 +02:00
antirez
62a4b817c6 add linkClient(): adds the client and caches the list node.
We have this operation in two places: when caching the master and
when linking a new client after the client creation. By having an API
for this we avoid incurring in errors when modifying one of the two
places forgetting the other. The function is also a good place where to
document why we cache the linked list node.

Related to #4497 and #4210.
2017-12-05 16:02:03 +01:00
Salvatore Sanfilippo
03cfc8bf3a
Merge pull request #4497 from soloestoy/optimize-unlink-client
networking: optimize unlinkClient() in freeClient()
2017-12-05 15:51:15 +01:00
Itamar Haber
d884ba4bc9 Helps CLIENT 2017-12-03 16:49:29 +02:00
antirez
4086dff477 Streams: augment client.bpop with XREAD specific fields. 2017-12-01 10:24:24 +01:00
antirez
4a377cecd8 Streams: initial work to use blocking lists logic for streams XREAD. 2017-12-01 10:24:24 +01:00
zhaozhao.zz
43be967690 networking: optimize unlinkClient() in freeClient() 2017-11-30 18:11:05 +08:00
Itamar Haber
59d52f7fab Standardizes the 'help' subcommand
This adds a new `addReplyHelp` helper that's used by commands
when returning a help text. The following commands have been
touched: DEBUG, OBJECT, COMMAND, PUBSUB, SCRIPT and SLOWLOG.

WIP

Fix entry command table entry for OBJECT for HELP option.

After #4472 the command may have just 2 arguments.

Improve OBJECT HELP descriptions.

See #4472.

WIP 2

WIP 3
2017-11-28 21:15:45 +02:00
antirez
e203a46cf3 Clients blocked in modules: free argv/argc later.
See issue #3844 for more information.
2017-07-11 12:33:01 +02:00
spinlock
ea31a4eae3 Optimize addReplyBulkSds for better performance 2017-07-05 14:25:05 +00:00
antirez
eddd8d34c4 Add symmetrical assertion to track c->reply_buffer infinite growth.
Redis clients need to have an instantaneous idea of the amount of memory
they are consuming (if the number is not exact should at least be
proportional to the actual memory usage). We do that adding and
subtracting the SDS length when pushing / popping from the client->reply
list. However it is quite simple to add bugs in such a setup, by not
taking the objects in the list and the count in sync. For such reason,
Redis has an assertion to track counts near 2^64: those are always the
result of the counter wrapping around because we subtract more than we
add. This commit adds the symmetrical assertion: when the list is empty
since we sent everything, the reply_bytes count should be zero. Thanks
to the new assertion it should be simple to also detect the other
problem, where the count slowly increases because of over-counting.
The assertion adds a conditional in the code that sends the buffer to
the socket but should not create any measurable performance slowdown,
listLength() just accesses a structure field, and this code path is
totally dominated by write(2).

Related to #4100.
2017-07-04 11:55:05 +02:00
Salvatore Sanfilippo
ef446bf16d Merge pull request #3802 from flowly/bugfix-calc-stat-net-output-bytes
Bugfix calc stat net output bytes
2017-06-20 17:01:16 +02:00
antirez
ece658713b Modules TSC: Improve inter-thread synchronization.
More work to do with server.unixtime and similar. Need to write Helgrind
suppression file in order to suppress the valse positives.
2017-05-09 11:57:09 +02:00
antirez
22be435efe Fix PSYNC2 incomplete command bug as described in #3899.
This bug was discovered by @kevinmcgehee and constituted a major hidden
bug in the PSYNC2 implementation, caused by the propagation from the
master of incomplete commands to slaves.

The bug had several results:

1. Borrowing from Kevin text in the issue: "Given that slaves blindly
copy over their master's input into their own replication backlog over
successive read syscalls, it's possible that with large commands or
small TCP buffers, partial commands are present in this buffer. If the
master were to fail before successfully propagating the entire command
to a slave, the slaves will never execute the partial command (since the
client is invalidated) but will copy it to replication backlog which may
relay those invalid bytes to its slaves on PSYNC2, corrupting the
backlog and possibly other valid commands that follow the failover.
Simple command boundaries aren't sufficient to capture this, either,
because in the case of a MULTI/EXEC block, if the master successfully
propagates a subset of the commands but not the EXEC, then the
transaction in the backlog becomes corrupt and could corrupt other
slaves that consume this data."

2. As identified by @yangsiran later, there is another effect of the
bug. For the same mechanism of the first problem, a slave having another
slave, could receive a full resynchronization request with an already
half-applied command in the backlog. Once the RDB is ready, it will be
sent to the slave, and the replication will continue sending to the
sub-slave the other half of the command, which is not valid.

The fix, designed by @yangsiran and @antirez, and implemented by
@antirez, uses a secondary buffer in order to feed the sub-masters and
update the replication backlog and offsets, only when a given part of
the query buffer is actually *applied* to the state of the instance,
that is, when the command gets processed and the command is not pending
in the Redis transaction buffer because of CLIENT_MULTI state.

Given that now the backlog and offsets representation are in agreement
with the actual processed commands, both issue 1 and 2 should no longer
be possible.

Thanks to @kevinmcgehee, @yangsiran and @oranagra for their work in
identifying and designing a fix for this problem.
2017-04-19 10:25:45 +02:00
antirez
1210af3804 Add a top comment in crucial functions inside networking.c. 2017-04-12 10:12:27 +02:00
oranagra
161a3a174b when a slave experiances an error on commands that come from master, print to the log
since slave isn't replying to it's master, these errors go unnoticed.
since we don't expect the master to send garbadge to the slave, this should be safe.
(as long as we don't log OOM errors there)
2017-02-23 03:44:42 -08:00
oranagra
f86df924b0 add SDS_NOINIT option to sdsnewlen to avoid unnecessary memsets.
this commit also contains small bugfix in rdbLoadLzfStringObject
a bug that currently has no implications.
2017-02-23 03:04:08 -08:00