redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-24 09:08:26 -05:00

Author	SHA1	Message	Date
Oran Agra	d55598988b	fix rare replication stream corruption with disk-based replication The slave sends \n keepalive messages to the master while parsing the rdb, and later sends REPLCONF ACK once a second. rarely, the master recives both a linefeed char and a REPLCONF in the same read, \n3\r\n$8\r\nREPLCONF\r\n... and it tries to trim two chars (\r\n) from the query buffer, trimming the '' from *3\r\n$8\r\nREPLCONF\r\n... then the master tries to process a command starting with '3' and replies to the slave a bunch of -ERR and one +OK. although the slave silently ignores these (prints a log message), this corrupts the replication offset at the slave since the slave increases the replication offset, and the master did not. other than the fix in processInlineBuffer, i did several other improvments while hunting this very rare bug. - when redis replies with "unknown command" it includes a portion of the arguments, not just the command name. so it would be easier to understand what was recived, in my case, on the slave side, it was -ERR, but the "arguments" were the interesting part (containing info on the error). - about a year ago i added code in addReplyErrorLength to print the error to the log in case of a reply to master (since this string isn't actually trasmitted to the master), now changed that block to print a similar log message to indicate an error being sent from the master to the slave. note that the slave is marked as CLIENT_SLAVE only after PSYNC was received, so this will not cause any harm for REPLCONF, and will only indicate problems that are gonna corrupt the replication stream anyway. - two places were c->reply was emptied, and i wanted to reset sentlen this is a precaution (i did not actually see such a problem), since a non-zero sentlen will cause corruption to be transmitted on the socket.	2018-07-17 12:51:49 +03:00
antirez	f9c84d6d39	Hopefully improve commenting of #5126 . Reading the PR gave me the opportunity to better specify what the code was doing in places where I was not immediately sure about what was going on. Moreover I documented the structure in server.h so that people reading the header file will immediately understand what the structure is useful for.	2018-07-16 17:56:54 +02:00
Oran Agra	bf680b6f8c	slave buffers were wasteful and incorrectly counted causing eviction A) slave buffers didn't count internal fragmentation and sds unused space, this caused them to induce eviction although we didn't mean for it. B) slave buffers were consuming about twice the memory of what they actually needed. - this was mainly due to sdsMakeRoomFor growing to twice as much as needed each time but networking.c not storing more than 16k (partially fixed recently in 237a38737). - besides it wasn't able to store half of the new string into one buffer and the other half into the next (so the above mentioned fix helped mainly for small items). - lastly, the sds buffers had up to 30% internal fragmentation that was wasted, consumed but not used. C) inefficient performance due to starting from a small string and reallocing many times. what i changed: - creating dedicated buffers for reply list, counting their size with zmalloc_size - when creating a new reply node from, preallocate it to at least 16k. - when appending a new reply to the buffer, first fill all the unused space of the previous node before starting a new one. other changes: - expose mem_not_counted_for_evict info field for the benefit of the test suite - add a test to make sure slave buffers are counted correctly and that they don't cause eviction	2018-07-16 16:43:42 +03:00
dejun.xdj	61f12973f7	Bugfix: PEL is incorrect when consumer is blocked using xreadgroup with NOACK option. Save NOACK option into client.blockingState structure.	2018-07-09 13:40:29 +02:00
dejun.xdj	289d8d9c2c	CLIENT UNBLOCK: fix client unblock help message.	2018-07-09 13:03:57 +02:00
WuYunlong	0a5805d7f1	fix compile warning in addReplySubcommandSyntaxError	2018-07-09 12:57:12 +02:00
antirez	2edcafb35d	addReplySubSyntaxError() renamed to addReplySubcommandSyntaxError().	2018-07-02 18:49:34 +02:00
Salvatore Sanfilippo	bc6a004588	Merge pull request #4998 from itamarhaber/module_command_help Module command help	2018-07-02 18:46:56 +02:00
antirez	d751d98b50	Change CLIENT LIST TYPE help string. Making it more similar to KILL.	2018-06-29 18:03:00 +02:00
zhaozhao.zz	b9cbd04b57	clients: add type option for client list	2018-06-28 17:43:05 +08:00
zhaozhao.zz	f5538642cc	clients: show pubsub flag in client list	2018-06-28 17:28:38 +08:00
antirez	ab55f9da5e	Make CLIENT HELP output nicer to the eyes.	2018-06-28 00:21:32 +02:00
antirez	4a70ff7451	Add unblock in CLIENT HELP.	2018-06-28 00:17:10 +02:00
antirez	2214043b5c	CLIENT UNBLOCK: support unblocking by error.	2018-06-27 18:51:06 +02:00
antirez	71295ee305	CLIENT UNBLOCK implemented.	2018-06-27 14:08:42 +02:00
antirez	fb39bfd7af	Take clients in a ID -> Client handle dictionary.	2018-06-27 14:08:42 +02:00
antirez	ed65d734e7	CLIENT ID implemented.	2018-06-27 14:08:42 +02:00
zhaozhao.zz	963002d71e	optimize reply list memory usage	2018-06-13 20:35:40 +08:00
Itamar Haber	76ad23d012	Adds MODULE HELP and implements addReplySubSyntaxError	2018-06-07 18:34:58 +03:00
antirez	6c4cb1670a	Add top comments in two addReply*() functions.	2018-03-22 11:45:04 +01:00
antirez	b86c26b2fd	Massivily simplify addReply*() functions in networking.c	2018-03-22 11:42:50 +01:00
antirez	00a29b1a81	Make addReplyError...() family functions able to get error codes. Now you can use: addReplyError("-MYERRORCODE some message"); If the error code is omitted, the behavior is like in the past, the generic -ERR will be used.	2018-03-15 12:54:10 +01:00
antirez	ccdae09046	CG: add & populate group+consumer in the blocking state.	2018-03-15 12:54:10 +01:00
antirez	3a5bf75ede	Actually use ae_flags to add AE_BARRIER if needed. Many thanks to @Plasma that spotted this problem reviewing the code.	2018-02-28 18:03:51 +01:00
antirez	75987431f0	AOF: fix a bug that may prevent proper fsyncing when fsync=always. In case the write handler is already installed, it could happen that we serve the reply of a query in the same event loop cycle we received it, preventing beforeSleep() from guaranteeing that we do the AOF fsync before sending the reply to the client. The AE_BARRIER mechanism, introduced in a previous commit, prevents this problem. This commit makes actual use of this new feature to fix the bug.	2018-02-27 13:06:42 +01:00
Salvatore Sanfilippo	d8830200b4	Merge pull request #3828 from oranagra/sdsnewlen_pr add SDS_NOINIT option to sdsnewlen to avoid unnecessary memsets.	2018-02-27 04:04:32 -08:00
antirez	ae29bcd8e2	More verbose logging when slave sends errors to master. See #3832.	2018-02-13 16:01:31 +01:00
Salvatore Sanfilippo	756df19134	Merge pull request #3832 from oranagra/slave_reply_to_master_pr when a slave responds with an error on commands that come from master, log it	2018-02-13 15:55:26 +01:00
Guy Benoish	fd8efb7c36	Replication buffer fills up on high rate traffic. When feeding the master with a high rate traffic the the slave's feed is much slower. This causes the replication buffer to grow (indefinitely) which leads to slave disconnection. The problem is that writeToClient() decides to stop writing after NET_MAX_WRITES_PER_EVENT writes (In order to be fair to clients). We should ignore this when the client is a slave. It's better if clients wait longer, the alternative is that the slave has no chance to stay in sync in this situation.	2018-01-18 12:10:48 +01:00
antirez	8075572207	New config options about protocol prefixed with "proto". Related to #4568.	2018-01-11 11:27:41 +01:00
Oran Agra	b509a14c3e	Add config options for max-bulk-len and max-querybuf-len mainly to support RESTORE of large keys	2017-12-29 12:43:48 +02:00
Oran Agra	60a4f12f8b	fix processing of large bulks (above 2GB) - protocol parsing (processMultibulkBuffer) was limitted to 32big positions in the buffer readQueryFromClient potential overflow - rioWriteBulkCount used int, although rioWriteBulkString gave it size_t - several places in sds.c that used int for string length or index. - bugfix in RM_SaveAuxField (return was 1 or -1 and not length) - RM_SaveStringBuffer was limitted to 32bit length	2017-12-29 12:24:19 +02:00
antirez	522760fac7	Change indentation and other minor details of PR #4489 . The main change introduced by this commit is pretending that help arrays are more text than code, thus indenting them at level 0. This improves readability, and is an old practice when defining arrays of C strings describing text. Additionally a few useless return statements are removed, and the HELP subcommand capitalized when printed to the user.	2017-12-06 12:05:14 +01:00
Itamar Haber	482d678e95	C style	2017-12-05 19:09:19 +02:00
Itamar Haber	b23c8babed	Uses an offset in addReplyHelp	2017-12-05 18:17:14 +02:00
Itamar Haber	8b51121998	Merge remote-tracking branch 'upstream/unstable' into help_subcommands	2017-12-05 18:14:59 +02:00
antirez	62a4b817c6	add linkClient(): adds the client and caches the list node. We have this operation in two places: when caching the master and when linking a new client after the client creation. By having an API for this we avoid incurring in errors when modifying one of the two places forgetting the other. The function is also a good place where to document why we cache the linked list node. Related to #4497 and #4210.	2017-12-05 16:02:03 +01:00
Salvatore Sanfilippo	03cfc8bf3a	Merge pull request #4497 from soloestoy/optimize-unlink-client networking: optimize unlinkClient() in freeClient()	2017-12-05 15:51:15 +01:00
Itamar Haber	d884ba4bc9	Helps CLIENT	2017-12-03 16:49:29 +02:00
antirez	4086dff477	Streams: augment client.bpop with XREAD specific fields.	2017-12-01 10:24:24 +01:00
antirez	4a377cecd8	Streams: initial work to use blocking lists logic for streams XREAD.	2017-12-01 10:24:24 +01:00
zhaozhao.zz	43be967690	networking: optimize unlinkClient() in freeClient()	2017-11-30 18:11:05 +08:00
Itamar Haber	59d52f7fab	Standardizes the 'help' subcommand This adds a new `addReplyHelp` helper that's used by commands when returning a help text. The following commands have been touched: DEBUG, OBJECT, COMMAND, PUBSUB, SCRIPT and SLOWLOG. WIP Fix entry command table entry for OBJECT for HELP option. After #4472 the command may have just 2 arguments. Improve OBJECT HELP descriptions. See #4472. WIP 2 WIP 3	2017-11-28 21:15:45 +02:00
antirez	e203a46cf3	Clients blocked in modules: free argv/argc later. See issue #3844 for more information.	2017-07-11 12:33:01 +02:00
spinlock	ea31a4eae3	Optimize addReplyBulkSds for better performance	2017-07-05 14:25:05 +00:00
antirez	eddd8d34c4	Add symmetrical assertion to track c->reply_buffer infinite growth. Redis clients need to have an instantaneous idea of the amount of memory they are consuming (if the number is not exact should at least be proportional to the actual memory usage). We do that adding and subtracting the SDS length when pushing / popping from the client->reply list. However it is quite simple to add bugs in such a setup, by not taking the objects in the list and the count in sync. For such reason, Redis has an assertion to track counts near 2^64: those are always the result of the counter wrapping around because we subtract more than we add. This commit adds the symmetrical assertion: when the list is empty since we sent everything, the reply_bytes count should be zero. Thanks to the new assertion it should be simple to also detect the other problem, where the count slowly increases because of over-counting. The assertion adds a conditional in the code that sends the buffer to the socket but should not create any measurable performance slowdown, listLength() just accesses a structure field, and this code path is totally dominated by write(2). Related to #4100.	2017-07-04 11:55:05 +02:00
Salvatore Sanfilippo	ef446bf16d	Merge pull request #3802 from flowly/bugfix-calc-stat-net-output-bytes Bugfix calc stat net output bytes	2017-06-20 17:01:16 +02:00
antirez	ece658713b	Modules TSC: Improve inter-thread synchronization. More work to do with server.unixtime and similar. Need to write Helgrind suppression file in order to suppress the valse positives.	2017-05-09 11:57:09 +02:00
antirez	22be435efe	Fix PSYNC2 incomplete command bug as described in #3899 . This bug was discovered by @kevinmcgehee and constituted a major hidden bug in the PSYNC2 implementation, caused by the propagation from the master of incomplete commands to slaves. The bug had several results: 1. Borrowing from Kevin text in the issue: "Given that slaves blindly copy over their master's input into their own replication backlog over successive read syscalls, it's possible that with large commands or small TCP buffers, partial commands are present in this buffer. If the master were to fail before successfully propagating the entire command to a slave, the slaves will never execute the partial command (since the client is invalidated) but will copy it to replication backlog which may relay those invalid bytes to its slaves on PSYNC2, corrupting the backlog and possibly other valid commands that follow the failover. Simple command boundaries aren't sufficient to capture this, either, because in the case of a MULTI/EXEC block, if the master successfully propagates a subset of the commands but not the EXEC, then the transaction in the backlog becomes corrupt and could corrupt other slaves that consume this data." 2. As identified by @yangsiran later, there is another effect of the bug. For the same mechanism of the first problem, a slave having another slave, could receive a full resynchronization request with an already half-applied command in the backlog. Once the RDB is ready, it will be sent to the slave, and the replication will continue sending to the sub-slave the other half of the command, which is not valid. The fix, designed by @yangsiran and @antirez, and implemented by @antirez, uses a secondary buffer in order to feed the sub-masters and update the replication backlog and offsets, only when a given part of the query buffer is actually applied to the state of the instance, that is, when the command gets processed and the command is not pending in the Redis transaction buffer because of CLIENT_MULTI state. Given that now the backlog and offsets representation are in agreement with the actual processed commands, both issue 1 and 2 should no longer be possible. Thanks to @kevinmcgehee, @yangsiran and @oranagra for their work in identifying and designing a fix for this problem.	2017-04-19 10:25:45 +02:00
antirez	1210af3804	Add a top comment in crucial functions inside networking.c.	2017-04-12 10:12:27 +02:00

... 3 4 5 6 7 ...

516 Commits