redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 00:28:26 -05:00

Author	SHA1	Message	Date
Harkrishn Patro	9f8885760b	Sharded pubsub implementation (#8621 ) This commit implements a sharded pubsub implementation based off of shard channels. Co-authored-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2022-01-02 16:54:47 -08:00
Viktor Söderqvist	45a155bd0f	Wait for replicas when shutting down (#9872 ) To avoid data loss, this commit adds a grace period for lagging replicas to catch up the replication offset. Done: * Wait for replicas when shutdown is triggered by SIGTERM and SIGINT. * Wait for replicas when shutdown is triggered by the SHUTDOWN command. A new blocked client type BLOCKED_SHUTDOWN is introduced, allowing multiple clients to call SHUTDOWN in parallel. Note that they don't expect a response unless an error happens and shutdown is aborted. * Log warning for each replica lagging behind when finishing shutdown. * CLIENT_PAUSE_WRITE while waiting for replicas. * Configurable grace period 'shutdown-timeout' in seconds (default 10). * New flags for the SHUTDOWN command: - NOW disables the grace period for lagging replicas. - FORCE ignores errors writing the RDB or AOF files which would normally prevent a shutdown. - ABORT cancels ongoing shutdown. Can't be combined with other flags. * New field in the output of the INFO command: 'shutdown_in_milliseconds'. The value is the remaining maximum time to wait for lagging replicas before finishing the shutdown. This field is present in the Server section only during shutdown. Not directly related: * When shutting down, if there is an AOF saving child, it is killed even if AOF is disabled. This can happen if BGREWRITEAOF is used when AOF is off. * Client pause now has end time and type (WRITE or ALL) per purpose. The different pause purposes are CLIENT PAUSE command, failover and shutdown. If clients are unpaused for one purpose, it doesn't affect client pause for other purposes. For example, the CLIENT UNPAUSE command doesn't affect client pause initiated by the failover or shutdown procedures. A completed failover or a failed shutdown doesn't unpause clients paused by the CLIENT PAUSE command. Notes: * DEBUG RESTART doesn't wait for replicas. * We already have a warning logged when a replica disconnects. This means that if any replica connection is lost during the shutdown, it is either logged as disconnected or as lagging at the time of exit. Co-authored-by: Oran Agra <oran@redislabs.com>	2022-01-02 09:50:15 +02:00
chenyang8094	af0b50f83a	Tests: don't rely on the response of MEMORY USAGE when mem_allocator is not jemalloc (#10010 ) It turns out that libc malloc can return an allocation of a different size on requests of the same size. this means that matching MEMORY USAGE of one key to another copy of the same data can fail. Solution: Keep running the test that calls MEMORY USAGE, but ignore the response. We do that by introducing a new utility function to get the memory usage, which always returns 1 when the allocator is not jemalloc. Other changes: Some formatting for datatype2.tcl Co-authored-by: Oran Agra <oran@redislabs.com>	2021-12-27 21:37:21 +02:00
Oran Agra	b7567394e1	resolve replication test timing sensitivity - 2nd attempt (#9988 ) issue started failing after #9878 was merged (made an exiting test more sensitive) looks like #9982 didn't help, tested this one and it seems to work better. this commit does two things: 1. reduce the extra delay i added earlier and instead add more keys, the effect no duration of replication is the same, but the intervals in which the server is responsive to the tcl client is higher. 2. improve the test infra to print context when assert_error fails.	2021-12-22 23:37:12 +02:00
Oran Agra	6add1b7217	Add external test that runs without debug command (#9964 ) - add needs:debug flag for some tests - disable "save" in external tests (speedup?) - use debug_digest proc instead of debug command directly so it can be skipped - use OBJECT ENCODING instead of DEBUG OBJECT to get encoding - add a proc for OBJECT REFCOUNT so it can be skipped - move a bunch of tests in latency_monitor tests to happen later so that latency monitor has some values in it - add missing close_replication_stream calls - make sure to close the temp client if DEBUG LOG fails	2021-12-19 17:41:51 +02:00
YaacovHazan	ae2f5b7b2e	Protected configs and sensitive commands (#9920 ) Block sensitive configs and commands by default. * `enable-protected-configs` - block modification of configs with the new `PROTECTED_CONFIG` flag. Currently we add this flag to `dbfilename`, and `dir` configs, all of which are non-mutable configs that can set a file redis will write to. * `enable-debug-command` - block the `DEBUG` command * `enable-module-command` - block the `MODULE` command These have a default value set to `no`, so that these features are not exposed by default to client connections, and can only be set by modifying the config file. Users can change each of these to either `yes` (allow all access), or `local` (allow access from local TCP connections and unix domain connections) Note that this is a breaking change (specifically the part about MODULE command being disabled by default). I.e. we don't consider DEBUG command being blocked as an issue (people shouldn't have been using it), and the few configs we protected are unlikely to have been set at runtime anyway. On the other hand, it's likely to assume some users who use modules, load them from the config file anyway. Note that's the whole point of this PR, for redis to be more secure by default and reduce the attack surface on innocent users, so secure defaults will necessarily mean a breaking change.	2021-12-19 10:46:16 +02:00
ny0312	792afb4432	Introduce memory management on cluster link buffers (#9774 ) Introduce memory management on cluster link buffers: * Introduce a new `cluster-link-sendbuf-limit` config that caps memory usage of cluster bus link send buffers. * Introduce a new `CLUSTER LINKS` command that displays current TCP links to/from peers. * Introduce a new `mem_cluster_links` field under `INFO` command output, which displays the overall memory usage by all current cluster links. * Introduce a new `total_cluster_links_buffer_limit_exceeded` field under `CLUSTER INFO` command output, which displays the accumulated count of cluster links freed due to `cluster-link-sendbuf-limit`.	2021-12-16 21:56:59 -08:00
Meir Shpilraien (Spielrein)	687210f155	Add FUNCTION FLUSH command to flush all functions (#9936 ) Added `FUNCTION FLUSH` command. The new sub-command allows delete all the functions. An optional `[SYNC\|ASYNC]` argument can be given to control whether or not to flush the functions synchronously or asynchronously. if not given the default flush mode is chosen by `lazyfree-lazy-user-flush` configuration values. Add the missing `functions.tcl` test to the list of tests that are executed in test_helper.tcl, and call FUNCTION FLUSH in between servers in external mode	2021-12-16 17:58:25 +02:00
yoav-steinberg	c7dc17fc0f	Fix possible int overflow when hashing an sds. (#9916 ) This caused a crash when adding elements larger than 2GB to a set (same goes for hash keys). See #8455. Details: * The fix makes the dict hash functions receive a `size_t` instead of an `int`. In practice the dict hash functions call siphash which receives a `size_t` and the callers to the hash function pass a `size_t` to it so the fix is trivial. * The issue was recreated by attempting to add a >2gb value to a set. Appropriate tests were added where I create a set with large elements and check basic functionality on it (SADD, SCARD, SPOP, etc...). * When I added the tests I also refactored a bit all the tests code which is run under the `--large-memory` flag. This removed code duplication for the test framework's `write_big_bulk` and `write_big_bulk` code and also takes care of not allocating the test frameworks helper huge string used by these tests when not run under `--large-memory`. * I also added the _violoations.tcl_ unit tests to be part of the entire test suite and leaned up non relevant list related tests that were in there. This was done in this PR because most of the _violations_ tests are "large memory" tests.	2021-12-13 21:16:25 +02:00
guybe7	af7489886d	Obliterate STRALGO! add LCS (which only works on keys) (#9799 ) Drop the STRALGO command, now LCS is a command of its own and it only works on keys (not input strings). The motivation is that STRALGO's syntax was really messed-up... - assumes all (future) string algorithms will take similar arguments - mixes command that takes keys and one that doesn't in the same command. - make it nearly impossible to expose the right key spec in COMMAND INFO (issues cluster clients) - hard for cluster clients to determine the key names (firstkey, lastkey, etc) - hard for ACL / flags (is it a read command?) This is a breaking change.	2021-11-18 10:47:49 +02:00
yoav-steinberg	e968d9ac58	Connection leak in external tests. (#9777 ) Two issues: 1. In many tests we simply forgot to close the connections we created, which doesn't matter for normal tests where the server is killed, but creates a leak on external server tests. 2. When calling `start_server` on external test we create a fresh connection instead of really starting a new server, but never clean it at the end.	2021-11-15 11:07:43 +02:00
Ozan Tezcan	b91d8b289b	Add sanitizer support and clean up sanitizer findings (#9601 ) - Added sanitizer support. `address`, `undefined` and `thread` sanitizers are available. - To build Redis with desired sanitizer : `make SANITIZER=undefined` - There were some sanitizer findings, cleaned up codebase - Added tests with address and undefined behavior sanitizers to daily CI. - Added tests with address sanitizer to the per-PR CI (smoke out mem leaks sooner). Basically, there are three types of issues : 1- Unaligned load/store : Most probably, this issue may cause a crash on a platform that does not support unaligned access. Redis does unaligned access only on supported platforms. 2- Signed integer overflow. Although, signed overflow issue can be problematic time to time and change how compiler generates code, current findings mostly about signed shift or simple addition overflow. For most platforms Redis can be compiled for, this wouldn't cause any issue as far as I can tell (checked generated code on godbolt.org). 3 -Minor leak (redis-cli), use-after-free(just before calling exit()); UB means nothing guaranteed and risky to reason about program behavior but I don't think any of the fixes here worth backporting. As sanitizers are now part of the CI, preventing new issues will be the real benefit.	2021-11-11 13:51:33 +02:00
yoav-steinberg	cd6b3d558b	Archive external redis log in external tests (#9765 ) On test failure store the external redis server logs as CI artifacts so we can review them. Write test name to server log for external server tests. This is attempted and silently failed in case external server doesn't support it. Note that in non-external server mode we use a more robust method of writing to the log which doesn't depend on the server actually running/working. This isn't possible for externl servers and required for some complex tests which are skipped in external mode anyway. Cleanup: remove dup code.	2021-11-11 13:04:02 +02:00
chendianqiang	a527c3e814	Test suite - user server socket to optimize port detection (#9663 ) Optimized port detection for tcl, use 'socket -server' instead of 'socket' to rule out port on TIME_WAIT Co-authored-by: chendianqiang <chendianqiang@meituan.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-07 13:53:57 +02:00
perryitay	f27083a4a8	Add support for list type to store elements larger than 4GB (#9357 ) Redis lists are stored in quicklist, which is currently a linked list of ziplists. Ziplists are limited to storing elements no larger than 4GB, so when bigger items are added they're getting truncated. This PR changes quicklists so that they're capable of storing large items in quicklist nodes that are plain string buffers rather than ziplist. As part of the PR there were few other changes in redis: 1. new DEBUG sub-commands: - QUICKLIST-PACKED-THRESHOLD - set the threshold of for the node type to be plan or ziplist. default (1GB) - QUICKLIST <key> - Shows low level info about the quicklist encoding of <key> 2. rdb format change: - A new type was added - RDB_TYPE_LIST_QUICKLIST_2 . - container type (packed / plain) was added to the beginning of the rdb object (before the actual node list). 3. testing: - Tests that requires over 100MB will be by default skipped. a new flag was added to 'runtest' to run the large memory tests (not used by default) Co-authored-by: sundb <sundbcn@gmail.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-11-03 20:47:18 +02:00
Oran Agra	87321deb3f	attempt to fix tracking test issue with external tests due to lazy free (#9722 ) The External tests started failing recently for unclear reason: ``` *** [err]: Tracking invalidation message of eviction keys should be before response in tests/unit/tracking.tcl Expected '0' to be equal to 'invalidate volatile-key' (context: type eval line 21 cmd {assert_equal $res {invalidate volatile-key}} proc ::test) ``` I suspect the issue is that the used_memory sample is taken while a lazy free is still being processed.	2021-11-02 16:42:53 +02:00
Oran Agra	665e428435	Testsuite: attempt to find / avoid valgrind warnings of killed processes (#9679 ) I recently started seeing a lot of empty valgrind reports in the daily CI. i.e. prints showing valgrind header but no leak report, which causes the tests to fail https://github.com/redis/redis/runs/3991335416?check_suite_focus=true This commit change 2 things: * first, considering valgrind is just slow, we used to give processes 60 seconds timeout on shutdown instead of 10 seconds we give normally. this commit changes that to 120. * secondly, when we reach the timeout, we first try to use SIGSEGV so that maybe we'll get a stack trace indicating where redis is hang, and we only resort to SIGKILL if double that time passed. note that if there are indeed hang processes, we will normally not see that in the non-valgrind runs, since the tests didn't use to detect any failure in that case, and now they will since `crashlog_from_file` is run after `kill_server`.	2021-10-26 08:34:30 +03:00
yoav-steinberg	6600253046	Client eviction ci issues (#9549 ) Fixing CI test issues introduced in #8687 - valgrind warnings in readQueryFromClient when client was freed by processInputBuffer - adding DEBUG pause-cron for tests not to be time dependent. - skipping a test that depends on socket buffers / events not compatible with TLS - making sure client got subscribed by not using deferring client	2021-09-26 17:45:02 +03:00
Yossi Gottlieb	0af7fe2cab	Add --skipfile and --skiptest regex support. (#9555 ) Empty patterns are not considered and skipped. Also, improve help text.	2021-09-26 15:12:37 +03:00
Huang Zhw	bdecbd30df	Fix test randstring, compare string and int is wrong. (#9544 ) This will cause the generated string containing "\". Fixes a broken change in #8687	2021-09-24 16:58:38 +03:00
yoav-steinberg	2753429c99	Client eviction (#8687 ) ### Description A mechanism for disconnecting clients when the sum of all connected clients is above a configured limit. This prevents eviction or OOM caused by accumulated used memory between all clients. It's a complimentary mechanism to the `client-output-buffer-limit` mechanism which takes into account not only a single client and not only output buffers but rather all memory used by all clients. #### Design The general design is as following: * We track memory usage of each client, taking into account all memory used by the client (query buffer, output buffer, parsed arguments, etc...). This is kept up to date after reading from the socket, after processing commands and after writing to the socket. * Based on the used memory we sort all clients into buckets. Each bucket contains all clients using up up to x2 memory of the clients in the bucket below it. For example up to 1m clients, up to 2m clients, up to 4m clients, ... * Before processing a command and before sleep we check if we're over the configured limit. If we are we start disconnecting clients from larger buckets downwards until we're under the limit. #### Config `maxmemory-clients` max memory all clients are allowed to consume, above this threshold we disconnect clients. This config can either be set to 0 (meaning no limit), a size in bytes (possibly with MB/GB suffix), or as a percentage of `maxmemory` by using the `%` suffix (e.g. setting it to `10%` would mean 10% of `maxmemory`). #### Important code changes * During the development I encountered yet more situations where our io-threads access global vars. And needed to fix them. I also had to handle keeps the clients sorted into the memory buckets (which are global) while their memory usage changes in the io-thread. To achieve this I decided to simplify how we check if we're in an io-thread and make it much more explicit. I removed the `CLIENT_PENDING_READ` flag used for checking if the client is in an io-thread (it wasn't used for anything else) and just used the global `io_threads_op` variable the same way to check during writes. * I optimized the cleanup of the client from the `clients_pending_read` list on client freeing. We now store a pointer in the `client` struct to this list so we don't need to search in it (`pending_read_list_node`). * Added `evicted_clients` stat to `INFO` command. * Added `CLIENT NO-EVICT ON\|OFF` sub command to exclude a specific client from the client eviction mechanism. Added corrosponding 'e' flag in the client info string. * Added `multi-mem` field in the client info string to show how much memory is used up by buffered multi commands. * Client `tot-mem` now accounts for buffered multi-commands, pubsub patterns and channels (partially), tracking prefixes (partially). * CLIENT_CLOSE_ASAP flag is now handled in a new `beforeNextClient()` function so clients will be disconnected between processing different clients and not only before sleep. This new function can be used in the future for work we want to do outside the command processing loop but don't want to wait for all clients to be processed before we get to it. Specifically I wanted to handle output-buffer-limit related closing before we process client eviction in case the two race with each other. * Added a `DEBUG CLIENT-EVICTION` command to print out info about the client eviction buckets. * Each client now holds a pointer to the client eviction memory usage bucket it belongs to and listNode to itself in that bucket for quick removal. * Global `io_threads_op` variable now can contain a `IO_THREADS_OP_IDLE` value indicating no io-threading is currently being executed. * In order to track memory used by each clients in real-time we can't rely on updating these stats in `clientsCron()` alone anymore. So now I call `updateClientMemUsage()` (used to be `clientsCronTrackClientsMemUsage()`) after command processing, after writing data to pubsub clients, after writing the output buffer and after reading from the socket (and maybe other places too). The function is written to be fast. * Clients are evicted if needed (with appropriate log line) in `beforeSleep()` and before processing a command (before performing oom-checks and key-eviction). * All clients memory usage buckets are grouped as follows: * All clients using less than 64k. * 64K..128K * 128K..256K * ... * 2G..4G * All clients using 4g and up. * Added client-eviction.tcl with a bunch of tests for the new mechanism. * Extended maxmemory.tcl to test the interaction between maxmemory and maxmemory-clients settings. * Added an option to flag a numeric configuration variable as a "percent", this means that if we encounter a '%' after the number in the config file (or config set command) we consider it as valid. Such a number is store internally as a negative value. This way an integer value can be interpreted as either a percent (negative) or absolute value (positive). This is useful for example if some numeric configuration can optionally be set to a percentage of something else. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-09-23 14:02:16 +03:00
filipe oliveira	b5a879e1c2	Added URI support to redis-benchmark (cli and benchmark share the same uri-parsing methods) (#9314 ) - Add `-u <uri>` command line option to support `redis://` URI scheme. - included server connection information object (`struct cliConnInfo`), used to describe an ip:port pair, db num user input, and user:pass to avoid a large number of function arguments. - Using sds on connection info strings for redis-benchmark/redis-cli Co-authored-by: yoav-steinberg <yoav@monfort.co.il>	2021-09-14 19:45:06 +03:00
yoav-steinberg	4c7827588d	Fixed leaked client for "start_server" when running in --loop (#9497 ) * On `kill_server` make sure we close the default `"client"` connection. * Don't reconnect when trying to execute the client's `close` command. * On `restart_server` make sure to remove the (closed) default `"client"` after killing the old server.	2021-09-13 18:16:47 +03:00
Huang Zhw	75dd230994	bitpos/bitcount add bit index (#9324 ) Make bitpos/bitcount support bit index: ``` BITPOS key bit [start [end [BIT\|BYTE]]] BITCOUNT key [start end [BIT\|BYTE]] ``` The default behavior is `BYTE`, so these commands are still compatible with old.	2021-09-12 11:31:22 +03:00
yoav-steinberg	19bc83716c	support regex in "--only" in runtest (#9352 )	2021-08-10 14:28:24 +03:00
Oran Agra	3f3f678a47	corrupt-dump-fuzzer test, avoid creating junk keys (#9302 ) The execution of the RPOPLPUSH command by the fuzzer created junk keys, that were later being selected by RANDOMKEY and modified. This also meant that lists were statistically tested more than other files. Fix the fuzzer not to pass junk key names to RPOPLPUSH, and add a check that detects that new keys are not added by the fuzzer to detect future similar issues.	2021-08-05 22:57:05 +03:00
Meir Shpilraien (Spielrein)	2237131e15	Unified Lua and modules reply parsing and added RESP3 support to RM_Call (#9202 ) ## Current state 1. Lua has its own parser that handles parsing `reds.call` replies and translates them to Lua objects that can be used by the user Lua code. The parser partially handles resp3 (missing big number, verbatim, attribute, ...) 2. Modules have their own parser that handles parsing `RM_Call` replies and translates them to RedisModuleCallReply objects. The parser does not support resp3. In addition, in the future, we want to add Redis Function (#8693) that will probably support more languages. At some point maintaining so many parsers will stop scaling (bug fixes and protocol changes will need to be applied on all of them). We will probably end up with different parsers that support different parts of the resp protocol (like we already have today with Lua and modules) ## PR Changes This PR attempt to unified the reply parsing of Lua and modules (and in the future Redis Function) by introducing a new parser unit (`resp_parser.c`). The new parser handles parsing the reply and calls different callbacks to allow the users (another unit that uses the parser, i.e, Lua, modules, or Redis Function) to analyze the reply. ### Lua API Additions The code that handles reply parsing on `scripting.c` was removed. Instead, it uses the resp_parser to parse and create a Lua object out of the reply. As mentioned above the Lua parser did not handle parsing big numbers, verbatim, and attribute. The new parser can handle those and so Lua also gets it for free. Those are translated to Lua objects in the following way: 1. Big Number - Lua table `{'big_number':'<str representation for big number>'}` 2. Verbatim - Lua table `{'verbatim_string':{'format':'<verbatim format>', 'string':'<verbatim string value>'}}` 3. Attribute - currently ignored and not expose to the Lua parser, another issue will be open to decide how to expose it. Tests were added to check resp3 reply parsing on Lua ### Modules API Additions The reply parsing code on `module.c` was also removed and the new resp_parser is used instead. In addition, the RedisModuleCallReply was also extracted to a separate unit located on `call_reply.c` (in the future, this unit will also be used by Redis Function). A nice side effect of unified parsing is that modules now also support resp3. Resp3 can be enabled by giving `3` as a parameter to the fmt argument of `RM_Call`. It is also possible to give `0`, which will indicate an auto mode. i.e, Redis will automatically chose the reply protocol base on the current client set on the RedisModuleCtx (this mode will mostly be used when the module want to pass the reply to the client as is). In addition, the following RedisModuleAPI were added to allow analyzing resp3 replies: * New RedisModuleCallReply types: * `REDISMODULE_REPLY_MAP` * `REDISMODULE_REPLY_SET` * `REDISMODULE_REPLY_BOOL` * `REDISMODULE_REPLY_DOUBLE` * `REDISMODULE_REPLY_BIG_NUMBER` * `REDISMODULE_REPLY_VERBATIM_STRING` * `REDISMODULE_REPLY_ATTRIBUTE` * New RedisModuleAPI: * `RedisModule_CallReplyDouble` - getting double value from resp3 double reply * `RedisModule_CallReplyBool` - getting boolean value from resp3 boolean reply * `RedisModule_CallReplyBigNumber` - getting big number value from resp3 big number reply * `RedisModule_CallReplyVerbatim` - getting format and value from resp3 verbatim reply * `RedisModule_CallReplySetElement` - getting element from resp3 set reply * `RedisModule_CallReplyMapElement` - getting key and value from resp3 map reply * `RedisModule_CallReplyAttribute` - getting a reply attribute * `RedisModule_CallReplyAttributeElement` - getting key and value from resp3 attribute reply * New context flags: * `REDISMODULE_CTX_FLAGS_RESP3` - indicate that the client is using resp3 Tests were added to check the new RedisModuleAPI ### Modules API Changes * RM_ReplyWithCallReply might return REDISMODULE_ERR if the given CallReply is in resp3 but the client expects resp2. This is not a breaking change because in order to get a resp3 CallReply one needs to specifically specify `3` as a parameter to the fmt argument of `RM_Call` (as mentioned above). Tests were added to check this change ### More small Additions * Added `debug set-disable-deny-scripts` that allows to turn on and off the commands no-script flag protection. This is used by the Lua resp3 tests so it will be possible to run `debug protocol` and check the resp3 parsing code. Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>	2021-08-04 16:28:07 +03:00
Ariel Shtul	bdbf5eedae	Module api support for RESP3 (#8521 ) Add new Module APS for RESP3 responses: - RM_ReplyWithMap - RM_ReplyWithSet - RM_ReplyWithAttribute - RM_ReplySetMapLength - RM_ReplySetSetLength - RM_ReplySetAttributeLength - RM_ReplyWithBool Deprecate REDISMODULE_POSTPONED_ARRAY_LEN in favor of a generic REDISMODULE_POSTPONED_LEN Improve documentation Add tests Co-authored-by: Guy Benoish <guy.benoish@redislabs.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-03 11:37:19 +03:00
Wen Hui	db41536454	Remove duplicate zero-port sentinels (#9240 ) The issue is that when a sentinel with the same address and IP is turned on with a different runid, its port is set to 0 but it is still present in the dictionary master->sentinels which contain all the sentinels for a master. This causes a problem when we do INFO SENTINEL because it takes the size of the dictionary of sentinels. This might also cause a problem for failover if enough sentinels have their port set to 0 since the number of voters in failover is also determined by the size of the dictionary of sentinels. This commits removes the sentinels with the port set to zero from the dictionary of sentinels. Fixes #8786	2021-07-29 12:32:28 +03:00
Oran Agra	6a5bac309e	Test infra, handle RESP3 attributes and big-numbers and bools (#9235 ) - promote the code in DEBUG PROTOCOL to addReplyBigNum - DEBUG PROTOCOL ATTRIB skips the attribute when client is RESP2 - networking.c addReply for push and attributes generate assertion when called on a RESP2 client, anything else would produce a broken protocol that clients can't handle.	2021-07-14 19:14:31 +03:00
Oran Agra	7103367ad4	Tests: add a way to read raw RESP protocol reponses (#9193 ) This makes it possible to distinguish between null response and an empty array (currently the tests infra translates both to an empty string/list)	2021-07-04 19:43:58 +03:00
Yossi Gottlieb	07b0d144ce	Improve bind and protected-mode config handling. (#9034 ) * Specifying an empty `bind ""` configuration prevents Redis from listening on any TCP port. Before this commit, such configuration was not accepted. * Using `CONFIG GET bind` will always return an explicit configuration value. Before this commit, if a bind address was not specified the returned value was empty (which was an anomaly). Another behavior change is that modifying the `bind` configuration to a non-default value will NO LONGER DISABLE protected-mode implicitly.	2021-06-22 12:50:17 +03:00
Binbin	0bfccc55e2	Fixed some typos, add a spell check ci and others minor fix (#8890 ) This PR adds a spell checker CI action that will fail future PRs if they introduce typos and spelling mistakes. This spell checker is based on blacklist of common spelling mistakes, so it will not catch everything, but at least it is also unlikely to cause false positives. Besides that, the PR also fixes many spelling mistakes and types, not all are a result of the spell checker we use. Here's a summary of other changes: 1. Scanned the entire source code and fixes all sorts of typos and spelling mistakes (including missing or extra spaces). 2. Outdated function / variable / argument names in comments 3. Fix outdated keyspace masks error log when we check `config.notify-keyspace-events` in loadServerConfigFromString. 4. Trim the white space at the end of line in `module.c`. Check: https://github.com/redis/redis/pull/7751 5. Some outdated https link URLs. 6. Fix some outdated comment. Such as: - In README: about the rdb, we used to said create a `thread`, change to `process` - dbRandomKey function coment (about the dictGetRandomKey, change to dictGetFairRandomKey) - notifyKeyspaceEvent fucntion comment (add type arg) - Some others minor fix in comment (Most of them are incorrectly quoted by variable names) 7. Modified the error log so that users can easily distinguish between TCP and TLS in `changeBindAddr`	2021-06-10 15:39:33 +03:00
Yossi Gottlieb	8a86bca5ed	Improve test suite to handle external servers better. (#9033 ) This commit revives the improves the ability to run the test suite against external servers, instead of launching and managing `redis-server` processes as part of the test fixture. This capability existed in the past, using the `--host` and `--port` options. However, it was quite limited and mostly useful when running a specific tests. Attempting to run larger chunks of the test suite experienced many issues: * Many tests depend on being able to start and control `redis-server` themselves, and there's no clear distinction between external server compatible and other tests. * Cluster mode is not supported (resulting with `CROSSSLOT` errors). This PR cleans up many things and makes it possible to run the entire test suite against an external server. It also provides more fine grained controls to handle cases where the external server supports a subset of the Redis commands, limited number of databases, cluster mode, etc. The tests directory now contains a `README.md` file that describes how this works. This commit also includes additional cleanups and fixes: * Tests can now be tagged. * Tag-based selection is now unified across `start_server`, `tags` and `test`. * More information is provided about skipped or ignored tests. * Repeated patterns in tests have been extracted to common procedures, both at a global level and on a per-test file basis. * Cleaned up some cases where test setup was based on a previous test executing (a major anti-pattern that repeats itself in many places). * Cleaned up some cases where test teardown was not part of a test (in the future we should have dedicated teardown code that executes even when tests fail). * Fixed some tests that were flaky running on external servers.	2021-06-09 15:13:24 +03:00
ny0312	53d1acd598	Always replicate time-to-live(TTL) as absolute timestamps in milliseconds (#8474 ) Till now, on replica full-sync we used to transfer absolute time for TTL, however when a command arrived (EXPIRE or EXPIREAT), we used to propagate it as is to replicas (possibly with relative time), but always translate it to EXPIREAT (absolute time) to AOF. This commit changes that and will always use absolute time for propagation. see discussion in #8433 Furthermore, we Introduce new commands: `EXPIRETIME/PEXPIRETIME` that allow extracting the absolute TTL time from a key.	2021-05-30 09:20:32 +03:00
YaacovHazan	32a2584e07	stabilize tests that involved with load handlers (#8967 ) When test stop 'load handler' by killing the process that generating the load, some commands that already in the input buffer, still might be processed by the server. This may cause some instability in tests, that count on that no more commands processed after we stop the `load handler' In this commit, new proc 'wait_load_handlers_disconnected' added, to verify that no more cammands from any 'load handler' prossesed, by checking that the clients who genreate the load is disconnceted. Also, replacing check of dbsize with wait_for_ofs_sync before comparing debug digest, as it would fail in case the last key the workload wrote was an overridden key (not a new one). Affected tests Race fix: - failover command to specific replica works - Connect multiple replicas at the same time (issue #141), master diskless=$mdl, replica diskless=$sdl - AOF rewrite during write load: RDB preamble=$rdbpre Cleanup and speedup: - Test replication with blocking lists and sorted sets operations - Test replication with parallel clients writing in different DBs - Test replication partial resync: $descr (diskless: $mdl, $sdl, reconnect: $reconnect	2021-05-20 15:29:43 +03:00
Oran Agra	611959eee5	fuzz tester, try to print hung command (#8837 )	2021-04-25 13:08:46 +03:00
Hanna Fadida	53a4d6c3b1	Modules: adding a module type for key space notification (#8759 ) Adding a new type mask for key space notification, REDISMODULE_NOTIFY_MODULE, to enable unique notifications from commands on REDISMODULE_KEYTYPE_MODULE type keys (which is currently unsupported). Modules can subscribe to a module key keyspace notification by RM_SubscribeToKeyspaceEvents, and clients by notify-keyspace-events of redis.conf or via the CONFIG SET, with the characters 'd' or 'A' (REDISMODULE_NOTIFY_MODULE type mask is part of the 'All' notation for key space notifications). Refactor: move some pubsub test infra from pubsub.tcl to util.tcl to be re-used by other tests.	2021-04-19 21:33:26 +03:00
Oran Agra	f4b5a4d869	Improve testsuite print of log file (#8805 ) 1. the `dump_logs` option would have printed only logs of servers that were spawn before the test proc started, and not ones that the test proc started inside it. 2. when a server proc catches an exception it should normally forward the exception upwards, specifically when it's an assertion that should be caught by a test proc above. however, in `durable` mode, we caught all exceptions printed them to stdout and let the code continue, this was wrong to do for assertions, which should have still been propagated to the test function. 3. don't bother to search for crash log to print if we printed the the entire log anyway 4. if no crash log was found, no need to print anything (i.e. the fact it wasn't found) 5. rename warnings_from_file to crashlog_from_file	2021-04-18 11:55:54 +03:00
sundb	569a3f4548	Use chi-square for random distributivity verification in test (#8709 ) Problem: Currently, when performing random distribution verification, we determine the probability of each element occurring in the sum, but the probability is only an estimate, these tests had rare sporadic failures, and we cannot verify what the probability of failure will be. Solution: Using the chi-square distribution instead of the original random distribution validation makes the test more reasonable and easier to find problems.	2021-04-01 08:20:15 +03:00
Viktor Söderqvist	5629dbe715	Add support for plaintext clients in TLS cluster (#8587 ) The cluster bus is established over TLS or non-TLS depending on the configuration tls-cluster. The client ports distributed in the cluster and sent to clients are assumed to be TLS or non-TLS also depending on tls-cluster. The cluster bus is now extended to also contain the non-TLS port of clients in a TLS cluster, when available. The non-TLS port of a cluster node, when available, is sent to clients connected without TLS in responses to CLUSTER SLOTS, CLUSTER NODES, CLUSTER SLAVES and MOVED and ASK redirects, instead of the TLS port. The user was able to override the client port by defining cluster-announce-port. Now cluster-announce-tls-port is added, so the user can define an alternative announce port for both TLS and non-TLS clients. Fixes #8134	2021-03-30 23:11:32 +03:00
Sokolov Yura	315df9ada0	Add cluster slot migration tests (#8649 ) Add tests for fixing migrating slot at all stages: 1. when migration is half inited on "migrating" node 2. when migration is half inited on "importing" node 3. migration inited, but not finished 4. migration is half finished on "migrating" node 5. migration is half finished on "importing" node Also add tests for many simultaneous slot migrations. Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>	2021-03-29 13:52:02 +03:00
Huang Zhw	a19c4058be	When tests exit normally, some processes may still be alive (#8647 ) In certain scenario start_server may think it failed to start a redis server although it started successfully. in these cases, it'll not terminate it, and it'll remain running when the test is over. In start_server if config doesn't have bind (the minimal.conf in introspection.tcl), it will try to bind ipv4 and ipv6. One may success while other fails. It will output "Could not create server TCP listening socket". wait_server_started uses this message to check whether instance started successfully. So it will consider that it failed even though redis started successfully. Additionally, in some cases it wasn't clear to users why the server exited, since the warning message printed to the log, could in some cases be harmless, and in some cases fatal. This PR adds makes a clear distinction between a warning log message and a fatal one, and changes the test suite to look for the fatal message.	2021-03-16 17:25:30 +02:00
Madelyn Olson	c6f0ea2c81	Allow stopped redis processes to be killed in tests (#8552 )	2021-02-24 14:26:16 -08:00
Huang Zw	f687ac0c32	Client tracking tracking-redir-broken push len is 2 not 3 (#8456 ) When redis responds with tracking-redir-broken push message (RESP3), it was responding with a broken protocol: an array of 3 elements, but only pushes 2 elements. Some bugs in the test make this pass. Read the push reply will consume an extra reply, because the reply length is 3, but there are only two elements, so the next reply will be treated as third element. So the test is corrected too. Other changes: * checkPrefixCollisionsOrReply success should return 1 instead of -1, this bug didn't have any implications. * improve client tracking tests to validate more of the response it reads.	2021-02-21 09:34:46 +02:00
Yossi Gottlieb	5b8350aaaa	Add --dump-logs tests option. (#8459 ) Dump the entire server log if a test failed, to easy troubleshooting with no access to log files.	2021-02-07 12:37:24 +02:00
sundb	18ac41973b	RAND* commands: fix risk of OOM panic in hash and zset, use fair random in hash, and add tests for even distribution to all (#8429 ) Changes to HRANDFIELD and ZRANDMEMBER: * Fix risk of OOM panic when client query a very big negative count (avoid allocating huge temporary buffer). * Fix uneven random distribution in HRANDFIELD with negative count (wasn't using dictGetFairRandomKey). * Add tests to check an even random distribution (HRANDFIELD, SRANDMEMBER, ZRANDMEMBER). Co-authored-by: Oran Agra <oran@redislabs.com>	2021-02-05 15:56:20 +02:00
Oran Agra	5a7eb9c881	Fix test issues from introduction of HRANDFIELD (#8424 ) * The corrupt dump fuzzer found a division by zero. * in some cases the random fields from the HRANDFIELD tests produced fields with newlines and other special chars (due to \ char), this caused the TCL tests to see a bulk response that has a newline in it and add {} around it, later it can think this is a nested list. in fact the `alpha` random string generator isn't using spaces and newlines, so it should not use `\` either.	2021-01-31 12:13:45 +02:00
Yang Bodong	b9a0500f16	Add HRANDFIELD and ZRANDMEMBER. improvements to SRANDMEMBER (#8297 ) New commands: `HRANDFIELD [<count> [WITHVALUES]]` `ZRANDMEMBER [<count> [WITHSCORES]]` Algorithms are similar to the one in SRANDMEMBER. Both return a simple bulk response when no arguments are given, and an array otherwise. In case values/scores are requested, RESP2 returns a long array, and RESP3 a nested array. note: in all 3 commands, the only option that also provides random order is the one with negative count. Changes to SRANDMEMBER * Optimization when count is 1, we can use the more efficient algorithm of non-unique random * optimization: work with sds strings rather than robj Other changes: * zzlGetScore: when zset needs to convert string to double, we use safer memcpy (in case the buffer is too small) * Solve a "bug" in SRANDMEMBER test: it intended to test a positive count (case 3 or case 4) and by accident used a negative count Co-authored-by: xinluton <xinluton@qq.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-01-29 10:47:28 +02:00
Allen Farris	0d18a1e85f	implement FAILOVER command (#8315 ) Implement FAILOVER command, which coordinates failover between the server and one of its replicas.	2021-01-28 13:18:05 -08:00

1 2 3 4 5

227 Commits