redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 00:28:26 -05:00

Author	SHA1	Message	Date
Binbin	5dddf496ce	Add missing pause tcl test to test_helper.tcl (#9158 ) * Add keyname tags to avoid CROSSSLOT errors in external server CI * Use new wait_for_blocked_clients_count in pause.tcl	2021-06-30 13:32:51 +03:00
Binbin	1d5aa37d68	Fix timing issue in psync2 test. (#9159 ) *** [err]: PSYNC2: total sum of full synchronizations is exactly 4 intests/integration/psync2.tcl Expected 5 == 4 (context: type eval line 8 cmd {assert {$sum == 4}} proc::test) Sometime the test got an unexpected full sync since a replica switch to master, before the new master change propagated the new replid to all replicas, a replica attempted to sync with it using a wrong replid and triggered a full resync. Consider this scenario: 1 slaveof 4 full resync 0 slaveof 4 full resync 2 slaveof 0 full resync 3 slaveof 1 full resync 1 slaveof no one, replid changed 3 reconnect 1, did a partial resyn and got the new replid Before 2 inherits the new replid. 3 slaveof 2 3 try to do a partial resyn with 2. But their replication ids are inconsistent, so a full resync happens. :) A special thank you for oran and helping me in this test case. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-06-30 09:18:10 +03:00
Yossi Gottlieb	5d8ea4b326	Add missing needs:repl tag. (#9169 )	2021-06-29 16:48:52 +03:00
Leibale Eidelman	95274f1f8a	fix ZRANGESTORE - should return 0 when src points to an empty key (#9089 ) mistakenly it used to return an empty array rather than 0. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-06-29 16:38:10 +03:00
Binbin	4bc5a8324d	ZRANDMEMBER WITHSCORES with negative COUNT may return bad score (#9162 ) Return a bad score when used with negative count (or count of 1), and non-ziplist encoded zset. Also add test to validate the return value and cover the issue.	2021-06-29 10:14:28 +03:00
Yossi Gottlieb	f233c4c59d	Add bind-source-addr configuration argument. (#9142 ) In the past, the first bind address that was explicitly specified was also used to bind outgoing connections. This could result with some problems. For example: on some systems using `bind 127.0.0.1` would result with outgoing connections also binding to `127.0.0.1` and failing to connect to remote addresses. With the recent change to the way `bind` is handled, this presented other issues: * The default first bind address is '' which is not a valid address. We make no distinction between user-supplied config that is identical to the default, and the default config. This commit addresses both these issues by introducing an explicit configuration parameter to control the bind address on outgoing connections.	2021-06-24 19:48:18 +03:00
Oran Agra	5ffdbae1f6	Fix failing basics moduleapi test on 32bit CI (#9140 )	2021-06-24 12:44:13 +03:00
Oran Agra	ae418eca24	Adjustments to recent RM_StringTruncate fix (#3718 ) (#9125 ) - Introduce a new sdssubstr api as a building block for sdsrange. The API of sdsrange is many times hard to work with and also has corner case that cause bugs. sdsrange is easy to work with and also simplifies the implementation of sdsrange. - Revert the fix to RM_StringTruncate and just use sdssubstr instead of sdsrange. - Solve valgrind warnings from the new tests introduced by the previous PR.	2021-06-22 17:22:40 +03:00
Yossi Gottlieb	a49b766860	Remove leftover after CONFIG SET bind change. (#9129 )	2021-06-22 14:03:00 +03:00
Yossi Gottlieb	8284544adb	Fix typo in test. (#9128 )	2021-06-22 13:30:20 +03:00
Yossi Gottlieb	07b0d144ce	Improve bind and protected-mode config handling. (#9034 ) * Specifying an empty `bind ""` configuration prevents Redis from listening on any TCP port. Before this commit, such configuration was not accepted. * Using `CONFIG GET bind` will always return an explicit configuration value. Before this commit, if a bind address was not specified the returned value was empty (which was an anomaly). Another behavior change is that modifying the `bind` configuration to a non-default value will NO LONGER DISABLE protected-mode implicitly.	2021-06-22 12:50:17 +03:00
Evan	1ccf2ca2f4	modules: Add newlen == 0 handling to RM_StringTruncate (#3717 ) (#3718 ) Previously, passing 0 for newlen would not truncate the string at all. This adds handling of this case, freeing the old string and creating a new empty string. Other changes: - Move `src/modules/testmodule.c` to `tests/modules/basics.c` - Introduce that basic test into the test suite - Add tests to cover StringTruncate - Add `test-modules` build target for the main makefile - Extend `distclean` build target to clean modules too	2021-06-22 12:26:48 +03:00
Oran Agra	d0819d618e	solve test timing issues in replication tests (#9121 ) # replication-3.tcl had a test timeout failure with valgrind on daily CI: ``` * [err]: SLAVE can reload "lua" AUX RDB fields of duplicated scripts in tests/integration/replication-3.tcl Replication not started. ``` replication took more than 70 seconds. https://github.com/redis/redis/runs/2854037905?check_suite_focus=true on my machine it takes only about 30, but i can see how 50 seconds isn't enough. # replication.tcl loading was over too quickly in freebsd daily CI: ``` * [err]: slave fails full sync and diskless load swapdb recovers it in tests/integration/replication.tcl Expected '0' to be equal to '1' (context: type eval line 44 cmd {assert_equal [s -1 loading] 1} proc ::start_server) ``` # rdb.tcl loading was over too quickly. increase the time loading takes, and decrease the amount of work we try to achieve in that time.	2021-06-22 11:10:11 +03:00
Oran Agra	9b564b525d	Fix race in client side tracking (#9116 ) The `Tracking gets notification of expired keys` test in tracking.tcl used to hung in valgrind CI quite a lot. It turns out the reason is that with valgrind and a busy machine, the server cron active expire cycle could easily run in the same event loop as the command that created `mykey`, so that when they key got expired, there were two change events to broadcast, one that set the key and one that expired it, but since we used raxTryInsert, the client that was associated with the "last" change was the one that created the key, so the NOLOOP filtered that event. This commit adds a test that reproduces the problem by using lazy expire in a multi-exec which makes sure the key expires in the same event loop as the one that added it.	2021-06-22 07:35:59 +03:00
sundb	eae0983d2d	Fix leak and double free issues in datatype2 module test (#9102 ) * Add missing call for RedisModule_DictDel in datatype2 test * Fix memory leak in datatype2 test	2021-06-17 21:45:21 +03:00
sundb	b586d5b567	Fix querybuf test failure (#9091 ) Fix test failure which introduced by #9003. The following case will occur when querybuf expansion will allocate memory equal to (16*1024)k. 1) make use ```CFLAGS=-DNO_MALLOC_USABLE_SIZE```. 2) ```malloc``` will not allocate more under ```alpine```.	2021-06-16 22:01:37 +03:00
chenyang8094	e0cd3ad0de	Enhance mem_usage/free_effort/unlink/copy callbacks and add GetDbFromIO api. (#8999 ) Create new module type enhanced callbacks: mem_usage2, free_effort2, unlink2, copy2. These will be given a context point from which the module can obtain the key name and database id. In addition the digest and defrag context can now be used to obtain the key name and database id.	2021-06-16 09:45:49 +03:00
Jason Elbaum	7f342020dc	Change return value type for ZPOPMAX/MIN in RESP3 (#8981 ) When using RESP3, ZPOPMAX/ZPOPMIN should return nested arrays for consistency with other commands (e.g. ZRANGE). We do that only when COUNT argument is present (similarly to how LPOP behaves). for reasoning see https://github.com/redis/redis/issues/8824#issuecomment-855427955 This is a breaking change only when RESP3 is used, and COUNT argument is present!	2021-06-16 09:29:57 +03:00
sundb	e5d8a5eb85	Fix the wrong reisze of querybuf (#9003 ) The initialize memory of `querybuf` is `PROTO_IOBUF_LEN(102416) 2` (due to sdsMakeRoomFor being greedy), under `jemalloc`, the allocated memory will be 40k. This will most likely result in the `querybuf` being resized when call `clientsCronResizeQueryBuffer` unless the client requests it fast enough. Note that this bug existed even before #7875, since the condition for resizing includes the sds headers (32k+6). ## Changes 1. Use non-greedy sdsMakeRoomFor when allocating the initial query buffer (of 16k). 1. Also use non-greedy allocation when working with BIG_ARG (we won't use that extra space anyway) 2. in case we did use a greedy allocation, read as much as we can into the buffer we got (including internal frag), to reduce system calls. 3. introduce a dedicated constant for the shrinking (same value as before) 3. Add test for querybuf. 4. improve a maxmemory test by ignoring the effect of replica query buffers (can accumulate many ACKs on slow env) 5. improve a maxmemory by disabling slowlog (it will cause slight memory growth on slow env).	2021-06-15 14:46:19 +03:00
Binbin	b109977301	Fix XINFO help for unexpected options. (#9075 ) Small cleanup and consistency.	2021-06-15 10:01:11 +03:00
Binbin	7900b48bc7	slowlog get command supports passing in -1 to get all logs. (#9018 ) This was already the case before this commit, but it wasn't clear / intended in the code, now it does.	2021-06-14 16:46:45 +03:00
YaacovHazan	1677efb9da	cleanup around loadAppendOnlyFile (#9012 ) Today when we load the AOF on startup, the loadAppendOnlyFile checks if the file is openning for reading. This check is redundent (dead code) as we open the AOF file for writing at initServer, and the file will always be existing for the loadAppendOnlyFile. In this commit: - remove all the exit(1) from loadAppendOnlyFile, as it is the caller responsibility to decide what to do in case of failure. - move the opening of the AOF file for writing, to be after we loading it. - avoid return -ERR in DEBUG LOADAOF, when the AOF is existing but empty	2021-06-14 10:38:08 +03:00
Binbin	b8a5da80c4	Fix accidental deletion of sinterstore command when we meet wrong type error. (#9032 ) SINTERSTORE would have deleted the dest key right away, even when later on it is bound to fail on an (WRONGTYPE) error. With this change it first picks up all the input keys, and only later delete the dest key if one is empty. Also add more tests for some commands. Mainly focus on - `wrong type error`: expand test case (base on sinter bug) in non-store variant add tests for store variant (although it exists in non-store variant, i think it would be better to have same tests) - the dstkey result when we meet `non-exist key (empty set)` in *store sdiff: - improve test case about wrong type error (the one we found in sinter, although it is safe in sdiff) - add test about using non-exist key (treat it like an empty set) sdiffstore: - according to sdiff test case, also add some tests about `wrong type error` and `non-exist key` - the different is that in sdiffstore, we will consider the `dstkey` result sunion/sunionstore add more tests (same as above) sinter/sinterstore also same as above ...	2021-06-13 10:53:46 +03:00
ny0312	fb140a1bff	Fix flaky test case for absolute TTL replication (#9069 ) The root cause is that one test (`5 keys in, 5 keys out`) is leaking a volatile key that can expire while another later test(`All TTL in commands are propagated as absolute timestamp in replication stream`) is running. Such leaked expiration injects an unexpected `DEL` command into the replication command during the later test, causing it to fail. The fixes are two fold: 1. Plug the leak in the first test. 2. Add FLUSHALL to the later test, to avoid future interference from other tests.	2021-06-13 08:42:20 +03:00
Binbin	0bfccc55e2	Fixed some typos, add a spell check ci and others minor fix (#8890 ) This PR adds a spell checker CI action that will fail future PRs if they introduce typos and spelling mistakes. This spell checker is based on blacklist of common spelling mistakes, so it will not catch everything, but at least it is also unlikely to cause false positives. Besides that, the PR also fixes many spelling mistakes and types, not all are a result of the spell checker we use. Here's a summary of other changes: 1. Scanned the entire source code and fixes all sorts of typos and spelling mistakes (including missing or extra spaces). 2. Outdated function / variable / argument names in comments 3. Fix outdated keyspace masks error log when we check `config.notify-keyspace-events` in loadServerConfigFromString. 4. Trim the white space at the end of line in `module.c`. Check: https://github.com/redis/redis/pull/7751 5. Some outdated https link URLs. 6. Fix some outdated comment. Such as: - In README: about the rdb, we used to said create a `thread`, change to `process` - dbRandomKey function coment (about the dictGetRandomKey, change to dictGetFairRandomKey) - notifyKeyspaceEvent fucntion comment (add type arg) - Some others minor fix in comment (Most of them are incorrectly quoted by variable names) 7. Modified the error log so that users can easily distinguish between TCP and TLS in `changeBindAddr`	2021-06-10 15:39:33 +03:00
Yossi Gottlieb	8a86bca5ed	Improve test suite to handle external servers better. (#9033 ) This commit revives the improves the ability to run the test suite against external servers, instead of launching and managing `redis-server` processes as part of the test fixture. This capability existed in the past, using the `--host` and `--port` options. However, it was quite limited and mostly useful when running a specific tests. Attempting to run larger chunks of the test suite experienced many issues: * Many tests depend on being able to start and control `redis-server` themselves, and there's no clear distinction between external server compatible and other tests. * Cluster mode is not supported (resulting with `CROSSSLOT` errors). This PR cleans up many things and makes it possible to run the entire test suite against an external server. It also provides more fine grained controls to handle cases where the external server supports a subset of the Redis commands, limited number of databases, cluster mode, etc. The tests directory now contains a `README.md` file that describes how this works. This commit also includes additional cleanups and fixes: * Tests can now be tagged. * Tag-based selection is now unified across `start_server`, `tags` and `test`. * More information is provided about skipped or ignored tests. * Repeated patterns in tests have been extracted to common procedures, both at a global level and on a per-test file basis. * Cleaned up some cases where test setup was based on a previous test executing (a major anti-pattern that repeats itself in many places). * Cleaned up some cases where test teardown was not part of a test (in the future we should have dedicated teardown code that executes even when tests fail). * Fixed some tests that were flaky running on external servers.	2021-06-09 15:13:24 +03:00
Fabian Eichinger	39b0f0dd73	Add support for combining NX and GET flags on SET command (#8906 ) Till now GET and NX were mutually exclusive. This change make their combination mean a "Get or Set" command. If the key exists it returns the old value and avoids setting, and if it does't exist it returns nil and sets it to the new value (possibly with expiry time)	2021-06-07 16:47:58 +03:00
Huang Zhw	eaa7a7bb93	Fix XTRIM or XADD with LIMIT may delete more entries than Count. (#9048 ) The decision to stop trimming due to LIMIT in XADD and XTRIM was after the limit was reached. i.e. the code was deleting at least that count of records (from the LIMIT argument's perspective, not the MAXLEN), instead of up to that count of records. see #9046	2021-06-07 14:43:36 +03:00
Oran Agra	3e39ea0b83	tests: prevent name clash in variables leading to wrong test name (#8995 ) running the "geo" unit would have shown that it completed a unit named "north". this was because the variable `$name` was overwritten. This commit isn't perfect, but it slightly reduces the chance for variable name clash. ``` $ ./runtest --single unit/geo ....... Testing unit/geo ....... [1/1 done]: north (15 seconds) ```	2021-06-06 17:35:30 +03:00
Oran Agra	b512dfe794	tests: add details when test fails on malformed info (#9042 )	2021-06-03 20:34:54 +03:00
pgxiaolianzi	f63bb9583d	Fix typo on buckup to backup (#8919 )	2021-06-01 22:54:30 -07:00
Oran Agra	7cb42c9c36	add test for modules load/unload and config rewrite	2021-06-01 13:43:48 +03:00
Oran Agra	ae67539c8b	Improve new time sensitive pexpireat propagation test (#9010 ) The test that was merged yesterday fails with valgrind and freebsd CI that are too slow, and 10 seconds apparently passed between the time the command was sent to redis and the time it was actually executed. ``` *** [err]: All TTL in commands are propagated as absolute timestamp in replication stream in tests/unit/expire.tcl Expected 'del a' to match 'set foo1 bar PXAT *' (context: type source line 778 file /home/runner/work/redis/redis/tests/test_helper.tcl cmd {assert_match [lindex $patterns $j] [read_from_replication_stream $s]} proc ::assert_replication_stream level 1) ```	2021-06-01 08:01:10 +03:00
ny0312	53d1acd598	Always replicate time-to-live(TTL) as absolute timestamps in milliseconds (#8474 ) Till now, on replica full-sync we used to transfer absolute time for TTL, however when a command arrived (EXPIRE or EXPIREAT), we used to propagate it as is to replicas (possibly with relative time), but always translate it to EXPIREAT (absolute time) to AOF. This commit changes that and will always use absolute time for propagation. see discussion in #8433 Furthermore, we Introduce new commands: `EXPIRETIME/PEXPIRETIME` that allow extracting the absolute TTL time from a key.	2021-05-30 09:20:32 +03:00
YaacovHazan	501d775583	unregister AE_READABLE from the read pipe in backgroundSaveDoneHandlerSocket (#8991 ) In diskless replication, we create a read pipe for the RDB, between the child and the parent. When we close this pipe (fd), the read handler also needs to be removed from the event loop (if it still registered). Otherwise, next time we will use the same fd, the registration will be fail (panic), because we will use EPOLL_CTL_MOD (the fd still register in the event loop), on fd that already removed from epoll_ctl	2021-05-26 14:51:53 +03:00
YaacovHazan	32a2584e07	stabilize tests that involved with load handlers (#8967 ) When test stop 'load handler' by killing the process that generating the load, some commands that already in the input buffer, still might be processed by the server. This may cause some instability in tests, that count on that no more commands processed after we stop the `load handler' In this commit, new proc 'wait_load_handlers_disconnected' added, to verify that no more cammands from any 'load handler' prossesed, by checking that the clients who genreate the load is disconnceted. Also, replacing check of dbsize with wait_for_ofs_sync before comparing debug digest, as it would fail in case the last key the workload wrote was an overridden key (not a new one). Affected tests Race fix: - failover command to specific replica works - Connect multiple replicas at the same time (issue #141), master diskless=$mdl, replica diskless=$sdl - AOF rewrite during write load: RDB preamble=$rdbpre Cleanup and speedup: - Test replication with blocking lists and sorted sets operations - Test replication with parallel clients writing in different DBs - Test replication partial resync: $descr (diskless: $mdl, $sdl, reconnect: $reconnect	2021-05-20 15:29:43 +03:00
Madelyn Olson	a59e75a475	Hide migrate command from slowlog if they include auth (#8859 ) Redact commands that include sensitive data from slowlog and monitor	2021-05-19 08:23:54 -07:00
Oran Agra	d67e66de72	Fix race in new lazyfree test (#8965 ) I recently saw this failure: [err]: lazy free a stream with all types of metadata in tests/unit/lazyfree.tcl Expected '2' to be equal to '1' (context: type eval line 23 cmd {assert_equal [s lazyfreed_objects] 1} proc ::test) The only explanation for such a thing is that the async flushdb wasn't done before we did the resetstat	2021-05-19 16:06:43 +03:00
Oran Agra	8458baf6a9	longer timeout in replication test (#8963 ) the test normally passes. but we saw one failure in a valgrind run in github actions	2021-05-18 17:13:59 +03:00
Oran Agra	cf41c0b5ff	fix race in config rewrite test (#8960 )	2021-05-18 17:10:06 +03:00
Oran Agra	fbc0e2b834	Reset lazyfreed_objects info field with RESETSTAT, test for stream lazyfree (#8934 ) And also add tests to cover lazy free of streams with various types of metadata (see #8932)	2021-05-17 16:54:37 +03:00
Raghav Muddur	31edc22ecc	EVALSHA_RO and EVAL_RO Commands (#8820 ) * EVALSHA_RO and EVAL_RO Commands Added new readonly versions of EVAL and EVALSHA.	2021-05-12 21:07:34 -07:00
yoav-steinberg	152fce5e2c	Enforce client output buffer soft limit when no traffic. (#8833 ) When client breached the output buffer soft limit but then went idle, we didn't disconnect on soft limit timeout, now we do. Note this also resolves some sporadic test failures in due to Linux buffering data which caused tests to fail if during the test we went back under the soft COB limit. Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: sundb <sundbcn@gmail.com>	2021-05-04 13:45:08 +03:00
Huang Zhw	2b22fffc78	Fix potential CONFIG SET bind test failure. (#8875 ) Use an invalid IP address to trigger CONFIG SET bind failure, instead of DNS which is not guaranteed to always fail.	2021-04-27 18:02:23 +03:00
Oran Agra	611959eee5	fuzz tester, try to print hung command (#8837 )	2021-04-25 13:08:46 +03:00
bugwz	761d7d2771	Print the number of abnormal line in AOF (#8823 ) When redis-check-aof finds an error, it prints the line number for faster troubleshooting.	2021-04-20 21:51:24 +03:00
Madelyn Olson	c73b4ddfd9	Fix memory leak when doing lazyfreeing client tracking table (#8822 ) Interior rax pointers were not being freed	2021-04-19 22:16:27 -07:00
Hanna Fadida	53a4d6c3b1	Modules: adding a module type for key space notification (#8759 ) Adding a new type mask for key space notification, REDISMODULE_NOTIFY_MODULE, to enable unique notifications from commands on REDISMODULE_KEYTYPE_MODULE type keys (which is currently unsupported). Modules can subscribe to a module key keyspace notification by RM_SubscribeToKeyspaceEvents, and clients by notify-keyspace-events of redis.conf or via the CONFIG SET, with the characters 'd' or 'A' (REDISMODULE_NOTIFY_MODULE type mask is part of the 'All' notation for key space notifications). Refactor: move some pubsub test infra from pubsub.tcl to util.tcl to be re-used by other tests.	2021-04-19 21:33:26 +03:00
guybe7	f40ca9cb58	Modules: Replicate lazy-expire even if replication is not allowed (#8816 ) Before this commit using RM_Call without "!" could cause the master to lazy-expire a key (delete it) but without replicating to replicas. This could cause the replica's memory usage to gradually grow and could also cause consistency issues if the master and replica have a clock diff. This bug was introduced in #8617 Added a test which demonstrates that scenario.	2021-04-19 17:16:02 +03:00
Harkrishn Patro	7a3d1487e4	ACL channels permission handling for save/load scenario. (#8794 ) In the initial release of Redis 6.2 setting a user to only allow pubsub access to a specific channel, and doing ACL SAVE, resulted in an assertion when ACL LOAD was used. This was later changed by #8723 (not yet released), but still not properly resolved (now it errors instead of crash). The problem is that the server that generates an ACL file, doesn't know what would be the setting of the acl-pubsub-default config in the server that will load it. so ACL SAVE needs to always start with resetchannels directive. This should still be compatible with old acl files (from redis 6.0), and ones from earlier versions of 6.2 that didn't mess with channels. Co-authored-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-04-19 13:27:44 +03:00
sundb	3a955d9ad4	Fix ouput buffer limit test (#8803 ) The tail size of c->reply is 16kb, but in the test only publish a few chars each time, due to a change in #8699, the obuf limit is now checked a new memory allocation is made, so this test would have sometimes failed to trigger a soft limit disconnection in time. The solution is to write bigger payloads to the output buffer, but still limit their rate (not more than 100k/s).	2021-04-19 10:08:07 +03:00
Yossi Gottlieb	c0f5c678c2	Revert cluster slot migration tests. (#8806 ) Disables #8649 and subsequent attempts to stabilize the test.	2021-04-18 20:51:08 +03:00
Oran Agra	a9897b0084	Fix timing of new replication test (#8807 ) In github actions CI with valgrind, i saw that even the fast replica (one that wasn't paused), didn't get to complete the replication fast enough, and ended up getting disconnected by timeout. Additionally, due to a typo in uname, we didn't get to actually run the CPU efficiency part of the test.	2021-04-18 15:12:34 +03:00
Oran Agra	f4b5a4d869	Improve testsuite print of log file (#8805 ) 1. the `dump_logs` option would have printed only logs of servers that were spawn before the test proc started, and not ones that the test proc started inside it. 2. when a server proc catches an exception it should normally forward the exception upwards, specifically when it's an assertion that should be caught by a test proc above. however, in `durable` mode, we caught all exceptions printed them to stdout and let the code continue, this was wrong to do for assertions, which should have still been propagated to the test function. 3. don't bother to search for crash log to print if we printed the the entire log anyway 4. if no crash log was found, no need to print anything (i.e. the fact it wasn't found) 5. rename warnings_from_file to crashlog_from_file	2021-04-18 11:55:54 +03:00
guybe7	d63d02601f	Add a timeout mechanism for replicas stuck in fullsync (#8762 ) Starting redis 6.0 (part of the TLS feature), diskless master uses pipe from the fork child so that the parent is the one sending data to the replicas. This mechanism has an issue in which a hung replica will cause the master to wait for it to read the data sent to it forever, thus preventing the fork child from terminating and preventing the creations of any other forks. This PR adds a timeout mechanism, much like the ACK-based timeout, we disconnect replicas that aren't reading the RDB file fast enough.	2021-04-15 17:18:51 +03:00
YaacovHazan	645c664fbb	stabilized and improve pendingquerybuf test suit (#8780 ) replace the hardcoded after 2000, with waiting for the sync and wait for condition	2021-04-14 11:49:00 +03:00
Oran Agra	b278e44376	Revert "Fix: server will crash if rdbload or rdbsave method is not provided in module (#8670 )" (#8771 ) This reverts commit `808f3004f0`.	2021-04-13 17:41:46 +03:00
Oran Agra	c07e16fadd	Add more attempts to a timing sensitive test (#8770 )	2021-04-13 17:35:10 +03:00
Yossi Gottlieb	5e3a15ae1b	Fix failing cluster tests. (#8763 ) Disable replica migration to avoid a race condition where the migrated-from node turns into a replica. Long term, this test should probably be improved to handle multiple slots and accept such auto migrations but this is a quick fix to stabilize the CI without completely dropping this test.	2021-04-13 00:00:57 +03:00
Yang Bodong	4c14e8668c	Fix out of range confusing error messages (XAUTOCLAIM, RPOP count) (#8746 ) Fix out of range error messages to be clearer (avoid mentioning 9223372036854775807) * Fix XAUTOCLAIM COUNT option confusing error msg * Fix other RPOP and alike error message to mention positive	2021-04-07 10:01:28 +03:00
Bonsai	808f3004f0	Fix: server will crash if rdbload or rdbsave method is not provided in module (#8670 ) With this fix, module data type registration will fail if the load or save callbacks are not defined, or the optional aux load and save callbacks are not either both defined or both missing.	2021-04-06 12:09:36 +03:00
Yossi Gottlieb	4724dd439e	Clean up and stabilize cluster migration tests. (#8745 ) This is work in progress, focusing on two main areas: * Avoiding race conditions with cluster configuration propagation. * Ignoring limitations with redis-cli --cluster fix which makes it hard to distinguish real errors (e.g. failure to fix) from expected conditions in this test (e.g. nodes not agreeing on configuration).	2021-04-06 11:57:57 +03:00
Huang Zhw	3b74b55084	Fix "default" and overwritten / reset users will not have pubsub channels permissions by default. (#8723 ) Background: Redis 6.2 added ACL control for pubsub channels (#7993), which were supposed to be permissive by default to retain compatibility with redis 6.0 ACL. But due to a bug, only newly created users got this `acl-pubsub-default` applied, while overwritten (updated) users got reset to `resetchannels` (denied). Since the "default" user exists before loading the config file, any ACL change to it, results in an update / overwrite. So when a "default" user is loaded from config file or include ACL file with no channels related rules, the user will not have any permissions to any channels. But other users will have default permissions to any channels. When upgraded from 6.0 with config rewrite, this will lead to "default" user channels permissions lost. When users are loaded from include file, then call "acl load", users will also lost channels permissions. Similarly, the `reset` ACL rule, would have reset the user to be denied access to any channels, ignoring `acl-pubsub-default` and breaking compatibility with redis 6.0. The implication of this fix is that it regains compatibility with redis 6.0, but breaks compatibility with redis 6.2.0 and 2.0.1. e.g. after the upgrade, the default user will regain access to pubsub channels. Other changes: Additionally this commit rename server.acl_pubusub_default to server.acl_pubsub_default and fix typo in acl tests.	2021-04-05 23:13:20 +03:00
Sokolov Yura	1cab962098	Add cluster-allow-replica-migration option. (#5285 ) Previously (and by default after commit) when master loose its last slot (due to migration, for example), its replicas will migrate to new last slot holder. There are cases where this is not desired: * Consolidation that results with removed nodes (including the replica, eventually). * Manually configured cluster topologies, which the admin wishes to preserve. Needlessly migrating a replica triggers a full synchronization and can have a negative impact, so we prefer to be able to avoid it where possible. This commit adds 'cluster-allow-replica-migration' configuration option that is enabled by default to preserve existed behavior. When disabled, replicas will not be auto-migrated. Fixes #4896 Co-authored-by: Oran Agra <oran@redislabs.com>	2021-04-04 09:43:24 +03:00
Valentino Geron	44d8b039e8	Fix XAUTOCLAIM response to return the next available id as the cursor (#8725 ) This command used to return the last scanned entry id as the cursor, instead of the next one to be scanned. so in the next call, the user could / should have sent `(cursor` and not just `cursor` if he wanted to avoid scanning the same record twice. Scanning the record twice would look odd if someone is checking what exactly was scanned, but it also has a side effect of incrementing the delivery count twice.	2021-04-01 12:13:55 +03:00
Oran Agra	370ab4c4db	Solve sentinel test issue in TLS due to recent tests change. (#8728 ) `5629dbe71` added a change that configures the tcp (plaintext) port alongside the tls port, this causes the INFO command for tcp_port to return that instead of the tls port when running in tls, and that broke the sentinel tests that query it. the fix is to add a method that gets the right port from CONFIG instead of relying on the tcp_port info field.	2021-04-01 09:44:44 +03:00
guybe7	843f769b96	zsetAdd: Fix wrong reply in case of INCR and GT/LT (#8717 ) If GT/LT fails the operation we need to reply with nill (like failure due to NX). Other changes: Add the missing $encoding suffix to many zset tests Note: there's a behavior change just in case of INCR + GT/LT that fails. The old code was replying with the wrong (rejected) score, and now it'll reply with nil. Note that that's anyway a corner case so this "behavior change" shouldn't have too much affect. Using GT/LT with INCR has a predictable result even before we run the command (INCR GT will only only / always fail if the increment is negative).	2021-04-01 09:33:53 +03:00
sundb	569a3f4548	Use chi-square for random distributivity verification in test (#8709 ) Problem: Currently, when performing random distribution verification, we determine the probability of each element occurring in the sum, but the probability is only an estimate, these tests had rare sporadic failures, and we cannot verify what the probability of failure will be. Solution: Using the chi-square distribution instead of the original random distribution validation makes the test more reasonable and easier to find problems.	2021-04-01 08:20:15 +03:00
Jérôme Loyet	91f4f41665	Add replica-announced config option (#8653 ) The 'sentinel replicas <master>' command will ignore replicas with `replica-announced` set to no. The goal of disabling the config setting replica-announced is to allow ghost replicas. The replica is in the cluster, synchronize with its master, can be promoted to master and is not exposed to sentinel clients. This way, it is acting as a live backup or living ghost. In addition, to prevent the replica to be promoted as master, set replica-priority to 0.	2021-03-30 23:40:22 +03:00
Yossi Gottlieb	6a052af890	Cluster migration test cleanup. (#8726 ) * Dump more output on error (always, cluster tests currently have no verbose flag). * Slow down redis-cli check iteration.	2021-03-30 23:33:01 +03:00
Viktor Söderqvist	5629dbe715	Add support for plaintext clients in TLS cluster (#8587 ) The cluster bus is established over TLS or non-TLS depending on the configuration tls-cluster. The client ports distributed in the cluster and sent to clients are assumed to be TLS or non-TLS also depending on tls-cluster. The cluster bus is now extended to also contain the non-TLS port of clients in a TLS cluster, when available. The non-TLS port of a cluster node, when available, is sent to clients connected without TLS in responses to CLUSTER SLOTS, CLUSTER NODES, CLUSTER SLAVES and MOVED and ASK redirects, instead of the TLS port. The user was able to override the client port by defining cluster-announce-port. Now cluster-announce-tls-port is added, so the user can define an alternative announce port for both TLS and non-TLS clients. Fixes #8134	2021-03-30 23:11:32 +03:00
JunhuaY	28375ff63e	re-fix config rewrite for empty save directive (#8722 ) the bug was also discussed in #8716, and was solved in #8719, but incompletely: when the server is started, and the save option is default, if you issue the " config set save "" " to change the save option, and then issue the “config rewrite” command, the " save "" " won't be saved.	2021-03-30 22:49:06 +03:00
Oran Agra	cd81dcf18b	solve race conditions in psync2-pingoff test (#8720 ) Another test race condition in the macos tests. the test was waiting for PINGs to be generated and put on the replication stream, but waiting for 1 or 2 seconds doesn't really guarantee that. then the test that expected 6 full syncs, found only 4	2021-03-30 11:41:06 +03:00
Yossi Gottlieb	65311a3360	Fix config rewrite with an empty "save" parameter. (#8719 )	2021-03-29 18:53:20 +03:00
Sokolov Yura	315df9ada0	Add cluster slot migration tests (#8649 ) Add tests for fixing migrating slot at all stages: 1. when migration is half inited on "migrating" node 2. when migration is half inited on "importing" node 3. migration inited, but not finished 4. migration is half finished on "migrating" node 5. migration is half finished on "importing" node Also add tests for many simultaneous slot migrations. Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>	2021-03-29 13:52:02 +03:00
Meir Shpilraien (Spielrein)	036963a7da	Restore old client 'processCommandAndResetClient' to fix false dead client indicator (#8715 ) 'processCommandAndResetClient' returns 1 if client is dead. It does it by checking if serve.current_client is NULL. On script timeout, Redis will re-enter 'processCommandAndResetClient' and when finish we will set server.current_client to NULL. This will cause later to falsely return 1 and think that the client that sent the timed-out script is dead (Redis to stop reading from the client buffer).	2021-03-29 13:34:16 +03:00
Huang Zhw	e138698e54	make processCommand check publish channel permissions. (#8534 ) Add publish channel permissions check in processCommand. processCommand didn't check publish channel permissions, so we can queue a publish command in a transaction. But when exec the transaction, it will fail with -NOPERM. We also union keys/commands/channels permissions check togegher in ACLCheckAllPerm. Remove pubsubCheckACLPermissionsOrReply in publishCommand/subscribeCommand/psubscribeCommand. Always check permissions in processCommand/execCommand/ luaRedisGenericCommand.	2021-03-26 14:10:01 +03:00
Oran Agra	497351ad07	Fix SLOWLOG for blocked commands (#8632 ) * SLOWLOG didn't record anything for blocked commands because the client was reset and argv was already empty. there was a fix for this issue specifically for modules, now it works for all blocked clients. * The original command argv (before being re-written) was also reset before adding the slowlog on behalf of the blocked command. * Latency monitor is now updated regardless of the slowlog flags of the command or its execution (their purpose is to hide sensitive info from the slowlog, not hide the fact the latency happened). * Latency monitor now uses real_cmd rather than c->cmd (which may be different if the command got re-written, e.g. GEOADD) Changes: * Unify shared code between slowlog insertion in call() and updateStatsOnUnblock(), hopefully prevent future bugs from happening due to the later being overlooked. * Reset CLIENT_PREVENT_LOGGING in resetClient rather than after command processing. * Add a test for SLOWLOG and BLPOP Notes: - real_cmd == c->lastcmd, except inside MULTI and Lua. - blocked commands never happen in these cases (MULTI / Lua) - real_cmd == c->cmd, except for when the command is rewritten (e.g. GEOADD) - blocked commands (currently) are never rewritten - other than the command's CLIENT_PREVENT_LOGGING, and the execution flag CLIENT_PREVENT_LOGGING, other cases that we want to avoid slowlog are on AOF loading (specifically CMD_CALL_SLOWLOG will be off when executed from execCommand that runs from an AOF)	2021-03-25 10:20:27 +02:00
Qu Chen	7de6451818	Properly initialize variable to make valgrind happy in checkChildrenDone(). Removed usage for the obsolete wait3() and wait4() in favor of waitpid(), and properly check for the exit status code. (#8666 )	2021-03-24 08:41:05 -07:00
yoav-steinberg	3060de88ce	Remove cron saving during BGSAVE test. (#8688 ) This fixes a race where a bgsave can start during the test after we verified no bgsave is running.	2021-03-24 15:14:47 +02:00
Oran Agra	f6e1a94e03	Corrupt stream key access to uninitialized memory (#8681 ) the corrupt-dump-fuzzer test found a case where an access to a corrupt stream would have caused accessing to uninitialized memory. now it'll panic instead. The issue was that there was a stream that says it has more than 0 records, but looking for the max ID came back empty handed. p.s. when sanitize-dump-payload is used, this corruption is detected, and the RESTORE command is gracefully rejected.	2021-03-24 11:33:49 +02:00
Yossi Gottlieb	c4ef1efdb7	Add support for reading encrypted keyfiles. (#8644 )	2021-03-22 13:27:46 +02:00
Oran Agra	2f717c156a	fix race in diskless load cluster tests (#8674 )	2021-03-22 10:51:13 +02:00
Oran Agra	a7c02b19bf	Fix race in replication test (#8679 ) Since redis 6.2, redis immediately tries to connect to the master, not waiting for replication cron. in the slow freebsd CI, this test failed and master_link_status was already "up" when INFO was called.	2021-03-22 10:50:39 +02:00
Meir Shpilraien (Spielrein)	9ae4f5c73d	Fix script kill to work also on scripts that use pcall (#8661 ) pcall function runs another LUA function in protected mode, this means that any error will be caught by this function and will not stop the LUA execution. The script kill mechanism uses error to stop the running script. Scripts that uses pcall can catch the error raise by the script kill mechanism, this will cause a script like this to be unkillable: local f = function() while 1 do redis.call('ping') end end while 1 do pcall(f) end The fix is, when we want to kill the script, we set the hook function to be invoked after each line. This will promise that the execution will get another error before it is able to enter the pcall function again.	2021-03-17 18:52:11 +02:00
Huang Zhw	a19c4058be	When tests exit normally, some processes may still be alive (#8647 ) In certain scenario start_server may think it failed to start a redis server although it started successfully. in these cases, it'll not terminate it, and it'll remain running when the test is over. In start_server if config doesn't have bind (the minimal.conf in introspection.tcl), it will try to bind ipv4 and ipv6. One may success while other fails. It will output "Could not create server TCP listening socket". wait_server_started uses this message to check whether instance started successfully. So it will consider that it failed even though redis started successfully. Additionally, in some cases it wasn't clear to users why the server exited, since the warning message printed to the log, could in some cases be harmless, and in some cases fatal. This PR adds makes a clear distinction between a warning log message and a fatal one, and changes the test suite to look for the fatal message.	2021-03-16 17:25:30 +02:00
Madelyn Olson	e1d98bca5a	Redact slowlog entries for config with sensitive data. (#8584 ) Redact config set requirepass/masterauth/masteruser from slowlog in addition to showing ACL commands without sensitive values.	2021-03-15 22:00:29 -07:00
guybe7	dba33a943d	Missing EXEC on modules propagation after failed EVAL execution (#8654 ) 1. moduleReplicateMultiIfNeeded should use server.in_eval like moduleHandlePropagationAfterCommandCallback 2. server.in_eval could have been set to 1 and not reset back to 0 (a lot of missed early-exits after in_eval is already 1) Note: The new assertions in processCommand cover (2) and I added two module tests to cover (1) Implications: If an EVAL that failed (and thus left server.in_eval=1) runs before a module command that replicates, the replication stream will contain MULTI (because moduleReplicateMultiIfNeeded used to check server.lua_caller which is NULL at this point) but not EXEC (because server.in_eval==1) This only affects modules as module.c the only user of server.in_eval. Affects versions 6.2.0, 6.2.1	2021-03-15 21:19:57 +02:00
KinWaiYuen	5b48d90049	Optimize CLUSTER SLOTS reply by reducing unneeded loops (#8541 ) This commit more efficiently computes the cluster bulk slots response by looping over the entire slot space once, instead of for each node.	2021-03-11 22:40:35 -08:00
guybe7	a4f03bd7eb	Fix some memory leaks in propagagte.c (#8636 ) Introduced by `3d0b427c30`	2021-03-11 13:50:13 +02:00
Harkrishn Patro	b70d81f60b	Process hello command even if the default user has no permissions. (#8633 ) Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>	2021-03-10 21:19:35 -08:00
guybe7	3d0b427c30	Fix some issues with modules and MULTI/EXEC (#8617 ) Bug 1: When a module ctx is freed moduleHandlePropagationAfterCommandCallback is called and handles propagation. We want to prevent it from propagating commands that were not replicated by the same context. Example: 1. module1.foo does: RM_Replicate(cmd1); RM_Call(cmd2); RM_Replicate(cmd3) 2. RM_Replicate(cmd1) propagates MULTI and adds cmd1 to also_propagagte 3. RM_Call(cmd2) create a new ctx, calls call() and destroys the ctx. 4. moduleHandlePropagationAfterCommandCallback is called, calling alsoPropagates EXEC (Note: EXEC is still not written to socket), setting server.in_trnsaction = 0 5. RM_Replicate(cmd3) is called, propagagting yet another MULTI (now we have nested MULTI calls, which is no good) and then cmd3 We must prevent RM_Call(cmd2) from resetting server.in_transaction. REDISMODULE_CTX_MULTI_EMITTED was revived for that purpose. Bug 2: Fix issues with nested RM_Call where some have '!' and some don't. Example: 1. module1.foo does RM_Call of module2.bar without replication (i.e. no '!') 2. module2.bar internally calls RM_Call of INCR with '!' 3. at the end of module1.foo we call RM_ReplicateVerbatim We want the replica/AOF to see only module1.foo and not the INCR from module2.bar Introduced a global replication_allowed flag inside RM_Call to determine whether we need to replicate or not (even if '!' was specified) Other changes: Split beforePropagateMultiOrExec to beforePropagateMulti afterPropagateExec just for better readability	2021-03-10 18:02:17 +02:00
Yossi Gottlieb	817894c012	Fix test false positive due to a race condition. (#8616 )	2021-03-08 21:22:08 +02:00
Yossi Gottlieb	7d81f39222	Fix flaky unit/maxmemory test on MacOS/BSD. (#8619 ) It seems like non-Linux sockets may be less greedy, resulting with more transient client output buffers. Haven't proven this but empirically when stressing this test on non-Linux tends to exhibit increased mem_clients_normal values.	2021-03-08 20:53:53 +02:00
Yossi Gottlieb	3c7d6a1853	Improve redis-cli non-binary safe string handling. (#8566 ) * The `redis-cli --scan` output should honor output mode (set explicitly or implicitly), and quote key names when not in raw mode. * Technically this is a breaking change, but it should be very minor since raw mode is by default on for non-tty output. * It should only affect TTY output (human users) or non-tty output if `--no-raw` is specified. * Added `--quoted-input` option to treat all arguments as potentially quoted strings. * Added `--quoted-pattern` option to accept a potentially quoted pattern. Unquoting is applied to potentially quoted input only if single or double quotes are used. Fixes #8561, #8563	2021-03-04 15:03:49 +02:00
Yossi Gottlieb	5d180d2834	Fix potential replication-4 test race condition. (#8583 ) Co-authored-by: Oran Agra <oran@redislabs.com>	2021-03-02 18:12:11 +02:00
YaacovHazan	c19530bc71	fix new networking tests to work when the test suite is used in tls mode (#8582 ) the tests were unable to connect to the server since the attempted to use normal tcp	2021-03-01 20:53:02 +02:00
Oran Agra	349ef3f6a0	fix stream deep sanitization with deleted records (#8568 ) When sanitizing the stream listpack, we need to count the deleted records too. otherwise the last line that checks the next pointer fails. Add test to cover that state in the stream tests.	2021-03-01 17:23:29 +02:00
YaacovHazan	a031d268b1	Make port, tls-port and bind configurations modifiable (#8510 ) Add ability to modify port, tls-port and bind configurations by CONFIG SET command. To simplify the code and make it cleaner, a new structure added, socketFds, which contains the file descriptors array and its counter, and used for TCP, TLS and Cluster sockets file descriptors.	2021-03-01 16:04:44 +02:00
Bonsai	81a55d026f	fix: call CLIENT INFO from redis module will crash the server (#8560 ) Because when the RM_Call is invoked. It will create a faker client. The point is client connection is NULL, so server will crash in connGetInfo Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2021-03-01 08:18:14 +02:00
Viktor Söderqvist	6122f1c450	Shared reusable client for RM_Call() (#8516 ) A single client pointer is added in the server struct. This is initialized by the first RM_Call() and reused for every subsequent RM_Call() except if it's already in use, which means that it's not used for (recursive) module calls to modules. For these, a new "fake" client is created each time. Other changes: * Avoid allocating a dict iterator in pubsubUnsubscribeAllChannels when not needed	2021-02-28 14:11:18 +02:00
Madelyn Olson	c6f0ea2c81	Allow stopped redis processes to be killed in tests (#8552 )	2021-02-24 14:26:16 -08:00
sundb	60d5ef4d82	Use addReplyErrorObject with shared.noscripterr (#8544 )	2021-02-24 08:45:13 -08:00
guybe7	f745c0181a	Fix race in CONFIG REWRITE sanity (#8536 ) server may still be LOADING the RDB when receiving the ping	2021-02-23 20:28:03 +02:00
Yossi Gottlieb	95ea74549c	Fix failed tests on Linux Alpine and add a CI job. (#8532 ) * Remove linux/version.h dependency. This introduces unnecessary dependencies, and generally not a good idea as the platform we build on may be different than the platform we run on. To determine if sync_file_range exists we can simply rely on header file hints. * Fix setproctitle() on libmusl. The previous ifdef checks were a bit too strict for no apparent reason. * Fix tests failure on Linux with no backtrace. * Add alpine daily CI job.	2021-02-23 12:57:45 +02:00
Harkrishn Patro	4739131ca6	Remove acl subcommand validation if fully added command exists. (#8483 ) This validation was only done for sub-commands and not for commands. These would have been valid (not produce any error) ACL SETUSER bob +@all +client ACL SETUSER bob +client +client so no reason for this one to fail: ACL SETUSER bob +client +client\|id One example why this is needed is that pfdebug wasn't part of the @hyperloglog group and now it is. so something like: acl setuser user1 +@hyperloglog +pfdebug\|test would have succeeded in early 6.0.x, and fail in 6.2 RC3 Co-authored-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-02-22 15:22:25 +02:00
Huang Zw	f687ac0c32	Client tracking tracking-redir-broken push len is 2 not 3 (#8456 ) When redis responds with tracking-redir-broken push message (RESP3), it was responding with a broken protocol: an array of 3 elements, but only pushes 2 elements. Some bugs in the test make this pass. Read the push reply will consume an extra reply, because the reply length is 3, but there are only two elements, so the next reply will be treated as third element. So the test is corrected too. Other changes: * checkPrefixCollisionsOrReply success should return 1 instead of -1, this bug didn't have any implications. * improve client tracking tests to validate more of the response it reads.	2021-02-21 09:34:46 +02:00
Gnanesh	0772098b1b	EXPIRE, EXPIREAT, SETEX, GETEX: Return error when expire time overflows (#8287 ) Respond with error if expire time overflows from positive to negative of vice versa. * `SETEX`, `SET EX`, `GETEX` etc would have already error on negative value, but now they would also error on overflows (i.e. when the input was positive but after the manipulation it becomes negative, which would have passed before) * `EXPIRE` and `EXPIREAT` was ok taking negative values (would implicitly delete the key), we keep that, but we do error if the user provided a value that changes sign when manipulated (except the case of changing sign when `basetime` is added) Signed-off-by: Gnanesh <gnaneshkunal@outlook.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-02-21 09:09:54 +02:00
sundb	46346e9e3a	Fix timing error oom-score-adj test (#8513 ) fixes timing issue, fork didn't always get to set the oom score before the test verified it.	2021-02-19 13:01:25 +02:00
uriyage	fd052d2a86	Adds INFO fields to track fork child progress (#8414 ) * Adding current_save_keys_total and current_save_keys_processed info fields. Present in replication, BGSAVE and AOFRW. * Changing RM_SendChildCOWInfo() to RM_SendChildHeartbeat(double progress) * Adding new info field current_fork_perc. Present in Replication, BGSAVE, AOFRW, and module forks.	2021-02-16 16:06:51 +02:00
Oran Agra	fb3457d157	minor test suite cleanup, revive old test (#8497 ) There are two tests in other.tcl that were dependant of the sha1 package import which meant that they didn't usually run. The reason it was like that was that prior to the creation of DEBUG DIGEST, the test suite used to have an equivalent function, but that's no longer the case and this dependency isn't needed. The other change is to revert config changes done by the test before the test suite continues. can be useful if using `--host` to run multiple units against the same server	2021-02-15 17:20:03 +02:00
Yossi Gottlieb	141ac8df59	Escape unsafe field name characters in INFO. (#8492 ) Fixes #8489	2021-02-15 17:08:53 +02:00
Oran Agra	30775bc3e3	solve race in replication-2 test - again (#8491 ) this should make it timing independent and also faster in most cases	2021-02-15 12:50:23 +02:00
Viktor Söderqvist	0bc8c9c8f9	Modules: In RM_HashSet, add COUNT_ALL flag and set errno (#8446 ) The added flag affects the return value of RM_HashSet() to include the number of inserted fields, in addition to updated and deleted fields. errno is set on errors, tests are added and documentation updated.	2021-02-15 11:40:05 +02:00
Yossi Gottlieb	8c42d1257f	Fix errors with sentinel leaked fds test. (#8482 ) * Don't run test script on non-Linux. * Verify that reported fds do indeed exist also in parent, to avoid false negatives on some systems (namely CentOS). Co-authored-by: Andy Pan <panjf2000@gmail.com>	2021-02-11 15:25:01 +02:00
filipe oliveira	b5ca1e9e53	Removed time sensitive checks from block on background tests. Fixed uninitialized variable (#8479 ) - removes time sensitive checks from block on background tests during leak checks. - fix uninitialized variable on RedisModuleBlockedClient() when calling RM_BlockedClientMeasureTimeEnd() without RM_BlockedClientMeasureTimeStart()	2021-02-10 08:59:07 +02:00
WuYunlong	203f357c32	Cleanup in redis-cli and tests: release memory on exit, change dup test name (#8475 ) 1. Rename 18-cluster-nodes-slots.tcl to 19-cluster-nodes-slots.tcl. it was conflicting with another test prefixed by 18 2. Release memory on exit in redis-cli.c. 3. Fix freeConvertedSds indentation.	2021-02-09 12:36:09 +02:00
Yossi Gottlieb	dbcc0a85d0	Fix and cleanup Sentinel leaked fds test. (#8469 ) * For consistency, use tclsh for the script as well * Ignore leaked fds that originate from grandparent process, since we only care about fds redis-sentinel itself is responsible for * Check every test iteration to catch problems early * Some cleanups, e.g. parameterization of file name, etc.	2021-02-08 17:02:46 +02:00
filipe oliveira	b2351ea0dc	[fix] Increasing block on background timeout time to avoid test failure (#8470 ) The test failed from time to time on Github actions. We think it's possible that on the module's blocking timeout time tracking test, the timeout is happening prior we issue the RedisModule_BlockedClientMeasureTimeStart(bc) on the background thread. If that is the case one possible solution is to increase the timeout. Increasing to 200ms to 500ms to see if nightly stops failing.	2021-02-08 16:24:00 +02:00
Oran Agra	02ab14cc2e	solve race in replication-2 test (#8461 ) use SIGSTOP instead of DEBUG SLEEP, reduces the test time by some 2 seconds and avoids failures on slow machines	2021-02-07 16:22:30 +02:00
Yossi Gottlieb	5b8350aaaa	Add --dump-logs tests option. (#8459 ) Dump the entire server log if a test failed, to easy troubleshooting with no access to log files.	2021-02-07 12:37:24 +02:00
Viktor Söderqvist	aea6e71ef8	RM_ZsetRem: Delete key if empty (#8453 ) Without this fix, RM_ZsetRem can leave empty sorted sets which are not allowed to exist. Removing from a sorted set while iterating seems to work (while inserting causes failed assetions). RM_ZsetRangeEndReached is modified to return 1 if the key doesn't exist, to terminate iteration when the last element has been removed.	2021-02-05 19:54:01 +02:00
filipe oliveira	b3bdcd2278	Fix compiler warning on implicit declaration of ‘nanosleep’ . Removed unused variable (#8454 )	2021-02-05 19:51:31 +02:00
sundb	18ac41973b	RAND* commands: fix risk of OOM panic in hash and zset, use fair random in hash, and add tests for even distribution to all (#8429 ) Changes to HRANDFIELD and ZRANDMEMBER: * Fix risk of OOM panic when client query a very big negative count (avoid allocating huge temporary buffer). * Fix uneven random distribution in HRANDFIELD with negative count (wasn't using dictGetFairRandomKey). * Add tests to check an even random distribution (HRANDFIELD, SRANDMEMBER, ZRANDMEMBER). Co-authored-by: Oran Agra <oran@redislabs.com>	2021-02-05 15:56:20 +02:00
Yang Bodong	b7b23a0ff5	Fix GEOSEARCH tcl test error (#8451 ) Issue with new test due to longitude wraparound.	2021-02-04 19:39:07 +02:00
Yang Bodong	ded1655d49	GEOSEARCH bybox bug fixes and new fuzzy tester (#8445 ) Fix errors of GEOSEARCH bybox search due to: 1. projection of the box to a trapezoid (when the meter box is converted to long / lat it's no longer a box). 2. width and height mismatch Changes: - New GEOSEARCH point in rectangle algorithm - Fix GEOSEARCH bybox width and height mismatch bug - Add GEOSEARCH bybox testing to the existing "GEOADD + GEORANGE randomized test" - Add new fuzzy test to stress test the bybox corners and edges - Add some tests for edge cases of the bybox algorithm Co-authored-by: Oran Agra <oran@redislabs.com>	2021-02-04 18:08:35 +02:00
Yossi Gottlieb	52fb306535	Fix 32-bit test modules build. (#8448 )	2021-02-04 11:37:28 +02:00
Yossi Gottlieb	de6f3ad017	Fix FreeBSD tests and CI Daily issues. (#8438 ) * Add bash temporarily to allow sentinel fd leaks test to run. * Use vmactions-freebsd rdist sync to work around bind permission denied and slow execution issues. * Upgrade to tcl8.6 to be aligned with latest Ubuntu envs. * Concat all command executions to avoid ignoring failures. * Skip intensive fuzzer on FreeBSD. For some yet unknown reason, generate_fuzzy_traffic_on_key causes TCL to significantly bloat on FreeBSD resulting with out of memory.	2021-02-03 17:35:28 +02:00
Oran Agra	8f27578de2	temporarily disable sentinel test FD leak print (#8425 ) These tests are not yet stable. on github actions they show some false leaks.	2021-01-31 12:14:36 +02:00
Oran Agra	5a7eb9c881	Fix test issues from introduction of HRANDFIELD (#8424 ) * The corrupt dump fuzzer found a division by zero. * in some cases the random fields from the HRANDFIELD tests produced fields with newlines and other special chars (due to \ char), this caused the TCL tests to see a bulk response that has a newline in it and add {} around it, later it can think this is a nested list. in fact the `alpha` random string generator isn't using spaces and newlines, so it should not use `\` either.	2021-01-31 12:13:45 +02:00
Wen Hui	eacccd2acb	fix sentinel tests error (#8422 ) This commit fixes sentinel announces hostnames test error in certain linux environment Before this commit, we only check localhost is resolved into 127.0.0.1, however in ubuntu or some other linux environments "localhost" will be resolved into ::1 ipv6 address first if the network stack is capable.	2021-01-30 11:18:58 +02:00
filipe oliveira	f0c5052aa8	Enabled background and reply time tracking on blocked on keys/blocked on background work clients (#7491 ) This commit enables tracking time of the background tasks and on replies, opening the door for properly tracking commands that rely on blocking / background work via the slowlog, latency history, and commandstats. Some notes: - The time spent blocked waiting for key changes, or blocked on synchronous replication is not accounted for. - This commit does not affect latency tracking of commands that are non-blocking or do not have background work. ( meaning that it all stays the same with exception to `BZPOPMIN`,`BZPOPMAX`,`BRPOP`,`BLPOP`, etc... and module's commands that rely on background threads ). - Specifically for latency history command we've added a new event class named `command-unblocking` that will enable latency monitoring on commands that spawn background threads to do the work. - For blocking commands we're now considering the total time of a command as the time spent on call() + the time spent on replying when unblocked. - For Modules commands that rely on background threads we're now considering the total time of a command as the time spent on call (main thread) + the time spent on the background thread ( if marked within `RedisModule_MeasureTimeStart()` and `RedisModule_MeasureTimeEnd()` ) + the time spent on replying (main thread) To test for this feature we've added a `unit/moduleapi/blockonbackground` test that relies on a module that blocks the client and sleeps on the background for a given time. - check blocked command that uses RedisModule_MeasureTimeStart() is tracking background time - check blocked command that uses RedisModule_MeasureTimeStart() is tracking background time even in timeout - check blocked command with multiple calls RedisModule_MeasureTimeStart() is tracking the total background time - check blocked command without calling RedisModule_MeasureTimeStart() is not reporting background time	2021-01-29 15:38:30 +02:00
Yang Bodong	b9a0500f16	Add HRANDFIELD and ZRANDMEMBER. improvements to SRANDMEMBER (#8297 ) New commands: `HRANDFIELD [<count> [WITHVALUES]]` `ZRANDMEMBER [<count> [WITHSCORES]]` Algorithms are similar to the one in SRANDMEMBER. Both return a simple bulk response when no arguments are given, and an array otherwise. In case values/scores are requested, RESP2 returns a long array, and RESP3 a nested array. note: in all 3 commands, the only option that also provides random order is the one with negative count. Changes to SRANDMEMBER * Optimization when count is 1, we can use the more efficient algorithm of non-unique random * optimization: work with sds strings rather than robj Other changes: * zzlGetScore: when zset needs to convert string to double, we use safer memcpy (in case the buffer is too small) * Solve a "bug" in SRANDMEMBER test: it intended to test a positive count (case 3 or case 4) and by accident used a negative count Co-authored-by: xinluton <xinluton@qq.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-01-29 10:47:28 +02:00
Allen Farris	0d18a1e85f	implement FAILOVER command (#8315 ) Implement FAILOVER command, which coordinates failover between the server and one of its replicas.	2021-01-28 13:18:05 -08:00
Yossi Gottlieb	4bb5ccbefb	Add proc-title-template option. (#8397 ) Make it possible to customize the process title, i.e. include custom strings, immutable configuration like port, tls-port, unix socket name, etc.	2021-01-28 18:17:39 +02:00
Viktor Söderqvist	4355145a62	Add modules API for streams (#8288 ) APIs added for these stream operations: add, delete, iterate and trim (by ID or maxlength). The functions are prefixed by RM_Stream. * RM_StreamAdd * RM_StreamDelete * RM_StreamIteratorStart * RM_StreamIteratorStop * RM_StreamIteratorNextID * RM_StreamIteratorNextField * RM_StreamIteratorDelete * RM_StreamTrimByLength * RM_StreamTrimByID The type RedisModuleStreamID is added and functions for converting from and to RedisModuleString. * RM_CreateStringFromStreamID * RM_StringToStreamID Whenever the stream functions return REDISMODULE_ERR, errno is set to provide additional error information. Refactoring: The zset iterator fields in the RedisModuleKey struct are wrapped in a union, to allow the same space to be used for type- specific info for streams and allow future use for other key types.	2021-01-28 16:19:43 +02:00
Yossi Gottlieb	bb7cd97439	Add hostname support in Sentinel. (#8282 ) This is both a bugfix and an enhancement. Internally, Sentinel relies entirely on IP addresses to identify instances. When configured with a new master, it also requires users to specify and IP and not hostname. However, replicas may use the replica-announce-ip configuration to announce a hostname. When that happens, Sentinel fails to match the announced hostname with the expected IP and considers that a different instance, triggering reconfiguration, etc. Another use case is where TLS is used and clients are expected to match the hostname to connect to with the certificate's SAN attribute. To properly implement this configuration, it is necessary for Sentinel to redirect clients to a hostname rather than an IP address. The new 'resolve-hostnames' configuration parameter determines if Sentinel is willing to accept hostnames. It is set by default to no, which maintains backwards compatibility and avoids unexpected DNS resolution delays on systems with DNS configuration issues. Internally, Sentinel continues to identify instances by their resolved IP address and will also report the IP by default. The new 'announce-hostnames' parameter determines if Sentinel should prefer to announce a hostname, when available, rather than an IP address. This applies to addresses returned to clients, as well as their representation in the configuration file, REPLICAOF configuration commands, etc. This commit also introduces SENTINEL CONFIG GET and SENTINEL CONFIG SET which can be used to introspect or configure global Sentinel configuration that was previously was only possible by directly accessing the configuration file and possibly restarting the instance. Co-authored-by: myl1024 <myl92916@qq.com> Co-authored-by: sundb <sundbcn@gmail.com>	2021-01-28 12:09:11 +02:00
Z. Liu	17b34c7309	Add 'set-proc-title' config so that this mechanism can be disabled (#3623 ) if option `set-proc-title' is no, then do nothing for proc title. The reason has been explained long ago, see following: We update redis to 2.8.8, then found there are some side effect when redis always change the process title. We run several slave instance on one computer, and all these salves listen on unix socket only, then ps will show: 1 S redis 18036 1 0 80 0 - 56130 ep_pol 14:02 ? 00:00:31 /usr/sbin/redis-server :0 1 S redis 23949 1 0 80 0 - 11074 ep_pol 15:41 ? 00:00:00 /usr/sbin/redis-server :0 for redis 2.6 the output of ps is like following: 1 S redis 18036 1 0 80 0 - 56130 ep_pol 14:02 ? 00:00:31 /usr/sbin/redis-server /etc/redis/a.conf 1 S redis 23949 1 0 80 0 - 11074 ep_pol 15:41 ? 00:00:00 /usr/sbin/redis-server /etc/redis/b.conf Later is more informational in our case. The situation is worse when we manage the config and process running state by salt. Salt check the process by running "ps \| grep SIG" (for Gentoo System) to check the running state, where SIG is the string to search for when looking for the service process with ps. Previously, we define sig as "/usr/sbin/redis-server /etc/redis/a.conf". Since the ps output is identical for our case, so we have no way to check the state of specified redis instance. So, for our case, we prefer the old behavior, i.e, do not change the process title for the main redis process. Or add an option such as "set-proc-title [yes\|no]" to control this behavior. Co-authored-by: Yossi Gottlieb <yossigo@gmail.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-01-28 11:12:39 +02:00
Raghav Muddur	0367a80819	GETEX, GETDEL and SET PXAT/EXAT (#8327 ) This commit introduces two new command and two options for an existing command GETEX <key> [PERSIST][EX seconds][PX milliseconds] [EXAT seconds-timestamp] [PXAT milliseconds-timestamp] The getexCommand() function implements extended options and variants of the GET command. Unlike GET command this command is not read-only. Only one of the options can be used at a given time. 1. PERSIST removes any TTL associated with the key. 2. EX Set expiry TTL in seconds. 3. PX Set expiry TTL in milliseconds. 4. EXAT Same like EX instead of specifying the number of seconds representing the TTL (time to live), it takes an absolute Unix timestamp 5. PXAT Same like PX instead of specifying the number of milliseconds representing the TTL (time to live), it takes an absolute Unix timestamp Command would return either the bulk string, error or nil. GETDEL <key> Would delete the key after getting. SET key value [NX] [XX] [KEEPTTL] [GET] [EX <seconds>] [PX <milliseconds>] [EXAT <seconds-timestamp>][PXAT <milliseconds-timestamp>] Two new options added here are EXAT and PXAT Key implementation notes - `SET` with `PX/EX/EXAT/PXAT` is always translated to `PXAT` in `AOF`. When relative time is specified (`PX/EX`), replication will always use `PX`. - `setexCommand` and `psetexCommand` would no longer need translation in `feedAppendOnlyFile` as they are modified to invoke `setGenericCommand ` with appropriate flags which will take care of correct AOF translation. - `GETEX` without any optional argument behaves like `GET`. - `GETEX` command is never propagated, It is either propagated as `PEXPIRE[AT], or PERSIST`. - `GETDEL` command is propagated as `DEL` - Combined the validation for `SET` and `GETEX` arguments. - Test cases to validate AOF/Replication propagation	2021-01-27 19:47:26 +02:00
Oran Agra	9e56d3969a	Add tests for RESP3 responce of ZINTER and ZRANGE (#8391 ) It was confusing as to why these don't return a map type. the reason is that order matters, so we need to make sure the client library knows to respect it. Added comments in the implementation and tests to cover it.	2021-01-26 17:55:32 +02:00
Wen Hui	1aad55b66f	Sentinel: Fix Config Dependency and Rewrite Sequence (#8271 ) This commit fixes a well known and an annoying issue in Sentinel mode. Cause of this issue: Currently, Redis rewrite process works well in server mode, however in sentinel mode, the sentinel config has variant semantics for different configurations, in example configuration https://github.com/redis/redis/blob/unstable/sentinel.conf, we put comments on these. However the rewrite process only treat the sentinel config as a single option. During rewrite process, it will mess up with the lines and comments. Approaches: In order to solve this issue, we need to differentiate different subconfig options in sentinel separately, for example, sentinel monitor <master-name> <ip> <redis-port> <quorum> we can treat it as sentinel monitor option, instead of the sentinel option. This commit also fixes the dependency issue when putting configurations in sentinel.conf. For example before this commit,we must put `sentinel monitor <master-name> <ip> <redis-port> <quorum>` before `sentinel auth-pass <master-name> <password>` for a single master, otherwise the server cannot start and will return error. This commit fixes this issue, as long as the monitoring master was configured, no matter the sequence is, the sentinel can start and run properly.	2021-01-26 09:31:54 +02:00
Oran Agra	437e258384	Fix rare test failures due to repl-ping-replica-period (#8393 ) some tests use attach_to_replication_stream to watch what's propagated to replicas, but in some cases the periodic ping may slip in and fail the test. we disable that ping by setting the period to once an hour (tests should not run for that long). other change is so that the next time this oom-score-adj test fails, we'll see the value (assert_equals prints it)	2021-01-25 11:05:25 +02:00
Oran Agra	f225891526	Fix recent test failures (#8386 ) 1. Valgrind leak in a recent change in a module api test 2. Increase treshold of a RESTORE TTL test 3. Change assertions to use assert_range which prints the values	2021-01-23 21:53:58 +02:00
Viktor Söderqvist	9c1483100a	Test that module can wake up module blocked on non-empty list key (#8382 ) BLPOP and other blocking list commands can only block on empty keys and LPUSH only wakes up clients when the list is created. Using the module API, it's possible to block on a non-empty key. Unblocking a client blocked on a non-empty list (or zset) can only be done using RedisModule_SignalKeyAsReady(). This commit tests it.	2021-01-22 16:19:37 +02:00
Andy Pan	8449a5df87	Sentinel tests, disable FD leak check, and print more details (#8376 ) * Print more details about fd leaks * temporarily prevent the leaks from failing the tests Co-authored-by: Oran Agra <oran@redislabs.com>	2021-01-22 12:11:58 +02:00
guybe7	5a77d015be	Fix misleading module test (#8366 ) the test was misleading because the module would actually woke up on a wrong type and re-blocked, while the test name suggests the module doesn't not wake up at all on a wrong type.. i changed the name of the test + added verification that indeed the module wakes up and gets re-blocked after it understand it's the wrong type	2021-01-20 14:03:38 +02:00
Andy Pan	6401920d70	Fix sentinel FD leak test, checking the wrong OS name (#8364 )	2021-01-20 10:17:20 +02:00
Andy Pan	1be29606c5	Fix sentinel FD leak test, not printing the list of leaks (#8363 )	2021-01-20 09:58:02 +02:00
Andy Pan	fb66e2e249	Use FD_CLOEXEC in Sentinel, so that FDs don't leak to the scripts it runs (#8242 ) Sentinel uses execve to run scripts, so it needs to use FD_CLOEXEC on all file descriptors, so that they're not accessible by the script it runs. This commit includes a change to the sentinel tests, which verifies no FDs are left opened when the script is executed.	2021-01-19 22:57:30 +02:00
Oran Agra	a29aec9abb	Add tests to make sure that relative EXPIRE is propagated to replicas (#8357 ) This commit adds tests to make sure that relative and absolute expire commands are propagated as is to replicas and stop any future attempt to change that without a proper discussion. see #8327 and #5171 Additionally it slightly improve the AOF test that tests the opposite (always propagating absolute times), by covering more commands, and shaving 2 seconds from the test time.	2021-01-19 18:49:26 +02:00
Viktor Söderqvist	4985c11bd6	Bugfix: Make modules blocked on keys unblock on commands like LPUSH (#8356 ) This was a regression from #7625 (only in 6.2 RC2). This makes it possible again to implement blocking list and zset commands using the modules API. This commit also includes a test case for the reverse: A module unblocks a client blocked on BLPOP by inserting elements using RedisModule_ListPush(). This already works, but it was untested.	2021-01-19 13:15:33 +02:00
Yossi Gottlieb	522d93607a	Add io-thread daily CI tests. (#8232 ) This adds basic coverage to IO threads by running the cluster and few selected Redis test suite tests with the IO threads enabled. Also provides some necessary additional improvements to the test suite: * Add --config to sentinel/cluster tests for arbitrary configuration. * Fix --tags whitelisting which was broken. * Add a `network` tag to some tests that are more network intensive. This is work in progress and more tests should be properly tagged in the future.	2021-01-17 15:48:48 +02:00
Yang Bodong	294f93af97	Add lazyfree-lazy-user-flush config to control default behavior of FLUSH[ALL\|DB], SCRIPT FLUSH (#8258 ) * Adds ASYNC and SYNC arguments to SCRIPT FLUSH * Adds SYNC argument to FLUSHDB and FLUSHALL * Adds new config to control the default behavior of FLUSHDB, FLUSHALL and SCRIPT FLUASH. the new behavior is as follows: * FLUSH[ALL\|DB],SCRIPT FLUSH: Determine sync or async according to the value of lazyfree-lazy-user-flush. * FLUSH[ALL\|DB],SCRIPT FLUSH ASYNC: Always flushes the database in an async manner. * FLUSH[ALL\|DB],SCRIPT FLUSH SYNC: Always flushes the database in a sync manner.	2021-01-15 15:32:58 +02:00
Wang Yuan	9cb9f98d2f	Optimize performance of clusterGenNodesDescription for large clusters (#8182 ) Optimize the performance of clusterGenNodesDescription by only checking slot ownership of each slot once, instead of checking each slot for each node.	2021-01-13 12:36:03 -08:00
Oran Agra	4f8458d8d6	fix race in cluster transactions test (#8312 ) we didn't wait for the commands executed on the master to reach the replica.	2021-01-12 10:03:45 +02:00
Madelyn Olson	b24b490393	Fix issues in wait test (#8310 ) This fixes three issues: 1. Using debug SLEEP was impacting the subsequent test, and causing it to pass reliably even though it should have failed. There was exactly 5 seconds of artificial pause (after 1000, wait 3000, wait 1000) between the debug sleep 5 and when we needed to unblock the client in the subsequent test. Now the test properly makes sure the client is unblocked, and the subsequent test is fixed. 2. Minor, the client pause types were using & comparisons instead of ==, since it was previously a flag. 3. Test is faster now that some of the hand wavy time is removed.	2021-01-12 09:46:24 +02:00
Oran Agra	264953871b	Fix cluster diskless load swapdb test (#8308 ) The test was trying to wait for the replica to start loading the rdb from the master before it kills the master, but it was actually waiting for ROLE to be in "sync" mode, which corresponds to REPL_STATE_TRANSFER that starts before the actual loading starts. now instead it waits for the loading flag to be set. Besides, the test was dependent on the previous configuration of the servers, relying on the fact the replica is configured to persist (either RDB of AOF), now it is set explicitly.	2021-01-12 09:41:57 +02:00
Oran Agra	8dd16caec8	Fix last COW INFO report, Skip test on non-linux platforms (#8301 ) - the last COW report wasn't always read from the pipe (receiveLastChildInfo wasn't used) - but in fact, there's no reason we won't always try to drain that pipe so i'm unifying receiveLastChildInfo with receiveChildInfo - adjust threshold of the COW test when run in accurate mode - add some prints in case this test fails again - fix indentation, page size, and PID! in MacOS proc info p.s. it seems that pri_pages_dirtied is always 0	2021-01-08 23:35:30 +02:00
Yang Bodong	ea5350c5ec	GEOSEARCH - ANY option, for limited search that returns ASAP (#8259 ) Support ANY option to return some results that match the criteria ASAP, without a complete search and implicit sorting.	2021-01-08 18:29:44 +02:00
guybe7	814aad65f1	XADD and XTRIM, Trim by MINID, and new LIMIT argument (#8169 ) This PR adds another trimming strategy to XADD and XTRIM named MINID (complements the existing MAXLEN). It also adds a new LIMIT argument that allows incremental trimming by repeated calls (rather than all at once). This provides the ability to trim all records older than a certain ID (which makes it possible for the user to trim by age too). Example: XTRIM mystream MINID ~ 1608540753 will trim entries with id < 1608540753, but might not trim all (because of the ~ modifier) The purpose is to ease the use of streams. many users use streams as logs and the common case is wanting a log of the last X seconds rather than a log that contains maximum X entries (new MINID vs existing MAXLEN) The new LIMIT modifier is only supported when the trim strategy uses ~. i.e. when the user asked for exact trimming, it all happens in one go (no possibility for incremental trimming). However, when ~ is provided, we trim full rax nodes, up to the limit number of records. The default limit is 100*stream_node_max_entries (used when LIMIT is not provided). I.e. this is a behavior change (even if the existing MAXLEN strategy is used). An explicit limit of 0 means unlimited (but note that it's not the default). Other changes: Refactor arg parsing code for XADD and XTRIM to use common code.	2021-01-08 18:13:25 +02:00
Oran Agra	5843a45d01	Skip defrag tests on systems with bigger page sizes (#8294 ) The defragger works well on these systems, but the tests and their thresholds are not adjusted for these big pages, so the defragger isn't able to get down the fragmentation to the levels the test expects and it fails on "defrag didn't stop". Randomly choosing 8k as the threshold for the skipping Fixes #8265 (which had 65k pages)	2021-01-08 10:03:21 +02:00
Madelyn Olson	999494cef8	Throw error for conflicting bcast tracking prefixes (#8176 ) Throw an error if there are conflicting bcast tracking prefixes.	2021-01-08 00:00:35 -08:00
Madelyn Olson	47579bdf5c	Add support for client pause WRITE (#8170 ) Implementation of client pause WRITE and client unpause	2021-01-07 23:36:54 -08:00
YaacovHazan	ea930a352c	Report child copy-on-write info continuously Add INFO field, rdb_active_cow_size, to report COW of a live fork child while it's active. - once in 1024 keys check the time, and if there's more than one second since the last report send a report to the parent via the pipe. - refactor the child_info_data struct, it's an implementation detail that shouldn't be in the server struct, and not used to communicate data between caller and callee - remove the magic value from that struct (not sure what it was good for), and instead add handling of short reads. - add another value to the structure, cow_type, to indicate if the report is for the new rdb_active_cow_size field, or it's the last report of a successful operation - add new Module API to report the active COW - add more asserts variants to test.tcl	2021-01-07 16:14:29 +02:00
Jonah H. Harris	b5029dfdad	Add ZRANGESTORE command, and improve ZSTORE command (#7844 ) Add ZRANGESTORE command, and improve ZSTORE command to deprecated Z[REV]RANGE[BYSCORE\|BYLEX]. Syntax for the new ZRANGESTORE command: ZRANGESTORE [BYSCORE \| BYLEX] [REV] [LIMIT offset count] New syntax for ZRANGE: ZRANGE [BYSCORE \| BYLEX] [REV] [WITHSCORES] [LIMIT offset count] Old syntax for ZRANGE: ZRANGE [WITHSCORES] Other ZRANGE commands remain unchanged. The implementation uses common code for all of these, by utilizing a consumer interface that in one command response to the client, and in the other command stores a zset key. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-01-07 10:58:53 +02:00
guybe7	714e103ac3	Add XAUTOCLAIM (#7973 ) New command: XAUTOCLAIM <key> <group> <consumer> <min-idle-time> <start> [COUNT <count>] [JUSTID] The purpose is to claim entries from a stale consumer without the usual XPENDING+XCLAIM combo which takes two round trips. The syntax for XAUTOCLAIM is similar to scan: A cursor is returned (streamID) by each call and should be used as start for the next call. 0-0 means the scan is complete. This PR extends the deferred reply mechanism for any bulk string (not just counts) This PR carries some unrelated test code changes: - Renames the term "client" into "consumer" in the stream-cgroups test - And also changes DEBUG SLEEP into "after" Co-authored-by: Oran Agra <oran@redislabs.com>	2021-01-06 10:34:27 +02:00
Oran Agra	2017407b4d	Fix wrong order of key/value in Lua map response (#8266 ) When a Lua script returns a map to redis (a feature which was added in redis 6 together with RESP3), it would have returned the value first and the key second. If the client was using RESP2, it was getting them out of order, and if the client was in RESP3, it was getting a map of value => key. This was happening regardless of the Lua script using redis.setresp(3) or not. This also affects a case where the script was returning a map which it got from from redis by doing something like: redis.setresp(3); return redis.call() This fix is a breaking change for redis 6.0 users who happened to rely on the wrong order (either ones that used redis.setresp(3), or ones that returned a map explicitly). This commit also includes other two changes in the tests: 1. The test suite now handles RESP3 maps as dicts rather than nested lists 2. Remove some redundant (duplicate) tests from tracking.tcl	2021-01-05 08:29:20 +02:00
Yang Bodong	10f94b0ab1	Swapdb should make transaction fail if there is any client watching keys (#8239 ) This PR not only fixes the problem that swapdb does not make the transaction fail, but also optimizes the FLUSHALL and FLUSHDB command to set the CLIENT_DIRTY_CAS flag to avoid unnecessary traversal of clients. FLUSHDB was changed to first iterate on all watched keys, and then on the clients watching each key. Instead of iterating though all clients, and for each iterate on watched keys. Co-authored-by: Oran Agra <oran@redislabs.com>	2021-01-04 14:48:28 +02:00
kukey	33fb617053	GEOADD - add [CH] [NX\|XX] options (#8227 ) New command flags similar to what SADD already has. Co-authored-by: huangwei03 <huangwei03@kuaishou.com> Co-authored-by: Itamar Haber <itamar@redislabs.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2021-01-03 17:13:37 +02:00
Oran Agra	71fbe6e800	Fix leak in new errorstats commit, and a flaky test (#8278 )	2021-01-02 08:37:19 +02:00
filipe oliveira	90b9f08e5d	Add errorstats info section, Add failed_calls and rejected_calls to commandstats (#8217 ) This Commit pushes forward the observability on overall error statistics and command statistics within redis-server: It extends INFO COMMANDSTATS to have - failed_calls in - so we can keep track of errors that happen from the command itself, broken by command. - rejected_calls - so we can keep track of errors that were triggered outside the commmand processing per se Adds a new section to INFO, named ERRORSTATS that enables keeping track of the different errors that occur within redis ( within processCommand and call ) based on the reply Error Prefix ( The first word after the "-", up to the first space ). This commit also fixes RM_ReplyWithError so that it can be correctly identified as an error reply.	2020-12-31 16:53:43 +02:00
Oran Agra	19d4705ffd	Make the protocol-version argument of HELLO optional (#7377 )	2020-12-27 16:37:27 +02:00
zhaozhao.zz	299f9ebffa	Tracking: add CLIENT TRACKINGINFO subcommand (#7309 ) Add CLIENT TRACKINGINFO subcommand Co-authored-by: Oran Agra <oran@redislabs.com>	2020-12-27 13:14:39 +02:00
Itamar Haber	f44186e575	Adds count to L/RPOP (#8179 ) Adds: `L/RPOP <key> [count]` Implements no. 2 of the following strategies: 1. Loop on listTypePop - this would result in multiple calls for memory freeing and allocating (see `769167a079`) 2. Iterate the range to build the reply, then call quickListDelRange - this requires two iterations and is the current choice 3. Refactor quicklist to have a pop variant of quickListDelRange - probably optimal but more complex Also: * There's a historical check for NULL after calling listTypePop that was converted to an assert. * This refactors common logic shared between LRANGE and the new form of LPOP/RPOP into addListRangeReply (adds test for b/w compat) * Consequently, it may have made sense to have `LRANGE l -1 -2` and `LRANGE l 9 0` be legit and return a reverse reply. Due to historical reasons that would be, however, a breaking change. * Added minimal comments to existing commands to adhere to the style, make core dev life easier and get commit karma, naturally.	2020-12-25 21:49:24 +02:00
Oran Agra	4617960863	resolve hung test.	2020-12-24 14:33:53 +02:00
xhe	ef14c18c8e	fix the test Signed-off-by: xhe <xw897002528@gmail.com>	2020-12-24 17:31:50 +08:00
xhe	60f13e7a86	try to fix the test Signed-off-by: xhe <xw897002528@gmail.com>	2020-12-24 16:50:08 +08:00
xhe	7a7c60459e	add a test Signed-off-by: xhe <xw897002528@gmail.com>	2020-12-24 15:26:24 +08:00
Yang Bodong	ee59dc1b5c	Tests: fix the problem that Darwin memory leak detection may fail (#8213 ) Apparently the "leaks" took reports a different error string about process that's not found in each version of MacOS. This cause the test suite to fail on some OS versions, since some tests terminate the process before looking for leaks. Instead of looking at the error string, we now look at the (documented) exit code.	2020-12-23 16:28:17 +02:00
Oran Agra	411c18bbce	Remove read-only flag from non-keyspace cmds, different approach for EXEC to propagate MULTI (#8216 ) In the distant history there was only the read flag for commands, and whatever command that didn't have the read flag was a write one. Then we added the write flag, but some portions of the code still used !read Also some commands that don't work on the keyspace at all, still have the read flag. Changes in this commit: 1. remove the read-only flag from TIME, ECHO, ROLE and LASTSAVE 2. EXEC command used to decides if it should propagate a MULTI by looking at the command flags (!read & !admin). When i was about to change it to look at the write flag instead, i realized that this would cause it not to propagate a MULTI for PUBLISH, EVAL, and SCRIPT, all 3 are not marked as either a read command or a write one (as they should), but all 3 are calling forceCommandPropagation. So instead of introducing a new flag to denote a command that "writes" but not into the keyspace, and still needs propagation, i decided to rely on the forceCommandPropagation, and just fix the code to propagate MULTI when needed rather than depending on the command flags at all. The implication of my change then is that now it won't decide to propagate MULTI when it sees one of these: SELECT, PING, INFO, COMMAND, TIME and other commands which are neither read nor write. 3. Changing getNodeByQuery and clusterRedirectBlockedClientIfNeeded in cluster.c to look at !write rather than read flag. This should have no implications, since these code paths are only reachable for commands which access keys, and these are always marked as either read or write. This commit improve MULTI propagation tests, for modules and a bunch of other special cases, all of which used to pass already before that commit. the only one that test change that uncovered a change of behavior is the one that DELs a non-existing key, it used to propagate an empty multi-exec block, and no longer does.	2020-12-22 12:03:49 +02:00
Qu Chen	f48afb4710	Handle binary safe string for REQUIREPASS and MASTERAUTH directives (#8200 ) * Handle binary safe string for REQUIREPASS and MASTERAUTH directives.	2020-12-17 09:26:33 -08:00
Itamar Haber	9acd40d97b	GEOSEARCH: change 'FROMLOC' to 'FROMLONLAT' (#8190 ) And formats style a tiniee-winiee bit	2020-12-14 17:15:12 +02:00
Oran Agra	cfb449cc80	Sanitize dump payload: excessive free on dup zset fields (#8189 )	2020-12-14 17:10:31 +02:00
Oran Agra	7d9b09adaa	Tests: fix new defrag test to be skipped when not supported (#8185 ) Additionally the older defrag tests are using an obsolete way to check if the defragger is suuported (the error no longer contains "DISABLED"). this doesn't usually makes a difference since these tests are completely skipped if the allocator is not jemalloc, but that would fail if the allocator is a jemalloc that doesn't support defrag.	2020-12-14 11:13:46 +02:00
Yossi Gottlieb	86e3395c11	Several (mostly Solaris-related) cleanups (#8171 ) * Allow runtest-moduleapi use a different 'make', for systems where GNU Make is 'gmake'. * Fix issue with builds on Solaris re-building everything from scratch due to CFLAGS/LDFLAGS not stored. * Fix compile failure on Solaris due to atomicvar and a bunch of warnings. * Fix garbled log timestamps on Solaris.	2020-12-13 17:09:54 +02:00
Oran Agra	ab60dcf564	Add module event for repl-diskless-load swapdb (#8153 ) When a replica uses the diskless-load swapdb approach, it backs up the old database, then attempts to load a new one, and in case of failure, it restores the backup. this means that modules with global out of keyspace data, must have an option to subscribe to events and backup/restore/discard their global data too.	2020-12-13 14:36:06 +02:00
Yossi Gottlieb	63c1303cfb	Modules: add defrag API support. (#8149 ) Add a new set of defrag functions that take a defrag context and allow defragmenting memory blocks and RedisModuleStrings. Modules can register a defrag callback which will be invoked when the defrag process handles globals. Modules with custom data types can also register a datatype-specific defrag callback which is invoked for keys that require defragmentation. The callback and associated functions support both one-step and multi-step options, depending on the complexity of the key as exposed by the free_effort callback.	2020-12-13 09:56:01 +02:00
杨博东	4d06d99bf8	Add GEOSEARCH / GEOSEARCHSTORE commands (#8094 ) Add commands to query geospatial data with bounding box. Two new commands that replace the existing 4 GEORADIUS* commands. GEOSEARCH key [FROMMEMBER member] [FROMLOC long lat] [BYRADIUS radius unit] [BYBOX width height unit] [WITHCORD] [WITHDIST] [WITHASH] [COUNT count] [ASC\|DESC] GEOSEARCHSTORE dest_key src_key [FROMMEMBER member] [FROMLOC long lat] [BYRADIUS radius unit] [BYBOX width height unit] [WITHCORD] [WITHDIST] [WITHASH] [COUNT count] [ASC\|DESC] [STOREDIST] - Add two types of CIRCULAR_TYPE and RECTANGLE_TYPE to achieve different searches - Judge whether the point is within the rectangle, refer to: geohashGetDistanceIfInRectangle	2020-12-12 02:21:05 +02:00
Yossi Gottlieb	8c291b97b9	TLS: Add different client cert support. (#8076 ) This adds a new `tls-client-cert-file` and `tls-client-key-file` configuration directives which make it possible to use different certificates for the TLS-server and TLS-client functions of Redis. This is an optional directive. If it is not specified the `tls-cert-file` and `tls-key-file` directives are used for TLS client functions as well. Also, `utils/gen-test-certs.sh` now creates additional server-only and client-only certs and will skip intensive operations if target files already exist.	2020-12-11 18:31:40 +02:00
Yossi Gottlieb	4e064fbab4	Add module data-type support for COPY. (#8112 ) This adds a copy callback for module data types, in order to make modules compatible with the new COPY command. The callback is optional and COPY will fail for keys with data types that do not implement it.	2020-12-09 20:22:45 +02:00
Oran Agra	48efc25f74	Handle output buffer limits for Module blocked clients (#8141 ) Module blocked clients cache the response in a temporary client, the reply list in this client would be affected by the recent fix in #7202, but when the reply is later copied into the real client, it would have bypassed all the checks for output buffer limit, which would have resulted in both: responding with a partial response to the client, and also not disconnecting it at all.	2020-12-08 16:41:20 +02:00
Oran Agra	a102b21d17	Improve stability of new CSC eviction test (#8160 ) `c4fdf09c0` added a test that now fails with valgrind it fails for two resons: 1) the test samples the used memory and then limits the maxmemory to that value, but it turns out this is not atomic and on slow machines the background cron process that clean out old query buffers reduces the memory so that the setting doesn't cause eviction. 2) the dbsize was tested late, after reading some invalidation messages by that time more and more keys got evicted, partially draining the db. this is not the focus of this fix (still a known limitation)	2020-12-08 16:33:09 +02:00
Wang Yuan	1acc315cea	Minor improvements for list-2 test (#8156 ) had some unused variables. now some are used to assert that they match, others were useless.	2020-12-08 16:26:38 +02:00
Yossi Gottlieb	00db1b5579	Fix failing macOS tests due to wc differences. (#8161 )	2020-12-08 16:22:16 +02:00
Itamar Haber	37f45d9e56	Adds exclusive range query intervals to XPENDING (#8130 )	2020-12-08 11:43:00 +02:00
guybe7	6bb5503524	More efficient self-XCLAIM (#8098 ) when the same consumer re-claim an entry that it already has, there's no need to remove-and-insert if it's the same rax. we do need to update the idle time though. this commit only improves efficiency (doesn't change behavior).	2020-12-07 21:31:35 +02:00
Yossi Gottlieb	bccbc5509a	Add CLIENT INFO and CLIENT LIST [id]. (#8113 ) * Add CLIENT INFO subcommand. The output is identical to CLIENT LIST but provides a single line for the current client only. * Add CLIENT LIST ID [id...]. Co-authored-by: Itamar Haber <itamar@redislabs.com>	2020-12-07 14:24:05 +02:00
Oran Agra	7ca00d694d	Sanitize dump payload: fail RESTORE if memory allocation fails When RDB input attempts to make a huge memory allocation that fails, RESTORE should fail gracefully rather than die with panic	2020-12-06 14:54:34 +02:00
Oran Agra	3716950cfc	Sanitize dump payload: validate no duplicate records in hash/zset/intset If RESTORE passes successfully with full sanitization, we can't affort to crash later on assertion due to duplicate records in a hash when converting it form ziplist to dict. This means that when doing full sanitization, we must make sure there are no duplicate records in any of the collections.	2020-12-06 14:54:34 +02:00
Oran Agra	5b44631397	testsuite: fix fd leak, prevent port clashing when using --baseport when using --baseport to run two tests suite in parallel (different folders), we need to also make sure the port used by the testsuite to communicate with it's workers is unique. otherwise the attept to find a free port connects to the other test suite and messes it. maybe one day we need to attempt to bind, instead of connect when tring to find a free port.	2020-12-06 14:54:34 +02:00

... 2 3 4 5 6 ...

1580 Commits