redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-22 08:08:53 -05:00

Author	SHA1	Message	Date
meiravgri	2e854bccc6	Fix async safety in signal handlers (#12658 ) see discussion from after https://github.com/redis/redis/pull/12453 was merged ---- This PR replaces signals that are not considered async-signal-safe (AS-safe) with safe calls. #### 1. serverLog() and serverLogFromHandler() `serverLog` uses unsafe calls. It was decided that we will avoid `serverLog` calls by the signal handlers when: * The signal is not fatal, such as SIGALRM. In these cases, we prefer using `serverLogFromHandler` which is the safe version of `serverLog`. Note they have different prompts: `serverLog`: `62220:M 26 Oct 2023 14:39:04.526 # <msg>` `serverLogFromHandler`: `62220:signal-handler (1698331136) <msg>` * The code was added recently. Calls to `serverLog` by the signal handler have been there ever since Redis exists and it hasn't caused problems so far. To avoid regression, from now we should use `serverLogFromHandler` #### 2. `snprintf` `fgets` and `strtoul`(base = 16) --------> `_safe_snprintf`, `fgets_async_signal_safe`, `string_to_hex` The safe version of `snprintf` was taken from [here](`8cfc4ca5e7/src/mc_util.c (L754)`) #### 3. fopen(), fgets(), fclose() --------> open(), read(), close() #### 4. opendir(), readdir(), closedir() --------> open(), syscall(SYS_getdents64), close() #### 5. Threads_mngr sync mechanisms * waiting for the thread to generate stack trace: semaphore --------> busy-wait * `globals_rw_lock` was removed: as we are not using malloc and the semaphore anymore we don't need to protect `ThreadsManager_cleanups`. #### 6. Stacktraces buffer The initial problem was that we were not able to safely call malloc within the signal handler. To solve that we created a buffer on the stack of `writeStacktraces` and saved it in a global pointer, assuming that under normal circumstances, the function `writeStacktraces` would complete before any thread attempted to write to it. However, if threads lag behind, they might access this global pointer after it no longer belongs to the `writeStacktraces` stack, potentially corrupting memory. To address this, various solutions were discussed [here](https://github.com/redis/redis/pull/12658#discussion_r1390442896) Eventually, we decided to create a pipe at server startup that will remain valid as long as the process is alive. We chose this solution due to its minimal memory usage, and since `write()` and `read()` are atomic operations. It ensures that stack traces from different threads won't mix. The stacktraces collection process is now as follows: * Cleaning the pipe to eliminate writes of late threads from previous runs. * Each thread writes to the pipe its stacktrace * Waiting for all the threads to mark completion or until a timeout (2 sec) is reached * Reading from the pipe to print the stacktraces. #### 7. Changes that were considered and eventually were dropped * replace watchdog timer with a POSIX timer: according to [settimer man](https://linux.die.net/man/2/setitimer) > POSIX.1-2008 marks getitimer() and setitimer() obsolete, recommending the use of the POSIX timers API ([timer_gettime](https://linux.die.net/man/2/timer_gettime)(2), [timer_settime](https://linux.die.net/man/2/timer_settime)(2), etc.) instead. However, although it is supposed to conform to POSIX std, POSIX timers API is not supported on Mac. You can take a look here at the Linux implementation: [here](`c7562ee135`) To avoid messing up the code, and uncertainty regarding compatibility, it was decided to drop it for now. * avoid using sds (uses malloc) in logConfigDebugInfo It was considered to print config info instead of using sds, however apparently, `logConfigDebugInfo` does more than just print the sds, so it was decided this fix is out of this issue scope. #### 8. fix Signal mask check The check `signum & sig_mask` intended to indicate whether the signal is blocked by the thread was incorrect. Actually, the bit position in the signal mask corresponds to the signal number. We fixed this by changing the condition to: `sig_mask & (1L << (sig_num - 1))` #### 9. Unrelated changes both `fork.tcl `and `util.tcl` implemented a function called `count_log_message` expecting different parameters. This caused confusion when trying to run daily tests with additional test parameters to run a specific test. The `count_log_message` in `fork.tcl` was removed and the calls were replaced with calls to `count_log_message` located in `util.tcl` --------- Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com> Co-authored-by: Oran Agra <oran@redislabs.com>	2023-11-23 13:22:20 +02:00
Binbin	5e403099bd	Fix misleading error message in redis.log when loglevel is invalid (#12636 ) We don't have any debug level, change it to log level.	2023-11-23 10:23:30 +02:00
Moshe Kaplan	c9aa586b6b	rdb.c: Avoid potential file handle leak (Coverity 404720) (#12795 ) `open()` can return any non-negative integer on success, including zero. This change modifies the check from open()'s return value to also include checking for a return value of zero (e.g., if stdin were closed and then `open()` was called). Fixes Coverity 404720 Can't happen in Redis. just a cleanup.	2023-11-23 10:14:17 +02:00
Moshe Kaplan	ae09d4d3ef	redis-check-aof.c: Avoid leaking file handle if file is zero bytes (#12797 ) If fopen() is successful, but redis_fstat() determines the file is zero bytes, the file handle stored in fp will leak. This change closes the filehandle stored in fp if the file is zero bytes. An FD leak on a tool like redis-check-aof isn't an issue (it'll exit soon anyway). This is just a cleanup.	2023-11-23 10:08:49 +02:00
Moshe Kaplan	1c48d3dab2	config.c: Avoid leaking file handle if redis_fstat() fails (#12796 ) If fopen() is successful, but redis_fstat() fails, the file handle stored in fp will leak. This change closes the filehandle stored in fp if redis_fstat() fails. Fixes Coverity 390029	2023-11-23 10:06:13 +02:00
Moshe Kaplan	157e5d47b5	util.c: Don't leak directory handle if recursive call to dirRemove fails (#12800 ) If a recursive call to dirRemove() returns -1, dirRemove() the directory handle stored in dir will leak. This change closes the directory handle stored in dir even if the recursive call to dirRemove() returns -1. Fixed Coverity 371073	2023-11-23 10:04:02 +02:00
Binbin	463476933c	Optimize dict expand check, move allow check after the thresholds check (#12789 ) dictExpandAllowed (for the main db dict and the expire dict) seems to involve a few function calls and memory accesses, and we can do it only after the thresholds checks and can get some performance improvements. A simple benchmark test: there are 11032768 fixed keys in the database, start a redis-server with `--maxmemory big_number --save ""`, start a redis-benchmark with `-P 100 -r 1000000000 -t set -n 5000000`, collect `throughput summary: n requests per second` result. After five rounds of testing: ``` unstable this PR 848032.56 897988.56 854408.69 913408.88 858663.94 914076.81 871839.56 916758.31 882612.56 920640.75 ``` We can see a 5% performance improvement in general condition. But note we'll still hit the degradation when we over the thresholds.	2023-11-22 11:33:26 +02:00
Yehoshua (Josh) Hershberg	ef1ca6c882	add some file level comments and copyright (#12793 ) A followup PR for #12742 Add some brief comments explaining the purpose of the file to the head of cluster_legacy.c and cluster.c. Add copyright notice to cluster.c Signed-off-by: Josh Hershberg <yehoshua@redis.com> Co-authored-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 11:32:23 +02:00
Binbin	dedbf99a80	Fix dbExpand not dividing by slots, resulting in consuming slots times the dictExpand (#12773 ) We meant to divide it by the number of slots, otherwise it will do slots times dictExpand, bug was introduced in #11695.	2023-11-22 11:16:06 +02:00
Oran Agra	58cb302526	Cluster refactor, common API for different implementations (#12742 ) This PR reworks the clustering code to support multiple clustering implementations, specifically, the current "legacy" clustering implementation or, although not part of this PR, flotilla (see #10875). Which implementation is used could be a compile-time flag (will be added later). Legacy clustering functionality remains unchanged. The basic idea is as follows. The header cluster.h now contains function declarations that define the "Cluster API." These are the contract and interface between any clustering implementation and the rest of the Redis source code. Some of the function definitions are shared between all clustering implementations. These functions are in cluster.c. The functions and data structures specific to legacy clustering are in cluster-legacy.c/h. One consequence of this is that the structs clusterNode and clusterState which were previously "public" to the rest of Redis are now hidden behind the Cluster API. The PR is divided up into commits, each with a commit message explaining the changes. some are just mass rename or moving code between files (may not require close inspection / review), others are manual changes. One other, related change is: - The "failover" command is now plugged into the Cluster API so that the clustering implementation can (a) enable/disable the command to begin with and if enabled (b) perform the actual failover. The "failover" command remains disabled for legacy clustering.	2023-11-22 09:31:19 +02:00
Josh Hershberg	eebb025826	Cluster refactor: Some code convention fixes Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:54:48 +02:00
Josh Hershberg	290f376429	Cluster refactor: fn renames + small compilation issue on ubuntu Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:54:06 +02:00
Josh Hershberg	13b754853c	Cluster refactor: cluster.h - reorder functions into logical groups Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:54:06 +02:00
Josh Hershberg	2e5181ef28	Cluster refactor: Add failover cmd support to cluster api The failover command is up until now not supported in cluster mode. This commit allows a cluster implementation to support the command. The legacy clustering implementation still does not support this command. Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:54:06 +02:00
Josh Hershberg	c6157b3510	Cluster refactor: Make clustering functions common Move primary functions used to implement datapath clustering into cluster.c, making them shared. This required adding "accessor" and other functions to abstract access to node details and cluster state. Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:54:06 +02:00
Josh Hershberg	4afc54ad9b	Cluster refactor: break up clusterCommand Divide up clusterCommand into clusterCommand for shared sub-commands and clusterCommandSpecial for implementation specific sub-commands. So to, the cluster command help sub-command has been divided into two implementations, clusterCommandHelp and clusterCommandHelpSpecial. Some common sub-subcommand implementations have been extracted and their implemenations either made shared or else implementation specific. Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:54:06 +02:00
Josh Hershberg	33ef6a3003	Cluster refactor: s/clusterNodeGetSlotBit/clusterNodeCoversSlot/ Simple rename, "GetSlotBit" is implementation specific Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:54:06 +02:00
Josh Hershberg	ac1513221b	Cluster refactor: Move items from cluster_legacy.c to cluster.c Move (but do not change) some items from cluster_legacy.c back info cluster.c. These items are shared code that all clustering implementations will use. Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:54:06 +02:00
Josh Hershberg	040cb6a4aa	Cluster refactor: verifyClusterNodeId need not be 'public' Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:54:06 +02:00
Josh Hershberg	4944eda696	Cluster refactor: Move more stuff from cluster.h to cluster_legacy.h More declerations can be moved into cluster_legacy.h as they are not requied for the cluster api. The code was simply moved, not changed in any way. Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:54:03 +02:00
Josh Hershberg	d9a0478599	Cluster refactor: Make clusterNode private Move clusterNode into cluster_legacy.h. In order to achieve this some accessor methods were added and also a refactor of how debugCommand handles cluster related subcommands. Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:50:46 +02:00
Josh Hershberg	98a6c44b75	Cluster refactor: Make clusterState private Move clusterState into cluster_legacy.h. In order to achieve this some "accessor" methods needed to be added to the cluster API and some other minor refactors. Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-22 05:44:10 +02:00
Binbin	2c41b13505	Block DEBUG POPULATE in loading and async-loading (#12790 ) When we are loading data, it is not safe to generate data through DEBUG POPULATE. POPULATE may generate keys, causing panic when loading data with duplicate keys.	2023-11-21 17:00:27 +02:00
Josh Hershberg	5292adb985	Cluster refactor: Move trivial stuff into cluster_legacy.h Move some declerations from cluster.h to cluster_legacy.h. The items moved are specific to the legacy clustering implementation and DO NOT require any other refactoring other than moving them from one file to another. Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-21 12:49:14 +02:00
Josh Hershberg	6a6ae6ffe8	Cluster refactor: Create new cluster.c and include of cluster_legacy.h create new cluster.c Signed-off-by: Josh Hershberg <yehoshua@redis.com> forgot to #include cluster_legacy.h Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-21 12:49:14 +02:00
Josh Hershberg	86915775f1	Cluster refactor: rename cluster.c -> cluster_legacy.c Signed-off-by: Josh Hershberg <yehoshua@redis.com>	2023-11-21 12:49:14 +02:00
iKun	4278ed8de5	redis-cli --bigkeys ,--hotkeys and --memkeys to replica in cluster mode (#12735 ) Make redis-cli --bigkeys and --memkeys usable on a replicas in cluster mode, by sending the READONLY command. This is only if -c is also given. We don't detect if a node is a master or a replica so we send READONLY in both cases. The READONLY has no effect on masters. Release notes: Make redis-cli --bigkeys and --memkeys usable on cluster replicas --------- Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Oran Agra <oran@redislabs.com>	2023-11-20 09:33:28 +02:00
Hwang Si Yeon	a1f91ffa18	Add an explanation for URI with -u in redis-cli --help (#12751 ) Add documentation of the URI format in the `--help` output of `redis-cli` and `redis-benchmark`. In particular, it's good for users to know that they need to specify "default" as the username when authenticating without a username. Other details of the URI format are described too, like scheme and dbnum. It used to be possible to connect to Redis using an URL with an empty username, like `redis-cli -u redis://:PASSWORD@localhost:6379/0`. This was broken in 6.2 (#8048), and there was a discussion about it #9186. Now, users need to specify "default" as the username and it's better to document it. Refer to #12746 for more details.	2023-11-19 15:09:14 +02:00
Wen Hui	5a1f4b9aec	Adding missing SWAPDB related test cases. (#12769 ) We have some test cases of swapdb with watchkey but missing seperate basic swapdb test cases, unhappy path and flushdb after swapdb. So added the test cases in keyspace.tcl.	2023-11-19 12:44:48 +02:00
Binbin	3d9c427f8c	Fix timing issue in CLUSTER SLAVE / REPLICAS consistent test (#12774 ) CI reports that this test failed, the reason is because during the command processing, the node processed PING/PONG, resulting in ping_sent or pong_received mismatch. Change to use MULTI to avoid timing issue. The test was introduced in #12224.	2023-11-19 11:09:33 +02:00
zhaozhao.zz	eb392c0a6f	replace calculateKeySlot with known slot in evictionPoolPopulate (#12777 )	2023-11-17 19:01:06 +08:00
zhaozhao.zz	6013122c8e	check dbSize when rewrite AOF, in case unnecessary SELECT (#12775 ) Introduced by #11695, should skip empty db in case unnecessary SELECT.	2023-11-17 19:00:26 +08:00
Binbin	9b6dded421	Fix empty rehashing list in swapdb mode (#12770 ) In swapdb mode, the temp db does not init the rehashing list, the change added in #12764 caused cluster ci to fail.	2023-11-16 11:18:25 +02:00
Binbin	4366bbaa61	Empty rehashing list in emptyDbStructure (#12764 ) This is currently harmless, since we have already cleared the dict before, it will reset the rehashidx to -1, and in incrementallyRehash we will call dictIsRehashing to check. It would be nice to empty the list to avoid meaningless attempts, and the code is also unified to reduce misunderstandings.	2023-11-15 07:55:34 +02:00
Binbin	fe36306340	Fix DB iterator not resetting pauserehash causing dict being unable to rehash (#12757 ) When using DB iterator, it will use dictInitSafeIterator to init a old safe dict iterator. When dbIteratorNext is used, it will jump to the next slot db dict when we are done a dict. During this process, we do not have any calls to dictResumeRehashing, which causes the dict's pauserehash to always be > 0. And at last, it will be returned directly in dictRehashMilliseconds, which causes us to have slot dict in a state where rehash cannot be completed. In the "expire scan should skip dictionaries with lot's of empty buckets" test, adding a `keys ` can reproduce the problem stably. `keys ` will call dbIteratorNext to trigger a traversal of all slot dicts. Added dbReleaseIterator and dbIteratorInitNextSafeIterator methods to call dictResetIterator. Issue was introduced in #11695.	2023-11-14 14:28:46 +02:00
Yossi Gottlieb	a9e73c00bc	Reduce FreeBSD daily scope. (#12758 ) The full test is very flaky running on a VM inside GitHub worker, so we have to settle for only building and running a small smoke test.	2023-11-13 17:22:09 +02:00
Roshan Khatri	88e83e517b	Add DEBUG_ASSERTIONS option to custom assert (#12667 ) This PR introduces a new macro, serverAssertWithInfoDebug, to do complex assertions only for debugging. The main intention is to allow running complex operations during tests without impacting runtime performance. This assertion is enabled when setting DEBUG_ASSERTIONS. The DEBUG_ASSERTIONS flag is set for the daily and CI variants of `test-sanitizer-address`.	2023-11-11 20:31:34 -08:00
Harkrishn Patro	9ca8490315	Increase timeout for expiry cluster tests (#12752 ) Test recently added fails on timeout in valgrind in GH actions. Locally with valgrind the test finishes within 1.5 sec(s). Couldn't find any issue due to lack of reproducibility. Increasing the timeout and adding an additional log to the test to understand how many keys were left at the end.	2023-11-11 12:01:04 +02:00
zhaozhao.zz	6258edebf0	reset bucket_count when empty db (#12750 ) Introduced in #12697 , should reset bucket_count when empty db, or the overhead memory usage of db can be miscalculated.	2023-11-10 15:52:57 +02:00
zhaozhao.zz	cf6ed3feeb	fix the wrong judgement for activerehashing in standalone mode (#12741 ) Introduced by #11695, the judgement should be dictIsRehashing.	2023-11-09 11:30:50 +02:00
Binbin	53294e537c	Fix genClusterDebugString minor sds leaks (#12739 ) This function now will only be called in printCrashReport, so this is just a cleanup.	2023-11-08 19:14:36 +02:00
Meir Shpilraien (Spielrein)	0ffb9d2ea9	Before evicted and before expired server events are not executed inside an execution unit. (#12733 ) Redis 7.2 (#9406) introduced a new modules event, `RedisModuleEvent_Key`. This new event allows the module to read the key data just before it is removed from the database (either deleted, expired, evicted, or overwritten). When the key is removed from the database, either by active expire or eviction. The new event was not called as part of an execution unit. This can cause an issue if the module registers a post notification job inside the event. This job will not be executed atomically with the expiration/eviction operation and will not replicated inside a Multi/Exec. Moreover, the post notification job will be executed right after the event where it is still not safe to perform any write operation, this will violate the promise that post notification job will be called atomically with the operation that triggered it and only when it is safe to write. This PR fixes the issue by wrapping each expiration/eviction of a key with an execution unit. This makes sure the entire operation will run atomically and all the post notification jobs will be executed at the end where it is safe to write. Tests were modified to verify the fix.	2023-11-08 09:28:22 +02:00
Yossi Gottlieb	6223355cf3	Use cross-platform-actions for FreeBSD support. (#12732 ) This change overcomes many stability issues experienced with the vmactions action. We need to limit VMs to 8GB for better stability, as the 13GB default seems to hang them occasionally. Shell code has been simplified since this action seem to use `bash -e` which will abort on non-zero exit codes anyway.	2023-11-06 18:07:14 +02:00
dingrui	a888503b4f	Remove unnecessary argument(tp) in gettimeofday() call for retrieving timezone (#12722 ) changes the `gettimeofday` caller, by removing an unused optional output argument. It would take 2 benefits: - simplify code, discard unnecessary arg. - possibly faster due to the implementation in kernel.	2023-11-06 15:10:09 +02:00
Chen Tianjie	282b82e9d2	Handle all CLUSTER_REDIR_ error code when verifying script. (#12707 ) Clarify the errors related to the cluster mode in the script, return the command that encountered an execution error along with the specific error message. --------- Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2023-11-06 17:48:58 +08:00
Wen Hui	28b6155ba5	Fix the bug that write redis sensitive command information to redis_cli historyfile (#11489 ) Currently, we do not write the following sensitive commands into the ~/.rediscli_history file: ACL SETUSER username [rule [rule ...]] AUTH [username] password HELLO [AUTH username password] MIGRATE host port <key \| ""> destination-db timeout [[AUTH password \| AUTH2 username password]] CONFIG SET masterauth master-password CONFIG SET masteruser username CONFIG SET requirepass foobared However, we still write the following sensitive commands into the ~/.rediscli_history file: ACL GETUSER username Sentinel CONFIG set sentinel-pass password Sentinel CONFIG set sentinel-user username Sentinel set mastername auth-pass password Sentinel set mastername auth-user username This change adds the commands of the second list to be skipped from being written to the history file.	2023-11-05 14:20:15 +02:00
Roshan Khatri	15a048d4f0	re-enable defrag tests in cluster mode. (#12710 ) Reverts the skipping defrag tests in cluster mode (done in #12672. instead it skips only some defrag tests that are relevant for cluster modes. The test now run well after investigating and making the changes in #12674 and #12694. Co-authored-by: Oran Agra <oran@redislabs.com>	2023-11-02 13:55:48 +02:00
dependabot[bot]	0ce74872c4	Bump actions/setup-node from 3 to 4 (#12708 ) Bumps [actions/setup-node](https://github.com/actions/setup-node) from 3 to 4. - [Release notes](https://github.com/actions/setup-node/releases) - [Commits](https://github.com/actions/setup-node/compare/v3...v4) --- updated-dependencies: - dependency-name: actions/setup-node dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-11-02 12:22:50 +02:00
Viktor Söderqvist	8878817d89	Optimize SCAN with MATCH when pattern implies cluster slot (#12536 ) Optimize the performance of SCAN commands when a match pattern can only contain keys from a single slot in cluster mode. This can happen when the pattern contains a hash tag before any wildcard matchers or when the key contains no matchers.	2023-11-01 00:06:49 -07:00
Chen Tianjie	e9f312e087	Change stat_client_qbuf_limit_disconnections to atomic. (#12711 ) In #12476 server.stat_client_qbuf_limit_disconnections was added. It is written in readQueryFromClient, which may be called by multiple threads when io-threads and io-threads-do-reads are turned on. Somehow we missed to make it an atomic variable.	2023-11-01 10:57:24 +08:00

... 3 4 5 6 7 ...

12135 Commits