redict/tests/unit
Viktor Söderqvist 45a155bd0f
Wait for replicas when shutting down (#9872)
To avoid data loss, this commit adds a grace period for lagging replicas to
catch up the replication offset.

Done:

* Wait for replicas when shutdown is triggered by SIGTERM and SIGINT.

* Wait for replicas when shutdown is triggered by the SHUTDOWN command. A new
  blocked client type BLOCKED_SHUTDOWN is introduced, allowing multiple clients
  to call SHUTDOWN in parallel.
  Note that they don't expect a response unless an error happens and shutdown is aborted.

* Log warning for each replica lagging behind when finishing shutdown.

* CLIENT_PAUSE_WRITE while waiting for replicas.

* Configurable grace period 'shutdown-timeout' in seconds (default 10).

* New flags for the SHUTDOWN command:

    - NOW disables the grace period for lagging replicas.

    - FORCE ignores errors writing the RDB or AOF files which would normally
      prevent a shutdown.

    - ABORT cancels ongoing shutdown. Can't be combined with other flags.

* New field in the output of the INFO command: 'shutdown_in_milliseconds'. The
  value is the remaining maximum time to wait for lagging replicas before
  finishing the shutdown. This field is present in the Server section **only**
  during shutdown.

Not directly related:

* When shutting down, if there is an AOF saving child, it is killed **even** if AOF
  is disabled. This can happen if BGREWRITEAOF is used when AOF is off.

* Client pause now has end time and type (WRITE or ALL) per purpose. The
  different pause purposes are *CLIENT PAUSE command*, *failover* and
  *shutdown*. If clients are unpaused for one purpose, it doesn't affect client
  pause for other purposes. For example, the CLIENT UNPAUSE command doesn't
  affect client pause initiated by the failover or shutdown procedures. A completed
  failover or a failed shutdown doesn't unpause clients paused by the CLIENT
  PAUSE command.

Notes:

* DEBUG RESTART doesn't wait for replicas.

* We already have a warning logged when a replica disconnects. This means that
  if any replica connection is lost during the shutdown, it is either logged as
  disconnected or as lagging at the time of exit.

Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-02 09:50:15 +02:00
..
moduleapi Remove incomplete fix of a broader problem (#10013) 2021-12-28 10:19:58 +02:00
type Tests: don't rely on the response of MEMORY USAGE when mem_allocator is not jemalloc (#10010) 2021-12-27 21:37:21 +02:00
acl.tcl Treat subcommands as commands (#9504) 2021-10-20 11:52:57 +03:00
aofrw.tcl Fix failing test due to recent change in transaction propagation (#10006) 2021-12-27 15:18:17 +02:00
auth.tcl Prevent unauthenticated client from easily consuming lots of memory (CVE-2021-32675) (#9588) 2021-10-04 12:10:31 +03:00
bitfield.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
bitops.tcl Change lzf to handle values larger than UINT32_MAX (#9776) 2021-11-16 13:12:25 +02:00
client-eviction.tcl Client eviction ci issues (#9549) 2021-09-26 17:45:02 +03:00
cluster.tcl Add FUNCTION DUMP and RESTORE. (#9938) 2021-12-26 09:03:37 +02:00
dump.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
expire.tcl Add external test that runs without debug command (#9964) 2021-12-19 17:41:51 +02:00
functions.tcl Add FUNCTION DUMP and RESTORE. (#9938) 2021-12-26 09:03:37 +02:00
geo.tcl GEO* STORE with empty src key delete the dest key and return 0, not empty array (#9271) 2021-08-01 19:32:24 +03:00
hyperloglog.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
info.tcl QUIT is a command, HOST: and POST are not (#9798) 2021-11-23 10:38:25 +02:00
introspection-2.tcl Fix COMMAND GETKEYS on LCS (#9852) 2021-11-28 09:02:38 +02:00
introspection.tcl Allow most CONFIG SET during loading, block some commands in async-loading (#9878) 2021-12-22 14:11:16 +02:00
keyspace.tcl Add external test that runs without debug command (#9964) 2021-12-19 17:41:51 +02:00
latency-monitor.tcl Add external test that runs without debug command (#9964) 2021-12-19 17:41:51 +02:00
lazyfree.tcl attempt to fix tracking test issue with external tests due to lazy free (#9722) 2021-11-02 16:42:53 +02:00
limits.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
maxmemory.tcl Sort out mess around propagation and MULTI/EXEC (#9890) 2021-12-23 00:03:48 +02:00
memefficiency.tcl Add external test that runs without debug command (#9964) 2021-12-19 17:41:51 +02:00
multi.tcl Fixed typo in test tag (for needs:debug) (#10021) 2021-12-28 16:23:02 +02:00
networking.tcl Protected configs and sensitive commands (#9920) 2021-12-19 10:46:16 +02:00
obuf-limits.tcl Better error handling for updateClientOutputBufferLimit. (#9308) 2021-08-29 15:03:05 +03:00
oom-score-adj.tcl Don't write oom score adj to proc unless we're managing it. (#9904) 2021-12-07 16:05:51 +02:00
other.tcl Add external test that runs without debug command (#9964) 2021-12-19 17:41:51 +02:00
pause.tcl Shorten timeouts of CLIENT PAUSE to avoid hanging when tests fail. (#9975) 2021-12-22 12:06:29 +02:00
pendingquerybuf.tcl Introduce memory management on cluster link buffers (#9774) 2021-12-16 21:56:59 -08:00
printver.tcl Print version info before running the test 2011-05-20 11:44:54 +02:00
protocol.tcl Tests: add a few missing needs:debug tags. (#9806) 2021-11-18 23:01:56 +02:00
pubsub.tcl Connection leak in external tests. (#9777) 2021-11-15 11:07:43 +02:00
querybuf.tcl Ignore resize threshold on idle qbuf resizing (#9322) 2021-08-06 20:50:34 +03:00
quit.tcl Add tests for OK on QUIT 2010-10-15 12:54:53 +02:00
scan.tcl Replace all usage of ziplist with listpack for t_zset (#9366) 2021-09-09 18:18:53 +03:00
scripting.tcl Fixed typo in test tag (for needs:debug) (#10021) 2021-12-28 16:23:02 +02:00
shutdown.tcl Wait for replicas when shutting down (#9872) 2022-01-02 09:50:15 +02:00
slowlog.tcl Redact ACL SETUSER arguments if the user has spaces (#9935) 2021-12-13 08:39:04 -08:00
sort.tcl Add SORT_RO command (#9299) 2021-08-09 09:40:29 +03:00
tls.tcl Add support for reading encrypted keyfiles. (#8644) 2021-03-22 13:27:46 +02:00
tracking.tcl Solve issues with tracking test in external mode (#9726) 2021-11-02 16:07:51 -07:00
violations.tcl Fix possible int overflow when hashing an sds. (#9916) 2021-12-13 21:16:25 +02:00
wait.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00