redict/tests/integration
Oran Agra 573246f73c
if diskless repl child is killed, make sure to reap the pid (#7742)
Starting redis 6.0 and the changes we made to the diskless master to be
suitable for TLS, I made the master avoid reaping (wait3) the pid of the
child until we know all replicas are done reading their rdb.

I did that in order to avoid a state where the rdb_child_pid is -1 but
we don't yet want to start another fork (still busy serving that data to
replicas).

It turns out that the solution used so far was problematic in case the
fork child was being killed (e.g. by the kernel OOM killer), in that
case there's a chance that we currently disabled the read event on the
rdb pipe, since we're waiting for a replica to become writable again.
and in that scenario the master would have never realized the child
exited, and the replica will remain hung too.
Note that there's no mechanism to detect a hung replica while it's in
rdb transfer state.

The solution here is to add another pipe which is used by the parent to
tell the child it is safe to exit. this mean that when the child exits,
for whatever reason, it is safe to reap it.

Besides that, i'm re-introducing an adjustment to REPLCONF ACK which was
part of #6271 (Accelerate diskless master connections) but was dropped
when that PR was rebased after the TLS fork/pipe changes (5a47794).
Now that RdbPipeCleanup no longer calls checkChildrenDone, and the ACK
has chance to detect that the child exited, it should be the one to call
it so that we don't have to wait for cron (server.hz) to do that.
2020-09-06 16:43:57 +03:00
..
aof-race.tcl TLS: Connections refactoring and TLS support. 2019-10-07 21:06:13 +03:00
aof.tcl test infra - wait_done_loading 2020-09-06 09:59:19 +03:00
block-repl.tcl TLS: Connections refactoring and TLS support. 2019-10-07 21:06:13 +03:00
convert-zipmap-hash-on-load.tcl convert-zipmap-hash-on-load false positive fixed. 2012-03-25 11:02:16 +02:00
logging.tcl Added regression test for issue #2371. 2015-02-10 14:40:27 +01:00
psync2-pingoff.tcl fix pingoff test race 2020-05-31 15:51:52 +03:00
psync2-reg.tcl fix loading race in psync2 tests 2020-04-28 09:18:01 +03:00
psync2.tcl tests/valgrind: don't use debug restart (#7404) 2020-07-10 08:26:52 +03:00
rdb.tcl test infra - reduce disk space usage 2020-09-06 09:59:19 +03:00
redis-cli.tcl Tests: fix redis-cli with remote hosts. (#7693) 2020-08-23 10:17:43 +03:00
replication-2.tcl Slave removal: remove slave from integration tests descriptions. 2018-09-11 15:32:28 +02:00
replication-3.tcl add daily github actions with libc malloc and valgrind 2020-05-04 09:52:20 +03:00
replication-4.tcl diskless replication on slave side (don't store rdb to file), plus some other related fixes 2019-07-08 15:37:48 +03:00
replication-psync.tcl diskless replication on slave side (don't store rdb to file), plus some other related fixes 2019-07-08 15:37:48 +03:00
replication.tcl if diskless repl child is killed, make sure to reap the pid (#7742) 2020-09-06 16:43:57 +03:00