mirror of
https://codeberg.org/redict/redict.git
synced 2025-01-23 00:28:26 -05:00
4e7eb16ae7
In #9408, we added some SENTINEL DEBUG to reduce default timeouts and allow tests to execute faster. The change in 05-manual.tcl may cause a race that SENTINEL FAILOVER response with a NOGOODSLAVE: ``` Manual failover works: FAILED: Expected NOGOODSLAVE No suitable replica to promote eq "OK" (context: type eval line 6 cmd {assert {$reply eq "OK"}} proc ::test) (Jumping to next unit after error) FAILED: caught an error in the test assertion:Expected NOGOODSLAVE No suitable replica to promote eq "OK" (context: type eval line 6 cmd {assert {$reply eq "OK"}} proc ::test) ``` The reason is that the info-period value was reduced in #9408 (the default value is 10000), and then manual failover was performed immediately, but the INFO may not exchanged between the sentinel and replicas, causing the sentinel to skip all the replicas in sentinelSelectSlave (Because replica's info_refresh is not updated, see the code snippet below), then return a NOGOODSLAVE, break the test. Code snippet from sentinelSelectSlave: ``` while((de = dictNext(di)) != NULL) { sentinelRedisInstance *slave = dictGetVal(de); mstime_t info_validity_time; if (master->flags & SRI_S_DOWN) info_validity_time = sentinel_ping_period*5; else info_validity_time = sentinel_info_period*3; if (mstime() - slave->info_refresh > info_validity_time) continue; } ``` By adding a wait_for_condition, we have the opportunity to let sentinel update the info_period of the replicas. |
||
---|---|---|
.. | ||
tests | ||
tmp | ||
run.tcl |