Test failed on freebsd:
```
*** [err]: Make the old master a replica of the new one and check conditions in tests/integration/psync2-pingoff.tcl
Expected '162' to be equal to '176' (context: type eval line 18 cmd {assert_equal [status $R(0) master_repl_offset] [status $R(1) master_repl_offset]} proc ::test)
```
There are two possible race conditions in the test.
1. The code waits for sync_full to increment, and assumes that means the
master did the fork. But in fact there are cases the master will increment
that sync_full counter (after replica asks for sync), but will see that
there's already a fork running and will delay the fork creation.
In this case the INCR will be executed before the fork happens, so it'll
not be in the command stream. Solve that by waiting for `master_link_status: up`
on the replica before the INCR.
2. The repl-ping-replica-period is still high (1 second), so there's a chance the
master will send an additional PING between the two calls to INFO (the line that
fails is the one that samples INFO from both servers). So there's a chance one of
them will have an incremented offset due to PING and the other won't have it yet.
In theory we can wait for the repl_offset to match, but then we risk facing a
situation where that race will hide an offset mis-match. so instead, i think we
should just change repl-ping-replica-period to prevent further pings from being pushed.
Co-authored-by: Oran Agra <oran@redislabs.com>
This commit revives the improves the ability to run the test suite against
external servers, instead of launching and managing `redis-server` processes as
part of the test fixture.
This capability existed in the past, using the `--host` and `--port` options.
However, it was quite limited and mostly useful when running a specific tests.
Attempting to run larger chunks of the test suite experienced many issues:
* Many tests depend on being able to start and control `redis-server` themselves,
and there's no clear distinction between external server compatible and other
tests.
* Cluster mode is not supported (resulting with `CROSSSLOT` errors).
This PR cleans up many things and makes it possible to run the entire test suite
against an external server. It also provides more fine grained controls to
handle cases where the external server supports a subset of the Redis commands,
limited number of databases, cluster mode, etc.
The tests directory now contains a `README.md` file that describes how this
works.
This commit also includes additional cleanups and fixes:
* Tests can now be tagged.
* Tag-based selection is now unified across `start_server`, `tags` and `test`.
* More information is provided about skipped or ignored tests.
* Repeated patterns in tests have been extracted to common procedures, both at a
global level and on a per-test file basis.
* Cleaned up some cases where test setup was based on a previous test executing
(a major anti-pattern that repeats itself in many places).
* Cleaned up some cases where test teardown was not part of a test (in the
future we should have dedicated teardown code that executes even when tests
fail).
* Fixed some tests that were flaky running on external servers.
Another test race condition in the macos tests.
the test was waiting for PINGs to be generated and put on the replication stream,
but waiting for 1 or 2 seconds doesn't really guarantee that.
then the test that expected 6 full syncs, found only 4
these tests create several edge cases that are otherwise uncovered (at
least not consistently) by the test suite, so although they're no longer
testing what they were meant to test, it's still a good idea to keep
them in hope that they'll expose some issue in the future.