redict/tests/integration/replication-2.tcl

start_server {tags {"repl external:skip"}} {
    start_server {} {
        test {First server should have role slave after SLAVEOF} {
            r -1 slaveof [srv 0 host] [srv 0 port]
            wait_replica_online r
            wait_for_condition 50 100 {
                [s -1 master_link_status] eq {up}
            } else {
                fail "Replication not started."
            }
        }

        test {If min-slaves-to-write is honored, write is accepted} {
            r config set min-slaves-to-write 1
            r config set min-slaves-max-lag 10
            r set foo 12345
            wait_for_condition 50 100 {
                [r -1 get foo] eq {12345}
            } else {
                fail "Write did not reached replica"
            }
        }

        test {No write if min-slaves-to-write is < attached slaves} {
            r config set min-slaves-to-write 2
            r config set min-slaves-max-lag 10
            catch {r set foo 12345} err
            set err
        } {NOREPLICAS*}

        test {If min-slaves-to-write is honored, write is accepted (again)} {
            r config set min-slaves-to-write 1
            r config set min-slaves-max-lag 10
            r set foo 12345
            wait_for_condition 50 100 {
                [r -1 get foo] eq {12345}
            } else {
                fail "Write did not reached replica"
            }
        }

        test {No write if min-slaves-max-lag is > of the slave lag} {
            r config set min-slaves-to-write 1
            r config set min-slaves-max-lag 2
            pause_process [srv -1 pid]
            assert {[r set foo 12345] eq {OK}}
            wait_for_condition 100 100 {
                [catch {r set foo 12345}] != 0
            } else {
                fail "Master didn't become readonly"
            }
            catch {r set foo 12345} err
            assert_match {NOREPLICAS*} $err
        }
        resume_process [srv -1 pid]

        test {min-slaves-to-write is ignored by slaves} {
            r config set min-slaves-to-write 1
            r config set min-slaves-max-lag 10
            r -1 config set min-slaves-to-write 1
            r -1 config set min-slaves-max-lag 10
            r set foo aaabbb
            wait_for_condition 50 100 {
                [r -1 get foo] eq {aaabbb}
            } else {
                fail "Write did not reached replica"
            }
        }

        # Fix parameters for the next test to work
        r config set min-slaves-to-write 0
        r -1 config set min-slaves-to-write 0
        r flushall

        test {MASTER and SLAVE dataset should be identical after complex ops} {
            createComplexDataset r 10000
            after 500
            if {[r debug digest] ne [r -1 debug digest]} {
                set csv1 [csvdump r]
                set csv2 [csvdump {r -1}]
                set fd [open /tmp/repldump1.txt w]
                puts -nonewline $fd $csv1
                close $fd
                set fd [open /tmp/repldump2.txt w]
                puts -nonewline $fd $csv2
                close $fd
                puts "Master - Replica inconsistency"
                puts "Run diff -u against /tmp/repldump*.txt for more info"
            }
            assert_equal [r debug digest] [r -1 debug digest]
        }
    }
}
Improve test suite to handle external servers better. (#9033) This commit revives the improves the ability to run the test suite against external servers, instead of launching and managing `redis-server` processes as part of the test fixture. This capability existed in the past, using the `--host` and `--port` options. However, it was quite limited and mostly useful when running a specific tests. Attempting to run larger chunks of the test suite experienced many issues: * Many tests depend on being able to start and control `redis-server` themselves, and there's no clear distinction between external server compatible and other tests. * Cluster mode is not supported (resulting with `CROSSSLOT` errors). This PR cleans up many things and makes it possible to run the entire test suite against an external server. It also provides more fine grained controls to handle cases where the external server supports a subset of the Redis commands, limited number of databases, cluster mode, etc. The tests directory now contains a `README.md` file that describes how this works. This commit also includes additional cleanups and fixes: * Tests can now be tagged. * Tag-based selection is now unified across `start_server`, `tags` and `test`. * More information is provided about skipped or ignored tests. * Repeated patterns in tests have been extracted to common procedures, both at a global level and on a per-test file basis. * Cleaned up some cases where test setup was based on a previous test executing (a major anti-pattern that repeats itself in many places). * Cleaned up some cases where test teardown was not part of a test (in the future we should have dedicated teardown code that executes even when tests fail). * Fixed some tests that were flaky running on external servers. 2021-06-09 08:13:24 -04:00			`start_server {tags {"repl external:skip"}} {`
replication test split into three parts in order to improve test execution time. Random fixes and improvements. 2011-07-10 18:46:25 -04:00			`start_server {} {`
			`test {First server should have role slave after SLAVEOF} {`
			`r -1 slaveof [srv 0 host] [srv 0 port]`
Set repl-diskless-sync to yes by default, add repl-diskless-sync-max-replicas (#10092) 1. enable diskless replication by default 2. add a new config named repl-diskless-sync-max-replicas that enables replication to start before the full repl-diskless-sync-delay was reached. 3. put replica online sooner on the master (see below) 4. test suite uses repl-diskless-sync-delay of 0 to be faster 5. a few tests that use multiple replica on a pre-populated master, are now using the new repl-diskless-sync-max-replicas 6. fix possible timing issues in a few cluster tests (see below) put replica online sooner on the master ---------------------------------------------------- there were two tests that failed because they needed for the master to realize that the replica is online, but the test code was actually only waiting for the replica to realize it's online, and in diskless it could have been before the master realized it. changes include two things: 1. the tests wait on the right thing 2. issues in the master, putting the replica online in two steps. the master used to put the replica as online in 2 steps. the first step was to mark it as online, and the second step was to enable the write event (only after getting ACK), but in fact the first step didn't contains some of the tasks to put it online (like updating good slave count, and sending the module event). this meant that if a test was waiting to see that the replica is online form the point of view of the master, and then confirm that the module got an event, or that the master has enough good replicas, it could fail due to timing issues. so now the full effect of putting the replica online, happens at once, and only the part about enabling the writes is delayed till the ACK. fix cluster tests -------------------- I added some code to wait for the replica to sync and avoid race conditions. later realized the sentinel and cluster tests where using the original 5 seconds delay, so changed it to 0. this means the other changes are probably not needed, but i suppose they're still better (avoid race conditions) 2022-01-17 07:11:11 -05:00			`wait_replica_online r`
Fix integration test NOREPLICAS error time dependent false positive. 2018-01-24 04:10:45 -05:00			`wait_for_condition 50 100 {`
			`[s -1 master_link_status] eq {up}`
			`} else {`
			`fail "Replication not started."`
			`}`
			`}`
replication test split into three parts in order to improve test execution time. Random fixes and improvements. 2011-07-10 18:46:25 -04:00
Tests for min-slaves-* feature. 2014-06-05 04:46:08 -04:00			`test {If min-slaves-to-write is honored, write is accepted} {`
			`r config set min-slaves-to-write 1`
			`r config set min-slaves-max-lag 10`
			`r set foo 12345`
			`wait_for_condition 50 100 {`
			`[r -1 get foo] eq {12345}`
			`} else {`
Slave removal: remove slave from integration tests descriptions. 2018-09-11 05:03:28 -04:00			`fail "Write did not reached replica"`
Tests for min-slaves-* feature. 2014-06-05 04:46:08 -04:00			`}`
			`}`

			`test {No write if min-slaves-to-write is < attached slaves} {`
			`r config set min-slaves-to-write 2`
			`r config set min-slaves-max-lag 10`
			`catch {r set foo 12345} err`
			`set err`
			`} {NOREPLICAS*}`

			`test {If min-slaves-to-write is honored, write is accepted (again)} {`
			`r config set min-slaves-to-write 1`
			`r config set min-slaves-max-lag 10`
			`r set foo 12345`
			`wait_for_condition 50 100 {`
			`[r -1 get foo] eq {12345}`
			`} else {`
Slave removal: remove slave from integration tests descriptions. 2018-09-11 05:03:28 -04:00			`fail "Write did not reached replica"`
Tests for min-slaves-* feature. 2014-06-05 04:46:08 -04:00			`}`
			`}`

			`test {No write if min-slaves-max-lag is > of the slave lag} {`
			`r config set min-slaves-to-write 1`
			`r config set min-slaves-max-lag 2`
Attempt to solve MacOS CI issues in GH Actions (#12013) The MacOS CI in github actions often hangs without any logs. GH argues that it's due to resource utilization, either running out of disk space, memory, or CPU starvation, and thus the runner is terminated. This PR contains multiple attempts to resolve this: 1. introducing pause_process instead of SIGSTOP, which waits for the process to stop before resuming the test, possibly resolving race conditions in some tests, this was a suspect since there was one test that could result in an infinite loop in that case, in practice this didn't help, but still a good idea to keep. 2. disable the `save` config in many tests that don't need it, specifically ones that use heavy writes and could create large files. 3. change the `populate` proc to use short pipeline rather than an infinite one. 4. use `--clients 1` in the macos CI so that we don't risk running multiple resource demanding tests in parallel. 5. enable `--verbose` to be repeated to elevate verbosity and print more info to stdout when a test or a server starts. 2023-04-12 02:19:21 -04:00			`pause_process [srv -1 pid]`
Tests for min-slaves-* feature. 2014-06-05 04:46:08 -04:00			`assert {[r set foo 12345] eq {OK}}`
solve race in replication-2 test - again (#8491) this should make it timing independent and also faster in most cases 2021-02-15 05:50:23 -05:00			`wait_for_condition 100 100 {`
			`[catch {r set foo 12345}] != 0`
			`} else {`
			`fail "Master didn't become readonly"`
			`}`
Tests for min-slaves-* feature. 2014-06-05 04:46:08 -04:00			`catch {r set foo 12345} err`
solve race in replication-2 test - again (#8491) this should make it timing independent and also faster in most cases 2021-02-15 05:50:23 -05:00			`assert_match {NOREPLICAS*} $err`
			`}`
Attempt to solve MacOS CI issues in GH Actions (#12013) The MacOS CI in github actions often hangs without any logs. GH argues that it's due to resource utilization, either running out of disk space, memory, or CPU starvation, and thus the runner is terminated. This PR contains multiple attempts to resolve this: 1. introducing pause_process instead of SIGSTOP, which waits for the process to stop before resuming the test, possibly resolving race conditions in some tests, this was a suspect since there was one test that could result in an infinite loop in that case, in practice this didn't help, but still a good idea to keep. 2. disable the `save` config in many tests that don't need it, specifically ones that use heavy writes and could create large files. 3. change the `populate` proc to use short pipeline rather than an infinite one. 4. use `--clients 1` in the macos CI so that we don't risk running multiple resource demanding tests in parallel. 5. enable `--verbose` to be repeated to elevate verbosity and print more info to stdout when a test or a server starts. 2023-04-12 02:19:21 -04:00			`resume_process [srv -1 pid]`
Tests for min-slaves-* feature. 2014-06-05 04:46:08 -04:00
			`test {min-slaves-to-write is ignored by slaves} {`
			`r config set min-slaves-to-write 1`
			`r config set min-slaves-max-lag 10`
			`r -1 config set min-slaves-to-write 1`
			`r -1 config set min-slaves-max-lag 10`
			`r set foo aaabbb`
			`wait_for_condition 50 100 {`
			`[r -1 get foo] eq {aaabbb}`
			`} else {`
Slave removal: remove slave from integration tests descriptions. 2018-09-11 05:03:28 -04:00			`fail "Write did not reached replica"`
Tests for min-slaves-* feature. 2014-06-05 04:46:08 -04:00			`}`
			`}`

			`# Fix parameters for the next test to work`
			`r config set min-slaves-to-write 0`
			`r -1 config set min-slaves-to-write 0`
			`r flushall`

replication test split into three parts in order to improve test execution time. Random fixes and improvements. 2011-07-10 18:46:25 -04:00			`test {MASTER and SLAVE dataset should be identical after complex ops} {`
			`createComplexDataset r 10000`
			`after 500`
			`if {[r debug digest] ne [r -1 debug digest]} {`
			`set csv1 [csvdump r]`
			`set csv2 [csvdump {r -1}]`
			`set fd [open /tmp/repldump1.txt w]`
			`puts -nonewline $fd $csv1`
			`close $fd`
			`set fd [open /tmp/repldump2.txt w]`
			`puts -nonewline $fd $csv2`
			`close $fd`
Slave removal: remove slave from integration tests descriptions. 2018-09-11 05:03:28 -04:00			`puts "Master - Replica inconsistency"`
replication test split into three parts in order to improve test execution time. Random fixes and improvements. 2011-07-10 18:46:25 -04:00			`puts "Run diff -u against /tmp/repldump*.txt for more info"`
			`}`
			`assert_equal [r debug digest] [r -1 debug digest]`
			`}`
			`}`
			`}`