This makes simpler to give people help when posting such kind of errors
in the mailing list or other help forums, because sometimes the
directive looks well spelled, but the version of Redis they are using is
not able to support it.
the warning condition was if usage > limit (saying it'll cause eviction
or oom), but in fact the eviction and oom depends on used minus slave
buffers.
other than fixing the condition, i add info about the current usage and
limit, which may be useful when looking at the log.
We noticed that the error replies for the generic mechanism for enums
are very verbose for config file parsing, but not for config set
command.
instead of replicating this code, i did a small refactoring to share
code between CONFIG SET and config file parsing.
and also renamed the enum group functions to be consistent with the
naming of other types.
Funcion adjustOpenFilesLimit() has an implicit parameter, which is server.maxclients.
This function aims to ajust maximum file descriptor number according to server.maxclients
by best effort, which is "bestlimit" could be lower than "maxfiles" but greater than "oldlimit".
When we try to increase "maxclients" using CONFIG SET command, we could increase maximum
file descriptor number to a bigger value without calling aeResizeSetSize the same time.
When later more and more clients connect to server, the allocated fd could be bigger and bigger,
and eventually exceeds events size of aeEventLoop.events. When new nodes joins the cluster,
new link is created, together with new fd, but when calling aeCreateFileEvent, we did not
check the return value. In this case, we have a non-null "link" but the associated fd is not
registered.
So when we dynamically set "maxclients" we could reach an inconsistency between maximum file
descriptor number of the process and server.maxclients. And later could cause cluster link and link
fd inconsistency.
While setting "maxclients" dynamically, we consider it as failed when resulting "maxclients" is not
the same as expected. We try to restore back the maximum file descriptor number when we failed to set
"maxclients" to the specified value, so that server.maxclients could act as a guard as before.
- make lua-replicate-commands mutable (it never was, but i don't see why)
- make tcp-backlog immutable (fix a recent refactory mistake)
- increase the max limit of a few configs to match what they were before
the recent refactory
Changes in behavior:
- Change server.stream_node_max_entries from int64_t to long long, so that it can be used by the generic infra
- standard error reply instead of "repl-backlog-size must be 1 or greater" and such
- tls-port and a few TLS booleans were readable (config get) even when USE_OPENSSL was off (now they aren't)
- syslog-enabled, syslog-ident, cluster-enabled, appendfilename, and supervised didn't have a get (now they do)
- pidfile was initialized to NULL in InitServerConfig but had CONFIG_DEFAULT_PID_FILE in rewriteConfig (so the real default was "", but rewrite would cause it to be set), fixed the rewrite.
- TLS config in server.h was uninitialized (if no tls config args were provided)
Adding test for sanity and coverage
- Adding is_valid_fn and update_fn, both return 1 for success and 0 for failure with an optional error message.
- Bugfix in handling boundary check of unsigned numeric types (was boundaries as signed)
- Adding more numeric types to generic mechanism: uint, ulonglong, long, time_t, off_t
- More verbose error replies ("argument must be between" in out of range CONFIG SET (like config file parsing)
- add capability for each config to have a callback to check if value is valid and return error string
will enable converting many of the remaining custom configs into generic ones (reducing the x4 repetition for set,get,config,rewrite)
- add capability for each config to to run some update code after config is changed (only for CONFIG SET)
will also enable converting many of the remaining custom configs into generic ones
- add capability to move default values from server.h and server.c to config.c
will reduce many excess lines in server.h and server.c (plus, no need to rebuild the entire code base when a default change 8-))
other behavior changes:
- fix bug in bool config get (always returning 'yes')
- fix a bug in modifying jemalloc-bg-thread at runtime (didn't call set_jemalloc_bg_thread, due to bad merge conflict resolution (my fault))
- side effect when a failed attempt to enable activedefrag at runtime, we now respond with -ERR and not with -DISABLED
This adds support for explicit configuration of a CA certs directory (in
addition to the previously supported bundle file). For redis-cli, if no
explicit CA configuration is supplied the system-wide default
configuration will be adopted.
misc:
- handle SSL_has_pending by iterating though these in beforeSleep, and setting timeout of 0 to aeProcessEvents
- fix issue with epoll signaling EPOLLHUP and EPOLLERR only to the write handlers. (needed to detect the rdb pipe was closed)
- add key-load-delay config for testing
- trim connShutdown which is no longer needed
- rioFdsetWrite -> rioFdWrite - simplified since there's no longer need to write to multiple FDs
- don't detect rdb child exited (don't call wait3) until we detect the pipe is closed
- Cleanup bad optimization from rio.c, add another one
* Introduce a connection abstraction layer for all socket operations and
integrate it across the code base.
* Provide an optional TLS connections implementation based on OpenSSL.
* Pull a newer version of hiredis with TLS support.
* Tests, redis-cli updates for TLS support.
The implementation of the diskless replication was currently diskless only on the master side.
The slave side was still storing the received rdb file to the disk before loading it back in and parsing it.
This commit adds two modes to load rdb directly from socket:
1) when-empty
2) using "swapdb"
the third mode of using diskless slave by flushdb is risky and currently not included.
other changes:
--------------
distinguish between aof configuration and state so that we can re-enable aof only when sync eventually
succeeds (and not when exiting from readSyncBulkPayload after a failed attempt)
also a CONFIG GET and INFO during rdb loading would have lied
When loading rdb from the network, don't kill the server on short read (that can be a network error)
Fix rdb check when performed on preamble AOF
tests:
run replication tests for diskless slave too
make replication test a bit more aggressive
Add test for diskless load swapdb