Funcion adjustOpenFilesLimit() has an implicit parameter, which is server.maxclients.
This function aims to ajust maximum file descriptor number according to server.maxclients
by best effort, which is "bestlimit" could be lower than "maxfiles" but greater than "oldlimit".
When we try to increase "maxclients" using CONFIG SET command, we could increase maximum
file descriptor number to a bigger value without calling aeResizeSetSize the same time.
When later more and more clients connect to server, the allocated fd could be bigger and bigger,
and eventually exceeds events size of aeEventLoop.events. When new nodes joins the cluster,
new link is created, together with new fd, but when calling aeCreateFileEvent, we did not
check the return value. In this case, we have a non-null "link" but the associated fd is not
registered.
So when we dynamically set "maxclients" we could reach an inconsistency between maximum file
descriptor number of the process and server.maxclients. And later could cause cluster link and link
fd inconsistency.
While setting "maxclients" dynamically, we consider it as failed when resulting "maxclients" is not
the same as expected. We try to restore back the maximum file descriptor number when we failed to set
"maxclients" to the specified value, so that server.maxclients could act as a guard as before.
- make lua-replicate-commands mutable (it never was, but i don't see why)
- make tcp-backlog immutable (fix a recent refactory mistake)
- increase the max limit of a few configs to match what they were before
the recent refactory
Changes in behavior:
- Change server.stream_node_max_entries from int64_t to long long, so that it can be used by the generic infra
- standard error reply instead of "repl-backlog-size must be 1 or greater" and such
- tls-port and a few TLS booleans were readable (config get) even when USE_OPENSSL was off (now they aren't)
- syslog-enabled, syslog-ident, cluster-enabled, appendfilename, and supervised didn't have a get (now they do)
- pidfile was initialized to NULL in InitServerConfig but had CONFIG_DEFAULT_PID_FILE in rewriteConfig (so the real default was "", but rewrite would cause it to be set), fixed the rewrite.
- TLS config in server.h was uninitialized (if no tls config args were provided)
Adding test for sanity and coverage
- Adding is_valid_fn and update_fn, both return 1 for success and 0 for failure with an optional error message.
- Bugfix in handling boundary check of unsigned numeric types (was boundaries as signed)
- Adding more numeric types to generic mechanism: uint, ulonglong, long, time_t, off_t
- More verbose error replies ("argument must be between" in out of range CONFIG SET (like config file parsing)
- add capability for each config to have a callback to check if value is valid and return error string
will enable converting many of the remaining custom configs into generic ones (reducing the x4 repetition for set,get,config,rewrite)
- add capability for each config to to run some update code after config is changed (only for CONFIG SET)
will also enable converting many of the remaining custom configs into generic ones
- add capability to move default values from server.h and server.c to config.c
will reduce many excess lines in server.h and server.c (plus, no need to rebuild the entire code base when a default change 8-))
other behavior changes:
- fix bug in bool config get (always returning 'yes')
- fix a bug in modifying jemalloc-bg-thread at runtime (didn't call set_jemalloc_bg_thread, due to bad merge conflict resolution (my fault))
- side effect when a failed attempt to enable activedefrag at runtime, we now respond with -ERR and not with -DISABLED
This adds support for explicit configuration of a CA certs directory (in
addition to the previously supported bundle file). For redis-cli, if no
explicit CA configuration is supplied the system-wide default
configuration will be adopted.
misc:
- handle SSL_has_pending by iterating though these in beforeSleep, and setting timeout of 0 to aeProcessEvents
- fix issue with epoll signaling EPOLLHUP and EPOLLERR only to the write handlers. (needed to detect the rdb pipe was closed)
- add key-load-delay config for testing
- trim connShutdown which is no longer needed
- rioFdsetWrite -> rioFdWrite - simplified since there's no longer need to write to multiple FDs
- don't detect rdb child exited (don't call wait3) until we detect the pipe is closed
- Cleanup bad optimization from rio.c, add another one
* Introduce a connection abstraction layer for all socket operations and
integrate it across the code base.
* Provide an optional TLS connections implementation based on OpenSSL.
* Pull a newer version of hiredis with TLS support.
* Tests, redis-cli updates for TLS support.
The implementation of the diskless replication was currently diskless only on the master side.
The slave side was still storing the received rdb file to the disk before loading it back in and parsing it.
This commit adds two modes to load rdb directly from socket:
1) when-empty
2) using "swapdb"
the third mode of using diskless slave by flushdb is risky and currently not included.
other changes:
--------------
distinguish between aof configuration and state so that we can re-enable aof only when sync eventually
succeeds (and not when exiting from readSyncBulkPayload after a failed attempt)
also a CONFIG GET and INFO during rdb loading would have lied
When loading rdb from the network, don't kill the server on short read (that can be a network error)
Fix rdb check when performed on preamble AOF
tests:
run replication tests for diskless slave too
make replication test a bit more aggressive
Add test for diskless load swapdb
jemalloc 5 doesn't immediately release memory back to the OS, instead there's a decaying
mechanism, which doesn't work when there's no traffic (no allocations).
this is most evident if there's no traffic after flushdb, the RSS will remain high.
1) enable jemalloc background purging
2) explicitly purge in flushdb
In mostly production environment, normal user's behavior should be
limited.
Now in redis ACL mechanism we can do it like that:
user default on +@all ~* -@dangerous nopass
user admin on +@all ~* >someSeriousPassword
Then the default normal user can not execute dangerous commands like
FLUSHALL/KEYS.
But some admin commands are in dangerous category too like PSYNC,
and the configurations above will forbid replica from sync with master.
Finally I think we could add a new configuration for replication,
it is masteruser option, like this:
masteruser admin
masterauth someSeriousPassword
Then replica will try AUTH admin someSeriousPassword and get privilege
to execute PSYNC. If masteruser is NULL, replica would AUTH with only
masterauth like before.