redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-24 00:59:02 -05:00

Author	SHA1	Message	Date
antirez	e7affd266c	New features for CLIENT KILL.	2014-06-16 14:24:28 +02:00
antirez	f26f79ea37	Assign an unique non-repeating ID to each new client. This will be used by CLIENT KILL and is also a good way to ensure a given client is still the same across CLIENT LIST calls. The output of CLIENT LIST was modified to include the new ID, but this change is considered to be backward compatible as the API does not imply you can do positional parsing, since each filed as a different name.	2014-06-16 14:22:55 +02:00
antirez	56d26c2380	Client types generalized. Because of output buffer limits Redis internals had this idea of type of clients: normal, pubsub, slave. It is possible to set different output buffer limits for the three kinds of clients. However all the macros and API were named after output buffer limit classes, while the idea of a client type is a generic one that can be reused. This commit does two things: 1) Rename the API and defines with more general names. 2) Change the class of clients executing the MONITOR command from "slave" to "normal". "2" is a good idea because you want to have very special settings for slaves, that are not a good idea for MONITOR clients that are instead normal clients even if they are conceptually slave-alike (since it is a push protocol). The backward-compatibility breakage resulting from "2" is considered to be minimal to care, since MONITOR is a debugging command, and because anyway this change is not going to break the format or the behavior, but just when a connection is closed on big output buffer issues.	2014-06-16 10:43:05 +02:00
antirez	96e0fe6232	Fix semantics of Lua calls to SELECT. Lua scripts are executed in the context of the currently selected database (as selected by the caller of the script). However Lua scripts are also free to use the SELECT command in order to affect other DBs. When SELECT is called frm Lua, the old behavior, before this commit, was to automatically set the Lua caller selected DB to the last DB selected by Lua. See for example the following sequence of commands: SELECT 0 SET x 10 EVAL "redis.call('select','1')" 0 SET x 20 Before this commit after the execution of this sequence of commands, we'll have x=10 in DB 0, and x=20 in DB 1. Because of the problem above, there was a bug affecting replication of Lua scripts, because of the actual implementation of replication. It was possible to fix the implementation of Lua scripts in order to fix the issue, but looking closely, the bug is the consequence of the behavior of Lua ability to set the caller's DB. Under the old semantics, a script selecting a different DB, has no simple ways to restore the state and select back the previously selected DB. Moreover the script auhtor must remember that the restore is needed, otherwise the new commands executed by the caller, will be executed in the context of a different DB. So this commit fixes both the replication issue, and this hard-to-use semantics, by removing the ability of Lua, after the script execution, to force the caller to switch to the DB selected by the Lua script. The new behavior of the previous sequence of commadns is to just set X=20 in DB 0. However Lua scripts are still capable of writing / reading from different DBs if needed. WARNING: This is a semantical change that will break programs that are conceived to select the client selected DB via Lua scripts. This fixes issue #1811.	2014-06-12 16:05:52 +02:00
antirez	73fefd0bc0	Scripting: Fix for a #1118 regression simplified. It is more straightforward to just test for a numerical type avoiding Lua's automatic conversion. The code is technically more correct now, however Lua should automatically convert to number only if the original type is a string that "looks like a number", and not from other types, so practically speaking the fix is identical AFAIK.	2014-06-11 10:10:58 +02:00
Matt Stancliff	76efe1225f	Scripting: Fix regression from #1118 The new check-for-number behavior of Lua arguments broke users who use large strings of just integers. The Lua number check would convert the string to a number, but that breaks user data because Lua numbers have limited precision compared to an arbitrarily precise number wrapped in a string. Regression fixed and new test added. Fixes #1118 again.	2014-06-10 14:26:13 -04:00
antirez	8ef79e72ac	Cluster: fix an error message when logging failover auth denied.	2014-06-10 17:39:42 +02:00
antirez	58799718be	Cluster: better comment for clusterSendFailoverAuthIfNeeded() epoch test.	2014-06-10 17:20:21 +02:00
antirez	61eb0eae83	Cluster: log granted failover authorizations.	2014-06-10 16:56:08 +02:00
antirez	d5d92deb6c	Cluster: log configEpoch updates to myself.	2014-06-10 16:38:36 +02:00
antirez	8204ab0098	Cluster: log when a master denies a failover auth.	2014-06-10 16:07:26 +02:00
antirez	9b3bc82c1a	Cluster: cluster_my_epoch added to CLUSTER INFO output.	2014-06-10 11:35:40 +02:00
Salvatore Sanfilippo	08c7363647	Merge pull request #1743 from mattsta/cygwin-compile-fix Cygwin compile fix	2014-06-09 11:42:14 +02:00
Salvatore Sanfilippo	c7f93143f6	Merge pull request #1669 from mattsta/blpop-internally-added-keys Fix blocking operations from missing new lists	2014-06-09 11:37:28 +02:00
antirez	6a13193d8f	ROLE output improved for slaves. Info about the replication state with the master added.	2014-06-07 17:38:20 +02:00
antirez	d34c2fa3bb	ROLE command added. The new ROLE command is designed in order to provide a client with informations about the replication in a fast and easy to use way compared to the INFO command where the same information is also available.	2014-06-07 17:27:49 +02:00
antirez	32d0a79f78	Cluster: check that configEpoch never goes back. Since there are ways to alter the configEpoch outside of the failover procedure (for exampel CLUSTER SET-CONFIG-EPOCH and via the configEpoch collision resolution algorithm), make always sure, before replacing our configEpoch with a new one, that it is greater than the current one.	2014-06-07 14:37:09 +02:00
antirez	a2c2ef7de5	Cluster: SET-CONFIG-EPOCH should update currentEpoch. SET-CONFIG-EPOCH, used by redis-trib at cluster creation time, failed to update the currentEpoch, making it possible after a failover for a server to set its configEpoch to a value smaller than the current one (since configEpochs are obtained using currentEpoch). The bug totally break the Redis Cluster algorithms and protocols allowing for permanent split brain conditions about the slots configuration as shown in issue #1799.	2014-06-07 14:25:47 +02:00
Salvatore Sanfilippo	a2403227c7	Merge pull request #1772 from andygrunwald/typo-avarege-average Fixed typo in word avarege in result message of --intrinsic-latency analyzer	2014-06-06 11:19:21 +02:00
Salvatore Sanfilippo	113be48221	Merge pull request #1780 from badboy/patch-8 Small typo fixed	2014-06-06 10:45:00 +02:00
Salvatore Sanfilippo	1e221d101c	Merge pull request #1788 from zionwu/unstable fix issue 1787	2014-06-06 10:33:11 +02:00
antirez	14fb0ac649	Don't process min-slaves-to-write for slaves. Replication is totally broken when a slave has this option, since it stops accepting updates from masters. This fixes issue #1434.	2014-06-05 10:48:05 +02:00
antirez	3758f27bc1	Fixed dbuf variable scope in luaRedisGenericCommand(). I'm not sure if while the visibility is the inner block, the fact we point to 'dbuf' is a problem or not, probably the stack var isx guaranteed to live until the function returns. However obvious code is better anyway.	2014-06-04 18:57:12 +02:00
antirez	072982d83c	Scripting: better Lua number -> string conversion in luaRedisGenericCommand(). The lua_to*string() family of functions use a non optimal format specifier when converting integers to strings. This has both the problem of the number being converted in exponential notation, which we don't use as a Redis return value when floating point numbers are involed, and, moreover, there is a loss of precision since the default format specifier is not able to represent numbers that must be represented exactly in the IEEE 754 number mantissa. The new code handles it as a special case using a saner conversion. This fixes issue #1118.	2014-06-04 18:33:24 +02:00
zionwu	dc8584696a	fix issue 1787	2014-06-01 02:23:24 +08:00
antirez	8a588ac14d	More trailing spaces in sentinel.c removed.	2014-05-28 15:46:05 +02:00
Jan-Erik Rediger	b187c591e3	Small typo fixed	2014-05-28 09:46:01 +02:00
Matt Stancliff	7a0c5fdf12	Disable recursive watchdog signal handler If we are in the signal handler, we don't want to handle the signal again. In extreme cases, this can cause a stack overflow and segfault Redis. Fixes #1771	2014-05-26 17:53:33 +02:00
antirez	88c2307535	Cluster: always allow ok -> fail switch in clusterUpdateState(). There is a time defined by REDIS_CLUSTER_WRITABLE_DELAY where fail -> ok switch is not possible after startup as a master for some time, however the contrary (ok -> fail) should always be possible.	2014-05-26 16:24:12 +02:00
Andy Grunwald	94e3bb568a	Fixed typo in word avarege in result message of --intrinsic-latency analyzer	2014-05-22 20:01:12 +02:00
antirez	b239a32aae	redisLogFromHandler() format changed to match new logs format.	2014-05-22 19:24:35 +02:00
antirez	d98fa718e0	Tag every log line with role. Every log contains, just after the pid, a single character that provides information about the role of an instance: S - Slave M - Master C - Writing child X - Sentinel	2014-05-22 18:48:37 +02:00
antirez	39603a7e31	Cluster: slave validity factor is now user configurable. Check the commit changes in the example redis.conf for more information.	2014-05-22 16:57:54 +02:00
antirez	762b1ae2be	Fix an error in redis-trib where we always talk with same node. While iterating the list of nodes we want to set the slot as stable in the current node, not always in the first node of the list.	2014-05-21 18:17:02 +02:00
antirez	c68c78719f	redis-trib fix improved: move keys from N nodes to owner.	2014-05-21 16:40:46 +02:00
Matt Stancliff	33f943b4cd	Fix blocking operations from missing new lists Behrad Zari discovered [1] and Josiah reported [2]: if you block and wait for a list to exist, but the list creates from a non-push command, the blocked client never gets notified. This commit adds notification of blocked clients into the DB layer and away from individual commands. Lists can be created by [LR]PUSH, SORT..STORE, RENAME, MOVE, and RESTORE. Previously, blocked client notifications were only triggered by [LR]PUSH. Your client would never get notified if a list were created by SORT..STORE or RENAME or a RESTORE, etc. Blocked client notification now happens in one unified place: - dbAdd() triggers notification when adding a list to the DB Two new tests are added that fail prior to this commit. All test pass. Fixes #1668 [1]: https://groups.google.com/forum/#!topic/redis-db/k4oWfMkN1NU [2]: #1668	2014-05-21 09:52:52 -04:00
antirez	56161ca0a4	redis-trib fix: use MIGRATE REPLACE when fixing slots. This fixes issue #1765.	2014-05-21 12:15:06 +02:00
antirez	ce2b2f22d9	Merge branch 'unstable' of github.com:/antirez/redis into unstable	2014-05-20 16:15:13 +02:00
Salvatore Sanfilippo	ce7c47265b	Merge pull request #1764 from michael-grunder/lua_cache_segfault Fix LUA_OBJCACHE segfault.	2014-05-20 16:14:34 +02:00
antirez	4ddc77041f	Remove trailing spaces from scripting.c	2014-05-20 16:11:22 +02:00
antirez	01e3f9ba1d	Remove trailing spaces from sentinel.c.	2014-05-20 14:22:42 +02:00
michael-grunder	ea0e2524aa	Fix LUA_OBJCACHE segfault. When scanning the argument list inside of a redis.call() invocation for pre-cached values, there was no check being done that the argument we were on was in fact within the bounds of the cache size. So if a redis.call() command was ever executed with more than 32 arguments (current cache size #define setting) redis-server could segfault.	2014-05-19 13:18:13 -07:00
Mike Trinkala	ba52cd06c8	Correct the HyperLogLog stale cache flag to prevent unnecessary computations. Set the MSB as documented.	2014-05-18 07:26:26 -07:00
antirez	67133d2f48	Cluster: use clusterSetNodeAsMaster() during slave failover. clusterHandleSlaveFailover() was reimplementing what clusterSetNodeAsMaster() without any good reason.	2014-05-15 17:03:28 +02:00
antirez	8c6e92c3bc	Cluster: clear todo_before_sleep flags when executing actions. Thanks to this change, when there is some code like: clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE\|...); ... and later before returning to the event loop ... clusterUpdateState(); The clusterUpdateState() function will clar the flag and will not be repeated in the clusterBeforeSleep() function. This especially important for config save/fsync flags which are slow to execute and not a good idea to repeat without a good reason. This is implemented for all the CLUSTER_TODO flags.	2014-05-15 16:33:13 +02:00
antirez	7b87cda70e	Fixed typo in CLUSTER RESET implementation.	2014-05-15 12:33:57 +02:00
antirez	796f4ae9f7	CLUSTER RESET implemented. The new command is able to reset a cluster node so that it starts again as a fresh node. By default the command performs a soft reset (the same as calling it as CLUSTER RESET SOFT), and the following steps are performed: 1) All slots are set as unassigned. 2) The list of known nodes is flushed. 3) Node is set as master if it is a slave. When an hard reset is performed with CLUSTER RESET HARD the following additional operations are performed: 4) A new Node ID is created at random. 5) Epochs are set to 0. CLUSTER RESET is useful both when the sysadmin wants to reconfigure a node with a different role (for example turning a slave into a master) and for testing purposes. It also may play a role in automatically provisioned Redis Clusters, since it allows to reset a node back to the initial state in order to be reconfigured.	2014-05-15 11:43:06 +02:00
antirez	8b9d5ecbd1	Remove trailing spaces from cluster.c file.	2014-05-15 10:18:36 +02:00
antirez	60e5d1724c	Cluster: don't accept cluster bus connections during startup.	2014-05-14 12:05:00 +02:00
antirez	6baac558d8	Cluster: better handling of stolen slots. The previous code handling a lost slot (by another master with an higher configuration for the slot) was defensive, considering it an error and putting the cluster in an odd state requiring redis-cli fix. This was changed, because actually this only happens either in a legitimate way, with failovers, or when the admin messed with the config in order to reconfigure the cluster. So the new code instead will try to make sure that the keys stored match the new slots map, by removing all the keys in the slots we lost ownership from. The function that deletes the keys from the lost slots is called only if the node does not lose all its slots (resulting in a reconfiguration as a slave of the node that got ownership). This is an optimization since the replication code will anyway flush all the instance data in a faster way.	2014-05-14 10:46:37 +02:00
antirez	832a298005	Cluster: fixed data_age computation / check integer overflow.	2014-05-12 17:46:15 +02:00
Matt Stancliff	7c4decb101	Fix lack of strtold under Cygwin Renaming strtold to strtod then casting the result is the standard way of dealing with no strtold in Cygwin.	2014-05-12 11:11:09 -04:00
Matt Stancliff	3e0e51dd9f	Fix lack of SA_ONSTACK under Cygwin Fixes #232	2014-05-12 11:10:24 -04:00
antirez	2692339138	Cluster: forced failover implemented. Using CLUSTER FAILOVER FORCE it is now possible to failover a master in a forced way, which means: 1) No check to understand if the master is up is performed. 2) No data age of the slave is checked. Evan a slave with very old data can manually failover a master in this way. 3) No chat with the master is attempted to reach its replication offset: the master can just be down.	2014-05-12 16:34:20 +02:00
antirez	005f564eb3	Cluster: bypass data_age check for manual failovers. Automatic failovers only happen in Redis Cluster if the slave trying to be elected was disconnected from its master for no more than 10 times the node-timeout value. However there should be no such a check for manual failovers, since these are initiated by the sysadmin that, in theory, knows what she is doing when a slave is selected to be promoted.	2014-05-12 16:12:12 +02:00
Akos Vandra	b252fab06c	Fixed possible buffer overflow bug if RDB file is corrupted. (Note: commit message modified by @antirez for clarity).	2014-05-12 11:48:14 +02:00
Akos Vandra	433e835d3e	fixed possible buffer overflow error	2014-05-12 11:19:07 +02:00
antirez	658ad301cc	redis-trib create: use CONFIG SET-CONFIG-EPOCH before joining the cluster. This way there is no need for the conflict resolution algo to be used in order to start with a cluster where each node has a different configEpoch.	2014-05-12 11:06:37 +02:00
antirez	715a6d3a78	redis-trib import: trap MIGRATE errors.	2014-05-12 10:36:33 +02:00
antirez	939c586ef7	redis-trib.rb: MIGRATE hardcoded timeout set to 15 sec. Will be configurable / adaptive at some point but let's start with a saner value compared to 1 sec which is not a good idea for big data structures stored into a single key.	2014-05-12 10:22:24 +02:00
antirez	5c78f87666	RESTORE: reply with -BUSYKEY special error code. The error when the target key is busy was a generic one, while it makes sense to be able to distinguish between the target key busy error and the others easily.	2014-05-12 10:01:59 +02:00
antirez	2a48bd4a37	Cluster: initial ability to import data from standalone instance.	2014-05-10 17:59:31 +02:00
antirez	71d0e7e0ea	CLUSTER MEET: better error messages when address is invalid. Fixes issue #1734.	2014-05-09 16:36:59 +02:00
antirez	74435aba47	redis-trib: allow support for mandatory options.	2014-05-09 16:11:11 +02:00
antirez	72ff03346f	DEBUG POPULATE: call dictExpand() to avoid useless rehashing.	2014-05-09 15:02:29 +02:00
antirez	8a170c817d	Cluster: bulk-accept new nodes connections. The same change was operated for normal client connections. This is important for Cluster as well, since when a node rejoins the cluster, when a partition heals or after a restart, it gets flooded with new connection attempts by all the other nodes trying to form a full mesh again.	2014-05-09 11:52:59 +02:00
antirez	3625b52791	Cluster: clusterAcceptHandler() comments updated to match the code.	2014-05-09 11:44:46 +02:00
antirez	2102778606	Sentinel: log when a failover will be attempted again. When a Sentinel performs a failover (successful or not), or when a Sentinel votes for a different Sentinel trying to start a failover, it sets a min delay before it will try to get elected for a failover. While not strictly needed, because if multiple Sentinels will try to failover the same master at the same time, only one configuration will eventually win, this serialization is practically very useful. Normal failovers are cleaner: one Sentinel starts to failover, the others update their config when the Sentinel performing the failover is able to get the selected slave to move from the role of slave to the one of master. However currently this timeout was implicit, so users could see Sentinels not reacting, after a failed failover, for some time, without giving any feedback in the logs to the poor sysadmin waiting for clues. This commit makes Sentinels more verbose about the delay: when a master is down and a failover attempt is not performed because the delay has still not elaped, something like that will be logged: Next failover delay: I will not start a failover before Thu May 8 16:48:59 2014	2014-05-08 16:38:53 +02:00
antirez	931beae9b0	Sentinel: generate +config-update-from event when a new config is received. This event makes clear, before the switch-master event is generated, that a Sentinel received a configuration update from another Sentinel.	2014-05-08 15:59:34 +02:00
antirez	0b0f872f3f	REDIS_ENCODING_EMBSTR_SIZE_LIMIT set to 39. The new value is the limit for the robj + SDS header + string + null-term to stay inside the 64 bytes Jemalloc arena in 64 bits systems.	2014-05-07 17:05:09 +02:00
antirez	4f686555ce	Scripting: objects caching for Lua c->argv creation. Reusing small objects when possible is a major speedup under certain conditions, since it is able to avoid the malloc/free pattern that otherwise is performed for every argument in the client command vector.	2014-05-07 16:12:32 +02:00
antirez	1e4ba6e7e6	Scripting: Use faster API for Lua client c->argv creation. Replace the three calls to Lua API lua_tostring, lua_lua_strlen, and lua_isstring, with a single call to lua_tolstring. ~ 5% consistent speed gain measured.	2014-05-07 16:12:32 +02:00
antirez	76fda9f8e1	Scripting: don't call lua_gc() after Lua script run. Calling lua_gc() after every script execution is too expensive, and apparently does not make the execution smoother: the same peak latency was measured before and after the commit. This change accounts for scripts execution speedup in the order of 10%.	2014-05-07 16:12:32 +02:00
antirez	48c49c4851	Scripting: cache argv in luaRedisGenericCommand(). ~ 4% consistently measured speed improvement.	2014-05-07 16:12:32 +02:00
antirez	3318b74705	Fixed missing c->bufpos reset in luaRedisGenericCommand(). Bug introduced when adding a fast path to avoid copying the reply buffer for small replies that fit into the client static buffer.	2014-05-07 16:12:32 +02:00
antirez	c49955fd77	Scripting: replace tolower() with faster code in evalGenericCommand(). The function showed up consuming a non trivial amount of time in the profiler output. After this change benchmarking gives a 6% speed improvement that can be consistently measured.	2014-05-07 16:12:32 +02:00
antirez	0ef4f44c5a	Scripting: luaRedisGenericCommand() fast path for buffer-only replies. When the reply is only contained in the client static output buffer, use a fast path avoiding the dynamic allocation of an SDS string to concatenate the client reply objects.	2014-05-07 16:12:32 +02:00
antirez	8226be61ec	Define HAVE_ATOMIC for clang.	2014-05-07 16:12:32 +02:00
antirez	40abeb1f40	Scripting: simpler reply buffer creation in luaRedisGenericCommand(). It if faster to just create the string with a single sdsnewlen() call. If c->bufpos is zero, the call will simply be like sdsemtpy().	2014-05-07 16:12:32 +02:00
antirez	11d9ecb71d	CLUSTER SET-CONFIG-EPOCH implemented. Initially Redis Cluster accepted that after cluster creation all the nodes were at configEpoch 0, evolving from zero as failovers happen. However later the semantic was made more strict in order to make sure a cluster has always all the master nodes with a different configEpoch, which is more robust in some corner case (especially resulting from errors by the system administrator). To assign different configEpochs to different nodes at startup was a task performed naturally by the config conflicts resolution algorithm (see the Cluster specification). However this works well only for small clusters or when there are actually just a few collisions, since it is designed for exceptional cases. When a large cluster is created hundred of nodes can be at epoch 0, so the conflict resolution code is slow to provide an unique config to each node. For this reason this new command was introduced. It can be called only when a node is totally fresh: no other nodes known, and configEpoch set to zero, so it is safe even against misuses. redis-trib will use the new command in order to start the cluster already setting an incremental unique config to every node.	2014-04-29 19:15:16 +02:00
antirez	0bcc7cb4bf	CLIENT LIST speedup via peerid caching + smart allocation. This commit adds peer ID caching in the client structure plus an API change and the use of sdsMakeRoomFor() in order to improve the reallocation pattern to generate the CLIENT LIST output. Both the changes account for a very significant speedup.	2014-04-28 17:36:57 +02:00
antirez	f9a4a80f49	Use sdscatfmt() in getClientInfoString() to make it faster.	2014-04-28 16:55:43 +02:00
antirez	2d76736a2e	Added new sdscatfmt() %u and %U format specifiers. This commit also fixes a bug in the implementation of sdscatfmt() resulting from stale references to the SDS string header after sdsMakeRoomFor() calls.	2014-04-28 16:38:17 +02:00
antirez	53575c4708	sdscatfmt() added to SDS library. sdscatprintf() relies on printf() family libc functions and is sometimes too slow in critical code paths. sdscatfmt() is an alternative which is: 1) Far less capable. 2) Format specifier uncompatible. 3) Faster. It is suitable to be used in those speed critical code paths such as CLIENT LIST output generation.	2014-04-28 16:23:17 +02:00
antirez	e29d330724	Process events with processEventsWhileBlocked() when blocked. When we are blocked and a few events a processed from time to time, it is smarter to call the event handler a few times in order to handle the accept, read, write, close cycle of a client in a single pass, otherwise there is too much latency added for clients to receive a reply while the server is busy in some way (for example during the DB loading).	2014-04-24 21:44:32 +02:00
antirez	3a3458ee7b	Accept multiple clients per iteration. When the listening sockets readable event is fired, we have the chance to accept multiple clients instead of accepting a single one. This makes Redis more responsive when there is a mass-connect event (for example after the server startup), and in workloads where a connect-disconnect pattern is used often, so that multiple clients are waiting to be accepted continuously. As a side effect, this commit makes the LOADING, BUSY, and similar errors much faster to deliver to the client, making Redis more responsive when there is to return errors to inform the clients that the server is blocked in an not interruptible operation.	2014-04-24 21:44:32 +02:00
antirez	cac4bae11a	AE_ERR -> ANET_ERR in acceptUnixHandler(). No actual changes since the value is the same.	2014-04-24 21:43:22 +02:00
antirez	7d9b45b4a1	While ANET_ERR is -1, check syscall retval for -1 itself.	2014-04-24 17:03:07 +02:00
antirez	e3cf812c9e	clusterLoadConfig() REDIS_ERR retval semantics refined. We should return REDIS_ERR to signal we can't read the configuration because there is no config file only after checking errno, othewise we risk to rewrite an existing file that was not accessible for some other reason.	2014-04-24 16:23:03 +02:00
antirez	db06108bc1	Lock nodes.conf to avoid multiple processes using the same file. This was a common source of problems among users. The solution adopted is not bullet-proof as if the user deletes the nodes.conf file manually, and starts a new instance with the same nodes.conf file path, two instances will use the same file. However following this reasoning the user may drop a nuclear bomb into the datacenter as well.	2014-04-24 16:04:10 +02:00
Salvatore Sanfilippo	32c917964e	Merge pull request #1677 from mattsta/expire-before-delete Check key expiration before deleting	2014-04-23 16:13:49 +02:00
Glauber Costa	7dd4432798	fix null pointer access with no file pointer I happen to be working on a system that lacks urandom. While the code does try to handle this case and artificially create some bytes if the file pointer is empty, it does try to close it unconditionally, leading to a segfault.	2014-04-23 12:07:25 +02:00
Salvatore Sanfilippo	e0918a332d	Merge pull request #1701 from kingsumos/node_description fix cluster node description showing wrong slot allocation	2014-04-23 11:37:47 +02:00
antirez	cb4e2ee9e7	Missing return REDIS_ERR added to processMultibulkBuffer(). When we set a protocol error we should return with REDIS_ERR to let the caller know it should stop processing the client. Bug found in a code auditing related to issue #1699.	2014-04-23 10:19:43 +02:00
kingsumos	a69178fdd2	fix cluster node description showing wrong slot allocation	2014-04-22 11:44:53 -04:00
antirez	20c040d364	redis-cli help.h updated.	2014-04-22 16:14:38 +02:00
antirez	ab3afe2f4d	ZREMRANGEBYLEX memory leak removed calling zslFreeLexRange().	2014-04-18 13:01:04 +02:00
antirez	5eb7ac0c92	Speedup hllRawSum() processing 8 bytes per iteration. The internal HLL raw encoding used by PFCOUNT when merging multiple keys is aligned to 8 bits (1 byte per register) so we can exploit this to improve performances by processing multiple bytes per iteration. In benchmarks the new code was several times faster with HLLs with many registers set to zero, while no slowdown was observed with populated HLLs.	2014-04-17 18:05:27 +02:00
antirez	192a213274	Speedup SUM(2^-reg[m]) in HyperLogLog computation. When the register is set to zero, we need to add 2^-0 to E, which is 1, but it is faster to just add 'ez' at the end, which is the number of registers set to zero, a value we need to compute anyway.	2014-04-17 17:53:20 +02:00
antirez	0feb2aabca	PFCOUNT support for multi-key union.	2014-04-17 17:32:59 +02:00
antirez	fcd2155b6f	HyperLogLog low level merge extracted from PFMERGE.	2014-04-17 17:08:43 +02:00
antirez	78954ca3a2	ZREMRANGEBYLEX implemented.	2014-04-17 14:49:25 +02:00
antirez	8827dc4eec	Always pass sorted set range objects by reference.	2014-04-17 14:30:12 +02:00
antirez	95098b7230	ZREMRANGE* commands refactored into a single generic function.	2014-04-17 14:19:14 +02:00
antirez	bcab07f7fc	Pass by pointer and release of lex ranges. Given that the code was written with a 2 years pause... something strange happened in the middle. So there was no function to free a lex range min/max objects, and in some places the range was passed by value.	2014-04-16 23:55:58 +02:00
antirez	8b5e0b213e	ZLEXCOUNT implemented. Like ZCOUNT for lexicographical ranges.	2014-04-16 12:17:00 +02:00
antirez	8e8f8189eb	HyperLogLog invalid representation error code set to INVALIDOBJ.	2014-04-16 09:10:30 +02:00
antirez	0bbdaca6a0	PFDEBUG TODENSE added. Converts HyperLogLogs from sparse to dense. Used for testing.	2014-04-16 09:05:42 +02:00
antirez	402110f9fd	User-defined switch point between sparse-dense HLL encodings.	2014-04-15 17:46:51 +02:00
antirez	d541f65d66	PFSELFTEST improved with sparse encoding checks.	2014-04-15 10:10:38 +02:00
antirez	dde8dff73f	PFDEBUG ENCODING added.	2014-04-14 19:35:00 +02:00
antirez	54f0156e8c	Set HLL_SPARSE_MAX to 3000. After running a few benchmarks, 3000 looks like a reasonable value to keep HLLs with a few thousand elements small while the CPU cost is still not huge. This covers all the cases where the dense representation would use N orders of magnitude more space, like in the case of many HLLs with carinality of a few tens or hundreds. It is not impossible that in the future this gets user configurable, however it is easy to pick an unreasoable value just looking at savings in the space dimension without checking what happens in the time dimension.	2014-04-14 16:15:55 +02:00
antirez	848d0461f9	Error message for invalid HLL objects unified.	2014-04-14 16:11:54 +02:00
antirez	81ceef7d22	PFMERGE fixed to work with sparse encoding.	2014-04-14 16:09:32 +02:00
antirez	9df77fc0c4	Mark PFDEBUG as write command in the commands table. It is safer since it is able to have side effects.	2014-04-14 15:57:50 +02:00
antirez	3bc35f9ce9	Correctly replicate PFDEBUG GETREG. Even if it is a debugging command, make sure that when it forces a change in encoding, the command is propagated.	2014-04-14 15:57:19 +02:00
antirez	ba0afb4566	Added assertion in hllSparseAdd() when promotion to dense occurs. If we converted to dense, a register must be updated in the dense representation.	2014-04-14 15:55:21 +02:00
antirez	e9cd51c7eb	hllSparseAdd(): speed optimization. Mostly by reordering opcodes check conditional by frequency of opcodes in larger sparse-encoded HLLs.	2014-04-14 15:42:05 +02:00
antirez	681bf7468b	Detect corrupted sparse HLLs in hllSparseSum().	2014-04-14 15:20:26 +02:00
antirez	db40da0a47	hllSparseAdd(): faster code removing conditional. Bottleneck found profiling. Big run time improvement found when testing after the change.	2014-04-14 12:58:46 +02:00
antirez	4e0a99ba51	Comment typo in hllSparseAdd(). first -> fits.	2014-04-14 12:12:53 +02:00
antirez	5532b5308a	Merge adjacent VAL opcodes in hllSparseAdd(). As more values are added splitting ZERO or XZERO opcodes, try to merge adjacent VAL opcodes if they have the same value.	2014-04-14 12:11:39 +02:00
antirez	837ca39081	More robust HLL_SPARSE macros protecting 'p' with parens. Now the macros will work with arguments such as "ptr+1".	2014-04-14 11:49:53 +02:00
antirez	142d133c8a	hllSparseAdd() opcode seek stop condition fixed.	2014-04-14 11:04:11 +02:00
antirez	1ee18db922	Fixed error message generation in PFDEBUG GETREG. Bulk length for registers was emitted too early, so if there was a bug the reply looked like a long array with just one element, blocking the client as result.	2014-04-14 10:25:19 +02:00
antirez	82c31f750d	Fixed memmove() count in hllSparseAdd().	2014-04-14 09:40:07 +02:00
antirez	3b20003503	hllSparseAdd(): more correct dense conversion conditional. We want to promote if the total string size exceeds the resulting size after the upgrade.	2014-04-14 09:36:32 +02:00
antirez	b7571b7453	hllSparseToDense(): sanity check added. The function checks if all the HLL_REGISTERS were processed during the convertion from sparse to dense encoding, returning REDIS_OK or REDIS_ERR to signal a corruption problem. A bug in PFDEBUG GETREG was fixed: when the object is converted to the dense representation we need to reassign the new pointer to the header structure pointer.	2014-04-14 09:27:01 +02:00
antirez	f9dc3cb04d	PFDEBUG DECODE added. Provides a human readable description of the opcodes composing a run-length encoded HLL (sparse encoding). The command is only useful for debugging / development tasks.	2014-04-14 09:00:53 +02:00
antirez	261da523e8	PFDEBUG added, PFGETREG removed. PFDEBUG will be the interface to do debugging tasks with a key containing an HLL object.	2014-04-13 23:01:21 +02:00
antirez	e8e717e145	hllSparseToDense API changed to take ref to object. The new API takes directly the object doing everything needed to turn it into a dense representation, including setting the new representation as object->ptr.	2014-04-13 22:59:27 +02:00
antirez	2067644a8c	hllSparseAdd() sanity check for span != 0 added.	2014-04-13 10:19:12 +02:00
antirez	80140fa006	Fix hllSparseAdd() new sequence replacement when next is NULL. sdsIncrLen() must be called anyway even if we are replacing the last oppcode of the sparse representation.	2014-04-12 23:55:44 +02:00
antirez	3c3c16561a	Fix seqlen computation in hllSparseAdd().	2014-04-12 23:52:36 +02:00
antirez	a9e057e095	Abstract hllSparseAdd() / hllDenseAdd() via hllAdd().	2014-04-12 23:42:56 +02:00
antirez	0b7d08efb9	hllSparseSum(): multiply 1 * runlen for zero entries.	2014-04-12 16:47:50 +02:00
antirez	d9314079ca	Macro HLL_SPARSE_XZERO_LEN fixed.	2014-04-12 16:46:08 +02:00
antirez	f5c03044a6	Fix HLL sparse object creation #2 . Two vars initialized to wrong values in createHLLObject().	2014-04-12 16:37:50 +02:00
antirez	b5659cb0a6	Increment pointer while iterating sparse HLL object.	2014-04-12 11:02:14 +02:00
antirez	1ccb661569	Fix HLL sparse object creation. The function didn't considered the fact that each XZERO opcode is two bytes.	2014-04-12 10:59:12 +02:00
antirez	a79386b1af	Create HyperLogLog objects with sparse encoding.	2014-04-12 10:56:18 +02:00
antirez	1fc04a6221	HyperLogLog sparse to dense conversion function.	2014-04-12 10:55:42 +02:00
antirez	c756936b1d	HyperLogLog sparse representation initial implementation. Code never tested, but the basic layout is shaped in this commit. Also missing: 1) Sparse -> Dense conversion function. 2) New HLL object creation using the sparse representation. 3) Implementation of PFMERGE for the sparse representation.	2014-04-11 17:34:32 +02:00
antirez	8ea5b46d30	hllCount() refactored to support multiple representations.	2014-04-11 10:25:07 +02:00
antirez	1efc1e052d	hllAdd() refactored into two functions. Also dense representation access macro renamed accordingly.	2014-04-11 09:47:52 +02:00
antirez	d55474e558	HyperLogLog refactoring to support different encodings. Metadata are now placed at the start of the representation as an header. There is a proper structure to access the representation. Still work to do in order to truly abstract the implementation from the representation, commands still work assuming dense representation.	2014-04-11 09:26:45 +02:00
Matt Stancliff	83d2830372	Check key expiration before deleting Deleting an expired key should return 0, not success. Fixes #1648	2014-04-10 17:08:02 -04:00
antirez	9c037ba85f	HyperLogLog sparse representation slightly modified. After running a few simulations with different alternative encodings, it was found that the VAL opcode performs better using 5 bits for the value and 2 bits for the run length, at least for cardinalities in the range of interest.	2014-04-10 16:36:31 +02:00
antirez	da2fbcf93d	HyperLogLog sparse representation description and macros.	2014-04-09 18:56:00 +02:00
antirez	67bb2c46b2	Add casting to match printf format. adjustOpenFilesLimit() and clusterUpdateSlotsWithConfig() that were assuming uint64_t is the same as unsigned long long, which is true probably for all the systems out there that we target, but still GCC emitted a warning since technically they are two different types.	2014-04-07 08:58:06 +02:00
antirez	3a6a1e42f1	ZRANGEBYLEX and ZREVRANGEBYLEX implementation.	2014-04-05 11:41:43 +02:00
antirez	d5be696db8	PFCOUNT: always unshare/decode the object. This will be a non-op most of the times since the object will be unshared / decoded, however it is more technically correct to start this way since the object may be decoded even in the read-only code path.	2014-04-04 17:25:55 +02:00
antirez	1c12bcbcfb	tryObjectEncoding() refactoring. We also avoid to re-create an object that is already in EMBSTR encoding.	2014-04-04 17:25:35 +02:00
antirez	433ce7f85c	Changed HyperLogLog hash seed to a non-zero value. Using a seed of zero has the side effect of having the empty string hashing to what is a very special case in the context of HyperLogLog: a very long run of zeroes. This did not influenced the correctness of the result with 16k registers because of the harmonic mean, but still it is inconvenient that a so obvious value maps to a so special hash. The seed 0xadc83b19 is used instead, which is the first 64 bits of the SHA1 of the empty string. Reference: issue #1657.	2014-04-04 09:36:32 +02:00
antirez	d2ca4bb62d	Return "WRONGTYPE" error on PF* type mismatch.	2014-04-03 22:10:20 +02:00
antirez	349c978189	Fix PFADD infinite loop. We need to guarantee that the last bit is 1, otherwise an element may hash to just zeroes with probability 1/(2^64) and trigger an infinite loop. See issue #1657.	2014-04-03 19:31:26 +02:00
antirez	ce637b2fef	Remove HyperLogLog type checking duplicated code.	2014-04-03 13:20:34 +02:00
antirez	aaaed66c56	PFGETREG added for testing purposes. The new command allows to get a dump of the registers stored into an HyperLogLog data structure for testing / debugging purposes.	2014-04-03 10:45:30 +02:00
antirez	9682295f68	PFCOUNT: unshare the object when cached cardinality is modified.	2014-04-03 10:37:32 +02:00
antirez	be9860d0e9	PFSELFTEST improved to test the approximation error.	2014-04-03 10:18:31 +02:00
antirez	096b5e921e	HyperLogLog: added magic / version. This will allow future changes like compressed representations. Currently the magic is not checked for performance reasons but this may change in the future, for example if we add new types encoded in strings that may have the same size of HyperLogLogs.	2014-04-02 09:58:47 +02:00
Raymond Myers	bf066c875f	Fixed pfadd/pfcount commands emitting hll* events instead of pf* events	2014-04-01 14:59:13 -07:00
Raymond Myers	f0868e080d	Change HLL* to PF* in error messages	2014-04-01 14:54:31 -07:00
antirez	4ab162a559	Include redis.h before other stuff in hyperloglog.c. Otherwise fmacros.h is included later and this may break compilation on different systems.	2014-04-01 15:52:15 +02:00
antirez	5afcca34ce	HyperLogLog API prefix modified from "P" to "PF". Using both the initials of Philippe Flajolet instead of just "P".	2014-03-31 22:48:01 +02:00
antirez	ba4e20835a	Makefile.dep updated with hyperloglog.o deps.	2014-03-31 19:51:34 +02:00
antirez	e887c62e45	HyperLogLog: make API use the P prefix in honor of Philippe Flajolet.	2014-03-31 19:29:40 +02:00
antirez	f1b7608128	HLLMERGE fixed by adding a... missing loop!	2014-03-31 16:03:05 +02:00
antirez	ec1ee66256	HyperLogLog apply bias correction using a polynomial. Better results can be achieved by compensating for the bias of the raw approximation just after 2.5m (when LINEARCOUNTING is no longer used) by using a polynomial that approximates the bias at a given cardinality. The curve used was found using this web page: http://www.xuru.org/rt/PR.asp That performs polynomial regression given a set of values.	2014-03-31 15:41:38 +02:00
antirez	f2277475b2	HLLMERGE implemented. Merge N HLL data structures by selecting the max value for every M[i] register among the set of HLLs.	2014-03-31 14:39:44 +02:00
antirez	4ab45183fc	HLLCOUNT is technically a write command When we update the cached value, we need to propagate the command and signal the key as modified for WATCH.	2014-03-31 12:29:24 +02:00
antirez	8aeb0c196a	HLLADD: propagate write when only variable name is given. The following form is given: HLLADD myhll No element is provided in the above case so if 'myhll' var does not exist the result is to just create an empty HLL structure, and no update will be performed on the registers. In this case, the DB should still be set dirty and the command propagated.	2014-03-31 12:21:08 +02:00
antirez	60e60f4ee0	HyperLogLog: use LINEARCOUNTING up to 3m. The HyperLogLog original paper suggests using LINEARCOUNTING for cardinalities < 2.5m, however for P=14 the median / max error curves show that a value of '3' is the best pick for m = 16384.	2014-03-31 10:09:55 +02:00
antirez	307a189900	HyperLogLog approximated cardinality caching. The more we add elements to an HyperLogLog counter, the smaller is the probability that we actually update some register. From this observation it is easy to see how it is possible to use caching of a previously computed cardinality and reuse it to serve HLLCOUNT queries as long as no register was updated in the data structure. This commit does exactly this by using just additional 8 bytes for the data structure to store a 64 bit unsigned integer value cached cardinality. When the most significant bit of the 64 bit integer is set, it means that the value computed is no longer usable since at least a single register was modified and we need to recompute it at the next call of HLLCOUNT. The value is always stored in little endian format regardless of the actual CPU endianess.	2014-03-30 19:26:16 +02:00
antirez	543ede03f2	String value unsharing refactored into proper function. All the Redis functions that need to modify the string value of a key in a destructive way (APPEND, SETBIT, SETRANGE, ...) require to make the object unshared (if refcount > 1) and encoded in raw format (if encoding is not already REDIS_ENCODING_RAW). This was cut & pasted many times in multiple places of the code. This commit puts the small logic needed into a function called dbUnshareStringValue().	2014-03-30 18:32:17 +02:00
antirez	aaf6db459b	Use endian neutral hash function for HyperLogLog. We need to be sure that you can save a dataset in a Redis instance, reload it in a different architecture, and continue to count in the same HyperLogLog structure. So 32 and 64 bit, little or bit endian, must all guarantee to output the same hash for the same element.	2014-03-30 00:55:49 +01:00
antirez	4628ac0065	HyperLogLog internal representation modified. The new representation is more obvious, starting from the LSB of the first byte and using bits going to MSB, and passing to next byte as needed. There was also a subtle error: first two bits were unused, everything was carried over on the right of two bits, even if it worked because of the code requirement of always having a byte more at the end. During the rewrite the code was made safer trying to avoid undefined behavior due to shifting an uint8_t for more than 8 bits.	2014-03-29 16:04:27 +01:00
antirez	5317a582cf	Remove a few useless operations from hllCount() fast path.	2014-03-29 12:17:56 +01:00
antirez	3ed947fb30	HLLCOUNT 3x faster taking fast path for default params.	2014-03-29 12:12:44 +01:00
antirez	28dce36f76	Use processor base types in HLL_(GET\|SET)_REGISTER. This speedups the macros by a noticeable factor.	2014-03-29 08:37:01 +01:00
antirez	ac8fbe8829	HyperLogLog: use precomputed table for 2^(-M[i]).	2014-03-28 22:49:24 +01:00
antirez	f90a4af3d7	HyperLogLog algorithm fixed in two ways. There was an error in the computation of 2^register, and the sequence of zeroes computed after the hashing did not included the "1".	2014-03-28 18:24:05 +01:00
antirez	ded86076b3	HLLCOUNT implemented.	2014-03-28 17:37:18 +01:00
antirez	156929ee97	HLLADD implemented.	2014-03-28 16:24:35 +01:00
antirez	5660ff1cc1	hllAdd() low level HyperLogLog "add" implemented.	2014-03-28 14:42:30 +01:00
antirez	e3234116ad	HyperLogLog: redefine constants using "P".	2014-03-28 14:09:28 +01:00
antirez	e73839e7d5	HLL_SET_REGISTER fixed. There was an error in the first version of the macro. Now the HLLSELFTEST test reports success.	2014-03-28 13:56:07 +01:00
antirez	f22397dd7f	Use REDIS_HLL_REGISTER_MAX when possible.	2014-03-28 12:16:39 +01:00
antirez	1c88c5941b	HLL_(SET\|GET)_REGISTER types fixed.	2014-03-28 12:15:46 +01:00
antirez	552eb5407a	HLLSELFTEST command implemented. To test the bitfield array of counters set/get macros from the Redis Tcl suite is hard, so a specialized command that is able to test the internals was developed.	2014-03-28 12:11:55 +01:00
antirez	0609380603	HyperLogLog: initial sketch of registers access.	2014-03-28 11:18:48 +01:00
antirez	8f52173b2c	Cluster: last_vote_epoch -> lastVoteEpoch. Use cammel case for epochs that are persisted on disk.	2014-03-27 15:01:24 +01:00
antirez	7fb14b73ba	Cluster: save/restore vars that must persist after recovery. This fixes issue #1479.	2014-03-27 14:56:29 +01:00
antirez	6dd2dbbd36	Cluster: handshake "already known" error logged to VERBOSE. This is not really an error but something that always happens for example when creating a new cluster, or if the sysadmin rejoins manually a node that is already known. Since useless logs don't help, moved to VERBOSE level.	2014-03-26 16:35:38 +01:00
antirez	3cf6f1f54f	Cluster: clusterHandleConfigEpochCollision() fixed. New config epochs must always be obtained incrementing the currentEpoch, that is itself guaranteed to be >= the max configEpoch currently known to the node.	2014-03-26 12:31:28 +01:00
antirez	80d4c52cdf	Cluster: better logging for clusterUpdateSlotsConfigWith().	2014-03-26 12:09:38 +01:00
antirez	eb746ec408	Cluster: CLUSTER SETSLOT implementation comment updated. Update the comment since the implementation details changed.	2014-03-25 17:50:46 +01:00
antirez	0064b1a583	Cluster: redis-trib cluster allocation more even across nodes. redis-trib used to allocate slots not considering fractions of nodes when computing the slots_per_node amount. So the fractional part was carried over till the end of the allocation, where the last node received a few more slots than any other (or a lot more if the cluster was composed of many nodes). The computation was changed to allocate slots more evenly when they are not exactly divisible for the number of masters we have.	2014-03-25 17:44:39 +01:00
antirez	6c527a89a0	Cluster: configEpoch collisions resolution. The slave election in Redis Cluster guarantees that slaves promoted to masters always end with unique config epochs, however failures during manual reshardings, software bugs and operational errors may in theory cause two nodes to have the same configEpoch. This commit introduces a mechanism to eventually always end with different configEpochs if a collision ever happens. As a (wanted) side effect, this also ensures that after a new cluster is created, all nodes will end with a different configEpoch automatically.	2014-03-25 17:19:58 +01:00
antirez	c1041c570f	Cluster: stay within 80 cols.	2014-03-25 16:07:14 +01:00
antirez	6540e9eeaa	Fix off by one bug in freeMemoryIfNeeded() eviction pool. Bug found by the continuous integration test running the Redis with valgrind: ==6245== Invalid read of size 8 ==6245== at 0x4C2DEEF: memcpy@GLIBC_2.2.5 (mc_replace_strmem.c:876) ==6245== by 0x41F9E6: freeMemoryIfNeeded (redis.c:3010) ==6245== by 0x41D2CC: processCommand (redis.c:2069) memmove() size argument was accounting for an extra element, going outside the bounds of the array.	2014-03-25 10:32:15 +01:00
antirez	6e33c908dd	adjustOpenFilesLimit() refactoring. In this commit: * Decrement steps are semantically differentiated from the reserved FDs. Previously both values were 32 but the meaning was different. * Make it clear that we save setrlimit errno. * Don't explicitly handle wrapping of 'f', but prevent it from happening. * Add comments to make the function flow more readable. This integrates PR #1630	2014-03-25 09:05:28 +01:00
Salvatore Sanfilippo	72c5ebcba4	Merge pull request #1630 from mattsta/fix-infinite-loop-ulimit Fix infinite loop ulimit	2014-03-25 08:42:39 +01:00
antirez	35667d75c3	Fixed undefined variable value with certain code paths. In sentinelFlushConfig() fd could be undefined when the following if statement was true: if (rewrite_status == -1) goto werr; This could cause random file descriptors to get closed.	2014-03-24 21:07:44 +01:00
Matt Stancliff	78782ed59f	Use LRU_CLOCK() instead of function getLRUClock() lookupKey() uses LRU_CLOCK(), so it seems object creation should use LRU_CLOCK() too.	2014-03-24 14:39:26 -04:00
Matt Stancliff	4290455145	Sentinel: Notify user when config can't be saved	2014-03-24 13:54:14 -04:00
Matt Stancliff	b47b343fab	Fix data loss when save AOF/RDB with no free space Previously, the (!fp) would only catch lack of free space under OS X. Linux waits to discover it can't write until it actually writes contents to disk. (fwrite() returns success even if the underlying file has no free space to write into. All the errors only show up at flush/sync/close time.) Fixes antirez/redis#1604	2014-03-24 13:54:14 -04:00
Salvatore Sanfilippo	906c4d77c0	Merge pull request #1617 from mattsta/remove-unused-warning Cluster: remove variable causing warning	2014-03-24 18:33:22 +01:00
Salvatore Sanfilippo	8e6625e6ae	Merge pull request #1629 from mattsta/fix-trib-master-assignment Cluster: Restore proper trib master iteration	2014-03-24 18:31:55 +01:00
Salvatore Sanfilippo	a006fcb8a7	Merge pull request #1628 from mattsta/fix-trib-create Cluster: Fix trib create when masters==replicas	2014-03-24 18:26:17 +01:00
Matt Stancliff	386a46946b	Fix potentially incorrect errno usage errno may be reset by the previous call to redisLog, so capture the original value for proper error reporting.	2014-03-24 13:21:15 -04:00
Matt Stancliff	3b54ee6ea4	Add REDIS_MIN_RESERVED_FDS define for open fds Also update the original REDIS_EVENTLOOP_FDSET_INCR to include REDIS_MIN_RESERVED_FDS. REDIS_EVENTLOOP_FDSET_INCR exists to make sure more than (maxclients+RESERVED) entries are allocated, but we can only guarantee that if we include the current value of REDIS_MIN_RESERVED_FDS as a minimum for the INCR size.	2014-03-24 13:15:35 -04:00
Salvatore Sanfilippo	896e15f3e3	Merge pull request #1627 from badboy/lru-fix Fixed a few typos.	2014-03-24 18:13:39 +01:00
Matt Stancliff	e942f3ce0f	Cluster: Restore proper trib master iteration This got removed in `2e5c394` during a new feature addition. The prior commit had "break if masters.length == masters_count" but we are guaranteed to aready have that condition met since otherwise we would haven't gotten this far. Without this break statement, it's possible some masters may be forgotten and have zero replicas while other masters have more than their requested number of replicas. Thanks to carlos for pointing out this regression at: https://groups.google.com/forum/#!topic/redis-db/_WVVqDw5B7c	2014-03-24 10:17:44 -04:00
Matt Stancliff	df4bdbf688	Cluster: Fix trib create when masters==replicas This bug was introduced in `2e5c394f` during a refactor. It took me a while to understand what was going on with the code, so I've refactored it further by: - Replacing boolean values with meaningful symbols - Replacing 'i' with a meaningful variable name - Adding the proper abort check - Factoring out now duplicated conditionals - Adding optional verbose logging (we're inside four different looping constructs, so it takes a while to figure out where all the moving parts are) - Updating comment for the section This fixes a problem when the number of master instances equaled the number of replica instances. Before, when there were equal numbers of both, nodes_count would go to zero, but the while loop would spin in i < @replicas because i would never be updated (because the nodes_list of each ip was length == 0, which triggered an endless loop of next -> i = 0 -> 0 < 1? -> true -> next -> i = 0 ...) Thanks to carlo who found this problem at: https://groups.google.com/forum/#!topic/redis-db/_WVVqDw5B7c	2014-03-24 10:17:38 -04:00
Matt Stancliff	90b844212d	Fix infinite loop on startup if ulimit too low Fun fact: rlim_t is an unsigned long long on all platforms. Continually subtracting from a rlim_t makes it get smaller and smaller until it wraps, then you're up to 2^64-1. This was causing an infinite loop on Redis startup if your ulimit was extremely (almost comically) low. The case of (f > oldlimit) would never be met in a case like: f = 150 while (f > 20) f -= 128 Since f is unsigned, it can't go negative and would take on values of: Iteration 1: 150 - 128 => 22 Iteration 2: 22 - 128 => 18446744073709551510 Iterations 3-∞: ... To catch the wraparound, we use the previous value of f stored in limit.rlimit_cur. If we subtract from f and get a larger number than the value it had previously, we print an error and exit since we don't have enough file descriptors to help the user at this point. Thanks to @bs3g for the inspiration to fix this problem. Patches existed from @bs3g at antirez#1227, but I needed to repair a few other parts of Redis simultaneously, so I didn't get a chance to use them.	2014-03-24 10:17:33 -04:00
Matt Stancliff	4a25983f8f	Improve error handling around setting ulimits The log messages about open file limits have always been slightly opaque and confusing. Here's an attempt to fix their wording, detail, and meaning. Users will have a better understanding of how to fix very common problems with these reworded messages. Also, we handle a new error case when maxclients becomes less than one, essentially rendering the server unusable. We now exit on startup instead of leaving the user with a server unable to handle any connections. This fixes antirez#356 as well.	2014-03-24 10:17:33 -04:00
Matt Stancliff	491532a713	Replace magic 32 with REDIS_EVENTLOOP_FDSET_INCR 32 was the additional number of file descriptors Redis would reserve when managing a too-low ulimit. The number 32 was in too many places statically, so now we use a macro instead that looks more appropriate. When Redis sets up the server event loop, it uses: server.maxclients+REDIS_EVENTLOOP_FDSET_INCR So, when reserving file descriptors, it makes sense to reserve at least REDIS_EVENTLOOP_FDSET_INCR FDs instead of only 32. Currently, REDIS_EVENTLOOP_FDSET_INCR is set to 128 in redis.h. Also, I replaced the static 128 in the while f < old loop with REDIS_EVENTLOOP_FDSET_INCR as well, which results in no change since it was already 128. Impact: Users now need at least maxclients+128 as their open file limit instead of maxclients+32 to obtain actual "maxclients" number of clients. Redis will carve the extra REDIS_EVENTLOOP_FDSET_INCR file descriptors it needs out of the "maxclients" range instead of failing to start (unless the local ulimit -n is too low to accomidate the request).	2014-03-24 10:17:33 -04:00
Matt Stancliff	c138631cd1	Fix maxclients error handling Everywhere in the Redis code base, maxclients is treated as an int with (int)maxclients or `maxclients = atoi(source)`, so let's make maxclients an int. This fixes a bug where someone could specify a negative maxclients on startup and it would work (as well as set maxclients very high) because: unsigned int maxclients; char *update = "-300"; maxclients = atoi(update); if (maxclients < 1) goto fail; But, (maxclients < 1) can only catch the case when maxclients is exactly 0. maxclients happily sets itself to -300, which isn't -300, but rather 4294966996, which isn't < 1, so... everything "worked." maxclients config parsing checks for the case of < 1, but maxclients CONFIG SET parsing was checking for case of < 0 (allowing maxclients to be set to 0). CONFIG SET parsing is now updated to match config parsing of < 1. It's tempting to add a MINIMUM_CLIENTS define, but... I didn't. These changes were inspired by antirez#356, but this doesn't fix that issue.	2014-03-24 10:17:33 -04:00
antirez	93253c2762	Sample and cache RSS in serverCron(). Obtaining the RSS (Resident Set Size) info is slow in Linux and OSX. This slowed down the generation of the INFO 'memory' section. Since the RSS does not require to be a real-time measurement, we now sample it with server.hz frequency (10 times per second by default) and use this value both to show the INFO rss field and to compute the fragmentation ratio. Practically this does not make any difference for memory profiling of Redis but speeds up the INFO call significantly.	2014-03-24 12:00:20 +01:00
antirez	30639c8ca9	sdscatvprintf(): Try to use a static buffer. For small content the function now tries to use a static buffer to avoid a malloc/free cycle that is too costly when the function is used in the context of performance critical code path such as INFO output generation. This change was verified to have positive effects in the execution speed of the INFO command.	2014-03-24 10:20:33 +01:00
antirez	d3efe04c47	Cache uname() output across INFO calls. Uname was profiled to be a slow syscall. It produces always the same output in the context of a single execution of Redis, so calling it at every INFO output generation does not make too much sense. The uname utsname structure was modified as a static variable. At the same time a static integer was added to check if we need to call uname the first time.	2014-03-24 10:00:08 +01:00
antirez	a9caca0424	sdscatvprintf(): guess buflen using format length. sdscatvprintf() uses a loop where it tries to output the formatted string in a buffer of the initial length, if there was not enough room, a buffer of doubled size is tried and so forth. The initial guess for the buffer length was very poor, an hardcoded "16". This caused the printf to be processed multiple times without a good reason. Given that printf functions are already not fast, the overhead was significant. The new heuristic is to use a buffer 4 times the length of the format buffer, and 32 as minimal size. This appears to be a good balance for typical uses of the function inside the Redis code base. This change improved INFO command performances 3 times.	2014-03-24 09:44:11 +01:00
antirez	4d2e8fa189	Use getLRUClock() instead of server.lruclock to create objects. Thanks to Matt Stancliff for noticing this error. It was in the original code but somehow I managed to remove the change from the commit...	2014-03-21 09:08:20 +01:00
antirez	5fa3248bad	The default maxmemory policy is now noeviction. This is safer as by default maxmemory should just set a memory limit without any key to be deleted, unless the policy is set to something more relaxed.	2014-03-21 08:03:34 +01:00
Jan-Erik Rediger	4fdd7a0546	Fixed a few typos.	2014-03-20 23:16:38 +01:00
antirez	a98369929e	Use 24 bits for the lru object field and improve resolution. There were 2 spare bits inside the Redis object structure that are now used in order to enlarge 4x the range of the LRU field. At the same time the resolution was improved from 10 to 1 second: this still provides 194 days before the LRU counter overflows (restarting from zero). This is not a problem since it only causes lack of eviction precision for objects not touched for a very long time, and the lack of precision is only temporary.	2014-03-20 17:56:27 +01:00
antirez	f4da796c53	Default LRU samples is now 5.	2014-03-20 17:05:42 +01:00
antirez	c641b670c3	Use new dictGetRandomKeys() API to get samples for eviction. The eviction quality degradates a bit in my tests, but since the API is faster, it allows to raise the number of samples, and overall is a win.	2014-03-20 16:52:12 +01:00
antirez	82b53c650c	struct dictEntry -> dictEntry.	2014-03-20 16:20:37 +01:00
antirez	5317f5e99a	Added dictGetRandomKeys() to dict.c: mass get random entries. This new function is useful to get a number of random entries from an hash table when we just need to do some sampling without particularly good distribution. It just jumps at a random place of the hash table and returns the first N items encountered by scanning linearly. The main usefulness of this function is to speedup Redis internal sampling of the key space, for example for key eviction or expiry.	2014-03-20 15:50:46 +01:00
antirez	22c9cfaf57	LRU eviction pool implementation. This is an improvement over the previous eviction algorithm where we use an eviction pool that is persistent across evictions of keys, and gets populated with the best candidates for evictions found so far. It allows to approximate LRU eviction at a given number of samples better than the previous algorithm used.	2014-03-20 11:57:29 +01:00
antirez	6d5790d682	Fix OBJECT IDLETIME return value converting to seconds. estimateObjectIdleTime() returns a value in milliseconds now, so we need to scale the output of OBJECT IDLETIME to seconds.	2014-03-20 11:55:18 +01:00
antirez	ad6b0f70b2	Obtain LRU clock in a resolution dependent way. For testing purposes it is handy to have a very high resolution of the LRU clock, so that it is possible to experiment with scripts running in just a few seconds how the eviction algorithms works. This commit allows Redis to use the cached LRU clock, or a value computed on demand, depending on the resolution. So normally we have the good performance of a precomputed value, and a clock that wraps in many days using the normal resolution, but if needed, changing a define will switch behavior to an high resolution LRU clock.	2014-03-20 11:47:12 +01:00
antirez	1faf82663f	Specify lruclock in redisServer structure via REDIS_LRU_BITS. The padding field was totally useless: removed.	2014-03-20 11:37:27 +01:00
antirez	d77e231682	Specify LRU resolution in milliseconds.	2014-03-20 11:33:25 +01:00
antirez	fe30847016	Set LRU parameters via REDIS_LRU_BITS define.	2014-03-20 11:22:47 +01:00
antirez	e150ec7d0c	Unify stats reset for CONFIG RESETSTAT / initServer(). Now CONFIG RESETSTAT makes sure to reset all the fields, and in the future it will be simpler to avoid missing new fields.	2014-03-19 12:55:49 +01:00
Matt Stancliff	67ed5f00aa	Cluster: remove variable causing warning GCC-4.9 warned about this, but clang didn't. This commit fixes warning: sentinel.c: In function 'sentinelReceiveHelloMessages': sentinel.c:2156:43: warning: variable 'master' set but not used [-Wunused-but-set-variable] sentinelRedisInstance ri = c->data, master;	2014-03-18 15:35:09 -04:00
antirez	b9e90a70fa	Sentinel: sentinelRefreshInstanceInfo() minor refactoring. Test sentinel.tilt condition on top and return if it is true. This allows to remove the check for the tilt condition in the remaining code paths of the function.	2014-03-18 15:35:47 +01:00
antirez	218cc5fc39	Sentinel: propagate down-after-ms changes to slaves and sentinels.	2014-03-18 14:37:44 +01:00
antirez	bb6d850160	Sentinel: down-after-milliseconds is not master-specific. addReplySentinelRedisInstance() modified so that this field is displayed for all the kind of instances: Sentinels, Masters, Slaves.	2014-03-18 11:21:17 +01:00
antirez	ae0b7680b3	Sentinel failure detection implementation improved. Failure detection in Sentinel is ping-pong based. It used to work by remembering the last time a valid PONG reply was received, and checking if the reception time was too old compared to the current current time. PINGs were sent at a fixed interval of 1 second. This works in a decent way, but does not scale well when we want to set very small values of "down-after-milliseconds" (this is the node timeout basically). This commit reiplements the failure detection making a number of changes. Some changes are inspired to Redis Cluster failure detection code: * A new last_ping_time field is added in representation of instances. If non zero, we have an active ping that was sent at the specified time. When a valid reply to ping is received, the field is zeroed again. * last_ping_time is not reset when we reconnect the link or send a new ping, so from our point of view it represents the time we started waiting for the instance to reply to our pings without receiving a reply. * last_ping_time is now used in order to check if the instance is timed out. This means that we can have a node timeout of 100 milliseconds and yet the system will work well since the new check is not bound to the period used to send pings. * Pings are now sent every second, or often if the value of down-after-milliseconds is less than one second. With a lower limit of 10 HZ ping frequency. * Link reconnection code was improved. This is used in order to try to reconnect the link when we are at 50% of the node timeout without a valid reply received yet. However the old code triggered unnecessary reconnections when the node timeout was very small. Now that should be ok. The new code passes the tests but more testing is needed and more unit tests stressing the failure detector, so currently this is merged only in the unstable branch.	2014-03-17 18:33:45 +01:00
antirez	3a2ff55617	Sentinel: use CLIENT SETNAME when connecting to Redis. This makes debugging / monitoring of Sentinels simpler since you can identify sentinels in CLIENT LIST output of Redis instances.	2014-03-15 14:59:23 +01:00
Matt Stancliff	584052ee6b	Fix segfault from accessing array out of bounds argc == 2; argv[2] == crash	2014-03-14 17:38:05 -04:00
antirez	ed813863f0	Sentinel: be safe under crash-recovery assumptions. Sentinel's main safety argument is that there are no two configurations for the same master with the same version (configuration epoch). For this to be true Sentinels require to be authorized by a majority. Additionally Sentinels require to do two important things: * Never vote again for the same epoch. * Never exchange an old vote for a fresh one. The first prerequisite, in a crash-recovery system model, requires to persist the master->leader_epoch on durable storage before to reply to messages. This was not the case. We also make sure to persist the current epoch in order to never reply to stale votes requests from other Sentinels, after a recovery. The configuration is persisted by making use of fsync(), this is considered in the context of this code a good enough guarantee that after a restart our durable state is restored, however this may not always be the case depending on the kind of hardware and operating system used.	2014-03-14 14:58:44 +01:00
antirez	365094028b	Sentinel: fake PUBLISH command to receive HELLO messages. Now the way HELLO messages are received is unified. Now it is no longer needed for Sentinels to converge to the higher configuration for a master to be able to chat via some Redis instance, the are able to directly exchanges configurations. Note that this commit does not include the (trivial) change needed to send HELLO messages to Sentinel instances as well, since for an error I committed the change in the previous commit that refactored hello messages processing into a separated function.	2014-03-14 11:07:42 +01:00
antirez	9dfe426fc8	Sentinel: HELLO processing refactored into sentinelProcessHelloMessage().	2014-03-14 11:07:42 +01:00
antirez	133fccb03f	Cluster: flag the transaction as dirty for the new redirections.	2014-03-13 15:11:53 +01:00
antirez	429aff4ef4	Linenoise updated, multiline mode enabled in redis-cli.	2014-03-13 15:11:08 +01:00
antirez	cc11d103c0	redis-trib: call MIGRATE via r.client.call as fix for redis-rb API changes. See issue #1593. Thanks to @badboy for suggesting the direct client.call fix.	2014-03-11 16:10:13 +01:00
antirez	df32eb6827	redis-trib: new subcommand 'call'. Exec command in all nodes. Example: ./redis-trib.rb call 192.168.1.11:7000 config get cluster-node-timeout	2014-03-11 14:58:55 +01:00
antirez	2e5c394fa8	redis-trib: create subcommand is now able to assign spare slaves. Example: if the user will try to configure a cluster with 9 nodes, asking for 1 slave for master, redis-trib will configure a 4 masters cluster with 1 slave each as usually, but this time will assign the spare node as a slave of one of the masters.	2014-03-11 14:17:28 +01:00
antirez	e26f4486b0	Cluster: update node configEpoch on UPDATE messages. The UPDATE message contains the configEpoch of the node configuration advertised in the packet. Update it if needed.	2014-03-11 11:53:09 +01:00
antirez	a2ff90919f	Cluster: set slot error if we receive an update for a busy slot. By manually modifying nodes configurations in random ways, it is possible to create the following scenario: A is serving keys for slot 10 B is manually configured to serve keys for slot 10 A receives an update from B (or another node) where it is informed that the slot 10 is now claimed by B with a greater configuration epoch, however A still has keys from slot 10. With this commit A will put the slot in error setting it in IMPORTING state, so that redis-trib can detect the issue.	2014-03-11 11:49:47 +01:00
antirez	1ed0ad77f0	Cluster: clarified a comment in clusterUpdateSlotsConfigWith().	2014-03-11 11:32:40 +01:00
antirez	8287945ff8	Cluster: flush importing/migrating state when master is turned into slave.	2014-03-11 11:22:06 +01:00
antirez	2e8e0ad44e	Cluster: clusterCloseAllSlots() added.	2014-03-11 11:16:18 +01:00
antirez	8eae54aa1e	DEBUG ERROR implemented. The new "error" subcommand of the DEBUG command can reply with an user selected error, specified as its sole argument: DEBUG ERROR "LOADING please wait..." The error is generated just prefixing the command argument with a "-" character, and replacing newlines with spaces (since error replies can't include newlines). The goal of the command is to help in Client libraries unit tests by making simple to simulate a command call triggering a given error.	2014-03-10 23:01:55 +01:00
antirez	2705306ba1	DEBUG CMDKEYS: provide some guarantee to getKeysFromCommand(). getKeysFromCommand() is designed to be called with the command arguments passing the basic arity checks described in the command table. DEBUG CMDKEYS must provide the same guarantees for calling getKeysFromCommand() to be safe.	2014-03-10 16:43:38 +01:00
antirez	5b864617bc	Cluster: make sortGetKeys() able to handle multiple STORE options. It does not make sense to pass multiple store options, so, better to handle it ;-)	2014-03-10 16:39:07 +01:00
antirez	c4ef1d6494	DEBUG CMDKEYS added for getKeysFromCommand() testing. Examples: redis 127.0.0.1:6379> debug cmdkeys set foo bar 1) "foo" redis 127.0.0.1:6379> debug cmdkeys mget a b c 1) "a" 2) "b" 3) "c" redis 127.0.0.1:6379> debug cmdkeys zunionstore foo 2 a b 1) "a" 2) "b" 3) "foo" redis 127.0.0.1:6379> debug cmdkeys ping (empty list or set)	2014-03-10 16:36:08 +01:00
antirez	3e1d772677	Cluster: don't allow BY option of SORT as well. There is the exception of a "constant" BY pattern that is used in order to signal to don't sort at all. In this case no lookup is needed so it is possible to support this case in Cluster mode.	2014-03-10 16:28:18 +01:00
antirez	04cf02e8dc	Cluster: SORT get keys helper implemented.	2014-03-10 16:26:08 +01:00
antirez	21765c8588	Cluster: evalGetKeys() fixed: was not setting keys count.	2014-03-10 16:23:42 +01:00
antirez	03344196f3	Cluster: don't allow GET option in cluster mode. The commit also refactors a bit the error handling during SORT option parsing.	2014-03-10 16:10:50 +01:00
antirez	8caecc9ab4	Fixed memory leak in SORT LIMIT option argument parsing on error.	2014-03-10 15:44:41 +01:00
antirez	ef5e7fbaa2	Cluster: getKeysFromCommand() top comment improved.	2014-03-10 15:31:01 +01:00
antirez	c0e818ab08	Cluster: evalGetKey() added for EVAL/EVALSHA. Previously we used zunionInterGetKeys(), however after this function was fixed to account for the destination key (not needed when the API was designed for "diskstore") the two set of commands can no longer be served by an unique keys-extraction function.	2014-03-10 15:26:13 +01:00
antirez	caf7b9b425	Cluster: getKeysFromCommand() and related: top-comments added.	2014-03-10 15:24:38 +01:00
antirez	787b297046	Cluster: getKeysFromCommand() API cleaned up. This API originated from the "diskstore" experiment, not for Redis Cluster itself, so there were legacy/useless things trying to differentiate between keys that are going to be overwritten and keys that need to be fetched from disk (preloaded). All useless with Cluster, so removed with the result of code simplification.	2014-03-10 13:18:41 +01:00
antirez	55b88e0044	Cluster: some zunionInterGetKeys() comment trimmed. Everything was pretty clear again from the initial statements.	2014-03-10 11:43:56 +01:00
Salvatore Sanfilippo	aca6cb529b	Merge pull request #1586 from mattsta/fix-zunioninterstorekeys Fix key extraction for z{union,inter}store	2014-03-10 11:39:45 +01:00
antirez	c1a7d3e61f	Cluster: abort on port too high error. It also fixes multi-line comment style to be consistent with the rest of the code base. Related to #1555.	2014-03-10 10:41:27 +01:00
Salvatore Sanfilippo	442b06db54	Merge pull request #1555 from mattsta/cluster-port-error-out Cluster port error out	2014-03-10 10:37:50 +01:00
antirez	ed8c55237b	Cluster: be explicit about passing NULL as bind addr for connect. The code was already correct but it was using that bindaddr[0] is set to NULL as a side effect of current implementation if no bind address is configured. This is not guarnteed to hold true in the future.	2014-03-10 10:33:53 +01:00
antirez	3e8a92ef8d	Cluster: log error when anetTcpNonBlockBindConnect() fails.	2014-03-10 10:32:28 +01:00
Salvatore Sanfilippo	3b0edb80ec	Merge pull request #1567 from mattsta/fix-cluster-join Bind source address for cluster communication	2014-03-10 10:28:32 +01:00
antirez	0f1f25784f	Cluster: better timeout and retry time for failover. When node-timeout is too small, in the order of a few milliseconds, there is no way the voting process can terminate during that time, so we set a lower limit for the failover timeout of two seconds. The retry time is set to two times the failover timeout time, so it is at least 4 seconds.	2014-03-10 09:57:52 +01:00
Matt Stancliff	f0782a6e86	Fix key extraction for z{union,inter}store The previous implementation wasn't taking into account the storage key in position 1 being a requirement (it was only counting the source keys in positions 3 to N). Fixes antirez/redis#1581	2014-03-07 16:33:20 -05:00
antirez	6984692060	Cluster: fix conditional generating TRYAGAIN error.	2014-03-07 16:18:00 +01:00
antirez	36676c2318	Redis Cluster: support for multi-key operations.	2014-03-07 13:19:09 +01:00
Salvatore Sanfilippo	bbf39b7a3a	Merge pull request #1576 from Hailei/fix-lruidletime-comment Fix REDIS_LRU_CLOCK_MAX's value	2014-03-06 18:14:36 +01:00
antirez	b74c899da3	Merge branch 'unstable' of github.com:/antirez/redis into unstable	2014-03-06 18:06:30 +01:00
Matt Stancliff	e8bae92e54	Reset op_sec_last_sample_ops when reset requested This value needs to be set to zero (in addition to stat_numcommands) or else people may see a negative operations per second count after they run CONFIG RESETSTAT. Fixes antirez/redis#1577	2014-03-06 18:00:08 +01:00
Matt Stancliff	385c25f70f	Remove redundant IP length definition REDIS_CLUSTER_IPLEN had the same value as REDIS_IP_STR_LEN. They were both #define'd to the same INET6_ADDRSTRLEN.	2014-03-06 17:55:43 +01:00
Matt Stancliff	d2040ab9b1	Remove some redundant code Function nodeIp2String in cluster.c is exactly anetPeerToString with a pre-extracted fd.	2014-03-06 17:55:39 +01:00
Matt Stancliff	59cf0b1902	Fix return value check for anetTcpAccept anetTcpAccept returns ANET_ERR, not AE_ERR. This isn't a physical error since both ANET_ERR and AE_ERR are -1, but better to be consistent.	2014-03-06 17:55:31 +01:00
Salvatore Sanfilippo	54e99fb226	Merge pull request #1578 from badboy/patch-5 Small typo fixed	2014-03-06 17:40:04 +01:00
antirez	9b401819c0	Cast saveparams[].seconds to long for %ld format specifier.	2014-03-05 11:26:18 +01:00
Jan-Erik Rediger	5f5118bdad	Small typo fixed	2014-03-05 00:41:02 +01:00
Matt Stancliff	e5b1e7be64	Bind source address for cluster communication The first address specified as a bind parameter (server.bindaddr[0]) gets used as the source IP for cluster communication. If no bind address is specified by the user, the behavior is unchanged. This patch allows multiple Redis Cluster instances to communicate when running on the same interface of the same host.	2014-03-04 17:36:45 -05:00
antirez	47750998a6	Sentinel: more aggressive failover start desynchronization. Sentinel needs to avoid split brain conditions due to multiple sentinels trying to get voted at the exact same time. So far some desynchronization was provided by fluctuating server.hz, that is the frequency of the timer function call. However the desynchonization provided in this way was not enough when using many Sentinel instances, especially when a large quorum value is used in order to force a greater degree of agreement (more than N/2+1). It was verified that it was likely to trigger a split brain condition, forcing the system to try again after a timeout. Usually the system will succeed after a few retries, but this is not optimal. This commit desynchronizes instances in a more effective way to make it likely that the first attempt will be successful.	2014-03-04 17:09:36 +01:00
antirez	08da025f56	CONFIG REWRITE should be logged at WARNING level.	2014-03-04 16:39:47 +01:00
zhanghailei	138695d990	refer to updateLRUClock's comment REDIS_LRU_CLOCK_MAX is 22 bits,but #define REDIS_LRU_CLOCK_MAX ((1<<21)-1) only 21 bits	2014-03-04 12:20:31 +08:00
zhanghailei	c0f8665414	FIXED a typo more thank should be more than	2014-03-04 11:21:34 +08:00
zhanghailei	4b9ac6edd0	According to context,the size should be 16 rather than 64	2014-03-04 11:21:34 +08:00
antirez	c5edd91716	Cluster: invalidate current transaction on redirections.	2014-03-03 17:11:51 +01:00
antirez	e41a3edfab	Merge branch 'cli_improved_bigkeys' of git://github.com/michael-grunder/redis into unstable	2014-03-03 11:20:54 +01:00
antirez	12a88d575d	Document why we update peak memory in INFO.	2014-03-03 11:19:54 +01:00
antirez	0c1bb1313c	Merge branch 'unstable' of github.com:/antirez/redis into unstable	2014-03-03 11:17:37 +01:00
antirez	8dea2029a4	Fix configEpoch assignment when a cluster slot gets "closed". This is still code to rework in order to use agreement to obtain a new configEpoch when a slot is migrated, however this commit handles the special case that happens when the nodes are just started and everybody has a configEpoch of 0. In this special condition to have the maximum configEpoch is not enough as the special epoch 0 is not unique (all the others are). This does not fixes the intrinsic race condition of a failover happening while we are resharding, that will be addressed later.	2014-03-03 11:12:11 +01:00
Matt Stancliff	f1c9a203b2	Force INFO used_memory_peak to match peak memory used_memory_peak only updates in serverCron every server.hz, but Redis can use more memory and a user can request memory INFO before used_memory_peak gets updated in the next cron run. This patch updates used_memory_peak to the current memory usage if the current memory usage is higher than the recorded used_memory_peak value. (And it only calls zmalloc_used_memory() once instead of twice as it was doing before.)	2014-02-28 17:47:41 -05:00
antirez	a89c8bb87c	Sentinel test: Makefile target added.	2014-02-28 16:00:00 +01:00
michael-grunder	806788d009	Improved bigkeys with progress, pipelining and summary This commit reworks the redis-cli --bigkeys command to provide more information about our progress as well as output summary information when we're done. - We now show an approximate percentage completion as we go - Hiredis pipelining is used for TYPE and SIZE retreival - A summary of keyspace distribution and overall breakout at the end	2014-02-27 12:01:57 -08:00
antirez	76a6e82d89	warnigns -> warnings in redisBitpos().	2014-02-27 13:17:23 +01:00
antirez	0e31eaa27f	More consistent BITPOS behavior with bit=0 and ranges. With the new behavior it is possible to specify just the start in the range (the end will be assumed to be the first byte), or it is possible to specify both start and end. This is useful to change the behavior of the command when looking for zeros inside a string. 1) If the user specifies both start and end, and no 0 is found inside the range, the command returns -1. 2) If instead no range is specified, or just the start is given, even if in the actual string no 0 bit is found, the command returns the first bit on the right after the end of the string. So for example if the string stored at key foo is "\xff\xff": BITPOS foo (returns 16) BITPOS foo 0 -1 (returns -1) BITPOS foo 0 (returns 16) The idea is that when no end is given the user is just looking for the first bit that is zero and can be set to 1 with SETBIT, as it is "available". Instead when a specific range is given, we just look for a zero within the boundaries of the range.	2014-02-27 12:53:03 +01:00
antirez	38c620b3b5	Initial implementation of BITPOS. It appears to work but more stress testing, and both unit tests and fuzzy testing, is needed in order to ensure the implementation is sane.	2014-02-27 12:44:27 +01:00
antirez	addd4de9c1	Merge branch 'unstable' of github.com:/antirez/redis into unstable	2014-02-27 10:14:03 +01:00
antirez	746ce35f5f	Fix misaligned word access in redisPopcount().	2014-02-27 09:46:20 +01:00
Matt Stancliff	d769cad4bf	Fix IP representation in clusterMsgDataGossip	2014-02-25 16:02:28 -05:00
antirez	55e36e1132	Merge branch 'bigkeys_scan' of git://github.com/michael-grunder/redis into unstable	2014-02-25 14:59:57 +01:00
michael-grunder	013a4ce242	Update --bigkeys to use SCAN This commit changes the findBigKeys() function in redis-cli.c to use the new SCAN command for iterating the keyspace, rather than RANDOMKEY. Because we can know when we're done using SCAN, it will exit after exhausting the keyspace.	2014-02-25 05:41:30 -08:00
antirez	a2c76ffb1c	redis-cli: also remove useless uint8_t.	2014-02-25 13:47:37 +01:00
antirez	ba993cc685	redis-cli: don't use uint64_t where actually not needed. The computation is just something to take the CPU busy, no need to use a specific type. Since stdint.h was not included this prevented compilation on certain systems.	2014-02-25 13:44:31 +01:00
antirez	5580350a7b	redis-cli: check argument existence for --pattern.	2014-02-25 12:38:29 +01:00
antirez	c1d67ea9b4	redis-cli: --intrinsic-latency run mode added.	2014-02-25 12:37:52 +01:00
antirez	dcac007b81	redis-cli: added comments to split program in parts.	2014-02-25 12:24:45 +01:00
antirez	b15411df98	Sentinel: log quorum with +monitor event.	2014-02-24 17:10:20 +01:00
antirez	6b373edb77	Sentinel: generate +monitor events at startup.	2014-02-24 16:33:55 +01:00
antirez	3b7a757468	Sentinel: log +monitor and +set events. Now that we have a runtime configuration system, it is very important to be able to log how the Sentinel configuration changes over time because of API calls.	2014-02-24 16:33:43 +01:00
antirez	25cebf7285	Sentinel: added missing exit(1) after checking for config file.	2014-02-24 16:22:52 +01:00
Salvatore Sanfilippo	e163332858	Merge pull request #1545 from mattsta/fix-redis-cli-sync Deny SYNC and PSYNC in redis-cli	2014-02-23 17:47:28 +01:00
antirez	b1c1386374	Sentinel: IDONTKNOW error removed. This error was conceived for the older version of Sentinel that worked via master redirection and that was not able to get configuration updates from other Sentinels via the Pub/Sub channel of masters or slaves. This reply does not make sense today, every Sentinel should reply with the best information it has currently. The error will make even more sense in the future since the plan is to allow Sentinels to update the configuration of other Sentinels via gossip with a direct chat without the prerequisite that they have at least a monitored instance in common.	2014-02-22 17:34:46 +01:00
Matt Stancliff	2c273e3591	Add cluster or sentinel to proc title If you launch redis with `redis-server --sentinel` then in a ps, your output only says "redis-server IP:Port" — this patch changes the proc title to include [sentinel] or [cluster] depending on the current server mode: e.g. "redis-server IP:Port [sentinel]" "redis-server IP:Port [cluster]"	2014-02-20 23:58:54 -05:00
antirez	7d7b3810e7	Sentinel: report instances role switch events. This is useful mostly for debugging of issues.	2014-02-20 12:13:52 +01:00
Matt Stancliff	ce68caea37	Cluster: error out quicker if port is unusable The default cluster control port is 10,000 ports higher than the base Redis port. If Redis is started on a too-high port, Cluster can't start and everything will exit later anyway.	2014-02-19 17:30:07 -05:00
Matt Stancliff	b20ae393f1	Fix "can't bind to address" error reporting. Report the actual port used for the listening attempt instead of server.port. Originally, Redis would just listen on server.port. But, with clustering, Redis uses a Cluster Port too, so we can't say server.port is always where we are listening. If you tried to launch Redis with a too-high port number (any port where Port+10000 > 65535), Redis would refuse to start, but only print an error saying it can't connect to the Redis port. This patch fixes much confusions.	2014-02-19 17:26:33 -05:00
antirez	7cec9e48ce	Sentinel: SENTINEL_SLAVE_RECONF_RETRY_PERIOD -> RECONF_TIMEOUT Rename define to match the new meaning.	2014-02-18 10:27:38 +01:00
antirez	18b8bad53c	Sentinel: fix slave promotion timeout. If we can't reconfigure a slave in time during failover, go forward as anyway the slave will be fixed by Sentinels in the future, once they detect it is misconfigured. Otherwise a failover in progress may never terminate if for some reason the slave is uncapable to sync with the master while at the same time it is not disconnected.	2014-02-18 08:50:57 +01:00
antirez	ede33fb912	Get absoulte config file path before processig 'dir'. The code tried to obtain the configuration file absolute path after processing the configuration file. However if config file was a relative path and a "dir" statement was processed reading the config, the absolute path obtained was wrong. With this fix the absolute path is obtained before processing the configuration while the server is still in the original directory where it was executed.	2014-02-17 16:44:53 +01:00
antirez	e1b77b61f3	Sentinel: better specify startup errors due to config file. Now it logs the file name if it is not accessible. Also there is a different error for the missing config file case, and for the non writable file case.	2014-02-17 16:44:49 +01:00
antirez	51bd9da1fd	Update cached time in rdbLoad() callback. server.unixtime and server.mstime are cached less precise timestamps that we use every time we don't need an accurate time representation and a syscall would be too slow for the number of calls we require. Such an example is the initialization and update process of the last interaction time with the client, that is used for timeouts. However rdbLoad() can take some time to load the DB, but at the same time it did not updated the time during DB loading. This resulted in the bug described in issue #1535, where in the replication process the slave loads the DB, creates the redisClient representation of its master, but the timestamp is so old that the master, under certain conditions, is sensed as already "timed out". Thanks to @yoav-steinberg and Redis Labs Inc for the bug report and analysis.	2014-02-13 15:13:26 +01:00
antirez	7e8abcf693	Log when CONFIG REWRITE goes bad.	2014-02-13 14:32:44 +01:00
antirez	21e6b0fbe9	Fix script cache bug in the scripting engine. This commit fixes a serious Lua scripting replication issue, described by Github issue #1549. The root cause of the problem is that scripts were put inside the script cache, assuming that slaves and AOF already contained it, even if the scripts sometimes produced no changes in the data set, and were not actaully propagated to AOF/slaves. Example: eval "if tonumber(KEYS[1]) > 0 then redis.call('incr', 'x') end" 1 0 Then: evalsha <sha1 step 1 script> 1 0 At this step sha1 of the script is added to the replication script cache (the script is marked as known to the slaves) and EVALSHA command is transformed to EVAL. However it is not dirty (there is no changes to db), so it is not propagated to the slaves. Then the script is called again: evalsha <sha1 step 1 script> 1 1 At this step master checks that the script already exists in the replication script cache and doesn't transform it to EVAL command. It is dirty and propagated to the slaves, but they fail to evaluate the script as they don't have it in the script cache. The fix is trivial and just uses the new API to force the propagation of the executed command regardless of the dirty state of the data set. Thank you to @minus-infinity on Github for finding the issue, understanding the root cause, and fixing it.	2014-02-13 12:10:43 +01:00
antirez	fc08c8599f	AOF write error: retry with a frequency of 1 hz.	2014-02-12 16:27:59 +01:00
antirez	fe8352540f	AOF: don't abort on write errors unless fsync is 'always'. A system similar to the RDB write error handling is used, in which when we can't write to the AOF file, writes are no longer accepted until we are able to write again. For fsync == always we still abort on errors since there is currently no easy way to avoid replying with success to the user otherwise, and this would violate the contract with the user of only acknowledging data already secured on disk.	2014-02-12 16:11:36 +01:00
antirez	db6d628c3e	Cluster: clusterDelNode(): remove node from master's slaves.	2014-02-11 10:34:25 +01:00
antirez	5e0e03be41	Cluster: UPDATE messages are the norm and verbose. Logging them at WARNING level was of little utility and of sure disturb.	2014-02-11 10:18:24 +01:00
antirez	8251d2d150	Cluster: redis-trib fix: handling of another trivial case.	2014-02-11 10:13:18 +01:00
antirez	4a64286c36	Cluster: configEpoch assignment in SETNODE improved. Avoid to trash a configEpoch for every slot migrated if this node has already the max configEpoch across the cluster. Still work to do in this area but this avoids both ending with a very high configEpoch without any reason and to flood the system with fsyncs.	2014-02-11 10:09:17 +01:00
antirez	72f7abf6a2	Cluster: clusterSetStartupEpoch() made more generally useful. The actual goal of the function was to get the max configEpoch found in the cluster, so make it general by removing the assignment of the max epoch to currentEpoch that is useful only at startup.	2014-02-11 10:00:14 +01:00
antirez	44f7afe28a	Cluster: always increment the configEpoch in SETNODE after import. Removed a stale conditional preventing the configEpoch from incrementing after the import in certain conditions. Since the master got a new slot it should always claim a new configuration.	2014-02-11 09:50:37 +01:00
antirez	a1349728ea	Cluster: on resharding upgrade version of receiving node. The node receiving the hash slot needs to have a version that wins over the other versions in order to force the ownership of the slot. However the current code is far from perfect since a failover can happen during the manual resharding. The fix is a work in progress but the bottom line is that the new version must either be voted as usually, set by redis-trib manually after it makes sure can't be used by other nodes, or reserved configEpochs could be used for manual operations (for example odd versions could be never used by slaves and are always used by CLUSTER SETSLOT NODE).	2014-02-11 00:36:05 +01:00
antirez	6dc26795aa	Cluster: fsync at every SETSLOT command puts too pressure on disks. During slots migration redis-trib can send a number of SETSLOT commands. Fsyncing every time is a bit too much in production as verified empirically. To make sure configs are fsynced on all nodes after a resharding redis-trib may send something like CLUSTER CONFSYNC. In this case fsyncs were not providing too much value since anyway processes can crash in the middle of the resharding of an hash slot, and redis-trib should be able to recover from this condition anyway.	2014-02-10 23:54:08 +01:00
antirez	218358bbbd	Cluster: conditions to clear "migrating" on slot for SETSLOT ... NODE changed. If the slot is manually assigned to another node, clear the migrating status regardless of the fact it was previously assigned to us or not, as long as we no longer have keys for this slot. This avoid a race during slots migration that may leave the slot in migrating status in the source node, since it received an update message from the destination node that is already claiming the slot. This way we are sure that redis-trib at the end of the slot migration is always able to close the slot correctly.	2014-02-10 23:51:47 +01:00
antirez	3107e7ca60	Cluster: remove debugging xputs from redis-trib.	2014-02-10 19:14:05 +01:00
antirez	1ae50a9b1d	Cluster: redis-trib fix: cover new case of open slot. The case is the trivial one a single node claiming the slot as migrating, without nodes claiming it as importing.	2014-02-10 19:10:23 +01:00
antirez	59e03a8f35	redis-trib: log event after we have reference to 'master'.	2014-02-10 18:48:40 +01:00
antirez	bf670e0745	Cluster: don't update slave's master if we don't know it. There is no way we can update the slave's node->slaveof pointer if we don't know the master (no node with such an ID in our tables).	2014-02-10 18:33:34 +01:00

... 5 6 7 8 9 ...

2922 Commits