redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-22 16:18:28 -05:00

Author	SHA1	Message	Date
antirez	929b6a4480	Cluster: cluster stuff moved from redis.h to cluster.h.	2013-10-09 15:38:05 +02:00
antirez	7afc0dd59a	Cluster: new clusterDoBeforeSleep() API. The new API is able to remember operations to perform before returning to the event loop, such as checking if there is the failover quorum for a slave, save and fsync the configuraiton file, and so forth. Because this operations are performed before returning on the event loop we are sure that messages that are sent in the same event loop run will be delivered after the configuration is already saved, that is a requirement sometimes. For instance we want to publish a new epoch only when it is already stored in nodes.conf in order to avoid returning back in the logical clock when a node is restarted. This new API provides a big performance advantage compared to saving and possibly fsyncing the configuration file multiple times in the same event loop run, especially in the case of big clusters with tens or hundreds of nodes.	2013-10-03 09:58:06 +02:00
antirez	6c4d904baf	Cluster: bus messages stats in CLUSTER info.	2013-10-02 10:10:08 +02:00
antirez	1dedf9aa36	Cluster: time field removed from cluster messages header. The new algorithm does not check replies time as checking for the currentEpoch in the reply ensures that the reply is about the current election process.	2013-09-30 16:19:44 +02:00
antirez	7c4b8f29e7	Cluster: react faster when a slave wins an election.	2013-09-26 16:54:43 +02:00
antirez	a445aa30a0	Cluster: master node now uses new protocol to vote.	2013-09-26 13:00:41 +02:00
antirez	fb9b76fe14	Cluster: slave node now uses the new protocol to get elected.	2013-09-26 11:13:17 +02:00
antirez	12483b0061	Cluster: configEpoch added in cluster nodes description.	2013-09-25 11:47:13 +02:00
antirez	925ea9f858	Cluster: added time field in cluster bus messages. The time is sent in requests, and copied back in reply packets. This way the receiver can compare the time field in a reply with its local clock and check the age of the request associated with this reply. This is an easy way to discard delayed replies. Note that only a clock is used here, that is the one of the node sending the packet. The receiver only copies the field back into the reply, so no synchronization is needed between clocks of different hosts.	2013-09-20 09:22:21 +02:00
antirez	72587e6cc5	Cluster: free HANDSHAKE nodes after node_timeout. Handshake nodes should turn into normal nodes or be freed in a reasonable amount of time, otherwise they'll keep accumulating if the address they are associated with is not reachable for some reason.	2013-09-04 12:41:21 +02:00
antirez	81a6a9639a	Use listenToPort() in cluster.c as well.	2013-08-22 14:05:07 +02:00
antirez	89ffba9133	Replication: better way to send a preamble before RDB payload. During the replication full resynchronization process, the RDB file is transfered from the master to the slave. However there is a short preamble to send, that is currently just the bulk payload length of the file in the usual Redis form $..length..<CR><LF>. This preamble used to be sent with a direct write call, assuming that there was alway room in the socket output buffer to hold the few bytes needed, however this does not scale in case we'll need to send more stuff, and is not very robust code in general. This commit introduces a more general mechanism to send a preamble up to 2GB in size (the max length of an sds string) in a non blocking way.	2013-08-12 10:29:14 +02:00
antirez	112fa47978	Add per-db average TTL information in INFO output. Example: db0:keys=221913,expires=221913,avg_ttl=655 The algorithm uses a running average with only two samples (current and previous). Keys found to be expired are considered at TTL zero even if the actual TTL can be negative. The TTL is reported in milliseconds.	2013-08-06 15:00:43 +02:00
antirez	6500fabfb8	Some activeExpireCycle() refactoring.	2013-08-06 12:55:49 +02:00
antirez	b09ea1bd90	Draft #1 of a new expired keys collection algorithm. The main idea here is that when we are no longer to expire keys at the rate the are created, we can't block more in the normal expire cycle as this would result in too big latency spikes. For this reason the commit introduces a "fast" expire cycle that does not run for more than 1 millisecond but is called in the beforeSleep() hook of the event loop, so much more often, and with a frequency bound to the frequency of executed commnads. The fast expire cycle is only called when the standard expiration algorithm runs out of time, that is, consumed more than REDIS_EXPIRELOOKUPS_TIME_PERC of CPU in a given cycle without being able to take the number of already expired keys that are yet not collected to a number smaller than 25% of the number of keys. You can test this commit with different loads, but a simple way is to use the following: Extreme load with pipelining: redis-benchmark -r 100000000 -n 100000000 \ -P 32 set ele:rand:000000000000 foo ex 2 Remove the -P32 in order to avoid the pipelining for a more real-world load. In another terminal tab you can monitor the Redis behavior with: redis-cli -i 0.1 -r -1 info keyspace and redis-cli --latency-history Note: this commit will make Redis printing a lot of debug messages, it is not a good idea to use it in production.	2013-08-05 12:05:22 +02:00
antirez	894eba07c8	Introduction of a new string encoding: EMBSTR Previously two string encodings were used for string objects: 1) REDIS_ENCODING_RAW: a string object with obj->ptr pointing to an sds stirng. 2) REDIS_ENCODING_INT: a string object where the obj->ptr void pointer is casted to a long. This commit introduces a experimental new encoding called REDIS_ENCODING_EMBSTR that implements an object represented by an sds string that is not modifiable but allocated in the same memory chunk as the robj structure itself. The chunk looks like the following: +--------------+-----------+------------+--------+----+ \| robj data... \| robj->ptr \| sds header \| string \| \0 \| +--------------+-----+-----+------------+--------+----+ \| ^ +-----------------------+ The robj->ptr points to the contiguous sds string data, so the object can be manipulated with the same functions used to manipulate plan string objects, however we need just on malloc and one free in order to allocate or release this kind of objects. Moreover it has better cache locality. This new allocation strategy should benefit both the memory usage and the performances. A performance gain between 60 and 70% was observed during micro-benchmarks, however there is more work to do to evaluate the performance impact and the memory usage behavior.	2013-07-22 10:31:38 +02:00
yoav	63d15dfc87	Chunked loading of RDB to prevent redis from stalling reading very large keys.	2013-07-16 15:41:24 +02:00
antirez	cf1579a798	SORT ALPHA: use collation instead of binary comparison. Note that we only do it when STORE is not used, otherwise we want an absolutely locale independent and binary safe sorting in order to ensure AOF / replication consistency. This is probably an unexpected behavior violating the least surprise rule, but there is currently no other simple / good alternative.	2013-07-12 12:02:36 +02:00
antirez	81e55ec0f3	Fixed compareStringObject() and introduced collateStringObject(). compareStringObject was not always giving the same result when comparing two exact strings, but encoded as integers or as sds strings, since it switched to strcmp() when at least one of the strings were not sds encoded. For instance the two strings "123" and "123\x00456", where the first string was integer encoded, would result into the old implementation of compareStringObject() to return 0 as if the strings were equal, while instead the second string is "greater" than the first in a binary comparison. The same compasion, but with "123" encoded as sds string, would instead return a value < 0, as it is correct. It is not impossible that the above caused some obscure bug, since the comparison was not always deterministic, and compareStringObject() is used in the implementation of skiplists, hash tables, and so forth. At the same time, collateStringObject() was introduced by this commit, so that can be used by SORT command to return sorted strings usign collation instead of binary comparison. See next commit.	2013-07-12 11:56:52 +02:00
antirez	d0001fe810	getClientPeerId() refactored into two functions.	2013-07-09 15:46:34 +02:00
antirez	e4c019e7a8	getClientPeerId() now reports errors. We now also use it in CLIENT KILL implementation.	2013-07-09 15:28:30 +02:00
antirez	5cdc5da990	getClientPeerID introduced. The function returns an unique identifier for the client, as ip:port for IPv4 and IPv6 clients, or as path:0 for Unix socket clients. See the top comment in the function for more info.	2013-07-09 12:49:20 +02:00
Geoff Garside	6181455ac6	Update REDIS_CLUSTER_IPLEN to INET6_ADDRSTRLEN. Change REDIS_CLUSTER_IPLEN to INET6_ADDRSTRLEN so that the clusterNode ip character buffer is big enough to hold an IPv6 address.	2013-07-08 15:57:23 +02:00
Geoff Garside	9cfa02fe73	Add macro to define clusterNode.ip buffer size. Add REDIS_CLUSTER_IPLEN macro to define the size of the clusterNode ip character array. Additionally use this macro in inet_ntop(3) calls where the size of the array was being defined manually. The REDIS_CLUSTER_IPLEN is defined as INET_ADDRSTRLEN which defines the correct size of a buffer to store an IPv4 address in. The INET_ADDRSTRLEN macro itself is defined in the <netinet/in.h> header file and should be portable across the majority of systems.	2013-07-08 15:55:39 +02:00
antirez	98eecb70eb	Binding multiple IPs done properly with multiple sockets.	2013-07-05 11:47:20 +02:00
antirez	90b0d66cce	Ability to bind multiple addresses.	2013-07-04 18:50:15 +02:00
antirez	de9a221749	CONFIG SET maxclients.	2013-06-28 17:08:03 +02:00
antirez	13585dd677	function renamed: popcount_binary -> redisPopcount.	2013-06-26 15:19:06 +02:00
Salvatore Sanfilippo	bae60ede1d	Merge pull request #1111 from yamt/netbsd3 netbsd support	2013-06-26 06:17:02 -07:00
antirez	8ca265cdb7	Don't disconnect pre PSYNC replication clients for timeout. Clients using SYNC to replicate are older implementations, such as redis-cli --slave, and are not designed to acknowledge the master with REPLCONF ACK commands, so we don't have any feedback and should not disconnect them on timeout.	2013-06-26 10:11:20 +02:00
antirez	f0bf5fd8c7	Use the RSC to replicate EVALSHA unmodified. This commit uses the Replication Script Cache in order to avoid translating EVALSHA into EVAL whenever possible for both the AOF and slaves.	2013-06-24 18:57:31 +02:00
antirez	94ec7db470	Replication of scripts as EVALSHA: sha1 caching implemented. This code is only responsible to take an LRU-evicted fixed length cache of SHA1 that we are sure all the slaves received. In this commit only the implementation is provided, but the Redis core does not use it to actually send EVALSHA to slaves when possible.	2013-06-24 10:26:04 +02:00
antirez	515a26bbc1	New API to force propagation. The old REDIS_CMD_FORCE_REPLICATION flag was removed from the implementation of Redis, now there is a new API to force specific executions of a command to be propagated to AOF / Replication link: void forceCommandPropagation(int flags); The new API is also compatible with Lua scripting, so a script that will execute commands that are forced to be propagated, will also be propagated itself accordingly even if no change to data is operated. As a side effect, this new design fixes the issue with scripts not able to propagate PUBLISH to slaves (issue #873).	2013-06-21 12:07:53 +02:00
antirez	455563faec	PUBSUB command implemented. Currently it implements three subcommands: PUBSUB CHANNELS [<pattern>] List channels with non-zero subscribers. PUBSUB NUMSUB [channel_1 ...] List number of subscribers for channels. PUBSUB NUMPAT Return number of subscribed patterns.	2013-06-20 15:32:00 +02:00
antirez	dfc98dccf4	Cluster: detect nodes address change.	2013-06-12 10:50:07 -07:00
antirez	ed599d3aca	min-slaves-to-write: don't accept writes with less than N replicas. This feature allows the user to specify the minimum number of connected replicas having a lag less or equal than the specified amount of seconds for writes to be accepted.	2013-05-30 11:30:04 +02:00
antirez	a864cae2a5	A comment about BLPOP timeout did not reflected actual behavior.	2013-05-27 19:34:14 +02:00
antirez	0292c5f7ae	Replication: send REPLCONF ACK to master.	2013-05-27 11:42:25 +02:00
antirez	6b4635f4f5	REPLCONF ACK command. This special command is used by the slave to inform the master the amount of replication stream it currently consumed. it does not return anything so that we not need to consume additional bandwidth needed by the master to reply something. The master can do a number of things knowing the amount of stream processed, such as understanding the "lag" in bytes of the slave, verify if a given command was already processed by the slave, and so forth.	2013-05-27 11:42:17 +02:00
YAMAMOTO Takashi	b2dd0849ce	rename popcount to popcount_binary to avoid a conflict with NetBSD libc NetBSD-current's libc has a function named popcount. hiding these extensions using feature macros is not possible because redis uses other extensions covered by the same feature macro. eg. inet_aton	2013-05-17 17:21:28 +09:00
antirez	310dbba01c	Added a define for most configuration defaults. Also the logfile option was modified to always have an explicit value and to log to stdout when an empty string is used as log file. Previously there was special handling of the string "stdout" that set the logfile to NULL, this always required some special handling.	2013-05-15 10:12:29 +02:00
antirez	c184f36d21	CONFIG REWRITE: support for client-output-buffer-limit.	2013-05-13 18:34:18 +02:00
antirez	7e049fafd3	CONFIG REWRITE: Initial support code and design.	2013-05-13 11:11:12 +02:00
antirez	5947f170f9	Obtain absoute path of configuration file, expose it in INFO.	2013-05-09 16:57:59 +02:00
antirez	0ae1b5b0a1	Revert "use long long instead of size_t make it more safe" This reverts commit `2c75f2cf1a`. After further analysis, it is very unlikely that we'll raise the string size limit to > 512MB, and at the same time such big strings will be used in 32 bit systems. Better to revert to size_t so that 32 bit processors will not be forced to use a 64 bit counter in normal operations, that is currently completely useless.	2013-05-08 10:01:27 +02:00
Jiahao Huang	2c75f2cf1a	use long long instead of size_t make it more safe	2013-05-07 23:37:22 +08:00
Jiahao Huang	e3ed78d43b	in 32bit machine, popcount don't work with a input string length up to 512 MB, bitcount commant may return negtive integer with string length more than 256 MB	2013-05-07 19:55:57 +08:00
antirez	5c9f6d4f55	Cluster: link reconnection on delayed PONG reply. When the PONG delay is half the cluster node timeout, the link gets disconnected (and later automatically reconnected) in order to ensure that it's not just a dead connection issue. However this operation is only performed if the link is old enough, in order to avoid to disconnect the same link again and again (and among the other problems, never receive the PONG because of that). Note: when the link is reconnected, the 'ping_sent' field is not updated even if a new ping is sent using the new connection, so we can still reliably detect a node ping timeout.	2013-05-03 15:43:03 +02:00
antirez	d264122f6a	Config option to turn AOF rewrite incremental fsync on/off.	2013-04-24 10:57:07 +02:00
antirez	336d722fba	AOF: sync data on disk every 32MB when rewriting. This prevents the kernel from putting too much stuff in the output buffers, doing too heavy I/O all at once. So the goal of this commit is to split the disk pressure due to the AOF rewrite process into smaller spikes. Please see issue #1019 for more information.	2013-04-24 10:26:31 +02:00
antirez	68cf249f81	Cluster: use server.cluster_node_timeout directly. We used to copy this value into the server.cluster structure, however this was not necessary. The reason why we don't directly use server.cluster->node_timeout is that things that can be configured via redis.conf need to be directly available in the server structure as server.cluster is allocated later only if needed in order to reduce the memory footprint of non-cluster instances.	2013-04-09 11:24:18 +02:00
antirez	ef4f25ff6e	Cluster: configdigest field no longer used. Removed.	2013-04-09 11:07:25 +02:00
antirez	d5b383477e	Cluster: move REDIS_CLUSTER_FAILOVER_DELAY near other timing defines.	2013-04-04 14:23:34 +02:00
antirez	05fa4f4034	Cluster: node timeout is now configurable.	2013-04-04 12:29:10 +02:00
antirez	00bab23c41	Cluster: turn hardcoded node timeout multiplicators into defines. Most Redis Cluster time limits are expressed in terms of the configured node timeout. Turn them into defines.	2013-04-04 12:04:11 +02:00
antirez	b237de33d1	Throttle BGSAVE attempt on saving error. When a BGSAVE fails, Redis used to flood itself trying to BGSAVE at every next cron call, that is either 10 or 100 times per second depending on configuration and server version. This commit does not allow a new automatic BGSAVE attempt to be performed before a few seconds delay (currently 5). This avoids both the auto-flood problem and filling the disk with logs at a serious rate. The five seconds limit, considering a log entry of 200 bytes, will use less than 4 MB of disk space per day that is reasonable, the sysadmin should notice before of catastrofic events especially since by default Redis will stop serving write queries after the first failed BGSAVE. This fixes issue #849	2013-04-02 14:05:50 +02:00
antirez	32a83c8206	DEBUG set-active-expire added. We need the ability to disable the activeExpireCycle() (active expired key collection) call for testing purposes.	2013-03-27 17:55:02 +01:00
antirez	506f9a42b0	Cluster: new flag PROMOTED introduced. A slave node set this flag for itself when, after receiving authorization from the majority of nodes, it turns itself into a master. At the same time now this flag is tested by nodes receiving a PING message before reconfiguring after a failover event. This makes the system more robust: even if currently there is no way to manually turn a slave into a master it is possible that we'll have such a feature in the future, or that simply because of misconfiguration a node joins the cluster as master while others believe it's a slave. This alone is now no longer enough to trigger reconfiguration as other nodes will check for the PROMOTED flag. The PROMOTED flag is cleared every time the node is turned back into a replica of some other node.	2013-03-20 10:48:42 +01:00
antirez	026b9483db	Cluster: add sender flags in cluster bus messages header. Sender flags were not propagated for the sender, but only for nodes in the gossip section. This is odd and in the next commits we'll need to get updated flags for the sender node, so this commit adds a new field in the cluster messages header. The message header is the same size as we reused some free space that was marked as 'unused' because of alignment concerns.	2013-03-20 10:32:00 +01:00
antirez	1375b0611b	Cluster: slaves start failover with a small delay. Redis Cluster can cope with a minority of nodes not informed about the failure of a master in time for some reason (netsplit or node not functioning properly, blocked, ...) however to wait a few seconds before to start the failover will make most "normal" failovers simpler as the FAIL message will propagate before the slave election happens.	2013-03-15 16:39:49 +01:00
antirez	35f05c66b6	Cluster: handle FAILOVER_AUTH_ACK messages. That's trivial as we just need to increment the count of masters that received with an ACK.	2013-03-14 16:43:13 +01:00
antirez	db7c17e969	Cluster: FAILOVER_AUTH_REQUEST message type introduced. This message is sent by a slave that is ready to failover its master to other nodes to get the authorization from the majority of masters.	2013-03-13 17:21:20 +01:00
antirez	575cbc9990	Cluster: clusterHandleSlaveFailover() stub.	2013-03-13 13:10:49 +01:00
antirez	5f5aa487f9	REDIS_DBCRON_DBS_PER_SEC -> REDIS_DBCRON_DBS_PER_CALL	2013-03-09 11:44:20 +01:00
antirez	7ac3b3a486	Only resize/rehash a few databases per cron iteration. This is the first step to lower the CPU usage when many databases are configured. The other is to also process a limited number of DBs per call in the active expire cycle.	2013-03-08 14:01:12 +01:00
antirez	3dad8196b7	Cluster: clusterUpdateState() function simplified. Also the NEEDHELP Cluster state was removed as it will no longer be used by Redis Cluster.	2013-03-06 18:25:40 +01:00
antirez	7b190a08cf	API to lookup commands with their original name. A new server.orig_commands table was added to the server structure, this contains a copy of the commant table unaffected by rename-command statements in redis.conf. A new API lookupCommandOrOriginal() was added that checks both tables, new first, old later, so that rewriteClientCommandVector() and friends can lookup commands with their new or original name in order to fix the client->cmd pointer when the argument vector is renamed. This fixes the segfault of issue #986, but does not fix a wider range of problems resulting from renaming commands that actually operate on data and are registered into the AOF file or propagated to slaves... That is command renaming should be handled with care.	2013-03-06 16:28:26 +01:00
antirez	1a02b7440a	Cluster: new node field fail_time. This is the unix time at which we set the FAIL flag for the node. It is only valid if FAIL is set. The idea is to use it in order to make the cluster more robust, for instance in order to revert a FAIL state if it is long-standing but still slots are assigned to this node, that is, no one is going to fix these slots apparently.	2013-03-05 13:15:05 +01:00
antirez	7bead003e2	SLAVEOF command refactored into a proper API. We now have replicationSetMaster() and replicationUnsetMaster() that can be called in other contexts (for instance Redis Cluster).	2013-03-04 13:22:21 +01:00
antirez	4521115b17	Cluster: new field in cluster node structure, "numslots". Before a relatively slow popcount() operation was needed every time we needed to get the number of slots served by a given cluster node. Now we just need to check an integer that is taken in sync with the bitmap.	2013-02-28 15:11:05 +01:00
antirez	f9b5ca29fd	Use GCC printf format attribute for redisLog(). This commit also fixes redisLog() statements producing warnings.	2013-02-27 12:27:15 +01:00
antirez	6356cf6808	Set process name in ps output to make operations safer. This commit allows Redis to set a process name that includes the binding address and the port number in order to make operations simpler. Redis children processes doing AOF rewrites or RDB saving change the name into redis-aof-rewrite and redis-rdb-bgsave respectively. This in general makes harder to kill the wrong process because of an error and makes simpler to identify saving children. This feature was suggested by Arnaud GRANAL in the Redis Google Group, Arnaud also pointed me to the setproctitle.c implementation includeed in this commit. This feature should work on all the Linux, OSX, and all the three major BSD systems.	2013-02-26 11:52:12 +01:00
antirez	996a643752	Cluster: use O(log(N)) algo for countKeysInSlot().	2013-02-25 12:37:50 +01:00
antirez	d2154254be	Cluster: fix case for getKeysInSlot() and countKeysInSlot(). Redis functions start in low case. A few functions about cluster were capitalized the wrong way.	2013-02-25 11:25:40 +01:00
antirez	c2eb4a606f	Cluster: use CountKeysInSlot() when we just need the count.	2013-02-25 11:23:04 +01:00
antirez	ad3bca1fdf	Cluster: added stub for verifyClusterConfigWithData(). See the top-comment for the function in this commit for details about what the function is supposed to do.	2013-02-25 11:20:17 +01:00
antirez	d218a4e244	Cluster: new state information, cluster size. The definition of cluster size is: the number of known nodes in the cluster that are masters and serving at least an hash slot.	2013-02-22 19:18:30 +01:00
antirez	974929770b	Cluster: introduced a failure reports system. A §Redis Cluster node used to mark a node as failing when itself detected a failure for that node, and a single acknowledge was received about the possible failure state. The new API will be used in order to possible to require that N other nodes have a PFAIL or FAIL state for a given node for a node to set it as failing.	2013-02-22 17:43:35 +01:00
antirez	ea7fc82a4a	Cluster: new command flag forcing implicit ASKING. Also using this new flag the RESTORE-ASKING command was implemented that will be used by MIGRATE.	2013-02-20 17:28:35 +01:00
antirez	02796ba7a7	Cluster: sanity checks on the cluster bus message length.	2013-02-15 16:44:39 +01:00
antirez	1a32d99b28	Cluster: move cluster config file out of config state. This makes us able to avoid allocating the cluster state structure if cluster is not enabled, but still we can handle the configuration directive that sets the cluster config filename.	2013-02-14 15:20:02 +01:00
antirez	1649e509c3	Cluster: the cluster state structure is now heap allocated.	2013-02-14 13:20:56 +01:00
antirez	ebd666db47	Cluster: from 4096 to 16384 hash slots.	2013-02-14 12:49:16 +01:00
antirez	dc24a6b132	Return a specific NOAUTH error if authentication is required.	2013-02-12 16:25:41 +01:00
antirez	24f258360b	Replication: added new stats counting full and partial resynchronizations.	2013-02-12 15:33:54 +01:00
antirez	078882025e	PSYNC: work in progress, preview #2 , rebased to unstable.	2013-02-12 12:52:21 +01:00
antirez	4b83ad4e1f	Use replicationFeedSlaves() to send PING to slaves. A Redis master sends PING commands to slaves from time to time: doing this ensures that even if absence of writes, the master->slave channel remains active and the slave can feel the master presence, instead of closing the connection for timeout. This commit changes the way PINGs are sent to slaves in order to use the standard interface used to replicate all the other commands, that is, the function replicationFeedSlaves(). With this change the stream of commands sent to every slave is exactly the same regardless of their exact state (Transferring RDB for first synchronization or slave already online). With the previous implementation the PING was only sent to online slaves, with the result that the output stream from master to slaves was not identical for all the slaves: this is a problem if we want to implement partial resyncs in the future using a global replication stream offset. TL;DR: this commit should not change the behaviour in practical terms, but is just something in preparation for partial resynchronization support.	2013-02-12 12:50:28 +01:00
antirez	7465ac7ab1	Emit SELECT to slaves in a centralized way. Before this commit every Redis slave had its own selected database ID state. This was not actually useful as the emitted stream of commands is identical for all the slaves. Now the the currently selected database is a global state that is set to -1 when a new slave is attached, in order to force the SELECT command to be re-emitted for all the slaves. This change is useful in order to implement replication partial resynchronization in the future, as makes sure that the stream of commands received by slaves, including SELECT commands, are exactly the same for every slave connected, at any time. In this way we could have a global offset that can identify a specific piece of the master -> slaves stream of commands.	2013-02-12 12:50:28 +01:00
antirez	124a635bc5	Set SO_KEEPALIVE on client sockets if configured to do so.	2013-02-08 16:40:59 +01:00
antirez	b70b459b0e	TCP_NODELAY after SYNC: changes to the implementation.	2013-02-05 12:04:30 +01:00
charsyam	c85647f354	Turn off TCP_NODELAY on the slave socket after SYNC. Further details from @antirez: It was reported by @StopForumSpam on Twitter that the Redis replication link was strangely using multiple TCP packets for multiple commands. This wastes a lot of bandwidth and is due to the TCP_NODELAY option we enable on the socket after accepting a new connection. However the master -> slave channel is a one-way channel since Redis replication is asynchronous, so there is no point in trying to reduce the latency, we should aim to reduce the bandwidth. For this reason this commit introduces the ability to disable the nagle algorithm on the socket after a successful SYNC. This feature is off by default because the delay can be up to 40 milliseconds with normally configured Linux kernels.	2013-02-05 12:04:25 +01:00
antirez	fce016d31b	Keyspace events: it is now possible to select subclasses of events. When keyspace events are enabled, the overhead is not sever but noticeable, so this commit introduces the ability to select subclasses of events in order to avoid to generate events the user is not interested in. The events can be selected using redis.conf or CONFIG SET / GET.	2013-01-28 13:15:12 +01:00
antirez	8766e81079	Fix decrRefCount() prototype from void to robj pointer. decrRefCount used to get its argument as a void* pointer in order to be used as destructor where a 'void free_object(void)' prototype is expected. However this made simpler to introduce bugs by freeing the wrong pointer. This commit fixes the argument type and introduces a new wrapper called decrRefCountVoid() that can be used when the void argument is needed.	2013-01-28 13:14:53 +01:00
antirez	4cdbce341e	Keyspace events notification API.	2013-01-28 13:14:36 +01:00
guiquanz	9d09ce3981	Fixed many typos.	2013-01-19 10:59:44 +01:00
antirez	1971740f0c	CLIENT GETNAME and CLIENT SETNAME introduced. Sometimes it is much simpler to debug complex Redis installations if it is possible to assign clients a name that is displayed in the CLIENT LIST output. This is the case, for example, for "leaked" connections. The ability to provide a name to the client makes it quite trivial to understand what is the part of the code implementing the client not releasing the resources appropriately. Behavior: CLIENT SETNAME: set a name for the client, or remove the current name if an empty name is set. CLIENT GETNAME: get the current name, or a nil. CLIENT LIST: now displays the client name if any. Thanks to Mark Gravell for pushing this idea forward.	2013-01-15 13:34:10 +01:00
antirez	f1481d4a03	serverCron() frequency is now a runtime parameter (was REDIS_HZ). REDIS_HZ is the frequency our serverCron() function is called with. A more frequent call to this function results into less latency when the server is trying to handle very expansive background operations like mass expires of a lot of keys at the same time. Redis 2.4 used to have an HZ of 10. This was good enough with almost every setup, but the incremental key expiration algorithm was working a bit better under extreme pressure when HZ was set to 100 for Redis 2.6. However for most users a latency spike of 30 milliseconds when million of keys are expiring at the same time is acceptable, on the other hand a default HZ of 100 in Redis 2.6 was causing idle instances to use some CPU time compared to Redis 2.4. The CPU usage was in the order of 0.3% for an idle instance, however this is a shame as more energy is consumed by the server, if not important resources. This commit introduces HZ as a runtime parameter, that can be queried by INFO or CONFIG GET, and can be modified with CONFIG SET. At the same time the default frequency is set back to 10. In this way we default to a sane value of 10, but allows users to easily switch to values up to 500 for near real-time applications if needed and if they are willing to pay this small CPU usage penalty.	2012-12-14 17:10:40 +01:00
antirez	2f87cf8b01	Blocking POP: use a dictionary to store keys clinet side. To store the keys we block for during a blocking pop operation, in the case the client is blocked for more data to arrive, we used a simple linear array of redis objects, in the blockingState structure: robj *keys; int count; However in order to fix issue #801 we also use a dictionary in order to avoid to end in the blocked clients queue for the same key multiple times with the same client. The dictionary was only temporary, just to avoid duplicates, but since we create / destroy it there is no point in doing this duplicated work, so this commit simply use a dictionary as the main structure to store the keys we are blocked for. So instead of the previous fields we now just have: dict keys; This simplifies the code and reduces the work done by the server during a blocking POP operation.	2012-12-02 20:43:15 +01:00
antirez	2f62c9663c	Introduced the Build ID in INFO and --version output. The idea is to be able to identify a build in a unique way, so for instance after a bug report we can recognize that the build is the one of a popular Linux distribution and perform the debugging in the same environment.	2012-11-29 14:20:08 +01:00
antirez	95f68f7b0f	EVALSHA is now case insensitive. EVALSHA used to crash if the SHA1 was not lowercase (Issue #783). Fixed using a case insensitive dictionary type for the sha -> script map used for replication of scripts.	2012-11-22 15:50:00 +01:00
antirez	3d1391272a	Safer handling of MULTI/EXEC on errors. After the transcation starts with a MULIT, the previous behavior was to return an error on problems such as maxmemory limit reached. But still to execute the transaction with the subset of queued commands on EXEC. While it is true that the client was able to check for errors distinguish QUEUED by an error reply, MULTI/EXEC in most client implementations uses pipelining for speed, so all the commands and EXEC are sent without caring about replies. With this change: 1) EXEC fails if at least one command was not queued because of an error. The EXECABORT error is used. 2) A generic error is always reported on EXEC. 3) The client DISCARDs the MULTI state after a failed EXEC, otherwise pipelining multiple transactions would be basically impossible: After a failed EXEC the next transaction would be simply queued as the tail of the previous transaction.	2012-11-22 10:32:07 +01:00
antirez	e23d281e48	MIGRATE TCP connections caching. By caching TCP connections used by MIGRATE to chat with other Redis instances a 5x performance improvement was measured with redis-benchmark against small keys. This can dramatically speedup cluster resharding and other processes where an high load of MIGRATE commands are used.	2012-11-12 00:47:24 +01:00
antirez	4365e5b2d3	BSD license added to every C source and header file.	2012-11-08 18:31:32 +01:00
Salvatore Sanfilippo	06851a93de	Merge pull request #741 from Run/typo fix a typo in redis.h line 595 comment	2012-11-02 04:10:47 -07:00
antirez	2ea41242f6	Unix socket clients properly displayed in MONITOR and CLIENT LIST. This also fixes issue #745.	2012-11-01 22:10:45 +01:00
Runzhen Wang	c23c657cdd	fix a typo in redis.h line 595 comment	2012-11-01 02:14:22 +08:00
antirez	a1b1c1ea3a	Fix MULTI / EXEC rendering in MONITOR output. Before of this commit it used to be like this: MULTI EXEC ... actual commands of the transaction ... Because after all that is the natural order of things. Transaction commands are queued and executed only after EXEC is called. However this makes debugging with MONITOR a mess, so the code was modified to provide a coherent output. What happens is that MULTI is rendered in the MONITOR output as far as possible, instead EXEC is propagated only after the transaction is executed, or even in the case it fails because of WATCH, so in this case you'll simply see: MULTI EXEC An empty transaction.	2012-10-16 17:35:50 +02:00
antirez	7eb850ef0e	A reimplementation of blocking operation internals. Redis provides support for blocking operations such as BLPOP or BRPOP. This operations are identical to normal LPOP and RPOP operations as long as there are elements in the target list, but if the list is empty they block waiting for new data to arrive to the list. All the clients blocked waiting for th same list are served in a FIFO way, so the first that blocked is the first to be served when there is more data pushed by another client into the list. The previous implementation of blocking operations was conceived to serve clients in the context of push operations. For for instance: 1) There is a client "A" blocked on list "foo". 2) The client "B" performs `LPUSH foo somevalue`. 3) The client "A" is served in the context of the "B" LPUSH, synchronously. Processing things in a synchronous way was useful as if "A" pushes a value that is served by "B", from the point of view of the database is a NOP (no operation) thing, that is, nothing is replicated, nothing is written in the AOF file, and so forth. However later we implemented two things: 1) Variadic LPUSH that could add multiple values to a list in the context of a single call. 2) BRPOPLPUSH that was a version of BRPOP that also provided a "PUSH" side effect when receiving data. This forced us to make the synchronous implementation more complex. If client "B" is waiting for data, and "A" pushes three elemnents in a single call, we needed to propagate an LPUSH with a missing argument in the AOF and replication link. We also needed to make sure to replicate the LPUSH side of BRPOPLPUSH, but only if in turn did not happened to serve another blocking client into another list ;) This were complex but with a few of mutually recursive functions everything worked as expected... until one day we introduced scripting in Redis. Scripting + synchronous blocking operations = Issue #614. Basically you can't "rewrite" a script to have just a partial effect on the replicas and AOF file if the script happened to serve a few blocked clients. The solution to all this problems, implemented by this commit, is to change the way we serve blocked clients. Instead of serving the blocked clients synchronously, in the context of the command performing the PUSH operation, it is now an asynchronous and iterative process: 1) If a key that has clients blocked waiting for data is the subject of a list push operation, We simply mark keys as "ready" and put it into a queue. 2) Every command pushing stuff on lists, as a variadic LPUSH, a script, or whatever it is, is replicated verbatim without any rewriting. 3) Every time a Redis command, a MULTI/EXEC block, or a script, completed its execution, we run the list of keys ready to serve blocked clients (as more data arrived), and process this list serving the blocked clients. 4) As a result of "3" maybe more keys are ready again for other clients (as a result of BRPOPLPUSH we may have push operations), so we iterate back to step "3" if it's needed. The new code has a much simpler semantics, and a simpler to understand implementation, with the disadvantage of not being able to "optmize out" a PUSH+BPOP as a No OP. This commit will be tested with care before the final merge, more tests will be added likely.	2012-09-17 10:26:46 +02:00
Salvatore Sanfilippo	24bc807b5c	Merge pull request #576 from saj/fix-slave-ping-period Bug fix: slaves being pinged every second	2012-09-05 06:59:37 -07:00
antirez	36741b2c81	Scripting: Force SORT BY constant determinism inside SORT itself. SORT is able to return (faster than when ordering) unordered output if the "BY" clause is used with a constant value. However we try to play well with scripting requirements of determinism providing always sorted outputs when SORT (and other similar commands) are called by Lua scripts. However we used the general mechanism in place in scripting in order to reorder SORT output, that is, if the command has the "S" flag set, the Lua scripting engine will take an additional step when converting a multi bulk reply to Lua value, calling a Lua sorting function. This is suboptimal as we can do it faster inside SORT itself. This is also broken as issue #545 shows us: basically when SORT is used with a constant BY, and additionally also GET is used, the Lua scripting engine was trying to order the output as a flat array, while it was actually a list of key-value pairs. What we do know is to recognized if the caller of SORT is the Lua client (since we can check this using the REDIS_LUA_CLIENT flag). If so, and if a "don't sort" condition is triggered by the BY option with a constant string, we force the lexicographical sorting. This commit fixes this bug and improves the performance, and at the same time simplifies the implementation. This does not mean I'm smart today, it means I was stupid when I committed the original implementation ;)	2012-09-05 01:17:49 +02:00
antirez	bb66fc3120	Send an async PING before starting replication with master. During the first synchronization step of the replication process, a Redis slave connects with the master in a non blocking way. However once the connection is established the replication continues sending the REPLCONF command, and sometimes the AUTH command if needed. Those commands are send in a partially blocking way (blocking with timeout in the order of seconds). Because it is common for a blocked master to accept connections even if it is actually not able to reply to the slave requests, it was easy for a slave to block if the master had serious issues, but was still able to accept connections in the listening socket. For this reason we now send an asynchronous PING request just after the non blocking connection ended in a successful way, and wait for the reply before to continue with the replication process. It is very unlikely that a master replying to PING can't reply to the other commands. This solution was proposed by Didier Spezia (Thanks!) so that we don't need to turn all the replication process into a non blocking affair, but still the probability of a slave blocked is minimal even in the event of a failing master. Also we now use getsockopt(SO_ERROR) in order to check errors ASAP in the event handler, instead of waiting for actual I/O to return an error. This commit fixes issue #632.	2012-09-02 12:24:38 +02:00
antirez	169a44cbd6	Sentinel: Redis-side support for slave priority. A Redis slave can now be configured with a priority, that is an integer number that is shown in INFO output and can be get and set using the redis.conf file or the CONFIG GET/SET command. This field is used by Sentinel during slave election. A slave with lower priority is preferred. A slave with priority zero is never elected (and is considered to be impossible to elect even if it is the only slave available). A next commit will add support in the Sentinel side as well.	2012-08-28 17:20:26 +02:00
antirez	784b93087c	Incrementally flush RDB on disk while loading it from a master. This fixes issue #539. Basically if there is enough free memory the OS may buffer the RDB file that the slave transfers on disk from the master. The file may actually be flused on disk at once by the operating system when it gets closed by Redis, causing the close system call to block for a long time. This patch is a modified version of one provided by yoav-steinberg of @garantiadata (the original version was posted in the issue #539 comments), and tries to flush the OS buffers incrementally (every 8 MB of loaded data).	2012-08-28 12:47:33 +02:00
Salvatore Sanfilippo	04950a9e4d	Merge pull request #586 from saj/aof_last_bgrewrite_status New in INFO: aof_last_bgrewrite_status	2012-07-27 03:55:20 -07:00
antirez	6b5daa2df2	First implementation of Redis Sentinel. This commit implements the first, beta quality implementation of Redis Sentinel, a distributed monitoring system for Redis with notification and automatic failover capabilities. More info at http://redis.io/topics/sentinel	2012-07-23 13:14:44 +02:00
antirez	5d73073f6e	Allow Pub/Sub in contexts where other commands are blocked. Redis loading data from disk, and a Redis slave disconnected from its master with serve-stale-data disabled, are two conditions where commands are normally refused by Redis, returning an error. However there is no reason to disable Pub/Sub commands as well, given that this layer does not interact with the dataset. To allow Pub/Sub in as many contexts as possible is especially interesting now that Redis Sentinel uses Pub/Sub of a Redis master as a communication channel between Sentinels. This commit allows Pub/Sub to be used in the above two contexts where it was previously denied.	2012-07-22 17:18:16 +02:00
Saj Goonatilleke	48553a29e8	New in INFO: aof_last_bgrewrite_status Behaves like rdb_last_bgsave_status -- even down to reporting 'ok' when no rewrite has been done yet. (You might want to check that aof_last_rewrite_time_sec is not -1.)	2012-07-18 09:54:55 +10:00
Saj Goonatilleke	9edfe63553	Bug fix: slaves being pinged every second REDIS_REPL_PING_SLAVE_PERIOD controls how often the master should transmit a heartbeat (PING) to its slaves. This period, which defaults to 10, is measured in seconds. Redis 2.4 masters used to ping their slaves every ten seconds, just like it says on the tin. The Redis 2.6 masters I have been experimenting with, on the other hand, ping their slaves every second. (master_last_io_seconds_ago never approaches 10.) I think the ping period was inadvertently slashed to one-tenth of its nominal value around the time REDIS_HZ was introduced. This commit reintroduces correct ping schedule behaviour.	2012-07-05 14:29:27 +10:00
antirez	3a32897856	REPLCONF internal command introduced. The REPLCONF command is an internal command (not designed to be directly used by normal clients) that allows a slave to set some replication related state in the master before issuing SYNC to start the replication. The initial motivation for this command, and the only reason currently it is used by the implementation, is to let the slave instance communicate its listening port to the slave, so that the master can show all the slaves with their listening ports in the "replication" section of the INFO output. This allows clients to auto discover and query all the slaves attached into a master. Currently only a single option of the REPLCONF command is supported, and it is called "listening-port", so the slave now starts the replication process with something like the following chat: REPLCONF listening-prot 6380 SYNC Note that this works even if the master is an older version of Redis and does not understand REPLCONF, because the slave ignores the REPLCONF error. In the future REPLCONF can be used for partial replication and other replication related features where there is the need to exchange information between master and slave. NOTE: This commit also fixes a bug: the INFO outout already carried information about slaves, but the port was broken, and was obtained with getpeername(2), so it was actually just the ephemeral port used by the slave to connect to the master as a client.	2012-06-27 09:43:57 +02:00
antirez	31a1439bfd	Fixed a timing attack on AUTH (Issue #560 ). The way we compared the authentication password using strcmp() allowed an attacker to gain information about the password using a well known class of attacks called "timing attacks". The bug appears to be practically not exploitable in most modern systems running Redis since even using multiple bytes of differences in the input at a time instead of one the difference in running time in in the order of 10 nanoseconds, making it hard to exploit even on LAN. However attacks always get better so we are providing a fix ASAP. The new implementation uses two fixed length buffers and a constant time comparison function, with the goal of: 1) Completely avoid leaking information about the content of the password, since the comparison is always performed between 512 characters and without conditionals. 2) Partially avoid leaking information about the length of the password. About "2" we still have a stage in the code where the real password and the user provided password are copied in the static buffers, we also run two strlen() operations against the two inputs, so the running time of the comparison is a fixed amount plus a time proportional to LENGTH(A)+LENGTH(B). This means that the absolute time of the operation performed is still related to the length of the password in some way, but there is no way to change the input in order to get a difference in the execution time in the comparison that is not just proportional to the string provided by the user (because the password length is fixed). Thus in practical terms the user should try to discover LENGTH(PASSWORD) looking at the whole execution time of the AUTH command and trying to guess a proportionality between the whole execution time and the password length: this appears to be mostly unfeasible in the real world. Also protecting from this attack is not very useful in the case of Redis as a brute force attack is anyway feasible if the password is too short, while with a long password makes it not an issue that the attacker knows the length.	2012-06-21 11:50:01 +02:00
antirez	ee789e157c	Dump ziplist hex value on failed assertion. The ziplist -> hashtable conversion code is triggered every time an hash value must be promoted to a full hash table because the number or size of elements reached the threshold. If a problem in the ziplist causes the same field to be present multiple times, the assertion of successful addition of the element inside the hash table will fail, crashing server with a failed assertion, but providing little information about the problem. This code adds a new logging function to perform the hex dump of binary data, and makes sure that the ziplist -> hashtable conversion code uses this new logging facility to dump the content of the ziplist when the assertion fails. This change was originally made in order to investigate issue #547.	2012-06-12 00:41:48 +02:00
antirez	33e1db36fa	Four new persistence fields in INFO. A few renamed. The 'persistence' section of INFO output now contains additional four fields related to RDB and AOF persistence: rdb_last_bgsave_time_sec Duration of latest BGSAVE in sec. rdb_current_bgsave_time_sec Duration of current BGSAVE in sec. aof_last_rewrite_time_sec Duration of latest AOF rewrite in sec. aof_current_rewrite_time_sec Duration of current AOF rewrite in sec. The 'current' fields are set to -1 if a BGSAVE / AOF rewrite is not in progress. The 'last' fileds are set to -1 if no previous BGSAVE / AOF rewrites were performed. Additionally a few fields in the persistence section were renamed for consistency: changes_since_last_save -> rdb_changes_since_last_save bgsave_in_progress -> rdb_bgsave_in_progress last_save_time -> rdb_last_save_time last_bgsave_status -> rdb_last_bgsave_status bgrewriteaof_in_progress -> aof_rewrite_in_progress bgrewriteaof_scheduled -> aof_rewrite_scheduled After the renaming, fields in the persistence section start with rdb_ or aof_ prefix depending on the persistence method they describe. The field 'loading' and related fields are not prefixed because they are unique for both the persistence methods.	2012-05-25 12:11:30 +02:00
antirez	0bd6d68e34	New commands: BITOP and BITCOUNT. The motivation for this new commands is to be search in the usage of Redis for real time statistics. See the article "Fast real time metrics using Redis". http://blog.getspool.com/2011/11/29/fast-easy-realtime-metrics-using-redis-bitmaps/ In general Redis strings when used as bitmaps using the SETBIT/GETBIT command provide a very space-efficient and fast way to store statistics. For instance in a web application with users, every user can be associated with a key that shows every day in which the user visited the web service. This information can be really valuable to extract user behaviour information. With Redis bitmaps doing this is very simple just saying that a given day is 0 (the data the service was put online) and all the next days are 1, 2, 3, and so forth. So with SETBIT it is possible to set the bit corresponding to the current day every time the user visits the site. It is possible to take the count of the bit sets on the run, this is extremely easy using a Lua script. However a fast bit count native operation can be useful, especially if it can operate on ranges, or when the string is small like in the case of days (even if you consider many years it is still extremely little data). For this reason BITOP was introduced. The command counts the number of bits set to 1 in a string, with optional range: BITCOUNT key [start end] The start/end parameters are similar to GETRANGE. If omitted the whole string is tested. Population counting is more useful when bit-level operations like AND, OR and XOR are avaialble. For instance I can test multiple users to see the number of days three users visited the site at the same time. To do this we can take the AND of all the bitmaps, and then count the set bits. For this reason the BITOP command was introduced: BITOP [AND\|OR\|XOR\|NOT] dest_key src_key1 src_key2 src_key3 ... src_keyN In the special case of NOT (that inverts the bits) only one source key can be passed. The judicious use of BITCOUNT and BITOP combined can lead to interesting use cases with very space efficient representation of data. The implementation provided is still not tested and optimized for speed, next commits will introduce unit tests. Later the implementation will be profiled to see if it is possible to gain an important amount of speed without making the code much more complex.	2012-05-24 15:19:43 +02:00
antirez	47ca4b6e28	Allow an AOF rewrite buffer > 2GB (Fix for issue #504 ). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.	2012-05-24 15:19:15 +02:00
antirez	61daf8914d	Impovements for: Redis timer, hashes rehashing, keys collection. A previous commit introduced REDIS_HZ define that changes the frequency of calls to the serverCron() Redis function. This commit improves different related things: 1) Software watchdog: now the minimal period can be set according to REDIS_HZ. The minimal period is two times the timer period, that is: (1000/REDIS_HZ)*2 milliseconds 2) The incremental rehashing is now performed in the expires dictionary as well. 3) The activeExpireCycle() function was improved in different ways: - Now it checks if it already used too much time using microseconds instead of milliseconds for better precision. - The time limit is now calculated correctly, in the previous version the division was performed before of the multiplication resulting in a timelimit of 0 if HZ was big enough. - Databases with less than 1% of buckets fill in the hash table are skipped, because getting random keys is too expensive in this condition. 4) tryResizeHashTables() is now called at every timer call, we need to match the number of calls we do to the expired keys colleciton cycle. 5) REDIS_HZ was raised to 100.	2012-05-13 21:52:35 +02:00
antirez	9434349236	Redis timer interrupt frequency configurable as REDIS_HZ. Redis uses a function called serverCron() that is very similar to the timer interrupt of an operating system. This function is used to handle a number of asynchronous things, like active expired keys collection, clients timeouts, update of statistics, things related to the cluster and replication, triggering of BGSAVE and AOF rewrite process, and so forth. In the past the timer was called 1 time per second. At some point it was raised to 10 times per second, but it still was fixed and could not be changed even at compile time, because different functions called from serverCron() assumed a given fixed frequency. This commmit makes the frequency configurable, so that it is simpler to pick a good tradeoff between overhead of this function (that is usually very small) and the responsiveness of Redis during a few critical circumstances where a lot of work is done inside the timer. An example of such a critical condition is mass-expire of a lot of keys in the same second. Up to a given percentage of CPU time is used to perform expired keys collection per expire cylce. Now changing the REDIS_HZ macro it is possible to do less work but more times per second in order to block the server for less time. If this patch will work well in our tests it will enter Redis 2.6-final.	2012-05-13 16:40:29 +02:00
antirez	1dcc95d081	More incremental active expired keys collection process. If a large amonut of keys are all expiring about at the same time, the "active" expired keys collection cycle used to block as far as the percentage of already expired keys was >= 25% of the total population of keys with an expire set. This could block the server even for many seconds in order to reclaim memory ASAP. The new algorithm uses at max a small amount of milliseconds per cycle, even if this means reclaiming the memory less promptly it also means a more responsive server.	2012-05-11 19:17:31 +02:00
antirez	ae62d29d1d	Use specific error if master is down and slave-serve-stale-data is set to no. We used to reply -ERR ... message ..., now the reply is instead -MASTERDOWN ... message ... so that it can be distinguished easily by the other error conditions.	2012-05-02 20:57:55 +02:00
antirez	d3701d2714	Limit memory used by big SLOWLOG entries. Two limits are added: 1) Up to SLOWLOG_ENTRY_MAX_ARGV arguments are logged. 2) Up to SLOWLOG_ENTRY_MAX_STRING bytes per argument are logged. 3) slowlog-max-len is set to 128 by default (was 1024). The number of remaining arguments / bytes is logged in the entry so that the user can understand better the nature of the logged command.	2012-04-21 20:34:45 +02:00
antirez	84bcd3aa24	It is now possible to enable/disable RDB checksum computation from redis.conf or via CONFIG SET/GET. Also CONFIG SET support added for rdbcompression as well.	2012-04-10 15:47:10 +02:00
antirez	88c1d9550d	crc64.c modified for incremental computation.	2012-04-09 12:20:47 +02:00
antirez	2cbdab903f	For coverage testing use exit() instead of _exit() when termiating saving children.	2012-04-07 12:11:23 +02:00
Salvatore Sanfilippo	d84f776e87	Merge pull request #426 from anydot/fix-rm-vm-comments remove mentions of VM in comments	2012-04-05 01:54:09 -07:00
antirez	9510d65dc8	CRC64 implementation added to Redis code base.	2012-04-02 12:31:44 +02:00
Premysl Hruby	8918de9202	remove mentions of VM in comments	2012-04-02 11:56:03 +02:00
antirez	04d360fdcd	Better syncio.c with millisecond resolution.	2012-03-31 11:21:45 +02:00
Joseph Jang	f892797e1b	Fixed a memory leak with replication occurs when two or more dbs are replicated and at least one of them is >db10	2012-03-30 10:34:29 +02:00
antirez	179e54d2a9	Fix for slaves chains. Force resync of slaves (simply disconnecting them) when SLAVEOF turns a master into a slave.	2012-03-29 09:24:02 +02:00
antirez	a7d12cbaf1	Log from signal handlers is now safer.	2012-03-28 13:45:39 +02:00
antirez	1043c8064b	Merge branch 'watchdog' into unstable	2012-03-28 13:16:19 +02:00
Premysl Hruby	8af9fe841c	declare hashDictType as external too	2012-03-27 18:18:57 +02:00
antirez	39bd025c29	Redis software watchdog.	2012-03-27 11:47:51 +02:00
antirez	c1d01b3c57	New INFO field aof_delayed_fsync introduced. This new field counts all the times Redis is configured with AOF enabled and fsync policy 'everysec', but the previous fsync performed by the background thread was not able to complete within two seconds, forcing Redis to perform a write against the AOF file while the fsync is still in progress (likely a blocking operation).	2012-03-25 11:27:35 +02:00
antirez	f3fd419fc9	Support for read-only slaves. Semantical fixes. This commit introduces support for read only slaves via redis.conf and CONFIG GET/SET commands. Also various semantical fixes are implemented here: 1) MULTI/EXEC with only read commands now work where the server is into a state where writes (or commands increasing memory usage) are not allowed. Before this patch everything inside a transaction would fail in this conditions. 2) Scripts just calling read-only commands will work against read only slaves, when the server is out of memory, or when persistence is into an error condition. Before the patch EVAL always failed in this condition.	2012-03-20 17:32:48 +01:00
antirez	ae22bf1ef6	Reclaim space from the client querybuf if needed.	2012-03-14 15:32:30 +01:00
antirez	e74dca73d9	Client creation time in redisClient structure. New age field in CLIENT LIST output.	2012-03-13 13:05:08 +01:00
antirez	8562798308	Merge conflicts resolved.	2012-03-09 22:07:45 +01:00
antirez	250e7f6908	Instantaneous ops/sec figure in INFO output.	2012-03-08 16:15:37 +01:00
antirez	91d664d6ce	run_id added to INFO output. The Run ID is a field that identifies a single execution of the Redis server. It can be useful for many purposes as it makes easy to detect if the instance we are talking about is the same, or if it is a different one or was rebooted. An application of run_id will be in the partial synchronization of replication, where a slave may request a partial sync from a given offset only if it is talking with the same master. Another application is in failover and monitoring scripts.	2012-03-08 10:13:36 +01:00
antirez	44f508f1a8	clusterGetRandomName() generalized into getRandomHexChars() so that we can use it for the run_id field as well.	2012-03-08 10:08:44 +01:00

1 2 3 4 5 ...

494 Commits