Commit Graph

3691 Commits

Author SHA1 Message Date
antirez
8811d49009 Remove useful check from tryObjectEncoding().
We are sure the string is large, since when the sds optimization branch
is entered it means that it was not possible to encode it as EMBSTR for
size concerns.
2013-08-27 12:36:52 +02:00
antirez
38b2e65e15 tryObjectEncoding(): optimize sds strings if possible.
When no encoding is possible, at least try to reallocate the sds string
with one that does not waste memory (with free space at the end of the
buffer) when the string is large enough.
2013-08-27 12:15:13 +02:00
antirez
6672bc8b3b tryObjectEncoding(): don't call stringl2() for too big strings.
We are sure that a string that is longer than 21 chars cannot be
represented by a 64 bit signed integer, as -(2^64) is 21 chars:

strlen(-18446744073709551616) => 21
2013-08-27 11:56:47 +02:00
antirez
ff9d66c4a9 Don't over-allocate the sds string for large bulk requests.
The call to sdsMakeRoomFor() did not accounted for the amount of data
already present in the query buffer, resulting into over-allocation.
2013-08-27 11:54:38 +02:00
antirez
e21803348a DEBUG SDSLEN added.
This command is only useful for low-level debugging of memory issues due
to sds wasting memory as empty buffer at the end of the string.
2013-08-27 11:53:49 +02:00
antirez
b34126e378 Update server.lastbgsave_status when fork() fails. 2013-08-27 10:16:29 +02:00
antirez
003cc8a4f5 Only run the fast active expire cycle if master & enabled. 2013-08-27 09:31:55 +02:00
antirez
303dde3757 Don't update node pong time via gossip.
This feature was implemented in the initial days of the Redis Cluster
implementaiton but is not a good idea at all.

1) It depends on clocks to be synchronized, that is already very bad.
2) Moreover it adds a bug where the pong time is updated via gossip so
no new PING is ever sent by the current node, with the effect of no PONG
received, no update of tables, no clearing of PFAIL flag.

In general to trust other nodes about the reachability of other nodes is
a broken distributed programming model.
2013-08-26 16:16:25 +02:00
antirez
6ae37b0e1d Cluster: set event handler in cluster bus listening socket.
The commit using listenToPort() introduced this bug by no longer
creating the event handler to handle incoming messages from the cluster
bus.
2013-08-22 14:53:53 +02:00
antirez
81a6a9639a Use listenToPort() in cluster.c as well. 2013-08-22 14:05:07 +02:00
antirez
4f310e05c0 Opening TCP listening ports refactored into a function. 2013-08-22 14:01:16 +02:00
antirez
0f0cc88589 Print error message when can't bind * on any address. 2013-08-22 13:02:59 +02:00
antirez
042776aff7 Cluster: fix CLUSTER MEET ip address validation.
This was broken by the IPv6 support patches.
2013-08-22 11:54:28 +02:00
antirez
9cf30132cc Cluster: process MEET packets as PING packets.
Somewhat a previous commit broken this so CLUSTER MEET was no longer
working.
2013-08-22 11:53:28 +02:00
antirez
b804afcf01 Use a safe dict.c iterator in clusterCron(). 2013-08-21 15:51:15 +02:00
antirez
35a977c499 Fix for issue #1214 simplified. 2013-08-21 11:36:09 +02:00
Salvatore Sanfilippo
038e356dbc Merge pull request #1214 from kaoshijuan/unstable
fixed initServer fail problem
2013-08-21 02:18:41 -07:00
antirez
7e9929e12e Use printf %zu specifier to print private_dirty. 2013-08-20 12:04:57 +02:00
antirez
1c08302edc dictFingerprint(): cast pointers to integer of same size. 2013-08-20 11:49:55 +02:00
antirez
00ddc3500c Revert "Fixed type in dict.c comment: 265 -> 256."
This reverts commit 6253180abd.
2013-08-19 17:25:48 +02:00
antirez
6253180abd Fixed type in dict.c comment: 265 -> 256. 2013-08-19 15:10:37 +02:00
antirez
1c75408457 assert.h replaced with redisassert.h when appropriate.
Also a warning was suppressed by including unistd.h in redisassert.h
(needed for _exit()).
2013-08-19 15:01:21 +02:00
antirez
ca294c6b1e Added redisassert.h as drop in replacement for assert.h.
By using redisassert.h version of assert() you get stack traces in the
log instead of a process disappearing on assertions.
2013-08-19 15:01:15 +02:00
antirez
905d4822da dictFingerprint() fingerprinting made more robust.
The previous hashing used the trivial algorithm of xoring the integers
together. This is not optimal as it is very likely that different
hash table setups will hash the same, for instance an hash table at the
start of the rehashing process, and at the end, will have the same
fingerprint.

Now we hash N integers in a smarter way, by summing every integer to the
previous hash, and taking the integer hashing again (see the code for
further details). This way it is a lot less likely that we get a
collision. Moreover this way of hashing explicitly protects from the
same set of integers in a different order to hash to the same number.

This commit is related to issue #1240.
2013-08-19 15:01:12 +02:00
antirez
3039e806d8 Fix comments for correctness in zunionInterGenericCommand().
Related to issue #1240.
2013-08-19 15:01:05 +02:00
antirez
cfb9d025c6 Properly init/release iterators in zunionInterGenericCommand().
This commit does mainly two things:

1) It fixes zunionInterGenericCommand() by removing mass-initialization
of all the iterators used, so that we don't violate the unsafe iterator
API of dictionaries. This fixes issue #1240.

2) Since the zui* APIs required the allocator to be initialized in the
zsetopsrc structure in order to use non-iterator related APIs, this
commit fixes this strict requirement by accessing objects directly via
the op->subject->ptr pointer we have to the object.
2013-08-19 15:01:01 +02:00
antirez
48cde3fe47 dict.c iterator API misuse protection.
dict.c allows the user to create unsafe iterators, that are iterators
that will not touch the dictionary data structure in any way, preventing
copy on write, but at the same time are limited in their usage.

The limitation is that when itearting with an unsafe iterator, no call
to other dictionary functions must be done inside the iteration loop,
otherwise the dictionary may be incrementally rehashed resulting into
missing elements in the set of the elements returned by the iterator.

However after introducing this kind of iterators a number of bugs were
found due to misuses of the API, and we are still finding
bugs about this issue. The bugs are not trivial to track because the
effect is just missing elements during the iteartion.

This commit introduces auto-detection of the API misuse. The idea is
that an unsafe iterator has a contract: from initialization to the
release of the iterator the dictionary should not change.

So we take a fingerprint of the dictionary state, xoring a few important
dict properties when the unsafe iteartor is initialized. We later check
when the iterator is released if the fingerprint is still the same. If it
is not, we found a misuse of the iterator, as not allowed API calls
changed the internal state of the dictionary.

This code was checked against a real bug, issue #1240.

This is what Redis prints (aborting) when a misuse is detected:

Assertion failed: (iter->fingerprint == dictFingerprint(iter->d)),
function dictReleaseIterator, file dict.c, line 587.
2013-08-19 15:00:57 +02:00
antirez
7398930756 Use precomptued objects for bulk and mbulk prefixes. 2013-08-12 12:50:49 +02:00
antirez
c06de115af replicationFeedSlaves() func name typo: feedReplicationBacklogWithObject -> feedReplicationBacklog. 2013-08-12 12:50:45 +02:00
antirez
dcc48a8143 replicationFeedSlave() reworked for correctness and speed.
The previous code using a static buffer as an optimization was lame:

1) Premature optimization, actually it was *slower* than naive code
   because resulted into the creation / destruction of the object
   encapsulating the output buffer.
2) The code was very hard to test, since it was needed to have specific
   tests for command lines exceeding the size of the static buffer.
3) As a result of "2" the code was bugged as the current tests were not
   able to stress specific corner cases.

It was replaced with easy to understand code that is safer and faster.
2013-08-12 12:50:29 +02:00
antirez
aa05128f51 Fix a PSYNC bug caused by a variable name typo. 2013-08-12 11:51:35 +02:00
antirez
610224d0d0 Fix sdsempty() prototype in sds.h. 2013-08-12 11:38:21 +02:00
antirez
89ffba9133 Replication: better way to send a preamble before RDB payload.
During the replication full resynchronization process, the RDB file is
transfered from the master to the slave. However there is a short
preamble to send, that is currently just the bulk payload length of the
file in the usual Redis form $..length..<CR><LF>.

This preamble used to be sent with a direct write call, assuming that
there was alway room in the socket output buffer to hold the few bytes
needed, however this does not scale in case we'll need to send more
stuff, and is not very robust code in general.

This commit introduces a more general mechanism to send a preamble up to
2GB in size (the max length of an sds string) in a non blocking way.
2013-08-12 10:29:14 +02:00
antirez
db862e8ef0 redis-benchmark: changes to random arguments substitution.
Before this commit redis-benchmark supported random argumetns in the
form of :rand:000000000000. In every string of that form, the zeros were
replaced with a random number of 12 digits at every command invocation.

However this was far from perfect as did not allowed to generate simply
random numbers as arguments, there was always the :rand: prefix.

Now instead every argument in the form __rand_int__ is replaced with a
12 digits number. Note that "__rand_int__" is 12 characters itself.

In order to implement the new semantic, it was needed to change a few
thigns in the internals of redis-benchmark, as new clients are created
cloning old clients, so without a stable prefix such as ":rand:" the old
way of cloning the client was no longer able to understand, from the old
command line, what was the position of the random strings to substitute.

Now instead a client structure is passed as a reference for cloning, so
that we can directly clone the offsets inside the command line.
2013-08-08 16:42:08 +02:00
antirez
92ab77f8d5 redis-benchmark: replace snprintf()+memcpy with faster code.
This change was profiler-driven, but the actual effect is hard to
measure in real-world redis benchmark runs.
2013-08-08 14:33:14 +02:00
Salvatore Sanfilippo
de698798bc Merge pull request #1234 from badboy/patch-2
Little typo
2013-08-07 07:08:04 -07:00
Jan-Erik Rediger
74c0af21f5 Little typo 2013-08-07 16:05:09 +02:00
antirez
36a0947185 redis-benchmark: fix memory leak introduced by 346256f 2013-08-07 16:00:18 +02:00
antirez
346256f933 redis-benchmark: max pipeline length hardcoded limit removed. 2013-08-07 15:58:58 +02:00
antirez
6cbfdd9520 redis-benchmark: fix db selection when :rand: feature is used. 2013-08-06 19:01:54 +02:00
antirez
d52c9b6cdb redis-benchmark: ability to SELECT a specifid db number. 2013-08-06 18:50:54 +02:00
antirez
112fa47978 Add per-db average TTL information in INFO output.
Example:

db0:keys=221913,expires=221913,avg_ttl=655

The algorithm uses a running average with only two samples (current and
previous). Keys found to be expired are considered at TTL zero even if
the actual TTL can be negative.

The TTL is reported in milliseconds.
2013-08-06 15:00:43 +02:00
antirez
4befe73b60 activeExpireCycle(): fix about fast cycle early start.
We don't want to repeat a fast cycle too soon, the previous code was
broken, we need to wait two times the period *since* the start of the
previous cycle in order to avoid there is an even space between cycles:

.-> start                   .-> second start
|                           |
+-------------+-------------+--------------+
| first cycle |    pause    | second cycle |
+-------------+-------------+--------------+

The second and first start must be PERIOD*2 useconds apart hence the *2
in the new code.
2013-08-06 12:59:04 +02:00
antirez
6500fabfb8 Some activeExpireCycle() refactoring. 2013-08-06 12:55:49 +02:00
antirez
d398f38879 Remove dead code and fix comments for new expire code. 2013-08-06 12:36:13 +02:00
antirez
66a26471dc Darft #2 for key collection algo: more improvements.
This commit makes the fast collection cycle time configurable, at
the same time it does not allow to run a new fast collection cycle
for the same amount of time as the max duration of the fast
collection cycle.
2013-08-05 16:14:28 +02:00
antirez
b09ea1bd90 Draft #1 of a new expired keys collection algorithm.
The main idea here is that when we are no longer to expire keys at the
rate the are created, we can't block more in the normal expire cycle as
this would result in too big latency spikes.

For this reason the commit introduces a "fast" expire cycle that does
not run for more than 1 millisecond but is called in the beforeSleep()
hook of the event loop, so much more often, and with a frequency bound
to the frequency of executed commnads.

The fast expire cycle is only called when the standard expiration
algorithm runs out of time, that is, consumed more than
REDIS_EXPIRELOOKUPS_TIME_PERC of CPU in a given cycle without being able
to take the number of already expired keys that are yet not collected
to a number smaller than 25% of the number of keys.

You can test this commit with different loads, but a simple way is to
use the following:

Extreme load with pipelining:

redis-benchmark -r 100000000 -n 100000000  \
        -P 32 set ele:rand:000000000000 foo ex 2

Remove the -P32 in order to avoid the pipelining for a more real-world
load.

In another terminal tab you can monitor the Redis behavior with:

redis-cli -i 0.1 -r -1 info keyspace

and

redis-cli --latency-history

Note: this commit will make Redis printing a lot of debug messages, it
is not a good idea to use it in production.
2013-08-05 12:05:22 +02:00
antirez
cd0ea1f202 Test: regression test for issue #1221. 2013-07-29 17:39:28 +02:00
antirez
c151eb6d92 Fix replicationFeedSlaves() off-by-one bug.
This fixes issue #1221.
2013-07-28 12:49:34 +02:00
antirez
bf56948fd0 Remove dead variable bothsds from object.c.
Thanks to @run and @badboy for spotting this.
Triva: clang was not able to provide me a warning about that when
compiling.

This closes #1024 and #1207, committing the change myself as the pull
requests no longer apply cleanly after other changes to the same
function.
2013-07-28 11:00:09 +02:00