Commit Graph

46 Commits

Author SHA1 Message Date
antirez
a37d0f8b48 SPOP with count: fix replication for code path #3. 2015-02-11 10:52:28 +01:00
antirez
9feee428f2 SPOP: reimplemented for speed and better distribution.
The old version of SPOP with "count" argument used an API call of dict.c
which was actually designed for a different goal, and was not capable of
good distribution. We follow a different three-cases approach optimized
for different ratiion between sets and requested number of elements.

The implementation is simpler and allowed the removal of a large amount
of code.
2015-02-11 10:52:28 +01:00
antirez
cc7f0434b5 Change alsoPropagate() behavior to make it more usable.
Now the API automatically creates its argv copy and increment ref count
of passed objects.
2015-02-11 10:52:27 +01:00
antirez
6b5922dcbb SPOP with count: initial fixes to the implementation.
Severan problems are addressed but still a few missing.
Since replication of this command was more complex than others since it
needs to replicate multiple SREM commands, an old API able to do this
was reused (it was taken inside the implementation since it was pretty
obvious soon or later that would be useful). The API was improved a bit
so that now a command may opt-out for the standard command replication
when the server.dirty counter is incremented, in order to "manually"
replicate what it wants.
2015-02-11 10:52:27 +01:00
Alon Diamant
d74a5a0880 Following @mattsta's friendly review:
1. memory leak in t_set.c has been fixed
  2. end-of-line spaces has been removed (from all over the place)
  3. for loops have been ordered up to match existing Redis style (less weird)
  4. comments format has been fixed (added * in the beggining of every comment line)
2014-12-21 16:13:45 +02:00
Alon Diamant
3c8a75583d Fix: case when SPOP with count>MAXINT, setTypeRandomElements() will get negative count argument due to signed/unsigned mismatch.
setTypeRandomElements() now returns unsigned long, and also uses unsigned long for anything related to count of members.
spopWithCountCommand() now uses unsigned long elements_returned instead of int, for values returned from setTypeRandomElements()
2014-12-18 14:38:20 +02:00
Alon Diamant
288028876f Added <count> parameter to SPOP:
spopCommand() now runs spopWithCountCommand() in case the <count> param is found.
Added intsetRandomMembers() to Intset: Copies N random members from the set into inputted 'values' array. Uses either the Knuth or Floyd sample algos depending on ratio count/size.
Added setTypeRandomElements() to SET type: Returns a number of random elements from a non empty set. This is a version of setTypeRandomElement() that is modified in order to return multiple entries, using dictGetRandomKeys() and intsetRandomMembers().
Added tests for SPOP with <count>: unit/type/set, unit/scripting, integration/aof
--
Cleaned up code a bit to match with required Redis coding style
2014-12-14 12:25:42 +02:00
antirez
95b1979c32 No more trailing spaces in Redis source code. 2014-06-26 18:48:40 +02:00
antirez
c00453da1d SDIFF iterator misuse fixed in diff algorithm #1.
The bug could be easily triggered by:

    SADD foo a b c 1 2 3 4 5 6
    SDIFF foo foo

When the key was the same in two sets, an unsafe iterator was used to
check existence of elements in the same set we were iterating.
Usually this would just result into a wrong output, however with the
dict.c API misuse protection we have in place, the result was actually
an assertion failed that was triggered by the CI test, while creating
random datasets for the "MASTER and SLAVE consistency" test.
2013-12-13 11:34:21 +01:00
antirez
ebcb6251e6 SCAN code refactored to parse cursor first.
The previous implementation of SCAN parsed the cursor in the generic
function implementing SCAN, SSCAN, HSCAN and ZSCAN.

The actual higher-level command implementation only checked for empty
keys and return ASAP in that case. The result was that inverting the
arguments of, for instance, SSCAN for example and write:

    SSCAN 0 key

Instead of

    SSCAN key 0

Resulted into no error, since 0 is a non-existing key name very likely.
Just the iterator returned no elements at all.

In order to fix this issue the code was refactored to extract the
function to parse the cursor and return the error. Every higher level
command implementation now parses the cursor and later checks if the key
exist or not.
2013-11-05 15:47:50 +01:00
antirez
4a1f1cc0d7 SSCAN implemented. 2013-10-28 11:17:32 +01:00
antirez
894eba07c8 Introduction of a new string encoding: EMBSTR
Previously two string encodings were used for string objects:

1) REDIS_ENCODING_RAW: a string object with obj->ptr pointing to an sds
stirng.

2) REDIS_ENCODING_INT: a string object where the obj->ptr void pointer
is casted to a long.

This commit introduces a experimental new encoding called
REDIS_ENCODING_EMBSTR that implements an object represented by an sds
string that is not modifiable but allocated in the same memory chunk as
the robj structure itself.

The chunk looks like the following:

+--------------+-----------+------------+--------+----+
| robj data... | robj->ptr | sds header | string | \0 |
+--------------+-----+-----+------------+--------+----+
                     |                       ^
                     +-----------------------+

The robj->ptr points to the contiguous sds string data, so the object
can be manipulated with the same functions used to manipulate plan
string objects, however we need just on malloc and one free in order to
allocate or release this kind of objects. Moreover it has better cache
locality.

This new allocation strategy should benefit both the memory usage and
the performances. A performance gain between 60 and 70% was observed
during micro-benchmarks, however there is more work to do to evaluate
the performance impact and the memory usage behavior.
2013-07-22 10:31:38 +02:00
Rock Li
8063155cd0 retval doesn't initalized
If each if conditions are all fail, variable retval will under uninitlized
2013-02-05 15:56:04 +08:00
Gengliang Wang
002747336a Fix a bug in srandmemberWithCountCommand()
In CASE 2, the call sunionDiffGenericCommand will involve the string "srandmember" 
> sadd foo one
(integer 1)
> sadd srandmember two
(integer 2)
> srandmember foo 3
1)"one"
2)"two"
2013-02-04 14:01:08 +08:00
antirez
e41d1d77e3 Generate del events when S*STORE commands delete the destination key. 2013-01-29 13:43:13 +01:00
antirez
fce016d31b Keyspace events: it is now possible to select subclasses of events.
When keyspace events are enabled, the overhead is not sever but
noticeable, so this commit introduces the ability to select subclasses
of events in order to avoid to generate events the user is not
interested in.

The events can be selected using redis.conf or CONFIG SET / GET.
2013-01-28 13:15:12 +01:00
antirez
da04e6ed44 Keyspace events added for more commands. 2013-01-28 13:14:56 +01:00
guiquanz
9d09ce3981 Fixed many typos. 2013-01-19 10:59:44 +01:00
antirez
f50e658455 SDIFF is now able to select between two algorithms for speed.
SDIFF used an algorithm that was O(N) where N is the total number
of elements of all the sets involved in the operation.

The algorithm worked like that:

ALGORITHM 1:

1) For the first set, add all the members to an auxiliary set.
2) For all the other sets, remove all the members of the set from the
auxiliary set.

So it is an O(N) algorithm where N is the total number of elements in
all the sets involved in the diff operation.

Cristobal Viedma suggested to modify the algorithm to the following:

ALGORITHM 2:

1) Iterate all the elements of the first set.
2) For every element, check if the element also exists in all the other
remaining sets.
3) Add the element to the auxiliary set only if it does not exist in any
of the other sets.

The complexity of this algorithm on the worst case is O(N*M) where N is
the size of the first set and M the total number of sets involved in the
operation.

However when there are elements in common, with this algorithm we stop
the computation for a given element as long as we find a duplicated
element into another set.

I (antirez) added an additional step to algorithm 2 to make it faster,
that is to sort the set to subtract from the biggest to the
smallest, so that it is more likely to find a duplicate in a larger sets
that are checked before the smaller ones.

WHAT IS BETTER?

None of course, for instance if the first set is much larger than the
other sets the second algorithm does a lot more work compared to the
first algorithm.

Similarly if the first set is much smaller than the other sets, the
original algorithm will less work.

So this commit makes Redis able to guess the number of operations
required by each algorithm, and select the best at runtime according
to the input received.

However, since the second algorithm has better constant times and can do
less work if there are duplicated elements, an advantage is given to the
second algorithm.
2012-11-30 16:36:42 +01:00
antirez
4365e5b2d3 BSD license added to every C source and header file. 2012-11-08 18:31:32 +01:00
antirez
578c94597f SRANDMEMBER <count> leak fixed.
For "CASE 4" (see code) we need to free the element if it's already in
the result dictionary and adding it failed.
2012-09-21 11:55:32 +02:00
antirez
be90c803e3 Added the SRANDMEMBER key <count> variant.
SRANDMEMBER called with just the key argument can just return a single
random element from a Redis Set. However many users need to return
multiple unique elements from a Set, this is not a trivial problem to
handle in the client side, and for truly good performance a C
implementation was required.

After many requests for this feature it was finally implemented.

The problem implementing this command is the strategy to follow when
the number of elements the user asks for is near to the number of
elements that are already inside the set. In this case asking random
elements to the dictionary API, and trying to add it to a temporary set,
may result into an extremely poor performance, as most add operations
will be wasted on duplicated elements.

For this reason this implementation uses a different strategy in this
case: the Set is copied, and random elements are returned to reach the
specified count.

The code actually uses 4 different algorithms optimized for the
different cases.

If the count is negative, the command changes behavior and allows for
duplicated elements in the returned subset.
2012-09-21 11:55:28 +02:00
Erik Dubbelboer
65fd32ab0a Fixed some spelling errors in the comments 2012-04-07 14:40:29 +02:00
antirez
c0ba9ebe13 dict.c API names modified to be more coincise and consistent. 2011-11-08 17:07:55 +01:00
antirez
eab0e26e03 replaced redisAssert() with redisAssertWithInfo() in a shitload of places. 2011-10-04 18:43:03 +02:00
antirez
c1c9d551da Fix for bug 561 and other related problems 2011-06-20 17:19:36 +02:00
antirez
3738ff5f32 Fix for the variadic version of SREM. Regression test added. 2011-05-31 20:14:29 +02:00
antirez
dd1eefa4f3 Fixed SINTER[STORE] problem related to the new copy on write safe iterator 2011-05-15 12:18:00 +02:00
antirez
b3a96d454e Variadic SREM 2011-04-19 17:37:03 +02:00
antirez
22f294d24a variadic SADD 2011-04-15 18:08:32 +02:00
antirez
30318c1ddd SPOP replication/AOF patch ported to unstable branch 2011-02-16 12:41:40 +01:00
antirez
cea8c5cd75 touched key for WATCH refactored into a more general thing that can be used also for the cache system. Some more changes towards diskstore working. 2010-12-29 19:39:42 +01:00
antirez
1b508da7ca SINTER/MEMBERS are now COW friendly, also some refactoring around was needed to get this result. 2010-12-09 23:01:09 +01:00
antirez
a5be65f71c COW friendly versions of SPOP and SRANDMEMBER commands, with some change to the set encoding-agnostic API. 2010-12-09 10:21:02 +01:00
Pieter Noordhuis
75b41de8ca Convert objects in the command procs instead of the protocol code 2010-10-17 17:21:41 +02:00
Pieter Noordhuis
b70d355521 Use existing reply functions where possible 2010-09-02 19:52:04 +02:00
Pieter Noordhuis
0537e7bf80 Use specialized function to add multi bulk reply length 2010-09-02 12:51:14 +02:00
Pieter Noordhuis
b301c1fc2b Wrapper for adding unknown multi bulk length to reply list 2010-08-30 16:39:14 +02:00
antirez
ec7e138926 test for intset integer encodability test and some small refactoring 2010-08-26 18:47:03 +02:00
antirez
23c64fe50d translated a few long logn into int64_t for correctness and to avoid compilation warnings as well 2010-08-26 18:11:26 +02:00
Pieter Noordhuis
740eee1cc6 Fix type that was not renamed and compiler warning 2010-08-26 12:13:51 +02:00
Pieter Noordhuis
cb72d0f155 Rename iterator to setTypeIterator for consistency 2010-08-21 11:38:24 +02:00
Pieter Noordhuis
aaada3f962 Merge branch 'master' into intset-split
Conflicts:
	src/Makefile
	src/t_set.c
2010-08-20 12:40:55 +02:00
antirez
5b4bff9c17 WATCH is now affected only when write commands actually modify the key content 2010-07-12 12:01:15 +02:00
Pieter Noordhuis
96ffb2fe97 merged intset code into the split files 2010-07-02 19:57:12 +02:00
antirez
e2641e09cc redis.c split into many different C files.
networking related stuff moved into networking.c

moved more code

more work on layout of source code

SDS instantaneuos memory saving. By Pieter and Salvatore at VMware ;)

cleanly compiling again after the first split, now splitting it in more C files

moving more things around... work in progress

split replication code

splitting more

Sets split

Hash split

replication split

even more splitting

more splitting

minor change
2010-07-01 14:38:51 +02:00