Commit Graph

121 Commits

Author SHA1 Message Date
Okada Haruki
c1cc5ca767
Fix typo 2019-09-17 06:18:01 +09:00
antirez
b2079cb508 HyperLogLog: fix the fix of a corruption bug. 2019-07-31 10:36:23 +02:00
John Sully
d659654f53 Fix HLL corruption bug 2019-07-29 18:11:52 -04:00
antirez
8ea906a3e8 HyperLogLog: fix comment in hllCount(). 2019-03-16 09:15:57 +01:00
antirez
e216ceaf0e HyperLogLog: handle wrong offset in the base case. 2019-03-15 17:16:06 +01:00
antirez
a4b90be9fc HyperLogLog: enlarge reghisto variable for safety. 2019-03-15 17:10:16 +01:00
John Sully
9f13b2bd49 Fix hyperloglog corruption 2019-03-15 13:13:01 +01:00
antirez
ddb98ad56f RESP3: hyperloglog.c updated. 2019-01-09 17:00:29 +01:00
Jack Drogon
93238575f7 Fix typo 2018-07-03 18:19:46 +02:00
Yossi Gottlieb
e1222d8b10 Clean gcc 7.x warnings, redis-cli cluster fix. 2018-06-03 15:54:30 +03:00
antirez
36b78e8dfe Aesthetic changes to PR #4749. 2018-03-16 16:57:53 +01:00
Otmar Ertl
15d7e61701 fixed compilation error when using clang as reported by michael-grunder 2018-03-14 21:00:06 +01:00
Otmar Ertl
97bde9f623 use all 64 bits of the hash value instead of 63 2018-03-11 09:18:00 +01:00
Otmar Ertl
44698f45e7 made constant static 2018-03-10 20:44:20 +01:00
Otmar Ertl
633983d479 improved definition of HLL_Q 2018-03-10 20:22:42 +01:00
Otmar Ertl
1e9a774871 improved HyperLogLog cardinality estimation
based on method described in https://arxiv.org/abs/1702.01284
that does not rely on any magic constants
2018-03-10 20:13:21 +01:00
Otmar Ertl
6470b21f59 replaced tab by spaces 2018-03-10 20:09:41 +01:00
antirez
0b561883b4 Hyperloglog: refresh hdr variable correctly.
This is a fix for the #3819 improvements. The o->ptr may change because
of hllSparseSet() calls, so 'hdr' must be correctly re-fetched.
2017-12-22 11:26:31 +01:00
antirez
27b81f3fbc Hyperloglog: Support for PFMERGE sparse encoding as target.
This is a fix for #3819.
2017-12-22 11:01:27 +01:00
antirez
cbe9d725a7 Hyperloglog: refactoring of sparse/dense add function.
The commit splits the add functions into a set() and add() set of
functions, so that it's possible to set registers in an independent way
just having the index and count.

Related to #3819, otherwise a fix is not possible.
2017-12-22 11:00:22 +01:00
antirez
5bd46d33db Fix isHLLObjectOrReply() to handle integer encoded strings.
Close #3766.
2017-07-11 12:44:59 +02:00
Salvatore Sanfilippo
b3391fd853 Use ARM unaligned accesses ifdefs for SPARC as well. 2017-02-23 22:39:44 +08:00
Salvatore Sanfilippo
72d6d64771 ARM: Avoid memcpy() in MurmurHash64A() if we are using 64 bit ARM.
However note that in architectures supporting 64 bit unaligned
accesses memcpy(...,...,8) is likely translated to a simple
word memory movement anyway.
2017-02-19 15:00:46 +00:00
Salvatore Sanfilippo
1e272a6b52 ARM: Fix 64 bit unaligned access in MurmurHash64A(). 2017-02-19 14:01:58 +00:00
antirez
87538cb7fe Switch PFCOUNT to LogLog-Beta algorithm.
The new algorithm provides the same speed with a smaller error for
cardinalities in the range 0-100k. Before switching, the new and old
algorithm behavior was studied in details in the context of
issue #3677. You can find a few graphs and motivations there.
2016-12-16 11:07:30 +01:00
antirez
0224be8811 Use llroundl() before converting loglog-beta output to integer.
Otherwise for small cardinalities the algorithm will output something
like, for example, 4.99 for a candinality of 5, that will be converted
to 4 producing a huge error.
2016-12-16 11:07:30 +01:00
Harish Murthy
c55e3fbae5 LogLog-Beta Algorithm support within HLL
Config option to use LogLog-Beta Algorithm for Cardinality
2016-12-16 11:07:30 +01:00
antirez
32f80e2f1b RDMF: More consistent define names. 2015-07-27 14:37:58 +02:00
antirez
40eb548a80 RDMF: REDIS_OK REDIS_ERR -> C_OK C_ERR. 2015-07-26 23:17:55 +02:00
antirez
2d9e3eb107 RDMF: redisAssert -> serverAssert. 2015-07-26 15:29:53 +02:00
antirez
14ff572482 RDMF: OBJ_ macros for object related stuff. 2015-07-26 15:28:00 +02:00
antirez
554bd0e7bd RDMF: use client instead of redisClient, like Disque. 2015-07-26 15:20:52 +02:00
antirez
cef054e868 RDMF (Redis/Disque merge friendlyness) refactoring WIP 1. 2015-07-26 15:17:18 +02:00
antirez
06e76bc3e2 Better read-only behavior for expired keys in slaves.
Slaves key expire is orchestrated by the master. Sometimes the master
will send the synthesized DEL to expire keys on the slave with a non
trivial delay (when the key is not accessed, only the incremental expiry
algorithm will expire it in background).

During that time, a key is logically expired, but slaves still return
the key if you GET (or whatever) it. This is a bad behavior.

However we can't simply trust the slave view of the key, since we need
the master to be able to send write commands to update the slave data
set, and DELs should only happen when the key is expired in the master
in order to ensure consistency.

However 99.99% of the issues with this behavior is when a client which
is not a master sends a read only command. In this case we are safe and
can consider the key as non existing.

This commit does a few changes in order to make this sane:

1. lookupKeyRead() is modified in order to return NULL if the above
conditions are met.
2. Calls to lookupKeyRead() in commands actually writing to the data set
are repliaced with calls to lookupKeyWrite().

There are redundand checks, so for example, if in "2" something was
overlooked, we should be still safe, since anyway, when the master
writes the behavior is to don't care about what expireIfneeded()
returns.

This commit is related to  #1768, #1770, #2131.
2014-12-10 16:10:21 +01:00
antirez
5bd3b9d93f Over 80 chars comment trimmed in pfcountCommand(). 2014-12-02 17:03:22 +01:00
antirez
edca2b14d2 Remove warnings and improve integer sign correctness. 2014-08-13 11:44:38 +02:00
antirez
0adf4482f0 PFSELFTEST: less false positives.
This is just a quickfix, for the nature of the test the right way to fix
it is to average the error of N runs, since otherwise it is always
possible to get a false positive with a bad run, or to minimize too much
this possibility we may end testing with too much "large" error ranges.
2014-07-23 11:43:57 +02:00
Mike Trinkala
ba52cd06c8 Correct the HyperLogLog stale cache flag to prevent unnecessary computations.
Set the MSB as documented.
2014-05-18 07:26:26 -07:00
antirez
5eb7ac0c92 Speedup hllRawSum() processing 8 bytes per iteration.
The internal HLL raw encoding used by PFCOUNT when merging multiple keys
is aligned to 8 bits (1 byte per register) so we can exploit this to
improve performances by processing multiple bytes per iteration.

In benchmarks the new code was several times faster with HLLs with many
registers set to zero, while no slowdown was observed with populated
HLLs.
2014-04-17 18:05:27 +02:00
antirez
192a213274 Speedup SUM(2^-reg[m]) in HyperLogLog computation.
When the register is set to zero, we need to add 2^-0 to E, which is 1,
but it is faster to just add 'ez' at the end, which is the number of
registers set to zero, a value we need to compute anyway.
2014-04-17 17:53:20 +02:00
antirez
0feb2aabca PFCOUNT support for multi-key union. 2014-04-17 17:32:59 +02:00
antirez
fcd2155b6f HyperLogLog low level merge extracted from PFMERGE. 2014-04-17 17:08:43 +02:00
antirez
8e8f8189eb HyperLogLog invalid representation error code set to INVALIDOBJ. 2014-04-16 09:10:30 +02:00
antirez
0bbdaca6a0 PFDEBUG TODENSE added.
Converts HyperLogLogs from sparse to dense. Used for testing.
2014-04-16 09:05:42 +02:00
antirez
402110f9fd User-defined switch point between sparse-dense HLL encodings. 2014-04-15 17:46:51 +02:00
antirez
d541f65d66 PFSELFTEST improved with sparse encoding checks. 2014-04-15 10:10:38 +02:00
antirez
dde8dff73f PFDEBUG ENCODING added. 2014-04-14 19:35:00 +02:00
antirez
54f0156e8c Set HLL_SPARSE_MAX to 3000.
After running a few benchmarks, 3000 looks like a reasonable value to
keep HLLs with a few thousand elements small while the CPU cost is
still not huge.

This covers all the cases where the dense representation would use N
orders of magnitude more space, like in the case of many HLLs with
carinality of a few tens or hundreds.

It is not impossible that in the future this gets user configurable,
however it is easy to pick an unreasoable value just looking at savings
in the space dimension without checking what happens in the time
dimension.
2014-04-14 16:15:55 +02:00
antirez
848d0461f9 Error message for invalid HLL objects unified. 2014-04-14 16:11:54 +02:00
antirez
81ceef7d22 PFMERGE fixed to work with sparse encoding. 2014-04-14 16:09:32 +02:00