Commit Graph

137 Commits

Author SHA1 Message Date
oranagra
7aa9e6d2ae active memory defragmentation 2016-12-30 03:37:52 +02:00
antirez
ac61f90625 DEBUG: new "ziplist" subcommand added. Dumps a ziplist on stdout.
The commit improves ziplistRepr() and adds a new debugging subcommand so
that we can trigger the dump directly from the Redis API.
This command capability was used while investigating issue #3684.
2016-12-16 09:02:50 +01:00
antirez
2669fb8364 PSYNC2: different improvements to Redis replication.
The gist of the changes is that now, partial resynchronizations between
slaves and masters (without the need of a full resync with RDB transfer
and so forth), work in a number of cases when it was impossible
in the past. For instance:

1. When a slave is promoted to mastrer, the slaves of the old master can
partially resynchronize with the new master.

2. Chained slalves (slaves of slaves) can be moved to replicate to other
slaves or the master itsef, without requiring a full resync.

3. The master itself, after being turned into a slave, is able to
partially resynchronize with the new master, when it joins replication
again.

In order to obtain this, the following main changes were operated:

* Slaves also take a replication backlog, not just masters.

* Same stream replication for all the slaves and sub slaves. The
replication stream is identical from the top level master to its slaves
and is also the same from the slaves to their sub-slaves and so forth.
This means that if a slave is later promoted to master, it has the
same replication backlong, and can partially resynchronize with its
slaves (that were previously slaves of the old master).

* A given replication history is no longer identified by the `runid` of
a Redis node. There is instead a `replication ID` which changes every
time the instance has a new history no longer coherent with the past
one. So, for example, slaves publish the same replication history of
their master, however when they are turned into masters, they publish
a new replication ID, but still remember the old ID, so that they are
able to partially resynchronize with slaves of the old master (up to a
given offset).

* The replication protocol was slightly modified so that a new extended
+CONTINUE reply from the master is able to inform the slave of a
replication ID change.

* REPLCONF CAPA is used in order to notify masters that a slave is able
to understand the new +CONTINUE reply.

* The RDB file was extended with an auxiliary field that is able to
select a given DB after loading in the slave, so that the slave can
continue receiving the replication stream from the point it was
disconnected without requiring the master to insert "SELECT" statements.
This is useful in order to guarantee the "same stream" property, because
the slave must be able to accumulate an identical backlog.

* Slave pings to sub-slaves are now sent in a special form, when the
top-level master is disconnected, in order to don't interfer with the
replication stream. We just use out of band "\n" bytes as in other parts
of the Redis protocol.

An old design document is available here:

https://gist.github.com/antirez/ae068f95c0d084891305

However the implementation is not identical to the description because
during the work to implement it, different changes were needed in order
to make things working well.
2016-11-09 15:37:15 +01:00
antirez
6782e774f1 debug.c: include dlfcn.h regardless of BACKTRACE support. 2016-09-27 00:29:47 +02:00
oranagra
309c2bcd1b add zmalloc used mem to DEBUG SDSLEN 2016-09-16 10:29:27 +02:00
antirez
78f35f8d2c Memory related subcommands of DEBUG moved to MEMORY. 2016-09-16 10:26:23 +02:00
antirez
d35deb2327 debug.c: no need to define _GNU_SOURCE, is defined in fmacros.h. 2016-09-09 11:15:10 +02:00
antirez
6211e77ab6 crash log - improve code dump with more info and called symbols. 2016-09-09 11:00:19 +02:00
oranagra
24811fcb1b crash log - add hex dump of function code 2016-09-08 14:14:57 +02:00
Yossi Gottlieb
8f3a4df775 Use const in Redis Module API where possible. 2016-06-20 23:08:06 +03:00
antirez
840ac20855 DEBUG command self documentation. 2016-05-04 12:45:55 +02:00
Oran Agra
f8909a2579 add DEBUG JEMALLC PURGE and JEMALLOC INFO cleanup 2016-04-25 16:47:42 +03:00
antirez
a1c9c05e17 Hopefully better memory test on crash.
The old test, designed to do a transformation on the bits that was
invertible, in order to avoid touching the original memory content, was
not effective as it was redis-server --test-memory. The former often
reported OK while the latter was able to spot the error.

So the test was substituted with one that may perform better, however
the new one must backup the memory tested, so it tests memory in small
pieces. This limits the effectiveness because of the CPU caches. However
some attempt is made in order to trash the CPU cache between the fill
and the check stages, but not for the addressing test unfortunately.

We'll see if this test will be able to find errors where the old failed.
2015-12-16 17:41:22 +01:00
antirez
b9aeb98156 Suppress harmless warnings. 2015-12-16 12:36:32 +01:00
antirez
30f057d88f Crash report format improvements. 2015-12-16 12:14:55 +01:00
antirez
6db8e8569d Log address causing SIGSEGV. 2015-12-15 18:00:29 +01:00
antirez
96628cc40d fix sprintf and snprintf format string
There are some cases of printing unsigned integer with %d conversion
specificator and vice versa (signed integer with %u specificator).

Patch by Sergey Polovko. Backported to Redis from Disque.
2015-11-28 09:05:41 +01:00
antirez
f26072eb66 More reliable DEBUG loadaof.
Make sure to flush the AOF output buffers before reloading.
Result: less false timing related false positives on AOF tests.
2015-10-30 12:06:09 +01:00
antirez
ff6d296000 Scripting: ability to turn on Lua commands style replication globally.
Currently this feature is only accessible via DEBUG for testing, since
otherwise depending on the instance configuration a given script works
or is broken, which is against the Redis philosophy.
2015-10-30 12:06:09 +01:00
antirez
35a0c772b5 DEBUG RESTART/CRASH-AND-RECOVER [delay] implemented. 2015-10-13 11:12:25 +02:00
antirez
c69c6c80fb Lazyfree: ability to free whole DBs in background. 2015-10-01 13:02:26 +02:00
antirez
974514b936 Lazyfree: Hash converted to use plain SDS WIP 4. 2015-10-01 13:02:25 +02:00
antirez
afc4b9241c DEBUG DIGEST Set type memory leak fixed. 2015-10-01 13:02:24 +02:00
antirez
a7c5be18a8 Lazyfree: Sorted sets convereted to plain SDS. (several commits squashed) 2015-10-01 13:02:24 +02:00
antirez
86d48efbfd Lazyfree: Convert Sets to use plains SDS (several commits squashed). 2015-10-01 13:02:24 +02:00
antirez
32f80e2f1b RDMF: More consistent define names. 2015-07-27 14:37:58 +02:00
antirez
40eb548a80 RDMF: REDIS_OK REDIS_ERR -> C_OK C_ERR. 2015-07-26 23:17:55 +02:00
antirez
2d9e3eb107 RDMF: redisAssert -> serverAssert. 2015-07-26 15:29:53 +02:00
antirez
14ff572482 RDMF: OBJ_ macros for object related stuff. 2015-07-26 15:28:00 +02:00
antirez
554bd0e7bd RDMF: use client instead of redisClient, like Disque. 2015-07-26 15:20:52 +02:00
antirez
424fe9afd9 RDMF: redisLog -> serverLog. 2015-07-26 15:17:43 +02:00
antirez
cef054e868 RDMF (Redis/Disque merge friendlyness) refactoring WIP 1. 2015-07-26 15:17:18 +02:00
antirez
3da97ea67f Add sdshdr5 to DEBUG structsize. 2015-07-16 09:14:39 +02:00
antirez
a76b380e06 Fix DEBUG structsize output. 2015-07-14 17:17:06 +02:00
Oran Agra
f15df8ba5d sds size classes - memory optimization 2015-07-14 17:17:06 +02:00
antirez
0f64080dcb DEBUG HTSTATS <dbid> added.
The command reports information about the hash table internal state
representing the specified database ID.

This can be used in order to investigate rehashings, memory usage issues
and for other debugging purposes.
2015-07-14 17:15:37 +02:00
Salvatore Sanfilippo
d83c810265 Merge pull request #2301 from mattsta/fix/lengths
Improve type correctness
2015-02-24 17:22:53 +01:00
antirez
7885e1264e DEBUG structsize
Show sizes of a few important data structures in Redis. More missing.
2015-01-23 18:10:14 +01:00
Matt Stancliff
f704360462 Improve RDB type correctness
It's possible large objects could be larger than 'int', so let's
upgrade all size counters to ssize_t.

This also fixes rdbSaveObject serialized bytes calculation.
Since entire serializations of data structures can be large,
so we don't want to limit their calculated size to a 32 bit signed max.

This commit increases object size calculation and
cascades the change back up to serializedlength printing.

Before:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:-2147483559 ...

After:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:2147483737 ...
2015-01-19 14:10:12 -05:00
antirez
f08586347d Revert "Use REDIS_SUPERVISED_NONE instead of 0."
This reverts commit 2c925b0c30.

Nevermind.
2015-01-12 15:58:23 +01:00
antirez
2c925b0c30 Use REDIS_SUPERVISED_NONE instead of 0. 2015-01-12 15:57:50 +01:00
Matt Stancliff
9e11d07909 Add more quicklist info to DEBUG OBJECT
Adds: ql_compressed (boolean, 1 if compression enabled for list, 0
otherwise)
Adds: ql_uncompressed_size (actual uncompressed size of all quicklistNodes)
Adds: ql_ziplist_max (quicklist max ziplist fill factor)

Compression ratio of the list is then ql_uncompressed_size / serializedlength

We report ql_uncompressed_size for all quicklists because serializedlength
is a _compressed_ representation anyway.

Sample output from a large list:
127.0.0.1:6379> llen abc
(integer) 38370061
127.0.0.1:6379> debug object abc
Value at:0x7ff97b51d140 refcount:1 encoding:quicklist serializedlength:19878335 lru:9718164 lru_seconds_idle:5 ql_nodes:21945 ql_avg_node:1748.46 ql_ziplist_max:-2 ql_compressed:0 ql_uncompressed_size:1643187761
(1.36s)

The 1.36s result time is because rdbSavedObjectLen() is serializing the
object, not because of any new stats reporting.

If we run DEBUG OBJECT on a compressed list, DEBUG OBJECT takes almost *zero*
time because rdbSavedObjectLen() reuses already-compressed ziplists:
127.0.0.1:6379> debug object abc
Value at:0x7fe5c5800040 refcount:1 encoding:quicklist serializedlength:19878335 lru:9718109 lru_seconds_idle:5 ql_nodes:21945 ql_avg_node:1748.46 ql_ziplist_max:-2 ql_compressed:1 ql_uncompressed_size:1643187761
2015-01-02 11:16:10 -05:00
Matt Stancliff
abdd1414a8 Allow compression of interior quicklist nodes
Let user set how many nodes to *not* compress.

We can specify a compression "depth" of how many nodes
to leave uncompressed on each end of the quicklist.

Depth 0 = disable compression.
Depth 1 = only leave head/tail uncompressed.
  - (read as: "skip 1 node on each end of the list before compressing")
Depth 2 = leave head, head->next, tail->prev, tail uncompressed.
  - ("skip 2 nodes on each end of the list before compressing")
Depth 3 = Depth 2 + head->next->next + tail->prev->prev
  - ("skip 3 nodes...")
etc.

This also:
  - updates RDB storage to use native quicklist compression (if node is
    already compressed) instead of uncompressing, generating the RDB string,
    then re-compressing the quicklist node.
  - internalizes the "fill" parameter for the quicklist so we don't
    need to pass it to _every_ function.  Now it's just a property of
    the list.
  - allows a runtime-configurable compression option, so we can
    expose a compresion parameter in the configuration file if people
    want to trade slight request-per-second performance for up to 90%+
    memory savings in some situations.
  - updates the quicklist tests to do multiple passes: 200k+ tests now.
2015-01-02 11:16:09 -05:00
Matt Stancliff
5127e39980 Add quicklist info to DEBUG OBJECT
Added field 'ql_nodes' and 'ql_avg_per_node'.

ql_nodes is the number of quicklist nodes in the quicklist.
ql_avg_node is the average fill level in each quicklist node. (LLEN / QL_NODES)

Sample output:
127.0.0.1:6379> DEBUG object b
Value at:0x7fa42bf2fed0 refcount:1 encoding:quicklist serializedlength:18489 lru:8983768 lru_seconds_idle:3 ql_nodes:430 ql_avg_per_node:511.73
127.0.0.1:6379> llen b
(integer) 220044
2015-01-02 11:16:09 -05:00
Matt Stancliff
27937c2821 Add DEBUG JEMALLOC INFO
Uses jemalloc function malloc_stats_print() to return
stats about what jemalloc has allocated internally.
2014-12-23 09:31:03 -05:00
Salvatore Sanfilippo
9c385ada22 Merge pull request #2134 from pyr/feature/supervised-init
Support daemon supervision by upstart or systemd
2014-12-11 14:39:09 +01:00
antirez
06e76bc3e2 Better read-only behavior for expired keys in slaves.
Slaves key expire is orchestrated by the master. Sometimes the master
will send the synthesized DEL to expire keys on the slave with a non
trivial delay (when the key is not accessed, only the incremental expiry
algorithm will expire it in background).

During that time, a key is logically expired, but slaves still return
the key if you GET (or whatever) it. This is a bad behavior.

However we can't simply trust the slave view of the key, since we need
the master to be able to send write commands to update the slave data
set, and DELs should only happen when the key is expired in the master
in order to ensure consistency.

However 99.99% of the issues with this behavior is when a client which
is not a master sends a read only command. In this case we are safe and
can consider the key as non existing.

This commit does a few changes in order to make this sane:

1. lookupKeyRead() is modified in order to return NULL if the above
conditions are met.
2. Calls to lookupKeyRead() in commands actually writing to the data set
are repliaced with calls to lookupKeyWrite().

There are redundand checks, so for example, if in "2" something was
overlooked, we should be still safe, since anyway, when the master
writes the behavior is to don't care about what expireIfneeded()
returns.

This commit is related to  #1768, #1770, #2131.
2014-12-10 16:10:21 +01:00
antirez
acf73a0592 Fix DEBUG OBJECT lru field to report seconds.
Because of (not so) recent Redis changes, now the LRU internally
reported unit is milliseconds, not seconds, but the DEBUG OBJECT output
was still claiming seconds while providing milliseconds.
However OBJECT IDLETIME was working as expected, which is the correct
API to use.
2014-11-26 16:38:33 +01:00
Pierre-Yves Ritschard
bc1a3b96e6 Support daemon supervision by upstart or systemd
Both upstart and systemd provide a way for daemons to
be supervised, as well as a mechanism for them to
signal their readyness status.

This patch provides compatibility with this functionality while
not interfering with other methods.

With this, it will be possible to use `expect stop` with upstart
and `Type=notify` with systemd.

A more detailed explanation of the mechanism can be found here:
http://spootnik.org/entries/2014/11/09_pid-tracking-in-modern-init-systems.html
2014-11-11 11:05:10 +01:00
antirez
591b69c745 Fix DEBUG POPULATE warning for lack of casting. 2014-10-09 11:17:27 +02:00