Commit Graph

97 Commits

Author SHA1 Message Date
antirez
2c925b0c30 Use REDIS_SUPERVISED_NONE instead of 0. 2015-01-12 15:57:50 +01:00
Matt Stancliff
9e11d07909 Add more quicklist info to DEBUG OBJECT
Adds: ql_compressed (boolean, 1 if compression enabled for list, 0
otherwise)
Adds: ql_uncompressed_size (actual uncompressed size of all quicklistNodes)
Adds: ql_ziplist_max (quicklist max ziplist fill factor)

Compression ratio of the list is then ql_uncompressed_size / serializedlength

We report ql_uncompressed_size for all quicklists because serializedlength
is a _compressed_ representation anyway.

Sample output from a large list:
127.0.0.1:6379> llen abc
(integer) 38370061
127.0.0.1:6379> debug object abc
Value at:0x7ff97b51d140 refcount:1 encoding:quicklist serializedlength:19878335 lru:9718164 lru_seconds_idle:5 ql_nodes:21945 ql_avg_node:1748.46 ql_ziplist_max:-2 ql_compressed:0 ql_uncompressed_size:1643187761
(1.36s)

The 1.36s result time is because rdbSavedObjectLen() is serializing the
object, not because of any new stats reporting.

If we run DEBUG OBJECT on a compressed list, DEBUG OBJECT takes almost *zero*
time because rdbSavedObjectLen() reuses already-compressed ziplists:
127.0.0.1:6379> debug object abc
Value at:0x7fe5c5800040 refcount:1 encoding:quicklist serializedlength:19878335 lru:9718109 lru_seconds_idle:5 ql_nodes:21945 ql_avg_node:1748.46 ql_ziplist_max:-2 ql_compressed:1 ql_uncompressed_size:1643187761
2015-01-02 11:16:10 -05:00
Matt Stancliff
abdd1414a8 Allow compression of interior quicklist nodes
Let user set how many nodes to *not* compress.

We can specify a compression "depth" of how many nodes
to leave uncompressed on each end of the quicklist.

Depth 0 = disable compression.
Depth 1 = only leave head/tail uncompressed.
  - (read as: "skip 1 node on each end of the list before compressing")
Depth 2 = leave head, head->next, tail->prev, tail uncompressed.
  - ("skip 2 nodes on each end of the list before compressing")
Depth 3 = Depth 2 + head->next->next + tail->prev->prev
  - ("skip 3 nodes...")
etc.

This also:
  - updates RDB storage to use native quicklist compression (if node is
    already compressed) instead of uncompressing, generating the RDB string,
    then re-compressing the quicklist node.
  - internalizes the "fill" parameter for the quicklist so we don't
    need to pass it to _every_ function.  Now it's just a property of
    the list.
  - allows a runtime-configurable compression option, so we can
    expose a compresion parameter in the configuration file if people
    want to trade slight request-per-second performance for up to 90%+
    memory savings in some situations.
  - updates the quicklist tests to do multiple passes: 200k+ tests now.
2015-01-02 11:16:09 -05:00
Matt Stancliff
5127e39980 Add quicklist info to DEBUG OBJECT
Added field 'ql_nodes' and 'ql_avg_per_node'.

ql_nodes is the number of quicklist nodes in the quicklist.
ql_avg_node is the average fill level in each quicklist node. (LLEN / QL_NODES)

Sample output:
127.0.0.1:6379> DEBUG object b
Value at:0x7fa42bf2fed0 refcount:1 encoding:quicklist serializedlength:18489 lru:8983768 lru_seconds_idle:3 ql_nodes:430 ql_avg_per_node:511.73
127.0.0.1:6379> llen b
(integer) 220044
2015-01-02 11:16:09 -05:00
Matt Stancliff
27937c2821 Add DEBUG JEMALLOC INFO
Uses jemalloc function malloc_stats_print() to return
stats about what jemalloc has allocated internally.
2014-12-23 09:31:03 -05:00
Salvatore Sanfilippo
9c385ada22 Merge pull request #2134 from pyr/feature/supervised-init
Support daemon supervision by upstart or systemd
2014-12-11 14:39:09 +01:00
antirez
06e76bc3e2 Better read-only behavior for expired keys in slaves.
Slaves key expire is orchestrated by the master. Sometimes the master
will send the synthesized DEL to expire keys on the slave with a non
trivial delay (when the key is not accessed, only the incremental expiry
algorithm will expire it in background).

During that time, a key is logically expired, but slaves still return
the key if you GET (or whatever) it. This is a bad behavior.

However we can't simply trust the slave view of the key, since we need
the master to be able to send write commands to update the slave data
set, and DELs should only happen when the key is expired in the master
in order to ensure consistency.

However 99.99% of the issues with this behavior is when a client which
is not a master sends a read only command. In this case we are safe and
can consider the key as non existing.

This commit does a few changes in order to make this sane:

1. lookupKeyRead() is modified in order to return NULL if the above
conditions are met.
2. Calls to lookupKeyRead() in commands actually writing to the data set
are repliaced with calls to lookupKeyWrite().

There are redundand checks, so for example, if in "2" something was
overlooked, we should be still safe, since anyway, when the master
writes the behavior is to don't care about what expireIfneeded()
returns.

This commit is related to  #1768, #1770, #2131.
2014-12-10 16:10:21 +01:00
antirez
acf73a0592 Fix DEBUG OBJECT lru field to report seconds.
Because of (not so) recent Redis changes, now the LRU internally
reported unit is milliseconds, not seconds, but the DEBUG OBJECT output
was still claiming seconds while providing milliseconds.
However OBJECT IDLETIME was working as expected, which is the correct
API to use.
2014-11-26 16:38:33 +01:00
Pierre-Yves Ritschard
bc1a3b96e6 Support daemon supervision by upstart or systemd
Both upstart and systemd provide a way for daemons to
be supervised, as well as a mechanism for them to
signal their readyness status.

This patch provides compatibility with this functionality while
not interfering with other methods.

With this, it will be possible to use `expect stop` with upstart
and `Type=notify` with systemd.

A more detailed explanation of the mechanism can be found here:
http://spootnik.org/entries/2014/11/09_pid-tracking-in-modern-init-systems.html
2014-11-11 11:05:10 +01:00
antirez
591b69c745 Fix DEBUG POPULATE warning for lack of casting. 2014-10-09 11:17:27 +02:00
Matt Stancliff
678df67501 Add missing 'by' 2014-09-29 06:49:09 -04:00
antirez
d4222e6bee DEBUG POPULATE two args form implemented.
The old DEBUG POPULATE form for automatic creation of test keys is:

    DEBUG POPULATE <count>

Now an additional form is available:

    DEBUG POPULATE <count> <prefix>

When prefix is not specified, it defaults to "key", so the keys are
named incrementally from key:0 to key:<count-1>. Otherwise the specified
prefix is used instead of "key".

The command is useful in order to populate different Redis instances
with key names guaranteed to don't collide. There are other debugging
uses, for example it is possible to add additional N keys using a count
of N and a random prefix at every call.
2014-09-25 17:01:56 +02:00
antirez
683f41adf2 DEBUG CMDKEYS moved to COMMAND GETKEYS. 2014-06-27 12:22:15 +02:00
Salvatore Sanfilippo
08c7363647 Merge pull request #1743 from mattsta/cygwin-compile-fix
Cygwin compile fix
2014-06-09 11:42:14 +02:00
Jan-Erik Rediger
b187c591e3 Small typo fixed 2014-05-28 09:46:01 +02:00
Matt Stancliff
7a0c5fdf12 Disable recursive watchdog signal handler
If we are in the signal handler, we don't want to handle
the signal again.  In extreme cases, this can cause a stack overflow
and segfault Redis.

Fixes #1771
2014-05-26 17:53:33 +02:00
Matt Stancliff
3e0e51dd9f Fix lack of SA_ONSTACK under Cygwin
Fixes #232
2014-05-12 11:10:24 -04:00
antirez
72ff03346f DEBUG POPULATE: call dictExpand() to avoid useless rehashing. 2014-05-09 15:02:29 +02:00
antirez
0bcc7cb4bf CLIENT LIST speedup via peerid caching + smart allocation.
This commit adds peer ID caching in the client structure plus an API
change and the use of sdsMakeRoomFor() in order to improve the
reallocation pattern to generate the CLIENT LIST output.

Both the changes account for a very significant speedup.
2014-04-28 17:36:57 +02:00
antirez
d77e231682 Specify LRU resolution in milliseconds. 2014-03-20 11:33:25 +01:00
antirez
8eae54aa1e DEBUG ERROR implemented.
The new "error" subcommand of the DEBUG command can reply with an user
selected error, specified as its sole argument:

    DEBUG ERROR "LOADING please wait..."

The error is generated just prefixing the command argument with a "-"
character, and replacing newlines with spaces (since error replies can't
include newlines).

The goal of the command is to help in Client libraries unit tests by
making simple to simulate a command call triggering a given error.
2014-03-10 23:01:55 +01:00
antirez
2705306ba1 DEBUG CMDKEYS: provide some guarantee to getKeysFromCommand().
getKeysFromCommand() is designed to be called with the command arguments
passing the basic arity checks described in the command table.

DEBUG CMDKEYS must provide the same guarantees for calling
getKeysFromCommand() to be safe.
2014-03-10 16:43:38 +01:00
antirez
c4ef1d6494 DEBUG CMDKEYS added for getKeysFromCommand() testing.
Examples:

    redis 127.0.0.1:6379> debug cmdkeys set foo bar
    1) "foo"
    redis 127.0.0.1:6379> debug cmdkeys mget a b c
    1) "a"
    2) "b"
    3) "c"
    redis 127.0.0.1:6379> debug cmdkeys zunionstore foo 2 a b
    1) "a"
    2) "b"
    3) "foo"
    redis 127.0.0.1:6379> debug cmdkeys ping
    (empty list or set)
2014-03-10 16:36:08 +01:00
antirez
2eb781b35b dict.c: added optional callback to dictEmpty().
Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.

This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).
2013-12-10 18:46:24 +01:00
antirez
e21803348a DEBUG SDSLEN added.
This command is only useful for low-level debugging of memory issues due
to sds wasting memory as empty buffer at the end of the string.
2013-08-27 11:53:49 +02:00
antirez
894eba07c8 Introduction of a new string encoding: EMBSTR
Previously two string encodings were used for string objects:

1) REDIS_ENCODING_RAW: a string object with obj->ptr pointing to an sds
stirng.

2) REDIS_ENCODING_INT: a string object where the obj->ptr void pointer
is casted to a long.

This commit introduces a experimental new encoding called
REDIS_ENCODING_EMBSTR that implements an object represented by an sds
string that is not modifiable but allocated in the same memory chunk as
the robj structure itself.

The chunk looks like the following:

+--------------+-----------+------------+--------+----+
| robj data... | robj->ptr | sds header | string | \0 |
+--------------+-----+-----+------------+--------+----+
                     |                       ^
                     +-----------------------+

The robj->ptr points to the contiguous sds string data, so the object
can be manipulated with the same functions used to manipulate plan
string objects, however we need just on malloc and one free in order to
allocate or release this kind of objects. Moreover it has better cache
locality.

This new allocation strategy should benefit both the memory usage and
the performances. A performance gain between 60 and 70% was observed
during micro-benchmarks, however there is more work to do to evaluate
the performance impact and the memory usage behavior.
2013-07-22 10:31:38 +02:00
Salvatore Sanfilippo
bae60ede1d Merge pull request #1111 from yamt/netbsd3
netbsd support
2013-06-26 06:17:02 -07:00
antirez
338cd4835d Fix logStackTrace() when logging to stdout.
When the semantics changed from logfile = NULL to logfile = "" to log
into standard output, no proper change was made to logStackTrace() to
make it able to work with the new setup.

This commit fixes the issue.
2013-06-19 14:44:40 +02:00
antirez
cf71b82111 Binary safe dump of object content in redisLogObjectDebugInfo(). 2013-06-04 15:53:53 +02:00
YAMAMOTO Takashi
ae82867664 use nanosleep instead of usleep
SUSv3 says that:
	The useconds argument shall be less than one million. If the value of
	useconds is 0, then the call has no effect.
and actually NetBSD's implementation rejects such a value with EINVAL.
use nanosleep which has no such a limitation instead.
2013-05-17 17:21:28 +09:00
antirez
3816e7bda9 Better DEBUG error message when num of arguments is wrong. 2013-03-28 11:39:29 +01:00
antirez
32a83c8206 DEBUG set-active-expire added.
We need the ability to disable the activeExpireCycle() (active
expired key collection) call for testing purposes.
2013-03-27 17:55:02 +01:00
antirez
f9b5ca29fd Use GCC printf format attribute for redisLog().
This commit also fixes redisLog() statements producing warnings.
2013-02-27 12:27:15 +01:00
antirez
7404b95833 Avoid compiler warning by casting to match printf() specifier. 2013-02-13 13:38:20 +01:00
Salvatore Sanfilippo
442235fe40 Merge pull request #869 from bilalhusain/patch-2
s/adiacent/adjacent/
2013-01-21 03:19:02 -08:00
guiquanz
9d09ce3981 Fixed many typos. 2013-01-19 10:59:44 +01:00
Bilal Husain
717e5ffb45 s/adiacent/adjacent/
fixed typo in a comment (step 2 memcheck)
2013-01-09 21:46:58 +05:30
antirez
f1481d4a03 serverCron() frequency is now a runtime parameter (was REDIS_HZ).
REDIS_HZ is the frequency our serverCron() function is called with.
A more frequent call to this function results into less latency when the
server is trying to handle very expansive background operations like
mass expires of a lot of keys at the same time.

Redis 2.4 used to have an HZ of 10. This was good enough with almost
every setup, but the incremental key expiration algorithm was working a
bit better under *extreme* pressure when HZ was set to 100 for Redis
2.6.

However for most users a latency spike of 30 milliseconds when million
of keys are expiring at the same time is acceptable, on the other hand a
default HZ of 100 in Redis 2.6 was causing idle instances to use some
CPU time compared to Redis 2.4. The CPU usage was in the order of 0.3%
for an idle instance, however this is a shame as more energy is consumed
by the server, if not important resources.

This commit introduces HZ as a runtime parameter, that can be queried by
INFO or CONFIG GET, and can be modified with CONFIG SET. At the same
time the default frequency is set back to 10.

In this way we default to a sane value of 10, but allows users to
easily switch to values up to 500 for near real-time applications if
needed and if they are willing to pay this small CPU usage penalty.
2012-12-14 17:10:40 +01:00
antirez
2f62c9663c Introduced the Build ID in INFO and --version output.
The idea is to be able to identify a build in a unique way, so for
instance after a bug report we can recognize that the build is the one
of a popular Linux distribution and perform the debugging in the same
environment.
2012-11-29 14:20:08 +01:00
antirez
b1b602a928 On crash memory test rewrote so that it actaully works.
1) We no longer test location by location, otherwise the CPU write cache
completely makes our business useless.
2) We still need a memory test that operates in steps from the first to
the last location in order to never hit the cache, but that is still
able to retain the memory content.

This was tested using a Linux box containing a bad memory module with a
zingle bit error (always zero).

So the final solution does has an error propagation step that is:

1) Invert bits at every location.
2) Swap adiacent locations.
3) Swap adiacent locations again.
4) Invert bits at every location.
5) Swap adiacent locations.
6) Swap adiacent locations again.

Before and after these steps, and after step 4, a CRC64 checksum is computed.
If the three CRC64 checksums don't match, a memory error was detected.
2012-11-29 10:24:35 +01:00
antirez
c87a40897c Merge remote-tracking branch 'origin/unstable' into unstable 2012-11-28 11:41:27 +01:00
Matt Arsenault
504e5072eb It's a watchdog, not a watchdong. 2012-11-28 11:35:19 +01:00
charsyam
d7c7ac4a57 remove compile warning bioKillThreads 2012-11-23 05:52:39 +08:00
antirez
7536991726 Make bio.c threads killable ASAP if needed.
We use this new bio.c feature in order to stop our I/O threads if there
is a memory test to do on crash. In this case we don't want anything
else than the main thread to run, otherwise the other threads may mess
with the heap and the memory test will report a false positive.
2012-11-22 10:12:11 +01:00
antirez
5a9e3f5842 Fast memory test on Redis crash. 2012-11-21 13:24:44 +01:00
antirez
4365e5b2d3 BSD license added to every C source and header file. 2012-11-08 18:31:32 +01:00
antirez
6fdc635447 Better Out of Memory handling.
The previous implementation of zmalloc.c was not able to handle out of
memory in an application-specific way. It just logged an error on
standard error, and aborted.

The result was that in the case of an actual out of memory in Redis
where malloc returned NULL (In Linux this actually happens under
specific overcommit policy settings and/or with no or little swap
configured) the error was not properly logged in the Redis log.

This commit fixes this problem, fixing issue #509.
Now the out of memory is properly reported in the Redis log and a stack
trace is generated.

The approach used is to provide a configurable out of memory handler
to zmalloc (otherwise the default one logging the event on the
standard output is used).
2012-08-24 12:55:37 +02:00
antirez
ee789e157c Dump ziplist hex value on failed assertion.
The ziplist -> hashtable conversion code is triggered every time an hash
value must be promoted to a full hash table because the number or size of
elements reached the threshold.

If a problem in the ziplist causes the same field to be present
multiple times, the assertion of successful addition of the element
inside the hash table will fail, crashing server with a failed
assertion, but providing little information about the problem.

This code adds a new logging function to perform the hex dump of binary
data, and makes sure that the ziplist -> hashtable conversion code uses
this new logging facility to dump the content of the ziplist when the
assertion fails.

This change was originally made in order to investigate issue #547.
2012-06-12 00:41:48 +02:00
antirez
61daf8914d Impovements for: Redis timer, hashes rehashing, keys collection.
A previous commit introduced REDIS_HZ define that changes the frequency
of calls to the serverCron() Redis function. This commit improves
different related things:

1) Software watchdog: now the minimal period can be set according to
REDIS_HZ. The minimal period is two times the timer period, that is:

    (1000/REDIS_HZ)*2 milliseconds

2) The incremental rehashing is now performed in the expires dictionary
as well.

3) The activeExpireCycle() function was improved in different ways:

- Now it checks if it already used too much time using microseconds
  instead of milliseconds for better precision.
- The time limit is now calculated correctly, in the previous version
  the division was performed before of the multiplication resulting in
  a timelimit of 0 if HZ was big enough.
- Databases with less than 1% of buckets fill in the hash table are
  skipped, because getting random keys is too expensive in this
  condition.

4) tryResizeHashTables() is now called at every timer call, we need to
   match the number of calls we do to the expired keys colleciton cycle.

5) REDIS_HZ was raised to 100.
2012-05-13 21:52:35 +02:00
antirez
11bd247d2b Produce the stack trace in an async safe way. 2012-04-26 16:28:54 +02:00