Commit Graph

4899 Commits

Author SHA1 Message Date
Matt Stancliff
f704360462 Improve RDB type correctness
It's possible large objects could be larger than 'int', so let's
upgrade all size counters to ssize_t.

This also fixes rdbSaveObject serialized bytes calculation.
Since entire serializations of data structures can be large,
so we don't want to limit their calculated size to a 32 bit signed max.

This commit increases object size calculation and
cascades the change back up to serializedlength printing.

Before:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:-2147483559 ...

After:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:2147483737 ...
2015-01-19 14:10:12 -05:00
antirez
cf76af6b9f Cluster: fetch my IP even if msg is not MEET for the first time.
In order to avoid that misconfigured cluster nodes at some time may
force an IP update on other nodes, it is required that nodes update
their own address only on MEET messages. However it does not make sense
to do this the first time a node is contacted and yet does not have an
IP, we just risk that myself->ip remains not assigned if there are
messages lost or cluster creation procedures that don't make sure
everybody is targeted by at least one incoming MEET message.

Also fix the logging of the IP switch avoiding the :-1 tail.
2015-01-13 10:50:34 +01:00
antirez
5b0f4a83ac Cluster: clusterMsgDataGossip structure, explict padding + minor stuff.
Also explicitly set version to 0, add a protocol version define, improve
comments in the gossip structure.

Note that the structure layout is the same after the change, we are just
making the padding explicit with an additional not used 16 bits field.
So this commit is still able to talk with the previous versions of
cluster nodes.
2015-01-13 10:40:09 +01:00
antirez
237ab727b9 Suppress valgrind error about write sending uninitialized data.
Valgrind checks that the buffers we transfer via syscalls are all
composed of bytes actually initialized. This is useful, it makes we able
to avoid leaking informations in non initialized parts fo messages
transferred to other hosts. This commit fixes one of such issues.
2015-01-13 09:31:37 +01:00
antirez
f08586347d Revert "Use REDIS_SUPERVISED_NONE instead of 0."
This reverts commit 2c925b0c30.

Nevermind.
2015-01-12 15:58:23 +01:00
antirez
2c925b0c30 Use REDIS_SUPERVISED_NONE instead of 0. 2015-01-12 15:57:50 +01:00
antirez
10007cb78c Merge branch 'unstable' of github.com:/antirez/redis into unstable 2015-01-12 15:56:46 +01:00
Salvatore Sanfilippo
e6416ca71c Merge pull request #2266 from mattsta/improve/supervised/startup
Three fixes: explicit supervise, pidfile create, remove memory leaks.
2015-01-12 15:56:36 +01:00
antirez
6274a6789d Cluster: initialize mf_end.
Can't be initialized by resetManualFailover() since it's actual state
the function uses, so we need to initialize it at startup time. Not
really a bug in practical terms, but showed up into valgrind and is not
technically correct anyway.
2015-01-12 15:55:00 +01:00
Matt Stancliff
eb7d67a3ab Remove RDB AUX memory leaks 2015-01-09 15:19:18 -05:00
Matt Stancliff
36a3b75355 Supervise redis processes only if configured
Adds configuration option 'supervised [no | upstart | systemd | auto]'

Also removed 'bzero' from the previous implementation because it's 2015.
(We could actually statically initialize those structs, but clang
throws an invalid warning when we try, so it looks bad even though it
isn't bad.)

Fixes #2264
2015-01-09 15:19:18 -05:00
Matt Stancliff
5e8b7e4f35 Define default pidfile when creating pid
We want pidfile to be NULL on startup so we can detect if the user
set an explicit value versus only using the default value.

Closes #1967
Fixes #2076
2015-01-09 15:19:18 -05:00
rebx
e6d499a446 Create PID file even if in foreground
Previously, Redis only wrote the pid file if
it was daemonizing, but many times it's useful to have
the pid written out even if you're in the foreground.

Some background for this is:
I usually run redis via daemontools. That entails running
redis-server on the foreground. Given that, I'd also want
redis-server to create a pidfile so other processes (e.g. nagios)
can run checks for that.

Closes #463
2015-01-09 15:19:18 -05:00
antirez
0d22121c27 Add "-lrt" in Makefile for Solaris.
This fix is from @NanXiao, however I was not able to retain authorship
because the Pull Request original repository was removed.
2015-01-09 11:53:51 +01:00
antirez
792a94153a Check for __sun macro in solarisfixes.h, not in includers. 2015-01-09 11:23:22 +01:00
antirez
f3fd58eb4a Cluster test: also write from Lua script in resharding test. 2015-01-09 11:23:22 +01:00
antirez
da95d22ad2 Prevent Lua scripts from violating Redis Cluster keyspace access rules.
Before this commit scripts were able to access / create keys outside the
set of hash slots served by the local node.
2015-01-09 11:23:22 +01:00
Salvatore Sanfilippo
1019c72930 Merge pull request #2265 from mattsta/fix/trib/create
Fix redis-trib creation failure
2015-01-08 19:43:14 +01:00
Matt Stancliff
bf58f8b513 Remove end of line whitespace from redis-trib 2015-01-08 13:31:03 -05:00
Matt Stancliff
1c477f62bc Fix redis-trib cluster create
Under certain conditions the node list wasn't being fully populated
and 'create' would fail trying to call methods on nil objects.
2015-01-08 13:28:35 -05:00
antirez
622c69e93d README section about make distclean reworded / extended. 2015-01-08 16:35:05 +01:00
antirez
b45e16e7ee Merge branch 'unstable' of github.com:/antirez/redis into unstable 2015-01-08 16:30:04 +01:00
Salvatore Sanfilippo
161ff77572 Merge pull request #2262 from HeartSaVioR/explain-make-distclean
Explain make distclean which seems not well known
2015-01-08 16:29:29 +01:00
Jungtaek Lim
219ab66cc8 Explain make distclean which seems not well known 2015-01-09 00:07:25 +09:00
antirez
42357668e8 Advertise Redis Cluster as experimental in redis.conf. 2015-01-08 14:41:34 +01:00
antirez
a7722dc31b Typo fixed: fiels -> fields in rdbSaveInfoAuxFields().
Thx to @badboy.
2015-01-08 12:06:22 +01:00
antirez
4c0e8923a6 A few more AUX info fields added to RDB. 2015-01-08 09:52:59 +01:00
antirez
206cd219b6 RDB AUX fields support.
This commit introduces a new RDB data type called 'aux'. It is used in
order to insert inside an RDB file key-value pairs that may serve
different needs, without breaking backward compatibility when new
informations are embedded inside an RDB file. The contract between Redis
versions is to ignore unknown aux fields when encountered.

Aux fields can be used in order to:

1. Augment the RDB file with info like version of Redis that created the
RDB file, creation time, used memory while the RDB was created, and so
forth.
2. Add state about Redis inside the RDB file that we need to reload
later: replication offset, previos master run ID, in order to improve
failovers safety and allow partial resynchronization after a slave
restart.
3. Anything that we may want to add to RDB files without breaking the
ability of past versions of Redis to load the file.
2015-01-08 09:52:55 +01:00
antirez
1a30e7ded1 rdbLoad() refactoring to make it simpler to follow. 2015-01-08 09:52:51 +01:00
antirez
e8614a1a77 New RDB v7 opcode: RESIZEDB.
The new opcode is an hint about the size of the dataset (keys and number
of expires) we are going to load for a given Redis database inside the
RDB file. Since hash tables are resized accordingly ASAP, useless
rehashing is avoided, speeding up load times significantly, in the order
of ~ 20% or more for larger data sets.

Related issue: #1719
2015-01-08 09:52:47 +01:00
antirez
32b10004e2 sdsnative() removed: New rdb.c API can load native strings. 2015-01-08 09:52:44 +01:00
antirez
f699b5e801 Use RDB_LOAD_PLAIN to load quicklists and encoded types.
Before we needed to create a string object with an embedded SDS, adn
basically duplicate the SDS part into a plain zmalloc() allocation.
2015-01-08 09:52:40 +01:00
antirez
68bc02c36c RDB refactored to load plain strings from RDB. 2015-01-08 09:52:36 +01:00
Salvatore Sanfilippo
05ba119fbb Merge pull request #2143 from mattsta/quicklist
Quicklist (linked list + ziplist)
2015-01-08 09:51:55 +01:00
Matt Stancliff
5870e22423 Upgrade LZF to 3.6 (2011) from 3.5 (2009)
This is lzf_c and lzf_d from
http://dist.schmorp.de/liblzf/liblzf-3.6.tar.gz
2015-01-02 11:16:10 -05:00
Matt Stancliff
25e12d10be Set optional 'static' for Quicklist+Redis
This also defines REDIS_STATIC='' for building everything
inside src/ and everything inside deps/lua/.
2015-01-02 11:16:10 -05:00
Matt Stancliff
9e11d07909 Add more quicklist info to DEBUG OBJECT
Adds: ql_compressed (boolean, 1 if compression enabled for list, 0
otherwise)
Adds: ql_uncompressed_size (actual uncompressed size of all quicklistNodes)
Adds: ql_ziplist_max (quicklist max ziplist fill factor)

Compression ratio of the list is then ql_uncompressed_size / serializedlength

We report ql_uncompressed_size for all quicklists because serializedlength
is a _compressed_ representation anyway.

Sample output from a large list:
127.0.0.1:6379> llen abc
(integer) 38370061
127.0.0.1:6379> debug object abc
Value at:0x7ff97b51d140 refcount:1 encoding:quicklist serializedlength:19878335 lru:9718164 lru_seconds_idle:5 ql_nodes:21945 ql_avg_node:1748.46 ql_ziplist_max:-2 ql_compressed:0 ql_uncompressed_size:1643187761
(1.36s)

The 1.36s result time is because rdbSavedObjectLen() is serializing the
object, not because of any new stats reporting.

If we run DEBUG OBJECT on a compressed list, DEBUG OBJECT takes almost *zero*
time because rdbSavedObjectLen() reuses already-compressed ziplists:
127.0.0.1:6379> debug object abc
Value at:0x7fe5c5800040 refcount:1 encoding:quicklist serializedlength:19878335 lru:9718109 lru_seconds_idle:5 ql_nodes:21945 ql_avg_node:1748.46 ql_ziplist_max:-2 ql_compressed:1 ql_uncompressed_size:1643187761
2015-01-02 11:16:10 -05:00
Matt Stancliff
02bb515a09 Config: Add quicklist, remove old list options
This removes:
  - list-max-ziplist-entries
  - list-max-ziplist-value

This adds:
  - list-max-ziplist-size
  - list-compress-depth

Also updates config file with new sections and updates
tests to use quicklist settings instead of old list settings.
2015-01-02 11:16:10 -05:00
Matt Stancliff
bbbbfb1442 Add branch prediction hints to quicklist
Actually makes a noticeable difference.

Branch hints were selected based on profiler hotspots.
2015-01-02 11:16:10 -05:00
Matt Stancliff
5f506b6d2b Cleanup quicklist style
Small fixes due to a new version of clang-format (it's less
crazy than the older version).
2015-01-02 11:16:09 -05:00
Matt Stancliff
abdd1414a8 Allow compression of interior quicklist nodes
Let user set how many nodes to *not* compress.

We can specify a compression "depth" of how many nodes
to leave uncompressed on each end of the quicklist.

Depth 0 = disable compression.
Depth 1 = only leave head/tail uncompressed.
  - (read as: "skip 1 node on each end of the list before compressing")
Depth 2 = leave head, head->next, tail->prev, tail uncompressed.
  - ("skip 2 nodes on each end of the list before compressing")
Depth 3 = Depth 2 + head->next->next + tail->prev->prev
  - ("skip 3 nodes...")
etc.

This also:
  - updates RDB storage to use native quicklist compression (if node is
    already compressed) instead of uncompressing, generating the RDB string,
    then re-compressing the quicklist node.
  - internalizes the "fill" parameter for the quicklist so we don't
    need to pass it to _every_ function.  Now it's just a property of
    the list.
  - allows a runtime-configurable compression option, so we can
    expose a compresion parameter in the configuration file if people
    want to trade slight request-per-second performance for up to 90%+
    memory savings in some situations.
  - updates the quicklist tests to do multiple passes: 200k+ tests now.
2015-01-02 11:16:09 -05:00
Matt Stancliff
5127e39980 Add quicklist info to DEBUG OBJECT
Added field 'ql_nodes' and 'ql_avg_per_node'.

ql_nodes is the number of quicklist nodes in the quicklist.
ql_avg_node is the average fill level in each quicklist node. (LLEN / QL_NODES)

Sample output:
127.0.0.1:6379> DEBUG object b
Value at:0x7fa42bf2fed0 refcount:1 encoding:quicklist serializedlength:18489 lru:8983768 lru_seconds_idle:3 ql_nodes:430 ql_avg_per_node:511.73
127.0.0.1:6379> llen b
(integer) 220044
2015-01-02 11:16:09 -05:00
Matt Stancliff
8d7021892e Remove malloc failure checks
We trust zmalloc to kill the whole process on memory failure
2015-01-02 11:16:09 -05:00
Matt Stancliff
e0d94a7b01 Increase test size for migrating large values
Previously, the old test ran 5,000 loops and used about 500k.

With quicklist, storing those same 5,000 loops takes up 24k, so the
"large value check" failed!

This increases the test to 20,000 loops which makes the object dump 96k.
2015-01-02 11:16:09 -05:00
Matt Stancliff
101b3a6e42 Convert quicklist RDB to store ziplist nodes
Turns out it's a huge improvement during save/reload/migrate/restore
because, with compression enabled, we're compressing 4k or 8k
chunks of data consisting of multiple elements in one ziplist
instead of compressing series of smaller individual elements.
2015-01-02 11:16:09 -05:00
Matt Stancliff
127c15e2b2 Convert RDB ziplist loading to sdsnative()
This saves us an unnecessary zmalloc, memcpy, and two frees.
2015-01-02 11:16:09 -05:00
Matt Stancliff
e1619772db Add sdsnative()
Use the existing memory space for an SDS to convert it to a regular
character buffer so we don't need to allocate duplicate space just
to extract a usable buffer for native operations.
2015-01-02 11:16:08 -05:00
Matt Stancliff
c6bf20c2a7 Add adaptive quicklist fill factor
Fill factor now has two options:
  - negative (1-5) for size-based ziplist filling
  - positive for length-based ziplist filling with implicit size cap.

Negative offsets define ziplist size limits of:
  -1: 4k
  -2: 8k
  -3: 16k
  -4: 32k
  -5: 64k

Positive offsets now automatically limit their max size to 8k.  Any
elements larger than 8k will be in individual nodes.

Positive ziplist fill factors will keep adding elements
to a ziplist until one of:
  - ziplist has FILL number of elements
    - or -
  - ziplist grows above our ziplist max size (currently 8k)

When using positive fill factors, if you insert a large
element (over 8k), that element will automatically allocate
an individual quicklist node with one element and no other elements will be
in the same ziplist inside that quicklist node.

When using negative fill factors, elements up to the size
limit can be added to one quicklist node.  If an element
is added larger than the max ziplist size, that element
will be allocated an individual ziplist in a new quicklist node.

Tests also updated to start testing at fill factor -5.
2015-01-02 11:16:08 -05:00
Matt Stancliff
60a9418ed9 redis-benchmark: Add RPUSH and RPOP tests 2015-01-02 11:16:08 -05:00
Matt Stancliff
0f15eb183b Free ziplist test lists during tests
Freeing our test lists helps keep valgrind output clean
2015-01-02 11:16:08 -05:00