Fix quicklistReplaceAtIndex() by updating the quicklist ziplist size.

The quicklist takes a cached version of the ziplist representation size
in bytes. The implementation must update this length every time the
underlying ziplist changes. However quicklistReplaceAtIndex() failed to
fix the length.

During LSET calls, the size of the ziplist blob and the cached size
inside the quicklist diverged. Later, when this size is used in an
authoritative way, for example during nodes splitting in order to copy
the nodes, we end with a duplicated node that may contain random
garbage.

This commit should fix issue #3343, however several problems were found
reviewing the quicklist.c code in search of this bug that should be
addressed soon or later.

For example:

1. To take a cached ziplist length is fragile since failing to update it
leads to this kind of issues.

2. The node splitting code needs auditing. For example it works just for
a side effect of ziplistDeleteRange() to be able to cope with a wrong
count of elements to remove. The code inside quicklist.c assumes that
-1 means "delete till the end" while actually it's just a count of how
many elements to delete, and is an unsigned count. So -1 gets converted
into the maximum integer, and just by chance the ziplist code stops
deleting elements after there are no more to delete.

3. Node splitting is extremely inefficient, it copies the node and
removes elements from both nodes even when actually there is to move a
single entry from one node to the other, or when the new resulting node
is empty at all so there is nothing to copy but just to create a new
node.

However at least for Redis 3.2 to introduce fresh code inside
quicklist.c may be even more risky, so instead I'm writing a better
fuzzy tester to stress the internals a bit more in order to anticipate
other possible bugs.

This bug was found using a fuzzy tester written after having some clue
about where the bug could be. The tester eventually created a ~2000
commands sequence able to always crash Redis. I wrote a better version
of the tester that searched for the smallest sequence that could crash
Redis automatically. Later this smaller sequence was minimized by
removing random commands till it still crashed the server. This resulted
into a sequence of 7 commands. With this small sequence it was just a
matter of filling the code with enough printf() to understand enough
state to fix the bug.
This commit is contained in:
antirez 2016-06-27 18:02:33 +02:00
parent 18983113c5
commit 5e176e1af5

View File

@ -671,6 +671,7 @@ int quicklistReplaceAtIndex(quicklist *quicklist, long index, void *data,
/* quicklistIndex provides an uncompressed node */ /* quicklistIndex provides an uncompressed node */
entry.node->zl = ziplistDelete(entry.node->zl, &entry.zi); entry.node->zl = ziplistDelete(entry.node->zl, &entry.zi);
entry.node->zl = ziplistInsert(entry.node->zl, entry.zi, data, sz); entry.node->zl = ziplistInsert(entry.node->zl, entry.zi, data, sz);
quicklistNodeUpdateSz(entry.node);
quicklistCompress(quicklist, entry.node); quicklistCompress(quicklist, entry.node);
return 1; return 1;
} else { } else {