Reading the PR gave me the opportunity to better specify what the code
was doing in places where I was not immediately sure about what was
going on. Moreover I documented the structure in server.h so that people
reading the header file will immediately understand what the structure
is useful for.
A) slave buffers didn't count internal fragmentation and sds unused space,
this caused them to induce eviction although we didn't mean for it.
B) slave buffers were consuming about twice the memory of what they actually needed.
- this was mainly due to sdsMakeRoomFor growing to twice as much as needed each time
but networking.c not storing more than 16k (partially fixed recently in 237a38737).
- besides it wasn't able to store half of the new string into one buffer and the
other half into the next (so the above mentioned fix helped mainly for small items).
- lastly, the sds buffers had up to 30% internal fragmentation that was wasted,
consumed but not used.
C) inefficient performance due to starting from a small string and reallocing many times.
what i changed:
- creating dedicated buffers for reply list, counting their size with zmalloc_size
- when creating a new reply node from, preallocate it to at least 16k.
- when appending a new reply to the buffer, first fill all the unused space of the
previous node before starting a new one.
other changes:
- expose mem_not_counted_for_evict info field for the benefit of the test suite
- add a test to make sure slave buffers are counted correctly and that they don't cause eviction
We don't want to increment the deliveries here, because the sysadmin
reset the consumer group so the desire is likely to restart processing,
and having the PEL polluted with old information is not useful but
probably confusing.
Related to #5111.
To simplify the semantics of blocking for a group, this commit changes
the implementation to better match the description we provide of
conusmer groups: blocking for > will make the consumer waiting for new
elements in the group. However blocking for any other ID will always
serve the local history of the consumer.
However it must be noted that the > ID is actually an alias for the
special ID ms/seq of UINT64_MAX,UINT64_MAX.
To detect when the group (or the whole key) is destroyed to send an
error to the consumers blocked in such group is a problem, so we leave
the consumers listening, the sysadmin is free to create or destroy
groups assuming she/he knows what to do. However a client may be blocked
in a given consumer group, that is later destroyed. Then the stream
receives new elements. In that case there is no sane behavior to serve
the consumer... but to report an error about the group no longer
existing.
More about detecting this synchronously and why it is not done:
1. Normally we don't do that, we leave clients blocked for other data
types such as lists.
2. When we free a stream object there is no longer information about
what was the key it was associated with, so while destroying the
consumer groups we miss the info to unblock the clients in that moment.
3. Objects can be reclaimed in other threads where it is no longer safe
to do client operations.
When a client blocks for a consumer group, we don't know the actual ID
we want to be served: other clients blocked in the same consumer group
may be served first, so the consumer group latest delivered ID changes.
This was not handled correctly, all the clients in the consumer group
were unblocked without data but the first.