There's a rare case which leads to stagnation in the defragger, causing
it to keep scanning the keyspace and do nothing (not moving any
allocation), this happens when all the allocator slabs of a certain bin
have the same % utilization, but the slab from which new allocations are
made have a lower utilization.
this commit fixes it by removing the current slab from the overall
average utilization of the bin, and also eliminate any precision loss in
the utilization calculation and move the decision about the defrag to
reside inside jemalloc.
and also add a test that consistently reproduce this issue.
Change `val` to `unsigned char` before being tested.
The fix is identical to the one that's been made in upstream jemalloc.
warning is:
src/malloc_io.c: In function ‘malloc_vsnprintf’:
src/malloc_io.c:369:2: warning: case label value exceeds maximum value for type
369 | case '?' | 0x80: \
| ^~~~
src/malloc_io.c:581:5: note: in expansion of macro ‘GET_ARG_NUMERIC’
581 | GET_ARG_NUMERIC(val, 'p');
| ^~~~~~~~~~~~~~~
There's a rare case which leads to stagnation in the defragger, causing
it to keep scanning the keyspace and do nothing (not moving any
allocation), this happens when all the allocator slabs of a certain bin
have the same % utilization, but the slab from which new allocations are
made have a lower utilization.
this commit fixes it by removing the current slab from the overall
average utilization of the bin, and also eliminate any precision loss in
the utilization calculation and move the decision about the defrag to
reside inside jemalloc.
and also add a test that consistently reproduce this issue.
Background threads may run for a long time, especially when the # of dirty pages
is high. Avoid blocking stats calls because of this (which may cause latency
spikes).
see https://github.com/jemalloc/jemalloc/issues/1502
cherry picked from commit 1a71533511027dbe3f9d989659efeec446915d6b
problems fixed:
* failing to read fragmentation information from jemalloc
* overflow in jemalloc fragmentation hint to the defragger
* test suite not triggering eviction after population
The original jemalloc source tree was modified to:
1. Remove the configure error that prevents nested builds.
2. Insert the Redis private Jemalloc API in order to allow the
Redis fragmentation function to work.