jemalloc — general purpose memory allocation functions
This manual describes jemalloc 4.4.0-0-gf1f76357313e7dcad7262f17a48ff0a2e005fcdc. More information can be found at the jemalloc website.
#include <jemalloc/jemalloc.h
>
void *malloc( | size_t size) ; |
void *calloc( | size_t number, |
size_t size) ; |
int posix_memalign( | void **ptr, |
size_t alignment, | |
size_t size) ; |
void *aligned_alloc( | size_t alignment, |
size_t size) ; |
void *realloc( | void *ptr, |
size_t size) ; |
void free( | void *ptr) ; |
void *mallocx( | size_t size, |
int flags) ; |
void *rallocx( | void *ptr, |
size_t size, | |
int flags) ; |
size_t xallocx( | void *ptr, |
size_t size, | |
size_t extra, | |
int flags) ; |
size_t sallocx( | void *ptr, |
int flags) ; |
void dallocx( | void *ptr, |
int flags) ; |
void sdallocx( | void *ptr, |
size_t size, | |
int flags) ; |
size_t nallocx( | size_t size, |
int flags) ; |
int mallctl( | const char *name, |
void *oldp, | |
size_t *oldlenp, | |
void *newp, | |
size_t newlen) ; |
int mallctlnametomib( | const char *name, |
size_t *mibp, | |
size_t *miblenp) ; |
int mallctlbymib( | const size_t *mib, |
size_t miblen, | |
void *oldp, | |
size_t *oldlenp, | |
void *newp, | |
size_t newlen) ; |
void malloc_stats_print( | void (*write_cb)
( void *, const char *)
, |
void *cbopaque, | |
const char *opts) ; |
size_t malloc_usable_size( | const void *ptr) ; |
void (*malloc_message)( | void *cbopaque, |
const char *s) ; |
const char *malloc_conf
;
The malloc()
function allocates
size
bytes of uninitialized memory. The allocated
space is suitably aligned (after possible pointer coercion) for storage
of any type of object.
The calloc()
function allocates
space for number
objects, each
size
bytes in length. The result is identical to
calling malloc()
with an argument of
number
* size
, with the
exception that the allocated memory is explicitly initialized to zero
bytes.
The posix_memalign()
function
allocates size
bytes of memory such that the
allocation's base address is a multiple of
alignment
, and returns the allocation in the value
pointed to by ptr
. The requested
alignment
must be a power of 2 at least as large as
sizeof(void *)
.
The aligned_alloc()
function
allocates size
bytes of memory such that the
allocation's base address is a multiple of
alignment
. The requested
alignment
must be a power of 2. Behavior is
undefined if size
is not an integral multiple of
alignment
.
The realloc()
function changes the
size of the previously allocated memory referenced by
ptr
to size
bytes. The
contents of the memory are unchanged up to the lesser of the new and old
sizes. If the new size is larger, the contents of the newly allocated
portion of the memory are undefined. Upon success, the memory referenced
by ptr
is freed and a pointer to the newly
allocated memory is returned. Note that
realloc()
may move the memory allocation,
resulting in a different return value than ptr
.
If ptr
is NULL
, the
realloc()
function behaves identically to
malloc()
for the specified size.
The free()
function causes the
allocated memory referenced by ptr
to be made
available for future allocations. If ptr
is
NULL
, no action occurs.
The mallocx()
,
rallocx()
,
xallocx()
,
sallocx()
,
dallocx()
,
sdallocx()
, and
nallocx()
functions all have a
flags
argument that can be used to specify
options. The functions only check the options that are contextually
relevant. Use bitwise or (|
) operations to
specify one or more of the following:
MALLOCX_LG_ALIGN(la
)
Align the memory allocation to start at an address
that is a multiple of (1 <<
. This macro does not validate
that la
)la
is within the valid
range.
MALLOCX_ALIGN(a
)
Align the memory allocation to start at an address
that is a multiple of a
, where
a
is a power of two. This macro does not
validate that a
is a power of 2.
MALLOCX_ZERO
Initialize newly allocated memory to contain zero bytes. In the growing reallocation case, the real size prior to reallocation defines the boundary between untouched bytes and those that are initialized to contain zero bytes. If this macro is absent, newly allocated memory is uninitialized.
MALLOCX_TCACHE(tc
)
Use the thread-specific cache (tcache) specified by
the identifier tc
, which must have been
acquired via the
mallctl. This macro does not validate that
tcache.create
tc
specifies a valid
identifier.
MALLOCX_TCACHE_NONE
Do not use a thread-specific cache (tcache). Unless
MALLOCX_TCACHE(
or
tc
)MALLOCX_TCACHE_NONE
is specified, an
automatically managed tcache will be used under many circumstances.
This macro cannot be used in the same flags
argument as
MALLOCX_TCACHE(
.tc
)
MALLOCX_ARENA(a
)
Use the arena specified by the index
a
. This macro has no effect for regions that
were allocated via an arena other than the one specified. This
macro does not validate that a
specifies an
arena index in the valid range.
The mallocx()
function allocates at
least size
bytes of memory, and returns a pointer
to the base address of the allocation. Behavior is undefined if
size
is 0
.
The rallocx()
function resizes the
allocation at ptr
to be at least
size
bytes, and returns a pointer to the base
address of the resulting allocation, which may or may not have moved from
its original location. Behavior is undefined if
size
is 0
.
The xallocx()
function resizes the
allocation at ptr
in place to be at least
size
bytes, and returns the real size of the
allocation. If extra
is non-zero, an attempt is
made to resize the allocation to be at least (
bytes, though inability to allocate
the extra byte(s) will not by itself result in failure to resize.
Behavior is undefined if size
+
extra
)size
is
0
, or if (
.size
+ extra
> SIZE_T_MAX
)
The sallocx()
function returns the
real size of the allocation at ptr
.
The dallocx()
function causes the
memory referenced by ptr
to be made available for
future allocations.
The sdallocx()
function is an
extension of dallocx()
with a
size
parameter to allow the caller to pass in the
allocation size as an optimization. The minimum valid input size is the
original requested size of the allocation, and the maximum valid input
size is the corresponding value returned by
nallocx()
or
sallocx()
.
The nallocx()
function allocates no
memory, but it performs the same size computation as the
mallocx()
function, and returns the real
size of the allocation that would result from the equivalent
mallocx()
function call, or
0
if the inputs exceed the maximum supported size
class and/or alignment. Behavior is undefined if
size
is 0
.
The mallctl()
function provides a
general interface for introspecting the memory allocator, as well as
setting modifiable parameters and triggering actions. The
period-separated name
argument specifies a
location in a tree-structured namespace; see the MALLCTL NAMESPACE section for
documentation on the tree contents. To read a value, pass a pointer via
oldp
to adequate space to contain the value, and a
pointer to its length via oldlenp
; otherwise pass
NULL
and NULL
. Similarly, to
write a value, pass a pointer to the value via
newp
, and its length via
newlen
; otherwise pass NULL
and 0
.
The mallctlnametomib()
function
provides a way to avoid repeated name lookups for applications that
repeatedly query the same portion of the namespace, by translating a name
to a “Management Information Base” (MIB) that can be passed
repeatedly to mallctlbymib()
. Upon
successful return from mallctlnametomib()
,
mibp
contains an array of
*miblenp
integers, where
*miblenp
is the lesser of the number of components
in name
and the input value of
*miblenp
. Thus it is possible to pass a
*miblenp
that is smaller than the number of
period-separated name components, which results in a partial MIB that can
be used as the basis for constructing a complete MIB. For name
components that are integers (e.g. the 2 in
),
the corresponding MIB component will always be that integer. Therefore,
it is legitimate to construct code like the following: arenas.bin.2.size
unsigned nbins, i; size_t mib[4]; size_t len, miblen; len = sizeof(nbins); mallctl("arenas.nbins", &nbins, &len, NULL, 0); miblen = 4; mallctlnametomib("arenas.bin.0.size", mib, &miblen); for (i = 0; i < nbins; i++) { size_t bin_size; mib[2] = i; len = sizeof(bin_size); mallctlbymib(mib, miblen, (void *)&bin_size, &len, NULL, 0); /* Do something with bin_size... */ }
The malloc_stats_print()
function writes
summary statistics via the write_cb
callback
function pointer and cbopaque
data passed to
write_cb
, or malloc_message()
if write_cb
is NULL
. The
statistics are presented in human-readable form unless “J” is
specified as a character within the opts
string, in
which case the statistics are presented in JSON format. This function can be
called repeatedly. General information that never changes during
execution can be omitted by specifying “g” as a character
within the opts
string. Note that
malloc_message()
uses the
mallctl*()
functions internally, so inconsistent
statistics can be reported if multiple threads use these functions
simultaneously. If --enable-stats
is specified during
configuration, “m” and “a” can be specified to
omit merged arena and per arena statistics, respectively;
“b”, “l”, and “h” can be specified
to omit per size class statistics for bins, large objects, and huge
objects, respectively. Unrecognized characters are silently ignored.
Note that thread caching may prevent some statistics from being completely
up to date, since extra locking would be required to merge counters that
track thread cache operations.
The malloc_usable_size()
function
returns the usable size of the allocation pointed to by
ptr
. The return value may be larger than the size
that was requested during allocation. The
malloc_usable_size()
function is not a
mechanism for in-place realloc()
; rather
it is provided solely as a tool for introspection purposes. Any
discrepancy between the requested allocation size and the size reported
by malloc_usable_size()
should not be
depended on, since such behavior is entirely implementation-dependent.
Once, when the first call is made to one of the memory allocation routines, the allocator initializes its internals based in part on various options that can be specified at compile- or run-time.
The string specified via --with-malloc-conf
, the
string pointed to by the global variable malloc_conf
, the
“name” of the file referenced by the symbolic link named
/etc/malloc.conf
, and the value of the
environment variable MALLOC_CONF
, will be interpreted, in
that order, from left to right as options. Note that
malloc_conf
may be read before
main()
is entered, so the declaration of
malloc_conf
should specify an initializer that contains
the final value to be read by jemalloc. --with-malloc-conf
and malloc_conf
are compile-time mechanisms, whereas
/etc/malloc.conf
and
MALLOC_CONF
can be safely set any time prior to program
invocation.
An options string is a comma-separated list of option:value pairs.
There is one key corresponding to each
mallctl (see the MALLCTL NAMESPACE section for options
documentation). For example, opt.*
abort:true,narenas:1
sets
the
and opt.abort
options. Some
options have boolean values (true/false), others have integer values (base
8, 10, or 16, depending on prefix), and yet others have raw string
values.opt.narenas
Traditionally, allocators have used sbrk(2) to obtain memory, which is suboptimal for several reasons, including race conditions, increased fragmentation, and artificial limitations on maximum usable memory. If sbrk(2) is supported by the operating system, this allocator uses both mmap(2) and sbrk(2), in that order of preference; otherwise only mmap(2) is used.
This allocator uses multiple arenas in order to reduce lock contention for threaded programs on multi-processor systems. This works well with regard to threading scalability, but incurs some costs. There is a small fixed per-arena overhead, and additionally, arenas manage memory completely independently of each other, which means a small fixed increase in overall memory fragmentation. These overheads are not generally an issue, given the number of arenas normally used. Note that using substantially more arenas than the default is not likely to improve performance, mainly due to reduced cache performance. However, it may make sense to reduce the number of arenas if an application does not make much use of the allocation functions.
In addition to multiple arenas, unless
--disable-tcache
is specified during configuration, this
allocator supports thread-specific caching for small and large objects, in
order to make it possible to completely avoid synchronization for most
allocation requests. Such caching allows very fast allocation in the
common case, but it increases memory usage and fragmentation, since a
bounded number of objects can remain allocated in each thread cache.
Memory is conceptually broken into equal-sized chunks, where the chunk size is a power of two that is greater than the page size. Chunks are always aligned to multiples of the chunk size. This alignment makes it possible to find metadata for user objects very quickly. User objects are broken into three categories according to size: small, large, and huge. Multiple small and large objects can reside within a single chunk, whereas huge objects each have one or more chunks backing them. Each chunk that contains small and/or large objects tracks its contents as runs of contiguous pages (unused, backing a set of small objects, or backing one large object). The combination of chunk alignment and chunk page maps makes it possible to determine all metadata regarding small and large allocations in constant time.
Small objects are managed in groups by page runs. Each run maintains
a bitmap to track which regions are in use. Allocation requests that are no
more than half the quantum (8 or 16, depending on architecture) are rounded
up to the nearest power of two that is at least sizeof(double)
. All other object size
classes are multiples of the quantum, spaced such that there are four size
classes for each doubling in size, which limits internal fragmentation to
approximately 20% for all but the smallest size classes. Small size classes
are smaller than four times the page size, large size classes are smaller
than the chunk size (see the
option), and
huge size classes extend from the chunk size up to the largest size class
that does not exceed opt.lg_chunk
PTRDIFF_MAX
.
Allocations are packed tightly together, which can be an issue for multi-threaded applications. If you need to assure that allocations do not suffer from cacheline sharing, round your allocation requests up to the nearest multiple of the cacheline size, or specify cacheline alignment when allocating.
The realloc()
,
rallocx()
, and
xallocx()
functions may resize allocations
without moving them under limited circumstances. Unlike the
*allocx()
API, the standard API does not
officially round up the usable size of an allocation to the nearest size
class, so technically it is necessary to call
realloc()
to grow e.g. a 9-byte allocation to
16 bytes, or shrink a 16-byte allocation to 9 bytes. Growth and shrinkage
trivially succeeds in place as long as the pre-size and post-size both round
up to the same size class. No other API guarantees are made regarding
in-place resizing, but the current implementation also tries to resize large
and huge allocations in place, as long as the pre-size and post-size are
both large or both huge. In such cases shrinkage always succeeds for large
size classes, but for huge size classes the chunk allocator must support
splitting (see
).
Growth only succeeds if the trailing memory is currently available, and
additionally for huge size classes the chunk allocator must support
merging.arena.<i>.chunk_hooks
Assuming 2 MiB chunks, 4 KiB pages, and a 16-byte quantum on a 64-bit system, the size classes in each category are as shown in Table 1.
Table 1. Size classes
Category | Spacing | Size |
---|---|---|
Small | lg | [8] |
16 | [16, 32, 48, 64, 80, 96, 112, 128] | |
32 | [160, 192, 224, 256] | |
64 | [320, 384, 448, 512] | |
128 | [640, 768, 896, 1024] | |
256 | [1280, 1536, 1792, 2048] | |
512 | [2560, 3072, 3584, 4096] | |
1 KiB | [5 KiB, 6 KiB, 7 KiB, 8 KiB] | |
2 KiB | [10 KiB, 12 KiB, 14 KiB] | |
Large | 2 KiB | [16 KiB] |
4 KiB | [20 KiB, 24 KiB, 28 KiB, 32 KiB] | |
8 KiB | [40 KiB, 48 KiB, 54 KiB, 64 KiB] | |
16 KiB | [80 KiB, 96 KiB, 112 KiB, 128 KiB] | |
32 KiB | [160 KiB, 192 KiB, 224 KiB, 256 KiB] | |
64 KiB | [320 KiB, 384 KiB, 448 KiB, 512 KiB] | |
128 KiB | [640 KiB, 768 KiB, 896 KiB, 1 MiB] | |
256 KiB | [1280 KiB, 1536 KiB, 1792 KiB] | |
Huge | 256 KiB | [2 MiB] |
512 KiB | [2560 KiB, 3 MiB, 3584 KiB, 4 MiB] | |
1 MiB | [5 MiB, 6 MiB, 7 MiB, 8 MiB] | |
2 MiB | [10 MiB, 12 MiB, 14 MiB, 16 MiB] | |
4 MiB | [20 MiB, 24 MiB, 28 MiB, 32 MiB] | |
8 MiB | [40 MiB, 48 MiB, 56 MiB, 64 MiB] | |
... | ... | |
512 PiB | [2560 PiB, 3 EiB, 3584 PiB, 4 EiB] | |
1 EiB | [5 EiB, 6 EiB, 7 EiB] |
The following names are defined in the namespace accessible via the
mallctl*()
functions. Value types are
specified in parentheses, their readable/writable statuses are encoded as
rw
, r-
, -w
, or
--
, and required build configuration flags follow, if
any. A name element encoded as <i>
or
<j>
indicates an integer component, where the
integer varies from 0 to some upper value that must be determined via
introspection. In the case of
,
stats.arenas.<i>.*
<i>
equal to
can be
used to access the summation of statistics from all arenas. Take special
note of the arenas.narenas
mallctl,
which controls refreshing of cached dynamic statistics.epoch
version
(const char *)
r-
Return the jemalloc version string.
epoch
(uint64_t)
rw
If a value is passed in, refresh the data from which
the mallctl*()
functions report values,
and increment the epoch. Return the current epoch. This is useful for
detecting whether another thread caused a refresh.
config.cache_oblivious
(bool)
r-
--enable-cache-oblivious
was specified
during build configuration.
config.debug
(bool)
r-
--enable-debug
was specified during
build configuration.
config.fill
(bool)
r-
--enable-fill
was specified during
build configuration.
config.lazy_lock
(bool)
r-
--enable-lazy-lock
was specified
during build configuration.
config.malloc_conf
(const char *)
r-
Embedded configure-time-specified run-time options
string, empty unless --with-malloc-conf
was specified
during build configuration.
config.munmap
(bool)
r-
--enable-munmap
was specified during
build configuration.
config.prof
(bool)
r-
--enable-prof
was specified during
build configuration.
config.prof_libgcc
(bool)
r-
--disable-prof-libgcc
was not
specified during build configuration.
config.prof_libunwind
(bool)
r-
--enable-prof-libunwind
was specified
during build configuration.
config.stats
(bool)
r-
--enable-stats
was specified during
build configuration.
config.tcache
(bool)
r-
--disable-tcache
was not specified
during build configuration.
config.tls
(bool)
r-
--disable-tls
was not specified during
build configuration.
config.utrace
(bool)
r-
--enable-utrace
was specified during
build configuration.
config.valgrind
(bool)
r-
--enable-valgrind
was specified during
build configuration.
config.xmalloc
(bool)
r-
--enable-xmalloc
was specified during
build configuration.
opt.abort
(bool)
r-
Abort-on-warning enabled/disabled. If true, most
warnings are fatal. The process will call
abort(3) in these cases. This option is
disabled by default unless --enable-debug
is
specified during configuration, in which case it is enabled by default.
opt.dss
(const char *)
r-
dss (sbrk(2)) allocation precedence as related to mmap(2) allocation. The following settings are supported if sbrk(2) is supported by the operating system: “disabled”, “primary”, and “secondary”; otherwise only “disabled” is supported. The default is “secondary” if sbrk(2) is supported by the operating system; “disabled” otherwise.
opt.lg_chunk
(size_t)
r-
Virtual memory chunk size (log base 2). If a chunk size outside the supported size range is specified, the size is silently clipped to the minimum/maximum supported size. The default chunk size is 2 MiB (2^21).
opt.narenas
(unsigned)
r-
Maximum number of arenas to use for automatic multiplexing of threads and arenas. The default is four times the number of CPUs, or one if there is a single CPU.
opt.purge
(const char *)
r-
Purge mode is “ratio” (default) or
“decay”. See
for details of the ratio mode. See opt.lg_dirty_mult
for
details of the decay mode.opt.decay_time
opt.lg_dirty_mult
(ssize_t)
r-
Per-arena minimum ratio (log base 2) of active to dirty
pages. Some dirty unused pages may be allowed to accumulate, within
the limit set by the ratio (or one chunk worth of dirty pages,
whichever is greater), before informing the kernel about some of those
pages via madvise(2) or a similar system call. This
provides the kernel with sufficient information to recycle dirty pages
if physical memory becomes scarce and the pages remain unused. The
default minimum ratio is 8:1 (2^3:1); an option value of -1 will
disable dirty page purging. See
and arenas.lg_dirty_mult
for related dynamic control options.arena.<i>.lg_dirty_mult
opt.decay_time
(ssize_t)
r-
Approximate time in seconds from the creation of a set
of unused dirty pages until an equivalent set of unused dirty pages is
purged and/or reused. The pages are incrementally purged according to a
sigmoidal decay curve that starts and ends with zero purge rate. A
decay time of 0 causes all unused dirty pages to be purged immediately
upon creation. A decay time of -1 disables purging. The default decay
time is 10 seconds. See
and arenas.decay_time
for related dynamic control options.
arena.<i>.decay_time
opt.stats_print
(bool)
r-
Enable/disable statistics printing at exit. If
enabled, the malloc_stats_print()
function is called at program exit via an
atexit(3) function. If
--enable-stats
is specified during configuration, this
has the potential to cause deadlock for a multi-threaded process that
exits while one or more threads are executing in the memory allocation
functions. Furthermore, atexit()
may
allocate memory during application initialization and then deadlock
internally when jemalloc in turn calls
atexit()
, so this option is not
universally usable (though the application can register its own
atexit()
function with equivalent
functionality). Therefore, this option should only be used with care;
it is primarily intended as a performance tuning aid during application
development. This option is disabled by default.
opt.junk
(const char *)
r-
[--enable-fill
]
Junk filling. If set to “alloc”, each byte
of uninitialized allocated memory will be initialized to
0xa5
. If set to “free”, all deallocated
memory will be initialized to 0x5a
. If set to
“true”, both allocated and deallocated memory will be
initialized, and if set to “false”, junk filling be
disabled entirely. This is intended for debugging and will impact
performance negatively. This option is “false” by default
unless --enable-debug
is specified during
configuration, in which case it is “true” by default unless
running inside Valgrind.
opt.quarantine
(size_t)
r-
[--enable-fill
]
Per thread quarantine size in bytes. If non-zero, each
thread maintains a FIFO object quarantine that stores up to the
specified number of bytes of memory. The quarantined memory is not
freed until it is released from quarantine, though it is immediately
junk-filled if the
option is
enabled. This feature is of particular use in combination with Valgrind, which can detect attempts
to access quarantined objects. This is intended for debugging and will
impact performance negatively. The default quarantine size is 0 unless
running inside Valgrind, in which case the default is 16
MiB.opt.junk
opt.redzone
(bool)
r-
[--enable-fill
]
Redzones enabled/disabled. If enabled, small
allocations have redzones before and after them. Furthermore, if the
option is
enabled, the redzones are checked for corruption during deallocation.
However, the primary intended purpose of this feature is to be used in
combination with Valgrind,
which needs redzones in order to do effective buffer overflow/underflow
detection. This option is intended for debugging and will impact
performance negatively. This option is disabled by
default unless running inside Valgrind.opt.junk
opt.zero
(bool)
r-
[--enable-fill
]
Zero filling enabled/disabled. If enabled, each byte
of uninitialized allocated memory will be initialized to 0. Note that
this initialization only happens once for each byte, so
realloc()
and
rallocx()
calls do not zero memory that
was previously allocated. This is intended for debugging and will
impact performance negatively. This option is disabled by default.
opt.utrace
(bool)
r-
[--enable-utrace
]
Allocation tracing based on utrace(2) enabled/disabled. This option is disabled by default.
opt.xmalloc
(bool)
r-
[--enable-xmalloc
]
Abort-on-out-of-memory enabled/disabled. If enabled,
rather than returning failure for any allocation function, display a
diagnostic message on STDERR_FILENO
and cause the
program to drop core (using
abort(3)). If an application is
designed to depend on this behavior, set the option at compile time by
including the following in the source code:
malloc_conf = "xmalloc:true";
This option is disabled by default.
opt.tcache
(bool)
r-
[--enable-tcache
]
Thread-specific caching (tcache) enabled/disabled. When
there are multiple threads, each thread uses a tcache for objects up to
a certain size. Thread-specific caching allows many allocations to be
satisfied without performing any thread synchronization, at the cost of
increased memory use. See the
option for related tuning information. This option is enabled by
default unless running inside Valgrind, in which case it is
forcefully disabled.opt.lg_tcache_max
opt.lg_tcache_max
(size_t)
r-
[--enable-tcache
]
Maximum size class (log base 2) to cache in the thread-specific cache (tcache). At a minimum, all small size classes are cached, and at a maximum all large size classes are cached. The default maximum is 32 KiB (2^15).
opt.prof
(bool)
r-
[--enable-prof
]
Memory profiling enabled/disabled. If enabled, profile
memory allocation activity. See the
option for on-the-fly activation/deactivation. See the opt.prof_active
option for probabilistic sampling control. See the opt.lg_prof_sample
option for control of cumulative sample reporting. See the opt.prof_accum
option for information on interval-triggered profile dumping, the opt.lg_prof_interval
option for information on high-water-triggered profile dumping, and the
opt.prof_gdump
option for final profile dumping. Profile output is compatible with
the jeprof command, which is based on the
pprof that is developed as part of the gperftools
package. See HEAP PROFILE
FORMAT for heap profile format documentation.opt.prof_final
opt.prof_prefix
(const char *)
r-
[--enable-prof
]
Filename prefix for profile dumps. If the prefix is
set to the empty string, no automatic dumps will occur; this is
primarily useful for disabling the automatic final heap dump (which
also disables leak reporting, if enabled). The default prefix is
jeprof
.
opt.prof_active
(bool)
r-
[--enable-prof
]
Profiling activated/deactivated. This is a secondary
control mechanism that makes it possible to start the application with
profiling enabled (see the
option) but
inactive, then toggle profiling at any time during program execution
with the opt.prof
mallctl.
This option is enabled by default.prof.active
opt.prof_thread_active_init
(bool)
r-
[--enable-prof
]
Initial setting for
in newly created threads. The initial setting for newly created threads
can also be changed during execution via the thread.prof.active
mallctl. This option is enabled by default.prof.thread_active_init
opt.lg_prof_sample
(size_t)
r-
[--enable-prof
]
Average interval (log base 2) between allocation samples, as measured in bytes of allocation activity. Increasing the sampling interval decreases profile fidelity, but also decreases the computational overhead. The default sample interval is 512 KiB (2^19 B).
opt.prof_accum
(bool)
r-
[--enable-prof
]
Reporting of cumulative object/byte counts in profile dumps enabled/disabled. If this option is enabled, every unique backtrace must be stored for the duration of execution. Depending on the application, this can impose a large memory overhead, and the cumulative counts are not always of interest. This option is disabled by default.
opt.lg_prof_interval
(ssize_t)
r-
[--enable-prof
]
Average interval (log base 2) between memory profile
dumps, as measured in bytes of allocation activity. The actual
interval between dumps may be sporadic because decentralized allocation
counters are used to avoid synchronization bottlenecks. Profiles are
dumped to files named according to the pattern
<prefix>.<pid>.<seq>.i<iseq>.heap
,
where <prefix>
is controlled by the
option. By default, interval-triggered profile dumping is disabled
(encoded as -1).
opt.prof_prefix
opt.prof_gdump
(bool)
r-
[--enable-prof
]
Set the initial state of
, which when
enabled triggers a memory profile dump every time the total virtual
memory exceeds the previous maximum. This option is disabled by
default.prof.gdump
opt.prof_final
(bool)
r-
[--enable-prof
]
Use an
atexit(3) function to dump final memory
usage to a file named according to the pattern
<prefix>.<pid>.<seq>.f.heap
,
where <prefix>
is controlled by the
option. Note that opt.prof_prefix
atexit()
may allocate
memory during application initialization and then deadlock internally
when jemalloc in turn calls atexit()
, so
this option is not universally usable (though the application can
register its own atexit()
function with
equivalent functionality). This option is disabled by
default.
opt.prof_leak
(bool)
r-
[--enable-prof
]
Leak reporting enabled/disabled. If enabled, use an
atexit(3) function to report memory leaks
detected by allocation sampling. See the
option for
information on analyzing heap profile output. This option is disabled
by default.opt.prof
thread.arena
(unsigned)
rw
Get or set the arena associated with the calling
thread. If the specified arena was not initialized beforehand (see the
mallctl), it will be automatically initialized as a side effect of
calling this interface.arenas.initialized
thread.allocated
(uint64_t)
r-
[--enable-stats
]
Get the total number of bytes ever allocated by the calling thread. This counter has the potential to wrap around; it is up to the application to appropriately interpret the counter in such cases.
thread.allocatedp
(uint64_t *)
r-
[--enable-stats
]
Get a pointer to the the value that is returned by the
mallctl. This is useful for avoiding the overhead of repeated
thread.allocated
mallctl*()
calls.
thread.deallocated
(uint64_t)
r-
[--enable-stats
]
Get the total number of bytes ever deallocated by the calling thread. This counter has the potential to wrap around; it is up to the application to appropriately interpret the counter in such cases.
thread.deallocatedp
(uint64_t *)
r-
[--enable-stats
]
Get a pointer to the the value that is returned by the
mallctl. This is useful for avoiding the overhead of repeated
thread.deallocated
mallctl*()
calls.
thread.tcache.enabled
(bool)
rw
[--enable-tcache
]
Enable/disable calling thread's tcache. The tcache is
implicitly flushed as a side effect of becoming
disabled (see
).
thread.tcache.flush
thread.tcache.flush
(void)
--
[--enable-tcache
]
Flush calling thread's thread-specific cache (tcache). This interface releases all cached objects and internal data structures associated with the calling thread's tcache. Ordinarily, this interface need not be called, since automatic periodic incremental garbage collection occurs, and the thread cache is automatically discarded when a thread exits. However, garbage collection is triggered by allocation activity, so it is possible for a thread that stops allocating/deallocating to retain its cache indefinitely, in which case the developer may find manual flushing useful.
thread.prof.name
(const char *)
r-
or
-w
[--enable-prof
]
Get/set the descriptive name associated with the calling thread in memory profile dumps. An internal copy of the name string is created, so the input string need not be maintained after this interface completes execution. The output string of this interface should be copied for non-ephemeral uses, because multiple implementation details can cause asynchronous string deallocation. Furthermore, each invocation of this interface can only read or write; simultaneous read/write is not supported due to string lifetime limitations. The name string must be nil-terminated and comprised only of characters in the sets recognized by isgraph(3) and isblank(3).
thread.prof.active
(bool)
rw
[--enable-prof
]
Control whether sampling is currently active for the
calling thread. This is an activation mechanism in addition to
; both must
be active for the calling thread to sample. This flag is enabled by
default.prof.active
tcache.create
(unsigned)
r-
[--enable-tcache
]
Create an explicit thread-specific cache (tcache) and
return an identifier that can be passed to the MALLOCX_TCACHE(
macro to explicitly use the specified cache rather than the
automatically managed one that is used by default. Each explicit cache
can be used by only one thread at a time; the application must assure
that this constraint holds.
tc
)
tcache.flush
(unsigned)
-w
[--enable-tcache
]
Flush the specified thread-specific cache (tcache). The
same considerations apply to this interface as to
,
except that the tcache will never be automatically discarded.
thread.tcache.flush
tcache.destroy
(unsigned)
-w
[--enable-tcache
]
Flush the specified thread-specific cache (tcache) and make the identifier available for use during a future tcache creation.
arena.<i>.purge
(void)
--
Purge all unused dirty pages for arena <i>, or for
all arenas if <i> equals
.
arenas.narenas
arena.<i>.decay
(void)
--
Trigger decay-based purging of unused dirty pages for
arena <i>, or for all arenas if <i> equals
.
The proportion of unused dirty pages to be purged depends on the current
time; see arenas.narenas
for
details.opt.decay_time
arena.<i>.reset
(void)
--
Discard all of the arena's extant allocations. This
interface can only be used with arenas created via
. None
of the arena's discarded/cached allocations may accessed afterward. As
part of this requirement, all thread caches which were used to
allocate/deallocate in conjunction with the arena must be flushed
beforehand. This interface cannot be used if running inside Valgrind,
nor if the quarantine size is
non-zero.arenas.extend
arena.<i>.dss
(const char *)
rw
Set the precedence of dss allocation as related to mmap
allocation for arena <i>, or for all arenas if <i> equals
. See
arenas.narenas
for supported
settings.opt.dss
arena.<i>.lg_dirty_mult
(ssize_t)
rw
Current per-arena minimum ratio (log base 2) of active
to dirty pages for arena <i>. Each time this interface is set and
the ratio is increased, pages are synchronously purged as necessary to
impose the new ratio. See
for additional information.opt.lg_dirty_mult
arena.<i>.decay_time
(ssize_t)
rw
Current per-arena approximate time in seconds from the
creation of a set of unused dirty pages until an equivalent set of
unused dirty pages is purged and/or reused. Each time this interface is
set, all currently unused dirty pages are considered to have fully
decayed, which causes immediate purging of all unused dirty pages unless
the decay time is set to -1 (i.e. purging disabled). See
for
additional information.opt.decay_time
arena.<i>.chunk_hooks
(chunk_hooks_t)
rw
Get or set the chunk management hook functions for arena
<i>. The functions must be capable of operating on all extant
chunks associated with arena <i>, usually by passing unknown
chunks to the replaced functions. In practice, it is feasible to
control allocation for arenas created via
such
that all chunks originate from an application-supplied chunk allocator
(by setting custom chunk hook functions just after arena creation), but
the automatically created arenas may have already created chunks prior
to the application having an opportunity to take over chunk
allocation.arenas.extend
typedef struct { chunk_alloc_t *alloc; chunk_dalloc_t *dalloc; chunk_commit_t *commit; chunk_decommit_t *decommit; chunk_purge_t *purge; chunk_split_t *split; chunk_merge_t *merge; } chunk_hooks_t;
The chunk_hooks_t structure comprises function pointers which are described individually below. jemalloc uses these functions to manage chunk lifetime, which starts off with allocation of mapped committed memory, in the simplest case followed by deallocation. However, there are performance and platform reasons to retain chunks for later reuse. Cleanup attempts cascade from deallocation to decommit to purging, which gives the chunk management functions opportunities to reject the most permanent cleanup operations in favor of less permanent (and often less costly) operations. The chunk splitting and merging operations can also be opted out of, but this is mainly intended to support platforms on which virtual memory mappings provided by the operating system kernel do not automatically coalesce and split, e.g. Windows.
typedef void *(chunk_alloc_t)( | void *chunk, |
size_t size, | |
size_t alignment, | |
bool *zero, | |
bool *commit, | |
unsigned arena_ind) ; |
A chunk allocation function conforms to the
chunk_alloc_t type and upon success returns a pointer to
size
bytes of mapped memory on behalf of arena
arena_ind
such that the chunk's base address is a
multiple of alignment
, as well as setting
*zero
to indicate whether the chunk is zeroed and
*commit
to indicate whether the chunk is
committed. Upon error the function returns NULL
and leaves *zero
and
*commit
unmodified. The
size
parameter is always a multiple of the chunk
size. The alignment
parameter is always a power
of two at least as large as the chunk size. Zeroing is mandatory if
*zero
is true upon function entry. Committing is
mandatory if *commit
is true upon function entry.
If chunk
is not NULL
, the
returned pointer must be chunk
on success or
NULL
on error. Committed memory may be committed
in absolute terms as on a system that does not overcommit, or in
implicit terms as on a system that overcommits and satisfies physical
memory needs on demand via soft page faults. Note that replacing the
default chunk allocation function makes the arena's
setting irrelevant.arena.<i>.dss
typedef bool (chunk_dalloc_t)( | void *chunk, |
size_t size, | |
bool committed, | |
unsigned arena_ind) ; |
A chunk deallocation function conforms to the
chunk_dalloc_t type and deallocates a
chunk
of given size
with
committed
/decommited memory as indicated, on
behalf of arena arena_ind
, returning false upon
success. If the function returns true, this indicates opt-out from
deallocation; the virtual memory mapping associated with the chunk
remains mapped, in the same commit state, and available for future use,
in which case it will be automatically retained for later reuse.
typedef bool (chunk_commit_t)( | void *chunk, |
size_t size, | |
size_t offset, | |
size_t length, | |
unsigned arena_ind) ; |
A chunk commit function conforms to the
chunk_commit_t type and commits zeroed physical memory to
back pages within a chunk
of given
size
at offset
bytes,
extending for length
on behalf of arena
arena_ind
, returning false upon success.
Committed memory may be committed in absolute terms as on a system that
does not overcommit, or in implicit terms as on a system that
overcommits and satisfies physical memory needs on demand via soft page
faults. If the function returns true, this indicates insufficient
physical memory to satisfy the request.
typedef bool (chunk_decommit_t)( | void *chunk, |
size_t size, | |
size_t offset, | |
size_t length, | |
unsigned arena_ind) ; |
A chunk decommit function conforms to the
chunk_decommit_t type and decommits any physical memory
that is backing pages within a chunk
of given
size
at offset
bytes,
extending for length
on behalf of arena
arena_ind
, returning false upon success, in which
case the pages will be committed via the chunk commit function before
being reused. If the function returns true, this indicates opt-out from
decommit; the memory remains committed and available for future use, in
which case it will be automatically retained for later reuse.
typedef bool (chunk_purge_t)( | void *chunk, |
size_tsize, | |
size_t offset, | |
size_t length, | |
unsigned arena_ind) ; |
A chunk purge function conforms to the chunk_purge_t
type and optionally discards physical pages within the virtual memory
mapping associated with chunk
of given
size
at offset
bytes,
extending for length
on behalf of arena
arena_ind
, returning false if pages within the
purged virtual memory range will be zero-filled the next time they are
accessed.
typedef bool (chunk_split_t)( | void *chunk, |
size_t size, | |
size_t size_a, | |
size_t size_b, | |
bool committed, | |
unsigned arena_ind) ; |
A chunk split function conforms to the chunk_split_t
type and optionally splits chunk
of given
size
into two adjacent chunks, the first of
size_a
bytes, and the second of
size_b
bytes, operating on
committed
/decommitted memory as indicated, on
behalf of arena arena_ind
, returning false upon
success. If the function returns true, this indicates that the chunk
remains unsplit and therefore should continue to be operated on as a
whole.
typedef bool (chunk_merge_t)( | void *chunk_a, |
size_t size_a, | |
void *chunk_b, | |
size_t size_b, | |
bool committed, | |
unsigned arena_ind) ; |
A chunk merge function conforms to the chunk_merge_t
type and optionally merges adjacent chunks,
chunk_a
of given size_a
and chunk_b
of given
size_b
into one contiguous chunk, operating on
committed
/decommitted memory as indicated, on
behalf of arena arena_ind
, returning false upon
success. If the function returns true, this indicates that the chunks
remain distinct mappings and therefore should continue to be operated on
independently.
arenas.narenas
(unsigned)
r-
Current limit on number of arenas.
arenas.initialized
(bool *)
r-
An array of
booleans. Each boolean indicates whether the corresponding arena is
initialized.arenas.narenas
arenas.lg_dirty_mult
(ssize_t)
rw
Current default per-arena minimum ratio (log base 2) of
active to dirty pages, used to initialize
during arena creation. See arena.<i>.lg_dirty_mult
for additional information.opt.lg_dirty_mult
arenas.decay_time
(ssize_t)
rw
Current default per-arena approximate time in seconds
from the creation of a set of unused dirty pages until an equivalent set
of unused dirty pages is purged and/or reused, used to initialize
during arena creation. See arena.<i>.decay_time
for
additional information.opt.decay_time
arenas.quantum
(size_t)
r-
Quantum size.
arenas.page
(size_t)
r-
Page size.
arenas.tcache_max
(size_t)
r-
[--enable-tcache
]
Maximum thread-cached size class.
arenas.nbins
(unsigned)
r-
Number of bin size classes.
arenas.nhbins
(unsigned)
r-
[--enable-tcache
]
Total number of thread cache bin size classes.
arenas.bin.<i>.size
(size_t)
r-
Maximum size supported by size class.
arenas.bin.<i>.nregs
(uint32_t)
r-
Number of regions per page run.
arenas.bin.<i>.run_size
(size_t)
r-
Number of bytes per page run.
arenas.nlruns
(unsigned)
r-
Total number of large size classes.
arenas.lrun.<i>.size
(size_t)
r-
Maximum size supported by this large size class.
arenas.nhchunks
(unsigned)
r-
Total number of huge size classes.
arenas.hchunk.<i>.size
(size_t)
r-
Maximum size supported by this huge size class.
arenas.extend
(unsigned)
r-
Extend the array of arenas by appending a new arena, and returning the new arena index.
prof.thread_active_init
(bool)
rw
[--enable-prof
]
Control the initial setting for
in newly created threads. See the thread.prof.active
option for additional information.opt.prof_thread_active_init
prof.active
(bool)
rw
[--enable-prof
]
Control whether sampling is currently active. See the
option for additional information, as well as the interrelated opt.prof_active
mallctl.thread.prof.active
prof.dump
(const char *)
-w
[--enable-prof
]
Dump a memory profile to the specified file, or if NULL
is specified, to a file according to the pattern
<prefix>.<pid>.<seq>.m<mseq>.heap
,
where <prefix>
is controlled by the
option.opt.prof_prefix
prof.gdump
(bool)
rw
[--enable-prof
]
When enabled, trigger a memory profile dump every time
the total virtual memory exceeds the previous maximum. Profiles are
dumped to files named according to the pattern
<prefix>.<pid>.<seq>.u<useq>.heap
,
where <prefix>
is controlled by the
option.opt.prof_prefix
prof.reset
(size_t)
-w
[--enable-prof
]
Reset all memory profile statistics, and optionally
update the sample rate (see
and opt.lg_prof_sample
).
prof.lg_sample
prof.lg_sample
(size_t)
r-
[--enable-prof
]
Get the current sample rate (see
).
opt.lg_prof_sample
prof.interval
(uint64_t)
r-
[--enable-prof
]
Average number of bytes allocated between
interval-based profile dumps. See the
option for additional information.opt.lg_prof_interval
stats.cactive
(size_t *)
r-
[--enable-stats
]
Pointer to a counter that contains an approximate count
of the current number of bytes in active pages. The estimate may be
high, but never low, because each arena rounds up when computing its
contribution to the counter. Note that the
mallctl has no bearing
on this counter. Furthermore, counter consistency is maintained via
atomic operations, so it is necessary to use an atomic operation in
order to guarantee a consistent read when dereferencing the pointer.
epoch
stats.allocated
(size_t)
r-
[--enable-stats
]
Total number of bytes allocated by the application.
stats.active
(size_t)
r-
[--enable-stats
]
Total number of bytes in active pages allocated by the
application. This is a multiple of the page size, and greater than or
equal to
.
This does not include
stats.allocated
, nor pages
entirely devoted to allocator metadata.stats.arenas.<i>.pdirty
stats.metadata
(size_t)
r-
[--enable-stats
]
Total number of bytes dedicated to metadata, which
comprise base allocations used for bootstrap-sensitive internal
allocator data structures, arena chunk headers (see
),
and internal allocations (see stats.arenas.<i>.metadata.mapped
).stats.arenas.<i>.metadata.allocated
stats.resident
(size_t)
r-
[--enable-stats
]
Maximum number of bytes in physically resident data
pages mapped by the allocator, comprising all pages dedicated to
allocator metadata, pages backing active allocations, and unused dirty
pages. This is a maximum rather than precise because pages may not
actually be physically resident if they correspond to demand-zeroed
virtual memory that has not yet been touched. This is a multiple of the
page size, and is larger than
.stats.active
stats.mapped
(size_t)
r-
[--enable-stats
]
Total number of bytes in active chunks mapped by the
allocator. This is a multiple of the chunk size, and is larger than
.
This does not include inactive chunks, even those that contain unused
dirty pages, which means that there is no strict ordering between this
and stats.active
.stats.resident
stats.retained
(size_t)
r-
[--enable-stats
]
Total number of bytes in virtual memory mappings that
were retained rather than being returned to the operating system via
e.g. munmap(2). Retained virtual memory is
typically untouched, decommitted, or purged, so it has no strongly
associated physical memory (see chunk hooks for details). Retained
memory is excluded from mapped memory statistics, e.g.
.
stats.mapped
stats.arenas.<i>.dss
(const char *)
r-
dss (sbrk(2)) allocation precedence as
related to mmap(2) allocation. See
for details.
opt.dss
stats.arenas.<i>.lg_dirty_mult
(ssize_t)
r-
Minimum ratio (log base 2) of active to dirty pages.
See
for details.opt.lg_dirty_mult
stats.arenas.<i>.decay_time
(ssize_t)
r-
Approximate time in seconds from the creation of a set
of unused dirty pages until an equivalent set of unused dirty pages is
purged and/or reused. See
for details.opt.decay_time
stats.arenas.<i>.nthreads
(unsigned)
r-
Number of threads currently assigned to arena.
stats.arenas.<i>.pactive
(size_t)
r-
Number of pages in active runs.
stats.arenas.<i>.pdirty
(size_t)
r-
Number of pages within unused runs that are potentially
dirty, and for which madvise
or
similar has not been called....
MADV_DONTNEED
stats.arenas.<i>.mapped
(size_t)
r-
[--enable-stats
]
Number of mapped bytes.
stats.arenas.<i>.retained
(size_t)
r-
[--enable-stats
]
Number of retained bytes. See
for
details.stats.retained
stats.arenas.<i>.metadata.mapped
(size_t)
r-
[--enable-stats
]
Number of mapped bytes in arena chunk headers, which track the states of the non-metadata pages.
stats.arenas.<i>.metadata.allocated
(size_t)
r-
[--enable-stats
]
Number of bytes dedicated to internal allocations.
Internal allocations differ from application-originated allocations in
that they are for internal use, and that they are omitted from heap
profiles. This statistic is reported separately from
and
stats.metadata
because it overlaps with e.g. the stats.arenas.<i>.metadata.mapped
and
stats.allocated
statistics, whereas the other metadata statistics do
not.stats.active
stats.arenas.<i>.npurge
(uint64_t)
r-
[--enable-stats
]
Number of dirty page purge sweeps performed.
stats.arenas.<i>.nmadvise
(uint64_t)
r-
[--enable-stats
]
Number of madvise
or
similar calls made to purge dirty pages....
MADV_DONTNEED
stats.arenas.<i>.purged
(uint64_t)
r-
[--enable-stats
]
Number of pages purged.
stats.arenas.<i>.small.allocated
(size_t)
r-
[--enable-stats
]
Number of bytes currently allocated by small objects.
stats.arenas.<i>.small.nmalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of allocation requests served by small bins.
stats.arenas.<i>.small.ndalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of small objects returned to bins.
stats.arenas.<i>.small.nrequests
(uint64_t)
r-
[--enable-stats
]
Cumulative number of small allocation requests.
stats.arenas.<i>.large.allocated
(size_t)
r-
[--enable-stats
]
Number of bytes currently allocated by large objects.
stats.arenas.<i>.large.nmalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of large allocation requests served directly by the arena.
stats.arenas.<i>.large.ndalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of large deallocation requests served directly by the arena.
stats.arenas.<i>.large.nrequests
(uint64_t)
r-
[--enable-stats
]
Cumulative number of large allocation requests.
stats.arenas.<i>.huge.allocated
(size_t)
r-
[--enable-stats
]
Number of bytes currently allocated by huge objects.
stats.arenas.<i>.huge.nmalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of huge allocation requests served directly by the arena.
stats.arenas.<i>.huge.ndalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of huge deallocation requests served directly by the arena.
stats.arenas.<i>.huge.nrequests
(uint64_t)
r-
[--enable-stats
]
Cumulative number of huge allocation requests.
stats.arenas.<i>.bins.<j>.nmalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of allocations served by bin.
stats.arenas.<i>.bins.<j>.ndalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of allocations returned to bin.
stats.arenas.<i>.bins.<j>.nrequests
(uint64_t)
r-
[--enable-stats
]
Cumulative number of allocation requests.
stats.arenas.<i>.bins.<j>.curregs
(size_t)
r-
[--enable-stats
]
Current number of regions for this size class.
stats.arenas.<i>.bins.<j>.nfills
(uint64_t)
r-
[--enable-stats
--enable-tcache
]
Cumulative number of tcache fills.
stats.arenas.<i>.bins.<j>.nflushes
(uint64_t)
r-
[--enable-stats
--enable-tcache
]
Cumulative number of tcache flushes.
stats.arenas.<i>.bins.<j>.nruns
(uint64_t)
r-
[--enable-stats
]
Cumulative number of runs created.
stats.arenas.<i>.bins.<j>.nreruns
(uint64_t)
r-
[--enable-stats
]
Cumulative number of times the current run from which to allocate changed.
stats.arenas.<i>.bins.<j>.curruns
(size_t)
r-
[--enable-stats
]
Current number of runs.
stats.arenas.<i>.lruns.<j>.nmalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of allocation requests for this size class served directly by the arena.
stats.arenas.<i>.lruns.<j>.ndalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of deallocation requests for this size class served directly by the arena.
stats.arenas.<i>.lruns.<j>.nrequests
(uint64_t)
r-
[--enable-stats
]
Cumulative number of allocation requests for this size class.
stats.arenas.<i>.lruns.<j>.curruns
(size_t)
r-
[--enable-stats
]
Current number of runs for this size class.
stats.arenas.<i>.hchunks.<j>.nmalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of allocation requests for this size class served directly by the arena.
stats.arenas.<i>.hchunks.<j>.ndalloc
(uint64_t)
r-
[--enable-stats
]
Cumulative number of deallocation requests for this size class served directly by the arena.
stats.arenas.<i>.hchunks.<j>.nrequests
(uint64_t)
r-
[--enable-stats
]
Cumulative number of allocation requests for this size class.
stats.arenas.<i>.hchunks.<j>.curhchunks
(size_t)
r-
[--enable-stats
]
Current number of huge allocations for this size class.
Although the heap profiling functionality was originally designed to be compatible with the pprof command that is developed as part of the gperftools package, the addition of per thread heap profiling functionality required a different heap profile format. The jeprof command is derived from pprof, with enhancements to support the heap profile format described here.
In the following hypothetical heap profile, [...]
indicates elision for the sake of compactness.
heap_v2/524288 t*: 28106: 56637512 [0: 0] [...] t3: 352: 16777344 [0: 0] [...] t99: 17754: 29341640 [0: 0] [...] @ 0x5f86da8 0x5f5a1dc [...] 0x29e4d4e 0xa200316 0xabb2988 [...] t*: 13: 6688 [0: 0] t3: 12: 6496 [0: ] t99: 1: 192 [0: 0] [...] MAPPED_LIBRARIES: [...]
The following matches the above heap profile, but most
tokens are replaced with <description>
to indicate
descriptions of the corresponding fields.
<heap_profile_format_version>/<mean_sample_interval> <aggregate>: <curobjs>: <curbytes> [<cumobjs>: <cumbytes>] [...] <thread_3_aggregate>: <curobjs>: <curbytes>[<cumobjs>: <cumbytes>] [...] <thread_99_aggregate>: <curobjs>: <curbytes>[<cumobjs>: <cumbytes>] [...] @ <top_frame> <frame> [...] <frame> <frame> <frame> [...] <backtrace_aggregate>: <curobjs>: <curbytes> [<cumobjs>: <cumbytes>] <backtrace_thread_3>: <curobjs>: <curbytes> [<cumobjs>: <cumbytes>] <backtrace_thread_99>: <curobjs>: <curbytes> [<cumobjs>: <cumbytes>] [...] MAPPED_LIBRARIES: </proc/<pid>/maps>
When debugging, it is a good idea to configure/build jemalloc with
the --enable-debug
and --enable-fill
options, and recompile the program with suitable options and symbols for
debugger support. When so configured, jemalloc incorporates a wide variety
of run-time assertions that catch application errors such as double-free,
write-after-free, etc.
Programs often accidentally depend on “uninitialized”
memory actually being filled with zero bytes. Junk filling
(see the
option) tends to expose such bugs in the form of obviously incorrect
results and/or coredumps. Conversely, zero
filling (see the opt.junk
option) eliminates
the symptoms of such bugs. Between these two options, it is usually
possible to quickly detect, diagnose, and eliminate such bugs.opt.zero
This implementation does not provide much detail about the problems
it detects, because the performance impact for storing such information
would be prohibitive. However, jemalloc does integrate with the most
excellent Valgrind tool if the
--enable-valgrind
configuration option is enabled.
If any of the memory allocation/deallocation functions detect an
error or warning condition, a message will be printed to file descriptor
STDERR_FILENO
. Errors will result in the process
dumping core. If the
option is set, most
warnings are treated as errors.opt.abort
The malloc_message
variable allows the programmer
to override the function which emits the text strings forming the errors
and warnings if for some reason the STDERR_FILENO
file
descriptor is not suitable for this.
malloc_message()
takes the
cbopaque
pointer argument that is
NULL
unless overridden by the arguments in a call to
malloc_stats_print()
, followed by a string
pointer. Please note that doing anything which tries to allocate memory in
this function is likely to result in a crash or deadlock.
All messages are prefixed by
“<jemalloc>:
”.
The malloc()
and
calloc()
functions return a pointer to the
allocated memory if successful; otherwise a NULL
pointer is returned and errno
is set to
ENOMEM.
The posix_memalign()
function
returns the value 0 if successful; otherwise it returns an error value.
The posix_memalign()
function will fail
if:
The alignment
parameter is
not a power of 2 at least as large as
sizeof(void *)
.
Memory allocation error.
The aligned_alloc()
function returns
a pointer to the allocated memory if successful; otherwise a
NULL
pointer is returned and
errno
is set. The
aligned_alloc()
function will fail if:
The alignment
parameter is
not a power of 2.
Memory allocation error.
The realloc()
function returns a
pointer, possibly identical to ptr
, to the
allocated memory if successful; otherwise a NULL
pointer is returned, and errno
is set to
ENOMEM if the error was the result of an
allocation failure. The realloc()
function always leaves the original buffer intact when an error occurs.
The free()
function returns no
value.
The mallocx()
and
rallocx()
functions return a pointer to
the allocated memory if successful; otherwise a NULL
pointer is returned to indicate insufficient contiguous memory was
available to service the allocation request.
The xallocx()
function returns the
real size of the resulting resized allocation pointed to by
ptr
, which is a value less than
size
if the allocation could not be adequately
grown in place.
The sallocx()
function returns the
real size of the allocation pointed to by ptr
.
The nallocx()
returns the real size
that would result from a successful equivalent
mallocx()
function call, or zero if
insufficient memory is available to perform the size computation.
The mallctl()
,
mallctlnametomib()
, and
mallctlbymib()
functions return 0 on
success; otherwise they return an error value. The functions will fail
if:
newp
is not
NULL
, and newlen
is too
large or too small. Alternatively, *oldlenp
is too large or too small; in this case as much data as possible
are read despite the error.
name
or
mib
specifies an unknown/invalid
value.
Attempt to read or write void value, or attempt to write read-only value.
A memory allocation failure occurred.
An interface with side effects failed in some way
not directly related to mallctl*()
read/write processing.
The malloc_usable_size()
function
returns the usable size of the allocation pointed to by
ptr
.
The following environment variable affects the execution of the allocation functions:
MALLOC_CONF
If the environment variable
MALLOC_CONF
is set, the characters it contains
will be interpreted as options.