function changes the size of the previously allocated memory referenced by
\fIptr\fR
to
\fIsize\fR
bytes\&. The contents of the memory are unchanged up to the lesser of the new and old sizes\&. If the new size is larger, the contents of the newly allocated portion of the memory are undefined\&. Upon success, the memory referenced by
\fIptr\fR
is freed and a pointer to the newly allocated memory is returned\&. Note that
argument that can be used to specify options\&. The functions only check the options that are contextually relevant\&. Use bitwise or (|) operations to specify one or more of the following:
.PP
\fBMALLOCX_LG_ALIGN(\fR\fB\fIla\fR\fR\fB) \fR
.RS4
Align the memory allocation to start at an address that is a multiple of
(1 << \fIla\fR)\&. This macro does not validate that
\fIla\fR
is within the valid range\&.
.RE
.PP
\fBMALLOCX_ALIGN(\fR\fB\fIa\fR\fR\fB) \fR
.RS4
Align the memory allocation to start at an address that is a multiple of
\fIa\fR, where
\fIa\fR
is a power of two\&. This macro does not validate that
\fIa\fR
is a power of 2\&.
.RE
.PP
\fBMALLOCX_ZERO\fR
.RS4
Initialize newly allocated memory to contain zero bytes\&. In the growing reallocation case, the real size prior to reallocation defines the boundary between untouched bytes and those that are initialized to contain zero bytes\&. If this macro is absent, newly allocated memory is uninitialized\&.
bytes, and returns a pointer to the base address of the resulting allocation, which may or may not have moved from its original location\&. Behavior is undefined if
parameter to allow the caller to pass in the allocation size as an optimization\&. The minimum valid input size is the original requested size of the allocation, and the maximum valid input size is the corresponding value returned by
function provides a general interface for introspecting the memory allocator, as well as setting modifiable parameters and triggering actions\&. The period\-separated
\fIname\fR
argument specifies a location in a tree\-structured namespace; see the
MALLCTL NAMESPACE
section for documentation on the tree contents\&. To read a value, pass a pointer via
\fIoldp\fR
to adequate space to contain the value, and a pointer to its length via
\fIoldlenp\fR; otherwise pass
\fBNULL\fR
and
\fBNULL\fR\&. Similarly, to write a value, pass a pointer to the value via
function provides a way to avoid repeated name lookups for applications that repeatedly query the same portion of the namespace, by translating a name to a
that is smaller than the number of period\-separated name components, which results in a partial MIB that can be used as the basis for constructing a complete MIB\&. For name components that are integers (e\&.g\&. the 2 in
arenas\&.bin\&.2\&.size), the corresponding MIB component will always be that integer\&. Therefore, it is legitimate to construct code like the following:
\fBNULL\fR\&. The statistics are presented in human\-readable form unless
\(lqJ\(rq
is specified as a character within the
\fIopts\fR
string, in which case the statistics are presented in
\m[blue]\fBJSON format\fR\m[]\&\s-2\u[2]\d\s+2\&. This function can be called repeatedly\&. General information that never changes during execution can be omitted by specifying
can be specified to omit merged arena and per arena statistics, respectively;
\(lqb\(rq,
\(lql\(rq, and
\(lqh\(rq
can be specified to omit per size class statistics for bins, large objects, and huge objects, respectively\&. Unrecognized characters are silently ignored\&. Note that thread caching may prevent some statistics from being completely up to date, since extra locking would be required to merge counters that track thread cache operations\&.
realloc(); rather it is provided solely as a tool for introspection purposes\&. Any discrepancy between the requested allocation size and the size reported by
Once, when the first call is made to one of the memory allocation routines, the allocator initializes its internals based in part on various options that can be specified at compile\- or run\-time\&.
options\&. Some options have boolean values (true/false), others have integer values (base 8, 10, or 16, depending on prefix), and yet others have raw string values\&.
.SH"IMPLEMENTATION NOTES"
.PP
Traditionally, allocators have used
\fBsbrk\fR(2)
to obtain memory, which is suboptimal for several reasons, including race conditions, increased fragmentation, and artificial limitations on maximum usable memory\&. If
This allocator uses multiple arenas in order to reduce lock contention for threaded programs on multi\-processor systems\&. This works well with regard to threading scalability, but incurs some costs\&. There is a small fixed per\-arena overhead, and additionally, arenas manage memory completely independently of each other, which means a small fixed increase in overall memory fragmentation\&. These overheads are not generally an issue, given the number of arenas normally used\&. Note that using substantially more arenas than the default is not likely to improve performance, mainly due to reduced cache performance\&. However, it may make sense to reduce the number of arenas if an application does not make much use of the allocation functions\&.
.PP
In addition to multiple arenas, unless
\fB\-\-disable\-tcache\fR
is specified during configuration, this allocator supports thread\-specific caching for small and large objects, in order to make it possible to completely avoid synchronization for most allocation requests\&. Such caching allows very fast allocation in the common case, but it increases memory usage and fragmentation, since a bounded number of objects can remain allocated in each thread cache\&.
Memory is conceptually broken into equal\-sized chunks, where the chunk size is a power of two that is greater than the page size\&. Chunks are always aligned to multiples of the chunk size\&. This alignment makes it possible to find metadata for user objects very quickly\&. User objects are broken into three categories according to size: small, large, and huge\&. Multiple small and large objects can reside within a single chunk, whereas huge objects each have one or more chunks backing them\&. Each chunk that contains small and/or large objects tracks its contents as runs of contiguous pages (unused, backing a set of small objects, or backing one large object)\&. The combination of chunk alignment and chunk page maps makes it possible to determine all metadata regarding small and large allocations in constant time\&.
Small objects are managed in groups by page runs\&. Each run maintains a bitmap to track which regions are in use\&. Allocation requests that are no more than half the quantum (8 or 16, depending on architecture) are rounded up to the nearest power of two that is at least
sizeof(\fBdouble\fR)\&. All other object size classes are multiples of the quantum, spaced such that there are four size classes for each doubling in size, which limits internal fragmentation to approximately 20% for all but the smallest size classes\&. Small size classes are smaller than four times the page size, large size classes are smaller than the chunk size (see the
Allocations are packed tightly together, which can be an issue for multi\-threaded applications\&. If you need to assure that allocations do not suffer from cacheline sharing, round your allocation requests up to the nearest multiple of the cacheline size, or specify cacheline alignment when allocating\&.
to grow e\&.g\&. a 9\-byte allocation to 16 bytes, or shrink a 16\-byte allocation to 9 bytes\&. Growth and shrinkage trivially succeeds in place as long as the pre\-size and post\-size both round up to the same size class\&. No other API guarantees are made regarding in\-place resizing, but the current implementation also tries to resize large and huge allocations in place, as long as the pre\-size and post\-size are both large or both huge\&. In such cases shrinkage always succeeds for large size classes, but for huge size classes the chunk allocator must support splitting (see
arena\&.<i>\&.chunk_hooks)\&. Growth only succeeds if the trailing memory is currently available, and additionally for huge size classes the chunk allocator must support merging\&.
functions report values, and increment the epoch\&. Return the current epoch\&. This is useful for detecting whether another thread caused a refresh\&.
Virtual memory chunk size (log base 2)\&. If a chunk size outside the supported size range is specified, the size is silently clipped to the minimum/maximum supported size\&. The default chunk size is 2 MiB (2^21)\&.
Maximum number of arenas to use for automatic multiplexing of threads and arenas\&. The default is four times the number of CPUs, or one if there is a single CPU\&.
Per\-arena minimum ratio (log base 2) of active to dirty pages\&. Some dirty unused pages may be allowed to accumulate, within the limit set by the ratio (or one chunk worth of dirty pages, whichever is greater), before informing the kernel about some of those pages via
or a similar system call\&. This provides the kernel with sufficient information to recycle dirty pages if physical memory becomes scarce and the pages remain unused\&. The default minimum ratio is 8:1 (2^3:1); an option value of \-1 will disable dirty page purging\&. See
Approximate time in seconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused\&. The pages are incrementally purged according to a sigmoidal decay curve that starts and ends with zero purge rate\&. A decay time of 0 causes all unused dirty pages to be purged immediately upon creation\&. A decay time of \-1 disables purging\&. The default decay time is 10 seconds\&. See
is specified during configuration, this has the potential to cause deadlock for a multi\-threaded process that exits while one or more threads are executing in the memory allocation functions\&. Furthermore,
function with equivalent functionality)\&. Therefore, this option should only be used with care; it is primarily intended as a performance tuning aid during application development\&. This option is disabled by default\&.
Per thread quarantine size in bytes\&. If non\-zero, each thread maintains a FIFO object quarantine that stores up to the specified number of bytes of memory\&. The quarantined memory is not freed until it is released from quarantine, though it is immediately junk\-filled if the
\m[blue]\fBValgrind\fR\m[]\&\s-2\u[3]\d\s+2, which can detect attempts to access quarantined objects\&. This is intended for debugging and will impact performance negatively\&. The default quarantine size is 0 unless running inside Valgrind, in which case the default is 16 MiB\&.
option is enabled, the redzones are checked for corruption during deallocation\&. However, the primary intended purpose of this feature is to be used in combination with
\m[blue]\fBValgrind\fR\m[]\&\s-2\u[3]\d\s+2, which needs redzones in order to do effective buffer overflow/underflow detection\&. This option is intended for debugging and will impact performance negatively\&. This option is disabled by default unless running inside Valgrind\&.
Zero filling enabled/disabled\&. If enabled, each byte of uninitialized allocated memory will be initialized to 0\&. Note that this initialization only happens once for each byte, so
calls do not zero memory that was previously allocated\&. This is intended for debugging and will impact performance negatively\&. This option is disabled by default\&.
Abort\-on\-out\-of\-memory enabled/disabled\&. If enabled, rather than returning failure for any allocation function, display a diagnostic message on
\fBSTDERR_FILENO\fR
and cause the program to drop core (using
\fBabort\fR(3))\&. If an application is designed to depend on this behavior, set the option at compile time by including the following in the source code:
Thread\-specific caching (tcache) enabled/disabled\&. When there are multiple threads, each thread uses a tcache for objects up to a certain size\&. Thread\-specific caching allows many allocations to be satisfied without performing any thread synchronization, at the cost of increased memory use\&. See the
Maximum size class (log base 2) to cache in the thread\-specific cache (tcache)\&. At a minimum, all small size classes are cached, and at a maximum all large size classes are cached\&. The default maximum is 32 KiB (2^15)\&.
Filename prefix for profile dumps\&. If the prefix is set to the empty string, no automatic dumps will occur; this is primarily useful for disabling the automatic final heap dump (which also disables leak reporting, if enabled)\&. The default prefix is
Profiling activated/deactivated\&. This is a secondary control mechanism that makes it possible to start the application with profiling enabled (see the
Average interval (log base 2) between allocation samples, as measured in bytes of allocation activity\&. Increasing the sampling interval decreases profile fidelity, but also decreases the computational overhead\&. The default sample interval is 512 KiB (2^19 B)\&.
Reporting of cumulative object/byte counts in profile dumps enabled/disabled\&. If this option is enabled, every unique backtrace must be stored for the duration of execution\&. Depending on the application, this can impose a large memory overhead, and the cumulative counts are not always of interest\&. This option is disabled by default\&.
Average interval (log base 2) between memory profile dumps, as measured in bytes of allocation activity\&. The actual interval between dumps may be sporadic because decentralized allocation counters are used to avoid synchronization bottlenecks\&. Profiles are dumped to files named according to the pattern
prof\&.gdump, which when enabled triggers a memory profile dump every time the total virtual memory exceeds the previous maximum\&. This option is disabled by default\&.
Get the total number of bytes ever allocated by the calling thread\&. This counter has the potential to wrap around; it is up to the application to appropriately interpret the counter in such cases\&.
Get the total number of bytes ever deallocated by the calling thread\&. This counter has the potential to wrap around; it is up to the application to appropriately interpret the counter in such cases\&.
Flush calling thread\*(Aqs thread\-specific cache (tcache)\&. This interface releases all cached objects and internal data structures associated with the calling thread\*(Aqs tcache\&. Ordinarily, this interface need not be called, since automatic periodic incremental garbage collection occurs, and the thread cache is automatically discarded when a thread exits\&. However, garbage collection is triggered by allocation activity, so it is possible for a thread that stops allocating/deallocating to retain its cache indefinitely, in which case the developer may find manual flushing useful\&.
Get/set the descriptive name associated with the calling thread in memory profile dumps\&. An internal copy of the name string is created, so the input string need not be maintained after this interface completes execution\&. The output string of this interface should be copied for non\-ephemeral uses, because multiple implementation details can cause asynchronous string deallocation\&. Furthermore, each invocation of this interface can only read or write; simultaneous read/write is not supported due to string lifetime limitations\&. The name string must be nil\-terminated and comprised only of characters in the sets recognized by
Create an explicit thread\-specific cache (tcache) and return an identifier that can be passed to the
\fBMALLOCX_TCACHE(\fR\fB\fItc\fR\fR\fB)\fR
macro to explicitly use the specified cache rather than the automatically managed one that is used by default\&. Each explicit cache can be used by only one thread at a time; the application must assure that this constraint holds\&.
Discard all of the arena\*(Aqs extant allocations\&. This interface can only be used with arenas created via
arenas\&.extend\&. None of the arena\*(Aqs discarded/cached allocations may accessed afterward\&. As part of this requirement, all thread caches which were used to allocate/deallocate in conjunction with the arena must be flushed beforehand\&. This interface cannot be used if running inside Valgrind, nor if the
Current per\-arena minimum ratio (log base 2) of active to dirty pages for arena <i>\&. Each time this interface is set and the ratio is increased, pages are synchronously purged as necessary to impose the new ratio\&. See
Current per\-arena approximate time in seconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused\&. Each time this interface is set, all currently unused dirty pages are considered to have fully decayed, which causes immediate purging of all unused dirty pages unless the decay time is set to \-1 (i\&.e\&. purging disabled)\&. See
Get or set the chunk management hook functions for arena <i>\&. The functions must be capable of operating on all extant chunks associated with arena <i>, usually by passing unknown chunks to the replaced functions\&. In practice, it is feasible to control allocation for arenas created via
such that all chunks originate from an application\-supplied chunk allocator (by setting custom chunk hook functions just after arena creation), but the automatically created arenas may have already created chunks prior to the application having an opportunity to take over chunk allocation\&.
.sp
.ifn\{\
.RS4
.\}
.nf
typedef struct {
chunk_alloc_t *alloc;
chunk_dalloc_t *dalloc;
chunk_commit_t *commit;
chunk_decommit_t *decommit;
chunk_purge_t *purge;
chunk_split_t *split;
chunk_merge_t *merge;
} chunk_hooks_t;
.fi
.ifn\{\
.RE
.\}
.sp
The
\fBchunk_hooks_t\fR
structure comprises function pointers which are described individually below\&. jemalloc uses these functions to manage chunk lifetime, which starts off with allocation of mapped committed memory, in the simplest case followed by deallocation\&. However, there are performance and platform reasons to retain chunks for later reuse\&. Cleanup attempts cascade from deallocation to decommit to purging, which gives the chunk management functions opportunities to reject the most permanent cleanup operations in favor of less permanent (and often less costly) operations\&. The chunk splitting and merging operations can also be opted out of, but this is mainly intended to support platforms on which virtual memory mappings provided by the operating system kernel do not automatically coalesce and split, e\&.g\&. Windows\&.
such that the chunk\*(Aqs base address is a multiple of
\fIalignment\fR, as well as setting
\fI*zero\fR
to indicate whether the chunk is zeroed and
\fI*commit\fR
to indicate whether the chunk is committed\&. Upon error the function returns
\fBNULL\fR
and leaves
\fI*zero\fR
and
\fI*commit\fR
unmodified\&. The
\fIsize\fR
parameter is always a multiple of the chunk size\&. The
\fIalignment\fR
parameter is always a power of two at least as large as the chunk size\&. Zeroing is mandatory if
\fI*zero\fR
is true upon function entry\&. Committing is mandatory if
\fI*commit\fR
is true upon function entry\&. If
\fIchunk\fR
is not
\fBNULL\fR, the returned pointer must be
\fIchunk\fR
on success or
\fBNULL\fR
on error\&. Committed memory may be committed in absolute terms as on a system that does not overcommit, or in implicit terms as on a system that overcommits and satisfies physical memory needs on demand via soft page faults\&. Note that replacing the default chunk allocation function makes the arena\*(Aqs
\fIcommitted\fR/decommited memory as indicated, on behalf of arena
\fIarena_ind\fR, returning false upon success\&. If the function returns true, this indicates opt\-out from deallocation; the virtual memory mapping associated with the chunk remains mapped, in the same commit state, and available for future use, in which case it will be automatically retained for later reuse\&.
type and commits zeroed physical memory to back pages within a
\fIchunk\fR
of given
\fIsize\fR
at
\fIoffset\fR
bytes, extending for
\fIlength\fR
on behalf of arena
\fIarena_ind\fR, returning false upon success\&. Committed memory may be committed in absolute terms as on a system that does not overcommit, or in implicit terms as on a system that overcommits and satisfies physical memory needs on demand via soft page faults\&. If the function returns true, this indicates insufficient physical memory to satisfy the request\&.
type and decommits any physical memory that is backing pages within a
\fIchunk\fR
of given
\fIsize\fR
at
\fIoffset\fR
bytes, extending for
\fIlength\fR
on behalf of arena
\fIarena_ind\fR, returning false upon success, in which case the pages will be committed via the chunk commit function before being reused\&. If the function returns true, this indicates opt\-out from decommit; the memory remains committed and available for future use, in which case it will be automatically retained for later reuse\&.
\fIcommitted\fR/decommitted memory as indicated, on behalf of arena
\fIarena_ind\fR, returning false upon success\&. If the function returns true, this indicates that the chunk remains unsplit and therefore should continue to be operated on as a whole\&.
\fIcommitted\fR/decommitted memory as indicated, on behalf of arena
\fIarena_ind\fR, returning false upon success\&. If the function returns true, this indicates that the chunks remain distinct mappings and therefore should continue to be operated on independently\&.
Current default per\-arena approximate time in seconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused, used to initialize
When enabled, trigger a memory profile dump every time the total virtual memory exceeds the previous maximum\&. Profiles are dumped to files named according to the pattern
Pointer to a counter that contains an approximate count of the current number of bytes in active pages\&. The estimate may be high, but never low, because each arena rounds up when computing its contribution to the counter\&. Note that the
mallctl has no bearing on this counter\&. Furthermore, counter consistency is maintained via atomic operations, so it is necessary to use an atomic operation in order to guarantee a consistent read when dereferencing the pointer\&.
Total number of bytes dedicated to metadata, which comprise base allocations used for bootstrap\-sensitive internal allocator data structures, arena chunk headers (see
Maximum number of bytes in physically resident data pages mapped by the allocator, comprising all pages dedicated to allocator metadata, pages backing active allocations, and unused dirty pages\&. This is a maximum rather than precise because pages may not actually be physically resident if they correspond to demand\-zeroed virtual memory that has not yet been touched\&. This is a multiple of the page size, and is larger than
stats\&.active\&. This does not include inactive chunks, even those that contain unused dirty pages, which means that there is no strict ordering between this and
Approximate time in seconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused\&. See
Number of bytes dedicated to internal allocations\&. Internal allocations differ from application\-originated allocations in that they are for internal use, and that they are omitted from heap profiles\&. This statistic is reported separately from
Although the heap profiling functionality was originally designed to be compatible with the
\fBpprof\fR
command that is developed as part of the
\m[blue]\fBgperftools package\fR\m[]\&\s-2\u[4]\d\s+2, the addition of per thread heap profiling functionality required a different heap profile format\&. The
\fBjeprof\fR
command is derived from
\fBpprof\fR, with enhancements to support the heap profile format described here\&.
When debugging, it is a good idea to configure/build jemalloc with the
\fB\-\-enable\-debug\fR
and
\fB\-\-enable\-fill\fR
options, and recompile the program with suitable options and symbols for debugger support\&. When so configured, jemalloc incorporates a wide variety of run\-time assertions that catch application errors such as double\-free, write\-after\-free, etc\&.
option) eliminates the symptoms of such bugs\&. Between these two options, it is usually possible to quickly detect, diagnose, and eliminate such bugs\&.
This implementation does not provide much detail about the problems it detects, because the performance impact for storing such information would be prohibitive\&. However, jemalloc does integrate with the most excellent
malloc_stats_print(), followed by a string pointer\&. Please note that doing anything which tries to allocate memory in this function is likely to result in a crash or deadlock\&.