826 lines
41 KiB
Plaintext
826 lines
41 KiB
Plaintext
|
|
This is a version (aka dlmalloc) of malloc/free/realloc written by
|
|
Doug Lea and released to the public domain, as explained at
|
|
http://creativecommons.org/publicdomain/zero/1.0/ Send questions,
|
|
comments, complaints, performance data, etc to dl@cs.oswego.edu
|
|
|
|
Version 2.8.6 Wed Aug 29 06:57:58 2012 Doug Lea
|
|
Note: There may be an updated version of this malloc obtainable at
|
|
ftp://gee.cs.oswego.edu/pub/misc/malloc.c
|
|
Check before installing!
|
|
|
|
* Quickstart
|
|
|
|
This library is all in one file to simplify the most common usage:
|
|
ftp it, compile it (-O3), and link it into another program. All of
|
|
the compile-time options default to reasonable values for use on
|
|
most platforms. You might later want to step through various
|
|
compile-time and dynamic tuning options.
|
|
|
|
For convenience, an include file for code using this malloc is at:
|
|
ftp://gee.cs.oswego.edu/pub/misc/malloc-2.8.6.h
|
|
You don't really need this .h file unless you call functions not
|
|
defined in your system include files. The .h file contains only the
|
|
excerpts from this file needed for using this malloc on ANSI C/C++
|
|
systems, so long as you haven't changed compile-time options about
|
|
naming and tuning parameters. If you do, then you can create your
|
|
own malloc.h that does include all settings by cutting at the point
|
|
indicated below. Note that you may already by default be using a C
|
|
library containing a malloc that is based on some version of this
|
|
malloc (for example in linux). You might still want to use the one
|
|
in this file to customize settings or to avoid overheads associated
|
|
with library versions.
|
|
|
|
* Vital statistics:
|
|
|
|
Supported pointer/size_t representation: 4 or 8 bytes
|
|
size_t MUST be an unsigned type of the same width as
|
|
pointers. (If you are using an ancient system that declares
|
|
size_t as a signed type, or need it to be a different width
|
|
than pointers, you can use a previous release of this malloc
|
|
(e.g. 2.7.2) supporting these.)
|
|
|
|
Alignment: 8 bytes (minimum)
|
|
Is set to 16 for NexGen32e.
|
|
|
|
Minimum overhead per allocated chunk: 4 or 8 bytes (if 4byte sizes)
|
|
8 or 16 bytes (if 8byte sizes)
|
|
Each malloced chunk has a hidden word of overhead holding size
|
|
and status information, and additional cross-check word
|
|
if FOOTERS is defined.
|
|
|
|
Minimum allocated size: 4-byte ptrs: 16 bytes (including overhead)
|
|
8-byte ptrs: 32 bytes (including overhead)
|
|
|
|
Even a request for zero bytes (i.e., malloc(0)) returns a
|
|
pointer to something of the minimum allocatable size.
|
|
The maximum overhead wastage (i.e., number of extra bytes
|
|
allocated than were requested in malloc) is less than or equal
|
|
to the minimum size, except for requests >= mmap_threshold that
|
|
are serviced via mmap(), where the worst case wastage is about
|
|
32 bytes plus the remainder from a system page (the minimal
|
|
mmap unit); typically 4096 or 8192 bytes.
|
|
|
|
Security: static-safe; optionally more or less
|
|
The "security" of malloc refers to the ability of malicious
|
|
code to accentuate the effects of errors (for example, freeing
|
|
space that is not currently malloc'ed or overwriting past the
|
|
ends of chunks) in code that calls malloc. This malloc
|
|
guarantees not to modify any memory locations below the base of
|
|
heap, i.e., static variables, even in the presence of usage
|
|
errors. The routines additionally detect most improper frees
|
|
and reallocs. All this holds as long as the static bookkeeping
|
|
for malloc itself is not corrupted by some other means. This
|
|
is only one aspect of security -- these checks do not, and
|
|
cannot, detect all possible programming errors.
|
|
|
|
If FOOTERS is defined nonzero, then each allocated chunk
|
|
carries an additional check word to verify that it was malloced
|
|
from its space. These check words are the same within each
|
|
execution of a program using malloc, but differ across
|
|
executions, so externally crafted fake chunks cannot be
|
|
freed. This improves security by rejecting frees/reallocs that
|
|
could corrupt heap memory, in addition to the checks preventing
|
|
writes to statics that are always on. This may further improve
|
|
security at the expense of time and space overhead. (Note that
|
|
FOOTERS may also be worth using with MSPACES.)
|
|
|
|
By default detected errors cause the program to abort (calling
|
|
"abort()"). You can override this to instead proceed past
|
|
errors by defining PROCEED_ON_ERROR. In this case, a bad free
|
|
has no effect, and a malloc that encounters a bad address
|
|
caused by user overwrites will ignore the bad address by
|
|
dropping pointers and indices to all known memory. This may
|
|
be appropriate for programs that should continue if at all
|
|
possible in the face of programming errors, although they may
|
|
run out of memory because dropped memory is never reclaimed.
|
|
|
|
If you don't like either of these options, you can define
|
|
CORRUPTION_ERROR_ACTION and USAGE_ERROR_ACTION to do anything
|
|
else. And if if you are sure that your program using malloc has
|
|
no errors or vulnerabilities, you can define TRUSTWORTHY to 1,
|
|
which might (or might not) provide a small performance improvement.
|
|
|
|
It is also possible to limit the maximum total allocatable
|
|
space, using malloc_set_footprint_limit. This is not
|
|
designed as a security feature in itself (calls to set limits
|
|
are not screened or privileged), but may be useful as one
|
|
aspect of a secure implementation.
|
|
|
|
Thread-safety: NOT thread-safe unless USE_LOCKS defined non-zero
|
|
When USE_LOCKS is defined, each public call to malloc, free,
|
|
etc is surrounded with a lock. By default, this uses a plain
|
|
pthread mutex, win32 critical section, or a spin-lock if if
|
|
available for the platform and not disabled by setting
|
|
USE_SPIN_LOCKS=0. However, if USE_RECURSIVE_LOCKS is defined,
|
|
recursive versions are used instead (which are not required for
|
|
base functionality but may be needed in layered extensions).
|
|
Using a global lock is not especially fast, and can be a major
|
|
bottleneck. It is designed only to provide minimal protection
|
|
in concurrent environments, and to provide a basis for
|
|
extensions. If you are using malloc in a concurrent program,
|
|
consider instead using nedmalloc
|
|
(http://www.nedprod.com/programs/portable/nedmalloc/) or
|
|
ptmalloc (See http://www.malloc.de), which are derived from
|
|
versions of this malloc.
|
|
|
|
System requirements: Any combination of MORECORE and/or MMAP/MUNMAP
|
|
This malloc can use unix sbrk or any emulation (invoked using
|
|
the CALL_MORECORE macro) and/or mmap/munmap or any emulation
|
|
(invoked using CALL_MMAP/CALL_MUNMAP) to get and release system
|
|
memory. On most unix systems, it tends to work best if both
|
|
MORECORE and MMAP are enabled. On Win32, it uses emulations
|
|
based on VirtualAlloc. It also uses common C library functions
|
|
like memset.
|
|
|
|
Compliance: I believe it is compliant with the Single Unix Specification
|
|
(See http://www.unix.org). Also SVID/XPG, ANSI C, and probably
|
|
others as well.
|
|
|
|
* Overview of algorithms
|
|
|
|
This is not the fastest, most space-conserving, most portable, or
|
|
most tunable malloc ever written. However it is among the fastest
|
|
while also being among the most space-conserving, portable and
|
|
tunable. Consistent balance across these factors results in a good
|
|
general-purpose allocator for malloc-intensive programs.
|
|
|
|
In most ways, this malloc is a best-fit allocator. Generally, it
|
|
chooses the best-fitting existing chunk for a request, with ties
|
|
broken in approximately least-recently-used order. (This strategy
|
|
normally maintains low fragmentation.) However, for requests less
|
|
than 256bytes, it deviates from best-fit when there is not an
|
|
exactly fitting available chunk by preferring to use space adjacent
|
|
to that used for the previous small request, as well as by breaking
|
|
ties in approximately most-recently-used order. (These enhance
|
|
locality of series of small allocations.) And for very large requests
|
|
(>= 256Kb by default), it relies on system memory mapping
|
|
facilities, if supported. (This helps avoid carrying around and
|
|
possibly fragmenting memory used only for large chunks.)
|
|
|
|
All operations (except malloc_stats and mallinfo) have execution
|
|
times that are bounded by a constant factor of the number of bits in
|
|
a size_t, not counting any clearing in calloc or copying in realloc,
|
|
or actions surrounding MORECORE and MMAP that have times
|
|
proportional to the number of non-contiguous regions returned by
|
|
system allocation routines, which is often just 1. In real-time
|
|
applications, you can optionally suppress segment traversals using
|
|
NO_SEGMENT_TRAVERSAL, which assures bounded execution even when
|
|
system allocators return non-contiguous spaces, at the typical
|
|
expense of carrying around more memory and increased fragmentation.
|
|
|
|
The implementation is not very modular and seriously overuses
|
|
macros. Perhaps someday all C compilers will do as good a job
|
|
inlining modular code as can now be done by brute-force expansion,
|
|
but now, enough of them seem not to.
|
|
|
|
Some compilers issue a lot of warnings about code that is
|
|
dead/unreachable only on some platforms, and also about intentional
|
|
uses of negation on unsigned types. All known cases of each can be
|
|
ignored.
|
|
|
|
For a longer but out of date high-level description, see
|
|
http://gee.cs.oswego.edu/dl/html/malloc.html
|
|
|
|
* MSPACES
|
|
If MSPACES is defined, then in addition to malloc, free, etc.,
|
|
this file also defines mspace_malloc, mspace_free, etc. These
|
|
are versions of malloc routines that take an "mspace" argument
|
|
obtained using create_mspace, to control all internal bookkeeping.
|
|
If ONLY_MSPACES is defined, only these versions are compiled.
|
|
So if you would like to use this allocator for only some allocations,
|
|
and your system malloc for others, you can compile with
|
|
ONLY_MSPACES and then do something like...
|
|
static mspace mymspace = create_mspace(0,0); // for example
|
|
#define mymalloc(bytes) mspace_malloc(mymspace, bytes)
|
|
|
|
(Note: If you only need one instance of an mspace, you can instead
|
|
use "USE_DL_PREFIX" to relabel the global malloc.)
|
|
|
|
You can similarly create thread-local allocators by storing
|
|
mspaces as thread-locals. For example:
|
|
static __thread mspace tlms = 0;
|
|
void* tlmalloc(size_t bytes) {
|
|
if (tlms == 0) tlms = create_mspace(0, 0);
|
|
return mspace_malloc(tlms, bytes);
|
|
}
|
|
void tlfree(void* mem) { mspace_free(tlms, mem); }
|
|
|
|
Unless FOOTERS is defined, each mspace is completely independent.
|
|
You cannot allocate from one and free to another (although
|
|
conformance is only weakly checked, so usage errors are not always
|
|
caught). If FOOTERS is defined, then each chunk carries around a tag
|
|
indicating its originating mspace, and frees are directed to their
|
|
originating spaces. Normally, this requires use of locks.
|
|
|
|
───────────────────────── Compile-time options ───────────────────────────
|
|
|
|
Be careful in setting #define values for numerical constants of type
|
|
size_t. On some systems, literal values are not automatically extended
|
|
to size_t precision unless they are explicitly casted. You can also
|
|
use the symbolic values SIZE_MAX, SIZE_T_ONE, etc below.
|
|
|
|
WIN32 default: defined if _WIN32 defined
|
|
Defining WIN32 sets up defaults for MS environment and compilers.
|
|
Otherwise defaults are for unix. Beware that there seem to be some
|
|
cases where this malloc might not be a pure drop-in replacement for
|
|
Win32 malloc: Random-looking failures from Win32 GDI API's (eg;
|
|
SetDIBits()) may be due to bugs in some video driver implementations
|
|
when pixel buffers are malloc()ed, and the region spans more than
|
|
one VirtualAlloc()ed region. Because dlmalloc uses a small (64Kb)
|
|
default granularity, pixel buffers may straddle virtual allocation
|
|
regions more often than when using the Microsoft allocator. You can
|
|
avoid this by using VirtualAlloc() and VirtualFree() for all pixel
|
|
buffers rather than using malloc(). If this is not possible,
|
|
recompile this malloc with a larger DEFAULT_GRANULARITY. Note:
|
|
in cases where MSC and gcc (cygwin) are known to differ on WIN32,
|
|
conditions use _MSC_VER to distinguish them.
|
|
|
|
DLMALLOC_EXPORT default: extern
|
|
Defines how public APIs are declared. If you want to export via a
|
|
Windows DLL, you might define this as
|
|
#define DLMALLOC_EXPORT extern __declspec(dllexport)
|
|
If you want a POSIX ELF shared object, you might use
|
|
#define DLMALLOC_EXPORT extern __attribute__((visibility("default")))
|
|
|
|
MALLOC_ALIGNMENT default: (size_t)(2 * sizeof(void *))
|
|
Controls the minimum alignment for malloc'ed chunks. It must be a
|
|
power of two and at least 8, even on machines for which smaller
|
|
alignments would suffice. It may be defined as larger than this
|
|
though. Note however that code and data structures are optimized for
|
|
the case of 8-byte alignment.
|
|
|
|
MSPACES default: 0 (false)
|
|
If true, compile in support for independent allocation spaces.
|
|
This is only supported if HAVE_MMAP is true.
|
|
|
|
ONLY_MSPACES default: 0 (false)
|
|
If true, only compile in mspace versions, not regular versions.
|
|
|
|
USE_LOCKS default: 0 (false)
|
|
Causes each call to each public routine to be surrounded with
|
|
pthread or WIN32 mutex lock/unlock. (If set true, this can be
|
|
overridden on a per-mspace basis for mspace versions.) If set to a
|
|
non-zero value other than 1, locks are used, but their
|
|
implementation is left out, so lock functions must be supplied manually,
|
|
as described below.
|
|
|
|
USE_SPIN_LOCKS default: 1 iff USE_LOCKS and spin locks available
|
|
If true, uses custom spin locks for locking. This is currently
|
|
supported only gcc >= 4.1, older gccs on x86 platforms, and recent
|
|
MS compilers. Otherwise, posix locks or win32 critical sections are
|
|
used.
|
|
|
|
USE_RECURSIVE_LOCKS default: not defined
|
|
If defined nonzero, uses recursive (aka reentrant) locks, otherwise
|
|
uses plain mutexes. This is not required for malloc proper, but may
|
|
be needed for layered allocators such as nedmalloc.
|
|
|
|
LOCK_AT_FORK default: not defined
|
|
If defined nonzero, performs pthread_atfork upon initialization
|
|
to initialize child lock while holding parent lock. The implementation
|
|
assumes that pthread locks (not custom locks) are being used. In other
|
|
cases, you may need to customize the implementation.
|
|
|
|
FOOTERS default: 0
|
|
If true, provide extra checking and dispatching by placing
|
|
information in the footers of allocated chunks. This adds
|
|
space and time overhead.
|
|
|
|
TRUSTWORTHY default: 0
|
|
If true, omit checks for usage errors and heap space overwrites.
|
|
|
|
USE_DL_PREFIX default: NOT defined
|
|
Causes compiler to prefix all public routines with the string 'dl'.
|
|
This can be useful when you only want to use this malloc in one part
|
|
of a program, using your regular system malloc elsewhere.
|
|
|
|
MALLOC_INSPECT_ALL default: NOT defined
|
|
If defined, compiles malloc_inspect_all and mspace_inspect_all, that
|
|
perform traversal of all heap space. Unless access to these
|
|
functions is otherwise restricted, you probably do not want to
|
|
include them in secure implementations.
|
|
|
|
MALLOC_ABORT default: defined as abort()
|
|
Defines how to abort on failed checks. On most systems, a failed
|
|
check cannot die with an "assert" or even print an informative
|
|
message, because the underlying print routines in turn call malloc,
|
|
which will fail again. Generally, the best policy is to simply call
|
|
abort(). It's not very useful to do more than this because many
|
|
errors due to overwriting will show up as address faults (null, odd
|
|
addresses etc) rather than malloc-triggered checks, so will also
|
|
abort. Also, most compilers know that abort() does not return, so
|
|
can better optimize code conditionally calling it.
|
|
|
|
PROCEED_ON_ERROR default: defined as 0 (false)
|
|
Controls whether detected bad addresses cause them to bypassed
|
|
rather than aborting. If set, detected bad arguments to free and
|
|
realloc are ignored. And all bookkeeping information is zeroed out
|
|
upon a detected overwrite of freed heap space, thus losing the
|
|
ability to ever return it from malloc again, but enabling the
|
|
application to proceed. If PROCEED_ON_ERROR is defined, the
|
|
static variable malloc_corruption_error_count is compiled in
|
|
and can be examined to see if errors have occurred. This option
|
|
generates slower code than the default abort policy.
|
|
|
|
DEBUG default: NOT defined
|
|
The DEBUG setting is mainly intended for people trying to modify
|
|
this code or diagnose problems when porting to new platforms.
|
|
However, it may also be able to better isolate user errors than just
|
|
using runtime checks. The assertions in the check routines spell
|
|
out in more detail the assumptions and invariants underlying the
|
|
algorithms. The checking is fairly extensive, and will slow down
|
|
execution noticeably. Calling malloc_stats or mallinfo with DEBUG
|
|
set will attempt to check every non-mmapped allocated and free chunk
|
|
in the course of computing the summaries.
|
|
|
|
ABORT_ON_ASSERT_FAILURE default: defined as 1 (true)
|
|
Debugging assertion failures can be nearly impossible if your
|
|
version of the assert macro causes malloc to be called, which will
|
|
lead to a cascade of further failures, blowing the runtime stack.
|
|
ABORT_ON_ASSERT_FAILURE cause assertions failures to call abort(),
|
|
which will usually make debugging easier.
|
|
|
|
MALLOC_FAILURE_ACTION default: sets errno to ENOMEM, or no-op on win32
|
|
The action to take before "return 0" when malloc fails to be able to
|
|
return memory because there is none available.
|
|
|
|
HAVE_MORECORE default: 1 (true) unless win32 or ONLY_MSPACES
|
|
True if this system supports sbrk or an emulation of it.
|
|
|
|
MORECORE default: sbrk
|
|
The name of the sbrk-style system routine to call to obtain more
|
|
memory. See below for guidance on writing custom MORECORE
|
|
functions. The type of the argument to sbrk/MORECORE varies across
|
|
systems. It cannot be size_t, because it supports negative
|
|
arguments, so it is normally the signed type of the same width as
|
|
size_t (sometimes declared as "intptr_t"). It doesn't much matter
|
|
though. Internally, we only call it with arguments less than half
|
|
the max value of a size_t, which should work across all reasonable
|
|
possibilities, although sometimes generating compiler warnings.
|
|
|
|
MORECORE_CONTIGUOUS default: 1 (true) if HAVE_MORECORE
|
|
If true, take advantage of fact that consecutive calls to MORECORE
|
|
with positive arguments always return contiguous increasing
|
|
addresses. This is true of unix sbrk. It does not hurt too much to
|
|
set it true anyway, since malloc copes with non-contiguities.
|
|
Setting it false when definitely non-contiguous saves time
|
|
and possibly wasted space it would take to discover this though.
|
|
|
|
MORECORE_CANNOT_TRIM default: NOT defined
|
|
True if MORECORE cannot release space back to the system when given
|
|
negative arguments. This is generally necessary only if you are
|
|
using a hand-crafted MORECORE function that cannot handle negative
|
|
arguments.
|
|
|
|
NO_SEGMENT_TRAVERSAL default: 0
|
|
If non-zero, suppresses traversals of memory segments
|
|
returned by either MORECORE or CALL_MMAP. This disables
|
|
merging of segments that are contiguous, and selectively
|
|
releasing them to the OS if unused, but bounds execution times.
|
|
|
|
HAVE_MMAP default: 1 (true)
|
|
True if this system supports mmap or an emulation of it. If so, and
|
|
HAVE_MORECORE is not true, MMAP is used for all system
|
|
allocation. If set and HAVE_MORECORE is true as well, MMAP is
|
|
primarily used to directly allocate very large blocks. It is also
|
|
used as a backup strategy in cases where MORECORE fails to provide
|
|
space from system. Note: A single call to MUNMAP is assumed to be
|
|
able to unmap memory that may have be allocated using multiple calls
|
|
to MMAP, so long as they are adjacent.
|
|
|
|
HAVE_MREMAP default: 1 on linux, else 0
|
|
If true realloc() uses mremap() to re-allocate large blocks and
|
|
extend or shrink allocation spaces.
|
|
|
|
MMAP_CLEARS default: 1 except on WINCE.
|
|
True if mmap clears memory so calloc doesn't need to. This is true
|
|
for standard unix mmap using /dev/zero and on WIN32 except for WINCE.
|
|
|
|
USE_BUILTIN_FFS default: 0 (i.e., not used)
|
|
Causes malloc to use the builtin ffs() function to compute indices.
|
|
Some compilers may recognize and intrinsify ffs to be faster than the
|
|
supplied C version. Also, the case of x86 using gcc is special-cased
|
|
to an asm instruction, so is already as fast as it can be, and so
|
|
this setting has no effect. Similarly for Win32 under recent MS compilers.
|
|
(On most x86s, the asm version is only slightly faster than the C version.)
|
|
|
|
malloc_getpagesize default: derive from system includes, or 4096.
|
|
The system page size. To the extent possible, this malloc manages
|
|
memory from the system in page-size units. This may be (and
|
|
usually is) a function rather than a constant. This is ignored
|
|
if WIN32, where page size is determined using getSystemInfo during
|
|
initialization.
|
|
|
|
NO_MALLINFO default: 0
|
|
If defined, don't compile "mallinfo". This can be a simple way
|
|
of dealing with mismatches between system declarations and
|
|
those in this file.
|
|
|
|
MALLINFO_FIELD_TYPE default: size_t
|
|
The type of the fields in the mallinfo struct. This was originally
|
|
defined as "int" in SVID etc, but is more usefully defined as
|
|
size_t. The value is used only if HAVE_USR_INCLUDE_MALLOC_H is not set
|
|
|
|
NO_MALLOC_STATS default: 0
|
|
If defined, don't compile "malloc_stats". This avoids calls to
|
|
fprintf and bringing in stdio dependencies you might not want.
|
|
|
|
REALLOC_ZERO_BYTES_FREES default: not defined
|
|
This should be set if a call to realloc with zero bytes should
|
|
be the same as a call to free. Some people think it should. Otherwise,
|
|
since this malloc returns a unique pointer for malloc(0), so does
|
|
realloc(p, 0).
|
|
|
|
LACKS_UNISTD_H, LACKS_FCNTL_H, LACKS_SYS_PARAM_H, LACKS_SYS_MMAN_H
|
|
LACKS_STRINGS_H, LACKS_STRING_H, LACKS_SYS_TYPES_H, LACKS_ERRNO_H
|
|
LACKS_STDLIB_H LACKS_SCHED_H LACKS_TIME_H default: NOT defined unless on WIN32
|
|
Define these if your system does not have these header files.
|
|
You might need to manually insert some of the declarations they provide.
|
|
|
|
DEFAULT_GRANULARITY default: page size if MORECORE_CONTIGUOUS,
|
|
system_info.dwAllocationGranularity in WIN32,
|
|
otherwise 64K.
|
|
Also settable using mallopt(M_GRANULARITY, x)
|
|
The unit for allocating and deallocating memory from the system. On
|
|
most systems with contiguous MORECORE, there is no reason to
|
|
make this more than a page. However, systems with MMAP tend to
|
|
either require or encourage larger granularities. You can increase
|
|
this value to prevent system allocation functions to be called so
|
|
often, especially if they are slow. The value must be at least one
|
|
page and must be a power of two. Setting to 0 causes initialization
|
|
to either page size or win32 region size. (Note: In previous
|
|
versions of malloc, the equivalent of this option was called
|
|
"TOP_PAD")
|
|
|
|
DEFAULT_TRIM_THRESHOLD default: 2MB
|
|
Also settable using mallopt(M_TRIM_THRESHOLD, x)
|
|
The maximum amount of unused top-most memory to keep before
|
|
releasing via malloc_trim in free(). Automatic trimming is mainly
|
|
useful in long-lived programs using contiguous MORECORE. Because
|
|
trimming via sbrk can be slow on some systems, and can sometimes be
|
|
wasteful (in cases where programs immediately afterward allocate
|
|
more large chunks) the value should be high enough so that your
|
|
overall system performance would improve by releasing this much
|
|
memory. As a rough guide, you might set to a value close to the
|
|
average size of a process (program) running on your system.
|
|
Releasing this much memory would allow such a process to run in
|
|
memory. Generally, it is worth tuning trim thresholds when a
|
|
program undergoes phases where several large chunks are allocated
|
|
and released in ways that can reuse each other's storage, perhaps
|
|
mixed with phases where there are no such chunks at all. The trim
|
|
value must be greater than page size to have any useful effect. To
|
|
disable trimming completely, you can set to SIZE_MAX. Note that the trick
|
|
some people use of mallocing a huge space and then freeing it at
|
|
program startup, in an attempt to reserve system memory, doesn't
|
|
have the intended effect under automatic trimming, since that memory
|
|
will immediately be returned to the system.
|
|
|
|
DEFAULT_MMAP_THRESHOLD default: 256K
|
|
Also settable using mallopt(M_MMAP_THRESHOLD, x)
|
|
The request size threshold for using MMAP to directly service a
|
|
request. Requests of at least this size that cannot be allocated
|
|
using already-existing space will be serviced via mmap. (If enough
|
|
normal freed space already exists it is used instead.) Using mmap
|
|
segregates relatively large chunks of memory so that they can be
|
|
individually obtained and released from the host system. A request
|
|
serviced through mmap is never reused by any other request (at least
|
|
not directly; the system may just so happen to remap successive
|
|
requests to the same locations). Segregating space in this way has
|
|
the benefits that: Mmapped space can always be individually released
|
|
back to the system, which helps keep the system level memory demands
|
|
of a long-lived program low. Also, mapped memory doesn't become
|
|
`locked' between other chunks, as can happen with normally allocated
|
|
chunks, which means that even trimming via malloc_trim would not
|
|
release them. However, it has the disadvantage that the space
|
|
cannot be reclaimed, consolidated, and then used to service later
|
|
requests, as happens with normal chunks. The advantages of mmap
|
|
nearly always outweigh disadvantages for "large" chunks, but the
|
|
value of "large" may vary across systems. The default is an
|
|
empirically derived value that works well in most systems. You can
|
|
disable mmap by setting to SIZE_MAX.
|
|
|
|
MAX_RELEASE_CHECK_RATE default: 4095 unless not HAVE_MMAP
|
|
The number of consolidated frees between checks to release
|
|
unused segments when freeing. When using non-contiguous segments,
|
|
especially with multiple mspaces, checking only for topmost space
|
|
doesn't always suffice to trigger trimming. To compensate for this,
|
|
free() will, with a period of MAX_RELEASE_CHECK_RATE (or the
|
|
current number of segments, if greater) try to release unused
|
|
segments to the OS when freeing chunks that result in
|
|
consolidation. The best value for this parameter is a compromise
|
|
between slowing down frees with relatively costly checks that
|
|
rarely trigger versus holding on to unused memory. To effectively
|
|
disable, set to SIZE_MAX. This may lead to a very slight speed
|
|
improvement at the expense of carrying around more memory.
|
|
|
|
────────────────────────────────────────────────────────────────────────────────
|
|
|
|
History:
|
|
|
|
v2.8.6 Wed Aug 29 06:57:58 2012 Doug Lea
|
|
* fix bad comparison in dlposix_memalign
|
|
* don't reuse adjusted asize in sys_alloc
|
|
* add LOCK_AT_FORK -- thanks to Kirill Artamonov for the suggestion
|
|
* reduce compiler warnings -- thanks to all who reported/suggested these
|
|
|
|
v2.8.5 Sun May 22 10:26:02 2011 Doug Lea (dl at gee)
|
|
* Always perform unlink checks unless TRUSTWORTHY
|
|
* Add posix_memalign.
|
|
* Improve realloc to expand in more cases; expose realloc_in_place.
|
|
Thanks to Peter Buhr for the suggestion.
|
|
* Add footprint_limit, inspect_all, bulk_free. Thanks
|
|
to Barry Hayes and others for the suggestions.
|
|
* Internal refactorings to avoid calls while holding locks
|
|
* Use non-reentrant locks by default. Thanks to Roland McGrath
|
|
for the suggestion.
|
|
* Small fixes to mspace_destroy, reset_on_error.
|
|
* Various configuration extensions/changes. Thanks
|
|
to all who contributed these.
|
|
|
|
V2.8.4a Thu Apr 28 14:39:43 2011 (dl at gee.cs.oswego.edu)
|
|
* Update Creative Commons URL
|
|
|
|
V2.8.4 Wed May 27 09:56:23 2009 Doug Lea (dl at gee)
|
|
* Use zeros instead of prev foot for is_mmapped
|
|
* Add mspace_track_large_chunks; thanks to Jean Brouwers
|
|
* Fix set_inuse in internal_realloc; thanks to Jean Brouwers
|
|
* Fix insufficient sys_alloc padding when using 16byte alignment
|
|
* Fix bad error check in mspace_footprint
|
|
* Adaptations for ptmalloc; thanks to Wolfram Gloger.
|
|
* Reentrant spin locks; thanks to Earl Chew and others
|
|
* Win32 improvements; thanks to Niall Douglas and Earl Chew
|
|
* Add NO_SEGMENT_TRAVERSAL and MAX_RELEASE_CHECK_RATE options
|
|
* Extension hook in malloc_state
|
|
* Various small adjustments to reduce warnings on some compilers
|
|
* Various configuration extensions/changes for more platforms. Thanks
|
|
to all who contributed these.
|
|
|
|
V2.8.3 Thu Sep 22 11:16:32 2005 Doug Lea (dl at gee)
|
|
* Add max_footprint functions
|
|
* Ensure all appropriate literals are size_t
|
|
* Fix conditional compilation problem for some #define settings
|
|
* Avoid concatenating segments with the one provided
|
|
in create_mspace_with_base
|
|
* Rename some variables to avoid compiler shadowing warnings
|
|
* Use explicit lock initialization.
|
|
* Better handling of sbrk interference.
|
|
* Simplify and fix segment insertion, trimming and mspace_destroy
|
|
* Reinstate REALLOC_ZERO_BYTES_FREES option from 2.7.x
|
|
* Thanks especially to Dennis Flanagan for help on these.
|
|
|
|
V2.8.2 Sun Jun 12 16:01:10 2005 Doug Lea (dl at gee)
|
|
* Fix memalign brace error.
|
|
|
|
V2.8.1 Wed Jun 8 16:11:46 2005 Doug Lea (dl at gee)
|
|
* Fix improper #endif nesting in C++
|
|
* Add explicit casts needed for C++
|
|
|
|
V2.8.0 Mon May 30 14:09:02 2005 Doug Lea (dl at gee)
|
|
* Use trees for large bins
|
|
* Support mspaces
|
|
* Use segments to unify sbrk-based and mmap-based system allocation,
|
|
removing need for emulation on most platforms without sbrk.
|
|
* Default safety checks
|
|
* Optional footer checks. Thanks to William Robertson for the idea.
|
|
* Internal code refactoring
|
|
* Incorporate suggestions and platform-specific changes.
|
|
Thanks to Dennis Flanagan, Colin Plumb, Niall Douglas,
|
|
Aaron Bachmann, Emery Berger, and others.
|
|
* Speed up non-fastbin processing enough to remove fastbins.
|
|
* Remove useless cfree() to avoid conflicts with other apps.
|
|
* Remove internal memcpy, memset. Compilers handle builtins better.
|
|
* Remove some options that no one ever used and rename others.
|
|
|
|
V2.7.2 Sat Aug 17 09:07:30 2002 Doug Lea (dl at gee)
|
|
* Fix malloc_state bitmap array misdeclaration
|
|
|
|
V2.7.1 Thu Jul 25 10:58:03 2002 Doug Lea (dl at gee)
|
|
* Allow tuning of FIRST_SORTED_BIN_SIZE
|
|
* Use PTR_UINT as type for all ptr->int casts. Thanks to John Belmonte.
|
|
* Better detection and support for non-contiguousness of MORECORE.
|
|
Thanks to Andreas Mueller, Conal Walsh, and Wolfram Gloger
|
|
* Bypass most of malloc if no frees. Thanks To Emery Berger.
|
|
* Fix freeing of old top non-contiguous chunk im sysmalloc.
|
|
* Raised default trim and map thresholds to 256K.
|
|
* Fix mmap-related #defines. Thanks to Lubos Lunak.
|
|
* Fix copy macros; added LACKS_FCNTL_H. Thanks to Neal Walfield.
|
|
* Branch-free bin calculation
|
|
* Default trim and mmap thresholds now 256K.
|
|
|
|
V2.7.0 Sun Mar 11 14:14:06 2001 Doug Lea (dl at gee)
|
|
* Introduce independent_comalloc and independent_calloc.
|
|
Thanks to Michael Pachos for motivation and help.
|
|
* Make optional .h file available
|
|
* Allow > 2GB requests on 32bit systems.
|
|
* new WIN32 sbrk, mmap, munmap, lock code from <Walter@GeNeSys-e.de>.
|
|
Thanks also to Andreas Mueller <a.mueller at paradatec.de>,
|
|
and Anonymous.
|
|
* Allow override of MALLOC_ALIGNMENT (Thanks to Ruud Waij for
|
|
helping test this.)
|
|
* memalign: check alignment arg
|
|
* realloc: don't try to shift chunks backwards, since this
|
|
leads to more fragmentation in some programs and doesn't
|
|
seem to help in any others.
|
|
* Collect all cases in malloc requiring system memory into sysmalloc
|
|
* Use mmap as backup to sbrk
|
|
* Place all internal state in malloc_state
|
|
* Introduce fastbins (although similar to 2.5.1)
|
|
* Many minor tunings and cosmetic improvements
|
|
* Introduce USE_PUBLIC_MALLOC_WRAPPERS, USE_MALLOC_LOCK
|
|
* Introduce MALLOC_FAILURE_ACTION, MORECORE_CONTIGUOUS
|
|
Thanks to Tony E. Bennett <tbennett@nvidia.com> and others.
|
|
* Include errno.h to support default failure action.
|
|
|
|
V2.6.6 Sun Dec 5 07:42:19 1999 Doug Lea (dl at gee)
|
|
* return null for negative arguments
|
|
* Added Several WIN32 cleanups from Martin C. Fong <mcfong at yahoo.com>
|
|
* Add 'LACKS_SYS_PARAM_H' for those systems without 'sys/param.h'
|
|
(e.g. WIN32 platforms)
|
|
* Cleanup header file inclusion for WIN32 platforms
|
|
* Cleanup code to avoid Microsoft Visual C++ compiler complaints
|
|
* Add 'USE_DL_PREFIX' to quickly allow co-existence with existing
|
|
memory allocation routines
|
|
* Set 'malloc_getpagesize' for WIN32 platforms (needs more work)
|
|
* Use 'assert' rather than 'ASSERT' in WIN32 code to conform to
|
|
usage of 'assert' in non-WIN32 code
|
|
* Improve WIN32 'sbrk()' emulation's 'findRegion()' routine to
|
|
avoid infinite loop
|
|
* Always call 'fREe()' rather than 'free()'
|
|
|
|
V2.6.5 Wed Jun 17 15:57:31 1998 Doug Lea (dl at gee)
|
|
* Fixed ordering problem with boundary-stamping
|
|
|
|
V2.6.3 Sun May 19 08:17:58 1996 Doug Lea (dl at gee)
|
|
* Added pvalloc, as recommended by H.J. Liu
|
|
* Added 64bit pointer support mainly from Wolfram Gloger
|
|
* Added anonymously donated WIN32 sbrk emulation
|
|
* Malloc, calloc, getpagesize: add optimizations from Raymond Nijssen
|
|
* malloc_extend_top: fix mask error that caused wastage after
|
|
foreign sbrks
|
|
* Add linux mremap support code from HJ Liu
|
|
|
|
V2.6.2 Tue Dec 5 06:52:55 1995 Doug Lea (dl at gee)
|
|
* Integrated most documentation with the code.
|
|
* Add support for mmap, with help from
|
|
Wolfram Gloger (Gloger@lrz.uni-muenchen.de).
|
|
* Use last_remainder in more cases.
|
|
* Pack bins using idea from colin@nyx10.cs.du.edu
|
|
* Use ordered bins instead of best-fit threshhold
|
|
* Eliminate block-local decls to simplify tracing and debugging.
|
|
* Support another case of realloc via move into top
|
|
* Fix error occuring when initial sbrk_base not word-aligned.
|
|
* Rely on page size for units instead of SBRK_UNIT to
|
|
avoid surprises about sbrk alignment conventions.
|
|
* Add mallinfo, mallopt. Thanks to Raymond Nijssen
|
|
(raymond@es.ele.tue.nl) for the suggestion.
|
|
* Add `pad' argument to malloc_trim and top_pad mallopt parameter.
|
|
* More precautions for cases where other routines call sbrk,
|
|
courtesy of Wolfram Gloger (Gloger@lrz.uni-muenchen.de).
|
|
* Added macros etc., allowing use in linux libc from
|
|
H.J. Lu (hjl@gnu.ai.mit.edu)
|
|
* Inverted this history list
|
|
|
|
V2.6.1 Sat Dec 2 14:10:57 1995 Doug Lea (dl at gee)
|
|
* Re-tuned and fixed to behave more nicely with V2.6.0 changes.
|
|
* Removed all preallocation code since under current scheme
|
|
the work required to undo bad preallocations exceeds
|
|
the work saved in good cases for most test programs.
|
|
* No longer use return list or unconsolidated bins since
|
|
no scheme using them consistently outperforms those that don't
|
|
given above changes.
|
|
* Use best fit for very large chunks to prevent some worst-cases.
|
|
* Added some support for debugging
|
|
|
|
V2.6.0 Sat Nov 4 07:05:23 1995 Doug Lea (dl at gee)
|
|
* Removed footers when chunks are in use. Thanks to
|
|
Paul Wilson (wilson@cs.texas.edu) for the suggestion.
|
|
|
|
V2.5.4 Wed Nov 1 07:54:51 1995 Doug Lea (dl at gee)
|
|
* Added malloc_trim, with help from Wolfram Gloger
|
|
(wmglo@Dent.MED.Uni-Muenchen.DE).
|
|
|
|
V2.5.3 Tue Apr 26 10:16:01 1994 Doug Lea (dl at g)
|
|
|
|
V2.5.2 Tue Apr 5 16:20:40 1994 Doug Lea (dl at g)
|
|
* realloc: try to expand in both directions
|
|
* malloc: swap order of clean-bin strategy;
|
|
* realloc: only conditionally expand backwards
|
|
* Try not to scavenge used bins
|
|
* Use bin counts as a guide to preallocation
|
|
* Occasionally bin return list chunks in first scan
|
|
* Add a few optimizations from colin@nyx10.cs.du.edu
|
|
|
|
V2.5.1 Sat Aug 14 15:40:43 1993 Doug Lea (dl at g)
|
|
* faster bin computation & slightly different binning
|
|
* merged all consolidations to one part of malloc proper
|
|
(eliminating old malloc_find_space & malloc_clean_bin)
|
|
* Scan 2 returns chunks (not just 1)
|
|
* Propagate failure in realloc if malloc returns 0
|
|
* Add stuff to allow compilation on non-ANSI compilers
|
|
from kpv@research.att.com
|
|
|
|
V2.5 Sat Aug 7 07:41:59 1993 Doug Lea (dl at g.oswego.edu)
|
|
* removed potential for odd address access in prev_chunk
|
|
* removed dependency on getpagesize.h
|
|
* misc cosmetics and a bit more internal documentation
|
|
* anticosmetics: mangled names in macros to evade debugger strangeness
|
|
* tested on sparc, hp-700, dec-mips, rs6000
|
|
with gcc & native cc (hp, dec only) allowing
|
|
Detlefs & Zorn comparison study (in SIGPLAN Notices.)
|
|
|
|
Trial version Fri Aug 28 13:14:29 1992 Doug Lea (dl at g.oswego.edu)
|
|
* Based loosely on libg++-1.2X malloc. (It retains some of the overall
|
|
structure of old version, but most details differ.)
|
|
|
|
/* ──────────────────── Alternative MORECORE functions ─────────────────── */
|
|
|
|
/*
|
|
Guidelines for creating a custom version of MORECORE:
|
|
|
|
* For best performance, MORECORE should allocate in multiples of pagesize.
|
|
* MORECORE may allocate more memory than requested. (Or even less,
|
|
but this will usually result in a malloc failure.)
|
|
* MORECORE must not allocate memory when given argument zero, but
|
|
instead return one past the end address of memory from previous
|
|
nonzero call.
|
|
* For best performance, consecutive calls to MORECORE with positive
|
|
arguments should return increasing addresses, indicating that
|
|
space has been contiguously extended.
|
|
* Even though consecutive calls to MORECORE need not return contiguous
|
|
addresses, it must be OK for malloc'ed chunks to span multiple
|
|
regions in those cases where they do happen to be contiguous.
|
|
* MORECORE need not handle negative arguments -- it may instead
|
|
just return MFAIL when given negative arguments.
|
|
Negative arguments are always multiples of pagesize. MORECORE
|
|
must not misinterpret negative args as large positive unsigned
|
|
args. You can suppress all such calls from even occurring by defining
|
|
MORECORE_CANNOT_TRIM,
|
|
|
|
As an example alternative MORECORE, here is a custom allocator
|
|
kindly contributed for pre-OSX macOS. It uses virtually but not
|
|
necessarily physically contiguous non-paged memory (locked in,
|
|
present and won't get swapped out). You can use it by uncommenting
|
|
this section, adding some #includes, and setting up the appropriate
|
|
defines above:
|
|
|
|
#define MORECORE osMoreCore
|
|
|
|
There is also a shutdown routine that should somehow be called for
|
|
cleanup upon program exit.
|
|
|
|
#define MAX_POOL_ENTRIES 100
|
|
#define MINIMUM_MORECORE_SIZE (64 * 1024U)
|
|
static int next_os_pool;
|
|
void *our_os_pools[MAX_POOL_ENTRIES];
|
|
|
|
void *osMoreCore(int size)
|
|
{
|
|
void *ptr = 0;
|
|
static void *sbrk_top = 0;
|
|
|
|
if (size > 0)
|
|
{
|
|
if (size < MINIMUM_MORECORE_SIZE)
|
|
size = MINIMUM_MORECORE_SIZE;
|
|
if (CurrentExecutionLevel() == kTaskLevel)
|
|
ptr = PoolAllocateResident(size + RM_PAGE_SIZE, 0);
|
|
if (ptr == 0)
|
|
{
|
|
return (void *) MFAIL;
|
|
}
|
|
// save ptrs so they can be freed during cleanup
|
|
our_os_pools[next_os_pool] = ptr;
|
|
next_os_pool++;
|
|
ptr = (void *) ((((size_t) ptr) + RM_PAGE_MASK) & ~RM_PAGE_MASK);
|
|
sbrk_top = (char *) ptr + size;
|
|
return ptr;
|
|
}
|
|
else if (size < 0)
|
|
{
|
|
// we don't currently support shrink behavior
|
|
return (void *) MFAIL;
|
|
}
|
|
else
|
|
{
|
|
return sbrk_top;
|
|
}
|
|
}
|
|
|
|
// cleanup any allocated memory pools
|
|
// called as last thing before shutting down driver
|
|
|
|
void osCleanupMem(void)
|
|
{
|
|
void **ptr;
|
|
|
|
for (ptr = our_os_pools; ptr < &our_os_pools[MAX_POOL_ENTRIES]; ptr++)
|
|
if (*ptr)
|
|
{
|
|
PoolDeallocate(*ptr);
|
|
*ptr = 0;
|
|
}
|
|
}
|
|
|
|
*/
|