A simple caching allocator pinned host memory allocations. More...
#include <CachingHostAllocator.h>
Classes | |
struct | BlockDescriptor |
class | TotalBytes |
Public Types | |
typedef std::multiset< BlockDescriptor, Compare > | BusyBlocks |
Set type for live blocks (ordered by ptr) More... | |
typedef std::multiset< BlockDescriptor, Compare > | CachedBlocks |
Set type for cached blocks (ordered by size) More... | |
typedef bool(* | Compare) (const BlockDescriptor &, const BlockDescriptor &) |
BlockDescriptor comparator function interface. More... | |
Public Member Functions | |
CachingHostAllocator (unsigned int bin_growth, unsigned int min_bin=1, unsigned int max_bin=INVALID_BIN, size_t max_cached_bytes=INVALID_SIZE, bool skip_cleanup=false, bool debug=false) | |
Set of live pinned host allocations currently in use. More... | |
CachingHostAllocator (bool skip_cleanup=false, bool debug=false) | |
Default constructor. More... | |
cudaError_t | FreeAllCached () |
Frees all cached pinned host allocations. More... | |
cudaError_t | HostAllocate (void **d_ptr, size_t bytes, cudaStream_t active_stream=nullptr) |
Provides a suitable allocation of pinned host memory for the given size. More... | |
cudaError_t | HostFree (void *d_ptr) |
Frees a live allocation of pinned host memory, returning it to the allocator. More... | |
void | NearestPowerOf (unsigned int &power, size_t &rounded_bytes, unsigned int base, size_t value) |
void | SetMaxCachedBytes (size_t max_cached_bytes) |
Sets the limit on the number bytes this allocator is allowed to cache. More... | |
~CachingHostAllocator () | |
Destructor. More... | |
Static Public Member Functions | |
static unsigned int | IntPow (unsigned int base, unsigned int exp) |
Public Attributes | |
unsigned int | bin_growth |
Mutex for thread-safety. More... | |
CachedBlocks | cached_blocks |
Aggregate cached bytes. More... | |
TotalBytes | cached_bytes |
Whether or not to print (de)allocation events to stdout. More... | |
bool | debug |
Whether or not to skip a call to FreeAllCached() when destructor is called. (The CUDA runtime may have already shut down for statically declared allocators) More... | |
BusyBlocks | live_blocks |
Set of cached pinned host allocations available for reuse. More... | |
unsigned int | max_bin |
Minimum bin enumeration. More... | |
size_t | max_bin_bytes |
Minimum bin size. More... | |
size_t | max_cached_bytes |
Maximum bin size. More... | |
unsigned int | min_bin |
Geometric growth factor for bin-sizes. More... | |
size_t | min_bin_bytes |
Maximum bin enumeration. More... | |
std::mutex | mutex |
const bool | skip_cleanup |
Maximum aggregate cached bytes. More... | |
Static Public Attributes | |
static const unsigned int | INVALID_BIN = (unsigned int)-1 |
Out-of-bounds bin. More... | |
static const int | INVALID_DEVICE_ORDINAL = -1 |
Invalid device ordinal. More... | |
static const size_t | INVALID_SIZE = (size_t)-1 |
Invalid size. More... | |
A simple caching allocator pinned host memory allocations.
I presume the CUDA stream-safeness is not useful as to read/write from/to the pinned host memory one needs to synchronize anyway. The difference wrt. device memory is that in the CPU all operations to the device memory are scheduled via the CUDA stream, while for the host memory one can perform operations directly.
bin_growth
provided during construction. Unused host allocations within a larger bin cache are not reused for allocation requests that categorize to smaller bin sizes.bin_growth
^ min_bin
) are rounded up to (bin_growth
^ min_bin
).bin_growth
^ max_bin
) are not rounded up to the nearest bin and are simply freed when they are deallocated instead of being returned to a bin-cache.max_cached_bytes
, allocations are simply freed when they are deallocated instead of being returned to their bin-cache.bin_growth
= 8min_bin
= 3max_bin
= 7max_cached_bytes
= 6MB - 1BDefinition at line 100 of file CachingHostAllocator.h.
typedef std::multiset<BlockDescriptor, Compare> notcub::CachingHostAllocator::BusyBlocks |
Set type for live blocks (ordered by ptr)
Definition at line 170 of file CachingHostAllocator.h.
typedef std::multiset<BlockDescriptor, Compare> notcub::CachingHostAllocator::CachedBlocks |
Set type for cached blocks (ordered by size)
Definition at line 167 of file CachingHostAllocator.h.
typedef bool(* notcub::CachingHostAllocator::Compare) (const BlockDescriptor &, const BlockDescriptor &) |
BlockDescriptor comparator function interface.
Definition at line 157 of file CachingHostAllocator.h.
|
inline |
Set of live pinned host allocations currently in use.
Constructor.
bin_growth | Geometric growth factor for bin-sizes |
min_bin | Minimum bin (default is bin_growth ^ 1) |
max_bin | Maximum bin (default is no max bin) |
max_cached_bytes | Maximum aggregate cached bytes (default is no limit) |
skip_cleanup | Whether or not to skip a call to FreeAllCached() when the destructor is called (default is to deallocate) |
debug | Whether or not to print (de)allocation events to stdout (default is no stderr output) |
Definition at line 242 of file CachingHostAllocator.h.
|
inline |
Default constructor.
Configured with:
bin_growth
= 8min_bin
= 3max_bin
= 7max_cached_bytes
= (bin_growth
^ max_bin
) * 3) - 1 = 6,291,455 byteswhich delineates five bin-sizes: 512B, 4KB, 32KB, 256KB, and 2MB and sets a maximum of 6,291,455 cached bytes
Definition at line 274 of file CachingHostAllocator.h.
|
inline |
Destructor.
Definition at line 638 of file CachingHostAllocator.h.
References FreeAllCached(), and skip_cleanup.
|
inline |
Frees all cached pinned host allocations.
Definition at line 579 of file CachingHostAllocator.h.
References cached_blocks, cached_bytes, cudaCheck, debug, relativeConstraints::error, notcub::CachingHostAllocator::TotalBytes::free, INVALID_DEVICE_ORDINAL, notcub::CachingHostAllocator::TotalBytes::live, live_blocks, and mutex.
Referenced by cms::cuda::allocator::cachingAllocatorsFreeCached(), and ~CachingHostAllocator().
|
inline |
Provides a suitable allocation of pinned host memory for the given size.
Once freed, the allocation becomes available immediately for reuse.
[out] | d_ptr | Reference to pointer to the allocation |
[in] | bytes | Minimum number of bytes for the allocation |
[in] | active_stream | The stream to be associated with this allocation |
Definition at line 312 of file CachingHostAllocator.h.
References notcub::CachingHostAllocator::BlockDescriptor::associated_stream, notcub::CachingHostAllocator::BlockDescriptor::bin, bin_growth, notcub::CachingHostAllocator::BlockDescriptor::bytes, cached_blocks, cached_bytes, cudaCheck, notcub::CachingHostAllocator::BlockDescriptor::d_ptr, debug, notcub::CachingHostAllocator::BlockDescriptor::device, relativeConstraints::error, newFWLiteAna::found, notcub::CachingHostAllocator::TotalBytes::free, INVALID_BIN, INVALID_DEVICE_ORDINAL, notcub::CachingHostAllocator::TotalBytes::live, live_blocks, max_bin, min_bin, min_bin_bytes, mutex, NearestPowerOf(), and notcub::CachingHostAllocator::BlockDescriptor::ready_event.
|
inline |
Frees a live allocation of pinned host memory, returning it to the allocator.
Once freed, the allocation becomes available immediately for reuse.
Definition at line 497 of file CachingHostAllocator.h.
References notcub::CachingHostAllocator::BlockDescriptor::associated_stream, notcub::CachingHostAllocator::BlockDescriptor::bin, notcub::CachingHostAllocator::BlockDescriptor::bytes, cached_blocks, cached_bytes, cudaCheck, debug, notcub::CachingHostAllocator::BlockDescriptor::device, relativeConstraints::error, notcub::CachingHostAllocator::TotalBytes::free, INVALID_BIN, INVALID_DEVICE_ORDINAL, notcub::CachingHostAllocator::TotalBytes::live, live_blocks, max_cached_bytes, mutex, and notcub::CachingHostAllocator::BlockDescriptor::ready_event.
|
inlinestatic |
Integer pow function for unsigned base and exponent
Definition at line 179 of file CachingHostAllocator.h.
References newFWLiteAna::base, and JetChargeProducer_cfi::exp.
|
inline |
Round up to the nearest power-of
Definition at line 194 of file CachingHostAllocator.h.
References newFWLiteAna::base, and cms::alpakatools::detail::power().
Referenced by HostAllocate().
|
inline |
Sets the limit on the number bytes this allocator is allowed to cache.
Changing the ceiling of cached bytes does not cause any allocations (in-use or cached-in-reserve) to be freed. See FreeAllCached()
.
Definition at line 292 of file CachingHostAllocator.h.
References debug, max_cached_bytes, and mutex.
unsigned int notcub::CachingHostAllocator::bin_growth |
Mutex for thread-safety.
Definition at line 217 of file CachingHostAllocator.h.
Referenced by HostAllocate().
CachedBlocks notcub::CachingHostAllocator::cached_blocks |
Aggregate cached bytes.
Definition at line 230 of file CachingHostAllocator.h.
Referenced by FreeAllCached(), HostAllocate(), and HostFree().
TotalBytes notcub::CachingHostAllocator::cached_bytes |
Whether or not to print (de)allocation events to stdout.
Definition at line 229 of file CachingHostAllocator.h.
Referenced by FreeAllCached(), HostAllocate(), and HostFree().
bool notcub::CachingHostAllocator::debug |
Whether or not to skip a call to FreeAllCached() when destructor is called. (The CUDA runtime may have already shut down for statically declared allocators)
Definition at line 227 of file CachingHostAllocator.h.
Referenced by rrapi.RRApi::dprint(), FreeAllCached(), rrapi.RRApi::get(), HostAllocate(), HostFree(), runTauIdMVA.TauIDEmbedder::load_againstElectronMVA6(), runTauIdMVA.TauIDEmbedder::loadMVA_WPs_run2_2017(), and SetMaxCachedBytes().
|
static |
Out-of-bounds bin.
Definition at line 106 of file CachingHostAllocator.h.
Referenced by HostAllocate(), and HostFree().
|
static |
Invalid device ordinal.
Definition at line 114 of file CachingHostAllocator.h.
Referenced by FreeAllCached(), HostAllocate(), and HostFree().
|
static |
Invalid size.
Definition at line 109 of file CachingHostAllocator.h.
BusyBlocks notcub::CachingHostAllocator::live_blocks |
Set of cached pinned host allocations available for reuse.
Definition at line 231 of file CachingHostAllocator.h.
Referenced by FreeAllCached(), HostAllocate(), and HostFree().
unsigned int notcub::CachingHostAllocator::max_bin |
Minimum bin enumeration.
Definition at line 219 of file CachingHostAllocator.h.
Referenced by HostAllocate().
size_t notcub::CachingHostAllocator::max_bin_bytes |
Minimum bin size.
Definition at line 222 of file CachingHostAllocator.h.
size_t notcub::CachingHostAllocator::max_cached_bytes |
Maximum bin size.
Definition at line 223 of file CachingHostAllocator.h.
Referenced by HostFree(), and SetMaxCachedBytes().
unsigned int notcub::CachingHostAllocator::min_bin |
Geometric growth factor for bin-sizes.
Definition at line 218 of file CachingHostAllocator.h.
Referenced by HostAllocate().
size_t notcub::CachingHostAllocator::min_bin_bytes |
Maximum bin enumeration.
Definition at line 221 of file CachingHostAllocator.h.
Referenced by HostAllocate().
std::mutex notcub::CachingHostAllocator::mutex |
Definition at line 215 of file CachingHostAllocator.h.
Referenced by FreeAllCached(), HostAllocate(), HostFree(), and SetMaxCachedBytes().
const bool notcub::CachingHostAllocator::skip_cleanup |
Maximum aggregate cached bytes.
Definition at line 226 of file CachingHostAllocator.h.
Referenced by ~CachingHostAllocator().