CMS 3D CMS Logo

List of all members | Public Types | Public Member Functions | Static Public Attributes | Private Member Functions | Private Attributes
ReadRepacker Class Reference

#include <ReadRepacker.h>

Public Types

using IOPosBuffer = edm::storage::IOPosBuffer
 
using IOSize = edm::storage::IOSize
 

Public Member Functions

IOSize bufferUsed () const
 
IOSize extraBytes () const
 
std::vector< IOPosBuffer > & iov ()
 
int pack (long long int *pos, int *len, int nbuf, char *buf, IOSize buffer_size)
 
IOSize realBytesProcessed () const
 
void unpack (char *buf)
 

Static Public Attributes

static constexpr IOSize BIG_READ_SIZE = 256 * 1024
 
static constexpr IOSize READ_COALESCE_SIZE = 32 * 1024
 
static constexpr IOSize TEMPORARY_BUFFER_SIZE = 256 * 1024
 

Private Member Functions

int packInternal (long long int *pos, int *len, int nbuf, char *buf, IOSize buffer_size)
 
void reset (unsigned int nbuf)
 

Private Attributes

IOSize m_buffer_used
 
IOSize m_extra_bytes
 
std::vector< int > m_idx_to_iopb
 
std::vector< int > m_idx_to_iopb_offset
 
std::vector< IOPosBufferm_iov
 
edm::propagate_const< int * > m_len
 
std::vector< char > m_spare_buffer
 

Detailed Description

Repack a set of read requests from the ROOT layer to be optimized for the storage layer.

The basic technique employed is to coalesce nearby, but not adjacent reads into one larger read in the request to the storage system. We will be purposely over-reading from storage.

The read-coalescing is done because the vector reads are typically unrolled server-side in a "dumb" fashion, with OS read-ahead disabled. The coalescing actually decreases the number of requests sent to disk; important, as ROOT I/O is typically latency bound.

The complexity here is in the fact that we must have buffer space to hold the extra bytes from the storage system, even through they're going to be discarded.

The approach is to reuse the ROOT buffer as temporary holding space, plus a small, fixed-size "spare buffer". So, in the worst-case, we will use about 256KB of extra buffer space. The read-coalesce algorithm is greedy, so we can't provide an a-priori estimate on how many extra I/O transactions will be sent to the storage (compared to vector-reads with no coalescing). Tests currently indicate that this approach usually causes zero to one additional I/O transaction to occur.

Definition at line 33 of file ReadRepacker.h.

Member Typedef Documentation

◆ IOPosBuffer

Definition at line 36 of file ReadRepacker.h.

◆ IOSize

Definition at line 35 of file ReadRepacker.h.

Member Function Documentation

◆ bufferUsed()

IOSize ReadRepacker::bufferUsed ( ) const
inline

Definition at line 49 of file ReadRepacker.h.

References m_buffer_used.

Referenced by TStorageFactoryFile::ReadBuffersSync().

49 { return m_buffer_used; } // Returns the total amount of space in the temp buffer used.
IOSize m_buffer_used
Definition: ReadRepacker.h:83

◆ extraBytes()

IOSize ReadRepacker::extraBytes ( ) const
inline

Definition at line 50 of file ReadRepacker.h.

References m_extra_bytes.

50  {
51  return m_extra_bytes;
52  } // Returns the number of extra bytes to be issued to the I/O system
IOSize m_extra_bytes
Definition: ReadRepacker.h:84

◆ iov()

std::vector<IOPosBuffer>& ReadRepacker::iov ( )
inline

Definition at line 47 of file ReadRepacker.h.

References m_iov.

Referenced by TStorageFactoryFile::ReadBuffersSync().

47 { return m_iov; } // Returns the IO vector, optimized for storage.
std::vector< IOPosBuffer > m_iov
Definition: ReadRepacker.h:81

◆ pack()

int ReadRepacker::pack ( long long int *  pos,
int *  len,
int  nbuf,
char *  buf,
IOSize  buffer_size 
)

Given a list of offsets and positions, pack them into a vector of IOPosBuffer (an "IO Vector"). This function will coalesce reads that are within READ_COALESCE_SIZE into a IOPosBuffer. This function will not create an IO vector whose summed buffer size is larger than TEMPORARY_BUFFER_SIZE. The IOPosBuffer in iov all point to a location inside buf.

Parameters
posAn array of file offsets, nbuf long.
lenAn array of offset length, nbuf long.
nbufNumber of buffers to pack.
bufLocation of temporary buffer for the results of the storage request.
buffer_sizeSize of the temporary buffer.

Returns the number of entries of the original array packed into iov.

Definition at line 21 of file ReadRepacker.cc.

References visDQMUpload::buf, m_len, m_spare_buffer, packInternal(), reset(), and TEMPORARY_BUFFER_SIZE.

Referenced by TStorageFactoryFile::ReadBuffersSync().

21  {
22  reset(nbuf);
23  m_len = len; // Record the len array so we can later unpack.
24 
25  // Determine the buffer to use for the initial packing.
26  char *tmp_buf;
27  IOSize tmp_size;
28  if (buffer_size < TEMPORARY_BUFFER_SIZE) {
30  tmp_buf = m_spare_buffer.data();
31  tmp_size = TEMPORARY_BUFFER_SIZE;
32  } else {
33  tmp_buf = buf;
34  tmp_size = buffer_size;
35  }
36 
37  int pack_count = packInternal(pos, len, nbuf, tmp_buf, tmp_size);
38 
39  if ((nbuf - pack_count > 0) && // If there is remaining work..
40  (tmp_buf != m_spare_buffer.data()) && // and the spare buffer isn't already used
41  ((IOSize)len[pack_count] <
42  TEMPORARY_BUFFER_SIZE)) { // And the spare buffer is big enough to hold at least one read.
43 
44  // Verify the spare is allocated.
45  // If tmp_buf != &m_spare_buffer[0] before, it certainly won't after.
47 
48  // If there are remaining chunks and we aren't already using the spare
49  // buffer, try using that too.
50  // This clutters up the code badly, but could save a network round-trip.
51  pack_count += packInternal(
52  &pos[pack_count], &len[pack_count], nbuf - pack_count, m_spare_buffer.data(), TEMPORARY_BUFFER_SIZE);
53  }
54 
55  return pack_count;
56 }
static constexpr IOSize TEMPORARY_BUFFER_SIZE
Definition: ReadRepacker.h:60
void reset(unsigned int nbuf)
std::vector< char > m_spare_buffer
Definition: ReadRepacker.h:85
edm::storage::IOSize IOSize
Definition: ReadRepacker.h:35
int packInternal(long long int *pos, int *len, int nbuf, char *buf, IOSize buffer_size)
Definition: ReadRepacker.cc:58
edm::propagate_const< int * > m_len
Definition: ReadRepacker.h:82

◆ packInternal()

int ReadRepacker::packInternal ( long long int *  pos,
int *  len,
int  nbuf,
char *  buf,
IOSize  buffer_size 
)
private

Definition at line 58 of file ReadRepacker.cc.

References cms::cuda::assert(), BIG_READ_SIZE, visDQMUpload::buf, heavyIonCSV_trainingSettings::idx, m_buffer_used, m_extra_bytes, m_idx_to_iopb, m_idx_to_iopb_offset, m_iov, edm::storage::IOPosBuffer::offset(), READ_COALESCE_SIZE, edm::storage::IOPosBuffer::set_data(), edm::storage::IOPosBuffer::set_offset(), edm::storage::IOPosBuffer::set_size(), and edm::storage::IOPosBuffer::size().

Referenced by pack().

58  {
59  if (nbuf == 0) {
60  return 0;
61  }
62 
63  // Handle case 1 separately to make the for-loop cleaner.
64  int iopb_offset = m_iov.size();
65  // Because we re-use the buffer from ROOT, we are guarantee this iopb will
66  // fit.
67  assert(static_cast<IOSize>(len[0]) <= buffer_size);
68  IOPosBuffer iopb(pos[0], buf, len[0]);
69  m_idx_to_iopb.push_back(iopb_offset);
70  m_idx_to_iopb_offset.push_back(0);
71 
72  IOSize buffer_used = len[0];
73  int idx;
74  for (idx = 1; idx < nbuf; idx++) {
75  if (buffer_used + len[idx] > buffer_size) {
76  // No way we can include this chunk in the read buffer
77  break;
78  }
79 
80  edm::storage::IOOffset extra_bytes_signed = (idx == 0) ? 0 : ((pos[idx] - iopb.offset()) - iopb.size());
81  assert(extra_bytes_signed >= 0);
82  IOSize extra_bytes = static_cast<IOSize>(extra_bytes_signed);
83 
84  if (((static_cast<IOSize>(len[idx]) < BIG_READ_SIZE) || (iopb.size() < BIG_READ_SIZE)) &&
85  (extra_bytes < READ_COALESCE_SIZE) && (buffer_used + len[idx] + extra_bytes <= buffer_size)) {
86  // The space between the two reads is small enough we can coalesce.
87 
88  // We enforce that the current read or the current iopb must be small.
89  // This is so we can "perfectly pack" buffers consisting of only big
90  // reads - in such a case, read coalescing doesn't help much.
91  m_idx_to_iopb.push_back(iopb_offset);
92  m_idx_to_iopb_offset.push_back(pos[idx] - iopb.offset());
93  iopb.set_size(pos[idx] + len[idx] - iopb.offset());
94  buffer_used += (len[idx] + extra_bytes);
95  m_extra_bytes += extra_bytes;
96  continue;
97  }
98  // There is a big jump, but still space left in the temporary buffer.
99  // Record our current iopb:
100  m_iov.push_back(iopb);
101 
102  // Reset iopb
103  iopb.set_offset(pos[idx]);
104  iopb.set_data(buf + buffer_used);
105  iopb.set_size(len[idx]);
106 
107  // Record location of this chunk.
108  iopb_offset++;
109 
110  m_idx_to_iopb.push_back(iopb_offset);
111  m_idx_to_iopb_offset.push_back(0);
112 
113  buffer_used += len[idx];
114  }
115  m_iov.push_back(iopb);
116 
117  m_buffer_used += buffer_used;
118  return idx;
119 }
int64_t IOOffset
Definition: IOTypes.h:20
std::vector< IOPosBuffer > m_iov
Definition: ReadRepacker.h:81
static constexpr IOSize READ_COALESCE_SIZE
Definition: ReadRepacker.h:63
assert(be >=bs)
static constexpr IOSize BIG_READ_SIZE
Definition: ReadRepacker.h:66
IOSize m_extra_bytes
Definition: ReadRepacker.h:84
edm::storage::IOSize IOSize
Definition: ReadRepacker.h:35
std::vector< int > m_idx_to_iopb_offset
Definition: ReadRepacker.h:80
edm::storage::IOPosBuffer IOPosBuffer
Definition: ReadRepacker.h:36
IOSize m_buffer_used
Definition: ReadRepacker.h:83
std::vector< int > m_idx_to_iopb
Definition: ReadRepacker.h:78

◆ realBytesProcessed()

IOSize ReadRepacker::realBytesProcessed ( ) const
inline

Definition at line 54 of file ReadRepacker.h.

References m_buffer_used, and m_extra_bytes.

Referenced by TStorageFactoryFile::ReadBuffersSync().

54  {
56  } // Return the number of bytes of the input request that would be processed by the IO vector
IOSize m_extra_bytes
Definition: ReadRepacker.h:84
IOSize m_buffer_used
Definition: ReadRepacker.h:83

◆ reset()

void ReadRepacker::reset ( unsigned int  nbuf)
private

Definition at line 141 of file ReadRepacker.cc.

References m_buffer_used, m_extra_bytes, m_idx_to_iopb, m_idx_to_iopb_offset, and m_iov.

Referenced by pack().

141  {
142  m_extra_bytes = 0;
143  m_buffer_used = 0;
144 
145  // Number of buffers to storage typically decreases, but nbuf/2 is just an
146  // somewhat-informed guess.
147  m_iov.reserve(nbuf / 2);
148  m_iov.clear();
149  m_idx_to_iopb.reserve(nbuf);
150  m_idx_to_iopb.clear();
151  m_idx_to_iopb_offset.reserve(nbuf);
152  m_idx_to_iopb_offset.clear();
153 }
std::vector< IOPosBuffer > m_iov
Definition: ReadRepacker.h:81
IOSize m_extra_bytes
Definition: ReadRepacker.h:84
std::vector< int > m_idx_to_iopb_offset
Definition: ReadRepacker.h:80
IOSize m_buffer_used
Definition: ReadRepacker.h:83
std::vector< int > m_idx_to_iopb
Definition: ReadRepacker.h:78

◆ unpack()

void ReadRepacker::unpack ( char *  buf)

Unpack the optimized set of reads from the storage system and copy the results in the order ROOT requested.

Definition at line 125 of file ReadRepacker.cc.

References visDQMUpload::buf, edm::storage::IOPosBuffer::data(), heavyIonCSV_trainingSettings::idx, m_idx_to_iopb, m_idx_to_iopb_offset, m_iov, and m_len.

Referenced by TStorageFactoryFile::ReadBuffersSync().

125  {
126  char *root_result_ptr = buf;
127  int nbuf = m_idx_to_iopb.size();
128  for (int idx = 0; idx < nbuf; idx++) {
129  int iov_idx = m_idx_to_iopb[idx];
130  IOPosBuffer &iopb = m_iov[iov_idx];
131  int iopb_offset = m_idx_to_iopb_offset[idx];
132  char *io_result_ptr = static_cast<char *>(iopb.data()) + iopb_offset;
133  // Note that we use the input buffer as a temporary where possible.
134  // Hence, the source and destination can overlap; use memmove instead of memcpy.
135  memmove(root_result_ptr, io_result_ptr, m_len[idx]);
136 
137  root_result_ptr += m_len[idx];
138  }
139 }
std::vector< IOPosBuffer > m_iov
Definition: ReadRepacker.h:81
std::vector< int > m_idx_to_iopb_offset
Definition: ReadRepacker.h:80
edm::storage::IOPosBuffer IOPosBuffer
Definition: ReadRepacker.h:36
edm::propagate_const< int * > m_len
Definition: ReadRepacker.h:82
std::vector< int > m_idx_to_iopb
Definition: ReadRepacker.h:78

Member Data Documentation

◆ BIG_READ_SIZE

constexpr IOSize ReadRepacker::BIG_READ_SIZE = 256 * 1024
static

Definition at line 66 of file ReadRepacker.h.

Referenced by packInternal().

◆ m_buffer_used

IOSize ReadRepacker::m_buffer_used
private

Definition at line 83 of file ReadRepacker.h.

Referenced by bufferUsed(), packInternal(), realBytesProcessed(), and reset().

◆ m_extra_bytes

IOSize ReadRepacker::m_extra_bytes
private

Definition at line 84 of file ReadRepacker.h.

Referenced by extraBytes(), packInternal(), realBytesProcessed(), and reset().

◆ m_idx_to_iopb

std::vector<int> ReadRepacker::m_idx_to_iopb
private

Definition at line 78 of file ReadRepacker.h.

Referenced by packInternal(), reset(), and unpack().

◆ m_idx_to_iopb_offset

std::vector<int> ReadRepacker::m_idx_to_iopb_offset
private

Definition at line 80 of file ReadRepacker.h.

Referenced by packInternal(), reset(), and unpack().

◆ m_iov

std::vector<IOPosBuffer> ReadRepacker::m_iov
private

Definition at line 81 of file ReadRepacker.h.

Referenced by iov(), packInternal(), reset(), and unpack().

◆ m_len

edm::propagate_const<int *> ReadRepacker::m_len
private

Definition at line 82 of file ReadRepacker.h.

Referenced by pack(), and unpack().

◆ m_spare_buffer

std::vector<char> ReadRepacker::m_spare_buffer
private

Definition at line 85 of file ReadRepacker.h.

Referenced by pack().

◆ READ_COALESCE_SIZE

constexpr IOSize ReadRepacker::READ_COALESCE_SIZE = 32 * 1024
static

Definition at line 63 of file ReadRepacker.h.

Referenced by packInternal().

◆ TEMPORARY_BUFFER_SIZE

constexpr IOSize ReadRepacker::TEMPORARY_BUFFER_SIZE = 256 * 1024
static

Definition at line 60 of file ReadRepacker.h.

Referenced by pack().