CMS 3D CMS Logo

List of all members | Public Member Functions | Static Public Attributes | Private Member Functions | Private Attributes
ReadRepacker Class Reference

#include <ReadRepacker.h>

Public Member Functions

IOSize bufferUsed () const
 
IOSize extraBytes () const
 
std::vector< IOPosBuffer > & iov ()
 
int pack (long long int *pos, int *len, int nbuf, char *buf, IOSize buffer_size)
 
IOSize realBytesProcessed () const
 
void unpack (char *buf)
 

Static Public Attributes

static const IOSize BIG_READ_SIZE = 256 * 1024
 
static const IOSize READ_COALESCE_SIZE = 32 * 1024
 
static const IOSize TEMPORARY_BUFFER_SIZE = 256 * 1024
 

Private Member Functions

int packInternal (long long int *pos, int *len, int nbuf, char *buf, IOSize buffer_size)
 
void reset (unsigned int nbuf)
 

Private Attributes

IOSize m_buffer_used
 
IOSize m_extra_bytes
 
std::vector< int > m_idx_to_iopb
 
std::vector< int > m_idx_to_iopb_offset
 
std::vector< IOPosBufferm_iov
 
edm::propagate_const< int * > m_len
 
std::vector< char > m_spare_buffer
 

Detailed Description

Repack a set of read requests from the ROOT layer to be optimized for the storage layer.

The basic technique employed is to coalesce nearby, but not adjacent reads into one larger read in the request to the storage system. We will be purposely over-reading from storage.

The read-coalescing is done because the vector reads are typically unrolled server-side in a "dumb" fashion, with OS read-ahead disabled. The coalescing actually decreases the number of requests sent to disk; important, as ROOT I/O is typically latency bound.

The complexity here is in the fact that we must have buffer space to hold the extra bytes from the storage system, even through they're going to be discarded.

The approach is to reuse the ROOT buffer as temporary holding space, plus a small, fixed-size "spare buffer". So, in the worst-case, we will use about 256KB of extra buffer space. The read-coalesce algorithm is greedy, so we can't provide an a-priori estimate on how many extra I/O transactions will be sent to the storage (compared to vector-reads with no coalescing). Tests currently indicate that this approach usually causes zero to one additional I/O transaction to occur.

Definition at line 33 of file ReadRepacker.h.

Member Function Documentation

IOSize ReadRepacker::bufferUsed ( ) const
inline

Definition at line 50 of file ReadRepacker.h.

References m_buffer_used.

Referenced by TStorageFactoryFile::ReadBuffersSync().

50 {return m_buffer_used;} // Returns the total amount of space in the temp buffer used.
IOSize m_buffer_used
Definition: ReadRepacker.h:76
IOSize ReadRepacker::extraBytes ( ) const
inline

Definition at line 51 of file ReadRepacker.h.

References m_extra_bytes.

51 {return m_extra_bytes;} // Returns the number of extra bytes to be issued to the I/O system
IOSize m_extra_bytes
Definition: ReadRepacker.h:77
std::vector<IOPosBuffer>& ReadRepacker::iov ( )
inline

Definition at line 48 of file ReadRepacker.h.

References m_iov.

Referenced by TStorageFactoryFile::ReadBuffersSync().

48 { return m_iov; } // Returns the IO vector, optimized for storage.
std::vector< IOPosBuffer > m_iov
Definition: ReadRepacker.h:74
int ReadRepacker::pack ( long long int *  pos,
int *  len,
int  nbuf,
char *  buf,
IOSize  buffer_size 
)

Given a list of offsets and positions, pack them into a vector of IOPosBuffer (an "IO Vector"). This function will coalesce reads that are within READ_COALESCE_SIZE into a IOPosBuffer. This function will not create an IO vector whose summed buffer size is larger than TEMPORARY_BUFFER_SIZE. The IOPosBuffer in iov all point to a location inside buf.

Parameters
posAn array of file offsets, nbuf long.
lenAn array of offset length, nbuf long.
nbufNumber of buffers to pack.
bufLocation of temporary buffer for the results of the storage request.
buffer_sizeSize of the temporary buffer.

Returns the number of entries of the original array packed into iov.

Definition at line 22 of file ReadRepacker.cc.

References m_len, m_spare_buffer, packInternal(), reset(), and TEMPORARY_BUFFER_SIZE.

Referenced by pyrootRender.interactiveRender::draw(), and TStorageFactoryFile::ReadBuffersSync().

23 {
24  reset(nbuf);
25  m_len = len; // Record the len array so we can later unpack.
26 
27  // Determine the buffer to use for the initial packing.
28  char * tmp_buf;
29  IOSize tmp_size;
30  if (buffer_size < TEMPORARY_BUFFER_SIZE) {
32  tmp_buf = &m_spare_buffer[0];
33  tmp_size = TEMPORARY_BUFFER_SIZE;
34  } else {
35  tmp_buf = buf;
36  tmp_size = buffer_size;
37  }
38 
39  int pack_count = packInternal(pos, len, nbuf, tmp_buf, tmp_size);
40 
41  if ((nbuf - pack_count > 0) && // If there is remaining work..
42  (tmp_buf != &m_spare_buffer[0]) && // and the spare buffer isn't already used
43  ((IOSize)len[pack_count] < TEMPORARY_BUFFER_SIZE)) { // And the spare buffer is big enough to hold at least one read.
44 
45  // Verify the spare is allocated.
46  // If tmp_buf != &m_spare_buffer[0] before, it certainly won't after.
48 
49  // If there are remaining chunks and we aren't already using the spare
50  // buffer, try using that too.
51  // This clutters up the code badly, but could save a network round-trip.
52  pack_count += packInternal(&pos[pack_count], &len[pack_count], nbuf-pack_count,
54 
55  }
56 
57  return pack_count;
58 }
edm::propagate_const< int * > m_len
Definition: ReadRepacker.h:75
void reset(unsigned int nbuf)
std::vector< char > m_spare_buffer
Definition: ReadRepacker.h:78
static const IOSize TEMPORARY_BUFFER_SIZE
Definition: ReadRepacker.h:57
int packInternal(long long int *pos, int *len, int nbuf, char *buf, IOSize buffer_size)
Definition: ReadRepacker.cc:61
size_t IOSize
Definition: IOTypes.h:14
int ReadRepacker::packInternal ( long long int *  pos,
int *  len,
int  nbuf,
char *  buf,
IOSize  buffer_size 
)
private

Definition at line 61 of file ReadRepacker.cc.

References BIG_READ_SIZE, training_settings::idx, m_buffer_used, m_extra_bytes, m_idx_to_iopb, m_idx_to_iopb_offset, m_iov, IOPosBuffer::offset(), READ_COALESCE_SIZE, IOPosBuffer::set_data(), IOPosBuffer::set_offset(), IOPosBuffer::set_size(), and IOPosBuffer::size().

Referenced by pack().

62 {
63  if (nbuf == 0) {
64  return 0;
65  }
66 
67  // Handle case 1 separately to make the for-loop cleaner.
68  int iopb_offset = m_iov.size();
69  // Because we re-use the buffer from ROOT, we are guarantee this iopb will
70  // fit.
71  assert(static_cast<IOSize>(len[0]) <= buffer_size);
72  IOPosBuffer iopb(pos[0], buf, len[0]);
73  m_idx_to_iopb.push_back(iopb_offset);
74  m_idx_to_iopb_offset.push_back(0);
75 
76  IOSize buffer_used = len[0];
77  int idx;
78  for (idx=1; idx < nbuf; idx++) {
79  if (buffer_used + len[idx] > buffer_size) {
80  // No way we can include this chunk in the read buffer
81  break;
82  }
83 
84  IOOffset extra_bytes_signed = (idx == 0) ? 0 : ((pos[idx] - iopb.offset()) - iopb.size()); assert(extra_bytes_signed >= 0);
85  IOSize extra_bytes = static_cast<IOSize>(extra_bytes_signed);
86 
87  if (((static_cast<IOSize>(len[idx]) < BIG_READ_SIZE) || (iopb.size() < BIG_READ_SIZE)) &&
88  (extra_bytes < READ_COALESCE_SIZE) && (buffer_used + len[idx] + extra_bytes <= buffer_size)) {
89  // The space between the two reads is small enough we can coalesce.
90 
91  // We enforce that the current read or the current iopb must be small.
92  // This is so we can "perfectly pack" buffers consisting of only big
93  // reads - in such a case, read coalescing doesn't help much.
94  m_idx_to_iopb.push_back(iopb_offset);
95  m_idx_to_iopb_offset.push_back(pos[idx]-iopb.offset());
96  iopb.set_size(pos[idx]+len[idx] - iopb.offset());
97  buffer_used += (len[idx] + extra_bytes);
98  m_extra_bytes += extra_bytes;
99  continue;
100  }
101  // There is a big jump, but still space left in the temporary buffer.
102  // Record our current iopb:
103  m_iov.push_back(iopb);
104 
105  // Reset iopb
106  iopb.set_offset(pos[idx]);
107  iopb.set_data(buf + buffer_used);
108  iopb.set_size(len[idx]);
109 
110  // Record location of this chunk.
111  iopb_offset ++;
112 
113  m_idx_to_iopb.push_back(iopb_offset);
114  m_idx_to_iopb_offset.push_back(0);
115 
116  buffer_used += len[idx];
117  }
118  m_iov.push_back(iopb);
119 
120  m_buffer_used += buffer_used;
121  return idx;
122 }
std::vector< IOPosBuffer > m_iov
Definition: ReadRepacker.h:74
IOSize m_extra_bytes
Definition: ReadRepacker.h:77
std::vector< int > m_idx_to_iopb_offset
Definition: ReadRepacker.h:73
int64_t IOOffset
Definition: IOTypes.h:19
size_t IOSize
Definition: IOTypes.h:14
static const IOSize BIG_READ_SIZE
Definition: ReadRepacker.h:63
static const IOSize READ_COALESCE_SIZE
Definition: ReadRepacker.h:60
IOSize m_buffer_used
Definition: ReadRepacker.h:76
std::vector< int > m_idx_to_iopb
Definition: ReadRepacker.h:72
IOSize ReadRepacker::realBytesProcessed ( ) const
inline

Definition at line 53 of file ReadRepacker.h.

References m_buffer_used, and m_extra_bytes.

Referenced by TStorageFactoryFile::ReadBuffersSync().

53 {return m_buffer_used-m_extra_bytes;} // Return the number of bytes of the input request that would be processed by the IO vector
IOSize m_extra_bytes
Definition: ReadRepacker.h:77
IOSize m_buffer_used
Definition: ReadRepacker.h:76
void ReadRepacker::reset ( unsigned int  nbuf)
private

Definition at line 149 of file ReadRepacker.cc.

References m_buffer_used, m_extra_bytes, m_idx_to_iopb, m_idx_to_iopb_offset, and m_iov.

Referenced by pack().

150 {
151  m_extra_bytes = 0;
152  m_buffer_used = 0;
153 
154  // Number of buffers to storage typically decreases, but nbuf/2 is just an
155  // somewhat-informed guess.
156  m_iov.reserve(nbuf/2);
157  m_iov.clear();
158  m_idx_to_iopb.reserve(nbuf);
159  m_idx_to_iopb.clear();
160  m_idx_to_iopb_offset.reserve(nbuf);
161  m_idx_to_iopb_offset.clear();
162 }
std::vector< IOPosBuffer > m_iov
Definition: ReadRepacker.h:74
IOSize m_extra_bytes
Definition: ReadRepacker.h:77
std::vector< int > m_idx_to_iopb_offset
Definition: ReadRepacker.h:73
IOSize m_buffer_used
Definition: ReadRepacker.h:76
std::vector< int > m_idx_to_iopb
Definition: ReadRepacker.h:72
void ReadRepacker::unpack ( char *  buf)

Unpack the optimized set of reads from the storage system and copy the results in the order ROOT requested.

Definition at line 129 of file ReadRepacker.cc.

References IOPosBuffer::data(), training_settings::idx, m_idx_to_iopb, m_idx_to_iopb_offset, m_iov, and m_len.

Referenced by TStorageFactoryFile::ReadBuffersSync().

130 {
131 
132  char * root_result_ptr = buf;
133  int nbuf = m_idx_to_iopb.size();
134  for (int idx=0; idx < nbuf; idx++) {
135  int iov_idx = m_idx_to_iopb[idx];
136  IOPosBuffer &iopb = m_iov[iov_idx];
137  int iopb_offset = m_idx_to_iopb_offset[idx];
138  char * io_result_ptr = static_cast<char *>(iopb.data()) + iopb_offset;
139  // Note that we use the input buffer as a temporary where possible.
140  // Hence, the source and destination can overlap; use memmove instead of memcpy.
141  memmove(root_result_ptr, io_result_ptr, m_len[idx]);
142 
143  root_result_ptr += m_len[idx];
144  }
145 
146 }
std::vector< IOPosBuffer > m_iov
Definition: ReadRepacker.h:74
edm::propagate_const< int * > m_len
Definition: ReadRepacker.h:75
void * data(void) const
Definition: IOPosBuffer.h:59
std::vector< int > m_idx_to_iopb_offset
Definition: ReadRepacker.h:73
std::vector< int > m_idx_to_iopb
Definition: ReadRepacker.h:72

Member Data Documentation

const IOSize ReadRepacker::BIG_READ_SIZE = 256 * 1024
static

Definition at line 63 of file ReadRepacker.h.

Referenced by packInternal().

IOSize ReadRepacker::m_buffer_used
private

Definition at line 76 of file ReadRepacker.h.

Referenced by bufferUsed(), packInternal(), realBytesProcessed(), and reset().

IOSize ReadRepacker::m_extra_bytes
private

Definition at line 77 of file ReadRepacker.h.

Referenced by extraBytes(), packInternal(), realBytesProcessed(), and reset().

std::vector<int> ReadRepacker::m_idx_to_iopb
private

Definition at line 72 of file ReadRepacker.h.

Referenced by packInternal(), reset(), and unpack().

std::vector<int> ReadRepacker::m_idx_to_iopb_offset
private

Definition at line 73 of file ReadRepacker.h.

Referenced by packInternal(), reset(), and unpack().

std::vector<IOPosBuffer> ReadRepacker::m_iov
private

Definition at line 74 of file ReadRepacker.h.

Referenced by iov(), packInternal(), reset(), and unpack().

edm::propagate_const<int*> ReadRepacker::m_len
private

Definition at line 75 of file ReadRepacker.h.

Referenced by pack(), and unpack().

std::vector<char> ReadRepacker::m_spare_buffer
private

Definition at line 78 of file ReadRepacker.h.

Referenced by pack().

const IOSize ReadRepacker::READ_COALESCE_SIZE = 32 * 1024
static

Definition at line 60 of file ReadRepacker.h.

Referenced by packInternal().

const IOSize ReadRepacker::TEMPORARY_BUFFER_SIZE = 256 * 1024
static

Definition at line 57 of file ReadRepacker.h.

Referenced by pack().