CMS 3D CMS Logo

 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Groups Pages
List of all members | Public Member Functions | Public Attributes | Private Attributes
mergeLHE.DefaultLHEMerger Class Reference
Inheritance diagram for mergeLHE.DefaultLHEMerger:
mergeLHE.BaseLHEMerger

Public Member Functions

def __init__
 
def check_header_compatibility
 
def file_iterator
 
def merge
 
def merge_headers
 
def merge_init_blocks
 
- Public Member Functions inherited from mergeLHE.BaseLHEMerger
def __init__
 
def merge
 

Public Attributes

 bypass_check
 
- Public Attributes inherited from mergeLHE.BaseLHEMerger
 input_files
 
 output_file
 

Private Attributes

 _f
 
 _header_lines
 
 _header_str
 
 _init_str
 
 _is_mglo
 
 _merged_init_str
 
 _nevent
 
 _uwgt
 
 _xsec_combined
 

Detailed Description

Default LHE merge scheme that copies the header of the first LHE file,
merges and outputs the init block, then concatenates all event blocks.

Definition at line 23 of file mergeLHE.py.

Constructor & Destructor Documentation

def mergeLHE.DefaultLHEMerger.__init__ (   self,
  input_files,
  output_file,
  kwargs 
)

Definition at line 27 of file mergeLHE.py.

27 
28  def __init__(self, input_files, output_file, **kwargs):
29  super(DefaultLHEMerger, self).__init__(input_files, output_file)
30 
31  self.bypass_check = kwargs.get('bypass_check', False)
32  # line-by-line iterator for each input file
33  self._f = [self.file_iterator(name) for name in self.input_files]
34  self._header_str = []
35  self._is_mglo = False
36  self._xsec_combined = 0.
37  self._uwgt = 0.
38  self._init_str = [] # initiated blocks for each input file
39  self._nevent = [] # number of events for each input file

Member Function Documentation

def mergeLHE.DefaultLHEMerger.check_header_compatibility (   self)
Check if all headers for input files are consistent.

Definition at line 46 of file mergeLHE.py.

References mergeLHE.DefaultLHEMerger.bypass_check.

Referenced by mergeLHE.DefaultLHEMerger.merge().

46 
48  """Check if all headers for input files are consistent."""
49 
50  if self.bypass_check:
51  return
52 
53  inconsistent_error_info = ("Incompatibility found in LHE headers: %s. "
54  "Use -b/--bypass-check to bypass the check.")
55  allow_diff_keys = [
56  'nevent', 'numevts', 'iseed', 'Seed', 'Random', '.log', '.dat', '.lhe',
57  'Number of Events', 'Integrated weight'
58  ]
59  self._header_lines = [header.split('\n') for header in self._header_str]
60 
61  # Iterate over header lines for all input files and check consistency
62  logging.debug('header line number: %s' \
63  % ', '.join([str(len(lines)) for lines in self._header_lines]))
64  assert all([
65  len(self._header_lines[0]) == len(lines) for lines in self._header_lines]
66  ), inconsistent_error_info % "line number not matches"
67  inconsistent_lines_set = [set() for _ in self._header_lines]
68  for line_zip in zip(*self._header_lines):
69  if any([k in line_zip[0] for k in allow_diff_keys]):
70  logging.debug('Captured \'%s\', we allow difference in this line' % line_zip[0])
71  continue
72  if not all([line_zip[0] == line for line in line_zip]):
73  # Ok so meet inconsistency in some lines, then temporarily store them
74  for i, line in enumerate(line_zip):
75  inconsistent_lines_set[i].add(line)
76  # Those inconsistent lines still match, meaning that it is only a change of order
77  assert all([inconsistent_lines_set[0] == lset for lset in inconsistent_lines_set]), \
78  inconsistent_error_info % ('{' + ', '.join(inconsistent_lines_set[0]) + '}')
bool any(const std::vector< T > &v, const T &what)
Definition: ECalSD.cc:37
OutputIterator zip(InputIterator1 first1, InputIterator1 last1, InputIterator2 first2, InputIterator2 last2, OutputIterator result, Compare comp)
def all
workaround iterator generators for ROOT classes
Definition: cmstools.py:25
void add(std::map< std::string, TH1 * > &h, TH1 *hist)
static std::string join(char **cmd)
Definition: RemoteFile.cc:19
#define str(s)
def mergeLHE.DefaultLHEMerger.file_iterator (   self,
  path 
)
Line-by-line iterator of a txt file

Definition at line 40 of file mergeLHE.py.

40 
41  def file_iterator(self, path):
42  """Line-by-line iterator of a txt file"""
43  with open(path, 'r') as f:
44  for line in f:
45  yield line
def mergeLHE.DefaultLHEMerger.merge (   self)

Definition at line 193 of file mergeLHE.py.

References mergeLHE.DefaultLHEMerger._f, edmStreamStallGrapher.StallMonitorParser._f, mergeLHE.DefaultLHEMerger._header_str, mergeLHE.DefaultLHEMerger._is_mglo, mergeLHE.DefaultLHEMerger._nevent, mergeLHE.DefaultLHEMerger._uwgt, mergeLHE.DefaultLHEMerger._xsec_combined, mergeLHE.DefaultLHEMerger.bypass_check, mergeLHE.DefaultLHEMerger.check_header_compatibility(), watchdog.group, join(), mergeLHE.DefaultLHEMerger.merge_headers(), mergeLHE.DefaultLHEMerger.merge_init_blocks(), GetRecoTauVFromDQM_MC_cff.next, mergeLHE.BaseLHEMerger.output_file, DTT0WireWorkflow.DTT0WireWorkflow.output_file, DTVdriftWorkflow.DTvdriftWorkflow.output_file, DTTtrigWorkflow.DTttrigWorkflow.output_file, sistrip::SpyUtilities.range(), and jetcorrextractor.sign().

194  def merge(self):
195  with open(self.output_file, 'w') as fw:
196  # Read the header for the all input files
197  for i in range(len(self._f)):
198  header = []
199  line = next(self._f[i])
200  while not re.search('\s*<init(>|\s)', line):
201  header.append(line)
202  line = next(self._f[i])
203  # 'header' includes all contents before reaches <init>
204  self._header_str.append(''.join(header))
206 
207  # Read <init> blocks for all input_files
208  for i in range(len(self._f)):
209  init = []
210  line = next(self._f[i])
211  while not re.search('\s*</init>', line):
212  init.append(line)
213  line = next(self._f[i])
214  # 'init_str' includes all contents inside <init>...</init>
215  self._init_str.append(''.join(init))
216 
217  # Iterate over all events file-by-file and write events temporarily
218  # to .tmp.lhe
219  with open('.tmp.lhe', 'w') as _fwtmp:
220  for i in range(len(self._f)):
221  nevent = 0
222  while True:
223  line = next(self._f[i])
224  if re.search('\s*</event>', line):
225  nevent += 1
226  if re.search('\s*</LesHouchesEvents>', line):
227  break
228  _fwtmp.write(line)
229  self._nevent.append(nevent)
230  self._f[i].close()
231 
232  # Merge the header and init blocks and write to the output
233  fw.write(self.merge_headers())
234  fw.write('<init>\n' + self.merge_init_blocks() + '</init>\n')
235 
236  # Write event blocks in .tmp.lhe back to the output
237  # If is MG5 LO LHE, will recalculate the weights based on combined xsec
238  # and nevent read from <MGGenerationInfo>, and the 'event_norm' mode
239  if self._is_mglo and not self.bypass_check:
240  event_norm = re.search(
241  r'\s(\w+)\s*=\s*event_norm\s',
242  self._header_str[0]).group(1)
243  if event_norm == 'sum':
244  self._uwgt = self._xsec_combined / sum(self._nevent)
245  elif event_norm == 'average':
246  self._uwgt = self._xsec_combined
247  logging.info(("MG5 LO LHE with event_norm = %s detected. Will "
248  "recalculate weights in each event block.\n"
249  "Unit weight: %+.7E") % (event_norm, self._uwgt))
250 
251  # Modify event wgt when transfering .tmp.lhe to the output file
252  event_line = -999
253  with open('.tmp.lhe', 'r') as ftmp:
254  sign = lambda x: -1 if x < 0 else 1
255  for line in ftmp:
256  event_line += 1
257  if re.search('\s*<event.*>', line):
258  event_line = 0
259  if event_line == 1:
260  # modify the XWGTUP appeared in the first line of the
261  # <event> block
262  orig_wgt = float(line.split()[2])
263  fw.write(re.sub(r'(^\s*\S+\s+\S+\s+)\S+(.+)', r'\g<1>%+.7E\g<2>' \
264  % (sign(orig_wgt) * self._uwgt), line))
265  elif re.search('\s*<wgt.*>.*</wgt>', line):
266  addi_wgt_str = re.search(r'<wgt.*>\s*(\S+)\s*<\/wgt>', line).group(1)
267  fw.write(line.replace(
268  addi_wgt_str, '%+.7E' % (float(addi_wgt_str) / orig_wgt * self._uwgt)))
269  else:
270  fw.write(line)
271  else:
272  # Simply transfer all lines
273  with open('.tmp.lhe', 'r') as ftmp:
274  for line in ftmp:
275  fw.write(line)
276  fw.write('</LesHouchesEvents>\n')
277  os.remove('.tmp.lhe')
278 
double sign(double x)
const uint16_t range(const Frame &aFrame)
tuple group
Definition: watchdog.py:82
static std::string join(char **cmd)
Definition: RemoteFile.cc:19
def mergeLHE.DefaultLHEMerger.merge_headers (   self)
Merge the headers of input LHEs. Need special handle for the MG5 LO case.

Definition at line 79 of file mergeLHE.py.

References mergeLHE.DefaultLHEMerger._header_str, mergeLHE.DefaultLHEMerger._is_mglo, mergeLHE.DefaultLHEMerger._nevent, mergeLHE.DefaultLHEMerger._xsec_combined, python.cmstools.all(), mergeLHE.DefaultLHEMerger.bypass_check, watchdog.group, python.rootplot.root2matplotlib.replace(), and ComparisonHelper.zip().

Referenced by mergeLHE.DefaultLHEMerger.merge().

79 
80  def merge_headers(self):
81  """Merge the headers of input LHEs. Need special handle for the MG5 LO case."""
82 
83  self._is_mglo = all(['MGGenerationInfo' in header for header in self._header_str])
84  if self._is_mglo and not self.bypass_check:
85  # Special handling of MadGraph5 LO LHEs
86  match_geninfo = [
87  re.search(
88  (r"<MGGenerationInfo>\s+#\s*Number of Events\s*\:\s*(\S+)\s+"
89  r"#\s*Integrated weight \(pb\)\s*\:\s*(\S+)\s+<\/MGGenerationInfo>"),
90  header
91  ) for header in self._header_str
92  ]
93  self._xsec_combined = sum(
94  [float(info.group(2)) * nevt for info, nevt in zip(match_geninfo, self._nevent)]
95  ) / sum(self._nevent)
96  geninfo_combined = ("<MGGenerationInfo>\n"
97  "# Number of Events : %d\n"
98  "# Integrated weight (pb) : %.10f\n</MGGenerationInfo>") \
99  % (sum(self._nevent), self._xsec_combined)
100  logging.info('Detected: MG5 LO LHEs. Input <MGGenerationInfo>:\n\tnevt\txsec')
101  for info, nevt in zip(match_geninfo, self._nevent):
102  logging.info('\t%d\t%.10f' % (nevt, float(info.group(2))))
103  logging.info('Combined <MGGenerationInfo>:\n\t%d\t%.10f' \
104  % (sum(self._nevent), self._xsec_combined))
105 
106  header_combined = self._header_str[0].replace(match_geninfo[0].group(), geninfo_combined)
107  return header_combined
108 
109  else:
110  # No need to merge the headers
111  return self._header_str[0]
OutputIterator zip(InputIterator1 first1, InputIterator1 last1, InputIterator2 first2, InputIterator2 last2, OutputIterator result, Compare comp)
def all
workaround iterator generators for ROOT classes
Definition: cmstools.py:25
tuple group
Definition: watchdog.py:82
def mergeLHE.DefaultLHEMerger.merge_init_blocks (   self)
If all <init> blocks are identical, return the same <init> block
(in the case of Powheg LHEs); otherwise, calculate the output <init>
blocks by merging the input blocks info using formula (same with the
MG5LOLHEMerger scheme):
    XSECUP = sum(xsecup * no.events) / tot.events
    XERRUP = sqrt( sum(sigma^2 * no.events^2) ) / tot.events
    XMAXUP = max(xmaxup)

Definition at line 112 of file mergeLHE.py.

References mergeLHE.DefaultLHEMerger._f, edmStreamStallGrapher.StallMonitorParser._f, mergeLHE.DefaultLHEMerger._init_str, mergeLHE.DefaultLHEMerger._nevent, python.cmstools.all(), mergeLHE.DefaultLHEMerger.bypass_check, mergeLHE.BaseLHEMerger.input_files, DTWorkflow.DTWorkflow.input_files, relativeConstraints.keys, SiStripPI.max, sistrip::SpyUtilities.range(), submitPVValidationJobs.split(), and digitizers_cfi.strip.

Referenced by mergeLHE.DefaultLHEMerger.merge().

113  def merge_init_blocks(self):
114  """If all <init> blocks are identical, return the same <init> block
115  (in the case of Powheg LHEs); otherwise, calculate the output <init>
116  blocks by merging the input blocks info using formula (same with the
117  MG5LOLHEMerger scheme):
118  XSECUP = sum(xsecup * no.events) / tot.events
119  XERRUP = sqrt( sum(sigma^2 * no.events^2) ) / tot.events
120  XMAXUP = max(xmaxup)
121  """
122 
123  if self.bypass_check:
124  # If bypass the consistency check, simply use the first LHE <init>
125  # block as the output
126  return self._init_str[0]
127 
128  # Initiate collected init block info. Will be in format of
129  # {iprocess: [xsecup, xerrup, xmaxup]}
130  new_init_block = {}
131  old_init_block = [{} for _ in self._init_str]
132 
133  # Read the xsecup, xerrup, and xmaxup info from the <init> block for
134  # all input LHEs
135  for i, bl in enumerate(self._init_str): # loop over files
136  nline = int(bl.split('\n')[0].strip().split()[-1])
137 
138  # loop over lines in <init> block
139  for bl_line in bl.split('\n')[1:nline + 1]:
140  bl_line_sp = bl_line.split()
141  old_init_block[i][int(bl_line_sp[3])] = [
142  float(bl_line_sp[0]), float(bl_line_sp[1]), float(bl_line_sp[2])]
143 
144  # After reading all subprocesses info, store the rest content in
145  # <init> block for the first file
146  if i == 0:
147  info_after_subprocess = bl.strip().split('\n')[nline + 1:]
148 
149  logging.info('Input file: %s' % self.input_files[i])
150  for ipr in sorted(list(old_init_block[i].keys()), reverse=True):
151  # reverse order: follow the MG5 custom
152  logging.info(' xsecup, xerrup, xmaxup, lprup: %.6E, %.6E, %.6E, %d' \
153  % tuple(old_init_block[i][ipr] + [ipr]))
154 
155  # Adopt smarter <init> block merging method
156  # If all <init> blocks from input files are identical, return the same block;
157  # otherwise combine them based on MG5LOLHEMerger scheme
158  if all([old_init_block[i] == old_init_block[0] for i in range(len(self._f))]):
159  # All <init> blocks are identical
160  logging.info(
161  'All input <init> blocks are identical. Output the same "<init> block.')
162  return self._init_str[0]
163 
164  # Otherwise, calculate merged init block
165  for i in range(len(self._f)):
166  for ipr in old_init_block[i]:
167  # Initiate the subprocess for the new block if it is found for the
168  # first time in one input file
169  if ipr not in new_init_block:
170  new_init_block[ipr] = [0., 0., 0.]
171  new_init_block[ipr][0] += old_init_block[i][ipr][0] * self._nevent[i] # xsecup
172  new_init_block[ipr][1] += old_init_block[i][ipr][1]**2 * self._nevent[i]**2 # xerrup
173  new_init_block[ipr][2] = max(new_init_block[ipr][2], old_init_block[i][ipr][2]) # xmaxup
174  tot_nevent = sum([self._nevent[i] for i in range(len(self._f))])
175 
176  # Write first line of the <init> block (modify the nprocess at the last)
177  self._merged_init_str = self._init_str[0].split('\n')[0].strip().rsplit(' ', 1)[0] \
178  + ' ' + str(len(new_init_block)) + '\n'
179  # Form the merged init block
180  logging.info('Output file: %s' % self.output_file)
181  for ipr in sorted(list(new_init_block.keys()), reverse=True):
182  # reverse order: follow the MG5 custom
183  new_init_block[ipr][0] /= tot_nevent
184  new_init_block[ipr][1] = math.sqrt(new_init_block[ipr][1]) / tot_nevent
185  logging.info(' xsecup, xerrup, xmaxup, lprup: %.6E, %.6E, %.6E, %d' \
186  % tuple(new_init_block[ipr] + [ipr]))
187  self._merged_init_str += '%.6E %.6E %.6E %d\n' % tuple(new_init_block[ipr] + [ipr])
188  self._merged_init_str += '\n'.join(info_after_subprocess)
189  if len(info_after_subprocess):
190  self._merged_init_str += '\n'
191 
192  return self._merged_init_str
const uint16_t range(const Frame &aFrame)
def all
workaround iterator generators for ROOT classes
Definition: cmstools.py:25
static std::string join(char **cmd)
Definition: RemoteFile.cc:19
#define str(s)

Member Data Documentation

mergeLHE.DefaultLHEMerger._f
private

Definition at line 32 of file mergeLHE.py.

Referenced by mergeLHE.DefaultLHEMerger.merge(), and mergeLHE.DefaultLHEMerger.merge_init_blocks().

mergeLHE.DefaultLHEMerger._header_lines
private

Definition at line 58 of file mergeLHE.py.

mergeLHE.DefaultLHEMerger._header_str
private

Definition at line 33 of file mergeLHE.py.

Referenced by mergeLHE.DefaultLHEMerger.merge(), and mergeLHE.DefaultLHEMerger.merge_headers().

mergeLHE.DefaultLHEMerger._init_str
private

Definition at line 37 of file mergeLHE.py.

Referenced by mergeLHE.DefaultLHEMerger.merge_init_blocks().

mergeLHE.DefaultLHEMerger._is_mglo
private

Definition at line 34 of file mergeLHE.py.

Referenced by mergeLHE.DefaultLHEMerger.merge(), and mergeLHE.DefaultLHEMerger.merge_headers().

mergeLHE.DefaultLHEMerger._merged_init_str
private

Definition at line 176 of file mergeLHE.py.

mergeLHE.DefaultLHEMerger._nevent
private

Definition at line 38 of file mergeLHE.py.

Referenced by mergeLHE.DefaultLHEMerger.merge(), mergeLHE.DefaultLHEMerger.merge_headers(), and mergeLHE.DefaultLHEMerger.merge_init_blocks().

mergeLHE.DefaultLHEMerger._uwgt
private

Definition at line 36 of file mergeLHE.py.

Referenced by mergeLHE.DefaultLHEMerger.merge().

mergeLHE.DefaultLHEMerger._xsec_combined
private

Definition at line 35 of file mergeLHE.py.

Referenced by mergeLHE.DefaultLHEMerger.merge(), and mergeLHE.DefaultLHEMerger.merge_headers().

mergeLHE.DefaultLHEMerger.bypass_check

Definition at line 30 of file mergeLHE.py.

Referenced by mergeLHE.DefaultLHEMerger.check_header_compatibility(), mergeLHE.DefaultLHEMerger.merge(), mergeLHE.DefaultLHEMerger.merge_headers(), and mergeLHE.DefaultLHEMerger.merge_init_blocks().