CMS 3D CMS Logo

Functions | Variables
mps_list_evts Namespace Reference

Functions

def get_mille_lines ()
 
def get_num_evts_per_dataset (mille_lines)
 
def get_num_evts_per_merged_dataset (merged_datasets, num_evts_per_dataset)
 
def merge_datasets (num_evts_per_dataset)
 
def print_merging_scheme (merged_datasets)
 
def print_num_evts_per_dataset (num_evts_per_dataset)
 

Variables

 merged_datasets = merge_datasets(num_evts_per_dataset)
 
 mille_lines = get_mille_lines()
 
string mps_db = "mps.db"
 
 num_evts_per_dataset = get_num_evts_per_dataset(mille_lines)
 
 num_evts_per_merged_dataset = get_num_evts_per_merged_dataset(merged_datasets,num_evts_per_dataset)
 

Detailed Description

Print the total number of events processed by the mille jobs per dataset

The information is taken from the `mps.db' file. Will group entries of the
same dataset and also datasets the script *thinks* belong to the same
data type, e.g. 0T cosmics. This is implemented very simple and should 
always be checked by the user.

Usage:

 `python mps_list_evts.py <mps.db file name>' or, after `scram b'
 `mps_list_evts.py <mps.db file name>'

M. Schroeder, DESY Hamburg      26-May-2014

Function Documentation

def mps_list_evts.get_mille_lines ( )
Return list of mps.db lines that correspond to a mille job 

Definition at line 26 of file mps_list_evts.py.

27  """ Return list of mps.db lines that correspond to a mille job """
28  mille_lines = []
29  with open(mps_db,"r") as db:
30  for line in db:
31  line = line.rstrip('\n')
32  # mille and pede job lines have 13 `:' separated fields
33  parts = line.split(":")
34  if len(parts) == 13:
35  # mille lines start with `<123>:job<123>'
36  if parts[1] == "job"+parts[0]:
37  mille_lines.append(parts)
38 
39  return mille_lines
40 
41 
42 
def get_mille_lines()
def mps_list_evts.get_num_evts_per_dataset (   mille_lines)
Return number of events per dataset

Returns a dict `<dataset>:<num_evts>', where <dataset> is the label
in the last field of the mille line.

Definition at line 43 of file mps_list_evts.py.

References createfilelist.int.

43 def get_num_evts_per_dataset(mille_lines):
44  """ Return number of events per dataset
45 
46  Returns a dict `<dataset>:<num_evts>', where <dataset> is the label
47  in the last field of the mille line.
48  """
49  num_evts_per_dataset = {}
50  for line in mille_lines:
51  dataset = line[12]
52  num_evts = int(line[6])
53  if dataset in num_evts_per_dataset:
54  num_evts_per_dataset[dataset] = num_evts_per_dataset[dataset] + num_evts
55  else:
56  num_evts_per_dataset[dataset] = num_evts
57 
58  return num_evts_per_dataset
59 
60 
61 
def get_num_evts_per_dataset(mille_lines)
def mps_list_evts.get_num_evts_per_merged_dataset (   merged_datasets,
  num_evts_per_dataset 
)
Return number of events per merged dataset

Returns a dict `<merged_dataset>:<num_evts>'; see comments to function
`merge_datasets' for an explanation of <merged_dataset>.

Definition at line 62 of file mps_list_evts.py.

62 def get_num_evts_per_merged_dataset(merged_datasets,num_evts_per_dataset):
63  """ Return number of events per merged dataset
64 
65  Returns a dict `<merged_dataset>:<num_evts>'; see comments to function
66  `merge_datasets' for an explanation of <merged_dataset>.
67  """
68  num_evts_per_merged_dataset = {}
69  for merged_dataset,datasets in six.iteritems(merged_datasets):
70  num_evts = 0
71  for dataset in datasets:
72  num_evts = num_evts + num_evts_per_dataset[dataset]
73  num_evts_per_merged_dataset[merged_dataset] = num_evts
74 
75  return num_evts_per_merged_dataset
76 
77 
78 
def get_num_evts_per_merged_dataset(merged_datasets, num_evts_per_dataset)
def mps_list_evts.merge_datasets (   num_evts_per_dataset)
Return dict `<merged_dataset> : list of <dataset>'

Associates all datasets in `num_evts_per_dataset' that belong by their
name to the same PD but to a different run era. For example:

isolated_mu_runa_v1, isolated_mu_runb_v1, isolated_mu_runc_v2 --> isolated_mu

The returned dict has as value a list of the merged datasets.

Definition at line 79 of file mps_list_evts.py.

References mps_setup.append.

79 def merge_datasets(num_evts_per_dataset):
80  """ Return dict `<merged_dataset> : list of <dataset>'
81 
82  Associates all datasets in `num_evts_per_dataset' that belong by their
83  name to the same PD but to a different run era. For example:
84 
85  isolated_mu_runa_v1, isolated_mu_runb_v1, isolated_mu_runc_v2 --> isolated_mu
86 
87  The returned dict has as value a list of the merged datasets.
88  """
89  datasets = num_evts_per_dataset.keys()
90  merged_datasets = {}
91  for dataset in datasets:
92  bare_name = dataset[0:dataset.find("run")].rstrip("_")
93  if bare_name in merged_datasets:
94  merged_datasets[bare_name].append(dataset)
95  else:
96  merged_datasets[bare_name] = [dataset]
97 
98  return merged_datasets
99 
100 
101 
def merge_datasets(num_evts_per_dataset)
def mps_list_evts.print_merging_scheme (   merged_datasets)
Print number of events per merged dataset

See comments to function `merge_datasets' for an explanation
of what is meant by merged dataset.

Definition at line 102 of file mps_list_evts.py.

References edm.print().

102 def print_merging_scheme(merged_datasets):
103  """ Print number of events per merged dataset
104 
105  See comments to function `merge_datasets' for an explanation
106  of what is meant by merged dataset.
107  """
108  print("Defining the following merged datasets:")
109  for merged_dataset,datasets in six.iteritems(merged_datasets):
110  print("\n `"+merged_dataset+"' from:")
111  for dataset in datasets:
112  print(" `"+dataset+"'")
113 
114 
115 
def print_merging_scheme(merged_datasets)
S & print(S &os, JobReport::InputFile const &f)
Definition: JobReport.cc:65
def mps_list_evts.print_num_evts_per_dataset (   num_evts_per_dataset)
Print number of events per dataset

See comments to function `get_num_evts_per_dataset' for an
explanation of what is meant by dataset.

Definition at line 116 of file mps_list_evts.py.

References edm.print(), and str.

116 def print_num_evts_per_dataset(num_evts_per_dataset):
117  """ Print number of events per dataset
118 
119  See comments to function `get_num_evts_per_dataset' for an
120  explanation of what is meant by dataset.
121  """
122  print("The following number of events per dataset have been processed:")
123  datasets = sorted(num_evts_per_dataset.keys())
124  max_name = 0
125  max_num = 0
126  for dataset in datasets:
127  if len(dataset) > max_name:
128  max_name = len(dataset)
129  if len(str(num_evts_per_dataset[dataset])) > max_num:
130  max_num = len(str(num_evts_per_dataset[dataset]))
131  expr_name = " {0: <"+str(max_name)+"}"
132  expr_num = " {0: >"+str(max_num)+"}"
133  for dataset in datasets:
134  print(expr_name.format(dataset)+" : "+expr_num.format(str(num_evts_per_dataset[dataset])))
135 
136 
def print_num_evts_per_dataset(num_evts_per_dataset)
S & print(S &os, JobReport::InputFile const &f)
Definition: JobReport.cc:65
#define str(s)

Variable Documentation

mps_list_evts.merged_datasets = merge_datasets(num_evts_per_dataset)

Definition at line 152 of file mps_list_evts.py.

mps_list_evts.mille_lines = get_mille_lines()

Definition at line 150 of file mps_list_evts.py.

mps_list_evts.mps_db = "mps.db"

Definition at line 23 of file mps_list_evts.py.

mps_list_evts.num_evts_per_dataset = get_num_evts_per_dataset(mille_lines)

Definition at line 151 of file mps_list_evts.py.

mps_list_evts.num_evts_per_merged_dataset = get_num_evts_per_merged_dataset(merged_datasets,num_evts_per_dataset)

Definition at line 153 of file mps_list_evts.py.