Public Member Functions | |
def | __init__ |
def | findFirstIndex_ofStartsWith |
def | findLineAfter |
def | findLineBefore |
def | firstTimeStampAfter |
def | firstTimeStampBefore |
def | get_tarball_fromlog |
def | getMachineInfo |
def | handleParsingError |
def | isTimeStamp |
def | parseAll |
def | parseAllOtherTests |
def | parseGeneralInfo |
def | parseTheCompletion |
def | parseTimeSize |
def | readCmsScimark |
def | readCmsScimarkTest |
def | readInput |
def | validateSteps |
Public Attributes | |
lines_general | |
lines_other | |
lines_timesize | |
missing_fields | |
reCmsScimarkTest | |
Private Member Functions | |
def | _applyParsingRules |
Private Attributes | |
_DEBUG | |
_MAX_STEPS | |
_otherStart | |
_path | |
_timeSizeEnd | |
_timeSizeStart | |
Static Private Attributes | |
string | _LINE_SEPARATOR = "|" |
The whole parsing works as following. We split the file into 3 parts (we keep 3 variables of line lists:self.lines_general, self.lines_timesize, self.lines_other ): * General info As most of the info are simple one line strings, we define some regular expressions defining and matching each of those lines. The regular expressions are associated with data which we can get from them. e.g. ^Suite started at (.+) on (.+) by user (.+)$ would match only the line defining the time suite started and on which machine. It's associated with tuple of field names for general info which will be filled in. in this way we get info = {'start_time': start-taken-from-regexp, 'host': host, 'user': user}. This is done by calling simple function _applyParsingRules which checks each lines with each if one passes another, if it does fills in the result dictionary with the result. Additionaly we get the cpu and memmory info from /proc/cpuinfo /proc/meminfo * TimeSize test We use the same technique a little bit also. But at first we divide the timesize lines by job (individual run of cmssw - per candle, and pileup/not). Then for each of the jobs we apply our parsing rules, also we find the starting and ending times (i.e. We know that start timestamp is somethere after certain line containing "Written out cmsRelvalreport.py input file at:") * All other tests We find the stating that the test is being launched (containing the test name, core and num events). Above we have the thread number, and below the starting time. The ending time can be ONLY connected with the starting time by the Thread-ID. The problem is that the file names different the same test instance like <Launching "PILE UP Memcheck"> and <"Memcheck" stopped>.
Definition at line 8 of file parserPerfsuiteMetadata.py.
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::__init__ | ( | self, | |
path | |||
) |
Definition at line 28 of file parserPerfsuiteMetadata.py.
00029 : 00030 00031 self._MAX_STEPS = 5 # MAXIMUM NUMBER OF STEPS PER RUN (taskset relvalreport.py...) 00032 self._DEBUG = False 00033 00034 00035 self._path = path 00036 00037 """ some initialisation to speedup the other functions """ 00038 #for cmsscimark 00039 self.reCmsScimarkTest = re.compile(r"""^Composite Score:(\s*)([^\s]+)$""") 00040 00041 #TimeSize 00042 """ the separator for beginning of timeSize / end of general statistics """ 00043 self._timeSizeStart = re.compile(r"""^Launching the TimeSize tests \(TimingReport, TimeReport, SimpleMemoryCheck, EdmSize\) with (\d+) events each$""") 00044 """ (the first timestamp is the start of TimeSize) """ 00045 00046 00047 """ the separator for end of timeSize / beginning of IgProf_Perf, IgProf_Mem, Memcheck, Callgrind tests """ 00048 self._timeSizeEnd = re.compile(r"""^Stopping all cmsScimark jobs now$""") 00049 00050 #Other tests: 00051 self._otherStart = re.compile(r"^Preparing") 00052 00053 """ 00054 ----- READ THE DATA ----- 00055 """ 00056 lines = self.readInput(path) 00057 """ split the whole file into parts """ 00058 #Let's not assume there are ALWAYS TimeSize tests in the runs of the Performance Suite!: 00059 #Check first: 00060 #FIXME: Vidmantas did not think to this case... will need to implement protectionb against it for all the IB tests... 00061 #To do as soon as possible... 00062 #Maybe revisit the strategy if it can be done quickly. 00063 timesize_end= [lines.index(line) for line in lines if self._timeSizeEnd.match(line)] 00064 if timesize_end: 00065 timesize_end_index = timesize_end[0] 00066 else: 00067 timesize_end_index=0 00068 timesize_start=[lines.index(line) for line in lines if self._timeSizeStart.match(line)] 00069 general_stop=[lines.index(line) for line in lines if self._otherStart.match(line)] 00070 if timesize_start: 00071 timesize_start_index = timesize_start[0] 00072 general_stop_index = timesize_start_index 00073 elif general_stop: 00074 timesize_start_index=timesize_end_index+1 00075 general_stop_index=general_stop[0] 00076 else: 00077 timesize_start_index=0 00078 general_stop_index=-1 00079 00080 """ we split the structure: 00081 * general 00082 * timesize 00083 * all others [igprof etc] 00084 """ 00085 00086 """ we get the indexes of spliting """ 00087 #Not OK to use timsize_start_index for the general lines... want to be general, also to cases of no TimeSize tests... 00088 #self.lines_general = lines[:timesize_start_index] 00089 self.lines_general = lines[:general_stop_index] 00090 self.lines_timesize = lines[timesize_start_index:timesize_end_index+1] 00091 self.lines_other = lines[timesize_end_index:] 00092 00093 """ a list of missing fields """ 00094 self.missing_fields = []
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::_applyParsingRules | ( | self, | |
parsing_rules, | |||
lines | |||
) | [private] |
Applies the (provided) regular expression rules (=rule[1] for rule in parsing_rules) to each line and if it matches the line, puts the mached information to the dictionary as the specified keys (=rule[0]) which is later returned Rule[3] contains whether the field is required to be found. If so and it isn't found the exception would be raised. rules = [ ( (field_name_1_to_match, field_name_2), regular expression, /optionaly: is the field required? if so "req"/ ) ]
we call a shared parsing helper
Definition at line 235 of file parserPerfsuiteMetadata.py.
00236 : 00237 """ 00238 Applies the (provided) regular expression rules (=rule[1] for rule in parsing_rules) 00239 to each line and if it matches the line, 00240 puts the mached information to the dictionary as the specified keys (=rule[0]) which is later returned 00241 Rule[3] contains whether the field is required to be found. If so and it isn't found the exception would be raised. 00242 rules = [ 00243 ( (field_name_1_to_match, field_name_2), regular expression, /optionaly: is the field required? if so "req"/ ) 00244 ] 00245 """ 00246 """ we call a shared parsing helper """ 00247 #parsing_rules = map(parsingRulesHelper.rulesRegexpCompileFunction, parsing_rules) 00248 #print parsing_rules 00249 (info, missing_fields) = parsingRulesHelper.rulesParser(parsing_rules, lines, compileRules = True) 00250 00251 self.missing_fields.extend(missing_fields) 00252 00253 return info 00254
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::findFirstIndex_ofStartsWith | ( | job_lines, | |
start_of_line | |||
) |
Definition at line 113 of file parserPerfsuiteMetadata.py.
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::findLineAfter | ( | self, | |
line_index, | |||
lines, | |||
test_condition, | |||
return_index = False |
|||
) |
finds a line satisfying the `test_condition` comming after the `line_index`
Definition at line 129 of file parserPerfsuiteMetadata.py.
00130 : 00131 """ finds a line satisfying the `test_condition` comming after the `line_index` """ 00132 # we're going forward the lines list 00133 for line_index in xrange(line_index + 1, len(lines)): 00134 line = lines[line_index] 00135 00136 if test_condition(line): 00137 if return_index: 00138 return line_index 00139 return line
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::findLineBefore | ( | self, | |
line_index, | |||
lines, | |||
test_condition | |||
) |
finds a line satisfying the `test_condition` comming before the `line_index`
Definition at line 118 of file parserPerfsuiteMetadata.py.
00119 : 00120 """ finds a line satisfying the `test_condition` comming before the `line_index` """ 00121 # we're going backwards the lines list 00122 for line_index in xrange(line_index -1, -1, -1): 00123 line = lines[line_index] 00124 00125 if test_condition(line): 00126 return line 00127 raise ValueError 00128
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::firstTimeStampAfter | ( | self, | |
line_index, | |||
lines | |||
) |
returns the first timestamp AFTER the line with given index
Definition at line 145 of file parserPerfsuiteMetadata.py.
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::firstTimeStampBefore | ( | self, | |
line_index, | |||
lines | |||
) |
returns the first timestamp BEFORE the line with given index
Definition at line 140 of file parserPerfsuiteMetadata.py.
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::get_tarball_fromlog | ( | self | ) |
Return the tarball castor location by parsing the cmsPerfSuite.log file
Definition at line 707 of file parserPerfsuiteMetadata.py.
00708 : 00709 '''Return the tarball castor location by parsing the cmsPerfSuite.log file''' 00710 print "Getting the url from the cmsPerfSuite.log" 00711 log=open("cmsPerfSuite.log","r") 00712 castor_dir="UNKNOWN_CASTOR_DIR" 00713 tarball="UNKNOWN_TARBALL" 00714 for line in log.readlines(): 00715 if 'castordir' in line: 00716 castor_dir=line.split()[1] 00717 if 'tgz' in line and tarball=="UNKNOWN_TARBALL": #Pick the first line that contains the tar command... 00718 if 'tar' in line: 00719 tarball=os.path.basename(line.split()[2]) 00720 castor_tarball=os.path.join(castor_dir,tarball) 00721 return castor_tarball
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::getMachineInfo | ( | self | ) |
Returns the cpu and memory info
cpu info
we assume that: * num_cores = max(core id+1) [it's counted from 0] * 'model name' is processor type [we will return only the first one - we assume others to be same!!?? * cpu MHz - is the speed of CPU
for model name : Intel(R) Core(TM)2 Duo CPU L9400 @ 1.86GHz cpu MHz : 800.000 cache size : 6144 KB
Definition at line 175 of file parserPerfsuiteMetadata.py.
00176 : 00177 """ Returns the cpu and memory info """ 00178 00179 """ cpu info """ 00180 00181 """ 00182 we assume that: 00183 * num_cores = max(core id+1) [it's counted from 0] 00184 * 'model name' is processor type [we will return only the first one - we assume others to be same!!?? 00185 * cpu MHz - is the speed of CPU 00186 """ 00187 #TODO: BUT cpu MHz show not the maximum speed but current, 00188 """ 00189 for 00190 model name : Intel(R) Core(TM)2 Duo CPU L9400 @ 1.86GHz 00191 cpu MHz : 800.000 00192 cache size : 6144 KB 00193 """ 00194 cpu_result = {} 00195 try: 00196 f= open(os.path.join(self._path, "cpuinfo"), "r") 00197 00198 #we split data into a list of tuples = [(attr_name, attr_value), ...] 00199 cpu_attributes = [l.strip().split(":") for l in f.readlines()] 00200 #print cpu_attributes 00201 f.close() 00202 cpu_result = { 00203 "num_cores": max ([int(attr[1].strip())+1 for attr in cpu_attributes if attr[0].strip() == "processor"]), #Bug... Vidmantas used "core id" 00204 "cpu_speed_MHZ": max ([attr[1].strip() for attr in cpu_attributes if attr[0].strip() == "cpu MHz"]), 00205 "cpu_cache_size": [attr[1].strip() for attr in cpu_attributes if attr[0].strip() == "cache size"][0], 00206 "cpu_model_name": [attr[1].strip() for attr in cpu_attributes if attr[0].strip() == "model name"][0] 00207 } 00208 except IOError,e: 00209 print e 00210 00211 00212 00213 00214 00215 """ memory info """ 00216 mem_result = {} 00217 00218 try: 00219 f= open(os.path.join(self._path, "meminfo"), "r") 00220 00221 #we split data into a list of tuples = [(attr_name, attr_value), ...] 00222 mem_attributes = [l.strip().split(":") for l in f.readlines()] 00223 00224 mem_result = { 00225 "memory_total_ram": [attr[1].strip() for attr in mem_attributes if attr[0].strip() == "MemTotal"][0] 00226 } 00227 00228 except IOError,e: 00229 print e 00230 00231 cpu_result.update(mem_result) 00232 return cpu_result 00233 00234
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::handleParsingError | ( | self, | |
message | |||
) |
Definition at line 150 of file parserPerfsuiteMetadata.py.
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::isTimeStamp | ( | line | ) |
Returns whether the string is a timestamp (if not returns None) >>> parserPerfsuiteMetadata.isTimeStamp("Fri Aug 14 01:16:03 2009") True >>> parserPerfsuiteMetadata.isTimeStamp("Fri Augx 14 01:16:03 2009")
Definition at line 96 of file parserPerfsuiteMetadata.py.
00097 : 00098 """ 00099 Returns whether the string is a timestamp (if not returns None) 00100 00101 >>> parserPerfsuiteMetadata.isTimeStamp("Fri Aug 14 01:16:03 2009") 00102 True 00103 >>> parserPerfsuiteMetadata.isTimeStamp("Fri Augx 14 01:16:03 2009") 00104 00105 """ 00106 datetime_format = "%a %b %d %H:%M:%S %Y" # we use default date format 00107 try: 00108 time.strptime(line, datetime_format) 00109 return True 00110 except ValueError: 00111 return None
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::parseAll | ( | self | ) |
Definition at line 722 of file parserPerfsuiteMetadata.py.
00723 : 00724 result = {"General": {}, "TestResults":{}, "cmsSciMark":{}, 'unrecognized_jobs': []} 00725 00726 """ all the general info - start, arguments, host etc """ 00727 result["General"].update(self.parseGeneralInfo()) 00728 00729 """ machine info - cpu, memmory """ 00730 result["General"].update(self.getMachineInfo()) 00731 00732 """ we add info about how successfull was the run, when it finished and final castor url to the file! """ 00733 result["General"].update(self.parseTheCompletion()) 00734 00735 print "Parsing TimeSize runs..." 00736 if len(self.lines_timesize) > 0: 00737 try: 00738 result["TestResults"].update(self.parseTimeSize()) 00739 except Exception, e: 00740 print "BAD BAD BAD UNHANDLED ERROR in parseTimeSize: " + str(e) 00741 00742 print "Parsing Other(IgProf, Memcheck, ...) runs..." 00743 try: 00744 result["TestResults"].update(self.parseAllOtherTests()) 00745 except Exception, e: 00746 print "BAD BAD BAD UNHANDLED ERROR in parseAllOtherTests: " + str(e) 00747 00748 #print result["TestResults"] 00749 00750 00751 main_cores = [result["General"]["run_on_cpus"]] 00752 num_cores = result["General"].get("num_cores", 0) 00753 #DEBUG 00754 #print "Number of cores was: %s"%num_cores 00755 #TODO: temporarly - search for cores, use regexp 00756 main_cores = [1] 00757 00758 # THE MAHCINE SCIMARKS 00759 result["cmsSciMark"] = self.readCmsScimark(main_cores = main_cores) 00760 00761 if self.missing_fields: 00762 self.handleParsingError("========== SOME REQUIRED FIELDS WERE NOT FOUND DURING PARSING ======= "+ str(self.missing_fields)) 00763 00764 return result 00765 00766
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::parseAllOtherTests | ( | self | ) |
Definition at line 360 of file parserPerfsuiteMetadata.py.
00361 : 00362 #make it general, for whatever test comes... 00363 test = {} 00364 00365 parsing_rules = ( 00366 (("", "candle", ), r"""^(Candle|ONLY) (.+) will be PROCESSED$""", "req"), 00367 #e.g.: --conditions FrontierConditions_GlobalTag,MC_31X_V4::All --eventcontent RECOSIM 00368 (("cms_driver_options", ), r"""^Using user-specified cmsDriver.py options: (.+)$"""), 00369 (("", "conditions", ""), r"""^Using user-specified cmsDriver.py options: (.*)--conditions ([^\s]+)(.*)$""", "req"), 00370 # for this we cannot guarrantee that it has been found, TODO: we might count the number of pileup candles and compare with arguments 00371 (("", "pileup_type", ""), r"""^Using user-specified cmsDriver.py options:(.*)--pileup=([^\s]+)(.*)$"""), 00372 #not shure if event content is required 00373 (("", "event_content", ""), r"""^Using user-specified cmsDriver.py options:(.*)--eventcontent ([^\s]+)(.*)$""", "req"), 00374 #TODO: after changeing the splitter to "taskset -c ..." this is no longer included into the part of correct job 00375 #(("input_user_root_file", ), r"""^For these tests will use user input file (.+)$"""), 00376 ) 00377 00378 00379 lines = self.lines_other 00380 """ 00381 00382 for each of IgProf_Perf, IgProf_Mem, Memcheck, Callgrind tests we have such a structure of input file: 00383 * beginning ->> and start timestamp- the firstone: 00384 Launching the PILE UP IgProf_Mem tests on cpu 4 with 201 events each 00385 Adding thread <simpleGenReportThread(Thread-1, started -176235632)> to the list of active threads 00386 Mon Jun 14 20:06:54 2010 00387 00388 <... whatever might be here, might overlap with other test start/end messages ..> 00389 00390 Mon Jun 14 21:59:33 2010 00391 IgProf_Mem test, in thread <simpleGenReportThread(Thread-1, stopped -176235632)> is done running on core 4 00392 00393 * ending - the last timestamp "before is done running ...." 00394 """ 00395 # we take the first TimeStamp after the starting message and the first before the finishing message in 2 rounds.. 00396 00397 #TODO: if threads would be changed it would stop working!!! 00398 00399 # i.e. Memcheck, cpu, events 00400 reSubmit = re.compile(r"""^Let's submit (.+) test on core (\d+)$""") 00401 00402 reStart = re.compile(r"""^Launching the (PILE UP |)(.*) tests on cpu (\d+) with (\d+) events each$""") 00403 00404 # i.e. Memcheck, thread name,id,core number 00405 reEnd = re.compile(r"""^(.*) test, in thread <simpleGenReportThread\((.+), stopped -(\d+)\)> is done running on core (\d+)$""") 00406 00407 reAddThread = re.compile(r"""^Adding thread <simpleGenReportThread\((.+), started -(\d+)\)> to the list of active threads$""") 00408 00409 reWaiting = re.compile(r"""^Waiting for tests to be done...$""") 00410 00411 reExitCode = re.compile(r"""Individual cmsRelvalreport.py ExitCode (\d+)""") 00412 """ we search for lines being either: (it's a little pascal'ish but we need the index!) """ 00413 00414 jobs = [] 00415 00416 #can split it into jobs ! just have to reparse it for the exit codes later.... 00417 for line_index in xrange(0, len(lines)): 00418 line = lines[line_index] 00419 if reSubmit.match(line): 00420 end_index = self.findLineAfter(line_index, lines, test_condition=lambda l: reWaiting.match(l), return_index = True) 00421 jobs.append(lines[line_index:end_index]) 00422 00423 for job_lines in jobs: 00424 #print job_lines 00425 info = self._applyParsingRules(parsing_rules, job_lines) 00426 #Fixing here the compatibility with new cmsdriver.py --conditions option 00427 #(for which now we have autoconditions and FrontierConditions_GlobalTag is optional): 00428 if 'auto:' in info['conditions']: 00429 from Configuration.AlCa.autoCond import autoCond 00430 info['conditions'] = autoCond[ info['conditions'].split(':')[1] ].split("::")[0] 00431 else: 00432 if 'FrontierConditions_GlobalTag' in info['conditions']: 00433 info['conditions']=info['conditions'].split(",")[1] 00434 00435 steps_start = self.findFirstIndex_ofStartsWith(job_lines, "You defined your own steps to run:") 00436 steps_end = self.findFirstIndex_ofStartsWith(job_lines, "*Candle ") 00437 #probably it includes steps until we found *Candle... ? 00438 steps = job_lines[steps_start + 1:steps_end] 00439 if not self.validateSteps(steps): 00440 self.handleParsingError( "Steps were not found corrently: %s for current job: %s" % (str(steps), str(job_lines))) 00441 00442 """ quite nasty - just a work around """ 00443 print "Trying to recover from this error in case of old cmssw" 00444 00445 """ we assume that steps are between the following sentance and a TimeStamp """ 00446 steps_start = self.findFirstIndex_ofStartsWith(job_lines, "Steps passed to writeCommands") 00447 steps_end = self.findLineAfter(steps_start, job_lines, test_condition = self.isTimeStamp, return_index = True) 00448 00449 steps = job_lines[steps_start + 1:steps_end] 00450 if not self.validateSteps(steps): 00451 self.handleParsingError( "EVEN AFTER RECOVERY Steps were not found corrently! : %s for current job: %s" % (str(steps), str(job_lines))) 00452 else: 00453 print "RECOVERY SEEMS to be successful: %s" % str(steps) 00454 00455 info["steps"] = self._LINE_SEPARATOR.join(steps) #!!!! STEPS MIGHT CONTAIN COMMA: "," 00456 00457 start_id_index = self.findLineAfter(0, job_lines, test_condition = reStart.match, return_index = True) 00458 pileUp, testName, testCore, testEventsNum = reStart.match(job_lines[start_id_index]).groups() 00459 info["testname"] = testName 00460 00461 thread_id_index = self.findLineAfter(0, job_lines, test_condition = reAddThread.match, return_index = True) 00462 info["start"] = self.firstTimeStampAfter(thread_id_index, job_lines) 00463 00464 thread_id, thread_number = reAddThread.match(job_lines[thread_id_index]).groups() 00465 info["thread_id"] = thread_id 00466 00467 if not test.has_key(testName): 00468 test[testName] = [] 00469 test[testName].append(info) 00470 00471 for line_index in xrange(0, len(lines)): 00472 line = lines[line_index] 00473 00474 if reEnd.match(line): 00475 testName, thread_id, thread_num, testCore = reEnd.match(line).groups() 00476 time = self.firstTimeStampBefore(line_index, lines) 00477 try: 00478 exit_code = "" 00479 #we search for the exit code 00480 line_exitcode = self.findLineBefore(line_index, lines, test_condition=lambda l: reExitCode.match(l)) 00481 exit_code, = reExitCode.match(line_exitcode).groups() 00482 except Exception, e: 00483 print "Error while getting exit code (Other test): %s" + str(e) 00484 00485 for key, thread in test.items(): 00486 for i in range(0, len(thread)): 00487 if thread[i]["thread_id"] == thread_id: 00488 thread[i].update({"end": time, "exit_code": exit_code}) 00489 break 00490 00491 return test 00492
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::parseGeneralInfo | ( | self | ) |
Definition at line 255 of file parserPerfsuiteMetadata.py.
00256 : 00257 lines = self.lines_general 00258 """ we define a simple list (tuple) of rules for parsing, the first part tuple defines the parameters to be fetched from the 00259 regexp while the second one is the regexp itself """ 00260 #TIP: don't forget that tuple of one ends with , 00261 parsing_rules = ( 00262 (("", "num_cores", "run_on_cpus"), r"""^This machine \((.+)\) is assumed to have (\d+) cores, and the suite will be run on cpu \[(.+)\]$"""), 00263 (("start_time", "host", "local_workdir", "user"), r"""^Performance Suite started running at (.+) on (.+) in directory (.+), run by user (.+)$""", "req"), 00264 (("architecture",) ,r"""^Current Architecture is (.+)$"""), 00265 (("test_release_based_on",), r"""^Test Release based on: (.+)$""", "req"), 00266 (("base_release_path",) , r"""^Base Release in: (.+)$"""), 00267 (("test_release_local_path",) , r"""^Your Test release in: (.+)$"""), 00268 00269 (("castor_dir",) , r"""^The performance suite results tarball will be stored in CASTOR at (.+)$"""), 00270 00271 (("TimeSize_events",) , r"""^(\d+) TimeSize events$"""), 00272 (("IgProf_events",) , r"""^(\d+) IgProf events$"""), 00273 (("CallGrind_events",) , r"""^(\d+) Callgrind events$"""), 00274 (("Memcheck_events",) , r"""^(\d+) Memcheck events$"""), 00275 00276 (("candles_TimeSize",) , r"""^TimeSizeCandles \[(.*)\]$"""), 00277 (("candles_TimeSizePU",) , r"""^TimeSizePUCandles \[(.*)\]$"""), 00278 00279 (("candles_Memcheck",) , r"""^MemcheckCandles \[(.*)\]$"""), 00280 (("candles_MemcheckPU",) , r"""^MemcheckPUCandles \[(.*)\]$"""), 00281 00282 (("candles_Callgrind",) , r"""^CallgrindCandles \[(.*)\]$"""), 00283 (("candles_CallgrindPU",) , r"""^CallgrindPUCandles \[(.*)\]$"""), 00284 00285 (("candles_IgProfPU",) , r"""^IgProfPUCandles \[(.*)\]$"""), 00286 (("candles_IgProf",) , r"""^IgProfCandles \[(.*)\]$"""), 00287 00288 00289 (("cmsScimark_before",) , r"""^(\d+) cmsScimark benchmarks before starting the tests$"""), 00290 (("cmsScimark_after",) , r"""^(\d+) cmsScimarkLarge benchmarks before starting the tests$"""), 00291 (("cmsDriverOptions",) , r"""^Running cmsDriver.py with user defined options: --cmsdriver="(.+)"$"""), 00292 00293 (("HEPSPEC06_SCORE",) ,r"""^This machine's HEPSPEC06 score is: (.+)$"""), 00294 00295 00296 ) 00297 """ we apply the defined parsing rules to extract the required fields of information into the dictionary (as defined in parsing rules) """ 00298 info = self._applyParsingRules(parsing_rules, lines) 00299 00300 00301 """ postprocess the candles list """ 00302 candles = {} 00303 for field, value in info.items(): 00304 if field.startswith("candles_"): 00305 test = field.replace("candles_", "") 00306 value = [v.strip(" '") for v in value.split(",")] 00307 #if value: 00308 candles[test]=value 00309 del info[field] 00310 #print candles 00311 info["candles"] = self._LINE_SEPARATOR.join([k+":"+",".join(v) for (k, v) in candles.items()]) 00312 00313 00314 """ TAGS """ 00315 """ 00316 --- Tag --- --- RelTag --- -------- Package -------- 00317 HEAD V05-03-06 IgTools/IgProf 00318 V01-06-05 V01-06-04 Validation/Performance 00319 --------------------------------------- 00320 total packages: 2 (2 displayed) 00321 """ 00322 tags_start_index = -1 # set some default 00323 try: 00324 tags_start_index = [i for i in xrange(0, len(lines)) if lines[i].startswith("--- Tag ---")][0] 00325 except: 00326 pass 00327 if tags_start_index > -1: 00328 tags_end_index = [i for i in xrange(tags_start_index + 1, len(lines)) if lines[i].startswith("---------------------------------------")][0] 00329 # print "tags start index: %s, end index: %s" % (tags_start_index, tags_end_index) 00330 tags = lines[tags_start_index:tags_end_index+2] 00331 # print [tag.split(" ") for tag in tags] 00332 # print "\n".join(tags) 00333 else: # no tags found, make an empty list ... 00334 tags = [] 00335 """ we join the tags with separator to store as simple string """ 00336 info["tags"] = self._LINE_SEPARATOR.join(tags) 00337 #FILES/PATHS 00338 00339 00340 """ get the command line """ 00341 try: 00342 cmd_index = self.findFirstIndex_ofStartsWith(lines, "Performance suite invoked with command line:") + 1 #that's the next line 00343 info["command_line"] = lines[cmd_index] 00344 except IndexError, e: 00345 if self._DEBUG: 00346 print e 00347 info["command_line"] = "" 00348 00349 try: 00350 cmd_parsed_start = self.findFirstIndex_ofStartsWith(lines, "Initial PerfSuite Arguments:") + 1 00351 cmd_parsed_end = self.findFirstIndex_ofStartsWith(lines, "Running cmsDriver.py") 00352 info["command_line_parsed"] = self._LINE_SEPARATOR.join(lines[cmd_parsed_start:cmd_parsed_end]) 00353 except IndexError, e: 00354 if self._DEBUG: 00355 print e 00356 info["command_line"] = "" 00357 00358 return info 00359
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::parseTheCompletion | ( | self | ) |
checks if the suite has successfully finished and if the tarball was successfully archived and uploaded to the castor
Definition at line 654 of file parserPerfsuiteMetadata.py.
00655 : 00656 """ 00657 checks if the suite has successfully finished 00658 and if the tarball was successfully archived and uploaded to the castor """ 00659 00660 parsing_rules = ( 00661 (("finishing_time", "", ""), r"""^Performance Suite finished running at (.+) on (.+) in directory (.+)$"""), 00662 (("castor_md5",) , r"""^The md5 checksum of the tarball: (.+)$"""), 00663 (("successfully_archived_tarball", ), r"""^Successfully archived the tarball (.+) in CASTOR!$"""), 00664 #TODO: WE MUST HAVE THE CASTOR URL, but for some of files it's not included [probably crashed] 00665 (("castor_file_url",), r"""^The tarball can be found: (.+)$"""), 00666 (("castor_logfile_url",), r"""^The logfile can be found: (.+)$"""), 00667 ) 00668 00669 00670 """ we apply the defined parsing rules to extract the required fields of information into the dictionary (as defined in parsing rules) """ 00671 info = self._applyParsingRules(parsing_rules, self.lines_other) 00672 00673 """ did we detect any errors in log files ? """ 00674 info["no_errors_detected"] = [line for line in self.lines_other if line == "There were no errors detected in any of the log files!"] and "1" or "0" 00675 if not info["successfully_archived_tarball"]: 00676 info["castor_file_url"] = "" 00677 00678 if not info["castor_file_url"]: 00679 #TODO: get the castor file url or abort 00680 self.handleParsingError( "Castor tarball URL not found. Trying to get from environment") 00681 lmdb_castor_url_is_valid = lambda url: url.startswith("/castor/") 00682 00683 url = "" 00684 try: 00685 #print "HERE!" 00686 url=self.get_tarball_fromlog() 00687 print "Extracted castor tarball full path by re-parsing cmsPerfSuite.log: %s"%url 00688 00689 except: 00690 if os.environ.has_key("PERFDB_CASTOR_FILE_URL"): 00691 url = os.environ["PERFDB_CASTOR_FILE_URL"] 00692 00693 else: #FIXME: add the possibility to get it directly from the cmsPerfSuite.log file (make sure it is dumped there before doing the tarball itself...) 00694 print "Failed to get the tarball location from environment variable PERFDB_CASTOR_FILE_URL" 00695 self.handleParsingError( "Castor tarball URL not found. Provide interactively") 00696 00697 while True: 00698 00699 if lmdb_castor_url_is_valid(url): 00700 info["castor_file_url"] = url 00701 break 00702 print "Please enter a valid CASTOR url: has to start with /castor/ and should point to the tarball" 00703 if os.isatty(0): url = sys.stdin.readline() 00704 else: raise IOError("stdin is closed.") 00705 00706 return info
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::parseTimeSize | ( | self | ) |
parses the timeSize
Definition at line 493 of file parserPerfsuiteMetadata.py.
00494 : 00495 """ parses the timeSize """ 00496 timesize_result = [] 00497 00498 # TODO: we will use the first timestamp after the "or these tests will use user input file..." 00499 #TODO: do we have to save the name of input file somewhere? 00500 """ 00501 the structure of input file: 00502 * beginning ->> and start timestamp- the firstone: 00503 >>> [optional:For these tests will use user input file /build/RAWReference/MinBias_RAW_320_IDEAL.root] 00504 <...> 00505 Using user-specified cmsDriver.py options: --conditions FrontierConditions_GlobalTag,MC_31X_V4::All --eventcontent RECOSIM 00506 Candle MinBias will be PROCESSED 00507 You defined your own steps to run: 00508 RAW2DIGI-RECO 00509 *Candle MinBias 00510 Written out cmsRelvalreport.py input file at: 00511 /build/relval/CMSSW_3_2_4/workStep2/MinBias_TimeSize/SimulationCandles_CMSSW_3_2_4.txt 00512 Thu Aug 13 14:53:37 2009 [start] 00513 <....> 00514 Thu Aug 13 16:04:48 2009 [end] 00515 Individual cmsRelvalreport.py ExitCode 0 00516 * ending - the last timestamp "... ExitCode ...." 00517 """ 00518 #TODO: do we need the cmsDriver --conditions? I suppose it would the global per work directory = 1 perfsuite run (so samefor all candles in one work dir) 00519 # TODO: which candle definition to use? 00520 """ divide into separate jobs """ 00521 lines = self.lines_timesize 00522 jobs = [] 00523 start = False 00524 timesize_start_indicator = re.compile(r"""^taskset -c (\d+) cmsRelvalreportInput.py""") 00525 for line_index in xrange(0, len(lines)): 00526 line = lines[line_index] 00527 # search for start of each TimeSize job (with a certain candle and step) 00528 if timesize_start_indicator.match(line): 00529 if start: 00530 jobs.append(lines[start:line_index]) 00531 start = line_index 00532 #add the last one 00533 jobs.append(lines[start:len(lines)]) 00534 #print "\n".join(str(i) for i in jobs) 00535 00536 parsing_rules = ( 00537 (("", "candle", ), r"""^(Candle|ONLY) (.+) will be PROCESSED$""", "req"), 00538 #e.g.: --conditions FrontierConditions_GlobalTag,MC_31X_V4::All --eventcontent RECOSIM 00539 (("cms_driver_options", ), r"""^Using user-specified cmsDriver.py options: (.+)$"""), 00540 (("", "conditions", ""), r"""^Using user-specified cmsDriver.py options: (.*)--conditions ([^\s]+)(.*)$""", "req"), 00541 # for this we cannot guarrantee that it has been found, TODO: we might count the number of pileup candles and compare with arguments 00542 (("", "pileup_type", ""), r"""^Using user-specified cmsDriver.py options:(.*)--pileup=([^\s]+)(.*)$"""), 00543 #not shure if event content is required 00544 (("", "event_content", ""), r"""^Using user-specified cmsDriver.py options:(.*)--eventcontent ([^\s]+)(.*)$""", "req"), 00545 #TODO: after changeing the splitter to "taskset -c ..." this is no longer included into the part of correct job 00546 #(("input_user_root_file", ), r"""^For these tests will use user input file (.+)$"""), 00547 ) 00548 00549 #parse each of the TimeSize jobs: find candles, etc and start-end times 00550 00551 reExit_code = re.compile(r"""Individual ([^\s]+) ExitCode (\d+)""") 00552 00553 if self._DEBUG: 00554 print "TimeSize (%d) jobs: %s" % (len(jobs), str(jobs)) 00555 00556 for job_lines in jobs: 00557 """ we apply the defined parsing rules to extract the required fields of information into the dictionary (as defined in parsing rules) """ 00558 info = self._applyParsingRules(parsing_rules, job_lines) 00559 #Fixing here the compatibility with new cmsdriver.py --conditions option (for which now we have autoconditions and FrontierConditions_GlobalTag is optional): 00560 if 'auto:' in info['conditions']: 00561 from Configuration.AlCa.autoCond import autoCond 00562 info['conditions'] = autoCond[ info['conditions'].split(':')[1] ].split("::")[0] 00563 else: 00564 if 'FrontierConditions_GlobalTag' in info['conditions']: 00565 info['conditions']=info['conditions'].split(",")[1] 00566 00567 #DEBUG: 00568 #print "CONDITIONS are: %s"%info['conditions'] 00569 #start time - the index after which comes the time stamp 00570 """ the following is not available on one of the releases, instead 00571 use the first timestamp available on our job - that's the starting time :) """ 00572 00573 #start_time_after = self.findFirstIndex_ofStartsWith(job_lines, "Written out cmsRelvalreport.py input file at:") 00574 #print start_time_after 00575 info["start"] = self.firstTimeStampAfter(0, job_lines) 00576 00577 #TODO: improve in future (in case of some changes) we could use findBefore instead which uses the regexp as parameter for searching 00578 #end time - the index before which comes the time stamp 00579 00580 # On older files we have - "Individual Relvalreport.py ExitCode 0" instead of "Individual cmsRelvalreport.py ExitCode" 00581 end_time_before = self.findLineAfter(0, job_lines, test_condition = reExit_code.match, return_index = True) 00582 00583 # on the same line we have the exit Code - so let's get it 00584 nothing, exit_code = reExit_code.match(job_lines[end_time_before]).groups() 00585 00586 info["end"] = self.firstTimeStampBefore(end_time_before, job_lines) 00587 info["exit_code"] = exit_code 00588 00589 steps_start = self.findFirstIndex_ofStartsWith(job_lines, "You defined your own steps to run:") 00590 steps_end = self.findFirstIndex_ofStartsWith(job_lines, "*Candle ") 00591 #probably it includes steps until we found *Candle... ? 00592 steps = job_lines[steps_start + 1:steps_end] 00593 if not self.validateSteps(steps): 00594 self.handleParsingError( "Steps were not found corrently: %s for current job: %s" % (str(steps), str(job_lines))) 00595 00596 """ quite nasty - just a work around """ 00597 print "Trying to recover from this error in case of old cmssw" 00598 00599 """ we assume that steps are between the following sentance and a TimeStamp """ 00600 steps_start = self.findFirstIndex_ofStartsWith(job_lines, "Steps passed to writeCommands") 00601 steps_end = self.findLineAfter(steps_start, job_lines, test_condition = self.isTimeStamp, return_index = True) 00602 00603 steps = job_lines[steps_start + 1:steps_end] 00604 if not self.validateSteps(steps): 00605 self.handleParsingError( "EVEN AFTER RECOVERY Steps were not found corrently! : %s for current job: %s" % (str(steps), str(job_lines))) 00606 else: 00607 print "RECOVERY SEEMS to be successful: %s" % str(steps) 00608 00609 info["steps"] = self._LINE_SEPARATOR.join(steps) #!!!! STEPS MIGHT CONTAIN COMMA: "," 00610 00611 00612 timesize_result.append(info) return {"TimeSize": timesize_result}
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::readCmsScimark | ( | self, | |
main_cores = [1] |
|||
) |
Definition at line 629 of file parserPerfsuiteMetadata.py.
00630 : 00631 main_core = main_cores[0] 00632 #TODO: WE DO NOT ALWAYS REALLY KNOW THE MAIN CORE NUMBER! but we don't care too much 00633 #we parse each of the SciMark files and the Composite scores 00634 csimark = [] 00635 csimark.extend(self.readCmsScimarkTest(testName = "cmsScimark2", testType = "mainCore", core = main_core)) 00636 csimark.extend(self.readCmsScimarkTest(testName = "cmsScimark2_large", testType = "mainCore_Large", core = main_core)) 00637 00638 00639 #we not always know the number of cores available so we will just search the directory to find out core numbers 00640 reIsCsiMark_notusedcore = re.compile("^cmsScimark_(\d+).log$") 00641 scimark_files = [reIsCsiMark_notusedcore.match(f).groups()[0] 00642 for f in os.listdir(self._path) 00643 if reIsCsiMark_notusedcore.match(f) 00644 and os.path.isfile(os.path.join(self._path, f)) ] 00645 00646 for core_number in scimark_files: 00647 try: 00648 csimark.extend(self.readCmsScimarkTest(testName = "cmsScimark_%s" % str(core_number), testType = "NotUsedCore_%s" %str(core_number), core = core_number)) 00649 except IOError, e: 00650 if self._DEBUG: 00651 print e 00652 return csimark 00653 #print csimark
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::readCmsScimarkTest | ( | self, | |
testName, | |||
testType, | |||
core | |||
) |
Definition at line 617 of file parserPerfsuiteMetadata.py.
00618 : 00619 lines = self.readInput(self._path, fileName = testName + ".log") 00620 scores = [{"score": self.reCmsScimarkTest.match(line).groups()[1], "type": testType, "core": core} 00621 for line in lines 00622 if self.reCmsScimarkTest.match(line)] 00623 #add the number of messurment 00624 i = 0 00625 for score in scores: 00626 i += 1 00627 score.update({"messurement_number": i}) 00628 return scores
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::readInput | ( | self, | |
path, | |||
fileName = "cmsPerfSuite.log" |
|||
) |
Definition at line 161 of file parserPerfsuiteMetadata.py.
def parserPerfsuiteMetadata::parserPerfsuiteMetadata::validateSteps | ( | self, | |
steps | |||
) |
Simple function for error detection. TODO: we could use a list of possible steps also
Definition at line 24 of file parserPerfsuiteMetadata.py.
Definition at line 28 of file parserPerfsuiteMetadata.py.
string parserPerfsuiteMetadata::parserPerfsuiteMetadata::_LINE_SEPARATOR = "|" [static, private] |
Definition at line 23 of file parserPerfsuiteMetadata.py.
Definition at line 28 of file parserPerfsuiteMetadata.py.
Definition at line 28 of file parserPerfsuiteMetadata.py.
Definition at line 28 of file parserPerfsuiteMetadata.py.
Definition at line 28 of file parserPerfsuiteMetadata.py.
Definition at line 28 of file parserPerfsuiteMetadata.py.
Definition at line 34 of file parserPerfsuiteMetadata.py.
Definition at line 34 of file parserPerfsuiteMetadata.py.
Definition at line 34 of file parserPerfsuiteMetadata.py.
Definition at line 34 of file parserPerfsuiteMetadata.py.
Definition at line 28 of file parserPerfsuiteMetadata.py.