CMS 3D CMS Logo

 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Groups Pages
List of all members | Public Member Functions | Public Attributes | Private Attributes
crabFunctions.CrabTask Class Reference

Class for a single CrabRequest e This class represents one crab3 task/request. More...

Public Member Functions

def __init__
 The object constructor. More...
 
def crab_folder
 
def crabConfig
 Function to access crab config object or read it if unititalized. More...
 
def crabFolder
 
def datasetpath
 
def handleNoState
 Function to handle Task which received NOSTATE status. More...
 
def isData
 Property function to find out if task runs on data. More...
 
def readLogArch
 Function to read log info from log.tar.gz. More...
 
def resubmit_failed
 Function to resubmit failed jobs in tasks. More...
 
def test_print
 
def update
 Function to update Task in associated Jobs. More...
 
def updateJobStats
 Function to update JobStatistics. More...
 

Public Attributes

 debug
 
 failureReason
 
 finalFiles
 
 isUpdating
 
 jobs
 
 lastUpdate
 
 localDir
 
 log
 
 maxjobnumber
 
 name
 
 nComplete
 
 nCooloff
 
 nFailed
 
 nFinished
 
 nIdle
 
 nJobs
 
 nRunning
 
 nTransferring
 
 nUnsubmitted
 
 outlfn
 
 resubmitCount
 
 state
 
 taskId
 
 totalEvents
 
 uuid
 

Private Attributes

 _crabConfig
 
 _crabFolder
 
 _datasetpath_default
 
 _isData
 

Detailed Description

Class for a single CrabRequest e This class represents one crab3 task/request.

Definition at line 372 of file crabFunctions.py.

Constructor & Destructor Documentation

def crabFunctions.CrabTask.__init__ (   self,
  taskname = "",
  crab_config = "",
  crabController = None,
  initUpdate = True,
  debuglevel = "ERROR",
  datasetpath = "",
  localDir = "",
  outlfn = "" 
)

The object constructor.

Parameters
self,:The object pointer.
taskname,:The object pointer.
initUpdate,:Flag if crab status should be called when an instance is created

Definition at line 387 of file crabFunctions.py.

388  outlfn = "" ,):
389 
390  # crab config as a python object should only be used via .config
391  self._crabConfig = None
393  self._crabFolder = None
394 
395  if taskname:
396  self.name = taskname
397  else:
398  if not crab_config:
399  raise ValueError("Either taskname or crab_config needs to be set")
400  if not os.path.exists( crab_config):
401  raise IOError("File %s not found" % crab_config )
402  self.name = crab_config
403  self.name = self.crabConfig.General.requestName
404  self.uuid = uuid.uuid4()
405  #~ self.lock = multiprocessing.Lock()
406  #setup logging
407  self.log = logging.getLogger( 'crabTask' )
408  self.log.setLevel(logging._levelNames[ debuglevel ])
409  self.jobs = {}
410  self.localDir = localDir
411  self.outlfn = outlfn
412  self.isUpdating = False
413  self.taskId = -1
414  #variables for statistics
415  self.nJobs = 0
416  self.state = "NOSTATE"
417  self.maxjobnumber = 0
418  self.nUnsubmitted = 0
419  self.nIdle = 0
420  self.nRunning = 0
421  self.nTransferring = 0
422  self.nCooloff = 0
423  self.nFailed = 0
424  self.nFinished = 0
425  self.nComplete = 0
426  self.failureReason = None
427  self.lastUpdate = datetime.datetime.now().strftime( "%Y-%m-%d_%H.%M.%S" )
429  self._isData = None
430  self.resubmitCount = 0
432  self.debug = False
434  self.finalFiles = []
435  self.totalEvents = 0
436 
438  self._datasetpath_default = datasetpath
439 
440  #start with first updates
441  if initUpdate:
442  self.update()
443  self.updateJobStats()
def updateJobStats
Function to update JobStatistics.
def update
Function to update Task in associated Jobs.

Member Function Documentation

def crabFunctions.CrabTask.crab_folder (   self)

Definition at line 507 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.update().

508  def crab_folder(self):
509  return os.path.join( self.crabConfig.General.workArea,
"crab_" + self.crabConfig.General.requestName)
def crabFunctions.CrabTask.crabConfig (   self)

Function to access crab config object or read it if unititalized.

Parameters
self,:CrabTask The object pointer.

Definition at line 465 of file crabFunctions.py.

References crabFunctions.CrabTask._crabConfig, AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, MuonGeometrySanityCheckPoint.name, classes.MonitorData.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, and hTMaxCell.name.

466  def crabConfig( self ):
467  if self._crabConfig is None:
468  crab = CrabController()
469  self._crabConfig = crab.readCrabConfig( self.name )
470  return self._crabConfig
def crabConfig
Function to access crab config object or read it if unititalized.
The CrabController class.
def crabFunctions.CrabTask.crabFolder (   self)

Definition at line 480 of file crabFunctions.py.

References crabFunctions.CrabTask._crabFolder, AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, MuonGeometrySanityCheckPoint.name, classes.MonitorData.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, and hTMaxCell.name.

481  def crabFolder( self ):
482  if not self._crabFolder is None: return self._crabFolder
483  crab = CrabController()
484  if os.path.exists( os.path.join( self.crabConfig.General.workArea, crab._prepareFoldername( self.name ) ) ):
485  self._crabFolder = os.path.join( self.crabConfig.General.workArea, crab._prepareFoldername( self.name ) )
486  return self._crabFolder
487  alternative_path = os.path.join(os.path.cwd(), crab._prepareFoldername( self.name ) )
488  if os.path.exists( alternative_path ):
489  self._crabFolder = alternative_path
490  return self._crabFolder
491  self.log.error( "Unable to find folder for Task")
492  return ""
The CrabController class.
def crabFunctions.CrabTask.datasetpath (   self)

Definition at line 472 of file crabFunctions.py.

References crabFunctions.CrabTask._datasetpath_default.

473  def datasetpath( self ):
474  try:
475  return self.crabConfig.Data.inputDataset
476  except:
477  pass
478  return self._datasetpath_default
def crabFunctions.CrabTask.handleNoState (   self)

Function to handle Task which received NOSTATE status.

Parameters
self,:CrabTask The object pointer.

Definition at line 542 of file crabFunctions.py.

References AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, MuonGeometrySanityCheckPoint.name, classes.MonitorData.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, hTMaxCell.name, crabFunctions.CrabTask.resubmitCount, CastorLedAnalysis.state, HcalLedAnalysis.state, CastorPedestalAnalysis.state, HcalPedestalAnalysis.state, and crabFunctions.CrabTask.state.

543  def handleNoState( self ):
544  crab = CrabController()
545  if "The CRAB3 server backend could not resubmit your task because the Grid scheduler answered with an error." in task.failureReason:
546  # move folder and try it again
547  cmd = 'mv %s bak_%s' %(crab._prepareFoldername( self.name ),crab._prepareFoldername( self.name ))
548  p = subprocess.Popen(cmd,stdout=subprocess.PIPE, shell=True)#,shell=True,universal_newlines=True)
549  (out,err) = p.communicate()
550  self.state = "SHEDERR"
551  configName = '%s_cfg.py' %(crab._prepareFoldername( self.name ))
552  crab.submit( configName )
553 
554  elif task.failureReason is not None:
555  self.state = "ERRHANDLE"
556  crab.resubmit( self.name )
557  self.resubmitCount += 1
The CrabController class.
def handleNoState
Function to handle Task which received NOSTATE status.
def crabFunctions.CrabTask.isData (   self)

Property function to find out if task runs on data.

Parameters
self,:CrabTask The object pointer.

Definition at line 448 of file crabFunctions.py.

References crabFunctions.CrabTask._isData.

449  def isData( self ):
450  if self._isData is None:
451  try:
452  test = self.crabConfig.Data.lumiMask
453  self._isData = True
454  except:
455  if self.name.startswith( "Data_" ):
456  self._isData = True
457  else:
458  self._isData = False
459  return self._isData
460 
def isData
Property function to find out if task runs on data.
def crabFunctions.CrabTask.readLogArch (   self,
  logArchName 
)

Function to read log info from log.tar.gz.

Parameters
self,:The object pointer.
logArchName,:path to the compressed log file
Returns
a dictionary with parsed info

Definition at line 599 of file crabFunctions.py.

References print(), and submitPVValidationJobs.split().

600  def readLogArch(self, logArchName):
601  JobNumber = logArchName.split("/")[-1].split("_")[1].split(".")[0]
602  log = {'readEvents' : 0}
603  with tarfile.open( logArchName, "r") as tar:
604  try:
605  JobXmlFile = tar.extractfile('FrameworkJobReport-%s.xml' % JobNumber)
606  root = ET.fromstring( JobXmlFile.read() )
607  for child in root:
608  if child.tag == 'InputFile':
609  for subchild in child:
610  if subchild.tag == 'EventsRead':
611  nEvents = int(subchild.text)
612  log.update({'readEvents' : nEvents})
613  break
614  break
615  except:
616  print("Can not parse / read %s" % logArchName)
617  return log
void print(TMatrixD &m, const char *label=nullptr, bool mathematicaFormat=false)
Definition: Utilities.cc:47
def readLogArch
Function to read log info from log.tar.gz.
def crabFunctions.CrabTask.resubmit_failed (   self)

Function to resubmit failed jobs in tasks.

Parameters
self,:CrabTask The object pointer.

Definition at line 496 of file crabFunctions.py.

References crabFunctions.CrabTask.jobs, crabFunctions.CrabTask.lastUpdate, AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, MuonGeometrySanityCheckPoint.name, classes.MonitorData.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, and hTMaxCell.name.

497  def resubmit_failed( self ):
498  failedJobIds = []
499  controller = CrabController()
500  for jobkey in self.jobs.keys():
501  job = self.jobs[jobkey]
502  if job['State'] == 'failed':
503  failedJobIds.append( job['JobIds'][-1] )
504  controller.resubmit( self.name, joblist = failedJobIds )
505  self.lastUpdate = datetime.datetime.now().strftime( "%Y-%m-%d_%H.%M.%S" )
The CrabController class.
def resubmit_failed
Function to resubmit failed jobs in tasks.
def crabFunctions.CrabTask.test_print (   self)

Definition at line 558 of file crabFunctions.py.

References crabFunctions.CrabTask.uuid.

559  def test_print(self):
return self.uuid
def crabFunctions.CrabTask.update (   self)

Function to update Task in associated Jobs.

Parameters
self,:CrabTask The object pointer.

Definition at line 513 of file crabFunctions.py.

References crabFunctions.CrabTask.crab_folder(), crabFunctions.CrabTask.failureReason, crabFunctions.CrabTask.isUpdating, crabFunctions.CrabTask.jobs, crabFunctions.CrabTask.lastUpdate, AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, MuonGeometrySanityCheckPoint.name, classes.MonitorData.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, hTMaxCell.name, Mpslibclass.jobdatabase.nJobs, crabFunctions.CrabTask.nJobs, crabFunctions.CrabTask.resubmitCount, CastorLedAnalysis.state, HcalLedAnalysis.state, CastorPedestalAnalysis.state, HcalPedestalAnalysis.state, crabFunctions.CrabTask.state, and crabFunctions.CrabTask.updateJobStats().

Referenced by progressbar.ProgressBar.__next__(), MatrixUtil.Matrix.__setitem__(), MatrixUtil.Steps.__setitem__(), dqm-mbProfile.Profile.finish(), progressbar.ProgressBar.finish(), and MatrixUtil.Steps.overwrite().

514  def update(self):
515  #~ self.lock.acquire()
516  self.log.debug( "Start update for task %s" % self.name )
517  self.isUpdating = True
518  controller = CrabController()
519  self.state = "UPDATING"
520  # check if we should drop this sample due to missing info
521 
522  self.log.debug( "Try to get status for task" )
523  self.state , self.jobs,self.failureReason = controller.status(self.crab_folder)
524  self.log.debug( "Found state: %s" % self.state )
525  if self.state=="FAILED":
526  #try it once more
527  time.sleep(2)
528  self.state , self.jobs,self.failureReason = controller.status(self.crab_folder)
529  self.nJobs = len(self.jobs)
530  self.updateJobStats()
531  if self.state == "NOSTATE":
532  self.log.debug( "Trying to resubmit because of NOSTATE" )
533  if self.resubmitCount < 3: self.self.handleNoState()
534  # add to db if not
535  # Final solution inf state not yet found
536  self.isUpdating = False
537  self.lastUpdate = datetime.datetime.now().strftime( "%Y-%m-%d_%H.%M.%S" )
538  #~ self.lock.release()
The CrabController class.
def updateJobStats
Function to update JobStatistics.
def update
Function to update Task in associated Jobs.
def crabFunctions.CrabTask.updateJobStats (   self,
  dCacheFileList = None 
)

Function to update JobStatistics.

Parameters
self,:The object pointer.
dCacheFilelist,:A list of files on the dCache

Definition at line 564 of file crabFunctions.py.

References any(), crabFunctions.CrabTask.jobs, AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, MuonGeometrySanityCheckPoint.name, classes.MonitorData.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, hTMaxCell.name, crabFunctions.CrabTask.nComplete, and print().

Referenced by crabFunctions.CrabTask.update().

565  def updateJobStats(self,dCacheFileList = None):
566  jobKeys = sorted(self.jobs.keys())
567  try:
568  intJobkeys = [int(x) for x in jobKeys]
569  except:
570  print("error parsing job numers to int")
571 
572  #maxjobnumber = max(intJobkeys)
573 
574  stateDict = {'unsubmitted':0,'idle':0,'running':0,'transferring':0,'cooloff':0,'failed':0,'finished':0}
575  nComplete = 0
576 
577  # loop through jobs
578  for key in jobKeys:
579  job = self.jobs[key]
580  #check if all completed files are on decache
581  for statekey in stateDict.keys():
582  if statekey in job['State']:
583  stateDict[statekey]+=1
584  # check if finished fails are found on dCache if dCacheFilelist is given
585  if dCacheFileList is not None:
586  outputFilename = "%s_%s"%( self.name, key)
587  if 'finished' in statekey and any(outputFilename in s for s in dCacheFileList):
588  nComplete +=1
589 
590  for state in stateDict:
591  attrname = "n" + state.capitalize()
592  setattr(self, attrname, stateDict[state])
593  self.nComplete = nComplete
bool any(const std::vector< T > &v, const T &what)
Definition: ECalSD.cc:37
void print(TMatrixD &m, const char *label=nullptr, bool mathematicaFormat=false)
Definition: Utilities.cc:47
def updateJobStats
Function to update JobStatistics.

Member Data Documentation

crabFunctions.CrabTask._crabConfig
private

Definition at line 390 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.crabConfig().

crabFunctions.CrabTask._crabFolder
private

Definition at line 392 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.crabFolder().

crabFunctions.CrabTask._datasetpath_default
private

Definition at line 437 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.datasetpath().

crabFunctions.CrabTask._isData
private

Definition at line 428 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.isData().

crabFunctions.CrabTask.debug

Definition at line 431 of file crabFunctions.py.

Referenced by util.rrapi.RRApi.dprint(), rrapi.RRApi.dprint(), pkg.AbstractPkg.generate(), util.rrapi.RRApi.get(), rrapi.RRApi.get(), pkg.AbstractPkg.get_kwds(), runTauIdMVA.TauIDEmbedder.loadMVA_WPs_run2_2017(), runTauIdMVA.TauIDEmbedder.runTauID(), and pkg.AbstractPkg.write().

crabFunctions.CrabTask.failureReason

Definition at line 425 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.update().

crabFunctions.CrabTask.finalFiles

Definition at line 433 of file crabFunctions.py.

crabFunctions.CrabTask.isUpdating

Definition at line 411 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.update().

crabFunctions.CrabTask.jobs

Definition at line 408 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.resubmit_failed(), crabFunctions.CrabTask.update(), and crabFunctions.CrabTask.updateJobStats().

crabFunctions.CrabTask.lastUpdate

Definition at line 426 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.resubmit_failed(), and crabFunctions.CrabTask.update().

crabFunctions.CrabTask.localDir

Definition at line 409 of file crabFunctions.py.

crabFunctions.CrabTask.log

Definition at line 406 of file crabFunctions.py.

Referenced by conddbCopyTest.CopyTest.execute(), and conditionUploadTest.UploadTest.upload().

crabFunctions.CrabTask.maxjobnumber

Definition at line 416 of file crabFunctions.py.

crabFunctions.CrabTask.name

Definition at line 395 of file crabFunctions.py.

Referenced by ElectronMVAID.ElectronMVAID.__call__(), FWLite.ElectronMVAID.__call__(), dirstructure.Directory.__create_pie_image(), DisplayManager.DisplayManager.__del__(), dqm_interfaces.DirID.__eq__(), BeautifulSoup.Tag.__eq__(), dirstructure.Directory.__get_full_path(), dirstructure.Comparison.__get_img_name(), dataset.Dataset.__getDataType(), dataset.Dataset.__getFileInfoList(), dirstructure.Comparison.__make_image(), core.autovars.NTupleVariable.__repr__(), core.autovars.NTupleObjectType.__repr__(), core.autovars.NTupleObject.__repr__(), core.autovars.NTupleCollection.__repr__(), dirstructure.Directory.__repr__(), dqm_interfaces.DirID.__repr__(), dirstructure.Comparison.__repr__(), config.Service.__setattr__(), config.CFG.__str__(), counter.Counter.__str__(), average.Average.__str__(), BeautifulSoup.Tag.__str__(), BeautifulSoup.SoupStrainer.__str__(), FWLite.WorkingPoints._reformat_cut_definitions(), core.autovars.NTupleObjectType.addSubObjects(), core.autovars.NTupleObjectType.addVariables(), core.autovars.NTupleObjectType.allVars(), dirstructure.Directory.calcStats(), crabFunctions.CrabTask.crabConfig(), crabFunctions.CrabTask.crabFolder(), geometryComparison.GeometryComparison.createScript(), validation.Sample.digest(), python.rootplot.utilities.Hist.divide(), python.rootplot.utilities.Hist.divide_wilson(), DisplayManager.DisplayManager.Draw(), TreeCrawler.Package.dump(), core.autovars.NTupleVariable.fillBranch(), core.autovars.NTupleObject.fillBranches(), core.autovars.NTupleCollection.fillBranchesScalar(), core.autovars.NTupleCollection.fillBranchesVector(), core.autovars.NTupleCollection.get_cpp_declaration(), core.autovars.NTupleCollection.get_cpp_wrapper_class(), core.autovars.NTupleCollection.get_py_wrapper_class(), utils.StatisticalTest.get_status(), production_tasks.Task.getname(), dataset.CMSDataset.getPrimaryDatasetEntries(), dataset.PrivateDataset.getPrimaryDatasetEntries(), primaryVertexResolution.PrimaryVertexResolution.getRepMap(), primaryVertexValidation.PrimaryVertexValidation.getRepMap(), zMuMuValidation.ZMuMuValidation.getRepMap(), crabFunctions.CrabTask.handleNoState(), VIDSelectorBase.VIDSelectorBase.initialize(), personalPlayback.Applet.log(), core.autovars.NTupleVariable.makeBranch(), core.autovars.NTupleObject.makeBranches(), core.autovars.NTupleCollection.makeBranchesScalar(), core.autovars.NTupleCollection.makeBranchesVector(), dirstructure.Directory.print_report(), dataset.BaseDataset.printInfo(), dataset.Dataset.printInfo(), crabFunctions.CrabTask.resubmit_failed(), production_tasks.MonitorJobs.run(), BeautifulSoup.SoupStrainer.searchTag(), python.rootplot.utilities.Hist.TGraph(), python.rootplot.utilities.Hist.TH1F(), crabFunctions.CrabTask.update(), crabFunctions.CrabTask.updateJobStats(), counter.Counter.write(), and average.Average.write().

crabFunctions.CrabTask.nComplete

Definition at line 424 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.updateJobStats().

crabFunctions.CrabTask.nCooloff

Definition at line 421 of file crabFunctions.py.

crabFunctions.CrabTask.nFailed

Definition at line 422 of file crabFunctions.py.

crabFunctions.CrabTask.nFinished

Definition at line 423 of file crabFunctions.py.

crabFunctions.CrabTask.nIdle

Definition at line 418 of file crabFunctions.py.

crabFunctions.CrabTask.nJobs

Definition at line 414 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.update().

crabFunctions.CrabTask.nRunning

Definition at line 419 of file crabFunctions.py.

crabFunctions.CrabTask.nTransferring

Definition at line 420 of file crabFunctions.py.

crabFunctions.CrabTask.nUnsubmitted

Definition at line 417 of file crabFunctions.py.

crabFunctions.CrabTask.outlfn

Definition at line 410 of file crabFunctions.py.

crabFunctions.CrabTask.resubmitCount

Definition at line 429 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.handleNoState(), and crabFunctions.CrabTask.update().

crabFunctions.CrabTask.state

Definition at line 415 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.handleNoState(), and crabFunctions.CrabTask.update().

crabFunctions.CrabTask.taskId

Definition at line 412 of file crabFunctions.py.

crabFunctions.CrabTask.totalEvents

Definition at line 434 of file crabFunctions.py.

crabFunctions.CrabTask.uuid

Definition at line 403 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.test_print().