CMS 3D CMS Logo

List of all members | Public Member Functions | Public Attributes | Private Attributes
crabFunctions.CrabTask Class Reference

Class for a single CrabRequest e This class represents one crab3 task/request. More...

Public Member Functions

def __init__ (self, taskname="", crab_config="", crabController=None, initUpdate=True, debuglevel="ERROR", datasetpath="", localDir="", outlfn="")
 The object constructor. More...
 
def crab_folder (self)
 
def crabConfig (self)
 Function to access crab config object or read it if unititalized. More...
 
def crabFolder (self)
 
def datasetpath (self)
 
def handleNoState (self)
 Function to handle Task which received NOSTATE status. More...
 
def isData (self)
 Property function to find out if task runs on data. More...
 
def readLogArch (self, logArchName)
 Function to read log info from log.tar.gz. More...
 
def resubmit_failed (self)
 Function to resubmit failed jobs in tasks. More...
 
def test_print (self)
 
def update (self)
 Function to update Task in associated Jobs. More...
 
def updateJobStats (self, dCacheFileList=None)
 Function to update JobStatistics. More...
 

Public Attributes

 debug
 
 failureReason
 
 finalFiles
 
 isUpdating
 
 jobs
 
 lastUpdate
 
 localDir
 
 log
 
 maxjobnumber
 
 name
 
 nComplete
 
 nCooloff
 
 nFailed
 
 nFinished
 
 nIdle
 
 nJobs
 
 nRunning
 
 nTransferring
 
 nUnsubmitted
 
 outlfn
 
 resubmitCount
 
 state
 
 taskId
 
 totalEvents
 
 uuid
 

Private Attributes

 _crabConfig
 
 _crabFolder
 
 _datasetpath_default
 
 _isData
 

Detailed Description

Class for a single CrabRequest e This class represents one crab3 task/request.

Definition at line 371 of file crabFunctions.py.

Constructor & Destructor Documentation

def crabFunctions.CrabTask.__init__ (   self,
  taskname = "",
  crab_config = "",
  crabController = None,
  initUpdate = True,
  debuglevel = "ERROR",
  datasetpath = "",
  localDir = "",
  outlfn = "" 
)

The object constructor.

Parameters
selfThe object pointer.
tasknameThe object pointer.
initUpdateFlag if crab status should be called when an instance is created

Definition at line 386 of file crabFunctions.py.

386  outlfn = "" ,):
387 
388  # crab config as a python object should only be used via .config
389  self._crabConfig = None
390 
391  self._crabFolder = None
392 
393  if taskname:
394  self.name = taskname
395  else:
396  if not crab_config:
397  raise ValueError("Either taskname or crab_config needs to be set")
398  if not os.path.exists( crab_config):
399  raise IOError("File %s not found" % crab_config )
400  self.name = crab_config
401  self.name = self.crabConfig.General.requestName
402  self.uuid = uuid.uuid4()
403  #~ self.lock = multiprocessing.Lock()
404  #setup logging
405  self.log = logging.getLogger( 'crabTask' )
406  self.log.setLevel(logging._levelNames[ debuglevel ])
407  self.jobs = {}
408  self.localDir = localDir
409  self.outlfn = outlfn
410  self.isUpdating = False
411  self.taskId = -1
412  #variables for statistics
413  self.nJobs = 0
414  self.state = "NOSTATE"
415  self.maxjobnumber = 0
416  self.nUnsubmitted = 0
417  self.nIdle = 0
418  self.nRunning = 0
419  self.nTransferring = 0
420  self.nCooloff = 0
421  self.nFailed = 0
422  self.nFinished = 0
423  self.nComplete = 0
424  self.failureReason = None
425  self.lastUpdate = datetime.datetime.now().strftime( "%Y-%m-%d_%H.%M.%S" )
426 
427  self._isData = None
428  self.resubmitCount = 0
429 
430  self.debug = False
431 
432  self.finalFiles = []
433  self.totalEvents = 0
434 
435 
436  self._datasetpath_default = datasetpath
437 
438  #start with first updates
439  if initUpdate:
440  self.update()
441  self.updateJobStats()
442 
def updateJobStats(self, dCacheFileList=None)
Function to update JobStatistics.
def update(self)
Function to update Task in associated Jobs.

Member Function Documentation

def crabFunctions.CrabTask.crab_folder (   self)

Definition at line 506 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.update().

506  def crab_folder(self):
507  return os.path.join( self.crabConfig.General.workArea,
508  "crab_" + self.crabConfig.General.requestName)
def crabFunctions.CrabTask.crabConfig (   self)
def crabFunctions.CrabTask.crabFolder (   self)

Definition at line 479 of file crabFunctions.py.

References crabFunctions.CrabTask._crabFolder, AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, validateAlignments.ParallelMergeJob.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, classes.MonitorData.name, MuonGeometrySanityCheckPoint.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, and hTMaxCell.name.

479  def crabFolder( self ):
480  if not self._crabFolder is None: return self._crabFolder
481  crab = CrabController()
482  if os.path.exists( os.path.join( self.crabConfig.General.workArea, crab._prepareFoldername( self.name ) ) ):
483  self._crabFolder = os.path.join( self.crabConfig.General.workArea, crab._prepareFoldername( self.name ) )
484  return self._crabFolder
485  alternative_path = os.path.join(os.path.cwd(), crab._prepareFoldername( self.name ) )
486  if os.path.exists( alternative_path ):
487  self._crabFolder = alternative_path
488  return self._crabFolder
489  self.log.error( "Unable to find folder for Task")
490  return ""
491 
The CrabController class.
def crabFunctions.CrabTask.datasetpath (   self)

Definition at line 471 of file crabFunctions.py.

References crabFunctions.CrabTask._datasetpath_default.

471  def datasetpath( self ):
472  try:
473  return self.crabConfig.Data.inputDataset
474  except:
475  pass
476  return self._datasetpath_default
477 
def crabFunctions.CrabTask.handleNoState (   self)

Function to handle Task which received NOSTATE status.

Parameters
selfCrabTask The object pointer.

Definition at line 541 of file crabFunctions.py.

References AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, validateAlignments.ParallelMergeJob.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, classes.MonitorData.name, MuonGeometrySanityCheckPoint.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, hTMaxCell.name, crabFunctions.CrabTask.resubmitCount, CastorLedAnalysis.state, HcalLedAnalysis.state, CastorPedestalAnalysis.state, HcalPedestalAnalysis.state, and crabFunctions.CrabTask.state.

541  def handleNoState( self ):
542  crab = CrabController()
543  if "The CRAB3 server backend could not resubmit your task because the Grid scheduler answered with an error." in task.failureReason:
544  # move folder and try it again
545  cmd = 'mv %s bak_%s' %(crab._prepareFoldername( self.name ),crab._prepareFoldername( self.name ))
546  p = subprocess.Popen(cmd,stdout=subprocess.PIPE, shell=True)#,shell=True,universal_newlines=True)
547  (out,err) = p.communicate()
548  self.state = "SHEDERR"
549  configName = '%s_cfg.py' %(crab._prepareFoldername( self.name ))
550  crab.submit( configName )
551 
552  elif task.failureReason is not None:
553  self.state = "ERRHANDLE"
554  crab.resubmit( self.name )
555  self.resubmitCount += 1
556 
The CrabController class.
def handleNoState(self)
Function to handle Task which received NOSTATE status.
def crabFunctions.CrabTask.isData (   self)

Property function to find out if task runs on data.

Parameters
selfCrabTask The object pointer.

Definition at line 447 of file crabFunctions.py.

References crabFunctions.CrabTask._isData.

447  def isData( self ):
448  if self._isData is None:
449  try:
450  test = self.crabConfig.Data.lumiMask
451  self._isData = True
452  except:
453  if self.name.startswith( "Data_" ):
454  self._isData = True
455  else:
456  self._isData = False
457  return self._isData
458 
459 
def isData(self)
Property function to find out if task runs on data.
def crabFunctions.CrabTask.readLogArch (   self,
  logArchName 
)

Function to read log info from log.tar.gz.

Parameters
selfThe object pointer.
logArchNamepath to the compressed log file
Returns
a dictionary with parsed info

Definition at line 598 of file crabFunctions.py.

References createfilelist.int, and split.

598  def readLogArch(self, logArchName):
599  JobNumber = logArchName.split("/")[-1].split("_")[1].split(".")[0]
600  log = {'readEvents' : 0}
601  with tarfile.open( logArchName, "r") as tar:
602  try:
603  JobXmlFile = tar.extractfile('FrameworkJobReport-%s.xml' % JobNumber)
604  root = ET.fromstring( JobXmlFile.read() )
605  for child in root:
606  if child.tag == 'InputFile':
607  for subchild in child:
608  if subchild.tag == 'EventsRead':
609  nEvents = int(subchild.text)
610  log.update({'readEvents' : nEvents})
611  break
612  break
613  except:
614  print "Can not parse / read %s" % logArchName
615  return log
616 
def readLogArch(self, logArchName)
Function to read log info from log.tar.gz.
double split
Definition: MVATrainer.cc:139
def crabFunctions.CrabTask.resubmit_failed (   self)

Function to resubmit failed jobs in tasks.

Parameters
selfCrabTask The object pointer.

Definition at line 495 of file crabFunctions.py.

References crabFunctions.CrabTask.jobs, crabFunctions.CrabTask.lastUpdate, AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, validateAlignments.ParallelMergeJob.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, classes.MonitorData.name, MuonGeometrySanityCheckPoint.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, and hTMaxCell.name.

495  def resubmit_failed( self ):
496  failedJobIds = []
497  controller = CrabController()
498  for jobkey in self.jobs.keys():
499  job = self.jobs[jobkey]
500  if job['State'] == 'failed':
501  failedJobIds.append( job['JobIds'][-1] )
502  controller.resubmit( self.name, joblist = failedJobIds )
503  self.lastUpdate = datetime.datetime.now().strftime( "%Y-%m-%d_%H.%M.%S" )
504 
The CrabController class.
def resubmit_failed(self)
Function to resubmit failed jobs in tasks.
def crabFunctions.CrabTask.test_print (   self)

Definition at line 557 of file crabFunctions.py.

References crabFunctions.CrabTask.uuid.

557  def test_print(self):
558  return self.uuid
def crabFunctions.CrabTask.update (   self)

Function to update Task in associated Jobs.

Parameters
selfCrabTask The object pointer.

Definition at line 512 of file crabFunctions.py.

References crabFunctions.CrabTask.crab_folder(), crabFunctions.CrabTask.failureReason, crabFunctions.CrabTask.isUpdating, crabFunctions.CrabTask.jobs, crabFunctions.CrabTask.lastUpdate, AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, validateAlignments.ParallelMergeJob.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, classes.MonitorData.name, MuonGeometrySanityCheckPoint.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, hTMaxCell.name, Mpslibclass.jobdatabase.nJobs, crabFunctions.CrabTask.nJobs, crabFunctions.CrabTask.resubmitCount, CastorLedAnalysis.state, HcalLedAnalysis.state, CastorPedestalAnalysis.state, HcalPedestalAnalysis.state, crabFunctions.CrabTask.state, and crabFunctions.CrabTask.updateJobStats().

Referenced by progressbar.ProgressBar.__next__(), MatrixUtil.Matrix.__setitem__(), MatrixUtil.Steps.__setitem__(), Vispa.Gui.VispaWidget.VispaWidget.autosize(), Vispa.Views.LineDecayView.LineDecayContainer.createObject(), Vispa.Views.LineDecayView.LineDecayContainer.deselectAllObjects(), Vispa.Gui.VispaWidgetOwner.VispaWidgetOwner.deselectAllWidgets(), Vispa.Gui.VispaWidget.VispaWidget.enableAutosizing(), dqm-mbProfile.Profile.finish(), progressbar.ProgressBar.finish(), Vispa.Gui.MenuWidget.MenuWidget.leaveEvent(), Vispa.Gui.VispaWidgetOwner.VispaWidgetOwner.mouseMoveEvent(), Vispa.Gui.MenuWidget.MenuWidget.mouseMoveEvent(), Vispa.Views.LineDecayView.LineDecayContainer.mouseMoveEvent(), Vispa.Gui.VispaWidgetOwner.VispaWidgetOwner.mouseReleaseEvent(), Vispa.Views.LineDecayView.LineDecayContainer.objectMoved(), MatrixUtil.Steps.overwrite(), Vispa.Views.LineDecayView.LineDecayContainer.removeObject(), Vispa.Gui.ConnectableWidget.ConnectableWidget.removePorts(), Vispa.Gui.FindDialog.FindDialog.reset(), Vispa.Gui.PortConnection.PointToPointConnection.select(), Vispa.Gui.VispaWidget.VispaWidget.select(), Vispa.Views.LineDecayView.LineDecayContainer.select(), Vispa.Gui.VispaWidget.VispaWidget.setText(), Vispa.Gui.VispaWidget.VispaWidget.setTitle(), Vispa.Gui.ZoomableWidget.ZoomableWidget.setZoom(), Vispa.Views.LineDecayView.LineDecayContainer.setZoom(), and Vispa.Gui.PortConnection.PointToPointConnection.updateConnection().

512  def update(self):
513  #~ self.lock.acquire()
514  self.log.debug( "Start update for task %s" % self.name )
515  self.isUpdating = True
516  controller = CrabController()
517  self.state = "UPDATING"
518  # check if we should drop this sample due to missing info
519 
520  self.log.debug( "Try to get status for task" )
521  self.state , self.jobs,self.failureReason = controller.status(self.crab_folder)
522  self.log.debug( "Found state: %s" % self.state )
523  if self.state=="FAILED":
524  #try it once more
525  time.sleep(2)
526  self.state , self.jobs,self.failureReason = controller.status(self.crab_folder)
527  self.nJobs = len(self.jobs)
528  self.updateJobStats()
529  if self.state == "NOSTATE":
530  self.log.debug( "Trying to resubmit because of NOSTATE" )
531  if self.resubmitCount < 3: self.self.handleNoState()
532  # add to db if not
533  # Final solution inf state not yet found
534  self.isUpdating = False
535  self.lastUpdate = datetime.datetime.now().strftime( "%Y-%m-%d_%H.%M.%S" )
536  #~ self.lock.release()
537 
def updateJobStats(self, dCacheFileList=None)
Function to update JobStatistics.
The CrabController class.
def update(self)
Function to update Task in associated Jobs.
def crabFunctions.CrabTask.updateJobStats (   self,
  dCacheFileList = None 
)

Function to update JobStatistics.

Parameters
selfThe object pointer.
dCacheFilelistA list of files on the dCache

Definition at line 563 of file crabFunctions.py.

References any(), createfilelist.int, crabFunctions.CrabTask.jobs, AlignableObjectId::entry.name, preexistingValidation.PreexistingValidation.name, alignment.Alignment.name, validateAlignments.ParallelMergeJob.name, XMLProcessor::_loaderBaseConfig.name, genericValidation.GenericValidation.name, h4DSegm.name, TrackerSectorStruct.name, classes.MonitorData.name, MuonGeometrySanityCheckPoint.name, classes.OutputData.name, h2DSegm.name, geometry.Structure.name, plotscripts.SawTeethFunction.name, crabFunctions.CrabTask.name, hTMaxCell.name, and crabFunctions.CrabTask.nComplete.

Referenced by crabFunctions.CrabTask.update().

563  def updateJobStats(self,dCacheFileList = None):
564  jobKeys = sorted(self.jobs.keys())
565  try:
566  intJobkeys = [int(x) for x in jobKeys]
567  except:
568  print "error parsing job numers to int"
569 
570  #maxjobnumber = max(intJobkeys)
571 
572  stateDict = {'unsubmitted':0,'idle':0,'running':0,'transferring':0,'cooloff':0,'failed':0,'finished':0}
573  nComplete = 0
574 
575  # loop through jobs
576  for key in jobKeys:
577  job = self.jobs[key]
578  #check if all completed files are on decache
579  for statekey in stateDict.keys():
580  if statekey in job['State']:
581  stateDict[statekey]+=1
582  # check if finished fails are found on dCache if dCacheFilelist is given
583  if dCacheFileList is not None:
584  outputFilename = "%s_%s"%( self.name, key)
585  if 'finished' in statekey and any(outputFilename in s for s in dCacheFileList):
586  nComplete +=1
587 
588  for state in stateDict:
589  attrname = "n" + state.capitalize()
590  setattr(self, attrname, stateDict[state])
591  self.nComplete = nComplete
592 
def updateJobStats(self, dCacheFileList=None)
Function to update JobStatistics.
bool any(const std::vector< T > &v, const T &what)
Definition: ECalSD.cc:37

Member Data Documentation

crabFunctions.CrabTask._crabConfig
private

Definition at line 389 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.crabConfig().

crabFunctions.CrabTask._crabFolder
private

Definition at line 391 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.crabFolder().

crabFunctions.CrabTask._datasetpath_default
private

Definition at line 436 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.datasetpath().

crabFunctions.CrabTask._isData
private

Definition at line 427 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.isData().

crabFunctions.CrabTask.debug
crabFunctions.CrabTask.failureReason

Definition at line 424 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.update().

crabFunctions.CrabTask.finalFiles

Definition at line 432 of file crabFunctions.py.

crabFunctions.CrabTask.isUpdating

Definition at line 410 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.update().

crabFunctions.CrabTask.jobs
crabFunctions.CrabTask.lastUpdate
crabFunctions.CrabTask.localDir

Definition at line 408 of file crabFunctions.py.

crabFunctions.CrabTask.log
crabFunctions.CrabTask.maxjobnumber

Definition at line 415 of file crabFunctions.py.

crabFunctions.CrabTask.name

Definition at line 394 of file crabFunctions.py.

Referenced by ElectronMVAID.ElectronMVAID.__call__(), dirstructure.Directory.__create_pie_image(), DisplayManager.DisplayManager.__del__(), dqm_interfaces.DirID.__eq__(), BeautifulSoup.Tag.__eq__(), dirstructure.Directory.__get_full_path(), dirstructure.Comparison.__get_img_name(), dataset.Dataset.__getDataType(), dataset.Dataset.__getFileInfoList(), dirstructure.Comparison.__make_image(), core.autovars.NTupleVariable.__repr__(), core.autovars.NTupleObjectType.__repr__(), core.autovars.NTupleObject.__repr__(), core.autovars.NTupleCollection.__repr__(), dirstructure.Directory.__repr__(), dqm_interfaces.DirID.__repr__(), dirstructure.Comparison.__repr__(), config.Service.__setattr__(), config.CFG.__str__(), counter.Counter.__str__(), average.Average.__str__(), BeautifulSoup.Tag.__str__(), BeautifulSoup.SoupStrainer.__str__(), core.autovars.NTupleObjectType.addSubObjects(), core.autovars.NTupleObjectType.addVariables(), core.autovars.NTupleObjectType.allVars(), dirstructure.Directory.calcStats(), crabFunctions.CrabTask.crabConfig(), crabFunctions.CrabTask.crabFolder(), validation.Sample.digest(), python.rootplot.utilities.Hist.divide(), python.rootplot.utilities.Hist.divide_wilson(), DisplayManager.DisplayManager.Draw(), TreeCrawler.Package.dump(), core.autovars.NTupleVariable.fillBranch(), core.autovars.NTupleObject.fillBranches(), core.autovars.NTupleCollection.fillBranchesScalar(), core.autovars.NTupleCollection.fillBranchesVector(), core.autovars.NTupleCollection.get_cpp_declaration(), core.autovars.NTupleCollection.get_cpp_wrapper_class(), core.autovars.NTupleCollection.get_py_wrapper_class(), utils.StatisticalTest.get_status(), production_tasks.Task.getname(), dataset.CMSDataset.getPrimaryDatasetEntries(), dataset.PrivateDataset.getPrimaryDatasetEntries(), crabFunctions.CrabTask.handleNoState(), VIDSelectorBase.VIDSelectorBase.initialize(), personalPlayback.Applet.log(), core.autovars.NTupleVariable.makeBranch(), core.autovars.NTupleObject.makeBranches(), core.autovars.NTupleCollection.makeBranchesScalar(), core.autovars.NTupleCollection.makeBranchesVector(), dirstructure.Directory.print_report(), dataset.BaseDataset.printInfo(), dataset.Dataset.printInfo(), crabFunctions.CrabTask.resubmit_failed(), production_tasks.MonitorJobs.run(), BeautifulSoup.SoupStrainer.searchTag(), python.rootplot.utilities.Hist.TGraph(), python.rootplot.utilities.Hist.TH1F(), crabFunctions.CrabTask.update(), crabFunctions.CrabTask.updateJobStats(), Vispa.Views.PropertyView.Property.valueChanged(), counter.Counter.write(), and average.Average.write().

crabFunctions.CrabTask.nComplete

Definition at line 423 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.updateJobStats().

crabFunctions.CrabTask.nCooloff

Definition at line 420 of file crabFunctions.py.

crabFunctions.CrabTask.nFailed

Definition at line 421 of file crabFunctions.py.

crabFunctions.CrabTask.nFinished

Definition at line 422 of file crabFunctions.py.

crabFunctions.CrabTask.nIdle

Definition at line 417 of file crabFunctions.py.

crabFunctions.CrabTask.nJobs

Definition at line 413 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.update().

crabFunctions.CrabTask.nRunning

Definition at line 418 of file crabFunctions.py.

crabFunctions.CrabTask.nTransferring

Definition at line 419 of file crabFunctions.py.

crabFunctions.CrabTask.nUnsubmitted

Definition at line 416 of file crabFunctions.py.

crabFunctions.CrabTask.outlfn

Definition at line 409 of file crabFunctions.py.

crabFunctions.CrabTask.resubmitCount
crabFunctions.CrabTask.state
crabFunctions.CrabTask.taskId

Definition at line 411 of file crabFunctions.py.

crabFunctions.CrabTask.totalEvents

Definition at line 433 of file crabFunctions.py.

crabFunctions.CrabTask.uuid

Definition at line 402 of file crabFunctions.py.

Referenced by crabFunctions.CrabTask.test_print().