CMS 3D CMS Logo

Public Member Functions | Static Public Member Functions | Private Types | Private Attributes

edm::DuplicateChecker Class Reference

#include <DuplicateChecker.h>

List of all members.

Public Member Functions

bool checkDisabled () const
bool checkingAllFiles () const
void disable ()
 DuplicateChecker (ParameterSet const &pset)
void inputFileClosed ()
void inputFileOpened (bool realData, IndexIntoFile const &indexIntoFile, std::vector< boost::shared_ptr< IndexIntoFile > > const &indexesIntoFiles, std::vector< boost::shared_ptr< IndexIntoFile > >::size_type currentIndexIntoFile)
bool isDuplicateAndCheckActive (int index, RunNumber_t run, LuminosityBlockNumber_t lumi, EventNumber_t event, std::string const &fileName)
bool noDuplicatesInFile () const

Static Public Member Functions

static void fillDescription (ParameterSetDescription &desc)

Private Types

enum  DataType { isRealData, isSimulation, unknown }
enum  DuplicateCheckMode { noDuplicateCheck, checkEachFile, checkEachRealDataFile, checkAllFilesOpened }

Private Attributes

DataType dataType_
bool disabled_
DuplicateCheckMode duplicateCheckMode_
bool itIsKnownTheFileHasNoDuplicates_
std::set
< IndexIntoFile::IndexRunLumiEventKey
relevantPreviousEvents_

Detailed Description

Definition at line 33 of file DuplicateChecker.h.


Member Enumeration Documentation

Enumerator:
isRealData 
isSimulation 
unknown 

Definition at line 72 of file DuplicateChecker.h.

Enumerator:
noDuplicateCheck 
checkEachFile 
checkEachRealDataFile 
checkAllFilesOpened 

Definition at line 68 of file DuplicateChecker.h.


Constructor & Destructor Documentation

edm::DuplicateChecker::DuplicateChecker ( ParameterSet const &  pset)

Definition at line 13 of file DuplicateChecker.cc.

References checkAllFilesOpened, checkEachFile, checkEachRealDataFile, CastorDigiReco::duplicateCheckMode, duplicateCheckMode_, Exception, edm::ParameterSet::getUntrackedParameter(), and noDuplicateCheck.

                                                             :
    dataType_(unknown),
    itIsKnownTheFileHasNoDuplicates_(false),
    disabled_(false)
  {
    // The default value provided as the second argument to the getUntrackedParameter function call
    // is not used when the ParameterSet has been validated and the parameters are not optional
    // in the description.  This is currently true when PoolSource is the primary input source.
    // The modules that use PoolSource as a SecSource have not defined their fillDescriptions function
    // yet, so the ParameterSet does not get validated yet.  As soon as all the modules with a SecSource
    // have defined descriptions, the defaults in the getUntrackedParameterSet function calls can
    // and should be deleted from the code.
    std::string duplicateCheckMode =
      pset.getUntrackedParameter<std::string>("duplicateCheckMode", std::string("checkAllFilesOpened"));

    if (duplicateCheckMode == std::string("noDuplicateCheck")) duplicateCheckMode_ = noDuplicateCheck;
    else if (duplicateCheckMode == std::string("checkEachFile")) duplicateCheckMode_ = checkEachFile;
    else if (duplicateCheckMode == std::string("checkEachRealDataFile")) duplicateCheckMode_ = checkEachRealDataFile;
    else if (duplicateCheckMode == std::string("checkAllFilesOpened")) duplicateCheckMode_ = checkAllFilesOpened;
    else {
      throw cms::Exception("Configuration")
        << "Illegal configuration parameter value passed to PoolSource for\n"
        << "the \"duplicateCheckMode\" parameter, legal values are:\n"
        << "\"noDuplicateCheck\", \"checkEachFile\", \"checkEachRealDataFile\", \"checkAllFilesOpened\"\n";
    }
  }

Member Function Documentation

bool edm::DuplicateChecker::checkDisabled ( ) const [inline]
bool edm::DuplicateChecker::checkingAllFiles ( ) const [inline]

Definition at line 62 of file DuplicateChecker.h.

References checkAllFilesOpened, and duplicateCheckMode_.

void edm::DuplicateChecker::disable ( )
void edm::DuplicateChecker::fillDescription ( ParameterSetDescription desc) [static]

Definition at line 119 of file DuplicateChecker.cc.

References edm::ParameterSetDescription::addUntracked().

                                                                  {
    std::string defaultString("checkAllFilesOpened");
    desc.addUntracked<std::string>("duplicateCheckMode", defaultString)->setComment(
        "'checkAllFilesOpened':   check across all input files\n"
        "'checkEachFile':         check each input file independently\n"
        "'checkEachRealDataFile': check each real data input file independently\n"
        "'noDuplicateCheck':      no duplicate checking\n"
    );
  }
void edm::DuplicateChecker::inputFileClosed ( )
void edm::DuplicateChecker::inputFileOpened ( bool  realData,
IndexIntoFile const &  indexIntoFile,
std::vector< boost::shared_ptr< IndexIntoFile > > const &  indexesIntoFiles,
std::vector< boost::shared_ptr< IndexIntoFile > >::size_type  currentIndexIntoFile 
)

Definition at line 47 of file DuplicateChecker.cc.

References checkAllFilesOpened, checkDisabled(), edm::IndexIntoFile::containsDuplicateEvents(), dataType_, duplicateCheckMode_, i, isRealData, isSimulation, itIsKnownTheFileHasNoDuplicates_, relevantPreviousEvents_, and edm::IndexIntoFile::set_intersection().

                                                                                {

    dataType_ = realData ? isRealData : isSimulation;
    if (checkDisabled()) return;

    relevantPreviousEvents_.clear();
    itIsKnownTheFileHasNoDuplicates_ = false;

    if (duplicateCheckMode_ == checkAllFilesOpened) {

      // Compares the current IndexIntoFile to all the previous ones and saves any duplicates.
      // One unintended thing, it also saves the duplicate runs and lumis.
      for(std::vector<boost::shared_ptr<IndexIntoFile> >::size_type i = 0; i < currentIndexIntoFile; ++i) {
        if (indexesIntoFiles[i].get() != 0) {

          indexIntoFile.set_intersection(*indexesIntoFiles[i], relevantPreviousEvents_);
        }
      }
    }
    if (relevantPreviousEvents_.empty()) {
      if(!indexIntoFile.containsDuplicateEvents()) {
        itIsKnownTheFileHasNoDuplicates_ = true;
      }
    }
  }
bool edm::DuplicateChecker::isDuplicateAndCheckActive ( int  index,
RunNumber_t  run,
LuminosityBlockNumber_t  lumi,
EventNumber_t  event,
std::string const &  fileName 
)

Definition at line 84 of file DuplicateChecker.cc.

References checkAllFilesOpened, checkDisabled(), duplicateCheckMode_, itIsKnownTheFileHasNoDuplicates_, and relevantPreviousEvents_.

                                                                              {
    if (itIsKnownTheFileHasNoDuplicates_) return false;
    if (checkDisabled()) return false;

    IndexIntoFile::IndexRunLumiEventKey newEvent(index, run, lumi, event);
    bool duplicate = !relevantPreviousEvents_.insert(newEvent).second;

    if (duplicate) {
      if (duplicateCheckMode_ == checkAllFilesOpened) {
        LogWarning("DuplicateEvent")
          << "Duplicate Events found in entire set of input files.\n"
          << "Both events were from run " << run 
          << " and luminosity block " << lumi
          << " with event number " << event << ".\n"
          << "The duplicate was from file " << fileName << ".\n"
          << "The duplicate will be skipped.\n";
      }
      else {
        LogWarning("DuplicateEvent")
          << "Duplicate Events found in file " << fileName << ".\n"
          << "Both events were from run " << run
          << " and luminosity block " << lumi
          << " with event number " << event << ".\n"
          << "The duplicate will be skipped.\n";
      }
      return true;
    }
    return false;
  }
bool edm::DuplicateChecker::noDuplicatesInFile ( ) const [inline]

Definition at line 48 of file DuplicateChecker.h.

References itIsKnownTheFileHasNoDuplicates_.


Member Data Documentation

Definition at line 74 of file DuplicateChecker.h.

Referenced by checkDisabled(), disable(), inputFileClosed(), and inputFileOpened().

Definition at line 85 of file DuplicateChecker.h.

Referenced by checkDisabled(), and disable().