CMS 3D CMS Logo

List of all members | Classes | Public Types | Public Member Functions | Private Member Functions | Private Attributes
SimpleSAXParser Class Reference

#include <SimpleSAXParser.h>

Inheritance diagram for SimpleSAXParser:
FWXMLConfigParser

Classes

struct  Attribute
 
class  ParserError
 

Public Types

typedef std::vector< AttributeAttributes
 
enum  PARSER_STATES {
  IN_DOCUMENT, IN_BEGIN_TAG, IN_DONE, IN_BEGIN_ELEMENT,
  IN_ELEMENT_WHITESPACE, IN_END_ELEMENT, IN_ATTRIBUTE_KEY, IN_END_TAG,
  IN_DATA, IN_BEGIN_ATTRIBUTE_VALUE, IN_STRING, IN_END_ATTRIBUTE_VALUE,
  IN_STRING_ENTITY, IN_DATA_ENTITY
}
 

Public Member Functions

virtual void data (const std::string &)
 
virtual void endElement (const std::string &)
 
void parse (void)
 
 SimpleSAXParser (std::istream &f)
 
virtual void startElement (const std::string &, Attributes &)
 
virtual ~SimpleSAXParser ()
 

Private Member Functions

std::string getToken (const char *delim)
 
std::string getToken (const char delim)
 
int nextChar (void)
 
const SimpleSAXParseroperator= (const SimpleSAXParser &)=delete
 
std::string parseEntity (const std::string &entity)
 
 SimpleSAXParser (const SimpleSAXParser &)=delete
 
bool skipChar (int c)
 

Private Attributes

Attributes m_attributes
 
char * m_buffer
 
size_t m_bufferSize
 
std::vector< std::string > m_elementTags
 
std::istream & m_in
 
int m_nextChar
 

Detailed Description

A simple SAX parser which is able to parse the configuration.

State machine for the parser can be drawn by cut and pasting the following to graphviz:

digraph { IN_DOCUMENT->IN_BEGIN_TAG [label="nextChar == '<'"]; IN_DOCUMENT->IN_DATA [label="nextChar != '<'"];

IN_BEGIN_TAG->IN_BEGIN_ELEMENT [label="nextChar >= 'a' && nextChar < 'Z'"]; IN_BEGIN_TAG->IN_END_ELEMENT [label= "nextChar == '/'"];

IN_BEGIN_ELEMENT->IN_END_ELEMENT [label="nextChar == '/'"]; IN_BEGIN_ELEMENT->IN_ELEMENT_WHITESPACE [label="nextChar == ' '"]; IN_BEGIN_ELEMENT->IN_END_TAG [label="nextChar == '>'"];

IN_ELEMENT_WHITESPACE->IN_ELEMENT_WHITESPACE [ label = "nextChar == \"\ \t\n""] IN_ELEMENT_WHITESPACE->IN_ATTRIBUTE_KEY [ label = "nextChar >= 'a' && nextChar < 'Z'"] IN_ELEMENT_WHITESPACE->IN_END_ELEMENT [label="nextChar == '/'"]

IN_END_ELEMENT->IN_END_TAG [label = "nextChar == '>'"];

IN_END_TAG->IN_BEGIN_TAG [label="nextChar == '<'"]; IN_END_TAG->IN_DATA [label="nextChar != '<'"]

IN_DATA->IN_BEGIN_TAG [label="nextChar == '<'"]; IN_DATA->IN_DATA_ENTITY [label="nextChar == '&'"]; IN_DATA->IN_DONE [label = "nextChar == EOF"];

IN_DATA_ENTITY->IN_DATA [label="nextChar == ';'"];

IN_ATTRIBUTE_KEY->IN_BEGIN_ATTRIBUTE_VALUE [label = "nextChar == '='"]

IN_BEGIN_ATTRIBUTE_VALUE->IN_STRING [label = "nextChar == '\"' || nextChar == '\'' "]

IN_STRING->IN_END_ATTRIBUTE_VALUE [label = "nextChar == quote"] IN_STRING->IN_STRING_ENTITY [label = "nextChar == '&'"]

IN_END_ATTRIBUTE_VALUE->IN_ELEMENT_WHITESPACE [label = "nextChar == ' '"] IN_END_ATTRIBUTE_VALUE->IN_END_ELEMENT [label = "nextChar == '/'"] IN_END_ATTRIBUTE_VALUE->IN_END_TAG [label = "nextChar == '>'"]

IN_STRING_ENTITY->IN_STRING [label = "nextChar == ';'"] }

Definition at line 69 of file SimpleSAXParser.h.

Member Typedef Documentation

◆ Attributes

typedef std::vector<Attribute> SimpleSAXParser::Attributes

Definition at line 82 of file SimpleSAXParser.h.

Member Enumeration Documentation

◆ PARSER_STATES

Enumerator
IN_DOCUMENT 
IN_BEGIN_TAG 
IN_DONE 
IN_BEGIN_ELEMENT 
IN_ELEMENT_WHITESPACE 
IN_END_ELEMENT 
IN_ATTRIBUTE_KEY 
IN_END_TAG 
IN_DATA 
IN_BEGIN_ATTRIBUTE_VALUE 
IN_STRING 
IN_END_ATTRIBUTE_VALUE 
IN_STRING_ENTITY 
IN_DATA_ENTITY 

Definition at line 93 of file SimpleSAXParser.h.

Constructor & Destructor Documentation

◆ SimpleSAXParser() [1/2]

SimpleSAXParser::SimpleSAXParser ( std::istream &  f)
inline

Definition at line 110 of file SimpleSAXParser.h.

111  : m_in(f), m_bufferSize(1024), m_buffer(new char[m_bufferSize]), m_nextChar(m_in.get()) {}

◆ ~SimpleSAXParser()

SimpleSAXParser::~SimpleSAXParser ( )
virtual

Definition at line 197 of file SimpleSAXParser.cc.

197 { delete[] m_buffer; }

References m_buffer.

◆ SimpleSAXParser() [2/2]

SimpleSAXParser::SimpleSAXParser ( const SimpleSAXParser )
privatedelete

Member Function Documentation

◆ data()

virtual void SimpleSAXParser::data ( const std::string &  )
inlinevirtual

Reimplemented in FWXMLConfigParser.

Definition at line 119 of file SimpleSAXParser.h.

119 {}

Referenced by parse().

◆ endElement()

virtual void SimpleSAXParser::endElement ( const std::string &  )
inlinevirtual

Reimplemented in FWXMLConfigParser.

Definition at line 118 of file SimpleSAXParser.h.

118 {}

Referenced by parse().

◆ getToken() [1/2]

std::string SimpleSAXParser::getToken ( const char *  delim)
inlineprivate

Definition at line 126 of file SimpleSAXParser.h.

126  {
128  return m_buffer;
129  }

References fgettoken(), m_buffer, m_bufferSize, m_in, and m_nextChar.

Referenced by parse().

◆ getToken() [2/2]

std::string SimpleSAXParser::getToken ( const char  delim)
inlineprivate

Definition at line 131 of file SimpleSAXParser.h.

131  {
132  char buf[2] = {delim, 0};
134  m_nextChar = m_in.get();
135  return m_buffer;
136  }

References visDQMUpload::buf, fgettoken(), m_buffer, m_bufferSize, m_in, and m_nextChar.

◆ nextChar()

int SimpleSAXParser::nextChar ( void  )
inlineprivate

Definition at line 145 of file SimpleSAXParser.h.

145 { return m_nextChar; }

References m_nextChar.

Referenced by parse().

◆ operator=()

const SimpleSAXParser& SimpleSAXParser::operator= ( const SimpleSAXParser )
privatedelete

◆ parse()

void SimpleSAXParser::parse ( void  )

Runs the state machine of the parser, invoking startElement(), setAttribute(), endElement(), data() virtual methods as approppriate. In order have the parser doing something usefull you need to derive from it and specialize the above mentioned virtual methods.

Default implementation is in any case useful to check syntax.

Definition at line 46 of file SimpleSAXParser.cc.

46  {
47  enum PARSER_STATES state = IN_DOCUMENT;
48  // Current delimiters for strings in attributes.
49  char stringDelims[] = "\"&";
50  std::string attributeName;
51  std::string attributeValue;
53  std::string currentData;
54 
55  while (state != IN_DONE) {
56  debug_state_machine(state);
57 
58  switch (state) {
59  // FIXME: IN_DOCUMENT should check the dtd...
60  case IN_DOCUMENT:
61  state = IN_DATA;
62  if (skipChar('<'))
63  state = IN_BEGIN_TAG;
64  break;
65 
66  case IN_BEGIN_TAG:
67  if (nextChar() >= 'A' && nextChar() <= 'z')
68  state = IN_BEGIN_ELEMENT;
69  else if (skipChar('/'))
70  state = IN_END_ELEMENT;
71  else
72  throw ParserError("Bad tag");
73  break;
74 
75  case IN_BEGIN_ELEMENT:
76  m_attributes.clear();
77  m_elementTags.push_back(getToken(" />"));
78  if (nextChar() == ' ')
79  state = IN_ELEMENT_WHITESPACE;
80  else if (skipChar('/'))
81  state = IN_END_ELEMENT;
82  else if (skipChar('>')) {
84  state = IN_END_TAG;
85  } else
86  throw ParserError("Bad element.");
87  break;
88 
90  while (skipChar(' ') || skipChar('\n') || skipChar('\t')) {
91  }
92 
93  if (nextChar() >= 'A' && nextChar() <= 'z')
94  state = IN_ATTRIBUTE_KEY;
95  else if (nextChar() == '/')
96  state = IN_END_ELEMENT;
97  else
98  throw ParserError("Syntax error in element" + m_elementTags.back());
99  break;
100 
101  case IN_ATTRIBUTE_KEY:
102  attributeName = getToken('=');
103  state = IN_BEGIN_ATTRIBUTE_VALUE;
104  break;
105 
107  if (skipChar('"')) {
108  state = IN_STRING;
109  attributeValue.clear();
110  stringDelims[0] = '\"';
111  } else if (skipChar('\'')) {
112  state = IN_STRING;
113  attributeValue.clear();
114  stringDelims[0] = '\'';
115  } else
116  throw ParserError("Expecting quotes.");
117  break;
118 
119  case IN_STRING:
120  attributeValue += getToken(stringDelims);
121  if (skipChar(stringDelims[0])) {
122  // Save the attributes in order, replacing those that are
123  // specified more than once.
124  Attribute attr(attributeName, attributeValue);
125  Attributes::iterator i = std::lower_bound(m_attributes.begin(), m_attributes.end(), attr);
126  if (i != m_attributes.end() && i->key == attr.key)
127  throw ParserError("Attribute " + i->key + " defined more than once");
128  m_attributes.insert(i, attr);
129  state = IN_END_ATTRIBUTE_VALUE;
130  } else if (skipChar(stringDelims[1]))
131  state = IN_STRING_ENTITY;
132  else
133  throw ParserError("Unexpected end of input at " + attributeValue);
134  break;
135 
137  getToken(" />");
138  if (nextChar() == ' ')
139  state = IN_ELEMENT_WHITESPACE;
140  else if (skipChar('/'))
141  state = IN_END_ELEMENT;
142  else if (skipChar('>')) {
144  state = IN_END_TAG;
145  }
146  break;
147 
148  case IN_END_ELEMENT:
149  tmp = getToken('>');
150  if (!tmp.empty() && tmp != m_elementTags.back())
151  throw ParserError("Non-matching closing element " + tmp + " for " + attributeValue);
152  endElement(tmp);
153  m_elementTags.pop_back();
154  state = IN_END_TAG;
155  break;
156 
157  case IN_END_TAG:
158  if (nextChar() == EOF)
159  return;
160  else if (skipChar('<'))
161  state = IN_BEGIN_TAG;
162  else
163  state = IN_DATA;
164  break;
165 
166  case IN_DATA:
167  currentData += getToken("<&");
168  if (skipChar('&'))
169  state = IN_DATA_ENTITY;
170  else if (skipChar('<')) {
171  data(currentData);
172  currentData.clear();
173  state = IN_BEGIN_TAG;
174  } else if (nextChar() == EOF) {
175  data(currentData);
176  return;
177  } else
178  throw ParserError("Unexpected end of input in element " + m_elementTags.back() + currentData);
179  break;
180 
181  case IN_DATA_ENTITY:
182  currentData += parseEntity(getToken(';'));
183  state = IN_DATA;
184  break;
185 
186  case IN_STRING_ENTITY:
187  attributeValue += parseEntity(getToken(';'));
188  state = IN_STRING;
189  break;
190 
191  case IN_DONE:
192  return;
193  }
194  }
195 }

References data(), debug_state_machine(), endElement(), getToken(), mps_fire::i, IN_ATTRIBUTE_KEY, IN_BEGIN_ATTRIBUTE_VALUE, IN_BEGIN_ELEMENT, IN_BEGIN_TAG, IN_DATA, IN_DATA_ENTITY, IN_DOCUMENT, IN_DONE, IN_ELEMENT_WHITESPACE, IN_END_ATTRIBUTE_VALUE, IN_END_ELEMENT, IN_END_TAG, IN_STRING, IN_STRING_ENTITY, SimpleSAXParser::Attribute::key, cuda_std::lower_bound(), m_attributes, m_elementTags, nextChar(), parseEntity(), skipChar(), startElement(), AlCaHLTBitMon_QueryRunRegistry::string, and createJobs::tmp.

Referenced by FWConfigurationManager::readFromFile().

◆ parseEntity()

std::string SimpleSAXParser::parseEntity ( const std::string &  entity)
private

Helper function to handle entities, i.e. characters specified with the "&label;" syntax.

Definition at line 6 of file SimpleSAXParser.cc.

6  {
7  if (entity == "quot")
8  return "\"";
9  else if (entity == "amp")
10  return "&";
11  else if (entity == "lt")
12  return "<";
13  else if (entity == "gt")
14  return ">";
15  throw ParserError("Unknown entity " + entity);
16 }

Referenced by parse().

◆ skipChar()

bool SimpleSAXParser::skipChar ( int  c)
inlineprivate

Definition at line 138 of file SimpleSAXParser.h.

138  {
139  if (m_nextChar != c)
140  return false;
141  m_nextChar = m_in.get();
142  return true;
143  }

References HltBtagPostValidation_cff::c, m_in, and m_nextChar.

Referenced by parse().

◆ startElement()

virtual void SimpleSAXParser::startElement ( const std::string &  ,
Attributes  
)
inlinevirtual

Reimplemented in FWXMLConfigParser.

Definition at line 117 of file SimpleSAXParser.h.

117 {}

Referenced by parse().

Member Data Documentation

◆ m_attributes

Attributes SimpleSAXParser::m_attributes
private

Definition at line 152 of file SimpleSAXParser.h.

Referenced by parse().

◆ m_buffer

char* SimpleSAXParser::m_buffer
private

Definition at line 149 of file SimpleSAXParser.h.

Referenced by getToken(), and ~SimpleSAXParser().

◆ m_bufferSize

size_t SimpleSAXParser::m_bufferSize
private

Definition at line 148 of file SimpleSAXParser.h.

Referenced by getToken().

◆ m_elementTags

std::vector<std::string> SimpleSAXParser::m_elementTags
private

Definition at line 151 of file SimpleSAXParser.h.

Referenced by parse().

◆ m_in

std::istream& SimpleSAXParser::m_in
private

Definition at line 147 of file SimpleSAXParser.h.

Referenced by getToken(), and skipChar().

◆ m_nextChar

int SimpleSAXParser::m_nextChar
private

Definition at line 150 of file SimpleSAXParser.h.

Referenced by getToken(), nextChar(), and skipChar().

SimpleSAXParser::getToken
std::string getToken(const char *delim)
Definition: SimpleSAXParser.h:126
SimpleSAXParser::startElement
virtual void startElement(const std::string &, Attributes &)
Definition: SimpleSAXParser.h:117
SimpleSAXParser::IN_DATA_ENTITY
Definition: SimpleSAXParser.h:107
mps_fire.i
i
Definition: mps_fire.py:355
f
double f[11][100]
Definition: MuScleFitUtils.cc:78
SimpleSAXParser::m_attributes
Attributes m_attributes
Definition: SimpleSAXParser.h:152
SimpleSAXParser::PARSER_STATES
PARSER_STATES
Definition: SimpleSAXParser.h:93
SimpleSAXParser::IN_END_ELEMENT
Definition: SimpleSAXParser.h:99
SimpleSAXParser::IN_BEGIN_ATTRIBUTE_VALUE
Definition: SimpleSAXParser.h:103
SimpleSAXParser::data
virtual void data(const std::string &)
Definition: SimpleSAXParser.h:119
SimpleSAXParser::endElement
virtual void endElement(const std::string &)
Definition: SimpleSAXParser.h:118
SimpleSAXParser::skipChar
bool skipChar(int c)
Definition: SimpleSAXParser.h:138
SimpleSAXParser::IN_END_TAG
Definition: SimpleSAXParser.h:101
SimpleSAXParser::IN_ATTRIBUTE_KEY
Definition: SimpleSAXParser.h:100
fgettoken
bool fgettoken(std::istream &in, char **buffer, size_t *maxSize, const char *separators, int *firstChar)
Definition: SimpleSAXParser.cc:224
createJobs.tmp
tmp
align.sh
Definition: createJobs.py:716
SimpleSAXParser::nextChar
int nextChar(void)
Definition: SimpleSAXParser.h:145
SimpleSAXParser::m_nextChar
int m_nextChar
Definition: SimpleSAXParser.h:150
SimpleSAXParser::m_in
std::istream & m_in
Definition: SimpleSAXParser.h:147
SimpleSAXParser::m_elementTags
std::vector< std::string > m_elementTags
Definition: SimpleSAXParser.h:151
SimpleSAXParser::parseEntity
std::string parseEntity(const std::string &entity)
Definition: SimpleSAXParser.cc:6
cuda_std::lower_bound
__host__ constexpr __device__ RandomIt lower_bound(RandomIt first, RandomIt last, const T &value, Compare comp={})
Definition: cudastdAlgorithm.h:27
AlCaHLTBitMon_QueryRunRegistry.string
string
Definition: AlCaHLTBitMon_QueryRunRegistry.py:256
SimpleSAXParser::IN_END_ATTRIBUTE_VALUE
Definition: SimpleSAXParser.h:105
SimpleSAXParser::IN_DATA
Definition: SimpleSAXParser.h:102
SimpleSAXParser::m_buffer
char * m_buffer
Definition: SimpleSAXParser.h:149
debug_state_machine
void debug_state_machine(enum SimpleSAXParser::PARSER_STATES state)
Definition: SimpleSAXParser.cc:18
SimpleSAXParser::IN_DOCUMENT
Definition: SimpleSAXParser.h:94
SimpleSAXParser::IN_BEGIN_TAG
Definition: SimpleSAXParser.h:95
SimpleSAXParser::IN_BEGIN_ELEMENT
Definition: SimpleSAXParser.h:97
HltBtagPostValidation_cff.c
c
Definition: HltBtagPostValidation_cff.py:31
SimpleSAXParser::IN_DONE
Definition: SimpleSAXParser.h:96
visDQMUpload.buf
buf
Definition: visDQMUpload.py:154
SimpleSAXParser::m_bufferSize
size_t m_bufferSize
Definition: SimpleSAXParser.h:148
SimpleSAXParser::IN_STRING
Definition: SimpleSAXParser.h:104
SimpleSAXParser::IN_STRING_ENTITY
Definition: SimpleSAXParser.h:106
SimpleSAXParser::IN_ELEMENT_WHITESPACE
Definition: SimpleSAXParser.h:98