Static Public Attributes | |
dictionary | NESTABLE_TAGS = {} |
tuple | RESET_NESTING_TAGS = buildTagMap('noscript') |
Static Public Attributes inherited from BeautifulSoup.BeautifulSoup | |
tuple | CHARSET_RE = re.compile("((^|;)\s*charset=)([^;]*)", re.M) |
tuple | NESTABLE_BLOCK_TAGS = ('blockquote', 'div', 'fieldset', 'ins', 'del') |
tuple | NESTABLE_INLINE_TAGS |
dictionary | NESTABLE_LIST_TAGS |
dictionary | NESTABLE_TABLE_TAGS |
tuple | NESTABLE_TAGS |
tuple | NON_NESTABLE_BLOCK_TAGS = ('address', 'form', 'p', 'pre') |
tuple | PRESERVE_WHITESPACE_TAGS = set(['pre', 'textarea']) |
dictionary | QUOTE_TAGS = {'script' : None, 'textarea' : None} |
tuple | RESET_NESTING_TAGS |
tuple | SELF_CLOSING_TAGS |
Static Public Attributes inherited from BeautifulSoup.BeautifulStoneSoup | |
ALL_ENTITIES = XHTML_ENTITIES | |
string | HTML_ENTITIES = "html" |
list | MARKUP_MASSAGE |
dictionary | NESTABLE_TAGS = {} |
list | PRESERVE_WHITESPACE_TAGS = [] |
dictionary | QUOTE_TAGS = {} |
dictionary | RESET_NESTING_TAGS = {} |
string | ROOT_TAG_NAME = u'[document]' |
dictionary | SELF_CLOSING_TAGS = {} |
dictionary | STRIP_ASCII_SPACES = { 9: None, 10: None, 12: None, 13: None, 32: None, } |
string | XHTML_ENTITIES = "xhtml" |
string | XML_ENTITIES = "xml" |
Static Public Attributes inherited from BeautifulSoup.Tag | |
fetch = findAll | |
findChild = find | |
findChildren = findAll | |
first = find | |
The MinimalSoup class is for parsing HTML that contains pathologically bad markup. It makes no assumptions about tag nesting, but it does know which tags are self-closing, that <script> tags contain Javascript and should not be parsed, that META tags may contain encoding information, and so on. This also makes it better for subclassing than BeautifulStoneSoup or BeautifulSoup.
Definition at line 1662 of file BeautifulSoup.py.
|
static |
Definition at line 1673 of file BeautifulSoup.py.
|
static |
Definition at line 1672 of file BeautifulSoup.py.