|
dictionary | NESTABLE_TAGS = {} |
|
tuple | RESET_NESTING_TAGS = buildTagMap('noscript') |
|
tuple | CHARSET_RE = re.compile("((^|;)\s*charset=)([^;]*)", re.M) |
|
list | NESTABLE_BLOCK_TAGS = ['blockquote', 'div', 'fieldset', 'ins', 'del'] |
|
list | NESTABLE_INLINE_TAGS |
|
dictionary | NESTABLE_LIST_TAGS |
|
dictionary | NESTABLE_TABLE_TAGS |
|
tuple | NESTABLE_TAGS |
|
list | NON_NESTABLE_BLOCK_TAGS = ['address', 'form', 'p', 'pre'] |
|
tuple | PRESERVE_WHITESPACE_TAGS = set(['pre', 'textarea']) |
|
dictionary | QUOTE_TAGS = {'script' : None, 'textarea' : None} |
|
tuple | RESET_NESTING_TAGS |
|
tuple | SELF_CLOSING_TAGS |
|
| ALL_ENTITIES = XHTML_ENTITIES |
|
string | HTML_ENTITIES = "html" |
|
list | MARKUP_MASSAGE |
|
dictionary | NESTABLE_TAGS = {} |
|
list | PRESERVE_WHITESPACE_TAGS = [] |
|
dictionary | QUOTE_TAGS = {} |
|
dictionary | RESET_NESTING_TAGS = {} |
|
string | ROOT_TAG_NAME = u'[document]' |
|
dictionary | SELF_CLOSING_TAGS = {} |
|
dictionary | STRIP_ASCII_SPACES = { 9: None, 10: None, 12: None, 13: None, 32: None, } |
|
string | XHTML_ENTITIES = "xhtml" |
|
string | XML_ENTITIES = "xml" |
|
| fetchNextSiblings = findNextSiblings |
|
| fetchParents = findParents |
|
| fetchPrevious = findAllPrevious |
|
| fetchPreviousSiblings = findPreviousSiblings |
|
The MinimalSoup class is for parsing HTML that contains
pathologically bad markup. It makes no assumptions about tag
nesting, but it does know which tags are self-closing, that
<script> tags contain Javascript and should not be parsed, that
META tags may contain encoding information, and so on.
This also makes it better for subclassing than BeautifulStoneSoup
or BeautifulSoup.
Definition at line 1640 of file BeautifulSoup.py.