CMS 3D CMS Logo

 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Groups Pages
List of all members | Static Public Attributes
BeautifulSoup.ICantBelieveItsBeautifulSoup Class Reference
Inheritance diagram for BeautifulSoup.ICantBelieveItsBeautifulSoup:
BeautifulSoup.BeautifulSoup BeautifulSoup.BeautifulStoneSoup BeautifulSoup.Tag BeautifulSoup.PageElement BeautifulSoup.RobustWackAssHTMLParser

Static Public Attributes

tuple I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS = ('noscript',)
 
 I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS = \
 
tuple NESTABLE_TAGS
 
- Static Public Attributes inherited from BeautifulSoup.BeautifulSoup
tuple CHARSET_RE = re.compile("((^|;)\s*charset=)([^;]*)", re.M)
 
tuple NESTABLE_BLOCK_TAGS = ('blockquote', 'div', 'fieldset', 'ins', 'del')
 
tuple NESTABLE_INLINE_TAGS
 
dictionary NESTABLE_LIST_TAGS
 
dictionary NESTABLE_TABLE_TAGS
 
tuple NESTABLE_TAGS
 
tuple NON_NESTABLE_BLOCK_TAGS = ('address', 'form', 'p', 'pre')
 
tuple PRESERVE_WHITESPACE_TAGS = set(['pre', 'textarea'])
 
dictionary QUOTE_TAGS = {'script' : None, 'textarea' : None}
 
tuple RESET_NESTING_TAGS
 
tuple SELF_CLOSING_TAGS
 
- Static Public Attributes inherited from BeautifulSoup.BeautifulStoneSoup
 ALL_ENTITIES = XHTML_ENTITIES
 
string HTML_ENTITIES = "html"
 
list MARKUP_MASSAGE
 
dictionary NESTABLE_TAGS = {}
 
list PRESERVE_WHITESPACE_TAGS = []
 
dictionary QUOTE_TAGS = {}
 
dictionary RESET_NESTING_TAGS = {}
 
string ROOT_TAG_NAME = u'[document]'
 
dictionary SELF_CLOSING_TAGS = {}
 
dictionary STRIP_ASCII_SPACES = { 9: None, 10: None, 12: None, 13: None, 32: None, }
 
string XHTML_ENTITIES = "xhtml"
 
string XML_ENTITIES = "xml"
 
- Static Public Attributes inherited from BeautifulSoup.Tag
 fetch = findAll
 
 findChild = find
 
 findChildren = findAll
 
 first = find
 

Additional Inherited Members

- Public Member Functions inherited from BeautifulSoup.BeautifulSoup
def __init__
 
def start_meta
 
- Public Member Functions inherited from BeautifulSoup.BeautifulStoneSoup
def __getattr__
 
def __init__
 
def convert_charref
 
def endData
 
def handle_charref
 
def handle_comment
 
def handle_data
 
def handle_decl
 
def handle_entityref
 
def handle_pi
 
def isSelfClosingTag
 
def parse_declaration
 
def popTag
 
def pushTag
 
def reset
 
def unknown_endtag
 
def unknown_starttag
 
- Public Member Functions inherited from BeautifulSoup.Tag
def __call__
 
def __contains__
 
def __delitem__
 
def __eq__
 
def __getattr__
 
def __getitem__
 
def __init__
 
def __iter__
 
def __len__
 
def __ne__
 
def __nonzero__
 
def __repr__
 
def __setitem__
 
def __str__
 
def __unicode__
 
def childGenerator
 
def clear
 
def decompose
 
def fetchText
 
def find
 
def findAll
 
def firstText
 
def get
 
def getString
 
def getText
 
def has_key
 
def index
 
def prettify
 
def recursiveChildGenerator
 
def renderContents
 
def setString
 
- Public Attributes inherited from BeautifulSoup.BeautifulSoup
 declaredHTMLEncoding
 
 originalEncoding
 
- Public Attributes inherited from BeautifulSoup.BeautifulStoneSoup
 convertEntities
 
 convertHTMLEntities
 
 convertXMLEntities
 
 currentData
 
 currentTag
 
 declaredHTMLEncoding
 
 escapeUnrecognizedEntities
 
 fromEncoding
 
 hidden
 
 instanceSelfClosingTags
 
 literal
 
 markup
 
 markupMassage
 
 originalEncoding
 
 parseOnlyThese
 
 previous
 
 quoteStack
 
 smartQuotesTo
 
 tagStack
 
- Public Attributes inherited from BeautifulSoup.Tag
 attrMap
 
 attrs
 
 containsSubstitutions
 
 contents
 
 convertHTMLEntities
 
 convertXMLEntities
 
 escapeUnrecognizedEntities
 
 hidden
 
 isSelfClosing
 
 name
 
 parserClass
 
- Properties inherited from BeautifulSoup.Tag
 string = property(getString, setString)
 
 text = property(getText)
 

Detailed Description

The BeautifulSoup class is oriented towards skipping over
common HTML errors like unclosed tags. However, sometimes it makes
errors of its own. For instance, consider this fragment:

 <b>Foo<b>Bar</b></b>

This is perfectly valid (if bizarre) HTML. However, the
BeautifulSoup class will implicitly close the first b tag when it
encounters the second 'b'. It will think the author wrote
"<b>Foo<b>Bar", and didn't close the first 'b' tag, because
there's no real-world reason to bold something that's already
bold. When it encounters '</b></b>' it will close two more 'b'
tags, for a grand total of three tags closed instead of two. This
can throw off the rest of your document structure. The same is
true of a number of other tags, listed below.

It's much more common for someone to forget to close a 'b' tag
than to actually use nested 'b' tags, and the BeautifulSoup class
handles the common case. This class handles the not-co-common
case: where you can't believe someone wrote what they did, but
it's valid HTML and BeautifulSoup screwed up by assuming it
wouldn't be.

Definition at line 1626 of file BeautifulSoup.py.

Member Data Documentation

tuple BeautifulSoup.ICantBelieveItsBeautifulSoup.I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS = ('noscript',)
static

Definition at line 1656 of file BeautifulSoup.py.

BeautifulSoup.ICantBelieveItsBeautifulSoup.I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS = \
static

Definition at line 1651 of file BeautifulSoup.py.

tuple BeautifulSoup.ICantBelieveItsBeautifulSoup.NESTABLE_TAGS
static
Initial value:
1 = buildTagMap([], BeautifulSoup.NESTABLE_TAGS,
2  I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS,
3  I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS)

Definition at line 1658 of file BeautifulSoup.py.