Public Member Functions | |
def | __init__ |
def | find_codec |
Public Attributes | |
declaredHTMLEncoding | |
markup | |
originalEncoding | |
smartQuotesTo | |
triedEncodings | |
unicode | |
Static Public Attributes | |
dictionary | CHARSET_ALIASES |
EBCDIC_TO_ASCII_MAP = None | |
dictionary | MS_CHARS |
Private Member Functions | |
def | _codec |
def | _convertFrom |
def | _detectEncoding |
def | _ebcdic_to_ascii |
def | _subMSChar |
def | _toUnicode |
A class for detecting the encoding of a *ML document and converting it to a Unicode string. If the source encoding is windows-1252, can replace MS smart quotes with their HTML or XML equivalents.
Definition at line 1756 of file BeautifulSoup.py.
def BeautifulSoup.UnicodeDammit.__init__ | ( | self, | |
markup, | |||
overrideEncodings = [] , |
|||
smartQuotesTo = 'xml' , |
|||
isHTML = False |
|||
) |
Definition at line 1770 of file BeautifulSoup.py.
|
private |
Definition at line 1941 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit.find_codec().
|
private |
Definition at line 1814 of file BeautifulSoup.py.
References BeautifulSoup.UnicodeDammit._subMSChar(), BeautifulSoup.UnicodeDammit._toUnicode(), BeautifulSoup.UnicodeDammit.find_codec(), recoMuon.in, BeautifulSoup.BeautifulStoneSoup.markup, BeautifulSoup.UnicodeDammit.markup, BeautifulSoup.BeautifulStoneSoup.smartQuotesTo, BeautifulSoup.UnicodeDammit.smartQuotesTo, and BeautifulSoup.UnicodeDammit.triedEncodings.
Given a document, tries to detect its XML encoding.
Definition at line 1867 of file BeautifulSoup.py.
References BeautifulSoup.UnicodeDammit._ebcdic_to_ascii(), BeautifulSoup.BeautifulStoneSoup.declaredHTMLEncoding, BeautifulSoup.BeautifulSoup.declaredHTMLEncoding, BeautifulSoup.UnicodeDammit.declaredHTMLEncoding, alcaDQMUpload.encode(), match(), and BeautifulSoup.UnicodeDammit.unicode.
|
private |
Definition at line 1952 of file BeautifulSoup.py.
References __class__< T >.__class__(), pat::__class__.__class__(), and join().
Referenced by BeautifulSoup.UnicodeDammit._detectEncoding().
|
private |
Changes a MS smart quote character to an XML or HTML entity.
Definition at line 1803 of file BeautifulSoup.py.
References BeautifulSoup.BeautifulStoneSoup.smartQuotesTo, and BeautifulSoup.UnicodeDammit.smartQuotesTo.
Referenced by BeautifulSoup.UnicodeDammit._convertFrom().
|
private |
Given a string and its encoding, decodes the string into Unicode. %encoding is a string recognized by encodings.aliases
Definition at line 1842 of file BeautifulSoup.py.
References BeautifulSoup.UnicodeDammit.unicode.
Referenced by BeautifulSoup.UnicodeDammit._convertFrom().
def BeautifulSoup.UnicodeDammit.find_codec | ( | self, | |
charset | |||
) |
Definition at line 1935 of file BeautifulSoup.py.
References BeautifulSoup.UnicodeDammit._codec().
Referenced by BeautifulSoup.UnicodeDammit._convertFrom().
|
static |
Definition at line 1766 of file BeautifulSoup.py.
BeautifulSoup.UnicodeDammit.declaredHTMLEncoding |
Definition at line 1771 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit._detectEncoding().
|
static |
Definition at line 1951 of file BeautifulSoup.py.
BeautifulSoup.UnicodeDammit.markup |
Definition at line 1833 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit._convertFrom().
|
static |
Definition at line 1977 of file BeautifulSoup.py.
BeautifulSoup.UnicodeDammit.originalEncoding |
Definition at line 1777 of file BeautifulSoup.py.
BeautifulSoup.UnicodeDammit.smartQuotesTo |
Definition at line 1774 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit._convertFrom(), and BeautifulSoup.UnicodeDammit._subMSChar().
BeautifulSoup.UnicodeDammit.triedEncodings |
Definition at line 1775 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit._convertFrom().
BeautifulSoup.UnicodeDammit.unicode |
Definition at line 1778 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit._detectEncoding(), and BeautifulSoup.UnicodeDammit._toUnicode().