Public Member Functions | |
def | __init__ (self, markup, overrideEncodings=[], smartQuotesTo='xml', isHTML=False) |
def | find_codec (self, charset) |
Public Attributes | |
declaredHTMLEncoding | |
markup | |
originalEncoding | |
smartQuotesTo | |
triedEncodings | |
unicode | |
Private Member Functions | |
def | _codec (self, charset) |
def | _convertFrom (self, proposed) |
def | _detectEncoding (self, xml_data, isHTML=False) |
def | _ebcdic_to_ascii (self, s) |
def | _subMSChar (self, orig) |
def | _toUnicode (self, data, encoding) |
A class for detecting the encoding of a *ML document and converting it to a Unicode string. If the source encoding is windows-1252, can replace MS smart quotes with their HTML or XML equivalents.
Definition at line 1756 of file BeautifulSoup.py.
def BeautifulSoup.UnicodeDammit.__init__ | ( | self, | |
markup, | |||
overrideEncodings = [] , |
|||
smartQuotesTo = 'xml' , |
|||
isHTML = False |
|||
) |
Definition at line 1770 of file BeautifulSoup.py.
|
private |
Definition at line 1941 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit.find_codec().
|
private |
Definition at line 1814 of file BeautifulSoup.py.
References BeautifulSoup.UnicodeDammit._subMSChar(), BeautifulSoup.UnicodeDammit._toUnicode(), BeautifulSoup.UnicodeDammit.find_codec(), recoMuon.in, BeautifulSoup.BeautifulStoneSoup.markup, BeautifulSoup.UnicodeDammit.markup, BeautifulSoup.BeautifulStoneSoup.smartQuotesTo, BeautifulSoup.UnicodeDammit.smartQuotesTo, and BeautifulSoup.UnicodeDammit.triedEncodings.
Given a document, tries to detect its XML encoding.
Definition at line 1867 of file BeautifulSoup.py.
References BeautifulSoup.UnicodeDammit._ebcdic_to_ascii(), BeautifulSoup.BeautifulStoneSoup.declaredHTMLEncoding, BeautifulSoup.BeautifulSoup.declaredHTMLEncoding, BeautifulSoup.UnicodeDammit.declaredHTMLEncoding, alcaDQMUpload.encode(), match(), and BeautifulSoup.UnicodeDammit.unicode.
|
private |
Definition at line 1952 of file BeautifulSoup.py.
References __class__< T >.__class__(), pat::__class__.__class__(), join(), and genParticles_cff.map.
Referenced by BeautifulSoup.UnicodeDammit._detectEncoding().
|
private |
Changes a MS smart quote character to an XML or HTML entity.
Definition at line 1803 of file BeautifulSoup.py.
References BeautifulSoup.BeautifulStoneSoup.smartQuotesTo, and BeautifulSoup.UnicodeDammit.smartQuotesTo.
Referenced by BeautifulSoup.UnicodeDammit._convertFrom().
|
private |
Given a string and its encoding, decodes the string into Unicode. %encoding is a string recognized by encodings.aliases
Definition at line 1842 of file BeautifulSoup.py.
References BeautifulSoup.UnicodeDammit.unicode.
Referenced by BeautifulSoup.UnicodeDammit._convertFrom().
def BeautifulSoup.UnicodeDammit.find_codec | ( | self, | |
charset | |||
) |
Definition at line 1935 of file BeautifulSoup.py.
References BeautifulSoup.UnicodeDammit._codec().
Referenced by BeautifulSoup.UnicodeDammit._convertFrom().
BeautifulSoup.UnicodeDammit.declaredHTMLEncoding |
Definition at line 1771 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit._detectEncoding().
BeautifulSoup.UnicodeDammit.markup |
Definition at line 1833 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit._convertFrom().
BeautifulSoup.UnicodeDammit.originalEncoding |
Definition at line 1777 of file BeautifulSoup.py.
BeautifulSoup.UnicodeDammit.smartQuotesTo |
Definition at line 1774 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit._convertFrom(), and BeautifulSoup.UnicodeDammit._subMSChar().
BeautifulSoup.UnicodeDammit.triedEncodings |
Definition at line 1775 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit._convertFrom().
BeautifulSoup.UnicodeDammit.unicode |
Definition at line 1778 of file BeautifulSoup.py.
Referenced by BeautifulSoup.UnicodeDammit._detectEncoding(), and BeautifulSoup.UnicodeDammit._toUnicode().