Public Member Functions | |
def | popTag |
def | popTag |
This class will push a tag with only a single string child into the tag's parent as an attribute. The attribute's name is the tag name, and the value is the string child. An example should give the flavor of the change: <foo><bar>baz</bar></foo> => <foo bar="baz"><bar>baz</bar></foo> You can then access fooTag['bar'] instead of fooTag.barTag.string. This is, of course, useful for scraping structures that tend to use subelements instead of attributes, such as SOAP messages. Note that it modifies its input, so don't print the modified version out. I'm not sure how many people really want to use this class; let me know if you do. Mainly I like the name.
Definition at line 1653 of file BeautifulSoup.py.
def BeautifulSoup::BeautifulSOAP::popTag | ( | self | ) |
Reimplemented from BeautifulSoup::BeautifulStoneSoup.
Definition at line 1673 of file BeautifulSoup.py.
01674 : 01675 if len(self.tagStack) > 1: 01676 tag = self.tagStack[-1] 01677 parent = self.tagStack[-2] 01678 parent._getAttrMap() 01679 if (isinstance(tag, Tag) and len(tag.contents) == 1 and 01680 isinstance(tag.contents[0], NavigableString) and 01681 not parent.attrMap.has_key(tag.name)): 01682 parent[tag.name] = tag.contents[0] 01683 BeautifulStoneSoup.popTag(self) 01684 01685 #Enterprise class names! It has come to our attention that some people 01686 #think the names of the Beautiful Soup parser classes are too silly 01687 #and "unprofessional" for use in enterprise screen-scraping. We feel 01688 #your pain! For such-minded folk, the Beautiful Soup Consortium And 01689 #All-Night Kosher Bakery recommends renaming this file to 01690 #"RobustParser.py" (or, in cases of extreme enterprisiness, 01691 #"RobustParserBeanInterface.class") and using the following #enterprise-friendly class aliases:
def BeautifulSoup::BeautifulSOAP::popTag | ( | self | ) |
Reimplemented from BeautifulSoup::BeautifulStoneSoup.
Definition at line 1673 of file BeautifulSoup.py.
01674 : 01675 if len(self.tagStack) > 1: 01676 tag = self.tagStack[-1] 01677 parent = self.tagStack[-2] 01678 parent._getAttrMap() 01679 if (isinstance(tag, Tag) and len(tag.contents) == 1 and 01680 isinstance(tag.contents[0], NavigableString) and 01681 not parent.attrMap.has_key(tag.name)): 01682 parent[tag.name] = tag.contents[0] 01683 BeautifulStoneSoup.popTag(self) 01684 01685 #Enterprise class names! It has come to our attention that some people 01686 #think the names of the Beautiful Soup parser classes are too silly 01687 #and "unprofessional" for use in enterprise screen-scraping. We feel 01688 #your pain! For such-minded folk, the Beautiful Soup Consortium And 01689 #All-Night Kosher Bakery recommends renaming this file to 01690 #"RobustParser.py" (or, in cases of extreme enterprisiness, 01691 #"RobustParserBeanInterface.class") and using the following #enterprise-friendly class aliases: