diff --git a/infra/api/package/beautifulsoup4-4.9.1.dist-info/AUTHORS b/infra/api/package/beautifulsoup4-4.9.1.dist-info/AUTHORS deleted file mode 100644 index 1f14fe0..0000000 --- a/infra/api/package/beautifulsoup4-4.9.1.dist-info/AUTHORS +++ /dev/null @@ -1,49 +0,0 @@ -Behold, mortal, the origins of Beautiful Soup... -================================================ - -Leonard Richardson is the primary maintainer. - -Aaron DeVore and Isaac Muse have made significant contributions to the -code base. - -Mark Pilgrim provided the encoding detection code that forms the base -of UnicodeDammit. - -Thomas Kluyver and Ezio Melotti finished the work of getting Beautiful -Soup 4 working under Python 3. - -Simon Willison wrote soupselect, which was used to make Beautiful Soup -support CSS selectors. Isaac Muse wrote SoupSieve, which made it -possible to _remove_ the CSS selector code from Beautiful Soup. - -Sam Ruby helped with a lot of edge cases. - -Jonathan Ellis was awarded the prestigious Beau Potage D'Or for his -work in solving the nestable tags conundrum. - -An incomplete list of people have contributed patches to Beautiful -Soup: - - Istvan Albert, Andrew Lin, Anthony Baxter, Oliver Beattie, Andrew -Boyko, Tony Chang, Francisco Canas, "Delong", Zephyr Fang, Fuzzy, -Roman Gaufman, Yoni Gilad, Richie Hindle, Toshihiro Kamiya, Peteris -Krumins, Kent Johnson, Marek Kapolka, Andreas Kostyrka, Roel Kramer, -Ben Last, Robert Leftwich, Stefaan Lippens, "liquider", Staffan -Malmgren, Ksenia Marasanova, JP Moins, Adam Monsen, John Nagle, "Jon", -Ed Oskiewicz, Martijn Peters, Greg Phillips, Giles Radford, Stefano -Revera, Arthur Rudolph, Marko Samastur, James Salter, Jouni Seppänen, -Alexander Schmolck, Tim Shirley, Geoffrey Sneddon, Ville Skyttä, -"Vikas", Jens Svalgaard, Andy Theyers, Eric Weiser, Glyn Webster, John -Wiseman, Paul Wright, Danny Yoo - -An incomplete list of people who made suggestions or found bugs or -found ways to break Beautiful Soup: - - Hanno Böck, Matteo Bertini, Chris Curvey, Simon Cusack, Bruce Eckel, - Matt Ernst, Michael Foord, Tom Harris, Bill de hOra, Donald Howes, - Matt Patterson, Scott Roberts, Steve Strassmann, Mike Williams, - warchild at redho dot com, Sami Kuisma, Carlos Rocha, Bob Hutchison, - Joren Mc, Michal Migurski, John Kleven, Tim Heaney, Tripp Lilley, Ed - Summers, Dennis Sutch, Chris Smith, Aaron Swartz, Stuart - Turner, Greg Edwards, Kevin J Kalupson, Nikos Kouremenos, Artur de - Sousa Rocha, Yichun Wei, Per Vognsen diff --git a/infra/api/package/beautifulsoup4-4.9.1.dist-info/COPYING.txt b/infra/api/package/beautifulsoup4-4.9.1.dist-info/COPYING.txt deleted file mode 100644 index fb6ae69..0000000 --- a/infra/api/package/beautifulsoup4-4.9.1.dist-info/COPYING.txt +++ /dev/null @@ -1,27 +0,0 @@ -Beautiful Soup is made available under the MIT license: - - Copyright (c) 2004-2017 Leonard Richardson - - Permission is hereby granted, free of charge, to any person obtaining - a copy of this software and associated documentation files (the - "Software"), to deal in the Software without restriction, including - without limitation the rights to use, copy, modify, merge, publish, - distribute, sublicense, and/or sell copies of the Software, and to - permit persons to whom the Software is furnished to do so, subject to - the following conditions: - - The above copyright notice and this permission notice shall be - included in all copies or substantial portions of the Software. - - THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - SOFTWARE. - -Beautiful Soup incorporates code from the html5lib library, which is -also made available under the MIT license. Copyright (c) 2006-2013 -James Graham and other contributors diff --git a/infra/api/package/beautifulsoup4-4.9.1.dist-info/INSTALLER b/infra/api/package/beautifulsoup4-4.9.1.dist-info/INSTALLER deleted file mode 100644 index a1b589e..0000000 --- a/infra/api/package/beautifulsoup4-4.9.1.dist-info/INSTALLER +++ /dev/null @@ -1 +0,0 @@ -pip diff --git a/infra/api/package/beautifulsoup4-4.9.1.dist-info/LICENSE b/infra/api/package/beautifulsoup4-4.9.1.dist-info/LICENSE deleted file mode 100644 index 4c068ba..0000000 --- a/infra/api/package/beautifulsoup4-4.9.1.dist-info/LICENSE +++ /dev/null @@ -1,30 +0,0 @@ -Beautiful Soup is made available under the MIT license: - - Copyright (c) 2004-2019 Leonard Richardson - - Permission is hereby granted, free of charge, to any person obtaining - a copy of this software and associated documentation files (the - "Software"), to deal in the Software without restriction, including - without limitation the rights to use, copy, modify, merge, publish, - distribute, sublicense, and/or sell copies of the Software, and to - permit persons to whom the Software is furnished to do so, subject to - the following conditions: - - The above copyright notice and this permission notice shall be - included in all copies or substantial portions of the Software. - - THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - SOFTWARE. - -Beautiful Soup incorporates code from the html5lib library, which is -also made available under the MIT license. Copyright (c) 2006-2013 -James Graham and other contributors - -Beautiful Soup depends on the soupsieve library, which is also made -available under the MIT license. Copyright (c) 2018 Isaac Muse diff --git a/infra/api/package/beautifulsoup4-4.9.1.dist-info/METADATA b/infra/api/package/beautifulsoup4-4.9.1.dist-info/METADATA deleted file mode 100644 index 1b4a564..0000000 --- a/infra/api/package/beautifulsoup4-4.9.1.dist-info/METADATA +++ /dev/null @@ -1,131 +0,0 @@ -Metadata-Version: 2.1 -Name: beautifulsoup4 -Version: 4.9.1 -Summary: Screen-scraping library -Home-page: http://www.crummy.com/software/BeautifulSoup/bs4/ -Author: Leonard Richardson -Author-email: leonardr@segfault.org -License: MIT -Download-URL: http://www.crummy.com/software/BeautifulSoup/bs4/download/ -Platform: UNKNOWN -Classifier: Development Status :: 5 - Production/Stable -Classifier: Intended Audience :: Developers -Classifier: License :: OSI Approved :: MIT License -Classifier: Programming Language :: Python -Classifier: Programming Language :: Python :: 2.7 -Classifier: Programming Language :: Python :: 3 -Classifier: Topic :: Text Processing :: Markup :: HTML -Classifier: Topic :: Text Processing :: Markup :: XML -Classifier: Topic :: Text Processing :: Markup :: SGML -Classifier: Topic :: Software Development :: Libraries :: Python Modules -Description-Content-Type: text/markdown -Requires-Dist: soupsieve (>1.2) -Provides-Extra: html5lib -Requires-Dist: html5lib ; extra == 'html5lib' -Provides-Extra: lxml -Requires-Dist: lxml ; extra == 'lxml' - -Beautiful Soup is a library that makes it easy to scrape information -from web pages. It sits atop an HTML or XML parser, providing Pythonic -idioms for iterating, searching, and modifying the parse tree. - -# Quick start - -``` ->>> from bs4 import BeautifulSoup ->>> soup = BeautifulSoup("
SomebadHTML") ->>> print soup.prettify() - - -
-Some - -bad - -HTML - - -
- - ->>> soup.find(text="bad") -u'bad' ->>> soup.i -HTML -# ->>> soup = BeautifulSoup("tags are treated in HTML. Tags in this list - are immune from pretty-printing; their contents will always be - output as-is. - - :param string_containers: A dictionary mapping tag names to - the classes that should be instantiated to contain the textual - contents of those tags. The default is to use NavigableString - for every tag, no matter what the name. You can override the - default by changing DEFAULT_STRING_CONTAINERS. - - :param store_line_numbers: If the parser keeps track of the - line numbers and positions of the original markup, that - information will, by default, be stored in each corresponding - `Tag` object. You can turn this off by passing - store_line_numbers=False. If the parser you're using doesn't - keep track of this information, then setting store_line_numbers=True - will do nothing. - """ - self.soup = None - if multi_valued_attributes is self.USE_DEFAULT: - multi_valued_attributes = self.DEFAULT_CDATA_LIST_ATTRIBUTES - self.cdata_list_attributes = multi_valued_attributes - if preserve_whitespace_tags is self.USE_DEFAULT: - preserve_whitespace_tags = self.DEFAULT_PRESERVE_WHITESPACE_TAGS - self.preserve_whitespace_tags = preserve_whitespace_tags - if store_line_numbers == self.USE_DEFAULT: - store_line_numbers = self.TRACKS_LINE_NUMBERS - self.store_line_numbers = store_line_numbers - if string_containers == self.USE_DEFAULT: - string_containers = self.DEFAULT_STRING_CONTAINERS - self.string_containers = string_containers - - def initialize_soup(self, soup): - """The BeautifulSoup object has been initialized and is now - being associated with the TreeBuilder. - - :param soup: A BeautifulSoup object. - """ - self.soup = soup - - def reset(self): - """Do any work necessary to reset the underlying parser - for a new document. - - By default, this does nothing. - """ - pass - - def can_be_empty_element(self, tag_name): - """Might a tag with this name be an empty-element tag? - - The final markup may or may not actually present this tag as - self-closing. - - For instance: an HTMLBuilder does not consider atag to be - an empty-element tag (it's not in - HTMLBuilder.empty_element_tags). This means an empty
tag - will be presented as "
", not "" or "". - - The default implementation has no opinion about which tags are - empty-element tags, so a tag will be presented as an - empty-element tag if and only if it has no children. - "
tag, and html5lib - doesn't. Abstracting this away lets us write simple tests - which run HTML fragments through the parser and compare the - results against other HTML fragments. - - This method should not be used outside of tests. - - :param fragment: A string -- fragment of HTML. - :return: A string -- a full HTML document. - """ - return fragment - - def set_up_substitutions(self, tag): - """Set up any substitutions that will need to be performed on - a `Tag` when it's output as a string. - - By default, this does nothing. See `HTMLTreeBuilder` for a - case where this is used. - - :param tag: A `Tag` - :return: Whether or not a substitution was performed. - """ - return False - - def _replace_cdata_list_attribute_values(self, tag_name, attrs): - """When an attribute value is associated with a tag that can - have multiple values for that attribute, convert the string - value to a list of strings. - - Basically, replaces class="foo bar" with class=["foo", "bar"] - - NOTE: This method modifies its input in place. - - :param tag_name: The name of a tag. - :param attrs: A dictionary containing the tag's attributes. - Any appropriate attribute values will be modified in place. - """ - if not attrs: - return attrs - if self.cdata_list_attributes: - universal = self.cdata_list_attributes.get('*', []) - tag_specific = self.cdata_list_attributes.get( - tag_name.lower(), None) - for attr in list(attrs.keys()): - if attr in universal or (tag_specific and attr in tag_specific): - # We have a "class"-type attribute whose string - # value is a whitespace-separated list of - # values. Split it into a list. - value = attrs[attr] - if isinstance(value, str): - values = nonwhitespace_re.findall(value) - else: - # html5lib sometimes calls setAttributes twice - # for the same tag when rearranging the parse - # tree. On the second call the attribute value - # here is already a list. If this happens, - # leave the value alone rather than trying to - # split it again. - values = value - attrs[attr] = values - return attrs - -class SAXTreeBuilder(TreeBuilder): - """A Beautiful Soup treebuilder that listens for SAX events. - - This is not currently used for anything, but it demonstrates - how a simple TreeBuilder would work. - """ - - def feed(self, markup): - raise NotImplementedError() - - def close(self): - pass - - def startElement(self, name, attrs): - attrs = dict((key[1], value) for key, value in list(attrs.items())) - #print("Start %s, %r" % (name, attrs)) - self.soup.handle_starttag(name, attrs) - - def endElement(self, name): - #print("End %s" % name) - self.soup.handle_endtag(name) - - def startElementNS(self, nsTuple, nodeName, attrs): - # Throw away (ns, nodeName) for now. - self.startElement(nodeName, attrs) - - def endElementNS(self, nsTuple, nodeName): - # Throw away (ns, nodeName) for now. - self.endElement(nodeName) - #handler.endElementNS((ns, node.nodeName), node.nodeName) - - def startPrefixMapping(self, prefix, nodeValue): - # Ignore the prefix for now. - pass - - def endPrefixMapping(self, prefix): - # Ignore the prefix for now. - # handler.endPrefixMapping(prefix) - pass - - def characters(self, content): - self.soup.handle_data(content) - - def startDocument(self): - pass - - def endDocument(self): - pass - - -class HTMLTreeBuilder(TreeBuilder): - """This TreeBuilder knows facts about HTML. - - Such as which tags are empty-element tags. - """ - - empty_element_tags = set([ - # These are from HTML5. - 'area', 'base', 'br', 'col', 'embed', 'hr', 'img', 'input', 'keygen', 'link', 'menuitem', 'meta', 'param', 'source', 'track', 'wbr', - - # These are from earlier versions of HTML and are removed in HTML5. - 'basefont', 'bgsound', 'command', 'frame', 'image', 'isindex', 'nextid', 'spacer' - ]) - - # The HTML standard defines these as block-level elements. Beautiful - # Soup does not treat these elements differently from other elements, - # but it may do so eventually, and this information is available if - # you need to use it. - block_elements = set(["address", "article", "aside", "blockquote", "canvas", "dd", "div", "dl", "dt", "fieldset", "figcaption", "figure", "footer", "form", "h1", "h2", "h3", "h4", "h5", "h6", "header", "hr", "li", "main", "nav", "noscript", "ol", "output", "p", "pre", "section", "table", "tfoot", "ul", "video"]) - - # The HTML standard defines an unusual content model for these tags. - # We represent this by using a string class other than NavigableString - # inside these tags. - # - # I made this list by going through the HTML spec - # (https://html.spec.whatwg.org/#metadata-content) and looking for - # "metadata content" elements that can contain strings. - # - # TODO: Arguably" will become " ", and " bar " will - be left alone. - - :param tag_name: The name of a markup tag. - """ - if self.empty_element_tags is None: - return True - return tag_name in self.empty_element_tags - - def feed(self, markup): - """Run some incoming markup through some parsing process, - populating the `BeautifulSoup` object in self.soup. - - This method is not implemented in TreeBuilder; it must be - implemented in subclasses. - - :return: None. - """ - raise NotImplementedError() - - def prepare_markup(self, markup, user_specified_encoding=None, - document_declared_encoding=None, exclude_encodings=None): - """Run any preliminary steps necessary to make incoming markup - acceptable to the parser. - - :param markup: Some markup -- probably a bytestring. - :param user_specified_encoding: The user asked to try this encoding. - :param document_declared_encoding: The markup itself claims to be - in this encoding. - :param exclude_encodings: The user asked _not_ to try any of - these encodings. - - :yield: A series of 4-tuples: - (markup, encoding, declared encoding, - has undergone character replacement) - - Each 4-tuple represents a strategy for converting the - document to Unicode and parsing it. Each strategy will be tried - in turn. - - By default, the only strategy is to parse the markup - as-is. See `LXMLTreeBuilderForXML` and - `HTMLParserTreeBuilder` for implementations that take into - account the quirks of particular parsers. - """ - yield markup, None, None, False - - def test_fragment_to_document(self, fragment): - """Wrap an HTML fragment to make it look like a document. - - Different parsers do this differently. For instance, lxml - introduces an empty