read_archived_html#
- read_archived_html(archive_path: str | CloudPath | Path, regex: str) HtmlElement [source]#
Read archived HTML from
zip
ortar
archives.You can use this site to build your regex.
- Parameters:
archive_path (AnyPathStrType) – Archive path
regex (str) – HTML regex (used by re) as it can be found in the getmembers() list
- Returns:
HTML file
- Return type:
html._Element
Example
>>> arch_path = 'D:/path/to/zip.zip' >>> file_regex = '.*dir.*file_name' # Use .* for any character >>> read_archived_html(arch_path, file_regex) <Element html at 0x1c90007f8c8>