You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Warctools uses the Content-Length field to determine the length of the body for validating and reading WARC files. Since the g-zipped bodies are no longer g-zipped in common-crawl WARC files, not the whole of g-zipped messages is being parsed. #14 fixes this and allows proper parsing common-crawl WARC files.
The text was updated successfully, but these errors were encountered:
Warctools uses the Content-Length field to determine the length of the body for validating and reading WARC files. Since the g-zipped bodies are no longer g-zipped in common-crawl WARC files, not the whole of g-zipped messages is being parsed.
#14 fixes this and allows proper parsing common-crawl WARC files.
The text was updated successfully, but these errors were encountered: