- Add warc_protocol, warc_version, warc_headers to wex response
- Fix errors in PhantomJS responses
- Handle non utf-8 urls
- Ensure utf-8 is tried first even if not declared
- Support onInitialized in PhantomJS required modules
- Add --label argument for easy process-wide labelling
- Fix shutdown error caused by daemon thread for timeout with phantomjs
- Fix handling of directories in tarfiles read from stdin (-)
- Small fix to avoid non-integer status code when error occur with PhantomJS
- Support 'params' keyword argument on URL.get
- Fix bug in handling HTML comments when fixing numeric character references
- Fix bug when using nested Cache objects
- Add support for reading WARC response format
- Fix bug in handling of invalid numeric character references
- Allow utf-8 in HTTP headers (only applies to PY2)
- Fix bug in HTTP decode caused by magic bytes handling.
- Add magic_bytes to Response for more reliable wex.http:decode behaviour.
- Re-worked encoding for HTML to pre-parse
- Better proxy support
- Now we flatten labels and values.
- href and src become href_url and src_url.
- Some API changes + switch to "tab-separated JSON".
- Uploaded sdist to PyPI for "pip install wextracto" simplicity.
- Initial release as open source