Pubu PoC

中文版說明文件連結

A proof-of-concept implementation of the vulnerability ZD-2023-00144, which allows an attacker to download books from pubu.com.tw without buying them. For more information, please refer to this blog post.

Please note⚠️: This is only a demonstration and for educational purposes only. Downloading copyrighted materials may violate the law or the website's Terms of Service. Use at your own risk.

Update (2023/08/27): This vulnerability is now fixed.

Demo

Usage

This repo requires python3 (version >= 3.8) and some additional packages.

Clone this repo: git clone https://github.com/joeywang4/Pubu-PoC
Change to the cloned directory: cd Pubu-PoC
Install required packages: pip install -r requirements.txt
Download a list of books: python main.py -v [book_id ...]
Find the book at the output/ folder.

book_id is a number that represents a book's ID, which can be found in the URL of the book. For example, the book URL https://www.pubu.com.tw/ebook/999 indicates that this book's book_id is 999. Therefore, to download this book, execute python main.py -v 999.

Command line arguments

main.py accepts several arguments to execute on different modes or output to different folders. The followings are some useful arguments, and please execute python main.py -h for the complete list of arguments.

-v or --verbose: generate verbose output
-t or --threads: adjust the number of threads for the crawler
-o or --output: configure the path to the output directory (default to output/)
-c or --change-decode: adjust the decode method when generating the PDF file. This option is required if the downloaded book is decoded incorrectly.

Known issues

Incorrect pages: sometimes books are decoded incorrectly, and the pages will look like tiles that are composed in a weird order. In this case, execute main.py with the -c option.
Searching never ends: some book's pages are not continuous and can be difficult to search. Please be patient or pre-fetch the pages (also takes a lot of time!) before downloading the book.

Local database

When downloading a book, this program will try to search the pages online, and sometimes searching can be very time-consuming. This program can pre-fetch the pages into a local database, and so the searching step can be skipped when downloading a book.

To pre-fetch the pages, execute python main.py -u pages, or use the -u all option to also pre-fetch the books. For more information about the local database, please refer to this file.

Please note that pre-fetching the up-to-date data can take a very long time, and the local database will consume around 1G of storage.

Testing

The test files are located at test/. Run pytest to execute them.

Contribution

PRs are welcome! Please run black and pylint after making changes to the code and add test cases for them if possible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Pubu PoC

Demo

Usage

Command line arguments

Known issues

Local database

Testing

Contribution

Files

README.md

Latest commit

History

README.md

File metadata and controls

Pubu PoC

Demo

Usage

Command line arguments

Known issues

Local database

Testing

Contribution