Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Memory-Efficient Stream Handling in pdf.js #19357

Open
rbsdc opened this issue Jan 20, 2025 · 0 comments
Open

[Feature]: Memory-Efficient Stream Handling in pdf.js #19357

rbsdc opened this issue Jan 20, 2025 · 0 comments

Comments

@rbsdc
Copy link

rbsdc commented Jan 20, 2025

Is the feature relevant to the Firefox PDF Viewer?

No

Feature description

Currently, a Stream object in pdf.js requires the entire stream to be loaded into memory (if I am not mistaken - the constructor requires an ArrayBuffer). This can be inefficient when working with large streams, such as flate-encoded attachments in PDFs.
I would like to ask:

  1. Is there an existing memory-efficient way to read large (encoded) streams in pdf.js that avoids loading the entire stream into memory or to read it blockwise?
  2. If not, would it be possible to implement a more memory-efficient mechanism?

One possible solution could be to introduce a StreamReader (or similar) abstraction that allows for more efficient handling of streams (the constructor of Stream would then, for example, look like constructor(streamReader, start, length, dict)). Possible implementations include:

  • ArrayBuffer-based Implementation: like the current implementation.
  • File-backed Implementation: A stream reader that maintains a reference to the file and reads only the required portions on demand, without loading the entire stream.

Other PDF viewers

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant