[Feature]: Memory-Efficient Stream Handling in pdf.js #19357

rbsdc · 2025-01-20T20:31:29Z

Is the feature relevant to the Firefox PDF Viewer?

No

Feature description

Currently, a Stream object in pdf.js requires the entire stream to be loaded into memory (if I am not mistaken - the constructor requires an ArrayBuffer). This can be inefficient when working with large streams, such as flate-encoded attachments in PDFs.
I would like to ask:

Is there an existing memory-efficient way to read large (encoded) streams in pdf.js that avoids loading the entire stream into memory or to read it blockwise?
If not, would it be possible to implement a more memory-efficient mechanism?

One possible solution could be to introduce a StreamReader (or similar) abstraction that allows for more efficient handling of streams (the constructor of Stream would then, for example, look like constructor(streamReader, start, length, dict)). Possible implementations include:

ArrayBuffer-based Implementation: like the current implementation.
File-backed Implementation: A stream reader that maintains a reference to the file and reads only the required portions on demand, without loading the entire stream.

Other PDF viewers

No response

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Memory-Efficient Stream Handling in pdf.js #19357

[Feature]: Memory-Efficient Stream Handling in pdf.js #19357

rbsdc commented Jan 20, 2025 •

edited

Loading

[Feature]: Memory-Efficient Stream Handling in pdf.js #19357

[Feature]: Memory-Efficient Stream Handling in pdf.js #19357

Comments

rbsdc commented Jan 20, 2025 • edited Loading

Is the feature relevant to the Firefox PDF Viewer?

Feature description

Other PDF viewers

rbsdc commented Jan 20, 2025 •

edited

Loading