-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix incomplete attached file problem #1261
base: master
Are you sure you want to change the base?
Fix incomplete attached file problem #1261
Conversation
Currently, we only have `ttstub_input_read` function, which corresponds to Rust's `read_exact`. This seems to work for most of xetex, but it's not semantically a replacement for buffered reading with `fread`, which is used in a dvipdfmx. One of such places is reading a file to be included in a PDF file as attachment. This is done through dvipdfmx special like this: \special{pdf:fstream @sourcefile (example.txt)} These reads are done in 1024 byte chunks, and if we read these chunks with read_exact, we are going to fail unless the file size is not exactly divisible by 1024. Tectonic would report this as: warning: 1024-byte read failed caused by: failed to fill whole buffer To fix this issue in a non-intrusive way, we introduce a new variant of reads - `ttstub_input_read_partial`, which follows Rust's `read` semantics, and not `read_exact`'s, and we use it for the dvipdfmx fstream reading. Likely any place that calls `ttstub_input_read` in a loop is also a candidate for using `ttstub_input_read_partial`, and perhaps it would work better for others. But there definitely are places in xetex that depend on the `read_exact` semantics. For example, when I naively just tried to make all reads into `ttstub_input_read_partial`, xetex failed pretty quickly on "undumping the format file", which has a few initial 4 byte reads, but then it tries to undump big string pool which was many kilobytes in size. Rust `read` however would only return 8196 minus couple of the the 4 bytes, which would fail a check on the TeX side which expects full read. AFAICT xetex uses `fread`, so this probably just comes down to buffering differences between `fread` and Rust's `Reader` / `BufReader`. Relates to tectonic-typesetting#935 Fixes tectonic-typesetting#1260
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1261 +/- ##
==========================================
- Coverage 46.35% 46.34% -0.02%
==========================================
Files 184 184
Lines 66222 66236 +14
==========================================
- Hits 30695 30694 -1
- Misses 35527 35542 +15 ☔ View full report in Codecov by Sentry. |
I provided more details in #1260 (comment). |
Thank you for the fix! At some point I'd like to get a fulldeps test suite set up for testing stuff with dependencies, but for now, do you think there's an easy way to write a test that covers this behavior without it? |
Sorry, I am not familiar with Tectonic's tests at all. What does fulldeps mean?
I assume you mean to test that
It's based on dvipdfmx manual, and embeds the TeX source file as an attachment. Other options make the PDF "readable", so we could check that:
|
Sorry, jargon borrowed from the Rust compiler. There, the 'fulldeps' are tests that need access to the 'full' compiler, including as an external dependency. Here, I'm using it to imply a project with access to the default bundle or
Thanks, that's exactly what I meant! For now using external dependencies in tests is hard, so something that only relies on |
I see.
Please do if you have a minute. It would take me some time to figure out how to test this in Tectonic test suite, and to be honest, don't have much chances now. I promise this well serve as an example for me to write such test myself in the future. |
Currently, we only have
ttstub_input_read
function, which corresponds to Rust'sread_exact
. This seems to work for most of xetex, but it's not semantically a replacement for buffered reading withfread
, which is used in a dvipdfmx.One of such places is reading a file to be included in a PDF file as attachment. This is done through dvipdfmx special like this:
These reads are done in 1024 byte chunks, and if we read these chunks with read_exact, we are going to fail unless the file size is not exactly divisible by 1024.
Tectonic would report this as:
To fix this issue in a non-intrusive way, we introduce a new variant of reads -
ttstub_input_read_partial
, which follows Rust'sread
semantics, and notread_exact
's, and we use it for the dvipdfmx fstream reading.Likely any place that calls
ttstub_input_read
in a loop is also a candidate for usingttstub_input_read_partial
, and perhaps it would work better for others.But there definitely are places in xetex that depend on the
read_exact
semantics. For example, when I naively just tried to make all reads intottstub_input_read_partial
, xetex failed pretty quickly on "undumping the format file", which has a few initial 4 byte reads, but then it tries to undump big string pool which was many kilobytes in size. Rustread
however would only return 8196 minus couple of the the 4 bytes, which would fail a check on the TeX side which expects full read. AFAICT xetex usesfread
, so this probably just comes down to buffering differences betweenfread
and Rust'sReader
/BufReader
.Relates to #935
Fixes #1260