Fix incomplete attached file problem #1261

vlasakm · 2025-02-22T11:13:20Z

Currently, we only have ttstub_input_read function, which corresponds to Rust's read_exact. This seems to work for most of xetex, but it's not semantically a replacement for buffered reading with fread, which is used in a dvipdfmx.

One of such places is reading a file to be included in a PDF file as attachment. This is done through dvipdfmx special like this:

\special{pdf:fstream @SourceFile (example.txt)}

These reads are done in 1024 byte chunks, and if we read these chunks with read_exact, we are going to fail unless the file size is not exactly divisible by 1024.

Tectonic would report this as:

warning: 1024-byte read failed
caused by: failed to fill whole buffer

To fix this issue in a non-intrusive way, we introduce a new variant of reads - ttstub_input_read_partial, which follows Rust's read semantics, and not read_exact's, and we use it for the dvipdfmx fstream reading.

Likely any place that calls ttstub_input_read in a loop is also a candidate for using ttstub_input_read_partial, and perhaps it would work better for others.

But there definitely are places in xetex that depend on the read_exact semantics. For example, when I naively just tried to make all reads into ttstub_input_read_partial, xetex failed pretty quickly on "undumping the format file", which has a few initial 4 byte reads, but then it tries to undump big string pool which was many kilobytes in size. Rust read however would only return 8196 minus couple of the the 4 bytes, which would fail a check on the TeX side which expects full read. AFAICT xetex uses fread, so this probably just comes down to buffering differences between fread and Rust's Reader / BufReader.

Relates to #935
Fixes #1260

@sourcefile

Currently, we only have `ttstub_input_read` function, which corresponds to Rust's `read_exact`. This seems to work for most of xetex, but it's not semantically a replacement for buffered reading with `fread`, which is used in a dvipdfmx. One of such places is reading a file to be included in a PDF file as attachment. This is done through dvipdfmx special like this: \special{pdf:fstream @sourcefile (example.txt)} These reads are done in 1024 byte chunks, and if we read these chunks with read_exact, we are going to fail unless the file size is not exactly divisible by 1024. Tectonic would report this as: warning: 1024-byte read failed caused by: failed to fill whole buffer To fix this issue in a non-intrusive way, we introduce a new variant of reads - `ttstub_input_read_partial`, which follows Rust's `read` semantics, and not `read_exact`'s, and we use it for the dvipdfmx fstream reading. Likely any place that calls `ttstub_input_read` in a loop is also a candidate for using `ttstub_input_read_partial`, and perhaps it would work better for others. But there definitely are places in xetex that depend on the `read_exact` semantics. For example, when I naively just tried to make all reads into `ttstub_input_read_partial`, xetex failed pretty quickly on "undumping the format file", which has a few initial 4 byte reads, but then it tries to undump big string pool which was many kilobytes in size. Rust `read` however would only return 8196 minus couple of the the 4 bytes, which would fail a check on the TeX side which expects full read. AFAICT xetex uses `fread`, so this probably just comes down to buffering differences between `fread` and Rust's `Reader` / `BufReader`. Relates to tectonic-typesetting#935 Fixes tectonic-typesetting#1260

codecov · 2025-02-22T11:51:56Z

Codecov Report

Attention: Patch coverage is 0% with 15 lines in your changes missing coverage. Please review.

Project coverage is 46.34%. Comparing base (c2ae25f) to head (e335870).

Files with missing lines	Patch %	Lines
crates/bridge_core/src/lib.rs	0.00%	13 Missing ⚠️
crates/bridge_core/support/support.c	0.00%	1 Missing ⚠️
crates/pdf_io/pdf_io/dpx-spc_pdfm.c	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1261      +/-   ##
==========================================
- Coverage   46.35%   46.34%   -0.02%     
==========================================
  Files         184      184              
  Lines       66222    66236      +14     
==========================================
- Hits        30695    30694       -1     
- Misses      35527    35542      +15

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

vlasakm · 2025-02-22T14:36:57Z

I provided more details in #1260 (comment).

CraftSpider · 2025-02-23T18:30:41Z

Thank you for the fix! At some point I'd like to get a fulldeps test suite set up for testing stuff with dependencies, but for now, do you think there's an easy way to write a test that covers this behavior without it?

vlasakm · 2025-02-23T20:57:06Z

At some point I'd like to get a fulldeps test suite set up for testing stuff with dependencies

Sorry, I am not familiar with Tectonic's tests at all. What does fulldeps mean?

do you think there's an easy way to write a test that covers this behavior without it?

I assume you mean to test that pdf:fstream special gets emitted correctly. I posted LaTeX test case here: #1260 (comment), I don't know exactly what our options are with regards to testing, but here is a minimal plain XeTeX test with no external dependencies:

\special{dvipdfmx:config C 0x40} % don't use object streams
\special{dvipdfmx:config z 0}    % don't use compression

\special{pdf:fstream @SourceFile (\jobname.tex)}
\special{pdf:ann width 10bp height 20bp
  <<
    /Type /Annot
    /Subtype /FileAttachment
    /FS <<
      /Type /Filespec
      /F (\jobname.tex)
      /EF << /F @SourceFile >>
    >>
    /Name /PushPin
    /C [0.8 0.2 0.2]
    /T (Source code of this document)
    /Subj (Source code of this document)
    /Contents (Source code of this document)
  >>
}

\bye

It's based on dvipdfmx manual, and embeds the TeX source file as an attachment. Other options make the PDF "readable", so we could check that:

There are no warnings emitted.
That the embedded PDF stream for the source file has the expected length.

CraftSpider · 2025-02-23T21:49:59Z

Sorry, I am not familiar with Tectonic's tests at all. What does fulldeps mean?

Sorry, jargon borrowed from the Rust compiler. There, the 'fulldeps' are tests that need access to the 'full' compiler, including as an external dependency. Here, I'm using it to imply a project with access to the default bundle or Tectonic.toml usage, which would make writing some test cases much easier.

<snip> but here is a minimal plain XeTeX test with no external dependencies: <snip>

Thanks, that's exactly what I meant! For now using external dependencies in tests is hard, so something that only relies on
XeTeX is perfect. If you want, I can push such a test directly to this branch when I have a minute.

vlasakm · 2025-02-23T23:45:27Z

Sorry, I am not familiar with Tectonic's tests at all. What does fulldeps mean?

Sorry, jargon borrowed from the Rust compiler. There, the 'fulldeps' are tests that need access to the 'full' compiler, including as an external dependency. Here, I'm using it to imply a project with access to the default bundle or Tectonic.toml usage, which would make writing some test cases much easier.

I see.

If you want, I can push such a test directly to this branch when I have a minute.

Please do if you have a minute.

It would take me some time to figure out how to test this in Tectonic test suite, and to be honest, don't have much chances now. I promise this well serve as an example for me to write such test myself in the future.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix incomplete attached file problem #1261

Fix incomplete attached file problem #1261

vlasakm commented Feb 22, 2025

codecov bot commented Feb 22, 2025

vlasakm commented Feb 22, 2025

CraftSpider commented Feb 23, 2025

vlasakm commented Feb 23, 2025

CraftSpider commented Feb 23, 2025

vlasakm commented Feb 23, 2025

Fix incomplete attached file problem #1261

Are you sure you want to change the base?

Fix incomplete attached file problem #1261

Conversation

vlasakm commented Feb 22, 2025

codecov bot commented Feb 22, 2025

Codecov Report

vlasakm commented Feb 22, 2025

CraftSpider commented Feb 23, 2025

vlasakm commented Feb 23, 2025

CraftSpider commented Feb 23, 2025

vlasakm commented Feb 23, 2025