Skip to content

Commit

Permalink
Explicitly mention the generated file in rust/gen/README.md (#473)
Browse files Browse the repository at this point in the history
  • Loading branch information
ia0 authored Jun 3, 2024
1 parent 6444089 commit 4bd6918
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 7 deletions.
31 changes: 24 additions & 7 deletions rust/gen/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,27 @@
This crate is for maintenance purposes only. It is used to update the Rust library to a new model.
It relies on 2 symbolic links to be coherent:
- `rust/lib/src/model.onnx` should point to the model.
- `rust/gen/data/config.json` should point to the definition of the target labels for that model.
There are 2 files in the Rust library that depend on the model:

Ideally if those 2 symbolic links are incoherent some test should fail, such that a PR updating the
model without updating the config would not pass the continuous integration.
- The model itself, `rust/lib/src/model.onnx`, which can either be a symbolic link to the source of
truth in this repository (publishing the crate will dereference the symbolic link), or a copy of
that source of truth (in which case there should be a test to make sure the file is an up-to-date
copy).
- The labels describing the model output, `rust/lib/src/label.rs`, which is generated from the model
configuration, `rust/gen/data/config.json`, which can either be a symbolic link or a copy to the
source of truth in this repository, and should match the model above.

There is another test to make sure the library code is in sync with the config. If this test fails,
one simply needs to run `./sync.sh` from the `rust` directory to update the library.
The purpose of this crate is to generate this second file. There is a test to make sure that this
file is up-to-date. If the test fails, one simply needs to run `./sync.sh` from the `rust` directory
to regenerate the file.

An alternative design to generating the file before publishing the crate, would be to publish the
model configuration and use a build script to generate the file during compilation. This has a few
disadvantages:

- We need to publish the model configuration which contains more information than needed to use the
library.
- We need to use a build script, which is frown upon for security reasons, as the entity compiling
the library or CLI now needs to trust the build script, which can run arbitrary code. This only
matters when the entity compiling the library or CLI is not the same as the one running the
library or CLI (e.g. Debian maintainers), since the library and CLI too can run arbitrary code.
- Using a build script also increases compilation time (and compilation complexity) instead of
having it factored before publishing.
2 changes: 2 additions & 0 deletions rust/gen/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ fn main() -> Result<()> {
let header = header.split("\n\n").next().context("main.rs does not contain an empty line")?;
writeln!(output, "{header}\n")?;
writeln!(output, "use crate::output::Metadata;\n")?;
writeln!(output, "// DO NOT EDIT, see link below for more information:")?;
writeln!(output, "// https://github.com/google/magika/tree/main/rust/gen\n")?;
writeln!(output, "/// Content type of a file.")?;
writeln!(output, "#[derive(Debug, Copy, Clone, PartialEq, Eq)]\n#[repr(u32)]")?;
writeln!(output, "pub enum Label {{")?;
Expand Down
3 changes: 3 additions & 0 deletions rust/lib/src/label.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@

use crate::output::Metadata;

// DO NOT EDIT, see link below for more information:
// https://github.com/google/magika/tree/main/rust/gen

/// Content type of a file.
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
#[repr(u32)]
Expand Down

0 comments on commit 4bd6918

Please sign in to comment.