Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(wasm): update NPM build script #3429

Closed
wants to merge 4 commits into from
Closed

chore(wasm): update NPM build script #3429

wants to merge 4 commits into from

Conversation

TalDerei
Copy link
Collaborator

@TalDerei TalDerei commented Nov 23, 2023

References #3428

Additionally, looking into other compile-time techniques to further reduce wasm binary size from 8.5 MB:

[profile.release]
opt-level = 's'
lto = true

I published penumbra-zone-test in the NPM registry for testing purposes (updated version is 0.63.3-preview-8 per the changes in #3432).

@TalDerei TalDerei self-assigned this Nov 23, 2023
@TalDerei TalDerei marked this pull request as ready for review November 25, 2023 04:43
@TalDerei TalDerei marked this pull request as draft November 26, 2023 16:54
@TalDerei TalDerei marked this pull request as ready for review November 26, 2023 16:56
Comment on lines 49 to 51
if (!packageJson.files.includes('bin')) {
packageJson.files.push('bin');
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this needs a code comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment

@@ -13,9 +13,42 @@ TARGETS.forEach(target => {
},
);

if (target === 'bundler') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see the bundler name is a bit confusing. Those three targets correspond to different ways the javascript can be packaged together. In this case, bundler is a npm package that is meant to be used for folks using webpack. We use this in our case in prod. For tests, we use the nodejs package.

That means, we actually need to have six packages published. The three targets x2 (with & without .bin files).

In this case, I'd put the old code in a function (compileWasm). Create a new main() function and have it call compileWasm + see if a --bundle-bin flag was passed. If so, run a second function to do the copying like you did.

const targetPackageDir = path.join(process.cwd(), `${target}`);

// Ensure the target binary directory exists
if (existsSync(binaryDir)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't we should throw an error

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added error handling

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is out of scope, but I wonder if we should re-write this little script in Rust. Initially, the thinking was that this was low weight and the sync methods weren't that big of a deal. Now that we likely need to do this six times (3 targets x 2 bin bundle options), I think we should introduce parallelism here. And if so, perhaps Rust (multithreading w/ Rayon) makes this easier and fits more within the wider repo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep it in Typescript, think we should do something like:

import util from 'util';
import { exec } from 'child_process';

const asyncExec = util.promisify(exec);

const TARGETS = ['bundler', 'web', 'nodejs'];

(async function () {
  const promises = TARGETS.map(target =>
    asyncExec(
      `wasm-pack build ../ --release --target ${target} --out-name index --out-dir publish/${target}-${binBundle}`,
    ),
  );
  await Promise.all(promises);
})();

or perhaps this library is easier to use and more visually discernable (the above does not have stdout): https://www.npmjs.com/package/concurrently

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rewriting the script in Rust might be a bit out of scope for now. I can experiment with these suggestions!

crates/wasm/Cargo.toml Show resolved Hide resolved
@grod220
Copy link
Contributor

grod220 commented Nov 28, 2023

I'm realizing that this package is not really enough for a consumer to load the proving keys easily.

  • Javascript bundlers cannot easily load bin files. The consumers have to roll a special webpack config to be able to include these files in the bundle.
  • Loading is not super straightforward. The consumers need to be able to pull the bin file into memory somehow. Probably fetch. However, in order to do that, the above needs to be setup to import the paths to those files.

Some ideas:

  • We could expose a new function that handles the challenges of loading+fetching
  • We could stringify the bin file and exported it. Something like:
 const convertBinToBase64 = (filePath: string): string => {
        const data = readFileSync(filePath);
        return data.toString('base64');
      };

      binaryFiles.forEach(file => {
        const sourcePath = path.join(binaryDir, file);
        const targetPath = path.join(targetBinaryDir, file);
        copyFileSync(sourcePath, targetPath);

        const base64Data = convertBinToBase64(targetPath);
        const exportName = path.basename(file, '.bin');
        const tsFilePath = path.join(targetPackageDir, `${exportName}.ts`);
        const tsContent = `export const ${exportName} = "${base64Data}";\n`;
        writeFileSync(tsFilePath, tsContent, 'utf-8');
      });

In any case, exposing new files/methods is not trivial. Wasm-pack outputs a fully-formed npm package that we'd have to manually manipulate via scripts. Or, perhaps best, we create a new npm package that imports the wasm-pack package. That way we have control to expose new functions as well as re-export the wasm functions. In this case, we'd need to use a bundler like webpack/parcel.

What do you think?

@hdevalence
Copy link
Member

Javascript bundlers cannot easily load bin files. The consumers have to roll a special webpack config to be able to include these files in the bundle.

Naive questions, but:

  • Are there binary assets that bundlers do support?
  • Why does a consumer need to put the proving keys in a bundler, rather than loading them from some other means (e.g., downloading them, loading them from a file, etc)?

@TalDerei
Copy link
Collaborator Author

TalDerei commented Nov 28, 2023

We can mimic Aztec's strategy in bb.js and download the CRS files from a centralized server rather than including them in the NPM packages. Maybe also separately including them in the bundlers for maximal flexibility is probably a good idea?

Reverting this PR to a draft until we decide.

@TalDerei TalDerei marked this pull request as draft November 28, 2023 04:12
@TalDerei
Copy link
Collaborator Author

TalDerei commented Nov 28, 2023

@hdevalence What cloud-based object storage services does Penumbra use? Where is the sqlite3 database for the ceremony hosted, Amazon S3?

@TalDerei
Copy link
Collaborator Author

TalDerei commented Nov 28, 2023

@grod220 I experienced similar challenges trying to modify the web-pack config in the webgpu repository to get the javascript bundlers to load the bin files. I suggest we skip all this unnecessary complexity and indirection and instead fetch the proving keys from a cloud provider.

I modified the TS script to conditionally bundle the proving keys in the three NPM packages if the --bundle-bin is passed (npm run publish-wasm --bundle-bin).

@grod220
Copy link
Contributor

grod220 commented Nov 28, 2023

Are there binary assets that bundlers do support?

We need to eventually pass it to the wasm function as a Unit8Array (right?). We could just store it as that and allow folks to import it from the npm package and simply pass it to the function. This still would require us making the npm package that wraps/re-exports the output from wasm pack.

Why does a consumer need to put the proving keys in a bundler, rather than loading them from some other means

In theory, this package doesn't have to provide the keys at all and the consumer is responsible for going to get it. Perhaps there are some benefits to this (?). The downside is that there would be runtime fetching costs. If they wanted to avoid this, they'd have to go through the same exercise we are talking about now (fetching from source & storing in bundler-compatible way).

@hdevalence
Copy link
Member

@hdevalence What cloud-based object storage services does Penumbra use? Where is the sqlite3 database for the ceremony hosted, Amazon S3?

Currently we use git-lfs hosted by Github (the download-proving-keys feature has a hand-rolled LFS client that's just capable enough to fetch the resources we need) but there's not a long-term plan yet.

Currently our storage infra is built around a model where there are few users (us) and frequent changes to the keys. But once we finish the ceremony, we'll have fixed proving keys and we'll be in a different problem context. At that point, I think it would make sense to host the proving keys on a CDN.

One option worth looking into is whether we can use Cloudflare's IPFS gateway, but Cloudflare's normal CDN is also probably a good bet.

The full ceremony transcripts are a different class of problem -- they'll probably be close to 1TB in size, and clients shouldn't need to download them unless they want to manually re-verify the transcripts.

@TalDerei TalDerei marked this pull request as ready for review November 28, 2023 19:30
@TalDerei
Copy link
Collaborator Author

TalDerei commented Nov 28, 2023

@grod220 and I agreed to load the keys over the network and revisit the "wrapper npm package" proposal later. Adding the bin files to the NPM packages would not in the current form be useful for consumers and potentially cause confusion. Closing PR.

@TalDerei TalDerei requested a review from grod220 November 28, 2023 19:30
@TalDerei TalDerei closed this Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants