From 8f4ff8a6cfb50d367cc66c39349178e392d00f64 Mon Sep 17 00:00:00 2001 From: Greg Tatum Date: Wed, 16 Aug 2023 10:05:35 -0500 Subject: [PATCH 1/2] Update the README.md to refer to build-wasm.sh and point to the developer docs When I was onboarding, it was hard to find the docs, and the wasm building documentation was out of date. --- README.md | 54 +++++++++++------------------------------------------- 1 file changed, 11 insertions(+), 43 deletions(-) diff --git a/README.md b/README.md index 05c3c3d25..e775e25c7 100644 --- a/README.md +++ b/README.md @@ -2,11 +2,12 @@ [![CircleCI badge](https://img.shields.io/circleci/project/github/browsermt/bergamot-translator/main.svg?label=CircleCI)](https://circleci.com/gh/browsermt/bergamot-translator/) -Bergamot translator provides a unified API for ([Marian NMT](https://marian-nmt.github.io/) framework based) neural machine translation functionality in accordance with the [Bergamot](https://browser.mt/) project that focuses on improving client-side machine translation in a web browser. +Bergamot translator provides a unified API for ([Marian NMT](https://marian-nmt.github.io/) framework based) neural machine translation functionality in accordance with the [Bergamot](https://browser.mt/) project that focuses on improving client-side machine translation in a web browser. Read more about this project in the [developer documentation](https://browser.mt/docs/main/index.html). ## Build Instructions ### Build Natively + Create a folder where you want to build all the artifacts (`build-native` in this case) and compile ```bash @@ -16,52 +17,20 @@ cmake ../ make -j2 ``` -### Build WASM -#### Prerequisite - -Building on wasm requires Emscripten toolchain. It can be downloaded and installed using following instructions: - -* Get the latest sdk: `git clone https://github.com/emscripten-core/emsdk.git` -* Enter the cloned directory: `cd emsdk` -* Install the sdk: `./emsdk install 3.1.8` -* Activate the sdk: `./emsdk activate 3.1.8` -* Activate path variables: `source ./emsdk_env.sh` - -#### Compile - -To build a version that translates with higher speeds on Firefox Nightly browser, follow these instructions: - - 1. Create a folder where you want to build all the artifacts (`build-wasm` in this case) and compile - ```bash - mkdir build-wasm - cd build-wasm - emcmake cmake -DCOMPILE_WASM=on ../ - emmake make -j2 - ``` +For more detailed build instructions read the [Bergamot C++ Library](https://browser.mt/docs/main/marian-integration.html) docs. - The wasm artifacts (.js and .wasm files) will be available in the build directory ("build-wasm" in this case). +### Build Wasm - 2. Patch generated artifacts to import GEMM library from a separate wasm module - ```bash - bash ../wasm/patch-artifacts-import-gemm-module.sh - ``` +The process for building Wasm is controlled by the `build-wasm.sh` script. This script downloads the emscripten toolchain and generates the build artifacts in the `build-wasm` folder. -To build a version that runs on all browsers (including Firefox Nightly) but translates slowly, follow these instructions: - - 1. Create a folder where you want to build all the artifacts (`build-wasm` in this case) and compile - ```bash - mkdir build-wasm - cd build-wasm - emcmake cmake -DCOMPILE_WASM=on ../ - emmake make -j2 - ``` +```bash +./build-wasm.sh +``` - 2. Patch generated artifacts to import GEMM library from a separate wasm module - ```bash - bash ../wasm/patch-artifacts-import-gemm-module.sh - ``` +For more information on running the Wasm see [Using Bergamot Translator in JavaScript](https://browser.mt/docs/main/wasm-example.html). #### Recompiling + As long as you don't update any submodule, just follow [Compile](#Compile) steps.\ If you update a submodule, execute following command in repository root folder before executing [Compile](#Compile) steps. @@ -69,7 +38,6 @@ If you update a submodule, execute following command in repository root folder b git submodule update --init --recursive ``` - ## How to use ### Using Native version @@ -77,6 +45,6 @@ git submodule update --init --recursive The builds generate library that can be integrated to any project. All the public header files are specified in `src` folder.\ A short example of how to use the APIs is provided in `app/bergamot.cpp` file. -### Using WASM version +### Using Wasm version Please follow the `README` inside the `wasm` folder of this repository that demonstrates how to use the translator in JavaScript. From 2b4a48f5c37ecc3805e0e4c3c0f50e3f3a613813 Mon Sep 17 00:00:00 2001 From: Greg Tatum Date: Thu, 17 Aug 2023 11:46:00 -0500 Subject: [PATCH 2/2] Update the capitalization for Wasm across the project --- .circleci/config.yml | 4 +--- .github/workflows/build.yml | 2 +- CMakeLists.txt | 21 ++++++++++----------- build-wasm.sh | 2 +- doc/marian-integration.rst | 2 +- src/tests/blocking.cpp | 2 +- src/tests/wasm.cpp | 6 +++--- src/translator/CMakeLists.txt | 2 +- wasm/CMakeLists.txt | 2 +- wasm/README.md | 4 ++-- wasm/module/README.md | 4 ++-- wasm/module/translator.js | 8 ++++---- wasm/module/worker/translator-worker.js | 16 ++++++++-------- wasm/node-test.js | 6 +++--- wasm/patch-artifacts-import-gemm-module.sh | 6 +++--- wasm/test_page/js/index.js | 2 +- wasm/test_page/start_server.sh | 6 +++--- 17 files changed, 46 insertions(+), 49 deletions(-) diff --git a/.circleci/config.yml b/.circleci/config.yml index 52d58fc09..8892a5845 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -11,7 +11,7 @@ jobs: - checkout - run: - name: Build WASM + name: Build Wasm command: | bash build-wasm.sh @@ -77,5 +77,3 @@ workflows: ignore: /.*/ requires: - build - - diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml index 830924c2c..dbaee0e4f 100644 --- a/.github/workflows/build.yml +++ b/.github/workflows/build.yml @@ -248,7 +248,7 @@ jobs: run: | ccache -s # Print current cache stats - - name: Import GEMM library from a separate wasm module + - name: Import GEMM library from a separate Wasm module working-directory: build-wasm run: bash ../wasm/patch-artifacts-import-gemm-module.sh diff --git a/CMakeLists.txt b/CMakeLists.txt index 82940de82..f57505fec 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -23,7 +23,7 @@ endif() if(NOT COMPILE_WASM) # Setting BUILD_ARCH to native invokes CPU intrinsic detection logic below. - # Prevent invoking that logic for WASM builds. + # Prevent invoking that logic for Wasm builds. set(BUILD_ARCH native CACHE STRING "Compile for this CPU architecture.") # Unfortunately MSVC supports a limited subset of BUILD_ARCH flags. Instead try to guess @@ -68,10 +68,10 @@ endif(MSVC) include(CMakeDependentOption) # Project specific cmake options -option(COMPILE_WASM "Compile for WASM" OFF) -cmake_dependent_option(USE_WASM_COMPATIBLE_SOURCE "Use wasm compatible sources" OFF "NOT COMPILE_WASM" ON) +option(COMPILE_WASM "Compile for Wasm" OFF) +cmake_dependent_option(USE_WASM_COMPATIBLE_SOURCE "Use Wasm compatible sources" OFF "NOT COMPILE_WASM" ON) -# WASM disables a million libraries, which also includes the unit test-library. +# Wasm disables a million libraries, which also includes the unit test-library. cmake_dependent_option(COMPILE_UNIT_TESTS "Compile unit tests" OFF "USE_WASM_COMPATIBLE_SOURCE" ON) option(COMPILE_TESTS "Compile bergamot-tests" OFF) cmake_dependent_option(ENABLE_CACHE_STATS "Enable stats on cache" ON "COMPILE_TESTS" OFF) @@ -85,7 +85,7 @@ SET(SSPLIT_COMPILE_LIBRARY_ONLY ON CACHE BOOL "Do not compile ssplit tests") if (USE_WASM_COMPATIBLE_SOURCE) SET(COMPILE_LIBRARY_ONLY ON CACHE BOOL "Build only the Marian library and exclude all executables.") SET(USE_MKL OFF CACHE BOOL "Compile with MKL support") - # # Setting the ssplit-cpp submodule specific cmake options for wasm + # # Setting the ssplit-cpp submodule specific cmake options for Wasm SET(SSPLIT_USE_INTERNAL_PCRE2 ON CACHE BOOL "Use internal PCRE2 instead of system PCRE2") endif() @@ -115,9 +115,9 @@ if(COMPILE_WASM) # See https://github.com/emscripten-core/emscripten/blob/main/src/settings.js list(APPEND WASM_COMPILE_FLAGS -O3 - # Preserve whitespaces in JS even for release builds; this doesn't increase wasm binary size + # Preserve whitespaces in JS even for release builds; this doesn't increase Wasm binary size $<$:-g1> - # Relevant Debug info only for release with debug builds as this increases wasm binary size + # Relevant Debug info only for release with debug builds as this increases Wasm binary size $<$:-g2> -fPIC -mssse3 @@ -128,9 +128,9 @@ if(COMPILE_WASM) ) list(APPEND WASM_LINK_FLAGS -O3 - # Preserve whitespaces in JS even for release builds; this doesn't increase wasm binary size + # Preserve whitespaces in JS even for release builds; this doesn't increase Wasm binary size $<$:-g1> - # Relevant Debug info only for release with debug builds as this increases wasm binary size + # Relevant Debug info only for release with debug builds as this increases Wasm binary size $<$:-g2> -lembind # Save some code, and some speed @@ -154,7 +154,7 @@ if(COMPILE_WASM) # Export all of the intgemm functions in case we need to fall back to using the embedded intgemm -sEXPORTED_FUNCTIONS=[_int8PrepareAFallback,_int8PrepareBFallback,_int8PrepareBFromTransposedFallback,_int8PrepareBFromQuantizedTransposedFallback,_int8PrepareBiasFallback,_int8MultiplyAndAddBiasFallback,_int8SelectColumnsOfBFallback] # Necessary for mozintgemm linking. This prepares the `wasmMemory` variable ahead of time as - # opposed to delegating that task to the wasm binary itself. This way we can link MozIntGEMM + # opposed to delegating that task to the Wasm binary itself. This way we can link MozIntGEMM # module to the same memory as the main bergamot-translator module. -sIMPORTED_MEMORY=1 # Dynamic execution is either frowned upon or blocked inside browser extensions @@ -180,4 +180,3 @@ option(COMPILE_PYTHON "Compile python bindings. Intended to be activated with se if(COMPILE_PYTHON) add_subdirectory(bindings/python) endif(COMPILE_PYTHON) - diff --git a/build-wasm.sh b/build-wasm.sh index b6d70efb6..43a3c84ea 100755 --- a/build-wasm.sh +++ b/build-wasm.sh @@ -41,7 +41,7 @@ cd ${BUILD_DIRECTORY} emcmake cmake -DCOMPILE_WASM=on ../ emmake make -j2 -# 2. Import GEMM library from a separate wasm module +# 2. Import GEMM library from a separate Wasm module bash ../wasm/patch-artifacts-import-gemm-module.sh # The artifacts (.js and .wasm files) will be available in the build directory diff --git a/doc/marian-integration.rst b/doc/marian-integration.rst index 756e0a810..d2895faaa 100644 --- a/doc/marian-integration.rst +++ b/doc/marian-integration.rst @@ -38,7 +38,7 @@ MKL/OpenBLAS. Building bergamot-translator ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Web Assembly (WASM) reduces building to only using a subset of +Web Assembly (Wasm) reduces building to only using a subset of functionalities of marian, the translation library powering bergamot-translator. When developing bergamot-translator it is important that the sources added be compatible with marian. Therefore, it is diff --git a/src/tests/blocking.cpp b/src/tests/blocking.cpp index 3bbb45634..e4279da59 100644 --- a/src/tests/blocking.cpp +++ b/src/tests/blocking.cpp @@ -17,7 +17,7 @@ int main(int argc, char *argv[]) { models.push_back(model); } - /// WASM is one special case where WASM path is being checked, involving translateMultiple and a multi-line feed. + /// Wasm is one special case where Wasm path is being checked, involving translateMultiple and a multi-line feed. /// Hence we do not bind it at a single input-blob single Response constraint imposed by the TestSuite. testSuite.run(config.opMode, models); diff --git a/src/tests/wasm.cpp b/src/tests/wasm.cpp index 97f0fc801..7bbe12f1a 100644 --- a/src/tests/wasm.cpp +++ b/src/tests/wasm.cpp @@ -5,7 +5,7 @@ void wasm(BlockingService &service, std::shared_ptr &model) { std::vector responseOptions; std::vector texts; - // WASM always requires HTML and alignment. + // Wasm always requires HTML and alignment. // TODO(jerinphilip): Fix this, bring in actual tests. // responseOptions.HTML = true; // responseOptions.alignment = true; // Necessary for HTML @@ -35,14 +35,14 @@ int main(int argc, char *argv[]) { for (auto &modelConfigPath : config.modelConfigPaths) { TranslationModel::Config modelConfig = parseOptionsFromFilePath(modelConfigPath); - // Anything WASM is expected to use the byte-array-loads. So we hard-code grabbing MemoryBundle from FS and use the + // Anything Wasm is expected to use the byte-array-loads. So we hard-code grabbing MemoryBundle from FS and use the // MemoryBundle capable constructor. MemoryBundle memoryBundle = getMemoryBundleFromConfig(modelConfig); std::shared_ptr model = std::make_shared(modelConfig, std::move(memoryBundle)); models.push_back(model); } - /// WASM is one special case where WASM path is being checked, involving translateMultiple and a multi-line feed. + /// Wasm is one special case where Wasm path is being checked, involving translateMultiple and a multi-line feed. /// Hence we do not bind it at a single input-blob single Response constraint imposed by the TestSuite. if (config.opMode == "wasm") { wasm(service, models.front()); diff --git a/src/translator/CMakeLists.txt b/src/translator/CMakeLists.txt index 1d773b46b..0597d4eb9 100644 --- a/src/translator/CMakeLists.txt +++ b/src/translator/CMakeLists.txt @@ -20,7 +20,7 @@ add_library(bergamot-translator STATIC xh_scanner.cpp ) if (USE_WASM_COMPATIBLE_SOURCE) - # Using wasm compatible sources should include this compile definition; + # Using Wasm compatible sources should include this compile definition; # Has to be done here because we are including marian headers + some sources # in local repository use these definitions target_compile_definitions(bergamot-translator PUBLIC USE_SSE2 WASM_COMPATIBLE_SOURCE) diff --git a/wasm/CMakeLists.txt b/wasm/CMakeLists.txt index ef8fd988a..ce3c18580 100644 --- a/wasm/CMakeLists.txt +++ b/wasm/CMakeLists.txt @@ -4,7 +4,7 @@ add_executable(bergamot-translator-worker bindings/response_bindings.cpp ) -# Generate version file that can be included in the wasm artifacts +# Generate version file that can be included in the Wasm artifacts configure_file(${CMAKE_CURRENT_SOURCE_DIR}/project_version.js.in ${CMAKE_CURRENT_BINARY_DIR}/project_version.js @ONLY) diff --git a/wasm/README.md b/wasm/README.md index 0f3f77426..57b0b05ff 100644 --- a/wasm/README.md +++ b/wasm/README.md @@ -4,7 +4,7 @@ All the instructions below are meant to run from the current directory. ## Using JS APIs -See [node-test.js](./node-test.js) for an annotated example of how to use the WASM module. Most of the code from it can also be used in a browser context. +See [node-test.js](./node-test.js) for an annotated example of how to use the Wasm module. Most of the code from it can also be used in a browser context. Alternatively refer to the file `test_page/js/worker.js` that demonstrates how to use the bergamot translator in JavaScript via a `