forked from pytorch/audio
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Migrate CTC decoder code (pytorch#2580)
Summary: This commit gets rid of our copy of CTC decoder code and replace it with upstream Flashlight-Text repo. Pull Request resolved: pytorch#2580 Reviewed By: carolineechen Differential Revision: D38244906 Pulled By: mthrok fbshipit-source-id: d274240fc67675552d19ff35e9a363b9b9048721
- Loading branch information
1 parent
919fd0c
commit 39b6343
Showing
33 changed files
with
125 additions
and
3,224 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# Custom CMakeLists for building flashlight-text decoder | ||
# | ||
# The main difference from upstream native CMakeLists from flashlight-text. | ||
# | ||
# 1. Build compression libraries statically and make KenLM self-contained | ||
# 2. Build KenLM without Boost by compiling only what is used by flashlight-text | ||
# 3. Build KenLM and flashlight-text in one go (not required, but nice-to-have feature) | ||
# 4. Tweak the location of bindings so that its easier for TorchAudio build process to pick up. | ||
# (the upstream CMakeLists.txt does not install them in the same location as libflashlight-text) | ||
# 5. Tweak the name of bindings. (remove suffix like cpython-37m-darwin) | ||
|
||
set(CMAKE_CXX_VISIBILITY_PRESET default) | ||
|
||
set( | ||
libflashlight_src | ||
submodule/flashlight/lib/text/decoder/Utils.cpp | ||
submodule/flashlight/lib/text/decoder/lm/KenLM.cpp | ||
submodule/flashlight/lib/text/decoder/lm/ZeroLM.cpp | ||
submodule/flashlight/lib/text/decoder/lm/ConvLM.cpp | ||
submodule/flashlight/lib/text/decoder/LexiconDecoder.cpp | ||
submodule/flashlight/lib/text/decoder/LexiconFreeDecoder.cpp | ||
submodule/flashlight/lib/text/decoder/LexiconFreeSeq2SeqDecoder.cpp | ||
submodule/flashlight/lib/text/decoder/LexiconSeq2SeqDecoder.cpp | ||
submodule/flashlight/lib/text/decoder/Trie.cpp | ||
submodule/flashlight/lib/text/String.cpp | ||
submodule/flashlight/lib/text/dictionary/Utils.cpp | ||
submodule/flashlight/lib/text/dictionary/Dictionary.cpp | ||
) | ||
|
||
torchaudio_library( | ||
libflashlight-text | ||
"${libflashlight_src}" | ||
submodule | ||
"" | ||
FL_TEXT_USE_KENLM | ||
) | ||
|
||
# TODO: update torchaudio_library to handle private links | ||
target_link_libraries( | ||
libflashlight-text | ||
PRIVATE | ||
kenlm) | ||
|
||
if (BUILD_TORCHAUDIO_PYTHON_EXTENSION) | ||
torchaudio_extension( | ||
flashlight_lib_text_dictionary | ||
submodule/bindings/python/flashlight/lib/text/_dictionary.cpp | ||
submodule | ||
libflashlight-text | ||
"" | ||
) | ||
torchaudio_extension( | ||
flashlight_lib_text_decoder | ||
submodule/bindings/python/flashlight/lib/text/_decoder.cpp | ||
submodule | ||
libflashlight-text | ||
FL_TEXT_USE_KENLM | ||
) | ||
endif() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule kenlm
updated
from 000000 to 5cea45
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.