TLG for k2 #1025
-
Hi, I would like to build TLG for K2. I found ctc_token_fst.py in WeNet using OpenFST below. Since this is not for K2, I think I can refer lexicon_to_fst but I don't know how to deal with the last line showing only single zero. Any suggestions? import sys print('0 1 ') with open(sys.argv[1], 'r', encoding='utf8') as fin: |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 13 replies
-
Please refer to The |
Beta Was this translation helpful? Give feedback.
-
Is the model fine-tuned with CTC loss? If not, then you cannot use TLG or T to decode its output. If yes, then congratulations you can use either T or TLG to decode its output. The following is an example to decode a Wav2Vec 2.0 model fine-tuned with CTC loss from torchaudio using a T graph: We also have a C++ runtime in sherpa to support it. Please see By the way, if you only want to T for decoding, you don't need
Assume I suggest that you first make it work to decode with a T graph, and then you can build a TLG graph for decoding. |
Beta Was this translation helpful? Give feedback.
-
Wow, this is PERFECT! Thank you so much!! As you suggested, I tried to replace the log softmax probability in your example with mine using a T graph. But I encountered the following error. I have used a fine-tuned wav2vec2 model from HuggingFace and it could be different with one from Facebook. Could you please have a look at it? My log softmax probability is here just in case. [F] /usr/share/miniconda/envs/k2/conda-bld/k2_1669428702383/work/k2/csrc/intersect_dense_pruned.cu:155:void k2::MultiGraphDenseIntersectPruned::Intersect(std::shared_ptrk2::DenseFsaVec&) Check failed: c_->IsCompatible(*b_fsas->Context()) [ Stack-Trace: ] Traceback (most recent call last):
|
Beta Was this translation helpful? Give feedback.
Is the model fine-tuned with CTC loss?
If not, then you cannot use TLG or T to decode its output.
If yes, then congratulations you can use either T or TLG to decode its output.
The following is an example to decode a Wav2Vec 2.0 model fine-tuned with CTC loss from torchaudio using a T graph:
k2-fsa/k2#1096 (comment)
We also have a C++ runtime in sherpa to support it. Please see
https://k2-fsa.github.io/sherpa/cpp/pretrained_models/offline_ctc/torchaudio.html
By the way, if you only want to T for decoding, you don't need
#0
,#1
, …