You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This code is in 'deepspeech.cc'(deepspeech 0.1.1) slash I want to make the 'top_paths' to 5, since I want to get 5 candidate results. However, the 'decoder_outputs' will not get 5 results, so it can not works. How to make it?
char* Model::decode(int aNFrames, float*** aLogits) { const int batch_size = BATCH_SIZE; const int top_paths = 1; // I change this to 5 const int timesteps = aNFrames; const size_t num_classes = mPriv->alphabet->GetSize() + 1; // +1 for blank
// Raw data containers (arrays of floats, ints, etc.). int sequence_lengths[batch_size] = {timesteps};
// Convert data containers to the format accepted by the decoder, simply // mapping the memory from the container to an Eigen::ArrayXi,::MatrixXf, // using Eigen::Map. Eigen::Map seq_len(&sequence_lengths[0], batch_size); std::vector<Eigen::Map> inputs; inputs.reserve(timesteps); for (int t = 0; t < timesteps; ++t) { inputs.emplace_back(&aLogits[t][0][0], batch_size, num_classes); }
// Prepare containers for output and scores. // CTCDecoder::Output is std::vector<std::vector> std::vectorCTCDecoder::Output decoder_outputs(top_paths); for (CTCDecoder::Output& output : decoder_outputs) { output.resize(batch_size); } float score[batch_size][top_paths] = {{0.0}}; Eigen::MapEigen::MatrixXf scores(&score[0][0], batch_size, top_paths);
// Output is an array of shape (1, n_results, result_length). // In this case, n_results is also equal to 1. size_t output_length = decoder_outputs[0][0].size() + 1; size_t decoded_length = 1; // add 1 for the slash 0 for (int i = 0; i < output_length - 1; i++) { int64 character = decoder_outputs[0][0][i]; const std::string& str = mPriv->alphabet->StringFromLabel(character); decoded_length += str.size(); }
This code is in 'deepspeech.cc'(deepspeech 0.1.1) slash I want to make the 'top_paths' to 5, since I want to get 5 candidate results. However, the 'decoder_outputs' will not get 5 results, so it can not works. How to make it?
char* Model::decode(int aNFrames, float*** aLogits) { const int batch_size = BATCH_SIZE; const int top_paths = 1; // I change this to 5 const int timesteps = aNFrames; const size_t num_classes = mPriv->alphabet->GetSize() + 1; // +1 for blank
// Raw data containers (arrays of floats, ints, etc.). int sequence_lengths[batch_size] = {timesteps};
// Convert data containers to the format accepted by the decoder, simply // mapping the memory from the container to an Eigen::ArrayXi,::MatrixXf, // using Eigen::Map. Eigen::Map seq_len(&sequence_lengths[0], batch_size); std::vector<Eigen::Map> inputs; inputs.reserve(timesteps); for (int t = 0; t < timesteps; ++t) { inputs.emplace_back(&aLogits[t][0][0], batch_size, num_classes); }
// Prepare containers for output and scores. // CTCDecoder::Output is std::vector<std::vector> std::vectorCTCDecoder::Output decoder_outputs(top_paths); for (CTCDecoder::Output& output : decoder_outputs) { output.resize(batch_size); } float score[batch_size][top_paths] = {{0.0}}; Eigen::MapEigen::MatrixXf scores(&score[0][0], batch_size, top_paths);
// Output is an array of shape (1, n_results, result_length). // In this case, n_results is also equal to 1. size_t output_length = decoder_outputs[0][0].size() + 1; size_t decoded_length = 1; // add 1 for the slash 0 for (int i = 0; i < output_length - 1; i++) { int64 character = decoder_outputs[0][0][i]; const std::string& str = mPriv->alphabet->StringFromLabel(character); decoded_length += str.size(); }
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
>>> jackhuang
[May 6, 2018, 12:50pm]
This code is in 'deepspeech.cc'(deepspeech 0.1.1) slash
I want to make the 'top_paths' to 5, since I want to get 5 candidate
results. However, the 'decoder_outputs' will not get 5 results, so it
can not works. How to make it?
char*
Model::decode(int aNFrames, float*** aLogits)
{
const int batch_size = BATCH_SIZE;
const int top_paths = 1; // I change this to 5
const int timesteps = aNFrames;
const size_t num_classes = mPriv->alphabet->GetSize() + 1; // +1 for blank
// Raw data containers (arrays of floats, ints, etc.).
int sequence_lengths[batch_size] = {timesteps};
// Convert data containers to the format accepted by the decoder, simply
// mapping the memory from the container to an Eigen::ArrayXi,::MatrixXf,
// using Eigen::Map.
Eigen::Map seq_len(&sequence_lengths[0], batch_size);
std::vector<Eigen::Map> inputs;
inputs.reserve(timesteps);
for (int t = 0; t < timesteps; ++t) {
inputs.emplace_back(&aLogits[t][0][0], batch_size, num_classes);
}
// Prepare containers for output and scores.
// CTCDecoder::Output is std::vector<std::vector>
std::vectorCTCDecoder::Output decoder_outputs(top_paths);
for (CTCDecoder::Output& output : decoder_outputs) {
output.resize(batch_size);
}
float score[batch_size][top_paths] = {{0.0}};
Eigen::MapEigen::MatrixXf scores(&score[0][0], batch_size, top_paths);
if (mPriv->scorer == NULL) {
CTCBeamSearchDecoder<>::DefaultBeamScorer scorer;
CTCBeamSearchDecoder<> decoder(num_classes,
mPriv->beam_width,
&scorer,
batch_size);
decoder.Decode(seq_len, inputs, &decoder_outputs, &scores).ok();
} else {
CTCBeamSearchDecoder decoder(num_classes,
mPriv->beam_width,
mPriv->scorer,
batch_size);
decoder.Decode(seq_len, inputs, &decoder_outputs, &scores).ok();
}
// Output is an array of shape (1, n_results, result_length).
// In this case, n_results is also equal to 1.
size_t output_length = decoder_outputs[0][0].size() + 1;
size_t decoded_length = 1; // add 1 for the slash 0
for (int i = 0; i < output_length - 1; i++) {
int64 character = decoder_outputs[0][0][i];
const std::string& str = mPriv->alphabet->StringFromLabel(character);
decoded_length += str.size();
}
char* output = (char*)malloc(sizeof(char) DEEPSPEECH.cdx deepspeech.commands DEEPSPEECH.pages DEEPSPEECH.warc.gz discourse.mozilla.org html-to-markdown.sh shell-conver-html-to-split-posts.sh sorted-deepspeech-posts decoded_length);
char* pen = output;
for (int i = 0; i < output_length - 1; i++) {
int64 character = decoder_outputs[0][0][i];
const std::string& str = mPriv->alphabet->StringFromLabel(character);
strncpy(pen, str.c_str(), str.size());
pen += str.size();
}
*pen = ' slash 0';
for (int i = 0; i < timesteps; ++i) {
for (int j = 0; j < batch_size; ++j) {
free(aLogits[i][j]);
}
free(aLogits[i]);
}
free(aLogits);
return output;
}
[This is an archived TTS discussion thread from discourse.mozilla.org/t/how-to-modify-this-code-to-output-multi-candidate-results-by-beam-search]
Beta Was this translation helpful? Give feedback.
All reactions