Transfer learning / finetuning in k2? #414

jwvl · 2022-06-09T08:21:56Z

jwvl
Jun 9, 2022

Hi,

I am coming to k2/icefall with some experience training kaldi (chain) models, and am wondering how the two compare in terms of finetuning.
For Kaldi models, the advice from the authors is usually that finetuning doesn't work that well and that it is better to train a model from scratch with your new data mixed in, e.g.
https://groups.google.com/g/kaldi-help/c/O2s3YkCXtmo/m/4dmPSvqODwAJ
How is this for the k2 models like pruned_transducer_stateless?

I am in a situation where I have a "generic" dataset of a few thousand hours and a growing domain-specific dataset of a few dozen hours. It would be great if I could train a model once on the generic dataset (costs a lost of GPU hours) and then periodically finetune it on the latest batch of in-domain data. Would such an approach be more feasible in k2/icefall?

Answered by danpovey

Jun 9, 2022

For our latest (RNN-T) models we are being careful not to use batchnorm, so that is one fewer problem that you would have.
Also, it should be possible with Lhotse to mix your data in with other data.

View full answer

danpovey · 2022-06-09T08:24:41Z

danpovey
Jun 9, 2022
Maintainer

For our latest (RNN-T) models we are being careful not to use batchnorm, so that is one fewer problem that you would have.
Also, it should be possible with Lhotse to mix your data in with other data.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transfer learning / finetuning in k2? #414

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Transfer learning / finetuning in k2? #414

jwvl Jun 9, 2022

Replies: 1 comment

danpovey Jun 9, 2022 Maintainer

jwvl
Jun 9, 2022

danpovey
Jun 9, 2022
Maintainer