A tutorial of dumping HugeCTR's model to TensorFlow.
There are two kind of binary model files dumped from HugeCTR: sparse model file (embedding) and dense model file. After training, HugeCTR will:
- save each sparse model (embedding) to <prefix><sparse-index>_sparse_<iteration>.model, for example,
dcnmodel0_sparse_10000.model
. - and save dense model to <prefix>_dense_<iteration>.model, for example,
dcnmodel_dense_10000.model
.
-
Sparse model file
HugeCTR supports multiple embeddings in one network. Each embedding corresponds to a sparse model file.-
Distributed embedding
For distributed embedding, HugeCTR stores keys and embedding features in the following order:key0, embedding_feature0, key1, embedding_feature1, key2, embedding_feature2, ...
Each pair of <key, embedding_feature> has size in bytes = sizeof(TypeHashKey) + sizeof(float) * embedding_vec_size.
-
Localized embedding
For localized embedding, HugeCTR stores keys, slot ids and embedding features in the following order:key0, slot_id0, embedding_feature0, key1, slot_id1, embedding_feature1, key2, slot_id2, embedding_feature2, ...
Each pair of <key, slot_id, embedding_feature> has size in bytes = sizeof(TypeHashKey) + sizeof(TypeHashValueIndex) + sizeof(float) * embedding_vec_size.
-
-
Dense model file
Dense model's weights will be stored in the order of layers in configuration file. All values are of typefloat
.Weights in Layer0, Weights in Layer1, Weights in Layer2, ...
The non-training parameters will be saved to a json file, such as
moving-mean
andmoving-var
in BatchNorm layer.So far, the following layers have parameters needed to be saved, and the parameters in each layer are stored in the order in which the variables appear:
-
BatchNorm
gamma, beta
-
InnerProduct
weight, bias
-
MultiCross
for i in num_layers: weight, bias
-
Multiply
weight
-
NOTE
These binary model files only store the values described above, without any other identifiers or headers, which means you can parse the weights from the model file, or write initial values into model file in order.
To achieve this, the whole process has the following steps:
- Train with HugeCTR to get model files.
- According to model configuration json file, manually build the same computing-graph using TensorFlow.
- Load weights from model files to initialize corresponding layers in TensorFlow.
- Then you can save it as TensorFlow checkpoint, fine-tune that network, or do something you like.
Take criteo dataset and DCN model as an example to demonstrate the steps.
- Python >= 3.6
- TensorFlow 1.x or TensorFlow 2.x
- numpy
- struct (python package)
- json (python package)
Suppose you have trained with samples/dcn/dcn.json
, and have data (sparse_embedding0.data), model files (_dense_20000.model/0_sparse_20000.model) in your coresponding folders.
If you don't have please modify your dcn.json script and commands below accordingly.
You can use these commands to run this demo:
$ cd hugectr/tutorial/dump_to_tf
$ python3 main.py \
../../samples/dcn/criteo/sparse_embedding0.data \
../../samples/dcn/_dense_20000.model \
../../samples/dcn/0_sparse_20000.model
Expected output (the number in runtime can be different):
[INFO] output = -0.008215
[INFO] save done.
Usage
python3 main.py dataset dense_model sparse_model0 sparse_model1 ...
Arguments:
dataset
: data file used in HugeCTR training
dense_model
: HugeCTR's dense model file
sparse_model
: HugeCTR's sparse model file(s). Specify sparse model(s) in the order of embedding(s) in model json file.
-
Train with HugeCTR to get model files.
Follow the instructions to get binary model files. -
According to model configuration json file, manually build the same computing-graph using TensorFlow.
As shown in main.py, use Tensorflow to build each layer according to model json file. TensorFlow layers equivalent to those used in HugeCTR can be found in hugectr_layers.py.
For simplicity, theinput keys
are directly used as therow-index
of embedding-table to look upembedding features
. Therefore input keys have shape[batchsize, slot_num, max_nnz_per_slot]
.For example: input_keys has shape [batchsize = 2, slot_num = 3, max_nnz_per_slot = 4] [[[5 17 24 26] [3 0 -1 -1] [1 18 29 -1]] [[4 16 23 -1] [2 0 3 -1] [1 -1 -1 -1]]] -1 represents invalid key, and its embedding feature is 0.
-
Load weights from model files to initialize corresponding layers in TensorFlow.
dump.py is used to parse parameters from binary model files. Each parameter is parsed in order as described above. The parsed values can be used to initialize parameters defined in TensorFlow's layers. -
Save it as TensorFlow checkpoint.
After completing the above steps, the computing-graph will be saved as TensorFlow chekcpoint. Then you can convert this checkpoint to other formats you need, such as.pb, .onnx
.