forked from CGCL-codes/naturalcc
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpython.log
196 lines (196 loc) · 46 KB
/
python.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
nohup: 忽略输入
[2021-03-13 12:57:53] INFO >> Load arguments in /home/yanghe/Documents/naturalcc-dev/run/summarization/tree2seq/config/python_wan/python.yml (train.py:291, cli_main())
[2021-03-13 12:57:53] INFO >> {'criterion': 'cross_entropy', 'optimizer': 'torch_adam', 'lr_scheduler': 'fixed', 'tokenizer': None, 'bpe': None, 'common': {'no_progress_bar': 0, 'log_interval': 500, 'log_format': 'simple', 'tensorboard_logdir': '', 'memory_efficient_fp16': 1, 'fp16_no_flatten_grads': 1, 'fp16_init_scale': 128, 'fp16_scale_window': None, 'fp16_scale_tolerance': 0.0, 'min_loss_scale': 0.0001, 'threshold_loss_scale': None, 'empty_cache_freq': 0, 'task': 'graph_summarization', 'seed': 1, 'cpu': 0, 'fp16': 0, 'fp16_opt_level': '01', 'server_ip': '', 'server_port': ''}, 'dataset': {'num_workers': 0, 'skip_invalid_size_inputs_valid_test': 1, 'max_tokens': None, 'max_sentences': 64, 'required_batch_size_multiple': 8, 'dataset_impl': 'mmap', 'train_subset': 'train', 'valid_subset': 'valid', 'validate_interval': 1, 'fixed_validation_seed': None, 'disable_validation': 0, 'max_tokens_valid': None, 'max_sentences_valid': 512, 'curriculum': 0, 'gen_subset': 'test', 'num_shards': 1, 'shard_id': 0}, 'distributed_training': {'distributed_world_size': 1, 'distributed_rank': 0, 'distributed_backend': 'nccl', 'distributed_init_method': None, 'distributed_port': -1, 'device_id': 0, 'distributed_no_spawn': 0, 'ddp_backend': 'c10d', 'bucket_cap_mb': 25, 'fix_batches_to_gpus': None, 'find_unused_parameters': 0, 'fast_stat_sync': 0, 'broadcast_buffers': 0, 'global_sync_iter': 50, 'warmup_iterations': 500, 'local_rank': -1, 'block_momentum': 0.875, 'block_lr': 1, 'use_nbm': 0, 'average_sync': 0}, 'task': {'data': '/data/yanghe/.ncc/python_wan/summarization/data-mmap', 'source_lang': 'bin_ast', 'target_lang': 'docstring_tokens', 'load_alignments': 0, 'left_pad_source': 0, 'left_pad_target': 0, 'max_source_positions': 9999, 'max_target_positions': 30, 'upsample_primary': 1, 'truncate_source': 1, 'truncate_target': 1, 'append_eos_to_target': 1, 'eval_bleu': 1, 'eval_bleu_detok': 'space', 'eval_bleu_detok_args': None, 'eval_tokenized_bleu': 0, 'eval_bleu_remove_bpe': None, 'eval_bleu_args': None, 'eval_bleu_print_samples': 0}, 'model': {'arch': 'nary_tree2seq', 'encoder_embed_dim': 512, 'encoder_embed': None, 'encoder_freeze_embed': 0, 'encoder_hidden_size': 512, 'encoder_layers': 1, 'encoder_bidirectional': 0, 'decoder_embed_dim': 512, 'decoder_embed': None, 'decoder_freeze_embed': None, 'decoder_hidden_size': 512, 'decoder_layers': 1, 'decoder_out_embed_dim': 512, 'decoder_attention': 1, 'adaptive_softmax_cutoff': None, 'share_decoder_input_output_embed': 0, 'share_all_embeddings': 0, 'encoder_dropout_in': 0.1, 'encoder_dropout_out': 0.2, 'decoder_dropout_in': 0.1, 'decoder_dropout_out': 0.2, 'max_source_positions': 9999, 'max_target_positions': 30}, 'optimization': {'max_epoch': 200, 'max_update': 0, 'clip_norm': 25, 'update_freq': [1], 'lrs': [0.0004], 'min_lr': -1, 'use_bmuf': 0, 'force_anneal': 0, 'warmup_updates': 0, 'lr_shrink': 0.98, 'sentence_avg': 1, 'adam': {'adam_betas': '(0.9, 0.999)', 'adam_eps': 1e-08, 'weight_decay': 0.0}, 'weight_decay': 0.0, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 5, 'max_steps': -1, 'warmup_steps': 0, 'gradient_accumulation_steps': 1}, 'checkpoint': {'restore_file': 'checkpoint_last.pt', 'reset_dataloader': None, 'reset_lr_scheduler': None, 'reset_meters': None, 'reset_optimizer': None, 'optimizer_overrides': '{}', 'save_interval': 1, 'save_interval_updates': 0, 'keep_interval_updates': 0, 'keep_last_epochs': -1, 'keep_best_checkpoints': -1, 'no_save': 0, 'no_epoch_checkpoints': 1, 'no_last_checkpoints': 0, 'no_save_optimizer_state': None, 'best_checkpoint_metric': 'bleu', 'maximize_best_checkpoint_metric': 1, 'patience': 10, 'save_dir': '/data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints', 'should_continue': 0, 'model_name_or_path': None, 'cache_dir': None, 'logging_steps': 500, 'save_steps': 2000, 'save_total_limit': 2, 'overwrite_output_dir': 0, 'overwrite_cache': 0}, 'eval': {'path': '/data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt', 'result_path': None, 'remove_bpe': None, 'quiet': 0, 'model_overrides': '{}', 'max_sentences': 2048, 'beam': 1, 'nbest': 1, 'max_len_a': 0, 'max_len_b': 30, 'min_len': 1, 'match_source_len': 0, 'no_early_stop': 0, 'unnormalized': 0, 'no_beamable_mm': 0, 'lenpen': 1, 'unkpen': 0, 'replace_unk': None, 'sacrebleu': 0, 'score_reference': 0, 'prefix_size': 0, 'no_repeat_ngram_size': 0, 'sampling': 0, 'sampling_topk': -1, 'sampling_topp': -1, 'temperature': 1.0, 'diverse_beam_groups': -1, 'diverse_beam_strength': 0.5, 'diversity_rate': -1.0, 'print_alignment': 0, 'print_step': 0, 'iter_decode_eos_penalty': 0.0, 'iter_decode_max_iter': 10, 'iter_decode_force_max_iter': 0, 'iter_decode_with_beam': 1, 'iter_decode_with_external_reranker': 0, 'retain_iter_history': 0, 'decoding_format': None, 'nltk_bleu': 1, 'rouge': 1}} (train.py:293, cli_main())
[2021-03-13 12:57:53] INFO >> single GPU training... (train.py:322, cli_main())
[2021-03-13 12:57:53] INFO >> [bin_ast] dictionary: 30976 types (graph_summarization.py:169, setup_task())
[2021-03-13 12:57:53] INFO >> [docstring_tokens] dictionary: 30000 types (graph_summarization.py:170, setup_task())
[2021-03-13 12:57:55] INFO >> truncate valid.docstring_tokens to 30 (graph_summarization.py:96, load_langpair_dataset())
[2021-03-13 12:57:55] INFO >> loaded 18505 examples from: /data/yanghe/.ncc/python_wan/summarization/data-mmap/valid.bin_ast (graph_summarization.py:117, load_langpair_dataset())
[2021-03-13 12:57:55] INFO >> loaded 18505 examples from: /data/yanghe/.ncc/python_wan/summarization/data-mmap/valid.docstring_tokens (graph_summarization.py:118, load_langpair_dataset())
[2021-03-13 12:57:56] INFO >> NaryTree2SeqModel(
(encoder): NaryTreeLSTMEncoder(
(embed_tokens): Embedding(30976, 512, padding_idx=0)
(lstm): NaryTreeLSTMCell(
(W_iou): Linear(in_features=512, out_features=1536, bias=False)
(U_iou): Linear(in_features=1024, out_features=1536, bias=False)
(U_f): Linear(in_features=1024, out_features=1024, bias=True)
)
)
(decoder): LSTMDecoder(
(embed_tokens): Embedding(30000, 512, padding_idx=0)
(layers): ModuleList(
(0): LSTMCell(1024, 512)
)
(attention): AttentionLayer(
(input_proj): Linear(in_features=512, out_features=512, bias=False)
(output_proj): Linear(in_features=1024, out_features=512, bias=False)
)
(fc_out): Linear(in_features=512, out_features=30000, bias=True)
)
) (train.py:211, single_main())
[2021-03-13 12:57:56] INFO >> model nary_tree2seq, criterion CrossEntropyCriterion (train.py:212, single_main())
[2021-03-13 12:57:56] INFO >> num. model params: 53956400 (num. trained: 53956400) (train.py:215, single_main())
[2021-03-13 12:57:57] INFO >> training on 1 GPUs (train.py:220, single_main())
[2021-03-13 12:57:57] INFO >> max tokens per GPU = None and max sentences per GPU = 64 (train.py:223, single_main())
[2021-03-13 12:57:57] INFO >> no existing checkpoint found /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (ncc_trainer.py:269, load_checkpoint())
[2021-03-13 12:57:57] INFO >> loading train data for epoch 1 (ncc_trainer.py:283, get_train_iterator())
[2021-03-13 12:58:04] INFO >> truncate train.docstring_tokens to 30 (graph_summarization.py:96, load_langpair_dataset())
[2021-03-13 12:58:04] INFO >> loaded 55538 examples from: /data/yanghe/.ncc/python_wan/summarization/data-mmap/train.bin_ast (graph_summarization.py:117, load_langpair_dataset())
[2021-03-13 12:58:04] INFO >> loaded 55538 examples from: /data/yanghe/.ncc/python_wan/summarization/data-mmap/train.docstring_tokens (graph_summarization.py:118, load_langpair_dataset())
[2021-03-13 12:58:04] INFO >> NOTE: your device may support faster training with fp16 (ncc_trainer.py:154, _setup_optimizer())
/home/yanghe/Documents/naturalcc-dev/ncc/utils/utils.py:575: UserWarning: amp_C fused kernels unavailable, disabling multi_tensor_l2norm; you may get better performance by installing NVIDIA's apex library
"amp_C fused kernels unavailable, disabling multi_tensor_l2norm; "
[2021-03-13 13:01:03] INFO >> epoch 001: 500 / 868 loss=81.867, nll_loss=7.873, bleu=0, ppl=234.38, wps=1858.6, ups=2.79, wpb=665.2, bsz=64, num_updates=500, lr=0.0004, gnorm=8.574, clip=0, train_wall=172, wall=186 (progress_bar.py:262, log())
[2021-03-13 13:03:11] INFO >> epoch 001 | loss 78.163 | nll_loss 7.503 | bleu 0 | ppl 181.33 | wps 1887.4 | ups 2.83 | wpb 666.6 | bsz 64 | num_updates 868 | lr 0.0004 | gnorm 8.465 | clip 0 | train_wall 294 | wall 314 (progress_bar.py:269, print())
[2021-03-13 13:05:06] INFO >> epoch 001 | valid on 'valid' subset | loss 69.96 | nll_loss 6.703 | bleu 6.8171 | ppl 104.17 | wps 1661 | wpb 5220.2 | bsz 500.1 | num_updates 868 (progress_bar.py:269, print())
[2021-03-13 13:05:10] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 1 @ 868 updates, score 6.817095387554636) (writing took 3.619679 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 13:05:57] INFO >> epoch 002: 132 / 868 loss=71.458, nll_loss=6.853, bleu=0, ppl=115.56, wps=1134.6, ups=1.7, wpb=667.4, bsz=64, num_updates=1000, lr=0.0004, gnorm=8.453, clip=0, train_wall=168, wall=480 (progress_bar.py:262, log())
[2021-03-13 13:08:55] INFO >> epoch 002: 632 / 868 loss=65.091, nll_loss=6.241, bleu=0, ppl=75.65, wps=1878.6, ups=2.81, wpb=667.5, bsz=64, num_updates=1500, lr=0.0004, gnorm=9.704, clip=0, train_wall=170, wall=658 (progress_bar.py:262, log())
[2021-03-13 13:10:20] INFO >> epoch 002 | loss 64.769 | nll_loss 6.217 | bleu 0 | ppl 74.38 | wps 1348.7 | ups 2.02 | wpb 666.6 | bsz 64 | num_updates 1736 | lr 0.0004 | gnorm 9.684 | clip 0 | train_wall 297 | wall 743 (progress_bar.py:269, print())
[2021-03-13 13:12:13] INFO >> epoch 002 | valid on 'valid' subset | loss 64.079 | nll_loss 6.139 | bleu 7.37349 | ppl 70.49 | wps 1692.9 | wpb 5220.2 | bsz 500.1 | num_updates 1736 | best_bleu 7.37349 (progress_bar.py:269, print())
[2021-03-13 13:12:17] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 2 @ 1736 updates, score 7.37349451443138) (writing took 3.800371 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 13:13:52] INFO >> epoch 003: 264 / 868 loss=59.746, nll_loss=5.743, bleu=0, ppl=53.54, wps=1119, ups=1.68, wpb=665.6, bsz=64, num_updates=2000, lr=0.000392, gnorm=10.501, clip=0, train_wall=173, wall=955 (progress_bar.py:262, log())
[2021-03-13 13:16:50] INFO >> epoch 003: 764 / 868 loss=56.505, nll_loss=5.415, bleu=0, ppl=42.65, wps=1877.2, ups=2.81, wpb=667.6, bsz=64, num_updates=2500, lr=0.000392, gnorm=11.715, clip=0, train_wall=170, wall=1133 (progress_bar.py:262, log())
[2021-03-13 13:17:27] INFO >> epoch 003 | loss 56.553 | nll_loss 5.428 | bleu 0 | ppl 43.06 | wps 1354.9 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 2604 | lr 0.000392 | gnorm 11.48 | clip 0 | train_wall 297 | wall 1170 (progress_bar.py:269, print())
[2021-03-13 13:19:20] INFO >> epoch 003 | valid on 'valid' subset | loss 61.136 | nll_loss 5.857 | bleu 7.90923 | ppl 57.98 | wps 1694.6 | wpb 5220.2 | bsz 500.1 | num_updates 2604 | best_bleu 7.90923 (progress_bar.py:269, print())
[2021-03-13 13:19:23] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 3 @ 2604 updates, score 7.909232649067772) (writing took 3.597923 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 13:21:47] INFO >> epoch 004: 396 / 868 loss=51.111, nll_loss=4.905, bleu=0, ppl=29.96, wps=1122.5, ups=1.68, wpb=666.9, bsz=64, num_updates=3000, lr=0.000376, gnorm=12.656, clip=0, train_wall=173, wall=1430 (progress_bar.py:262, log())
[2021-03-13 13:24:34] INFO >> epoch 004 | loss 49.751 | nll_loss 4.775 | bleu 0 | ppl 27.39 | wps 1353.3 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 3472 | lr 0.000376 | gnorm 13.264 | clip 0 | train_wall 298 | wall 1597 (progress_bar.py:269, print())
[2021-03-13 13:26:28] INFO >> epoch 004 | valid on 'valid' subset | loss 59.353 | nll_loss 5.687 | bleu 8.36698 | ppl 51.5 | wps 1685.8 | wpb 5220.2 | bsz 500.1 | num_updates 3472 | best_bleu 8.36698 (progress_bar.py:269, print())
[2021-03-13 13:26:32] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 4 @ 3472 updates, score 8.366982188238381) (writing took 3.637929 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 13:26:41] INFO >> epoch 005: 28 / 868 loss=49.357, nll_loss=4.736, bleu=0, ppl=26.66, wps=1135.2, ups=1.7, wpb=666.6, bsz=64, num_updates=3500, lr=0.000354, gnorm=13.601, clip=0, train_wall=169, wall=1724 (progress_bar.py:262, log())
[2021-03-13 13:29:43] INFO >> epoch 005: 528 / 868 loss=43.803, nll_loss=4.206, bleu=0, ppl=18.46, wps=1826.4, ups=2.74, wpb=666.2, bsz=64, num_updates=4000, lr=0.000354, gnorm=14.659, clip=0, train_wall=175, wall=1906 (progress_bar.py:262, log())
[2021-03-13 13:31:42] INFO >> epoch 005 | loss 43.918 | nll_loss 4.216 | bleu 0 | ppl 18.58 | wps 1352.2 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 4340 | lr 0.000354 | gnorm 14.857 | clip 0 | train_wall 298 | wall 2025 (progress_bar.py:269, print())
[2021-03-13 13:33:35] INFO >> epoch 005 | valid on 'valid' subset | loss 58.734 | nll_loss 5.627 | bleu 8.41821 | ppl 49.43 | wps 1692.9 | wpb 5220.2 | bsz 500.1 | num_updates 4340 | best_bleu 8.41821 (progress_bar.py:269, print())
[2021-03-13 13:33:39] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 5 @ 4340 updates, score 8.418206455933563) (writing took 3.769744 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 13:34:34] INFO >> epoch 006: 160 / 868 loss=42.179, nll_loss=4.051, bleu=0, ppl=16.58, wps=1145, ups=1.72, wpb=666, bsz=64, num_updates=4500, lr=0.000327, gnorm=15.275, clip=0, train_wall=167, wall=2197 (progress_bar.py:262, log())
[2021-03-13 13:37:36] INFO >> epoch 006: 660 / 868 loss=38.927, nll_loss=3.737, bleu=0, ppl=13.34, wps=1830.9, ups=2.75, wpb=666.6, bsz=64, num_updates=5000, lr=0.000327, gnorm=16.313, clip=0, train_wall=174, wall=2379 (progress_bar.py:262, log())
[2021-03-13 13:38:50] INFO >> epoch 006 | loss 38.889 | nll_loss 3.733 | bleu 0 | ppl 13.29 | wps 1354.6 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 5208 | lr 0.000327 | gnorm 16.242 | clip 0 | train_wall 297 | wall 2452 (progress_bar.py:269, print())
[2021-03-13 13:40:43] INFO >> epoch 006 | valid on 'valid' subset | loss 58.26 | nll_loss 5.582 | bleu 8.73361 | ppl 47.89 | wps 1689 | wpb 5220.2 | bsz 500.1 | num_updates 5208 | best_bleu 8.73361 (progress_bar.py:269, print())
[2021-03-13 13:40:47] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 6 @ 5208 updates, score 8.733612705066914) (writing took 3.624264 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 13:42:31] INFO >> epoch 007: 292 / 868 loss=36.147, nll_loss=3.47, bleu=0, ppl=11.08, wps=1130.7, ups=1.7, wpb=666.7, bsz=64, num_updates=5500, lr=0.000295, gnorm=16.643, clip=0, train_wall=170, wall=2674 (progress_bar.py:262, log())
[2021-03-13 13:45:30] INFO >> epoch 007: 792 / 868 loss=34.929, nll_loss=3.349, bleu=0, ppl=10.19, wps=1865.5, ups=2.8, wpb=667.3, bsz=64, num_updates=6000, lr=0.000295, gnorm=17.721, clip=0.2, train_wall=171, wall=2853 (progress_bar.py:262, log())
[2021-03-13 13:45:57] INFO >> epoch 007 | loss 34.551 | nll_loss 3.316 | bleu 0 | ppl 9.96 | wps 1354.4 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 6076 | lr 0.000295 | gnorm 17.318 | clip 0.1 | train_wall 297 | wall 2880 (progress_bar.py:269, print())
[2021-03-13 13:47:50] INFO >> epoch 007 | valid on 'valid' subset | loss 58.423 | nll_loss 5.597 | bleu 9.02305 | ppl 48.42 | wps 1683.7 | wpb 5220.2 | bsz 500.1 | num_updates 6076 | best_bleu 9.02305 (progress_bar.py:269, print())
[2021-03-13 13:47:54] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 7 @ 6076 updates, score 9.023047463771928) (writing took 3.691599 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 13:50:27] INFO >> epoch 008: 424 / 868 loss=31.168, nll_loss=2.986, bleu=0, ppl=7.92, wps=1125.3, ups=1.69, wpb=667.7, bsz=64, num_updates=6500, lr=0.000262, gnorm=17.657, clip=0, train_wall=172, wall=3149 (progress_bar.py:262, log())
[2021-03-13 13:53:05] INFO >> epoch 008 | loss 30.827 | nll_loss 2.959 | bleu 0 | ppl 7.78 | wps 1350.5 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 6944 | lr 0.000262 | gnorm 18.112 | clip 0.2 | train_wall 298 | wall 3308 (progress_bar.py:269, print())
[2021-03-13 13:54:59] INFO >> epoch 008 | valid on 'valid' subset | loss 58.778 | nll_loss 5.631 | bleu 9.01392 | ppl 49.57 | wps 1682.2 | wpb 5220.2 | bsz 500.1 | num_updates 6944 | best_bleu 9.02305 (progress_bar.py:269, print())
[2021-03-13 13:55:01] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 8 @ 6944 updates, score 9.013924699389015) (writing took 2.299987 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 13:55:22] INFO >> epoch 009: 56 / 868 loss=30.652, nll_loss=2.949, bleu=0, ppl=7.72, wps=1126.6, ups=1.69, wpb=665.1, bsz=64, num_updates=7000, lr=0.000227, gnorm=18.444, clip=0.4, train_wall=172, wall=3445 (progress_bar.py:262, log())
[2021-03-13 13:58:16] INFO >> epoch 009: 556 / 868 loss=27.525, nll_loss=2.639, bleu=0, ppl=6.23, wps=1911.7, ups=2.87, wpb=667.1, bsz=64, num_updates=7500, lr=0.000227, gnorm=18.461, clip=0.2, train_wall=167, wall=3619 (progress_bar.py:262, log())
[2021-03-13 14:00:11] INFO >> epoch 009 | loss 27.7 | nll_loss 2.659 | bleu 0 | ppl 6.31 | wps 1358.5 | ups 2.04 | wpb 666.6 | bsz 64 | num_updates 7812 | lr 0.000227 | gnorm 18.723 | clip 0.5 | train_wall 297 | wall 3734 (progress_bar.py:269, print())
[2021-03-13 14:02:05] INFO >> epoch 009 | valid on 'valid' subset | loss 59.1 | nll_loss 5.662 | bleu 9.26373 | ppl 50.64 | wps 1686.3 | wpb 5220.2 | bsz 500.1 | num_updates 7812 | best_bleu 9.26373 (progress_bar.py:269, print())
[2021-03-13 14:02:08] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 9 @ 7812 updates, score 9.263730128554833) (writing took 3.457729 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 14:03:16] INFO >> epoch 010: 188 / 868 loss=26.813, nll_loss=2.575, bleu=0, ppl=5.96, wps=1111.2, ups=1.67, wpb=666.5, bsz=64, num_updates=8000, lr=0.000193, gnorm=18.99, clip=0.6, train_wall=175, wall=3919 (progress_bar.py:262, log())
[2021-03-13 14:06:14] INFO >> epoch 010: 688 / 868 loss=25.015, nll_loss=2.401, bleu=0, ppl=5.28, wps=1869.4, ups=2.81, wpb=666.4, bsz=64, num_updates=8500, lr=0.000193, gnorm=19.178, clip=1, train_wall=171, wall=4097 (progress_bar.py:262, log())
[2021-03-13 14:07:18] INFO >> epoch 010 | loss 25.068 | nll_loss 2.406 | bleu 0 | ppl 5.3 | wps 1354.6 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 8680 | lr 0.000193 | gnorm 19.173 | clip 0.8 | train_wall 297 | wall 4161 (progress_bar.py:269, print())
[2021-03-13 14:09:12] INFO >> epoch 010 | valid on 'valid' subset | loss 59.551 | nll_loss 5.705 | bleu 9.48661 | ppl 52.18 | wps 1685.1 | wpb 5220.2 | bsz 500.1 | num_updates 8680 | best_bleu 9.48661 (progress_bar.py:269, print())
[2021-03-13 14:09:16] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 10 @ 8680 updates, score 9.4866111296095) (writing took 3.824611 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 14:11:10] INFO >> epoch 011: 320 / 868 loss=23.649, nll_loss=2.269, bleu=0, ppl=4.82, wps=1128.7, ups=1.69, wpb=666.7, bsz=64, num_updates=9000, lr=0.000161, gnorm=19.132, clip=1, train_wall=171, wall=4393 (progress_bar.py:262, log())
[2021-03-13 14:14:08] INFO >> epoch 011: 820 / 868 loss=23.027, nll_loss=2.214, bleu=0, ppl=4.64, wps=1871, ups=2.81, wpb=665.5, bsz=64, num_updates=9500, lr=0.000161, gnorm=19.487, clip=0.6, train_wall=170, wall=4570 (progress_bar.py:262, log())
[2021-03-13 14:14:26] INFO >> epoch 011 | loss 22.881 | nll_loss 2.196 | bleu 0 | ppl 4.58 | wps 1353.7 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 9548 | lr 0.000161 | gnorm 19.203 | clip 0.8 | train_wall 297 | wall 4589 (progress_bar.py:269, print())
[2021-03-13 14:16:19] INFO >> epoch 011 | valid on 'valid' subset | loss 60.112 | nll_loss 5.759 | bleu 9.59903 | ppl 54.16 | wps 1681 | wpb 5220.2 | bsz 500.1 | num_updates 9548 | best_bleu 9.59903 (progress_bar.py:269, print())
[2021-03-13 14:16:23] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 11 @ 9548 updates, score 9.599033921670394) (writing took 3.832378 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 14:19:02] INFO >> epoch 012: 452 / 868 loss=21.012, nll_loss=2.018, bleu=0, ppl=4.05, wps=1130.2, ups=1.7, wpb=666.2, bsz=64, num_updates=10000, lr=0.000132, gnorm=19.079, clip=1.2, train_wall=170, wall=4865 (progress_bar.py:262, log())
[2021-03-13 14:21:33] INFO >> epoch 012 | loss 21.125 | nll_loss 2.028 | bleu 0 | ppl 4.08 | wps 1352.6 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 10416 | lr 0.000132 | gnorm 19.436 | clip 1.6 | train_wall 297 | wall 5016 (progress_bar.py:269, print())
[2021-03-13 14:23:27] INFO >> epoch 012 | valid on 'valid' subset | loss 60.577 | nll_loss 5.804 | bleu 9.70668 | ppl 55.86 | wps 1684.4 | wpb 5220.2 | bsz 500.1 | num_updates 10416 | best_bleu 9.70668 (progress_bar.py:269, print())
[2021-03-13 14:23:31] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 12 @ 10416 updates, score 9.70668232918501) (writing took 3.768739 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 14:24:01] INFO >> epoch 013: 84 / 868 loss=21.165, nll_loss=2.027, bleu=0, ppl=4.08, wps=1118.9, ups=1.67, wpb=668.2, bsz=64, num_updates=10500, lr=0.000105, gnorm=19.687, clip=2, train_wall=174, wall=5164 (progress_bar.py:262, log())
[2021-03-13 14:26:58] INFO >> epoch 013: 584 / 868 loss=19.508, nll_loss=1.874, bleu=0, ppl=3.66, wps=1878.5, ups=2.82, wpb=666.1, bsz=64, num_updates=11000, lr=0.000105, gnorm=19.197, clip=1.8, train_wall=170, wall=5341 (progress_bar.py:262, log())
[2021-03-13 14:28:41] INFO >> epoch 013 | loss 19.655 | nll_loss 1.887 | bleu 0 | ppl 3.7 | wps 1353.2 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 11284 | lr 0.000105 | gnorm 19.362 | clip 1.6 | train_wall 297 | wall 5444 (progress_bar.py:269, print())
[2021-03-13 14:30:35] INFO >> epoch 013 | valid on 'valid' subset | loss 60.964 | nll_loss 5.841 | bleu 9.79243 | ppl 57.32 | wps 1686.7 | wpb 5220.2 | bsz 500.1 | num_updates 11284 | best_bleu 9.79243 (progress_bar.py:269, print())
[2021-03-13 14:30:38] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 13 @ 11284 updates, score 9.792427764691736) (writing took 3.765152 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 14:31:58] INFO >> epoch 014: 216 / 868 loss=19.27, nll_loss=1.849, bleu=0, ppl=3.6, wps=1112.4, ups=1.67, wpb=667.1, bsz=64, num_updates=11500, lr=8.3e-05, gnorm=19.452, clip=1, train_wall=175, wall=5641 (progress_bar.py:262, log())
[2021-03-13 14:34:54] INFO >> epoch 014: 716 / 868 loss=18.534, nll_loss=1.777, bleu=0, ppl=3.43, wps=1890.4, ups=2.83, wpb=667.2, bsz=64, num_updates=12000, lr=8.3e-05, gnorm=19.235, clip=0.8, train_wall=169, wall=5817 (progress_bar.py:262, log())
[2021-03-13 14:35:50] INFO >> epoch 014 | loss 18.496 | nll_loss 1.775 | bleu 0 | ppl 3.42 | wps 1350.3 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 12152 | lr 8.3e-05 | gnorm 19.228 | clip 0.9 | train_wall 298 | wall 5872 (progress_bar.py:269, print())
[2021-03-13 14:37:44] INFO >> epoch 014 | valid on 'valid' subset | loss 61.322 | nll_loss 5.875 | bleu 9.95747 | ppl 58.7 | wps 1678 | wpb 5220.2 | bsz 500.1 | num_updates 12152 | best_bleu 9.95747 (progress_bar.py:269, print())
[2021-03-13 14:37:47] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 14 @ 12152 updates, score 9.957469064736461) (writing took 3.757679 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 14:39:53] INFO >> epoch 015: 348 / 868 loss=17.814, nll_loss=1.712, bleu=0, ppl=3.28, wps=1115.7, ups=1.68, wpb=666, bsz=64, num_updates=12500, lr=6.4e-05, gnorm=19.127, clip=1, train_wall=173, wall=6116 (progress_bar.py:262, log())
[2021-03-13 14:42:50] INFO >> epoch 015: 848 / 868 loss=17.597, nll_loss=1.69, bleu=0, ppl=3.23, wps=1882.7, ups=2.83, wpb=666.2, bsz=64, num_updates=13000, lr=6.4e-05, gnorm=19.111, clip=0.4, train_wall=169, wall=6293 (progress_bar.py:262, log())
[2021-03-13 14:42:57] INFO >> epoch 015 | loss 17.543 | nll_loss 1.684 | bleu 0 | ppl 3.21 | wps 1353.3 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 13020 | lr 6.4e-05 | gnorm 19.036 | clip 0.7 | train_wall 297 | wall 6300 (progress_bar.py:269, print())
[2021-03-13 14:44:51] INFO >> epoch 015 | valid on 'valid' subset | loss 61.611 | nll_loss 5.903 | bleu 10.0045 | ppl 59.83 | wps 1680.5 | wpb 5220.2 | bsz 500.1 | num_updates 13020 | best_bleu 10.0045 (progress_bar.py:269, print())
[2021-03-13 14:44:55] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 15 @ 13020 updates, score 10.004475807404225) (writing took 3.643793 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 14:47:44] INFO >> epoch 016: 480 / 868 loss=16.764, nll_loss=1.612, bleu=0, ppl=3.06, wps=1131.3, ups=1.7, wpb=665.2, bsz=64, num_updates=13500, lr=4.8e-05, gnorm=18.884, clip=0.8, train_wall=169, wall=6587 (progress_bar.py:262, log())
[2021-03-13 14:50:05] INFO >> epoch 016 | loss 16.817 | nll_loss 1.614 | bleu 0 | ppl 3.06 | wps 1350.7 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 13888 | lr 4.8e-05 | gnorm 18.988 | clip 0.6 | train_wall 298 | wall 6728 (progress_bar.py:269, print())
[2021-03-13 14:51:59] INFO >> epoch 016 | valid on 'valid' subset | loss 61.928 | nll_loss 5.933 | bleu 10.0412 | ppl 61.11 | wps 1679.1 | wpb 5220.2 | bsz 500.1 | num_updates 13888 | best_bleu 10.0412 (progress_bar.py:269, print())
[2021-03-13 14:52:03] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 16 @ 13888 updates, score 10.041209674714201) (writing took 3.598538 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 14:52:43] INFO >> epoch 017: 112 / 868 loss=16.808, nll_loss=1.609, bleu=0, ppl=3.05, wps=1117.9, ups=1.67, wpb=668.7, bsz=64, num_updates=14000, lr=3.5e-05, gnorm=19.053, clip=0.6, train_wall=174, wall=6886 (progress_bar.py:262, log())
[2021-03-13 14:55:41] INFO >> epoch 017: 612 / 868 loss=16.264, nll_loss=1.559, bleu=0, ppl=2.95, wps=1877.7, ups=2.81, wpb=667.4, bsz=64, num_updates=14500, lr=3.5e-05, gnorm=18.788, clip=0.8, train_wall=170, wall=7064 (progress_bar.py:262, log())
[2021-03-13 14:57:13] INFO >> epoch 017 | loss 16.285 | nll_loss 1.563 | bleu 0 | ppl 2.96 | wps 1353.1 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 14756 | lr 3.5e-05 | gnorm 18.859 | clip 1.2 | train_wall 297 | wall 7156 (progress_bar.py:269, print())
[2021-03-13 14:59:07] INFO >> epoch 017 | valid on 'valid' subset | loss 62.12 | nll_loss 5.952 | bleu 10.0477 | ppl 61.89 | wps 1680 | wpb 5220.2 | bsz 500.1 | num_updates 14756 | best_bleu 10.0477 (progress_bar.py:269, print())
[2021-03-13 14:59:11] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 17 @ 14756 updates, score 10.047685569458828) (writing took 3.695944 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 15:00:39] INFO >> epoch 018: 244 / 868 loss=16.059, nll_loss=1.545, bleu=0, ppl=2.92, wps=1116.3, ups=1.68, wpb=665.2, bsz=64, num_updates=15000, lr=2.6e-05, gnorm=18.947, clip=1.6, train_wall=173, wall=7361 (progress_bar.py:262, log())
[2021-03-13 15:03:36] INFO >> epoch 018: 744 / 868 loss=15.933, nll_loss=1.528, bleu=0, ppl=2.88, wps=1878.2, ups=2.82, wpb=667, bsz=64, num_updates=15500, lr=2.6e-05, gnorm=18.855, clip=0.4, train_wall=170, wall=7539 (progress_bar.py:262, log())
[2021-03-13 15:04:21] INFO >> epoch 018 | loss 15.884 | nll_loss 1.525 | bleu 0 | ppl 2.88 | wps 1351.9 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 15624 | lr 2.6e-05 | gnorm 18.84 | clip 0.5 | train_wall 297 | wall 7584 (progress_bar.py:269, print())
[2021-03-13 15:06:15] INFO >> epoch 018 | valid on 'valid' subset | loss 62.351 | nll_loss 5.974 | bleu 10.1459 | ppl 62.84 | wps 1677.9 | wpb 5220.2 | bsz 500.1 | num_updates 15624 | best_bleu 10.1459 (progress_bar.py:269, print())
[2021-03-13 15:06:19] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 18 @ 15624 updates, score 10.1458955766709) (writing took 3.732891 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 15:08:35] INFO >> epoch 019: 376 / 868 loss=15.572, nll_loss=1.498, bleu=0, ppl=2.82, wps=1112.9, ups=1.67, wpb=665.1, bsz=64, num_updates=16000, lr=1.8e-05, gnorm=18.718, clip=0.6, train_wall=174, wall=7838 (progress_bar.py:262, log())
[2021-03-13 15:11:30] INFO >> epoch 019 | loss 15.578 | nll_loss 1.495 | bleu 0 | ppl 2.82 | wps 1349.3 | ups 2.02 | wpb 666.6 | bsz 64 | num_updates 16492 | lr 1.8e-05 | gnorm 18.787 | clip 0.7 | train_wall 298 | wall 8013 (progress_bar.py:269, print())
[2021-03-13 15:13:24] INFO >> epoch 019 | valid on 'valid' subset | loss 62.38 | nll_loss 5.977 | bleu 10.1234 | ppl 62.97 | wps 1681.3 | wpb 5220.2 | bsz 500.1 | num_updates 16492 | best_bleu 10.1459 (progress_bar.py:269, print())
[2021-03-13 15:13:26] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 19 @ 16492 updates, score 10.12340506292069) (writing took 2.181532 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 15:13:28] INFO >> epoch 020: 8 / 868 loss=15.663, nll_loss=1.501, bleu=0, ppl=2.83, wps=1138.4, ups=1.71, wpb=667.6, bsz=64, num_updates=16500, lr=1.3e-05, gnorm=18.864, clip=0.6, train_wall=170, wall=8131 (progress_bar.py:262, log())
[2021-03-13 15:16:23] INFO >> epoch 020: 508 / 868 loss=15.273, nll_loss=1.468, bleu=0, ppl=2.77, wps=1908.5, ups=2.87, wpb=666, bsz=64, num_updates=17000, lr=1.3e-05, gnorm=18.613, clip=0.8, train_wall=167, wall=8306 (progress_bar.py:262, log())
[2021-03-13 15:18:26] INFO >> epoch 020 | loss 15.368 | nll_loss 1.475 | bleu 0 | ppl 2.78 | wps 1390.8 | ups 2.09 | wpb 666.6 | bsz 64 | num_updates 17360 | lr 1.3e-05 | gnorm 18.754 | clip 0.8 | train_wall 287 | wall 8429 (progress_bar.py:269, print())
[2021-03-13 15:20:15] INFO >> epoch 020 | valid on 'valid' subset | loss 62.472 | nll_loss 5.985 | bleu 10.1356 | ppl 63.35 | wps 1761.5 | wpb 5220.2 | bsz 500.1 | num_updates 17360 | best_bleu 10.1459 (progress_bar.py:269, print())
[2021-03-13 15:20:17] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 20 @ 17360 updates, score 10.135579790263938) (writing took 2.529248 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 15:21:02] INFO >> epoch 021: 140 / 868 loss=15.408, nll_loss=1.477, bleu=0, ppl=2.78, wps=1192, ups=1.79, wpb=667.1, bsz=63.9, num_updates=17500, lr=9e-06, gnorm=18.879, clip=0.6, train_wall=161, wall=8585 (progress_bar.py:262, log())
[2021-03-13 15:23:58] INFO >> epoch 021: 640 / 868 loss=15.228, nll_loss=1.461, bleu=0, ppl=2.75, wps=1901.4, ups=2.85, wpb=667.2, bsz=64, num_updates=18000, lr=9e-06, gnorm=18.671, clip=0.6, train_wall=168, wall=8761 (progress_bar.py:262, log())
[2021-03-13 15:25:15] INFO >> epoch 021 | loss 15.204 | nll_loss 1.459 | bleu 0 | ppl 2.75 | wps 1414.2 | ups 2.12 | wpb 666.6 | bsz 64 | num_updates 18228 | lr 9e-06 | gnorm 18.689 | clip 0.7 | train_wall 286 | wall 8838 (progress_bar.py:269, print())
[2021-03-13 15:27:02] INFO >> epoch 021 | valid on 'valid' subset | loss 62.536 | nll_loss 5.991 | bleu 10.1559 | ppl 63.62 | wps 1788.7 | wpb 5220.2 | bsz 500.1 | num_updates 18228 | best_bleu 10.1559 (progress_bar.py:269, print())
[2021-03-13 15:27:06] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 21 @ 18228 updates, score 10.155869442148523) (writing took 3.645213 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 15:28:38] INFO >> epoch 022: 272 / 868 loss=15.161, nll_loss=1.452, bleu=0, ppl=2.74, wps=1192.4, ups=1.78, wpb=668.2, bsz=64, num_updates=18500, lr=6e-06, gnorm=18.7, clip=1, train_wall=163, wall=9041 (progress_bar.py:262, log())
[2021-03-13 15:31:30] INFO >> epoch 022: 772 / 868 loss=15.065, nll_loss=1.448, bleu=0, ppl=2.73, wps=1936.9, ups=2.91, wpb=665.6, bsz=64, num_updates=19000, lr=6e-06, gnorm=18.58, clip=0.6, train_wall=165, wall=9213 (progress_bar.py:262, log())
[2021-03-13 15:32:01] INFO >> epoch 022 | loss 15.093 | nll_loss 1.449 | bleu 0 | ppl 2.73 | wps 1424.2 | ups 2.14 | wpb 666.6 | bsz 64 | num_updates 19096 | lr 6e-06 | gnorm 18.618 | clip 0.7 | train_wall 284 | wall 9244 (progress_bar.py:269, print())
[2021-03-13 15:33:49] INFO >> epoch 022 | valid on 'valid' subset | loss 62.611 | nll_loss 5.999 | bleu 10.166 | ppl 63.94 | wps 1783.1 | wpb 5220.2 | bsz 500.1 | num_updates 19096 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 15:33:52] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_best.pt (epoch 22 @ 19096 updates, score 10.165978328240914) (writing took 3.537951 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 15:36:08] INFO >> epoch 023: 404 / 868 loss=15.015, nll_loss=1.442, bleu=0, ppl=2.72, wps=1198.3, ups=1.8, wpb=666, bsz=64, num_updates=19500, lr=4e-06, gnorm=18.53, clip=1, train_wall=160, wall=9491 (progress_bar.py:262, log())
[2021-03-13 15:38:50] INFO >> epoch 023 | loss 15.005 | nll_loss 1.44 | bleu 0 | ppl 2.71 | wps 1417.6 | ups 2.13 | wpb 666.6 | bsz 64 | num_updates 19964 | lr 4e-06 | gnorm 18.597 | clip 0.7 | train_wall 285 | wall 9653 (progress_bar.py:269, print())
[2021-03-13 15:40:37] INFO >> epoch 023 | valid on 'valid' subset | loss 62.628 | nll_loss 6 | bleu 10.1482 | ppl 64.01 | wps 1775.3 | wpb 5220.2 | bsz 500.1 | num_updates 19964 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 15:40:40] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 23 @ 19964 updates, score 10.148158981162911) (writing took 2.327245 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 15:40:52] INFO >> epoch 024: 36 / 868 loss=15.008, nll_loss=1.442, bleu=0, ppl=2.72, wps=1173.1, ups=1.76, wpb=666.1, bsz=64, num_updates=20000, lr=2e-06, gnorm=18.703, clip=0.6, train_wall=167, wall=9775 (progress_bar.py:262, log())
[2021-03-13 15:43:42] INFO >> epoch 024: 536 / 868 loss=14.963, nll_loss=1.436, bleu=0, ppl=2.71, wps=1954.1, ups=2.93, wpb=666.6, bsz=64, num_updates=20500, lr=2e-06, gnorm=18.699, clip=0.8, train_wall=164, wall=9945 (progress_bar.py:262, log())
[2021-03-13 15:45:39] INFO >> epoch 024 | loss 14.996 | nll_loss 1.439 | bleu 0 | ppl 2.71 | wps 1412.7 | ups 2.12 | wpb 666.6 | bsz 64 | num_updates 20832 | lr 2e-06 | gnorm 18.642 | clip 1 | train_wall 287 | wall 10062 (progress_bar.py:269, print())
[2021-03-13 15:47:31] INFO >> epoch 024 | valid on 'valid' subset | loss 62.643 | nll_loss 6.002 | bleu 10.1485 | ppl 64.08 | wps 1716.7 | wpb 5220.2 | bsz 500.1 | num_updates 20832 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 15:47:33] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 24 @ 20832 updates, score 10.148500921326056) (writing took 2.361867 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 15:48:31] INFO >> epoch 025: 168 / 868 loss=15.065, nll_loss=1.444, bleu=0, ppl=2.72, wps=1155, ups=1.73, wpb=667.7, bsz=64, num_updates=21000, lr=2e-06, gnorm=18.565, clip=1.2, train_wall=168, wall=10234 (progress_bar.py:262, log())
[2021-03-13 15:51:30] INFO >> epoch 025: 668 / 868 loss=14.967, nll_loss=1.435, bleu=0, ppl=2.7, wps=1869.3, ups=2.8, wpb=667.5, bsz=64, num_updates=21500, lr=2e-06, gnorm=18.531, clip=0.2, train_wall=171, wall=10413 (progress_bar.py:262, log())
[2021-03-13 15:52:41] INFO >> epoch 025 | loss 14.955 | nll_loss 1.435 | bleu 0 | ppl 2.7 | wps 1371.6 | ups 2.06 | wpb 666.6 | bsz 64 | num_updates 21700 | lr 2e-06 | gnorm 18.565 | clip 0.3 | train_wall 296 | wall 10484 (progress_bar.py:269, print())
[2021-03-13 15:54:33] INFO >> epoch 025 | valid on 'valid' subset | loss 62.65 | nll_loss 6.002 | bleu 10.1465 | ppl 64.11 | wps 1702.9 | wpb 5220.2 | bsz 500.1 | num_updates 21700 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 15:54:35] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 25 @ 21700 updates, score 10.146479287740247) (writing took 2.108213 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 15:56:22] INFO >> epoch 026: 300 / 868 loss=14.828, nll_loss=1.428, bleu=0, ppl=2.69, wps=1138.6, ups=1.71, wpb=664.4, bsz=64, num_updates=22000, lr=1e-06, gnorm=18.581, clip=0.2, train_wall=170, wall=10705 (progress_bar.py:262, log())
[2021-03-13 15:59:21] INFO >> epoch 026: 800 / 868 loss=14.972, nll_loss=1.438, bleu=0, ppl=2.71, wps=1861.8, ups=2.79, wpb=666.4, bsz=64, num_updates=22500, lr=1e-06, gnorm=18.503, clip=0.6, train_wall=172, wall=10884 (progress_bar.py:262, log())
[2021-03-13 15:59:43] INFO >> epoch 026 | loss 14.927 | nll_loss 1.433 | bleu 0 | ppl 2.7 | wps 1369.6 | ups 2.05 | wpb 666.6 | bsz 64 | num_updates 22568 | lr 1e-06 | gnorm 18.523 | clip 0.5 | train_wall 295 | wall 10906 (progress_bar.py:269, print())
[2021-03-13 16:01:38] INFO >> epoch 026 | valid on 'valid' subset | loss 62.66 | nll_loss 6.003 | bleu 10.1486 | ppl 64.15 | wps 1672.8 | wpb 5220.2 | bsz 500.1 | num_updates 22568 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 16:01:40] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 26 @ 22568 updates, score 10.148622794100001) (writing took 2.168515 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 16:04:18] INFO >> epoch 027: 432 / 868 loss=14.868, nll_loss=1.426, bleu=0, ppl=2.69, wps=1121.4, ups=1.68, wpb=666.9, bsz=64, num_updates=23000, lr=1e-06, gnorm=18.5, clip=0.6, train_wall=173, wall=11181 (progress_bar.py:262, log())
[2021-03-13 16:06:50] INFO >> epoch 027 | loss 14.896 | nll_loss 1.43 | bleu 0 | ppl 2.69 | wps 1355.6 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 23436 | lr 1e-06 | gnorm 18.568 | clip 0.8 | train_wall 297 | wall 11333 (progress_bar.py:269, print())
[2021-03-13 16:08:44] INFO >> epoch 027 | valid on 'valid' subset | loss 62.667 | nll_loss 6.004 | bleu 10.1483 | ppl 64.18 | wps 1678.5 | wpb 5220.2 | bsz 500.1 | num_updates 23436 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 16:08:46] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 27 @ 23436 updates, score 10.148261521115503) (writing took 2.140262 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 16:09:10] INFO >> epoch 028: 64 / 868 loss=14.9, nll_loss=1.43, bleu=0, ppl=2.69, wps=1141.5, ups=1.71, wpb=666.5, bsz=64, num_updates=23500, lr=0, gnorm=18.592, clip=0.8, train_wall=168, wall=11473 (progress_bar.py:262, log())
[2021-03-13 16:12:08] INFO >> epoch 028: 564 / 868 loss=14.856, nll_loss=1.427, bleu=0, ppl=2.69, wps=1873.1, ups=2.81, wpb=666.1, bsz=64, num_updates=24000, lr=0, gnorm=18.615, clip=0.6, train_wall=170, wall=11651 (progress_bar.py:262, log())
[2021-03-13 16:13:57] INFO >> epoch 028 | loss 14.904 | nll_loss 1.431 | bleu 0 | ppl 2.7 | wps 1356.3 | ups 2.03 | wpb 666.6 | bsz 64 | num_updates 24304 | lr 0 | gnorm 18.603 | clip 0.7 | train_wall 297 | wall 11760 (progress_bar.py:269, print())
[2021-03-13 16:15:51] INFO >> epoch 028 | valid on 'valid' subset | loss 62.669 | nll_loss 6.004 | bleu 10.1507 | ppl 64.19 | wps 1681.4 | wpb 5220.2 | bsz 500.1 | num_updates 24304 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 16:15:53] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 28 @ 24304 updates, score 10.150664697639428) (writing took 2.160613 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 16:17:03] INFO >> epoch 029: 196 / 868 loss=14.928, nll_loss=1.432, bleu=0, ppl=2.7, wps=1130.1, ups=1.69, wpb=666.8, bsz=64, num_updates=24500, lr=0, gnorm=18.543, clip=0.6, train_wall=172, wall=11946 (progress_bar.py:262, log())
[2021-03-13 16:20:02] INFO >> epoch 029: 696 / 868 loss=14.99, nll_loss=1.436, bleu=0, ppl=2.71, wps=1866.3, ups=2.79, wpb=668.2, bsz=64, num_updates=25000, lr=0, gnorm=18.66, clip=0.4, train_wall=172, wall=12125 (progress_bar.py:262, log())
[2021-03-13 16:21:02] INFO >> epoch 029 | loss 14.884 | nll_loss 1.429 | bleu 0 | ppl 2.69 | wps 1359.9 | ups 2.04 | wpb 666.6 | bsz 64 | num_updates 25172 | lr 0 | gnorm 18.556 | clip 0.3 | train_wall 296 | wall 12185 (progress_bar.py:269, print())
[2021-03-13 16:22:56] INFO >> epoch 029 | valid on 'valid' subset | loss 62.67 | nll_loss 6.004 | bleu 10.1493 | ppl 64.19 | wps 1683.2 | wpb 5220.2 | bsz 500.1 | num_updates 25172 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 16:22:58] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 29 @ 25172 updates, score 10.149273439620009) (writing took 2.091821 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 16:24:53] INFO >> epoch 030: 328 / 868 loss=14.915, nll_loss=1.43, bleu=0, ppl=2.69, wps=1145.8, ups=1.72, wpb=667.1, bsz=64, num_updates=25500, lr=0, gnorm=18.691, clip=0.4, train_wall=168, wall=12416 (progress_bar.py:262, log())
[2021-03-13 16:27:55] INFO >> epoch 030: 828 / 868 loss=14.821, nll_loss=1.426, bleu=0, ppl=2.69, wps=1825.4, ups=2.74, wpb=665, bsz=64, num_updates=26000, lr=0, gnorm=18.537, clip=0.4, train_wall=175, wall=12598 (progress_bar.py:262, log())
[2021-03-13 16:28:08] INFO >> epoch 030 | loss 14.912 | nll_loss 1.431 | bleu 0 | ppl 2.7 | wps 1357.9 | ups 2.04 | wpb 666.6 | bsz 64 | num_updates 26040 | lr 0 | gnorm 18.658 | clip 0.3 | train_wall 297 | wall 12611 (progress_bar.py:269, print())
[2021-03-13 16:30:02] INFO >> epoch 030 | valid on 'valid' subset | loss 62.67 | nll_loss 6.004 | bleu 10.1507 | ppl 64.19 | wps 1688.1 | wpb 5220.2 | bsz 500.1 | num_updates 26040 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 16:30:04] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 30 @ 26040 updates, score 10.150652385065664) (writing took 2.641140 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 16:32:48] INFO >> epoch 031: 460 / 868 loss=14.928, nll_loss=1.436, bleu=0, ppl=2.71, wps=1133.9, ups=1.7, wpb=665.2, bsz=64, num_updates=26500, lr=0, gnorm=18.617, clip=0.8, train_wall=170, wall=12891 (progress_bar.py:262, log())
[2021-03-13 16:35:15] INFO >> epoch 031 | loss 14.894 | nll_loss 1.43 | bleu 0 | ppl 2.69 | wps 1356.8 | ups 2.04 | wpb 666.6 | bsz 64 | num_updates 26908 | lr 0 | gnorm 18.593 | clip 1.2 | train_wall 297 | wall 13038 (progress_bar.py:269, print())
[2021-03-13 16:37:09] INFO >> epoch 031 | valid on 'valid' subset | loss 62.671 | nll_loss 6.004 | bleu 10.15 | ppl 64.19 | wps 1674 | wpb 5220.2 | bsz 500.1 | num_updates 26908 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 16:37:12] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 31 @ 26908 updates, score 10.15004327688585) (writing took 2.331435 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 16:37:46] INFO >> epoch 032: 92 / 868 loss=14.891, nll_loss=1.427, bleu=0, ppl=2.69, wps=1122.6, ups=1.68, wpb=667.6, bsz=64, num_updates=27000, lr=0, gnorm=18.618, clip=1.4, train_wall=173, wall=13189 (progress_bar.py:262, log())
[2021-03-13 16:40:44] INFO >> epoch 032: 592 / 868 loss=14.839, nll_loss=1.424, bleu=0, ppl=2.68, wps=1869.7, ups=2.81, wpb=666.5, bsz=64, num_updates=27500, lr=0, gnorm=18.453, clip=0.8, train_wall=171, wall=13367 (progress_bar.py:262, log())
[2021-03-13 16:42:21] INFO >> epoch 032 | loss 14.899 | nll_loss 1.43 | bleu 0 | ppl 2.69 | wps 1356.8 | ups 2.04 | wpb 666.6 | bsz 64 | num_updates 27776 | lr 0 | gnorm 18.564 | clip 0.8 | train_wall 297 | wall 13464 (progress_bar.py:269, print())
[2021-03-13 16:44:16] INFO >> epoch 032 | valid on 'valid' subset | loss 62.671 | nll_loss 6.004 | bleu 10.1512 | ppl 64.2 | wps 1667.4 | wpb 5220.2 | bsz 500.1 | num_updates 27776 | best_bleu 10.166 (progress_bar.py:269, print())
[2021-03-13 16:44:19] INFO >> saved checkpoint /data/yanghe/.ncc/python_wan/summarization/data-mmap/nary_tree2seq/checkpoints/checkpoint_last.pt (epoch 32 @ 27776 updates, score 10.15119857873906) (writing took 2.400287 seconds) (checkpoint_utils.py:81, save_checkpoint())
[2021-03-13 16:44:19] INFO >> early stop since valid performance hasn't improved for last 10 runs (train.py:176, should_stop_early())
[2021-03-13 16:44:19] INFO >> early stop since valid performance hasn't improved for last 10 runs (train.py:260, single_main())
[2021-03-13 16:44:19] INFO >> done training in 13574.4 seconds (train.py:271, single_main())