Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training error on BRATS21 #306

Open
FengheTan9 opened this issue Sep 20, 2023 · 10 comments
Open

Training error on BRATS21 #306

FengheTan9 opened this issue Sep 20, 2023 · 10 comments

Comments

@FengheTan9
Copy link

FengheTan9 commented Sep 20, 2023

Hello, I followed the same steps in the readme to run Swin UETR on the Brats21 data set but the accuracy did not improve.

logfile:

0 gpu 0
Batch size is: 1 epochs 300
Total parameters count 62191941
Writing Tensorboard logs to ./runs/unetr_test_dir
0 Tue Sep 19 07:03:25 2023 Epoch: 0
/opt/conda/lib/python3.8/site-packages/monai/utils/deprecate_utils.py:321: FutureWarning: monai.transforms.io.dictionary LoadImaged.init:image_only: Current default value of argument image_only=False has been deprecated since version 1.1. It will be changed to image_only=True in version 1.3.
warn_deprecated(argname, msg, warning_category)
Epoch 0/300 0/1000 loss: 0.9774 time 40.33s
Epoch 0/300 200/1000 loss: 0.9718 time 1.50s
Epoch 0/300 400/1000 loss: 0.9715 time 1.51s
Epoch 0/300 600/1000 loss: 0.9722 time 1.51s
Epoch 0/300 800/1000 loss: 0.9726 time 1.51s
Final training 0/299 loss: 0.9722 time 1514.97s
0 Tue Sep 19 07:28:40 2023 Epoch: 1
Epoch 1/300 0/1000 loss: 0.9994 time 3.36s
Epoch 1/300 200/1000 loss: 0.9618 time 1.50s
Epoch 1/300 400/1000 loss: 0.9598 time 1.26s
Epoch 1/300 600/1000 loss: 0.9581 time 1.50s
Epoch 1/300 800/1000 loss: 0.9546 time 1.51s
Final training 1/299 loss: 0.9520 time 1480.98s
0 Tue Sep 19 07:53:21 2023 Epoch: 2
Epoch 2/300 0/1000 loss: 0.8660 time 3.76s
Epoch 2/300 200/1000 loss: 0.9428 time 1.52s
Epoch 2/300 400/1000 loss: 0.9429 time 1.50s
Epoch 2/300 600/1000 loss: 0.9430 time 1.26s
Epoch 2/300 800/1000 loss: 0.9412 time 1.50s
Final training 2/299 loss: 0.9389 time 1479.49s
0 Tue Sep 19 08:18:00 2023 Epoch: 3
Epoch 3/300 0/1000 loss: 0.9594 time 3.77s
Epoch 3/300 200/1000 loss: 0.9342 time 1.53s
Epoch 3/300 400/1000 loss: 0.9320 time 1.53s
Epoch 3/300 600/1000 loss: 0.9303 time 1.52s
Epoch 3/300 800/1000 loss: 0.9293 time 1.52s
Final training 3/299 loss: 0.9282 time 1487.08s
0 Tue Sep 19 08:42:47 2023 Epoch: 4
Epoch 4/300 0/1000 loss: 0.8798 time 3.27s
Epoch 4/300 200/1000 loss: 0.9222 time 1.51s
Epoch 4/300 400/1000 loss: 0.9208 time 1.51s
Epoch 4/300 600/1000 loss: 0.9187 time 1.51s
Epoch 4/300 800/1000 loss: 0.9176 time 1.51s
Final training 4/299 loss: 0.9161 time 1473.44s
0 Tue Sep 19 09:07:21 2023 Epoch: 5
Epoch 5/300 0/1000 loss: 1.0000 time 3.56s
Epoch 5/300 200/1000 loss: 0.9144 time 1.51s
Epoch 5/300 400/1000 loss: 0.9074 time 1.52s
Epoch 5/300 600/1000 loss: 0.9048 time 1.51s
Epoch 5/300 800/1000 loss: 0.9068 time 1.44s
Final training 5/299 loss: 0.9051 time 1475.70s
0 Tue Sep 19 09:31:57 2023 Epoch: 6
Epoch 6/300 0/1000 loss: 0.8484 time 3.59s
Epoch 6/300 200/1000 loss: 0.8920 time 1.00s
Epoch 6/300 400/1000 loss: 0.8919 time 1.51s
Epoch 6/300 600/1000 loss: 0.8888 time 1.00s
Epoch 6/300 800/1000 loss: 0.8863 time 1.00s
Final training 6/299 loss: 0.8842 time 1472.87s
0 Tue Sep 19 09:56:29 2023 Epoch: 7
Epoch 7/300 0/1000 loss: 0.9901 time 3.67s
Epoch 7/300 200/1000 loss: 0.8694 time 1.50s
Epoch 7/300 400/1000 loss: 0.8691 time 1.50s
Epoch 7/300 600/1000 loss: 0.8674 time 1.51s
Epoch 7/300 800/1000 loss: 0.8699 time 1.52s
Final training 7/299 loss: 0.8652 time 1471.87s
0 Tue Sep 19 10:21:01 2023 Epoch: 8
Epoch 8/300 0/1000 loss: 0.7133 time 3.77s
Epoch 8/300 200/1000 loss: 0.8508 time 1.51s
Epoch 8/300 400/1000 loss: 0.8439 time 1.00s
Epoch 8/300 600/1000 loss: 0.8399 time 1.53s
Epoch 8/300 800/1000 loss: 0.8359 time 1.08s
Final training 8/299 loss: 0.8315 time 1483.32s
0 Tue Sep 19 10:45:45 2023 Epoch: 9
Epoch 9/300 0/1000 loss: 0.7796 time 3.66s
Epoch 9/300 200/1000 loss: 0.8162 time 1.53s
Epoch 9/300 400/1000 loss: 0.8025 time 1.52s
Epoch 9/300 600/1000 loss: 0.8013 time 1.51s
Epoch 9/300 800/1000 loss: 0.7993 time 1.52s
Final training 9/299 loss: 0.7989 time 1475.32s
Val 9/300 0/251 , Dice_TC: 0.0030307944 , Dice_WT: 0.026375564 , Dice_ET: 0.0024515165 , time 22.11s
Val 9/300 200/251 , Dice_TC: 0.007678646 , Dice_WT: 0.020653196 , Dice_ET: 0.0047179246 , time 6.02s
Final validation stats 9/299 , Dice_TC: 0.007620856 , Dice_WT: 0.020649604 , Dice_ET: 0.004677914 , time 1492.36s
new best (0.000000 --> 0.010983).
Saving checkpoint ./runs/unetr_test_dir/model.pt
Saving checkpoint ./runs/unetr_test_dir/model_final.pt
Copying to model.pt new best model!!!!
0 Tue Sep 19 11:35:19 2023 Epoch: 10
/opt/conda/lib/python3.8/site-packages/torch/utils/checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
Epoch 10/300 0/1000 loss: 0.9864 time 3.49s
Epoch 10/300 200/1000 loss: 0.7814 time 1.51s
Epoch 10/300 400/1000 loss: 0.7723 time 1.51s
Epoch 10/300 600/1000 loss: 0.7662 time 1.50s
Epoch 10/300 800/1000 loss: 0.7629 time 0.99s
Final training 10/299 loss: 0.7553 time 1473.77s
0 Tue Sep 19 11:59:52 2023 Epoch: 11
Epoch 11/300 0/1000 loss: 0.4967 time 3.62s
Epoch 11/300 200/1000 loss: 0.7308 time 1.51s
Epoch 11/300 400/1000 loss: 0.7079 time 1.51s
Epoch 11/300 600/1000 loss: 0.7039 time 1.51s
Epoch 11/300 800/1000 loss: 0.6957 time 1.51s
Final training 11/299 loss: 0.6884 time 1473.34s
0 Tue Sep 19 12:24:26 2023 Epoch: 12
Epoch 12/300 0/1000 loss: 0.7155 time 3.47s
Epoch 12/300 200/1000 loss: 0.6454 time 1.51s
Epoch 12/300 400/1000 loss: 0.6413 time 1.51s
Epoch 12/300 600/1000 loss: 0.6393 time 1.51s
Epoch 12/300 800/1000 loss: 0.6233 time 1.04s
Final training 12/299 loss: 0.6175 time 1477.73s
0 Tue Sep 19 12:49:03 2023 Epoch: 13
Epoch 13/300 0/1000 loss: 0.7709 time 3.64s
Epoch 13/300 200/1000 loss: 0.5679 time 1.53s
Epoch 13/300 400/1000 loss: 0.5704 time 1.53s
Epoch 13/300 600/1000 loss: 0.5589 time 1.52s
Epoch 13/300 800/1000 loss: 0.5586 time 1.53s
Final training 13/299 loss: 0.5499 time 1487.71s
0 Tue Sep 19 13:13:51 2023 Epoch: 14
Epoch 14/300 0/1000 loss: 0.2308 time 3.69s
Epoch 14/300 200/1000 loss: 0.4844 time 1.51s
Epoch 14/300 400/1000 loss: 0.4784 time 1.50s
Epoch 14/300 600/1000 loss: 0.4911 time 1.51s
Epoch 14/300 800/1000 loss: 0.4878 time 1.51s
Final training 14/299 loss: 0.4808 time 1474.15s
0 Tue Sep 19 13:38:25 2023 Epoch: 15
Epoch 15/300 0/1000 loss: 0.3428 time 4.57s
Epoch 15/300 200/1000 loss: 0.4314 time 1.51s
Epoch 15/300 400/1000 loss: 0.4337 time 1.51s
Epoch 15/300 600/1000 loss: 0.4224 time 1.51s
Epoch 15/300 800/1000 loss: 0.4181 time 1.50s
Final training 15/299 loss: 0.4166 time 1477.63s
0 Tue Sep 19 14:03:03 2023 Epoch: 16
Epoch 16/300 0/1000 loss: 0.1402 time 3.82s
Epoch 16/300 200/1000 loss: 0.4020 time 1.51s
Epoch 16/300 400/1000 loss: 0.3914 time 1.51s
Epoch 16/300 600/1000 loss: 0.3859 time 1.48s
Epoch 16/300 800/1000 loss: 0.3818 time 1.51s
Final training 16/299 loss: 0.3783 time 1474.34s
0 Tue Sep 19 14:27:37 2023 Epoch: 17
Epoch 17/300 0/1000 loss: 0.9377 time 3.44s
Epoch 17/300 200/1000 loss: 0.3690 time 1.51s
Epoch 17/300 400/1000 loss: 0.3728 time 1.51s
Epoch 17/300 600/1000 loss: 0.3706 time 1.51s
Epoch 17/300 800/1000 loss: 0.3710 time 1.50s
Final training 17/299 loss: 0.3641 time 1472.89s
0 Tue Sep 19 14:52:10 2023 Epoch: 18
Epoch 18/300 0/1000 loss: 0.1130 time 3.50s
Epoch 18/300 200/1000 loss: 0.3218 time 1.51s
Epoch 18/300 400/1000 loss: 0.3164 time 1.53s
Epoch 18/300 600/1000 loss: 0.3273 time 1.51s
Epoch 18/300 800/1000 loss: 0.3301 time 1.53s
Final training 18/299 loss: 0.3264 time 1483.04s
0 Tue Sep 19 15:16:53 2023 Epoch: 19
Epoch 19/300 0/1000 loss: 0.1883 time 3.55s
Epoch 19/300 200/1000 loss: 0.3354 time 1.52s
Epoch 19/300 400/1000 loss: 0.3292 time 1.51s
Epoch 19/300 600/1000 loss: 0.3253 time 1.51s
Epoch 19/300 800/1000 loss: 0.3262 time 1.50s
Final training 19/299 loss: 0.3252 time 1477.95s
Val 19/300 0/251 , Dice_TC: 0.0030307944 , Dice_WT: 0.026375564 , Dice_ET: 0.0024515165 , time 8.49s
Val 19/300 200/251 , Dice_TC: 0.007678646 , Dice_WT: 0.020653196 , Dice_ET: 0.0047179246 , time 6.01s
Final validation stats 19/299 , Dice_TC: 0.007620856 , Dice_WT: 0.020649604 , Dice_ET: 0.004677914 , time 1478.38s
Saving checkpoint ./runs/unetr_test_dir/model_final.pt
0 Tue Sep 19 16:06:11 2023 Epoch: 20
Epoch 20/300 0/1000 loss: 0.3574 time 3.11s
Epoch 20/300 200/1000 loss: 0.3382 time 1.50s
Epoch 20/300 400/1000 loss: 0.3132 time 1.52s
Epoch 20/300 600/1000 loss: 0.3132 time 1.51s
Epoch 20/300 800/1000 loss: 0.3017 time 1.50s
Final training 20/299 loss: 0.3065 time 1476.15s
0 Tue Sep 19 16:30:47 2023 Epoch: 21
Epoch 21/300 0/1000 loss: 0.1812 time 3.27s
Epoch 21/300 200/1000 loss: 0.2838 time 1.51s
Epoch 21/300 400/1000 loss: 0.3070 time 1.13s
Epoch 21/300 600/1000 loss: 0.3166 time 1.04s
Epoch 21/300 800/1000 loss: 0.3261 time 1.51s
Final training 21/299 loss: 0.3198 time 1472.29s
0 Tue Sep 19 16:55:19 2023 Epoch: 22
Epoch 22/300 0/1000 loss: 0.1040 time 2.99s
Epoch 22/300 200/1000 loss: 0.2709 time 1.52s
Epoch 22/300 400/1000 loss: 0.3096 time 1.51s
Epoch 22/300 600/1000 loss: 0.3091 time 1.51s
Epoch 22/300 800/1000 loss: 0.2992 time 1.51s
Final training 22/299 loss: 0.2893 time 1477.88s
0 Tue Sep 19 17:19:57 2023 Epoch: 23
Epoch 23/300 0/1000 loss: 0.0419 time 3.38s
Epoch 23/300 200/1000 loss: 0.2871 time 1.53s
Epoch 23/300 400/1000 loss: 0.3055 time 1.52s
Epoch 23/300 600/1000 loss: 0.3159 time 1.28s
Epoch 23/300 800/1000 loss: 0.3052 time 1.52s
Final training 23/299 loss: 0.3090 time 1486.37s
0 Tue Sep 19 17:44:43 2023 Epoch: 24
Epoch 24/300 0/1000 loss: 0.0406 time 3.62s
Epoch 24/300 200/1000 loss: 0.3025 time 1.51s
Epoch 24/300 400/1000 loss: 0.2987 time 1.50s
Epoch 24/300 600/1000 loss: 0.3014 time 1.50s
Epoch 24/300 800/1000 loss: 0.3056 time 1.51s
Final training 24/299 loss: 0.3014 time 1474.28s
0 Tue Sep 19 18:09:18 2023 Epoch: 25
Epoch 25/300 0/1000 loss: 0.0554 time 3.60s
Epoch 25/300 200/1000 loss: 0.3325 time 1.51s
Epoch 25/300 400/1000 loss: 0.3288 time 1.52s
Epoch 25/300 600/1000 loss: 0.3243 time 1.38s
Epoch 25/300 800/1000 loss: 0.3055 time 1.50s
Final training 25/299 loss: 0.2995 time 1475.26s
0 Tue Sep 19 18:33:53 2023 Epoch: 26
Epoch 26/300 0/1000 loss: 0.2394 time 4.38s
Epoch 26/300 200/1000 loss: 0.3158 time 1.51s
Epoch 26/300 400/1000 loss: 0.3020 time 1.51s
Epoch 26/300 600/1000 loss: 0.3113 time 1.50s
Epoch 26/300 800/1000 loss: 0.3193 time 1.51s
Final training 26/299 loss: 0.3075 time 1474.43s
0 Tue Sep 19 18:58:27 2023 Epoch: 27
Epoch 27/300 0/1000 loss: 0.1060 time 2.91s
Epoch 27/300 200/1000 loss: 0.2557 time 1.51s
Epoch 27/300 400/1000 loss: 0.2639 time 1.51s
Epoch 27/300 600/1000 loss: 0.2699 time 1.51s
Epoch 27/300 800/1000 loss: 0.2650 time 1.25s
Final training 27/299 loss: 0.2696 time 1470.01s
0 Tue Sep 19 19:22:57 2023 Epoch: 28
Epoch 28/300 0/1000 loss: 0.3487 time 4.24s
Epoch 28/300 200/1000 loss: 0.2666 time 1.52s
Epoch 28/300 400/1000 loss: 0.2722 time 1.52s
Epoch 28/300 600/1000 loss: 0.2883 time 1.52s
Epoch 28/300 800/1000 loss: 0.2856 time 1.51s
Final training 28/299 loss: 0.2862 time 1488.47s
0 Tue Sep 19 19:47:46 2023 Epoch: 29
Epoch 29/300 0/1000 loss: 0.0461 time 4.04s
Epoch 29/300 200/1000 loss: 0.2707 time 1.52s
Epoch 29/300 400/1000 loss: 0.2753 time 1.51s
Epoch 29/300 600/1000 loss: 0.2807 time 1.50s
Epoch 29/300 800/1000 loss: 0.2855 time 1.51s
Final training 29/299 loss: 0.2824 time 1481.74s
Val 29/300 0/251 , Dice_TC: 0.0030307944 , Dice_WT: 0.026375564 , Dice_ET: 0.0024515165 , time 8.38s
Val 29/300 200/251 , Dice_TC: 0.007678646 , Dice_WT: 0.020653196 , Dice_ET: 0.0047179246 , time 6.02s
Final validation stats 29/299 , Dice_TC: 0.007620856 , Dice_WT: 0.020649604 , Dice_ET: 0.004677914 , time 1476.93s
Saving checkpoint ./runs/unetr_test_dir/model_final.pt
0 Tue Sep 19 20:37:06 2023 Epoch: 30
Epoch 30/300 0/1000 loss: 1.0000 time 3.73s
Epoch 30/300 200/1000 loss: 0.3016 time 1.51s
Epoch 30/300 400/1000 loss: 0.2875 time 1.51s
Epoch 30/300 600/1000 loss: 0.2867 time 1.50s
Epoch 30/300 800/1000 loss: 0.2903 time 1.51s
Final training 30/299 loss: 0.2868 time 1476.41s
0 Tue Sep 19 21:01:43 2023 Epoch: 31
Epoch 31/300 0/1000 loss: 0.0602 time 4.06s
Epoch 31/300 200/1000 loss: 0.2830 time 1.49s
Epoch 31/300 400/1000 loss: 0.2741 time 1.32s
Epoch 31/300 600/1000 loss: 0.2904 time 1.51s
Epoch 31/300 800/1000 loss: 0.2899 time 1.51s
Final training 31/299 loss: 0.2942 time 1477.11s
0 Tue Sep 19 21:26:20 2023 Epoch: 32
Epoch 32/300 0/1000 loss: 0.2807 time 3.87s
Epoch 32/300 200/1000 loss: 0.2938 time 1.51s
Epoch 32/300 400/1000 loss: 0.2789 time 1.51s
Epoch 32/300 600/1000 loss: 0.2694 time 1.51s
Epoch 32/300 800/1000 loss: 0.2744 time 1.53s
Final training 32/299 loss: 0.2788 time 1479.12s
0 Tue Sep 19 21:50:59 2023 Epoch: 33
Epoch 33/300 0/1000 loss: 0.1702 time 4.15s
Epoch 33/300 200/1000 loss: 0.2930 time 1.36s
Epoch 33/300 400/1000 loss: 0.2988 time 1.53s
Epoch 33/300 600/1000 loss: 0.2936 time 1.51s
Epoch 33/300 800/1000 loss: 0.2922 time 1.51s
Final training 33/299 loss: 0.2883 time 1486.31s
0 Tue Sep 19 22:15:45 2023 Epoch: 34
Epoch 34/300 0/1000 loss: 0.0522 time 3.71s
Epoch 34/300 200/1000 loss: 0.2491 time 1.51s
Epoch 34/300 400/1000 loss: 0.2687 time 1.51s
Epoch 34/300 600/1000 loss: 0.2749 time 1.51s
Epoch 34/300 800/1000 loss: 0.2729 time 1.51s
Final training 34/299 loss: 0.2695 time 1473.69s
0 Tue Sep 19 22:40:19 2023 Epoch: 35
Epoch 35/300 0/1000 loss: 0.0719 time 3.84s
Epoch 35/300 200/1000 loss: 0.2924 time 1.52s
Epoch 35/300 400/1000 loss: 0.3049 time 1.50s
Epoch 35/300 600/1000 loss: 0.3041 time 1.52s
Epoch 35/300 800/1000 loss: 0.2934 time 1.50s
Final training 35/299 loss: 0.2871 time 1473.62s
0 Tue Sep 19 23:04:52 2023 Epoch: 36
Epoch 36/300 0/1000 loss: 0.1817 time 3.70s
Epoch 36/300 200/1000 loss: 0.3181 time 1.50s
Epoch 36/300 400/1000 loss: 0.2824 time 1.50s
Epoch 36/300 600/1000 loss: 0.2897 time 1.35s
Epoch 36/300 800/1000 loss: 0.2874 time 1.51s
Final training 36/299 loss: 0.2854 time 1477.99s
0 Tue Sep 19 23:29:30 2023 Epoch: 37
Epoch 37/300 0/1000 loss: 0.7340 time 3.79s
Epoch 37/300 200/1000 loss: 0.3279 time 1.51s
Epoch 37/300 400/1000 loss: 0.3255 time 1.51s
Epoch 37/300 600/1000 loss: 0.3096 time 1.51s
Epoch 37/300 800/1000 loss: 0.3049 time 1.51s
Final training 37/299 loss: 0.3109 time 1476.81s
0 Tue Sep 19 23:54:07 2023 Epoch: 38
Epoch 38/300 0/1000 loss: 0.0719 time 3.68s
Epoch 38/300 200/1000 loss: 0.2764 time 1.53s
Epoch 38/300 400/1000 loss: 0.2820 time 1.50s
Epoch 38/300 600/1000 loss: 0.2835 time 1.53s
Epoch 38/300 800/1000 loss: 0.2860 time 1.53s
Final training 38/299 loss: 0.2845 time 1486.57s
0 Wed Sep 20 00:18:54 2023 Epoch: 39
Epoch 39/300 0/1000 loss: 0.1300 time 3.57s
Epoch 39/300 200/1000 loss: 0.2720 time 1.50s
Epoch 39/300 400/1000 loss: 0.2716 time 1.51s
Epoch 39/300 600/1000 loss: 0.2733 time 1.50s
Epoch 39/300 800/1000 loss: 0.2749 time 1.51s
Final training 39/299 loss: 0.2754 time 1479.09s
Val 39/300 0/251 , Dice_TC: 0.0030307944 , Dice_WT: 0.026375564 , Dice_ET: 0.0024515165 , time 8.34s
Val 39/300 200/251 , Dice_TC: 0.007678646 , Dice_WT: 0.020653196 , Dice_ET: 0.0047179246 , time 6.01s
Final validation stats 39/299 , Dice_TC: 0.007620856 , Dice_WT: 0.020649604 , Dice_ET: 0.004677914 , time 1476.42s
Saving checkpoint ./runs/unetr_test_dir/model_final.pt
0 Wed Sep 20 01:08:11 2023 Epoch: 40
Epoch 40/300 0/1000 loss: 0.1369 time 4.09s
Epoch 40/300 200/1000 loss: 0.2911 time 1.51s
Epoch 40/300 400/1000 loss: 0.2979 time 1.51s
Epoch 40/300 600/1000 loss: 0.2993 time 1.15s
Epoch 40/300 800/1000 loss: 0.2878 time 1.48s
Final training 40/299 loss: 0.2874 time 1476.60s
0 Wed Sep 20 01:32:47 2023 Epoch: 41
Epoch 41/300 0/1000 loss: 0.2226 time 3.65s
Epoch 41/300 200/1000 loss: 0.2734 time 1.50s
Epoch 41/300 400/1000 loss: 0.2828 time 1.51s
Epoch 41/300 600/1000 loss: 0.2833 time 1.51s
Epoch 41/300 800/1000 loss: 0.2844 time 1.50s
Final training 41/299 loss: 0.2852 time 1477.13s
0 Wed Sep 20 01:57:25 2023 Epoch: 42
Epoch 42/300 0/1000 loss: 0.0672 time 3.09s
Epoch 42/300 200/1000 loss: 0.3140 time 1.51s
Epoch 42/300 400/1000 loss: 0.3082 time 1.51s
Epoch 42/300 600/1000 loss: 0.3050 time 1.53s
Epoch 42/300 800/1000 loss: 0.3050 time 1.53s
Final training 42/299 loss: 0.2968 time 1478.77s
0 Wed Sep 20 02:22:03 2023 Epoch: 43
Epoch 43/300 0/1000 loss: 0.1963 time 3.82s
Epoch 43/300 200/1000 loss: 0.2688 time 1.52s
Epoch 43/300 400/1000 loss: 0.2781 time 1.53s
Epoch 43/300 600/1000 loss: 0.2899 time 1.51s
Epoch 43/300 800/1000 loss: 0.2832 time 1.02s
Final training 43/299 loss: 0.2827 time 1487.20s
0 Wed Sep 20 02:46:51 2023 Epoch: 44
Epoch 44/300 0/1000 loss: 0.4128 time 3.75s
Epoch 44/300 200/1000 loss: 0.2632 time 1.50s
Epoch 44/300 400/1000 loss: 0.2680 time 1.51s
Epoch 44/300 600/1000 loss: 0.2631 time 1.51s
Epoch 44/300 800/1000 loss: 0.2638 time 1.51s
Final training 44/299 loss: 0.2655 time 1477.86s
0 Wed Sep 20 03:11:28 2023 Epoch: 45
Epoch 45/300 0/1000 loss: 0.1376 time 4.29s
Epoch 45/300 200/1000 loss: 0.2968 time 1.51s
Epoch 45/300 400/1000 loss: 0.2985 time 1.50s
Epoch 45/300 600/1000 loss: 0.2912 time 1.50s
Epoch 45/300 800/1000 loss: 0.2896 time 1.51s
Final training 45/299 loss: 0.2856 time 1478.27s
0 Wed Sep 20 03:36:07 2023 Epoch: 46
Epoch 46/300 0/1000 loss: 0.2607 time 3.64s
Epoch 46/300 200/1000 loss: 0.2903 time 1.50s
Epoch 46/300 400/1000 loss: 0.2744 time 1.51s
Epoch 46/300 600/1000 loss: 0.2799 time 1.51s
Epoch 46/300 800/1000 loss: 0.2753 time 1.51s
Final training 46/299 loss: 0.2789 time 1480.81s
0 Wed Sep 20 04:00:48 2023 Epoch: 47
Epoch 47/300 0/1000 loss: 0.3602 time 3.80s
Epoch 47/300 200/1000 loss: 0.3169 time 1.52s
Epoch 47/300 400/1000 loss: 0.3123 time 1.50s
Epoch 47/300 600/1000 loss: 0.2915 time 1.50s
Epoch 47/300 800/1000 loss: 0.2926 time 1.51s
Final training 47/299 loss: 0.2928 time 1479.22s
0 Wed Sep 20 04:25:27 2023 Epoch: 48
Epoch 48/300 0/1000 loss: 0.1059 time 3.82s
Epoch 48/300 200/1000 loss: 0.2518 time 1.53s
Epoch 48/300 400/1000 loss: 0.2601 time 1.38s
Epoch 48/300 600/1000 loss: 0.2701 time 1.51s
Epoch 48/300 800/1000 loss: 0.2770 time 1.52s
Final training 48/299 loss: 0.2741 time 1489.80s
0 Wed Sep 20 04:50:17 2023 Epoch: 49
Epoch 49/300 0/1000 loss: 0.1415 time 3.74s
Epoch 49/300 200/1000 loss: 0.2665 time 1.51s
Epoch 49/300 400/1000 loss: 0.2727 time 1.51s
Epoch 49/300 600/1000 loss: 0.2614 time 1.51s
Epoch 49/300 800/1000 loss: 0.2593 time 1.33s
Final training 49/299 loss: 0.2669 time 1479.11s
Val 49/300 0/251 , Dice_TC: 0.0030307944 , Dice_WT: 0.026375564 , Dice_ET: 0.0024515165 , time 8.31s
Val 49/300 200/251 , Dice_TC: 0.007678646 , Dice_WT: 0.020653196 , Dice_ET: 0.0047179246 , time 6.02s
Final validation stats 49/299 , Dice_TC: 0.007620856 , Dice_WT: 0.020649604 , Dice_ET: 0.004677914 , time 1481.37s
Saving checkpoint ./runs/unetr_test_dir/model_final.pt
0 Wed Sep 20 05:39:39 2023 Epoch: 50
Epoch 50/300 0/1000 loss: 0.0930 time 3.74s
Epoch 50/300 200/1000 loss: 0.2738 time 1.50s
Epoch 50/300 400/1000 loss: 0.2512 time 1.51s
Epoch 50/300 600/1000 loss: 0.2680 time 1.50s
Epoch 50/300 800/1000 loss: 0.2579 time 1.51s
Final training 50/299 loss: 0.2691 time 1478.59s
0 Wed Sep 20 06:04:17 2023 Epoch: 51
Epoch 51/300 0/1000 loss: 0.3865 time 3.71s
Epoch 51/300 200/1000 loss: 0.2720 time 1.07s
Epoch 51/300 400/1000 loss: 0.2836 time 1.50s
Epoch 51/300 600/1000 loss: 0.2820 time 1.50s
Epoch 51/300 800/1000 loss: 0.2858 time 1.51s
Final training 51/299 loss: 0.2835 time 1480.44s
0 Wed Sep 20 06:28:58 2023 Epoch: 52
Epoch 52/300 0/1000 loss: 0.1500 time 3.77s
Epoch 52/300 200/1000 loss: 0.2358 time 1.39s
Epoch 52/300 400/1000 loss: 0.2420 time 1.51s
Epoch 52/300 600/1000 loss: 0.2641 time 1.53s
Epoch 52/300 800/1000 loss: 0.2652 time 1.53s
Final training 52/299 loss: 0.2637 time 1487.13s
0 Wed Sep 20 06:53:45 2023 Epoch: 53
Epoch 53/300 0/1000 loss: 0.1032 time 4.31s
Epoch 53/300 200/1000 loss: 0.2394 time 1.53s
Epoch 53/300 400/1000 loss: 0.2450 time 1.02s
Epoch 53/300 600/1000 loss: 0.2590 time 1.00s
Epoch 53/300 800/1000 loss: 0.2566 time 0.99s
Final training 53/299 loss: 0.2542 time 1098.19s
0 Wed Sep 20 07:12:03 2023 Epoch: 54
Epoch 54/300 0/1000 loss: 0.4845 time 3.43s
Epoch 54/300 200/1000 loss: 0.2559 time 1.00s
Epoch 54/300 400/1000 loss: 0.2609 time 1.00s
Epoch 54/300 600/1000 loss: 0.2561 time 1.00s
Epoch 54/300 800/1000 loss: 0.2594 time 0.99s
Final training 54/299 loss: 0.2669 time 993.89s
0 Wed Sep 20 07:28:37 2023 Epoch: 55
Epoch 55/300 0/1000 loss: 0.7291 time 3.55s
Epoch 55/300 200/1000 loss: 0.2936 time 0.99s
Epoch 55/300 400/1000 loss: 0.3097 time 1.00s
Epoch 55/300 600/1000 loss: 0.2992 time 1.00s
Epoch 55/300 800/1000 loss: 0.2919 time 0.99s
Final training 55/299 loss: 0.2890 time 992.17s
0 Wed Sep 20 07:45:09 2023 Epoch: 56
Epoch 56/300 0/1000 loss: 0.2320 time 3.72s
Epoch 56/300 200/1000 loss: 0.2644 time 1.00s
Epoch 56/300 400/1000 loss: 0.2647 time 0.99s
Epoch 56/300 600/1000 loss: 0.2566 time 1.00s
Epoch 56/300 800/1000 loss: 0.2461 time 1.01s
Final training 56/299 loss: 0.2477 time 993.08s
0 Wed Sep 20 08:01:42 2023 Epoch: 57
Epoch 57/300 0/1000 loss: 0.0641 time 3.06s
Epoch 57/300 200/1000 loss: 0.2585 time 0.96s
Epoch 57/300 400/1000 loss: 0.2577 time 1.00s
Epoch 57/300 600/1000 loss: 0.2555 time 1.00s
Epoch 57/300 800/1000 loss: 0.2671 time 1.00s
Final training 57/299 loss: 0.2727 time 990.91s
0 Wed Sep 20 08:18:13 2023 Epoch: 58
Epoch 58/300 0/1000 loss: 0.0610 time 3.17s
Epoch 58/300 200/1000 loss: 0.3100 time 1.02s
Epoch 58/300 400/1000 loss: 0.2808 time 1.02s
Epoch 58/300 600/1000 loss: 0.2696 time 1.02s
Epoch 58/300 800/1000 loss: 0.2661 time 1.03s
Final training 58/299 loss: 0.2731 time 1015.76s
0 Wed Sep 20 08:35:09 2023 Epoch: 59
Epoch 59/300 0/1000 loss: 1.0000 time 3.03s
Epoch 59/300 200/1000 loss: 0.2583 time 1.00s
Epoch 59/300 400/1000 loss: 0.2510 time 0.98s
Epoch 59/300 600/1000 loss: 0.2721 time 1.01s
Epoch 59/300 800/1000 loss: 0.2698 time 0.99s
Final training 59/299 loss: 0.2691 time 988.23s

@odtdtdfd
Copy link

me too

@Luffy03
Copy link

Luffy03 commented Oct 19, 2023

Same issue. My accuracy did not increase and even more my training loss also not decrease.

@zbnaseri
Copy link

zbnaseri commented Oct 24, 2023

Hi. I have memory problem in running this code on BraTS20 on the colab and kaggle. is there any solution?

@pcl1121
Copy link

pcl1121 commented Nov 29, 2023

Same issue. My accuracy did not increase and even more my training loss also not decrease.

my training loss also not decrease in several networks,have you solved it?

@ericspod
Copy link
Member

ericspod commented Jan 6, 2024

@ahatamiz would you be able to help here? Thanks!

@Navee402
Copy link

Navee402 commented Feb 4, 2024

Hi. I have memory problem in running this code on BraTS20 on the colab and kaggle. is there any solution?

Hey! I am also encountering this problem. Im trying to run in on Nvidia RTX3090. I am getting "RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertion" this error. Were you able to figure out the problem? Can You please help me if you had solved the issue?
Thank you so much

@Dasha484
Copy link

Dasha484 commented Mar 6, 2024

Hello, I also encountered the same issue during training, with particularly low accuracy. Have you managed to resolve it? @FengheTan9

@Eight3H
Copy link

Eight3H commented Mar 13, 2024

Hello, I also encountered the same issue during training, with particularly low accuracy. Have you managed to resolve it? @FengheTan9

Leave a contact information and discuss it

@0Jmyy0
Copy link

0Jmyy0 commented Jul 17, 2024

@FengheTan9 I also have the same problem, the loss keeps decreasing, but the val_Dice indicator keeps showing low, may I ask if you have solved this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

11 participants