-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slimming DenseNet #3
Comments
Thanks for your interest. We prune channels according to the BN's scaling factors, and after this process we set small factors (and biases) to 0, then we see which channels we can prune without affecting the network. This is applied to all network structures. In DenseNet actually dimension of scaling factors match the dimension of the convolution, because of the "pre-activation" structure. The lambda parameter needs tuning for different datasets and hyperparameters (e.g. learning rate), so you may need to see the final performance. |
Thanks for your answer. I have an example of one part of DenseNet-40 (k=12): module.features.init_conv.weight : torch.Size([24, 3, 3, 3]) [N, C, K, K]: [#filters, #channels, kernel_size, kernel_size] "norm.weight" here is the scaling factor in batch normalization. For me, each norm.weight layer I try to prune 40% #channels of batch normalization coresponding to #filters of previous conv.weight and #channels of latter conv.weight. How can you prune incoming and outgoing in this case? Please correct me if I make mistakes in pruning. By the way, When parameters of layers are pruned, how does it affect to the performance of network? Is there any way to track how the performance changes ? Thanks. |
|
|
|
Yes, I see. Also, do you public the code for DenseNet and Resnet experiments? I also need to reproduce all your experiments for evaluation. Thanks. |
In case you're still interested, we've released our Pytorch implementation here https://github.com/Eric-mingjie/network-slimming, which supports ResNet and DenseNet. |
Hi @liuzhuang13,
Thank you for a great work. I saw that you leveraged scaling factors of Batch normalization to prune incoming and outgoing weights at conv layers, However in DenseNet after a basic block (1x1 + 3x3) the previous features is concatenated to the current one and the dimension of scaling factors is not matched to that of the previous convolutional layer for pruning. So, How can you prune weights in this case?
By the way, when training sparsity DenseNet is finished with lambda 1e-5, I notice that many scaling factors are not small enough for pruning. Does this affect to the performance of compressed network?
Thanks,
Hai
The text was updated successfully, but these errors were encountered: