-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submission for #102 #149
base: master
Are you sure you want to change the base?
Submission for #102 #149
Conversation
Hi, please find below a review submitted by one of the reviewers: Score: 6 The problem statement is clearly written in Section 1: the Adam optimizer fails when some gradients have large magnitude but appear rarely. Code The code is open-sourced on GitHub. Communication with the Original Author The authors do not mention any communication with the authors in the paper, but a record of a discussion can be found on OpenReview. Hyperparameter Search The authors sometimes use different hyperparameters from the original authors. Ablation Study No ablation study is performed by the authors. Discussion on Results The authors briefly discuss about the reproduced results. A more detailed review with side-by-side plots might help the readers. Recommendations for Reproducibility The authors do not provide any particular recommendations to the original authors for improving reproducibility. Overall Organization and Clarity Overall, the paper is well-written and is easy to read. However, some parts of the paper is left for the reader to refer to the original paper. Here are points that I was particularly impressed with:
Here are some parts of the paper I wished for more information:
Here are some minor fixes I recommend:
Thank you! Score (1-10): 6 |
Hi, please find below a review submitted by one of the reviewers: Score: 5 The authors present a good description of the problem and exhibit a sound understanding of the problem setting through their report. They also do a great job in summarizing the setting and notations used in the original paper. However, I would like to point out that they have a slight typo in the Algorithm description (Line 7). The algorithms shifts the gradients by n-points, not just 1. I would request the authors to make necessary corrections in the report (I assume it's just a typo and not a mistake in their code as they use n as a hyperparameter in several experiments). The authors also use an additional normalization for v (Line 8) which I don't see in the description in the original paper. It would help the reader if the authors could provide a description of the need of this normalization. Is it there in the original code base and the authors don't mention this in the paper? Or is it something that the report authors introduce? I would like to appreciate the authors' efforts to write the code from scratch in PyTorch (given that the original authors' code was in tensorflow). However, I feel that the authors could improve the readme of their github repository to add a "how to run" section that would help future researchers aiming to build on their work. The authors show a detailed communication with the original authors on openreview forum but don't mention it in the report. It would help if they could add a link to the same in the report. The hyperparameter search was not significantly conducted. This raises questions on the robustness of the algorithm. As a reproducibility report, I would expect it to have an extensive hyperparameter search section on replicated experiments before trying out newer experiments, which was missing. Also it is not explicitly clear which hyperparameters were chosen by the authors and which were used from the original implementation. It would help to add a table in the report with the 3 columns - Experiment, Hyperparameters used by original authors, Hyperparameters used in reproduciblity report. At the current stage, it is not very easy to judge if the choice of hyperparameters was done looking at the previous report or chosen by the authors from their experience. I feel a major contribution of this work is their experiments in a more realistic WGAN-GP setting. I would like to appreciate the authors' efforts for the same. Added to this, they also replicated most of the experiments in the original paper (except the tiny-imagenet and CIFAR-10) which gives an insight into the reproducibility of the original paper. They also pointed out discrepancies between the paper and the algorithm's implementation, which I feel is a great contribution in the context of the challenge. It would be really helpful if the authors could add some recommendations to improve reproducibility in a structured way. Right now, it's slightly scattered and not an easy take-away from the report. |
Submission for #102