-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Added notebooks that are colab accessible, they should both be runnable from within colab. " #2
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The notebooks are great and overall worked well for me. I'd like to get some feedback from others who are less familiar with how METL works.
We will need to add these to the notebooks README.md
and explain how to run them in Colab. We can provide a direct link to open these notebooks in Colab. ColabFold uses https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb
for example. We should document that the T4 GPU runtime is sufficient to run both notebooks. Is it necessary to connect the runtime in Colab before starting?
Will we keep the original finetuning.ipynb
as well? We probably can as long as we explain their different purposes.
Good feedback :)
From my understanding I think you have to connect to it manually if you want to use the GPU. I think it has something to do with them not trying to automatically start billing you if you are allocating like A100's or something Adding something to the readme (and the model card) is needed I think, too |
Yea. I think we should keep them separate. If you are running the normal one, you don't need the notebook cloning metl again (it's assumed it should be cloned already) and force-downloading all of the dependencies every time you try to run it |
…b version to also remove statement about apple silicone"
notebooks/colab_finetuning.ipynb
Outdated
"\n", | ||
"For demonstration purposes, this repository contains the [avGFP dataset](https://github.com/gitter-lab/metl/tree/main/data/dms_data/avgfp) from [Sarkisyan et al. (2016)](https://doi.org/10.1038/nature17995).\n", | ||
"See the [metl-pub](https://github.com/gitter-lab/metl-pub) repository to access the other experimental datasets we used in our preprint.\n", | ||
"See the README in the [dms_data](https://github.com/gitter-lab/metl/tree/main/data/dms_data/avgfp/data/dms_data) directory for information about how to use your own experimental dataset." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This global link and some others are broken. Some are still relative. Please double check them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agitter The broken links were links to python or ipynb files that are broken in the original notebooks as well.
Additionally, the broken global links are actually linking to pages that no longer exist. The only way to get to those files are by going to older commits, when they did exist, like the other notebook does.
I do not know what you want to do with that data. I fixed the ipynb links, but I will wait on the broken globals because those files are no longer in the current version of the main repository.
An example of that is this link here for the dms_data that is in the notebook: https://github.com/gitter-lab/metl/tree/main/data/dms_data/avgfp/data/dms_data
But the actual file was deleted though can be found in an earlier commit: https://github.com/gitter-lab/metl/tree/45237bbc4c6cfae1d4657930bdb732ec41a3fc79/data/dms_data
Are we supported this still? Does this need to be referenced or talk about? And if it does, why is it not included in the current version of main?
I'm confused on these so I am leaving them, though did I fix the broken links to python files and notebooks (that were also broken in the normal finetuning notebook)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. This is a question for @samgelman.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, the link https://github.com/gitter-lab/metl/tree/main/data/dms_data/avgfp/data/dms_data is incorrect, it should just be https://github.com/gitter-lab/metl/tree/main/data/dms_data
I made another pass through the notebooks. Some of the orginal comments are still unresolved, including some small typos corrections. I'd like to add these to the readme table before merging. @samgelman do you want to test these in Colab before we merge? |
Were the typo's the links or were there other things that you saw? I can't seem to find anything else. I thought I got all of the typo corrections you put out on the previous round of comments |
I think a lot of my small comments on the pretraining notebook were masked by GitHub as outdated but weren't incorporated. Try searching "To zero based helper functoin" as one example. |
Should've gotten them now. There are a few comments that you can answer though in the unresolved comments above |
I've reviewed both notebooks, and they are working great! I have a few minor comments, for the prediction notebook:
For the finetuning notebook, there are some broken links. It looks like there may have been a find and replace, but the replacement text erroneously added "data/dms_data/avgfp" to the path.
I suggest fixing the broken links, and then other changes can be discussed / added later in a separate PR (especially the JSON vs. text file upload, I don't think that's a critical issue that should block the merge). Great job! |
…edback and updated the links in the finetuning notebook"
I reran both notebooks and made a few small changes in 9912989. They both look excellent. |
To run,