Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor database for datasets #5

Open
freeman-lab opened this issue Sep 11, 2015 · 0 comments
Open

Refactor database for datasets #5

freeman-lab opened this issue Sep 11, 2015 · 0 comments

Comments

@freeman-lab
Copy link
Member

To begin addressing #4 , we should use a more flexible approach for managing datasets and their metadata. Currently we use a mongo db (within meteor) for datasets but it's populated by querying all datasets stored in a dedicated bucket on S3, and periodically refreshed, so is more or less transient.

We should instead use a more persistent db that's initialized with the S3 bucket, but provide methods for users to submit datasets directly in the web app and update the db, with some form validation / checking.

We'll assume anyone submitting data is already hosting it publicly on S3, and the validation during submission can check that the specified resources exist. Submissions can also specify any Jupyter notebooks associated with the data. We'll need to separately address how to include the new notebooks in our notebook deployments (will create another issue for that).

cc @bcipolli

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant