-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gaia integration with Minerva/Girder #72
Comments
One decision that will inform what to do here is determining how data is stored and transferred. There are several different conventions being used here and elsewhere.
I think in general we should always route data through Minerva's server rather than allow the client to get data directly from (or post to) Gaia. Doing so would add complexity in a way that wouldn't scale to real world datasets. |
Thanks @jbeezley. By routing data "through Minerva's server rather than allow the client to get data directly from (or post to) Gaia", do you mean that Minerva should create a girder-worker job for gaia, that would include the data's item id('s) (and possibly a format type to distinguish between data stored as metadata or in files). Girder-worker would then retrieve the data sets as GeoJSON and use that as input to a Gaia process? And save the process output to a new item? |
@aashish24 based on today's phone call, does this accurately represent your ideas on how to move forward?
@jbeezley @dorukozturk @kotfic thoughts? |
@mbertrand You may want to look at the girder_worker girder_io plugin for some ideas about how to leverage the girder-client library to achieve this kind of functionality. Obviously that code is tied up with girder_worker's architecture, but it essentially does what you're describing. @aashish24 rather than creating another custom distributed job management plugin, wouldn't it be better to coordinate with the girder dev team and see if we can put together a PR that meets gaia's needs through an existing piece of infrastructure? |
Thanks @kotfic I will take a look at that plugin. It would also be my preference to use a standard girder celery/job management framework if possible. |
@aashish24 geospatial vector data in minerva can come from a few sources right now:
Each of these sources would need some strategy for processing in Gaia:
Am I missing any other sources? Any particular source I should give priority to? |
PS @aashish24 in girder there's also the ability to store a 'geo' field in an item's metadata via the geospatial plugin, but AFAIK this isn't used in Minerva. |
Some exploratory code in a notebook here: https://gist.github.com/mbertrand/c153d0019441ef0fc2298e5359d73f2d Uses a combination of girder-client, girder-worker to retrieve geojson data from minerva items and use them as inputs to run a gaia process via girder-worker. |
thanks @mbertrand I am getting back to this. I will have a look. Thanks, |
@aashish24 Here is an alternative approach, based on several of the supplied analyses in Minerva. It is a girder plugin that allows Gaia processes to be run on Minerva geojson datasets as Girder jobs (with or without celery): https://github.com/mbertrand/gaia_minerva |
@mbertrand Just an FYI, once girder/girder#1553 clears you should be able to modify gaia_minerva to use remote celery jobs via the |
The repo for this is https://github.com/OpenDataAnalytics/gaia_minerva |
Goal
Run Gaia processes from within Minerva
General requirements
@jbeezley @kotfic @aashish24 @dorukozturk let me know if you have any thoughts/suggestions on this, thanks.
The text was updated successfully, but these errors were encountered: