-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add marine infrastructure dataset and model config #49
base: master
Are you sure you want to change the base?
Conversation
Changing back to draft, I'm going to add the work on deploying the prediction pipeline on Beaker here. |
…ects into favyen/marine-infra
@@ -104,7 +100,9 @@ def launch_job( | |||
config_path, | |||
"--autoresume=true", | |||
], | |||
constraints=Constraints(cluster=["ai2/jupiter-cirrascale-2"]), | |||
constraints=Constraints( | |||
cluster=["ai2/jupiter-cirrascale-2", "ai2/augusta-google-1"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Make cluster Name into constants or better a class containing all the cluster names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add refactoring of this to #100 since that PR consolidates some of the Beaker stuff.
return cur_groups | ||
|
||
|
||
class MonthlySentinel2(DataSource): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this code make more sense in rslearn? I would imagine there are other usecases for monthly sentinel2 data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment for the rest of the data sources here. Add integration tests as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests. I think it makes sense for now to keep it in rslearn_projects, we will see how many applications use the same thing. If we do add to rslearn, we should refactor it so we don't need this wrapper class, ideally it'd be something in the QueryConfig maybe.
rslp/satlas/postprocess.py
Outdated
return fname, json.load(f) | ||
|
||
|
||
def apply_nms( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add tests for the postprocessing functions
applying the model globally but can be disabled for small regions to avoid | ||
the time to create the index. | ||
""" | ||
dataset_config_fname = DATASET_CONFIG_FNAME.format(application=application.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Split this into a couple functions for readability
# Smoothing is handled by a Go script. | ||
subprocess.check_call( | ||
[ | ||
"rslp/satlas/scripts/smooth_point_labels_viterbi", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't the smooth point labels be written in python?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be feasible, I'm not sure, this is borrowed from https://github.com/allenai/remote-sensing-data-hub/blob/main/go_scripts/smooth_point_labels_viterbi.go and not much reason to rewrite at this time I think.
# Apply the pipeline. It will ingest data and apply the model. | ||
# We disable rtree index so that it doesn't need an hour to create it. | ||
predict_pipeline( | ||
application=Application.MARINE_INFRA, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have tests for the other applications as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it is only MARINE_INFRA / WIND_TURBINE and these are similar enough that I don't think we need to run both (the code is the same, just different model, but this test is just making sure the code runs correctly).
I think once we have SOLAR_FARM it could use its own test since there will be code specific to the segmentation.
This adds dataset/model configs for solar farm and wind turbine models, along with corresponding inference pipelines.
It also:
rslp.common.worker
system to launch workers that read tasks from a queue (Google Pub/Sub topic) and execute them. In the future all task systems (like Landsat and Sentinel-2 job launchers) should be converted to this system so that they can share the same way to execute workers -- then the individual projects just need to implement the process of writing tasks to the topic.