Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Landsat performance issue by using nearest neighbor interpolation #101

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

favyen2
Copy link
Collaborator

@favyen2 favyen2 commented Feb 7, 2025

Both Landsat models are currently trained on a dataset that was materialized before allenai/rslearn#104. Because Landsat scenes are not pixel-aligned (x/y offset in CRS coordinates are not multiples of the m/pixel resolution), this means that:

  • Before the change, the offset was rounded to the nearest integer. This is not really correct, but in this case it corresponds roughly to nearest neighbor interpolation.
  • After the change, the default is to do bilinear resampling to compute the values of the pixels on pixel-aligned grid against those from the source images (which for Landsat are on an offset grid).

There seems to be a performance drop due to the discrepancy.

Bilinear:

Expected detections between 200 and 500, Actual detections: 447 
Expected detections between 0 and 10, Actual detections: 9 
Expected detections between 0 and 10, Actual detections: 4 
Expected detections between 200 and 500, Actual detections: 292 
Expected detections between 0 and 10, Actual detections: 0 
Expected detections between 0 and 10, Actual detections: 0 
Expected detections between 0 and 10, Actual detections: 1 
Expected detections between 200 and 500, Actual detections: 366 
Expected detections between 20 and 50, Actual detections: 19 
Expected detections between 0 and 10, Actual detections: 2 
Expected detections between 0 and 10, Actual detections: 3 
Expected detections between 20 and 50, Actual detections: 31 
Expected detections between 0 and 10, Actual detections: 8 
Expected detections between 0 and 10, Actual detections: 5 
Expected detections between 20 and 50, Actual detections: 20 
Pass: 14, Fail: 1, Pass %: 93.33

Original results:

Expected detections between 200 and 500, Actual detections: 443 
Expected detections between 0 and 10, Actual detections: 8 
Expected detections between 0 and 10, Actual detections: 3 
Expected detections between 200 and 500, Actual detections: 363 
Expected detections between 0 and 10, Actual detections: 0 
Expected detections between 0 and 10, Actual detections: 3 
Expected detections between 0 and 10, Actual detections: 1 
Expected detections between 200 and 500, Actual detections: 449 
Expected detections between 20 and 50, Actual detections: 30 
Expected detections between 0 and 10, Actual detections: 1 
Expected detections between 0 and 10, Actual detections: 3 
Expected detections between 20 and 50, Actual detections: 35 
Expected detections between 0 and 10, Actual detections: 10 
Expected detections between 0 and 10, Actual detections: 9 
Expected detections between 20 and 50, Actual detections: 45 
Pass: 15, Fail: 0, Pass %: 100.00

We could retrain the model, but nearest neighbor interpolation produces sharper images for human visualization (which Skylight wants), and it is convenient to use the same images for inference and for visualization. It seems there's no need for retraining to use nearest neighbor interpolation:

Expected detections between 200 and 500, Actual detections: 396 
Expected detections between 0 and 10, Actual detections: 8 
Expected detections between 0 and 10, Actual detections: 4 
Expected detections between 200 and 500, Actual detections: 368 
Expected detections between 0 and 10, Actual detections: 0 
Expected detections between 0 and 10, Actual detections: 4 
Expected detections between 0 and 10, Actual detections: 0 
Expected detections between 200 and 500, Actual detections: 404 
Expected detections between 20 and 50, Actual detections: 27 
Expected detections between 0 and 10, Actual detections: 1 
Expected detections between 0 and 10, Actual detections: 3 
Expected detections between 20 and 50, Actual detections: 35 
Expected detections between 0 and 10, Actual detections: 9 
Expected detections between 0 and 10, Actual detections: 7 
Expected detections between 20 and 50, Actual detections: 41 
Pass: 15, Fail: 0, Pass %: 100.00

I guess the performance discrepancy is pretty small but anyway this change will align the inference closer to the training and also restore the sharpness of the pan-sharpened images (we could create another layer for visualization but it seems worse to maintain long-term).

I also removed relative path action since it broke the rslp.landsat_vessels.job_launcher functionality (job_launcher passes --config "{JSON} but the relative path action interprets the JSON as a path and prepends some absolute path to it). This action doesn't seem to be used anywhere and makes it confusing because the user can no longer cd to a directory outside of rslearn_projects and run their own config file they have put in that directory unless they use absolute path, instead it will assume by default that they want to run a config from inside the rslearn_projects directory (I think this is unintuitive).

@favyen2 favyen2 requested a review from yawenzzzz February 7, 2025 16:33
@favyen2
Copy link
Collaborator Author

favyen2 commented Feb 7, 2025

OK I don't think there is really any performance issue, it seems about the same either way just happens to detect slightly different numbers of vessels. So the main thing here is that it fixes the pan-sharpened image without sacrificing performance (and without needing to retrain the model).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant