Fix Landsat performance issue by using nearest neighbor interpolation #101

favyen2 · 2025-02-07T16:33:15Z

Both Landsat models are currently trained on a dataset that was materialized before allenai/rslearn#104. Because Landsat scenes are not pixel-aligned (x/y offset in CRS coordinates are not multiples of the m/pixel resolution), this means that:

Before the change, the offset was rounded to the nearest integer. This is not really correct, but in this case it corresponds roughly to nearest neighbor interpolation.
After the change, the default is to do bilinear resampling to compute the values of the pixels on pixel-aligned grid against those from the source images (which for Landsat are on an offset grid).

There seems to be a performance drop due to the discrepancy.

Bilinear:

Expected detections between 200 and 500, Actual detections: 447 
Expected detections between 0 and 10, Actual detections: 9 
Expected detections between 0 and 10, Actual detections: 4 
Expected detections between 200 and 500, Actual detections: 292 
Expected detections between 0 and 10, Actual detections: 0 
Expected detections between 0 and 10, Actual detections: 0 
Expected detections between 0 and 10, Actual detections: 1 
Expected detections between 200 and 500, Actual detections: 366 
Expected detections between 20 and 50, Actual detections: 19 
Expected detections between 0 and 10, Actual detections: 2 
Expected detections between 0 and 10, Actual detections: 3 
Expected detections between 20 and 50, Actual detections: 31 
Expected detections between 0 and 10, Actual detections: 8 
Expected detections between 0 and 10, Actual detections: 5 
Expected detections between 20 and 50, Actual detections: 20 
Pass: 14, Fail: 1, Pass %: 93.33

Original results:

Expected detections between 200 and 500, Actual detections: 443 
Expected detections between 0 and 10, Actual detections: 8 
Expected detections between 0 and 10, Actual detections: 3 
Expected detections between 200 and 500, Actual detections: 363 
Expected detections between 0 and 10, Actual detections: 0 
Expected detections between 0 and 10, Actual detections: 3 
Expected detections between 0 and 10, Actual detections: 1 
Expected detections between 200 and 500, Actual detections: 449 
Expected detections between 20 and 50, Actual detections: 30 
Expected detections between 0 and 10, Actual detections: 1 
Expected detections between 0 and 10, Actual detections: 3 
Expected detections between 20 and 50, Actual detections: 35 
Expected detections between 0 and 10, Actual detections: 10 
Expected detections between 0 and 10, Actual detections: 9 
Expected detections between 20 and 50, Actual detections: 45 
Pass: 15, Fail: 0, Pass %: 100.00

We could retrain the model, but nearest neighbor interpolation produces sharper images for human visualization (which Skylight wants), and it is convenient to use the same images for inference and for visualization. It seems there's no need for retraining to use nearest neighbor interpolation:

Expected detections between 200 and 500, Actual detections: 396 
Expected detections between 0 and 10, Actual detections: 8 
Expected detections between 0 and 10, Actual detections: 4 
Expected detections between 200 and 500, Actual detections: 368 
Expected detections between 0 and 10, Actual detections: 0 
Expected detections between 0 and 10, Actual detections: 4 
Expected detections between 0 and 10, Actual detections: 0 
Expected detections between 200 and 500, Actual detections: 404 
Expected detections between 20 and 50, Actual detections: 27 
Expected detections between 0 and 10, Actual detections: 1 
Expected detections between 0 and 10, Actual detections: 3 
Expected detections between 20 and 50, Actual detections: 35 
Expected detections between 0 and 10, Actual detections: 9 
Expected detections between 0 and 10, Actual detections: 7 
Expected detections between 20 and 50, Actual detections: 41 
Pass: 15, Fail: 0, Pass %: 100.00

I guess the performance discrepancy is pretty small but anyway this change will align the inference closer to the training and also restore the sharpness of the pan-sharpened images (we could create another layer for visualization but it seems worse to maintain long-term).

I also removed relative path action since it broke the rslp.landsat_vessels.job_launcher functionality (job_launcher passes --config "{JSON} but the relative path action interprets the JSON as a path and prepends some absolute path to it). This action doesn't seem to be used anywhere and makes it confusing because the user can no longer cd to a directory outside of rslearn_projects and run their own config file they have put in that directory unless they use absolute path, instead it will assume by default that they want to run a config from inside the rslearn_projects directory (I think this is unintuitive).

favyen2 · 2025-02-07T16:37:38Z

OK I don't think there is really any performance issue, it seems about the same either way just happens to detect slightly different numbers of vessels. So the main thing here is that it fixes the pan-sharpened image without sacrificing performance (and without needing to retrain the model).

Fix Landsat performance issue

e0baebf

favyen2 requested a review from yawenzzzz February 7, 2025 16:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Landsat performance issue by using nearest neighbor interpolation #101

Fix Landsat performance issue by using nearest neighbor interpolation #101

favyen2 commented Feb 7, 2025

favyen2 commented Feb 7, 2025 •

edited

Loading

Fix Landsat performance issue by using nearest neighbor interpolation #101

Are you sure you want to change the base?

Fix Landsat performance issue by using nearest neighbor interpolation #101

Conversation

favyen2 commented Feb 7, 2025

favyen2 commented Feb 7, 2025 • edited Loading

favyen2 commented Feb 7, 2025 •

edited

Loading