Add Land cover mapping tutorial #2449

burakekim · 2024-12-05T15:45:48Z

Adding a new tutorial according to #2418

This tutorial demonstrates how to combine Sentinel-2 and ~~CDL~~ EuroCrops datasets using the ~~Sentinel2CDLDataModule~~ Sentinel2EuroCropsDataModule. It covers training a semantic segmentation model, along with evaluation and inference steps.

nilsleh · 2024-12-06T13:47:19Z

@burakekim Can you add the tutorial to the table of content in the rst file? Then one can view it in the CI docs to review as well :)

docs/index.rst

burakekim · 2025-01-04T13:48:49Z

The initial and somewhat end-to-end draft is now out:

downloads a Sentinel-2 patch with rasterio's windowed reading
prepares EuroCrops
visualizes the Sentinel-2 patch and its corresponding EuroCrops mask (with matplotlib and on a dynamic map)
trains and loads a dummy model for qualitative and quantitative evaluation

There are quite a few things I want to correct and improve:

train on GPU for a reasonable number of epochs, with proper dataloader and Trainer hyperparameters
maybe host the pretrained model + Sentinel-2 patch on HF?
use a bigger Sentinel-2 patch for training and possibly download another patch for inference or use some sort of opportunistic sampling (can we do that with GridSampler?) for proper evaluation that tones down potential spatial autocorrelation
EuroCrops has over 300 labels, but each country has its own distinct subset. The number of classes Slovakia has is still high. Shall we just turn this into a binary crop classification?
addressing the question above comes down to what we want to do with the trained model, i.e., does it add value to form multi-class classification?
there is a skimage dependency to visualize Sentinel-2 with percentile normalization; and folium, pyproj, shapely for plotting the Sentinel-2 and EuroCrops bounds on a dynamic map -- or is it fine to download 3rd party libraries for individual case studies?

P.S. In the next iteration, I am thinking of renaming the case study to Crop Type Classification. That would describe the task better

docs/tutorials/case_studies.rst

Co-authored-by: Adam J. Stewart <[email protected]>

adamjstewart · 2025-01-04T21:19:46Z

Still need to actually look at the code, but here are responses to your TODOs:

train on GPU for a reasonable number of epochs, with proper dataloader and Trainer hyperparameters

Note that this needs to run in CI, preferably in seconds, not days. We can monkeypatch certain hyperparams to make this faster, but it shouldn't require a GPU.

maybe host the pretrained model + Sentinel-2 patch on HF?

Happy to do this if it makes the above faster while still getting good results.

use a bigger Sentinel-2 patch for training and possibly download another patch for inference or use some sort of opportunistic sampling (can we do that with GridSampler?) for proper evaluation that tones down potential spatial autocorrelation

Avoid big data, this needs to run in CI where we have very limited storage, don't want to wait 10 min to download data during a tutorial. Not sure what you mean by opportunistic sampling, but there are various GeoDataset splitting methods that you can use to chop a tile into east/west splits, grids, etc.

EuroCrops has over 300 labels, but each country has its own distinct subset. The number of classes Slovakia has is still high. Shall we just turn this into a binary crop classification? addressing the question above comes down to what we want to do with the trained model, i.e., does it add value to form multi-class classification?

I think both add value. Basically, we should have some kind of binary semantic segmentation application, and some kind of multiclass semantic segmentation application. They don't both have to be for agriculture though. For binary, something like building mapping may make more sense.

Also, tasks involving agriculture benefit greatly from time-series data. I'm planning on extending this tutorial for time series once we add support for it. So don't worry too much about the details right now, they will change in the future. This will also make the big data problem even worse, so keep the images small for now.

there is a skimage dependency to visualize Sentinel-2 with percentile normalization; and folium, pyproj, shapely for plotting the Sentinel-2 and EuroCrops bounds on a dynamic map -- or is it fine to download 3rd party libraries for individual case studies?

Would prefer to avoid any additional dependencies if we can. Any reason we can't plot a static map with matplotlib? eurocrops.plot(sample) and sentinel2.plot(sample) should get you pretty far. If we do need to add additional deps, they need to be installed in .github/workflows/tutorials.yaml and .github/workflows/release.yaml like we did with planetary_computer. But I'm trying to get rid of those too, since they aren't absolutely necessary and aren't tracked by dependabot like our formal deps.

P.S. In the next iteration, I am thinking of renaming the case study to Crop Type Classification. That would describe the task better

I agree with the rename. Both "Crop Classification" and "Crop Type Mapping" are common names. I think the latter may actually be even more common, and more technically correct. A computer vision person may argue that this is semantic segmentation, not classification. Of course, semantic segmentation is just pixelwise classification, so the distinction isn't too important.

burakekim and others added 3 commits November 29, 2024 20:38

define cdl and s2 datamodules

3dc8f5e

Merge branch 'microsoft:main' into tutorial_be

890d9d5

debugging length none error

5e3736e

github-actions bot added the documentation Improvements or additions to documentation label Dec 5, 2024

burakekim added 5 commits December 5, 2024 16:14

markdowns

c9fc49b

cdl to eurocrops

c2cb387

no spatiotemporal intersection error

c15ff01

solve spatiotemporal error + new France S2 + plotting

857d45f

vectordataset getitem taking forever

786e997

adamjstewart mentioned this pull request Dec 6, 2024

Add additional tutorials #2418

Open

25 tasks

set up training

78c90fe

burakekim and others added 2 commits December 6, 2024 17:55

add to index.rst and author info

83098f2

Merge branch 'main' into tutorial_be

d2ffb64

adamjstewart modified the milestones: 0.6.2, 0.6.3 Dec 8, 2024

burakekim added 2 commits January 3, 2025 21:30

Merge branch 'main' into tutorial_be

a07e9e1

I might have messed up the index.rst

427da4c

adamjstewart reviewed Jan 3, 2025

View reviewed changes

docs/index.rst Outdated Show resolved Hide resolved

burakekim added 9 commits January 4, 2025 01:01

solved the nodata and code-got-stuck errors

f3d26c9

solved the nodata and code-got-stuck errors

247aaa8

set up training -- train on WS, need GPU

c96043f

first complete draft and case_studies.rst addition

44b2e8b

Merge branch 'main' into tutorial_be

68f5b2c

revert index.rst?

4464a88

revert index.rst for real?

cf2fdd3

index.rst spaces on a sunny day

33641de

ok this was the last one

1e54f03

some docstring

b1f930b

adamjstewart reviewed Jan 4, 2025

View reviewed changes

docs/tutorials/case_studies.rst Outdated Show resolved Hide resolved

Update docs/tutorials/case_studies.rst

c50ba47

Co-authored-by: Adam J. Stewart <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Land cover mapping tutorial #2449

Add Land cover mapping tutorial #2449

burakekim commented Dec 5, 2024 •

edited

Loading

nilsleh commented Dec 6, 2024

burakekim commented Jan 4, 2025 •

edited

Loading

adamjstewart commented Jan 4, 2025

Add Land cover mapping tutorial #2449

Are you sure you want to change the base?

Add Land cover mapping tutorial #2449

Conversation

burakekim commented Dec 5, 2024 • edited Loading

nilsleh commented Dec 6, 2024

burakekim commented Jan 4, 2025 • edited Loading

adamjstewart commented Jan 4, 2025

burakekim commented Dec 5, 2024 •

edited

Loading

burakekim commented Jan 4, 2025 •

edited

Loading