Model redesign #30

annehaley · 2023-10-09T20:58:35Z

This PR is the first in a series of three PRs in a large redesign effort. This PR is scoped to include changes to database models, their rest endpoints, their conversion functions, and the ingest process that creates objects from our sample data (the populate management command).

Reorganize models
Rewrite Rest API for new models
Rewrite populate script and tasks to fit creation of new models

Follow-up PRs address the other facets of the redesign effort. Ideally, we should prepare and review them all together so that they can be merged all at once.

Write testing for CI (Enable CI testing #33)
Adjust web client to fetch and adhere to new models (Web updates #34)

uvdat/core/models/networks.py

uvdat/core/models/dataset.py

jjnesbitt · 2023-10-10T17:23:43Z

Considering this, developers can use this branch with its new migration history while the docker file refers to the new location, and they can use other branches with their old database location. If this is merged, we will no longer need the old database locations.

The local storage of volume data is meant as a temporary measure for reviewing/testing out this PR, correct (i.e. wouldn't be merged to master)? This configuration makes it otherwise pesky to do a local wipe of the database.

annehaley · 2023-10-11T13:11:31Z

The local storage of volume data is meant as a temporary measure for reviewing/testing out this PR, correct (i.e. wouldn't be merged to master)? This configuration makes it otherwise pesky to do a local wipe of the database.

Sure, this change can be removed before merging. I find it convenient that I can see the volume-mounted folders in the project location, but I understand that's just a personal preference. Besides, it deviates from the Girder 4 setup, so this change can just be scoped to this WIP.

annehaley · 2023-10-11T13:36:34Z

I made another iteration on the design (aaa1c7f), with the following changes @AlmightyYakob and I have discussed:

Changing the relationship between NetworkNodes and NetworkEdges, as suggested above.
Removing the inheritance for regions. OriginalRegion and DerivedRegion just have some repeated fields instead.
Additionally, DerivedRegion now has a reference to a VectorDataSource as its map representation, since it doesn't have a reference to a Dataset. We discussed how the region models themselves are internal representations, just like network nodes and edges. For visualization on the map, we should rely on DataSource -type objects. Therefore, DerivedRegions need a reference to one of these, which we know should be of the Vector type.
Removing the attempt at a DataCollection model. This is a concept we should keep in mind for future iterations, but it doesn't make sense at the moment. This concept is intended for a user to curate their own collections, so this model would likely be tied to a User object. At this stage, we disregard User objects and have not implemented authentication, so these objects would be largely purposeless. We will need them in the future, though, and they will be easy to add in later, since they are outside of the main web of models.

uvdat/core/models/networks.py

jjnesbitt · 2023-10-11T21:10:45Z

Additionally, DerivedRegion now has a reference to a VectorDataSource as its map representation, since it doesn't have a reference to a Dataset. We discussed how the region models themselves are internal representations, just like network nodes and edges. For visualization on the map, we should rely on DataSource -type objects. Therefore, DerivedRegions need a reference to one of these, which we know should be of the Vector type.

I think the problem with this is that derived regions can be "derived" from regions taken from multiple datasets. I've been thinking about the lingering gaps in our modeling, and I'm starting to think we need an additional data source, DerivedRegionDataSource. This would replace DerivedRegion entirely, and would largely contain the same fields.

I also think that rather than point to a dataset, a Region should refer to a VectorDataSource (which in turn refers to a dataset), since I think that more closely models what we're trying to achieve with the DataSource abstraction. What do you think? I've included a diagram below to represent this configuration.

---
UVDAT ER Diagram
---
erDiagram
Dataset {
    CharField name
    TextField description
    ForeignKey city
}

DerivedRegionDataSource {
    CharField name
    ForeignKey city
    S3FileField geojson_data
    ManyToManyField source_regions
}

VectorDataSource {
    ForeignKey dataset
    JSONField metadata
    S3FileField geojson_data
}

Region {
    BigAutoField id
    CharField name
    JSONField properties
    MultiPolygonField boundary
    ForeignKey data_source
}
%% Relations

Dataset }|--|| City : city
VectorDataSource }|--|| Dataset: dataset
Region }|--|| VectorDataSource : data_source
DerivedRegionDataSource }|--|{ Region: source_regions
DerivedRegionDataSource }|--|| City: city

annehaley · 2023-10-12T13:33:20Z

I think the problem with this is that derived regions can be "derived" from regions taken from multiple datasets. I've been thinking about the lingering gaps in our modeling, and I'm starting to think we need an additional data source, DerivedRegionDataSource. This would replace DerivedRegion entirely, and would largely contain the same fields. I also think that rather than point to a dataset, a Region should refer to a VectorDataSource (which in turn refers to a dataset), since I think that more closely models what we're trying to achieve with the DataSource abstraction.

I'm not sure we would be able to capture all we want to represent for the DerivedRegion in a DataSource. The goal for the DataSource type is to separate the map representation from the internal representation. For the map representation of a DerivedRegion, a VectorDataSource has everything we need. I think the internal representation of the boundary and the original regions should be separate from this. With the current model, it's fine if the original regions are from multiple datasets; the DerivedRegion is made independent from any datasets and a VectorDataSource (with dataset=null) can be made for it.

The way I think of the DataSource concept is as a implementation detail for visualization. Ideally, a user should be able to interact with every other model without concerning themselves with the DataSources. They can fetch Datasets and their FileItems, or they can fetch NetworkNodes and NetworkEdges related to a Dataset, or they can fetch OriginalRegions and DerivedRegions in a City. An advanced user may be interested in the converted data and may look at the DataSources, but the average user would only want the main models. Thus, the DataSource-type models shouldn't contain any data essential to the objects themselves. They should be largely hidden from the user and consumed only by the map viz.

annehaley · 2023-10-12T14:09:11Z

With the above consideration of how I think of DataSources, I think it would be more appropriate to make Charts as they were before. They should be a sibling to Dataset, not a child of AbstractDataSource. They aren't intended to be used by the map viz. I've made these changes in 1afb1f4

jjnesbitt · 2023-10-12T21:10:10Z

The way I think of the DataSource concept is as a implementation detail for visualization. Ideally, a user should be able to interact with every other model without concerning themselves with the DataSources. They can fetch Datasets and their FileItems, or they can fetch NetworkNodes and NetworkEdges related to a Dataset, or they can fetch OriginalRegions and DerivedRegions in a City. An advanced user may be interested in the converted data and may look at the DataSources, but the average user would only want the main models. Thus, the DataSource-type models shouldn't contain any data essential to the objects themselves. They should be largely hidden from the user and consumed only by the map viz.

This makes sense to me, and resolves my questions/concerns with DerivedRegion. I guess for that specific use case, we would use the city field to retrieve the DerivedRegions, and when it came time to view them on the map, we would use the VectorDataSource to retrieve that data.

My only other concern then is making sure the original use case regarding time series is covered. I'm a bit fuzzy on that, maybe we should discuss this offline at some point.

annehaley · 2023-10-12T21:14:06Z

Sounds good to me. I spent today on bdcece6, which fits our data into these new models. I tried combining the flood area datasets (grouping them by type) as a test for multiple DataSources on the same Dataset. We would do the same thing with the time series data.
Let me know what you think. We can discuss more tomorrow or next week if you'd like.

…ble_simulations`

Web updates

Enable CI testing

web/src/components/MainDrawerContents.vue

…ted` function

jjnesbitt

I pushed 0a2c9ca, which mostly just ensures that newRegionName is set back to "" when a derived region selection is canceled (the rest is formatting/ergonomics). If that looks good to you, then this is ready.

annehaley added 8 commits October 9, 2023 15:07

Reorganize model definitions (first pass)

f6ee54c

Use relative locations for docker volume mounts and fix tox

1cde6c9

Start new migration history

27ecae0

Udate admin.py for new models

a22b715

Format new models files

3c74ab7

Update rest viewsets for new models

d894e77

Update model references in tasks

acda2c3

Various design changes

d28edfa

jjnesbitt reviewed Oct 10, 2023

View reviewed changes

uvdat/core/models/networks.py Outdated Show resolved Hide resolved

jjnesbitt reviewed Oct 10, 2023

View reviewed changes

uvdat/core/models/networks.py Outdated Show resolved Hide resolved

jjnesbitt reviewed Oct 10, 2023

View reviewed changes

uvdat/core/models/dataset.py Show resolved Hide resolved

Another design iteration

aaa1c7f

jjnesbitt reviewed Oct 11, 2023

View reviewed changes

uvdat/core/models/networks.py Outdated Show resolved Hide resolved

jjnesbitt reviewed Oct 11, 2023

View reviewed changes

uvdat/core/models/networks.py Outdated Show resolved Hide resolved

annehaley added 2 commits October 12, 2023 09:36

Small changes to some fields

3e6b67f

Separate Charts from DataSources

1afb1f4

annehaley added 2 commits October 12, 2023 10:27

Fix Chart-FileItem relationship

cd31d9e

Update populate script and conversion tasks

bdcece6

annehaley force-pushed the model-redesign branch from ea9abbd to bdcece6 Compare October 12, 2023 21:03

annehaley added 3 commits October 18, 2023 13:05

Rename DataSources to MapLayers

87d8195

Change Cities into Contexts

cfd6a07

Rename "Original" to "Source"

edb1ada

annehaley and others added 17 commits October 31, 2023 12:56

Add label to Context dropdown

34b79f3

Fix applying zIndex when switching layers on the same Dataset

790e6df

Add indexes and constraints to VectorTile

6acaaf6

Remove geojson_data JSONField, reorg geojson funcs

a088c52

Add dataset classification

cef6f5e

Use tile extents instead of tile coords

a3df988

Fix type errors for AbstractLayer

0efeab8

Don't create local paths for service volumes

c96ff90

Retrieve map layers from dataset detail endpoint

f987749

Fix linting

4e19483

Squash migrations into initial

4ab6137

Add TODO

4b62b03

Exclude "Extended" serializers from serializer matches in `get_availa…

55fbff5

…ble_simulations`

Use already-fetched map layers, consistent use of caching and types

3d4287f

Fix reference to "dataset" object (instead of "Dataset" class)

1175a1b

Add TODO

334b445

Merge pull request #35 from OpenGeoscience/web-updates-updates

52d8d4d

jjnesbitt mentioned this pull request Nov 22, 2023

Optimize Vector Tiling #21

Closed

jjnesbitt and others added 4 commits November 22, 2023 13:39

Fix bug in getOrCreateLayerFromID

93ae876

Merge pull request #34 from OpenGeoscience/web-updates

56eab4a

Web updates

Merge pull request #33 from OpenGeoscience/ci-testing

a2026de

Enable CI testing

Merge remote-tracking branch 'origin/master' into model-redesign

6459e75

jjnesbitt reviewed Nov 27, 2023

View reviewed changes

web/src/components/MainDrawerContents.vue Outdated Show resolved Hide resolved

Use selectedDerivedRegions storage instead of `isDerivedRegionSelec…

dfe4c74

…ted` function

annehaley requested a review from jjnesbitt November 27, 2023 16:59

Update behavior around availableDerivedRegions

0a2c9ca

jjnesbitt approved these changes Nov 27, 2023

View reviewed changes

Fix layers edge case bugs

361ec2f

annehaley merged commit 72000c2 into master Nov 27, 2023

annehaley deleted the model-redesign branch November 27, 2023 21:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model redesign #30

Model redesign #30

annehaley commented Oct 9, 2023 •

edited

Loading

jjnesbitt commented Oct 10, 2023

annehaley commented Oct 11, 2023

annehaley commented Oct 11, 2023

jjnesbitt commented Oct 11, 2023

annehaley commented Oct 12, 2023 •

edited

Loading

annehaley commented Oct 12, 2023

jjnesbitt commented Oct 12, 2023

annehaley commented Oct 12, 2023

jjnesbitt left a comment

Model redesign #30

Model redesign #30

Conversation

annehaley commented Oct 9, 2023 • edited Loading

jjnesbitt commented Oct 10, 2023

annehaley commented Oct 11, 2023

annehaley commented Oct 11, 2023

jjnesbitt commented Oct 11, 2023

annehaley commented Oct 12, 2023 • edited Loading

annehaley commented Oct 12, 2023

jjnesbitt commented Oct 12, 2023

annehaley commented Oct 12, 2023

jjnesbitt left a comment

Choose a reason for hiding this comment

annehaley commented Oct 9, 2023 •

edited

Loading

annehaley commented Oct 12, 2023 •

edited

Loading