You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Every time we change the way pipelines are called (e.g. by changing function arguments) we have to adapt the code in mlpp-workflows accordingly. It would be better instead if we had a stable interface between the two libraries.
It could be in the form of an output xr.Dataset object.
This could be done by simply moving the following function (defined in mlpp-workflows) to this library.
defextract_features(
data: Dict[str, xr.Dataset],
feature_list: List[str],
points: Tuple[List],
reftimes: List[datetime],
leadtimes: List[int],
) ->xr.Dataset:
"""Extract features from a given source."""ds=xr.Dataset()
forfeatureinfeature_list:
LOGGER.info(f"FEATURE: {feature}")
try:
output=getattr(globals()["mlpp_features"], feature)(
data, points, reftimes, leadtimes, ds=ds
)
except:
LOGGER.exception(f"{feature} pipeline failed!")
ds[feature] =output.chunk("auto").persist()
LOGGER.info(ds)
returnds
It will also be easier to document how the two libraries interact since it will be just one object.
Hi @frazane, thanks for the nice suggestion. Indeed, the interface to mlpp-features is defined as a xr.Dataset object (all pipelines return that). This said, I like the idea of moving the extract_features method to mlpp-features! Moreover, I think it could be interesting to refactor it as a class, say a FeatureStore class, and use that not only to return the feature dataset (as in the original method above), but also to discover and explore features, for example to retrieve the list of all the input parameters given a list of features. What you think?
Every time we change the way pipelines are called (e.g. by changing function arguments) we have to adapt the code in mlpp-workflows accordingly. It would be better instead if we had a stable interface between the two libraries.
It could be in the form of an output
xr.Dataset
object.This could be done by simply moving the following function (defined in mlpp-workflows) to this library.
It will also be easier to document how the two libraries interact since it will be just one object.
@dnerini thoughts?
The text was updated successfully, but these errors were encountered: