Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add preprocessor to extract surface values from 3D atmospheric variables #2641

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions doc/recipe/preprocessor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -311,6 +311,7 @@ Preprocessor Variable s
:ref:`weighting_landsea_fraction<land/sea fraction weighting>` [#f3]_ ``sftlf``, ``sftof`` land_area_fraction, sea_area_fraction
:ref:`distance_metric<distance_metric>` [#f5]_ ``areacella``, ``areacello`` cell_area
:ref:`histogram<histogram>` [#f5]_ ``areacella``, ``areacello`` cell_area
:ref:`extract_surface_from_atm<extract_surface_from_atm>` [#f3]_ ``ps`` surface_air_pressure
===================================================================== ============================== =====================================

.. [#f3] This preprocessor requires at least one of the mentioned supplementary
Expand Down Expand Up @@ -2205,6 +2206,7 @@ The ``_volume.py`` module contains the following preprocessor functions:
* ``extract_transect``: Extract data along a line of constant latitude or
longitude.
* ``extract_trajectory``: Extract data along a specified trajectory.
* ``extract_surface_from_atm``: Extract atmospheric data at the surface.


``extract_volume``
Expand Down Expand Up @@ -2388,6 +2390,21 @@ Note that this function uses the expensive ``interpolate`` method from
See also :func:`esmvalcore.preprocessor.extract_trajectory`.


``extract_surface_from_atm``
----------------------

This function extract data at the surface for an atmospheric variable.

The function retrieves the first unmasked data points along the pressure levels
and returns the interpolated value at the corresponding surface pressure
given by the surface air pressure ``ps``.

The required supplementary surface air pressure ``ps`` is attached to
the main dataset as described in :ref:`supplementary_variables`.

See also :func:`esmvalcore.preprocessor.extract_surface_from_atm`.


.. _cycles:

Cycles
Expand Down
3 changes: 3 additions & 0 deletions esmvalcore/preprocessor/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@
extract_transect,
extract_volume,
volume_statistics,
extract_surface_from_atm,
)
from ._weighting import weighting_landsea_fraction

Expand Down Expand Up @@ -122,6 +123,8 @@
"resample_time",
# Level extraction
"extract_levels",
# Extract surface
"extract_surface_from_atm",
# Weighting
"weighting_landsea_fraction",
# Mask landsea (fx or Natural Earth)
Expand Down
22 changes: 15 additions & 7 deletions esmvalcore/preprocessor/_supplementary_vars.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,13 +105,21 @@ def add_ancillary_variable(cube, ancillary_cube):
var_name=ancillary_cube.var_name,
attributes=ancillary_cube.attributes,
)
start_dim = cube.ndim - len(ancillary_var.shape)
cube.add_ancillary_variable(ancillary_var, range(start_dim, cube.ndim))
logger.debug(
"Added %s as ancillary variable in cube of %s.",
ancillary_cube.var_name,
cube.var_name,
)
data_dims = [None] * ancillary_cube.ndim
for coord in ancillary_cube.coords():
for ancillary_dim, cube_dim in zip(ancillary_cube.coord_dims(coord),
cube.coord_dims(coord)):
data_dims[ancillary_dim] = cube_dim
if None in data_dims:
none_dims = ", ".join(
str(i) for i, d in enumerate(data_dims) if d is None)
msg = (
f"Failed to add {ancillary_cube} to {cube} as ancillary variable. "
f"No coordinate associated with ancillary cube dimensions {none_dims}"
)
raise ValueError(msg)

cube.add_ancillary_variable(ancillary_var, data_dims)


def add_supplementary_variables(
Expand Down
106 changes: 106 additions & 0 deletions esmvalcore/preprocessor/_volume.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
from iris.coords import AuxCoord, CellMeasure
from iris.cube import Cube
from iris.util import broadcast_to_shape
import stratify

from ._shared import (
get_iris_aggregator,
Expand Down Expand Up @@ -603,3 +604,108 @@
points = [("latitude", latitudes), ("longitude", longitudes)]
interpolated_cube = interpolate(cube, points) # Very slow!
return interpolated_cube


def _get_first_unmasked_data(array, axis):
bouweandela marked this conversation as resolved.
Show resolved Hide resolved
"""Get first unmasked value of an array along an axis."""
mask = da.ma.getmaskarray(array)
numerical_mask = da.where(mask, -1.0, 1.0)
indices_first_positive = da.argmax(numerical_mask, axis=axis)
indices = da.meshgrid(
*[da.arange(array.shape[i]) for i in range(array.ndim) if i != axis],
indexing="ij",
)
indices = list(indices)
indices.insert(axis, indices_first_positive)
first_unmasked_data = np.array(array)[tuple(indices)]
return first_unmasked_data


@register_supplementaries(
variables=['ps'],
required='require_at_least_one'
)
def extract_surface_from_atm(
cube: Cube,
) -> Cube:
"""Extract surface from 3D atmospheric variable based on surface pressure.

Parameters
----------
cube:
Input cube.

Returns
-------
iris.cube.Cube
Collapsed cube.
"""
# Declare required variables
# - 3D atmo variable to extract at the surface
# - surface_air_pressure (ps)
ps_cube = None
try:
ps_cube = cube.ancillary_variable("surface_air_pressure")
except iris.exceptions.AncillaryVariableNotFoundError:
logger.debug(
"Ancillary variable surface air pressure not found in cube. "
"Check file availability."
)
if ps_cube:
# Fill masked data if necessary (interpolation fails with masked data)
(z_axis,) = cube.coord_dims(
cube.coord(axis="Z", dim_coords=True)
)
mask = da.ma.getmaskarray(cube.core_data())
if mask.any():
first_unmasked_data = _get_first_unmasked_data(
cube.core_data(), axis=z_axis
)
dim_map = [dim for dim in range(cube.ndim) if dim != z_axis]
first_unmasked_data = iris.util.broadcast_to_shape(
first_unmasked_data, cube.shape, dim_map
)
cube.data = da.where(
mask, first_unmasked_data, cube.core_data()
)

# Interpolation (not supported for dask arrays)
air_pressure_coord = cube.coord("air_pressure")
original_levels = iris.util.broadcast_to_shape(
air_pressure_coord.points,
cube.shape,
cube.coord_dims(air_pressure_coord),
)
target_levels = np.expand_dims(ps_cube.data, axis=z_axis)
sfc_data = stratify.interpolate(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you looked into using the esmvalcore.preprocessor.extract_levels function?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to make it work but couldn't manage to. To replace the interpolation step I tried with

sfc_data = extract_levels(
            cube=cube,
            levels=ps_cube.data,
            scheme="linear",
            coordinate="air_pressure",
            rtol=1e-7,
            atol=None
            )

but I always get the following error TypeError: extract_levels() missing 1 required positional argument: 'data' which I don't really understand where it comes from. In case you have any insights on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be caused by the preserve_float_dtype decorator around extract_levels. Using

sfc_data = extract_levels(
    cube,
    levels=ps_cube.data,
    scheme="linear",
    coordinate="air_pressure",
    rtol=1e-7,
    atol=None
)

avoids this problem. @schlunma Since you introduced the wrapper, would you have time to take a look at improving it to avoid this puzzling error?

target_levels,
original_levels,
cube.data,
axis=z_axis,
interpolation="linear",
extrapolation="linear",
)
sfc_data = np.squeeze(sfc_data, axis=z_axis)

# Construct surface var cube
indices = [slice(None)] * cube.ndim
indices[z_axis] = 0

Check notice on line 692 in esmvalcore/preprocessor/_volume.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

esmvalcore/preprocessor/_volume.py#L692

No overload variant of "__setitem__" of "list" matches argument types "Any", "int" [call-overload] (error)
var_cube = cube[tuple(indices)]
var_cube.data = sfc_data
if var_cube.coords("air_pressure"):
var_cube.remove_coord("air_pressure")
ps_coord = iris.coords.AuxCoord(
ps_cube.data,
var_name="plev",
standard_name="air_pressure",
long_name="pressure",
units=ps_cube.units,
)
var_cube.add_aux_coord(ps_coord, np.arange(var_cube.ndim))
var_cube.var_name = cube.var_name + 's'
logger.debug("Extracting surface using surface air pressure.")

else:
raise ValueError("Surface air pressure could not be found. Stopping.")

return var_cube