-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a boundary layer module to estimate boundary height #3572
base: main
Are you sure you want to change the base?
Add a boundary layer module to estimate boundary height #3572
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall a good start, though the documentation could use a description of the assumptions or limitations of each method (i.e., this method gives odd answers at night, that one gives odd answers during the day, the concept of a boundary layer distinct from the free troposphere breaks down inside a thunderstorm).
This is a solid collection of functions to implement existing methods for soundings isolated in time.
from metpy.units import units | ||
|
||
|
||
def smooth(val, span): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XArray calls this a rolling mean. So does pandas.
Bottleneck calls this a moving-window mean.
SciPy appears to call the same thing a uniform filter.
These would likely work better in the See Also
section than as a change in the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the references! I knew some equivalent functions were already existing but they are not quite exactly the same (Xarray works on xarray.Dataset, Scipy has a slightly different strategy at the edges) and, given that the function is simple enough, it was less work to write it than to look for the existing one.
Bottleneck's function seems to do exactly what I want but it is not listed in the Metpy's dependencies. Do you think it's worth adding it so I can use their moving-mean function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd have to ask one of the maintainers, but, in the meantime, would this be faster?
cumulative_sums = np.nancumsum(val)
rolling_sums = cumulative_sums[span:] - cumulative_sums[:-span]
valid_index = np.isfinite(val)
cumulative_count = np.cumsum(valid_index)
rolling_count = cumulative_count[span:] - cumulative_count[:-span]
rolling_means = rolling_sums / rolling_count
You'd need to pre-allocate rolling_means
and handle the edges still but it should work. (Alternately, use np.lib.stride_tricks.sliding_window_view
with np.nanmean
, but the note about it being slow is warranted)
Alternately, use SciPy for the bulk of the calculation, then re-do the edges the way you want.
It would probably be a good idea to check whether this takes enough time that it's worth optimizing before going too far, though (as you may have noticed, I am not good at that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the testing data I have it's almost instantaneous and, as I moved other topics, I don't really have something bigger to quickly try it on. I suggest we leave it that way for now and other users might open another issue if when need to speed it up. Would that be alright?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You picked the same name as elsewhere in MetPy, though the edge handling is again different from what you do here (they do not smooth close to the edge) and from SciPy, and they do not use nanmean
.
The bigger test data would likely be someone trying to find boundary layer height from model data somewhere, and it looks like your code is designed for a profile at a time, not arrays of profiles (either the N x Z that description implies or the Z x Y x X conventional in model output). It might be a simple matter to add an axis
keyword argument to most functions and borrow the logic from the derivative functions to create the indexers (and possibly the vertical axis auto-detection for DataArrays), but that should probably be a follow-up PR in case it isn't.
src/metpy/calc/boundarylayer.py
Outdated
u[0] = 0 * units.meter_per_second | ||
v[0] = 0 * units.meter_per_second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason you force the winds at the first layer to zero without documenting the fact, rather than asserting an additional layer at the ground (below the profiles passed) where the winds are zero to satisfy the boundary conditions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at all, I just didn't have the idea of adding an extra point. Thanks for the suggestion!
However, it raises the question of what value put to the potential temperature at this extra point. I think duplicating the lowest point it the most convenient solution.
The two lines would be replaced by:
# Add a ground point with null wind to satisfy theoretical boundary conditions
u = np.insert(u, 0, 0 * units.meter_per_second)
v = np.insert(v, 0, 0 * units.meter_per_second)
height = np.insert(height, 0, 0 * units.metre)
potential_temperature = np.insert(potential_temperature, 0, potential_temperature[0])
Additionally, as the insertion of the ground changes the length of the profile, the returned profile should exclude it to keep the same size as the input ones:
return bRi[1:]
Does it look OK with these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you need to copy the point into the array? Could you just add a couple of variables for these? That would avoid the complication with the changing profile size, which would also impact idxfoot
.
Something like:
if idxfoot == 0:
u0 = 0
v0 = 0
else:
u0 = u[idxfoot]
v0 = v[idxfoot]
theta0 = theta[idxfoot]
height0 = height[idxfoot]
or even:
if idxfoot == 0:
Du = u
Dv = v
else:
Du = u - u[idxfoot]
Dv = v - v[idxfoot]
I'm not sure if height should be treated the same as velocity. I think I remember a displacement height that might complicate things?
I think the main difference between this method and yours would be if the boundary layer was below the lowest passed level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your second suggestion:
if idxfoot == 0:
Du = u
Dv = v
else:
Du = u - u[idxfoot]
Dv = v - v[idxfoot]
looks the best to me. I will go for that one. Thanks!
Regarding the problem that could arise with BL below the first level, I think the bulk Richardson method is out of scope anyway in this kind of situation, so I don't think we need to address it (rather refer to a more appropriate method).
theta0 = potential_temperature[0] | ||
|
||
if any(potential_temperature > theta0): | ||
iblh = np.where(potential_temperature > theta0)[0][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this looks like a potential temperature threshold method. I would prefer "exceeds" over "reaches" in the documentation, given the usual description of the convective boundary layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me. I will also add that it's only suited for unstable boundary layer, as suggested in the main comment. The name of the method might vary with the authors, "parcel method" is the one I have seen the most, but I can include other names (e.g. "potential temperature threshold method") in the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be a difference in expectations if the boundary layer is saturated (i.e. fog or fair-weather cumulus), but describing alternate names should avoid that.
src/metpy/calc/boundarylayer.py
Outdated
"""Calculate atmospheric boundary layer height from the relative | ||
humidity gradient | ||
|
||
It is the height where the relative humidity or specific humidity gradient reaches a minimum. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a side note, it should be possible to (ab)use this function with TKE or aerosol backscatter. I take it you considered wavelet methods out-of-scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely possible to use it with TKE or aerosol backscatter. I don't mention it because the references I give are related to radiosounding data, while aerosol would come from lidar or ceilometer and TKE usually from model output, but they are very good way of deriving the BLH (especially the TKE, when turbulence is right in the model).
The wavelet method is out of scope for this first PR, yes. It would be great to have it in the future but I will not have the time to commit for that. In the meantime, this gradient method + smoothing should be equivalent to the Haar wavelet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TKE usually from model output, but they are very good way of deriving the BLH (especially the TKE, when turbulence is right in the model).
I think I've seen TKE from Doppler lidar. It's probably possible to derive something similar from Doppler radar, but the radar-based boundary-layer methods I've seen have been based on scattering/reflectivity.
The wavelet method is out of scope for this first PR, yes. It would be great to have it in the future but I will not have the time to commit for that.
Yeah, there's a lot of interesting things to do with boundary layers, and not all of them should be in a first attempt.
In the meantime, this gradient method + smoothing should be equivalent to the Haar wavelet.
Fairly close, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, Doppler lidars can provide TKE but, as far as I remember, only the most powerful (and expensive) Doppler lidars will be able to have any signal from above the boundary layer, which make it difficult to run such an estimation because the top of the BL might be too close to the end of the profile.
Anyway, the function is there, no harm trying it on TKE profiles! Do you think I should change the name (and the doc) of the function to make it more general? Something like blh_from_concentration_gradient
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, Doppler lidars can provide TKE but, as far as I remember, only the most powerful (and expensive) Doppler lidars will be able to have any signal from above the boundary layer, which make it difficult to run such an estimation because the top of the BL might be too close to the end of the profile.
I think I've seen estimates based on SNR/signal quality in that situation, but likely less frequent of a use-case.
Anyway, the function is there, no harm trying it on TKE profiles! Do you think I should change the name (and the doc) of the function to make it more general? Something like
blh_from_concentration_gradient
?
I like blh_from_concentration_gradient
. Definitely mention humidity in the docs, since that's probably what people will be looking for.
src/metpy/calc/boundarylayer.py
Outdated
Contains a collection of boundary layer height estimations. | ||
|
||
|
||
References: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably ask the maintainers about one overall References
section here with citations everywhere else vs. repeated References
sections in each function that needs them.
"""Calculate atmospheric boundary layer height from the inversion of | ||
absolute temperature gradient | ||
|
||
It is the height where the temperature gradient (absolute or potential) changes of sign. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, how well does this work for the convective boundary layer with potential temperature? Or, for that matter, with the nocturnal stable boundary layer with either?
I was expecting to see a threshold method on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I remember (my experience with this is now a bit old), this method is more suited for nocturnal stable boundary layers as it will track the end of the stable layer at the surface.
For convective boundary layer, the parcel method should be preferred to this method, as this method would gives underestimated height with a large variability.
The threshold of
tests/test_boundarylayer.py
Outdated
blh = boundarylayer.blh_from_richardson_bulk(SAMPLE_HEIGHT, potential_temperature, SAMPLE_U, SAMPLE_V) | ||
print(f"Estimated boundary layer height with blh_from_richardson_bulk:\t {blh}") | ||
|
||
blh = boundarylayer.blh_from_parcel(SAMPLE_HEIGHT, potential_temperature) | ||
print(f"Estimated boundary layer height with blh_from_parcel:\t {blh}") | ||
|
||
blh = boundarylayer.blh_from_humidity_gradient(SAMPLE_HEIGHT, SAMPLE_RELATIVE_HUMIDITY) | ||
print(f"Estimated boundary layer height with blh_from_humidity_gradient (relative humidity):\t {blh}") | ||
|
||
blh = boundarylayer.blh_from_humidity_gradient(SAMPLE_HEIGHT, specific_humidity) | ||
print(f"Estimated boundary layer height with blh_from_humidity_gradient (specific humidity):\t {blh}") | ||
|
||
blh = boundarylayer.blh_from_temperature_inversion(SAMPLE_HEIGHT, SAMPLE_TEMPERATURE) | ||
print(f"Estimated boundary layer height with blh_from_temperature_inversion (absolute temperature):\t {blh}") | ||
|
||
blh = boundarylayer.blh_from_temperature_inversion(SAMPLE_HEIGHT, potential_temperature) | ||
print(f"Estimated boundary layer height with blh_from_temperature_inversion (potential temperature):\t {blh}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the usual method is to put the calculations in a function, with some assert
s to indicate what you expect to be true. With this setup, checking that the different methods give answers within a few hundred meters of each other, or within a few hundred meters of the value you picked out from a Skew-T or tephigram, would likely get your intent across.
I'm pretty sure MetPy has an example sounding around somewhere, for the documentation if not for the tests, so you could create similar tests on that one as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can write the corresponding pytest
unitary tests based on these computations. However, I better use a standard Metpy profile rather than my own testing data. Do you know where I could find this sounding data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the review, @DWesl !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/Unidata/MetPy/tree/main/staticdata has a few files matching *_sounding.txt
that look promising. Those are used here:
MetPy/examples/Advanced_Sounding.py
Lines 30 to 31 in eaf608d
df = pd.read_fwf(get_test_data('may4_sounding.txt', as_file_obj=False), | |
skiprows=5, usecols=[0, 1, 2, 3, 6, 7], names=col_names) |
The SkewT tests look like they use p = np.geomspace(1000, 100, 10); T = np.linspace(20, -20, 10)
and similar profiles. You could build something similar for a first test (constant potential temperature and water vapor mixing ratio up to some height, then a sharp increase in the former and a sharp decrease in the latter)
import numpy as np | ||
import pandas as pd | ||
|
||
import metpy.calc as mpcalc |
Check notice
Code scanning / CodeQL
Module is imported with 'import' and 'import from' Note test
# -*-coding:utf-8 -*- | ||
"""Testing program for the MetPy boundary layer module""" | ||
|
||
import numpy as np |
Check notice
Code scanning / CodeQL
Unused import Note test
col_names = ["pressure", "height", "temperature", "dewpoint", "direction", "speed"] | ||
|
||
df = pd.read_fwf( | ||
get_test_data("may4_sounding.txt", as_file_obj=False), |
Check warning
Code scanning / CodeQL
File is not always closed Warning test
src/metpy/calc/boundarylayer.py
Outdated
N = len(val) | ||
smoothed_val = deepcopy(val) | ||
for i in range(N): | ||
smoothed_val[i] = np.nanmean(val[i - min(span, i) : i + min(span, N - i)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smoothed_val[i] = np.nanmean(val[i - min(span, i) : i + min(span, N - i)]) | |
smoothed_val[i] = np.nanmean(val[max(i - span, 0) : min(i + span, N)]) |
Would this be more clear as to the intent?
return smoothed_val | ||
|
||
|
||
def bulk_richardson_number( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might close #628
Add a boundary layer module to estimate boundary height with various classical ways from vertical profiles
Description Of Changes
Boundary layer height estimation is a parameter commonly derived from vertical profiles (from radiosondings or model profiles) with key importance in application domains such as air quality. The issue #1515 was expressing the need for a way to estimate boundary layer height with MetPy. This PR suggests a module in
calc
to do so.The newly added module
boundarylayer.py
includes four different ways to estimate the boundary layer height, each of them is linked to references:Help needed
The imports in the module do not follow the MetPy standards. They are just as if MetPy was an external module. Can someone help me correcting them?
The decorators that are present on the functions of other modules (such as
exporter
) are missing. As I'm not familiar with them, help is welcome here too.The testing program added together with the module only runs the functions in the new module on sample data, but the unitary tests are not defined yet.
Thanks for your help.
Checklist