-
-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with special casing std_dev = 0
#283
Comments
Sorry, I am not able to keep up with all the code changes and discussion recently (I do read them quickly but I don't feel like I have enough time to give good feedback). It seems like the issue here is to find a balance between simplicity and correctness. Not special-casing 0 standard deviation for When I first read this issue, I thought it was going to relate more to #278 and the behavior of |
Yes, your comparison with #278 is apt. It hadn't specifically occurred to me but I think these are two of the main cases where attempts to make
Are you suggesting that it would be more correct for I think it is nice for Yes, as part of my vendetta against treating |
@jagerber48 I think it is inevitable that a UFloat with std_dev=0 will be prone to causing trouble or confusion. The whole point of having a UFloat is that std_dev is not 0. Handling this is an implementation choice, and backward compatibility would be nice but is not necessary. I think it is OK that Without looking at the details of the new code in #262, a few possibilities come to mind:
I think I would favor 2 and 4, but this is not a strong opinion. I also find FWIW, I would even suggest def ufloat(nominal_value, std_dev, tag=None):
if std_dev < 0.0 or abs(std_dev/nominal_value) < 1.e-15:
raise TypeError('ufloat needs a finite, positive value for `std_dev`')
.... This would require that |
@newville thanks for your response, I think it's really helpful. First off, if I try Regarding the issue here and more generally
Going through each in detail:
|
That all sounds reasonable. Any of those is OK with me. To be clear, though, for options 1 and 2, I meant only that creating a UFloat with the function x = ufloat(val, std_dev=0) could return a float, oir raises TypeError. I don't think that would cause downstream trouble or interfere with downstream uses. I would be OK with either option, as x = ufloat(7, 0) just makes no sense. Either "cast to the closest sensible object" or reply with "That ain't no ufloat!". If an ndarray of ufloats values has some value of For your use case, you say "In these cases I want to treat the values as having zero uncertainty.". I have a hard time understanding what that could mean, other than But, hey, go for it! |
It's true that it's easy for my use case and others to "wreak havoc". If If we agree that a. the behavior I've described in 3 pertaining to then I would say that 4 is the least invasive resolution to the bug. 4 changes behavior on a few already strange edge cases whereas 1 and 2 are both pretty major changes to the API, e.g. changing return type or raising an exception on previously valid input. edit: If there is a resolution to this bug that allows us preserve option 3 (i.e. the status quo) while ALSO maintaining the lazy linear combination expansion algorithm performance, I would want to know about that. I can think more about if it would be possible, but I'm a bit doubtful based on my current understanding of the algorithm. Basically, as far as my understanding goes, it is critical that edit2: And I'll repeat: Currently only a few (2 or 4 total) tests relating to |
@jagerber48 @wshanks I'm OK with option 4. I might say that 1 and 2 are sort of API changes, but also that And: option 4 only is OK too. |
@newville thanks for all the feedback. I think I still lean towards 4. One option is to do 4 but give a warning on |
Ok, I've thought more about it and have an important result. 1 and 2 fall prey to the same problem 3 does. If we go with 1 (std_dev = 0 returns a float) then if we do Note So for now I'm going to proceed with option 4 because that is the only one I see that can actually solve the issue being raised. |
@jagerber48 Sorry, I cannot tell if you are being serious or not. If calling the function Again, I mean precisely and only that calling the function
Nope. Not at all. Completey WRONG. That is not what option 1 is. Option 1 and Option 2 are about (and only about, and exactly about, and precisely about) the function "ufloat()". Period, end of story, no more to it. Your I am done with this discussion. I would like to be able to work on this project, but you have a very strong tendency to misunderstand what I say and put words in my mouth. Again, I am taking a break from this project. |
The discussion has moved on but to reply to the question earlier:
Yes, Based on the recent comments, it seems like options 1 and 2 imply option 4 in a way. By this I mean that they try to steer the user away from using 0 standard deviation because havoc may ensue (option 4), but the user is still free to generate a |
How well would using nan as the uncertainty instead of 0 work for your purposes? |
I've been pulled away from this for a couple of weeks as well. Anyways...
Yes I considered this. The answer is not well. I average in old data into new data to get sort of a moving average of the uncertainty over time. Basically the first data with real uncertainty sets the initial uncertainty if older data has no (i.e. zero) uncertainty. But if old data has Anyways, my current plan is:
|
Perfectly willing to be the continued voice of dissension here. I have no problem with UFloats having a standard deviation of zero or the ensuing havoc that will cause. I just don't care what about the result of
There are lots of oddities in the behavior and comparisons of UFloats. It's OK if some calculations blow up. Still, I would prefer that the function Again, this preference is for what the Using |
This is a subtle problem that has been an Achilles heel of #262. On master branch we have
Let
f=x**y
. In this case withx==0
and non-integery
we would expectdf/dx
to benan
. Indeed this is what is returned bypow_deriv_0
inops.py
(the function responsible for this math) https://github.com/lmfit/uncertainties/blob/master/uncertainties/ops.py#L137.We would expect
df/dy
to be 0 in this case and this is what is returned bypow_deriv_1
. So I would expect the uncertainty to beBut since
df/dx
isnan
the whole equation will be converted tonan
. So, I would expect(zero**p).s
to have anan
uncertainty. However, we can see that it does not. The reason it does not is because of this check inAffineScalarFunc.error_components
at line 509uncertainties/uncertainties/core.py
Lines 500 to 516 in 61f688f
Basically if any variable has zero std dev then it is excluded from the uncertainty calculation. My read of this is that
uncertainties
has attempted to treatUFloat
with zero std dev as a float.This has been... tricky.. to implement in #262. I was able to get it at one point, but it required a lot of special casing in various places. However, thinking about it so much has led me to discover what I would call a bug.
In the latter case we see
x_zero ** p
hasnan
std_dev despitex_zero == zero
. The problem is the catch on variables with zero std_dev only works onVariable
. Butx_zero
, is not aVariable
, it is anAffineScalarFunc
. We also can't compute the std_dev ofAffineScalarFunc
mid-computation because this is the costly step that would spoil the O(N) lazy linear combination expansion algorithm.If I eliminate that check then all tests still pass except for one test looking at
pow(zero, p)
and another looking atumath.pow(zero, p)
.I do not like treating
UFloat
with zero std_dev as a float. It requires some awkward special-casing in the code and in the tests. I do not think it will be possible to get(x_zero**p).s
to equal 0. So I don't think(zero**p).s
should be zero. I think it is ok if it isnan
. I'm not sure if docs hint thatUFloat
with zero std_dev should behave like a float, but if it is we could remove those hints. If users somehow depend on the behavior in this edge case, or they're really so surprised about it then I would say they can open an issue. But there are other surprising behaviors, for example #92.The text was updated successfully, but these errors were encountered: