You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think the problem is that the rule for exp mutates the gradient it receives, which isn't safe as there's no guarantee this isn't shared with other rules -- here, the rule for + is doing Δ -> (Δ, Δ).
From here #381 (comment), but not sure when this was introduced. Haven't tried to check whether other rules do this too.
Xref FluxML/Zygote.jl#981 maybe --- it would be nice to change the convention to mutate Δ freely by default, and copy only when necessary.
The text was updated successfully, but these errors were encountered:
At it's root this is the same problem as the accumulation in presence of aliasing.
AD is not linearly typed, but rather linearly typed if you have a special dupe and elim operators that I need to understand.
Should these agree?
I think the problem is that the rule for
exp
mutates the gradient it receives, which isn't safe as there's no guarantee this isn't shared with other rules -- here, the rule for + is doingΔ -> (Δ, Δ)
.From here #381 (comment), but not sure when this was introduced. Haven't tried to check whether other rules do this too.
Xref FluxML/Zygote.jl#981 maybe --- it would be nice to change the convention to mutate
Δ
freely by default, and copy only when necessary.The text was updated successfully, but these errors were encountered: