-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement covariant kernels #40
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Initial test and example appear to be working well
Should enable us to use one type for both covariance cases
The `RealSpaceKernel` implementation is working and showing good results. Implementation is only slightly slower than the `PixelSpaceKernel`. Could get rid of the pixel-space implementation all together, but going to keep it for now. Next I will refactor `bin_cmd_smooth` and `partial_cmd_smooth` to add the new real-space kernel.
For `Float32/64` we can use `VectorizationBase.verf` to get a ~2x speedup in `addstar!` which should be worth it as `VectorizationBase` is a dependency of `LoopVectorization` already. To support this specialization, I was able to break the main `erf` calls out into `gaussian_int_general` so that the kernel `evaluate` methods for both `GaussianPSFAsymmetric` and `GaussianPSFCovariant` now call into this single generic function. We are still keeping a fallback for general `Real` inputs to `SpecialFunctions.erf` which supports more datatypes. Also added some docstrings and cleaned up some code syntax throughout.
Additional 2x speed improvement due to better simd. Requires ensuring that the loop indices are all inbounds *prior* to starting the loop, where the old implementation was lazy and looped over all defined indices with explicit `checkbounds()` inside the loop. `LoopVectorization.jl` and `VectorizationBase.jl` are at risk of being deprecated in Julia 1.11, but using them together in this application gives 4x speedup for the hottest loop in the template creation process, so I think this is worth doing for now.
add third filter for additional tests
Tried to fold `bin_cmd_smooth` into `partial_cmd_smooth` for better integration, but it massively allocates probably because it cannot determine the type of some variable. Julia specializes at function boundaries and so these allocations go away when I call out to `bin_cmd_smooth` with the same arguments. Having trouble figuring out what is causing the allocations at the moment so might just resort to refactoring `bin_cmd_smooth` after all.
Initial results look pretty good on real data. Should probably make `cov_mult` mandatory in `bin_cmd_smooth` rather than optional -- address later. Think I might have a bug in `model_cmd` as results on synthetic data processed with that function look worse; check later. The kernel implementation is a bit inefficient because the kernel is tilted in matrix space so many pixels are evaluated that are far from the bulk of the kernel. Could fix this with some sort of sparsity pattern mask.
At lower Hess diagram resolutions like `size(edges) = (75, 75)` there were significant artifacts in the template model with the new covariant kernel. Doubling the y-resolution to `length=150` gives a much better model. Not obvious why this is the case, as the integration over the pixels should make the solution resolution agnostic. Might have something to do with not resolving the skewed kernel shape if the grid is too coarse? Worth more investigation.
4x performance improvement with `@turbo` loop optimizations. Same approach as previous commit where I added `@turbo` for `PixelSpaceKernel`; make loop iterators aren't empty, all loop indices inbounds, etc.
Previously if you passed a direct `edges` argument to `calculate_edges` it would return it with no checks. Downstream functions that call this method now assume the output will be a `Tuple{<:AbstractRange, <:AbstractRange}`, so we have to check the input type and error if it is incorrect. This commit also cleans the docs for `bin_cmd`.
Rather than hard-coded 0.01 deltamg in call to `mini_spacing`, use calculated bin steps to set the required deltamag.
No need to pass all the other kws when we've already calculated edges
Forgot this was on by default, turn off to show raw data
When pixels are very large compared to kernel size, the old offsets could become 0 as, e.g., `div(1,2) = 0`, which would cause failures.
better to use the rcparam
old size was mostly adequate, but x-axis width was a bit low such that I could see the truncation of the kernel in the plots I'm making for the paper. Since the methods are well optimized, we can afford to increase the size of the kernel; making a single template still takes ~3 ms so still quite good.
Previous implementation failed to include the altered `deltax` in the integration across the bin/pixel; this was a shortcut that would work well at high grid resolutions (e.g., Hess diagram size=(75, 200)) but had major issues in lower-resolution grids. Re-derive the pixel integrals (see `notes/kernels.nb`) and implement new method for `gaussian_psf_covariant`. The integral is not entirely analytic, but one axis can be analytic integrated. After analytically integrating along the x-axis, I'm using 3-point Gauss-Legendre integration along the y-axis to finish the pixel integration. New residuals look much better and performance is comparable to `GaussianPSFAsymmetric` implementation.
add `kernels` docpage and fix examples to work in docs build
This was an old experimental code for implementing the covariant Gaussian kernels that have now been added into the main package.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #40 +/- ##
==========================================
+ Coverage 79.69% 88.06% +8.37%
==========================================
Files 19 18 -1
Lines 1374 1416 +42
==========================================
+ Hits 1095 1247 +152
+ Misses 279 169 -110 ☔ View full report in Codecov by Sentry. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #30.
This is a major PR that separates the kernels we support to describe the 2-D probability distributions of observed magnitudes given intrinsic magnitudes into
RealSpaceKernel
s andPixelSpaceKernel
s depending on whether their fields (centroids, widths, etc.) are defined in magnitude (real) space or the pixel space of the Hess diagram matrix.This was added to enable definition of the covariant kernel
GaussianPSFCovariant
which is easier to describe in real space coordinates than pixel space coordinates. This kernel is valid for Hess diagrams where the y-axis magnitude also appears in the x-axis color. Addition of this kernel greatly improves modelling accuracy for these types of Hess diagrams. A future refactor may be able to convert this into aPixelSpaceKernel
, but the extra code necessitated by the implementation ofaddstar!
for subtypes ofRealSpaceKernel
adds very little overhead (~200 ns per call) and so this is not a priority.A new example
examples/templates/kernels_example.jl
has been added showing the performance of this functionality. The output of this script is spliced into a new documentation page under thefitting/internals
heading.New tests, including of
partial_cmd_smooth
, have been added. Callingpartial_cmd_smooth
will necessarily call most of the other functions in the main "template creation" process. Many of these methods presently lack their own unit tests but this is low priority for now.There was also an issue with the old
data/isochrone.txt
which has now been fixed.I added VectorizationBase.jl to deps to get
AbstractSIMD, verf
for this PR but now I don't think it is truly necessary. However, this is a dependency of LoopVectorization.jl which I do need and so we would have to install it anyway. For now I'm fine keeping it but might remove in the future.