-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arbitrarily sized stencils #426
Comments
I ran into a need for this as well. Formulated here: #426. I'm not sure which is a harder engineering challenge to integrate into Accelerate, but the type I expressed I think could support any general convolution, with a function for how to interact element wise with a kernel (Exp a -> Exp b -> Exp c) and then an associative function that reduces down these values to one element (Exp c -> Exp c -> Exp d). I don't really like the notation that stencil uses, because while its convenient to write an explicit function, using it like a kernel array is cumbersome. Would be nice to write gaussianKernel :: Acc (Matrix Float)
gaussianKernel = (use $ A.fromList ... / A.map ...)
gaussian :: Acc (Matrix Float) -> Acc (Matrix Float)
gaussian arr = stridedConvolution (Z :. 0 :. 0) (Z :. 0 :. 0) (*) (+) gaussianKernel arr |
The canny example in the accelerate-examples has some haskell land conversion that makes this sort of thing decent, though still not very pretty convolve5x1 :: Num a => [Exp a] -> Stencil5x1 a -> Exp a
convolve5x1 kernel (_, (a,b,c,d,e), _)
= P.sum $ P.zipWith (*) kernel [a,b,c,d,e]
convolve1x5 :: Num a => [Exp a] -> Stencil1x5 a -> Exp a
convolve1x5 kernel ((_,a,_), (_,b,_), (_,c,_), (_,d,_), (_,e,_))
= P.sum $ P.zipWith (*) kernel [a,b,c,d,e]
...
gaussianX :: Acc (Image Float) -> Acc (Image Float)
gaussianX = stencil (convolve5x1 gaussian) A.clamp
where
gaussian = P.map (/16) [ 1, 4, 6, 4, 1 ] But I had some long compilation time issues previously following this method (which Trevor very thankfully fixed). But I'm not sure what the scaling issues would be like if the kernel would be made much bigger. Whereas at the very worst, formulating this as convolution, you could write a binding to the cudnn libraries which support these types of things in the spirit of the many deep learning use cases. My particular use cases has been some simple object detection (looking for high density chunks of a certain feature by way of a strided convolution). Also, similar to gaussian kernels, being able to create certain angles of gabor filters as Accelerate matrices for convolution would be nice as well. |
I think it is important to tease out a few more details here. As far as I can tell, the discussion here (and in #387) are about convolutions specifically, which are generally more restricted than what In particular this ticket is asking for arbitrarily sized stencils. As a concrete example let's say a 51x51 gaussian blur, which we certainly can't do with the current interface (ignore whether or not this is a sensible thing to do). A basic building block which would allow this might be something like: stencil'
:: Exp sh -- size of the border, I promise not to read outside this offset
-> ((Exp sh -> Exp a) -> Exp b) -- given a function to read neighbouring elements (by relative offset), produce a final value
-> Boundary (Array sh a)
-> Acc (Array sh a)
-> Acc (Array sh b) ...and then implement more user-friendly functions on top of that (e.g. which iterate over the values in the tile and perform a convolution).
That is fair. The gaussian blur example works fine here because the coefficients are known statically. If they aren't then it will be more tedious to work with. For some of your use cases, I'm not sure when it is better to do the |
It should be possible to implement stencils of arbitrary size, where the size may not be known until runtime. This would look something like:
This would be used like the existing stencils, but instead of a (nested) tuple, the summary function passed an array to summarise.
As a motivating example, a Gaussian blur:
The text was updated successfully, but these errors were encountered: