Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid \ for non-allocating #20

Open
ChrisRackauckas opened this issue Nov 19, 2017 · 6 comments
Open

Avoid \ for non-allocating #20

ChrisRackauckas opened this issue Nov 19, 2017 · 6 comments

Comments

@ChrisRackauckas
Copy link

When trying to avoid allocations, note that \ actually allocates. You will instead want to use A_ldiv_B! for the non-allocating form.

@acroy
Copy link
Owner

acroy commented Nov 19, 2017

This refers to chbv! (e.g. here) and padm (e.g. here).

@mforets
Copy link
Contributor

mforets commented Nov 22, 2017

A_ldiv_B! requires that the matrix A is factorized; these are special types depending on the properties of A. Take for example (chbv.jl L92):

for i = 1:2*p
    w .+= (A - t[i]*I) \ (a[i] * vec)
end

Here A - t[i]*I changes inside the loop. I wonder if this is still a case where A_ldiv_B! gives more performance? Since A_ldiv_B!(y, factorize(A - t[i]*I), a[i] * vec) requires calling factorize for each i.

Note that A = factorize(A) and Di = Diagonal(fill(t[i], n)) are valid factorizations, although A - Di is not.

@ChrisRackauckas
Copy link
Author

Here A - t[i]*I changes inside the loop. I wonder if this is still a case where A_ldiv_B! gives more performance? Since A_ldiv_B!(y, factorize(A - t[i]*I), a[i] * vec) requires calling factorize for each i.

Well two things here. One \ is internally calling factorize and doing this, so you'd just be going lower here. A - t[i]*I can be done inplace into a cache matrix. Then you can reduce an allocation here by doing factorize!(A - t[i]*I). Then make a[i] * vec in place into some cache vector. So that will get rid of two matrix allocations and one vector allocation (matrix because Julia cannot inplace factorize by default without your consent!). Whether that's enough to make a performance difference depends on the type of matrix and the size, among other things. Profiling would have to be done to measure how much it matters, but matrix allocations in an inner loop could be hurting and causing GC calls.

Next I think factorize should be a default arg, with allowing the user to give a choice for the linear solver method. Note that factorize is actually not type-stable, but the instability is contained because the result of A_ldiv_B! is ignored. But allowing the user to set it could speed up calculations on small matrices, and would let people do crazy things.

@acroy
Copy link
Owner

acroy commented Nov 23, 2017

I tried this yesterday:

function chbv2!{T<:Real}(w::Vector{T}, A, vec::Vector{T})
    p = min(length(θ), length(α))
    scale!(copy!(w, vec), α0)
    tmp_v = similar(vec, Complex128)
    B = A - θ[1]*I  # make copy of A with diagonal elements

    @inbounds for i = 1:p
        scale!(copy!(tmp_v, vec), α[i])
        for n=1:size(B,1)
            B[n,n] = A[n,n] - θ[i]
        end
        A_ldiv_B!( factorize(B), tmp_v)
        w .+= real( tmp_v )
    end
    return w
end

and the number of allocations is basically reduced by a factor 2.

I thought factorize! was deprecated (julia#5526) though?

@ChrisRackauckas
Copy link
Author

I thought factorize! was deprecated (julia#5526) though?

Then I would just get it to lufact! and have it be an option for the user to change. The matrix has to be square, right? So it should usually be the one that makes sense.

w .+= real( tmp_v )

you'll want that as

w .+= real.( tmp_v )

or

@. w == real(tmp_v)

@acroy
Copy link
Owner

acroy commented Nov 23, 2017

lufact! only works for dense arrays, right? cholfact! requires symmetric/hermitian matrices. If we use an optional argument to provide an appropriate factorize! version, we probably have to default to factorize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants