Skip to content

Latest commit

 

History

History
102 lines (76 loc) · 2.55 KB

README.md

File metadata and controls

102 lines (76 loc) · 2.55 KB

MCAnalyzer

Build Status

MCAnalyzer.jl provides a interface to LLVM MCA for Julia functions.

Usage

MCAnalyzer.jl provides the two functions mark_start and mark_end both will insert some special markers into you code. llvm-mca will then analyse the generated object file and only analyse the parts in between the two markers.

Currently supported analysis modes:

  • analyze(function, types)
    • Corresponds to a basic analysis with no specific analysis flags
  • timeline(function, types)
    • Corresponds to a timeline analysis with the -timeline flag of llvm-mca
  • bottleneck(function, types)
    • Corresponds to a bottleneck analysis with the -bottleneck-analysis flag of llvm-mca

The analysis is printed to stdout.

Supported architectures

  • HSW: Haswell
  • BDW: Broadwell
  • SKL: Skylake
  • SKX: Skylake-X

By default analyse will use SKL, but you can supply a target architecture through analyze(func, tt, :SKX)

Caveats

iaca 3.0 currently only supports throughput analysis. This means that currently it is only useful to analyze loops. mark_start() has to be in the beginning of the loop body and mark_end() has to be after the loop. iaca will then treat the loop as an infite loop.

It is recommended to use @code_llvm/@code_native to inspect the IR/assembly and check that the annotations are in the expected place.

Examples

using MCAnalyzer

function mysum(A)
    acc = zero(eltype(A))
    for i in eachindex(A)
        mark_start()
        @inbounds acc += A[i]
    end
    mark_end()
    return acc
end

analyze(mysum, (Vector{Float64},))
using MCAnalyzer

function f(y::Float64)
    x = 0.0
    for i=1:100
        mark_start()
        x += 2*y*i
    end
    mark_end()
    x
end

analyze(f, (Float64,))
using MCAnalyzer

function g(y::Float64)
    x1 = x2 = x3 = x4 = x5 = x6 = x7 = 0.0
    for i=1:7:100
        mark_start()
        x1 += 2*y*i
        x2 += 2*y*(i+1)
        x3 += 2*y*(i+2)
        x4 += 2*y*(i+3)
        x5 += 2*y*(i+4)
        x6 += 2*y*(i+5)
        x7 += 2*y*(i+6)
    end
    mark_end()
    x1 + x2 + x3 + x4 + x5 + x6 + x7
end

analyze(g, Tuple{Float64})

Notes

Acknowledgment

  • @maleadt for LLVM.jl
  • @carnaval for IACA.jl the original inspiration for this project