-
Notifications
You must be signed in to change notification settings - Fork 166
Intrinsic Functions Design
Currently both LPython and LFortran has intrinsic functions of two types:
- an actual language intrinsic function such as
chr
orlen
in Python andmodulo
orsin
in Fortran - functions that are not directly called by the user, but are still intrinsic, inserted by the frontend (AST to ASR pass) into ASR, an example is the string formatting
%
operator in Python, orformat
in Fortran. You can think of these a auxiliary functions that help the frontend represent certain operations, but from the ASR and backend perspectives they just behave like intrinsic functions.
They are all implemented in the runtime library in src/runtime
as either Python or Fortran source code. There are either just interfaces, or have an implementation:
- Interface only: those are functions that must be implemented by the backends, they cannot be implemented in the language itself. Examples are:
len
,size
,ubound
; as well as operations like bitwise integer operations, and even arithmetic operations (such as plus as in1+1
, minus, multiply, power, etc). - Implementation: functions that can be implemented in the frontend language, or more specifically, they can be implemented using other ASR operations. These are the vast majority of functions, such as trigonometric functions (
sin
,cos
, ...) and other mathematical functions, file operations, string formatting and manipulation, even things like modulo or floor. These functions are implemented either directly, or via calling into C, but in both cases they are implemented in ASR and the backends can treat them like any other function.
A special case of the "Implementation" intrinsic functions are functions like flipsign
which is an intrinsic function, inserted by an optimization pass to represent a certain operation:
- https://gitlab.com/lfortran/lfortran/-/blob/f0a924432ca1a4a83474ac11a4d13bdd2365e17a/src/runtime/builtin/lfortran_intrinsic_optimization.f90#L21
- https://github.com/lcompilers/lpython/blob/1d7ba2e6de540c7a5a926fe70ae3a32c88adcfd0/src/libasr/pass/flip_sign.cpp#L102
- https://github.com/lcompilers/lpython/blob/1d7ba2e6de540c7a5a926fe70ae3a32c88adcfd0/src/libasr/pass/pass_utils.cpp#L348
And then the backend can optionally provide a very efficient implementation:
- https://github.com/lcompilers/lpython/blob/1d7ba2e6de540c7a5a926fe70ae3a32c88adcfd0/src/libasr/codegen/asr_to_llvm.cpp#L3980
- https://github.com/lcompilers/lpython/blob/1d7ba2e6de540c7a5a926fe70ae3a32c88adcfd0/src/libasr/codegen/asr_to_llvm.cpp#L4028
- https://github.com/lcompilers/lpython/blob/1d7ba2e6de540c7a5a926fe70ae3a32c88adcfd0/src/libasr/codegen/asr_to_llvm.cpp#L4110
We will have hundreds of such optimizations like flipsign
, there is no upper bound on how many we can do. However, they all have an implementation, so if the backend doesn't want to implement them, it can just use the implementation and compile them like any other user function.
I think the number of intrinsic functions that are "Interface" only is limited.
See also the LFortran initial brainstorming for the compile time handling of intrinsic functions like sin
and the current implementation in LFortran and LPython:
- https://gitlab.com/lfortran/lfortran/-/wikis/Intrinsic-Functions
- https://gitlab.com/lfortran/lfortran/-/blob/f0a924432ca1a4a83474ac11a4d13bdd2365e17a/src/lfortran/semantics/comptime_eval.h#L41
- https://github.com/lcompilers/lpython/blob/1d7ba2e6de540c7a5a926fe70ae3a32c88adcfd0/src/lpython/semantics/python_comptime_eval.h#L31
The requirements on ASR passes and backends is that:
- they must be able to quickly identify a given function; probably using an integer ID that is unique
- there shouldn't be any overhead over representing such operation in ASR directly such as currently
BinOp
: https://github.com/lcompilers/lpython/blob/1d7ba2e6de540c7a5a926fe70ae3a32c88adcfd0/grammar/ASR.asdl#L205; that means that instead of representing*
as BinOp and checking the "op" argument (effectively an integer), it would be a FunctionCall, but one would check the ID argument (an integer), there would be just many more options for the ID (hundreds or thousands instead of 5), but it should be possible to implement it as efficiently, using the techniques from re2c --- possibly even as a large switch statement. - It should be as fast to construct the ASR, that is, if
*
is represented asmul_i32_i32
function, constructing theFunctionCall
node should be as fast as constructing theBinOp
node --- the frontend has understanding of these operations anyway, so it can load the functionmul_i32_i32
and store it in a symbol table and then just reference it quickly. With such an approach it should be possible to make it as fast as the current BinOp implementation. - We should not manage integers directly, but either some enum, or defines, so that we have one place in libasr that lists / registers these functions in some kind of a list, but underneath it would be equivalent to integer comparison for speed.
- The "interface" operations are set and the same (equivalent) for each front end language
- The functions like
flipsign
for optimizations are also the same for each language - Each frontend language then has its own functions, such as string formatting functions (Python and Fortran has different string formatting)
We might later convert even BinOp, BoolOp, StrOp into just function calls like any other intrinsic functions, but those should be done at the end, once the design is implemented, so that we can ensure the performance is the same. For now we will implement the above design for other functions first, and we get to these fundamental operations like integer arithmetic at the very end.
Some ideas:
-
For the common functions
len
,size
(interface),flipsign
,fma
(implementation) we can maintain them as ASR code (pretty printed) in libasr itself. Then libasr will have printers of ASR to the surface languages (Python, Fortran, ...), so we simply generate all the code forlen
,size
,flipsign
, ... into LPython and LFortran from libasr's ASR "reference" implementation. To update it, one can update the Python code via LPython, or Fortran code via LFortran, get new ASR and commit to libasr. As we sync libasr across LPython/LFortran, we simply regenerate this part of the runtime library. This approach would apply to all functions for which libasr must have special knowledge, that is, it needs these functions either in the LLVM or other backends (len
,flipsign
), or in the ASR passes (flipsign
). -
The functions that are unique to each frontend (such as string formatting) would simply only be maintained by that frontend. Libasr would not have any special knowledge of these (besides being intrinsic).
For intrinsic functions that cannot be implemented in the frontend language, such as len
, size
, reshape
, list.append()
, ubound
, as well as +
, -
, *
, etc. would be implemented in ASR itself. Then we can remove the lfortran/lpython_builtin
module; ASR becomes 100% self sufficient. It will not require any runtime library. The backends might choose to implement some functionality using a runtime library specific to the backend, but that's a separate issue (say a complex multiplication can be implemented as a C function, but that cannot be done at the ASR level, as the C function requires access to the "internals" of the complex number, which at the ASR level is an "opaque" intrinsic type).
Let us list all such intrinsic "operations".
+, -, *, /, **, bit and, bit or, bit xor, <<, >>
Possibly: popcnt
+, -, *, /, **
Possibly: nearest
+, -, *, /, **, %re, %im
creating a complex number from two reals (as runtime variables)
and, or, xor, not
len
, string concatenation, string indexing/slicing, allocate, deallocate
Note: "string repeat" can be implemented using the above intrinsic operations.
Note: uppercase/lowercase could be handled by casting character to int and implemented in a library. However perhaps it depends on the kind of character. If it is unicode it might be more tricky. Still it probably can be handled somehow in a library.
size
, shape
, reshape
, indexing/slicing, ubound
, lbound
, broadcasting
Possibly: concatenate, insert, append, delete_elements
array operations: +, -, *, /, ** (and "%re, %im" for complex arrays, "bit and, bit or, bit xor, <<, >>" for integer arrays, and "and, or, xor" for logical arrays) --- possibly these operations can be represented using the operators on the basic types that could accept arrays (of exactly the same shape, type and kind) as well.
allocate
, deallocate
Possibly: any
, all
, merge
allocate
, deallocate
, append
, indexing/slicing, len
, remove_item, extend, insert, clear,
Note: find can be implemented as a library function
len
, indexing/slicing
add_item, remove_item_at_index, len
, indexing, clear
add_item, remove_item, len
, clear, operations: -, |, &, ^
- Allocatables:
allocate
,deallocate
,move_alloc
(possibly can be lumped withallocate
),allocated
- Optional arguments:
present
- Conversion between types:
Cast
(formerImplicitCast
),transfer
- C interoperation:
c_loc
,c_associated
,c_f_pointer
,c_funloc
,c_sizeof
,C_F_PROCPOINTER
The proposal is being implemented step by step. See: https://gitlab.com/lfortran/lfortran/-/merge_requests/1700
With some experience implementing the proposal, here are the new requirements for ASR that guide its design:
- ASR is "raised" (not "lowered") from the frontend language; no semantic information was lost.
- It is as far as possible from the frontend language, while still allowing to transform back into it
- It is the canonical representation of the language; all semantics figured out and locally available (readily available from each ASR node; no need for some global look ups or extra passes over ASR to figure anything out)
- Backend can do a single pass over ASR to generate code (it can choose to apply some ASR passes to rewrite some ASR nodes using some simpler nodes; but it doesn't have to do that, in principle it can implement each node in a straightforward manner)
- ASR nodes are implemented in C++ in such a way to allow as fast as possible construction and walking/visiting (necessary for speed of compilation)
- It contains all basic types (logical, integer, real, complex, string, array, pointer, list, tuple, set, dict, class, derived type)
- It contains ASR nodes that operate on variables of these basic types; only those operations that require access to the details how a given type is implemented in the backend. Everything else is implemented by the frontend using existing ASR operations.
- These operations are specific to a type, so we have nodes like:
IntegerBinOp
,ComplexBinOp
,StringLen
,ArraySize
- The binary operations accept two arguments of the same type (
IntegerBinOp
only accepts integers, etc.), and also the samekind
. UseCast
to change kind or type. - We try not to split different types to different nodes, even return types, so we have:
ArrayItem
(returns an element),ArraySection
(returns an array)
- These operations are specific to a type, so we have nodes like:
- ASR is a self-contained (and complete) small language. You can implement a frontend that does not have any runtime library and it would be a very usable language, has everything one needs to implement more complicated operations. Or to rephrase it: there is no runtime library from the ASR perspective, the front end can have a "runtime library", but it compiles it and inserts into ASR, so the
libasr
accepts ASR that is self-contained and the backend only has to implement all ASR nodes. - Backends (LLVM, C++, WASM) can have their own runtime libraries of helper functions to help them implement some of the ASR features (at runtime), but those are always specific to a backend. For example, the LLVM can have some runtime library functions (implemented in C) that implement some of the string functionality; but the C++ backend might choose to just use
std::string
to implement everything.
Next step from a "minimal design":
- Add to ASR all nodes that are shared across all frontends. That includes all nodes that must be implemented by the backend, but it also includes nodes that could have been implemented in the frontend, such as
StringRepeat
. - This "broader scope" can be restated as: each frontend only has to maintain its "runtime library" of functions that are specific to that frontend. If it is shared with another frontend, it can go into ASR itself as a node. In another words: we do not need to maintain a set of ASR functionality/library shared across languages. Everything is either as ASR nodes, or in the frontend (and not shared with other frontends).
- Whether this broader design makes sense remains to be seen. Currently we only use it for
StringRepeat
.