-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arrow language #8512
Arrow language #8512
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a correct start.
I assume that in the future we want to support "structures" of array elements. E.g. not just an ArrayFixedArrayData32
or 64-bit, but:
ArrayFixedArray
of int32
, Date32
, float
, etc. struct. That will probably lead towards abstracting/defining the structure of single array element rather than having special types of array. Possibly.
engine/runtime-language-arrow/src/main/java/org/enso/interpreter/arrow/ArrowContext.java
Outdated
Show resolved
Hide resolved
engine/runtime-language-arrow/src/main/java/org/enso/interpreter/arrow/ArrowLanguage.java
Outdated
Show resolved
Hide resolved
engine/runtime-language-arrow/src/main/java/org/enso/interpreter/arrow/ArrowParser.java
Outdated
Show resolved
Hide resolved
engine/runtime-language-arrow/src/main/java/org/enso/interpreter/arrow/ArrowParser.java
Outdated
Show resolved
Hide resolved
...e-language-arrow/src/main/java/org/enso/interpreter/arrow/runtime/ArrowFixedArrayDate32.java
Outdated
Show resolved
Hide resolved
...e-language-arrow/src/main/java/org/enso/interpreter/arrow/runtime/ArrowFixedArrayDate32.java
Outdated
Show resolved
Hide resolved
...e-language-arrow/src/main/java/org/enso/interpreter/arrow/runtime/ArrowFixedArrayDate64.java
Outdated
Show resolved
Hide resolved
engine/runtime-with-polyglot/src/test/java/org/enso/interpreter/test/VerifyArrowTest.java
Outdated
Show resolved
Hide resolved
A recommendation for study: https://www.graalvm.org/truffle/javadoc/com/oracle/truffle/api/staticobject/package-summary.html - that's the official Truffle way to represent immutable (e.g. static) objects. |
810e06e
to
fd7a132
Compare
...arrow/src/main/java/org/enso/interpreter/arrow/runtime/ArrowCastToFixedSizeArrayFactory.java
Outdated
Show resolved
Hide resolved
d9e640b
to
28a06cb
Compare
...ime-language-arrow/src/main/java/org/enso/interpreter/arrow/node/ArrowCastFixedSizeNode.java
Outdated
Show resolved
Hide resolved
engine/runtime-language-arrow/src/main/java/org/enso/interpreter/arrow/node/ArrowEvalNode.java
Outdated
Show resolved
Hide resolved
engine/runtime-language-arrow/src/main/java/org/enso/interpreter/arrow/ArrowParser.java
Outdated
Show resolved
Hide resolved
id = ArrowLanguage.ID, | ||
name = "Truffle implementation of Arrow", | ||
characterMimeTypes = {ArrowLanguage.MIME}, | ||
defaultMimeType = ArrowLanguage.MIME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to explicitly specify internal = true
. By default internal = false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ArrowLanguage
shall be an experimental feature accessible via foreign arrow xyz = """
syntax. I am afraid the language needs to be non-internal to be exposed in the foreign
set of language (as of #7882).
In any case, it'd be good if ArrowLanguage
wasn't accessible by default (throw a parsing error for example). Possibly shield it with
var canArrow = false;
assert canArrow = true;
if (!canArrow) throw ...
A simple example with continuous fixed-size array storing dates is added. More to follow.
Tests were previously wrong and specialization had a typo, which was misleading. In order to be really comparable (not only up-to miliseconds) had to create instant for ZonedDateTime from seconds and nanoseconds.
Also simplified Date implementation to share more code.
Added an example where an IntVector is created in Java, we get a memory pointer to the buffer, and cast it in our Arrow language. The main trick was to ensure that the buffer created at the specific address has `LITTLE_ENDIAN` order. The test also revealed that we didn't type-adjust the index when writing/reading to our fixed-size int array.
It makes little sense to include the full dependency just to allow for memory-mapped byte buffer which we can construct by hand.
Rather than copying over non-null bitmaps (aka `validityBuffer`) we memory-map it, similarly to the data buffer. The code can be further simplified if we lock the type of elements stored in the ByteBuffer to a single one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be happy if there was a separate arrow-language.jar
with own module-info.class
. That way the Arrow language was clearly separated from the Enso runtime.
I'd hide the Arrow language behind a flag - otherwise it is hard to find out it is experiemental. Possibly behind -ea
for now.
build.sbt
Outdated
}, | ||
Test / addModules := Seq("org.enso.interpreter.arrow"), | ||
Test / javaOptions ++= testLogProviderOptions ++ Seq( | ||
"--add-opens=java.base/java.nio=org.enso.interpreter.arrow", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a fan of --add-opens
, but having that in tests only is probably fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One can probably use --add-opens=org.enso.interpreter.arrow/org.enso.interpreter.arrow==ALL-UNNAMED
- or something like that.
...arrow/src/main/java/org/enso/interpreter/arrow/runtime/ArrowCastToFixedSizeArrayFactory.java
Show resolved
Hide resolved
engine/runtime-language-arrow/src/main/java/org/enso/interpreter/arrow/runtime/NullValue.java
Outdated
Show resolved
Hide resolved
engine/runtime-language-arrow/src/main/java/org/enso/interpreter/arrow/util/MemoryUtil.java
Outdated
Show resolved
Hide resolved
engine/runtime-language-arrow/src/main/java/org/enso/interpreter/arrow/util/MemoryUtil.java
Outdated
Show resolved
Hide resolved
engine/runtime-language-arrow/src/test/java/org/enso/interpreter/arrow/VerifyArrowTest.java
Outdated
Show resolved
Hide resolved
assertNotNull(int32Constr); | ||
Value int32Array = | ||
int32Constr.newInstance( | ||
vector.getDataBufferAddress(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
address and capacity is OK, for now. It'd be handy (for interaction with Python) to also support InteropLibrary.isPointer.
...untime-language-arrow/src/main/java/org/enso/interpreter/arrow/runtime/ByteBufferDirect.java
Show resolved
Hide resolved
That's possible already now.
|
The problem is not that it is possible, but that it is impossible to distribute runtime.jar without Arrow! |
I don't think that is the case anymore. Nothing depends on arrow project. |
I think this first step towards supporting Arrow language is sufficient enough. There is obviously more work involved but the longer I delay the merge the harder becomes to keep it in sync with develop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excuse the delay in my review. Apart from some minor suggestions, looks good.
engine/runtime-language-arrow/src/test/java/org/enso/interpreter/arrow/VerifyArrowTest.java
Outdated
Show resolved
Hide resolved
private BaseFixedWidthVector allocateFixedLengthVector( | ||
BufferAllocator allocator, Object[] testValues, LogicalLayout unit) { | ||
var valueCount = 0; | ||
switch (unit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: refactor to return switch expression
engine/runtime-language-arrow/src/main/java/org/enso/interpreter/arrow/ArrowParser.java
Outdated
Show resolved
Hide resolved
...time-language-arrow/src/main/java/org/enso/interpreter/arrow/runtime/ArrowFixedArrayInt.java
Outdated
Show resolved
Hide resolved
engine/runtime-language-arrow/src/test/resources/application-test.conf
Outdated
Show resolved
Hide resolved
@JaroslavTulach What do you mean that it is impossible to distribute Or did you mean that you would like to distribute the arrow language, but would like to keep it separate from the |
Co-authored-by: Pavel Marek <[email protected]>
Pull Request Description
Initial implementation of the Arrow language. Closes #7755.
Currently supported logical types are
One can currently
new[<name-of-the-type>]
cast[<name-of-the-type>]
Closes #7755.
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.