-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming arrow query results #317
base: main
Are you sure you want to change the base?
Conversation
|
||
func (r *arrowStreamReader) Retain() {} | ||
|
||
func (r *arrowStreamReader) Release() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite know what this function is supposed to be and where it's called, I have a feeling it's the release
method of the ArrowArrayStream
, in which case it's missing one part of the spec
The release callback MUST mark the structure as released, by setting its release member to NULL.
This release callback doesn't set the release member to NULL, which is necessary to mark the structure as being released
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Tishj,
Thank you very much for the review! It seems that Go's Arrow binding wraps a Go RecordReader
into an ArrowArrayStream
and conforms to the spec with a release wrapper: https://github.com/apache/arrow-go/blob/v18.1.0/arrow/cdata/exports.go#L175
I’ve also updated the implementation to include an atomic reference count, similar to how the official RecordReader
is implemented: https://github.com/apache/arrow-go/blob/v18.1.0/arrow/array/record.go#L44
arrow.go
Outdated
|
||
func (r *arrowStreamReader) Release() { | ||
if r.currentRec != nil { | ||
r.currentRec.Release() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a release
callback for the ArrowArray, it should check whether release
is null before calling it
A released structure is indicated by setting its release callback to NULL. Before reading and interpreting a structure’s data, consumers SHOULD check for a NULL release callback and treat it accordingly (probably by erroring out).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this is not a callback but a method, so it cannot be null.
This PR updates
Arrow.QueryContext
to return a streaming Arrow record reader, rather than materializing all query results in memory. This approach is beneficial for handling query results larger than available memory, such as when exporting a large table.