Merged
Conversation
Change the behviour of the string array back to the old behaviour where accessing the Value function returns a string that is backed by the arrow memory buffer. This avoids data allocations to memory outside of the memory allocator. The implementation of array.String has been simplified somewhat as part of the new behaviour. There are a number of places where correct behviour relies on copies of the data being made. To avoid having to fix all of these in the same PR a temporary ValueCopy function has been added to maintain the old semantics. This is being used everywhere the Value function was previously, except for cases where the value is obviously immediately processed, then discarded.
appletreeisyellow
approved these changes
May 22, 2024
Contributor
appletreeisyellow
left a comment
There was a problem hiding this comment.
It seems a lot of files got touched in this PR, but the main change was adding the interface binaryArray and adapting it to different types. Many of the files changes are just refactoring the method name. I left some unblocking comments and questions ✅
I would prefer an other pair of eyes for review!
Comment on lines
+176
to
+183
| // ValueCopy returns the value at the requested position copied into a | ||
| // new memory location. This value will remain valid after the array is | ||
| // released, but is not tracked by the memory allocator. | ||
| // | ||
| // This function is intended to be temporary while changes are being | ||
| // made to reduce the amount of unaccounted data memory. | ||
| func (a *String) ValueCopy(i int) string { | ||
| return string(a.ValueRef(i).Bytes()) |
Contributor
There was a problem hiding this comment.
Thanks for the comments. It is helpful! 👍
Comment on lines
+198
to
+200
| // Buffer returns the memory buffer that contains the value. | ||
| func (r StringRef) Buffer() *arrowmem.Buffer { | ||
| return r.buf |
Contributor
There was a problem hiding this comment.
I don't find anywhere uses Buffer() function. Is it still needed?
Contributor
Author
There was a problem hiding this comment.
It will be used by the follow-up PRs in this series.
Co-authored-by: Chunchun Ye <14298407+appletreeisyellow@users.noreply.github.com>
jeffreyssmith2nd
approved these changes
May 22, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change the behviour of the string array back to the old behaviour where accessing the Value function returns a string that is backed by the arrow memory buffer. This avoids data allocations to memory outside of the memory allocator.
The implementation of array.String has been simplified somewhat as part of the new behaviour.
There are a number of places where correct behviour relies on copies of the data being made. To avoid having to fix all of these in the same PR a temporary ValueCopy function has been added to maintain the old semantics. This is being used everywhere the Value function was previously, except for cases where the value is obviously immediately processed, then discarded.
The cases where the
VisitCopyfunction is being used will be address one at a time until we can avoid significant levels of unaccounted memory.Checklist
Dear Author 👋, the following checks should be completed (or explicitly dismissed) before merging.
experimental/docs/Spec.mdhas been updatedDear Reviewer(s) 👋, you are responsible (among others) for ensuring the completeness and quality of the above before approval.