-
Notifications
You must be signed in to change notification settings - Fork 3k
API, Core: Add data file reference to DeleteFile #11443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| case 15: | ||
| return wrapped.sortOrderId(); | ||
| case 16: | ||
| if (wrapped instanceof DeleteFile) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of this instanceof given that it is called in a tight loop. We need it as IndexedDataFile is used for both data and deletes. At the same time, I am not sure it is reasonable to add IndexedDeleteFile for the sake of handling 3 different fields.
Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would define referencedDataFile on ContentFile and have DataFile always return null. That's how we handle content in v1/v2. DataFile returns a constant value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid unnecessary methods in DataFile. The same goes for content offset and size, which will only be applicable to deletes. Let me think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better not to expose these methods on DataFile:
- Avoids reliance on validation outside of this block to ensure these fields are null for data files. We had bugs before that allowed persisting equality field IDs for data files.
- Avoids unnecessary methods that can be invoked but have no meaning, which we have to explain.
| case 17: | ||
| this.referencedDataFile = value != null ? value.toString() : null; | ||
| return; | ||
| case 18: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the ordinal of fileOrdinal. This normally doesn't matter as we read with a projection. We also changed it a couple of times before. Anything I missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as both reads and writes are updated, it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming!
|
Thanks for reviewing, @rdblue @amogh-jahagirdar! |
This PR adds
referenced_data_fileto our delete metadata as per the design doc for improving position deletes.This PR cannot be merged until the vote is closed. This work is part of #11122.