Skip to content

Add Athena partition projection docs#13238

Merged
martint merged 1 commit intotrinodb:masterfrom
colebow:colebow/athena-partition-projection-docs
Jul 21, 2022
Merged

Add Athena partition projection docs#13238
martint merged 1 commit intotrinodb:masterfrom
colebow:colebow/athena-partition-projection-docs

Conversation

@colebow
Copy link
Member

@colebow colebow commented Jul 19, 2022

Description

Documentation for Athena partition projection feature. Edit of #13232

Related issues, pull requests, and links

Documentation

( ) No documentation is needed.
(x) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

(x) No release notes entries required.
( ) Release notes entries required with the following suggested text:

Copy link
Member

@aczajkowski aczajkowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a default somehow for each table?

Copy link
Member Author

@colebow colebow Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From looking at the code, the default value is null, but that null behaves identically to false because it's read in as an Optional and then used in some checks with <property>.orElse(false). So you do need to set this to true to enable it for a table. Should we say it's default false in the docs (because for our users, it may as well be), then submit a follow-up PR to explicitly set a default of false and get rid of the Optional? (I can handle that)

cc @aczajkowski

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little more complex even. Value is optional as we are not owners of it. It may be pre existing in metastore.
So we read it as optional and:

  • if Optional is empty - Projection is not active. Thats why in condition there is <property>.orElse(false)
  • if Optional is not empty
    • if value is false - Projection is there but disabled. We only validate all Projection table and column Properties. If there is some issue users may add Trino owned property partition_projection_ignore=true to by pass compatibility issues
    • if value is true - Projection is enabled. We validate all Projection table and column Properties and use Partition Projection

All above is due to fact that when in AWS Athena projection.enabled=false other properties are not validated and may contain invalid values.
And as we map them to Trino properties and the other way round we may run into issues.
So we prefere to always validate if projection.enabled is there. Or tell Trino to ignore that table projection at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And so the default is null

@colebow colebow force-pushed the colebow/athena-partition-projection-docs branch from 035db4e to b11650d Compare July 20, 2022 06:01
@colebow colebow requested a review from mosabua July 20, 2022 06:01
@aczajkowski
Copy link
Member

@colebow @mosabua Thank you for taking care of it.

@colebow colebow force-pushed the colebow/athena-partition-projection-docs branch from b11650d to 7e43705 Compare July 20, 2022 19:50
Copy link
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to go.. please merge this before the release is cut @martint

@mosabua
Copy link
Member

mosabua commented Jul 21, 2022

Can you maybe look at this and merge @kokosing ?

@kokosing
Copy link
Member

CI fails

@mosabua
Copy link
Member

mosabua commented Jul 21, 2022

@colebow can you confirm that you built the docs locally and the CI failure is a false alarm?

@mosabua
Copy link
Member

mosabua commented Jul 21, 2022

@kokosing @colebow I just checked out the code and built the docs just fine locally. Must be a CI flakiness.

@aczajkowski
Copy link
Member

@mosabua Ive builded locally as well.
CI error seems unrelated

Error: TestSqlTaskManager.testFailStuckSplitTasks:278 expected [1] but found [0] 

23639[INFO] 

23640Error: Tests run: 8383, Failures: 1, Errors: 0, Skipped: 0

@kokosing
Copy link
Member

I restarted the job. Please report the flaky test as GitHub issue. We should not just ignore flaky failures.

@colebow
Copy link
Member Author

colebow commented Jul 21, 2022

@kokosing it passed this time around. I'll try to look into why that test might be flaky.

@martint martint merged commit d8b3e5d into trinodb:master Jul 21, 2022
@github-actions github-actions bot added this to the 391 milestone Jul 21, 2022
@colebow colebow deleted the colebow/athena-partition-projection-docs branch September 8, 2022 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Write docs for AWS Athena partition projection

5 participants