Skip to content

Iceberg Metadata Cache for faster query planning#13338

Closed
abhinav-yadav-git wants to merge 2 commits intotrinodb:masterfrom
abhinav-yadav-git:Abhinav/IcebergMetadataCache
Closed

Iceberg Metadata Cache for faster query planning#13338
abhinav-yadav-git wants to merge 2 commits intotrinodb:masterfrom
abhinav-yadav-git:Abhinav/IcebergMetadataCache

Conversation

@abhinav-yadav-git
Copy link

@abhinav-yadav-git abhinav-yadav-git commented Jul 25, 2022

Description

Similar to HiveMetaStore cache, we need a in memory cache for Iceberg connector as well so that metadata operations (getTableProperties, getTableStats etc) can be served from cache reducing query planning time. In iceberg connector it is needed even more as due to metadata files being persisted on remote file system each iceberg table scan triggers remote reads for metadata json/manifest files and then these files are inferred by the iceberg layer which is a costly operation.

Current implementation is similar to the hive one but it is strongly consistent as it utilises the fact that in case of iceberg metadata json is immutable, with each write we would get a new metadata json, hence with each read for getTableHandle it checks if the metadata location of the cached table matches with the current metadata json location, if there is a mismatch we invalidate and flush all the individual caches(like statsCache, propertiesCache etc) making it possible to practically cache metadata for longer durations without comprising on correctness.

With this patch, this cache is only available for iceberg tables with Glue catalog and not hive as in order to get it working with hive catalog in iceberg we would need some refactoring to bypass CachingHiveMetaStore layer to avoid redundant caches in code as hive has its own metastore cache.

Improvements- Have tested this patch on TPCDS suite in our lower and envs and with the current change we are seeing 90% reduction in planning time if there is no change in the underlining table metadata.

Is this change a fix, improvement, new feature, refactoring, or other?
New Feature

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)
Trino Iceberg Connector

How would you describe this change to a non-technical end user or system administrator?
Iceberg Metadata Cache for caching metadata operations.

Related issues, pull requests, and links

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot
Copy link

cla-bot bot commented Jul 25, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: abhinav-yadav-git.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@abhinav-yadav-git abhinav-yadav-git force-pushed the Abhinav/IcebergMetadataCache branch from 23d72f7 to 810aee1 Compare July 25, 2022 13:29
@cla-bot
Copy link

cla-bot bot commented Jul 25, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: abhinav-yadav-git.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@electrum
Copy link
Member

We actually don't allow usage of CachingHiveMetastore in Iceberg, so we don't need to worry about redundant caching.

@findepi
Copy link
Member

findepi commented Jul 27, 2022

can you please make sure you filled the CLA? For more information, see https://github.com/trinodb/cla.

@abhinav-yadav-git abhinav-yadav-git force-pushed the Abhinav/IcebergMetadataCache branch from 810aee1 to 2619685 Compare August 1, 2022 05:57
@cla-bot cla-bot bot added the cla-signed label Aug 1, 2022
@abhinav-yadav-git
Copy link
Author

@findepi Have signed the cla, and have refreshed the PR.

@alexjo2144
Copy link
Member

alexjo2144 commented Aug 1, 2022

Between Trino and the Iceberg library there are a few conversations and approaches floating around for reducing planning time.

#11708
apache/iceberg#4518

There's also some caching in the Iceberg library that I don't think we leverage very well right now: https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/BaseSnapshot.java#L54

Maybe we can consolidate the conversation somewhere.

Improvements- Have tested this patch on TPCDS suite in our lower and envs and with the current change we are seeing 90% reduction in planning time if there is no change in the underlining table metadata.

Also, which version did you test against? @findepi merged this change into 391 I think that was a non-trivial improvement.

@abhinav-yadav-git
Copy link
Author

@alexjo2144 The iceberg PR which has been shared by you is something which we are looking at as well, caching manifest content should help in reducing planning as well as scheduling time especially in cases of complex queries where a table is present in more than one sub queries or nested queries causing repetitive manifest scanning.
Irrespective of the optimisations at the iceberg layer, caching the metadata operations at trino layer should reduce most of the planning time, assuming most of the latency in planning comes from costly table scans. Also I agree we can create a parent Issue may be for all improvements in planning and link other issues for clarity.

Also I tested my patch on trino 374. 379 and 388 versions and saw similar improvements. We are not on 391 yet, the stats cache PR will also have good improvement but I think it is more of caching within the same query transaction and not across transactions.

@abhinav-yadav-git abhinav-yadav-git force-pushed the Abhinav/IcebergMetadataCache branch from 2619685 to 355e17a Compare August 4, 2022 15:22
@abhinav-yadav-git
Copy link
Author

abhinav-yadav-git commented Sep 5, 2022

@findepi @electrum @erichwang @alexjo2144 Hi folks, can you please review the above PR.

@ElapsedSoul
Copy link

mark for record

@mosabua
Copy link
Member

mosabua commented Jan 12, 2024

👋 @abhinav-yadav-git - this PR has become inactive. We hope you are still interested in working on it. Please let us know, and we can try to get reviewers to help with that.

We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks.

@github-actions
Copy link

github-actions bot commented Sep 5, 2024

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Sep 5, 2024
@raunaqmorarka
Copy link
Member

This should be tackled by #22739

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

8 participants