feat!(runtime): Support custom Runtime in Catalog#2308
feat!(runtime): Support custom Runtime in Catalog#2308CTTY wants to merge 9 commits intoapache:mainfrom
Conversation
| .metadata(metadata) | ||
| .metadata_location(metadata_location) | ||
| .identifier(table_ident) | ||
| .file_io(file_io.clone()) |
There was a problem hiding this comment.
I think we should pass runtime here as well, need to revisit
We can add a builder to IcebergTableProviderFactory and add with_runtime there. But I feel like this can be a separate effort
| pub struct RuntimeHandle { | ||
| /// Keeps the tokio runtime alive when we own it (`Runtime::new`). | ||
| /// `None` when borrowing an existing runtime via `Handle::try_current()`. | ||
| _owned: Option<Arc<tokio::runtime::Runtime>>, |
There was a problem hiding this comment.
I still don't understand why we need to keep this field, even if it's created from Runtime, I think keeping a handle would be enough?
There was a problem hiding this comment.
tokio won't guarantee the handle to complete if the runtime was dropped: https://docs.rs/tokio/latest/tokio/runtime/struct.Runtime.html#:~:text=Shutting%20down%20the,yield%20until%20completion.
There was a problem hiding this comment.
What's the problem if it's not complete? If you are worring about some actions may fail in the middle of executions, then this is not the right way to guarantee that. There are many ways to crash the program, and storing Arc<Runtime> doesn't help much and our design should not rely on it.
The problem with this design is that it makes difficult for user to understand the subtle difference between these two api. We should either always ask user to provide Arc<Runtime>, or always always only store Handle.
| use tracing::warn; | ||
|
|
||
| /// Wrapper around tokio's JoinHandle that panics on task failure. | ||
| pub struct JoinHandle<T>(task::JoinHandle<T>); |
There was a problem hiding this comment.
The error handling would be rather painful. tokio's JoinHandle's Output is Result<tokio's T, JoinError>: https://github.com/tokio-rs/tokio/blob/master/tokio/src/runtime/task/join.rs#L325
without the wrapper, we will need to handle the JoinError directly
There was a problem hiding this comment.
In this case I think it's reasonable to keep it, but we should not panic when error happens, we should convert it to iceberg's error.
| T: Send + 'static, | ||
| { | ||
| JoinHandle(task::spawn_blocking(f)) | ||
| impl Default for Runtime { |
There was a problem hiding this comment.
No, we should not implement Default
| pub struct RuntimeHandle { | ||
| /// Keeps the tokio runtime alive when we own it (`Runtime::new`). | ||
| /// `None` when borrowing an existing runtime via `Handle::try_current()`. | ||
| _owned: Option<Arc<tokio::runtime::Runtime>>, |
There was a problem hiding this comment.
What's the problem if it's not complete? If you are worring about some actions may fail in the middle of executions, then this is not the right way to guarantee that. There are many ways to crash the program, and storing Arc<Runtime> doesn't help much and our design should not rely on it.
The problem with this design is that it makes difficult for user to understand the subtle difference between these two api. We should either always ask user to provide Arc<Runtime>, or always always only store Handle.
| use tracing::warn; | ||
|
|
||
| /// Wrapper around tokio's JoinHandle that panics on task failure. | ||
| pub struct JoinHandle<T>(task::JoinHandle<T>); |
There was a problem hiding this comment.
In this case I think it's reasonable to keep it, but we should not panic when error happens, we should convert it to iceberg's error.
Which issue does this PR close?
Runtimesupport enhancement #1958What changes are included in this PR?
with_runtimetoCatalogBuildertraitAre these changes tested?
Yes